Skip to content

Feature: Add RoPE/AliBi embeddings to the Optimus model

Currently, the Optimus model has positional sinusoidal embeddings. Implement either RoPE or AliBi. These should support extrapolation to context lengths longer than that used for pre-training.