Skip to content
Snippets Groups Projects
Vlad-Andrei BĂDOIU (78692)'s avatar
Vlad-Andrei BĂDOIU (78692) authored
Fix estimation interval

See merge request !16
fe76efab
History

Optimus Prime

(Yet another) PyTorch framework for training large language models.

How to use

Training

An example on how to use the framework can be found in training.py. Feel free to adapt as needed. Also see Custom training.

Inference

After training a model (or getting hold of one from other sources), there's an example on how to run inference in inference.py. It uses nucleus sampling, with adjustable top-p threshold and temperature values.

Basic building blocks

As its parent PyTorch, the framework is split between a number of modules. The most important modules are the OptimusDataLoader, the Datasets, the Trainer, the tokenizers and the models. These can be combined and adapted in any way, shape or form to train a model from scratch.

Custom training

The usual workflow for a user is to create and train a tokenizer (see optimus/tokenizers for an example), model a dataset (see optimus/datasets for an example), create a model architecture (see optimus/models as an example) and use the data loader and the trainer modules to train the model. The Trainer module has a number of useful options which can be used during training (mixed precision training, checkpointing, gradient accumulation, plotting the training loss etc.; see optimus/trainer.py for what the Trainer is capable of).

Of course, any number of the above can be used as defaults.

[!TIP] You can choose which GPU's to train on, using the environment variable CUDA_VISIBLE_DEVICES. For example, you can train on the second available GPU on the system with CUDA_VISIBLE_DEVICES=1 python optimus/example_training.py.

Required packages

There are a number of packages required to run the framework. Get your closest Python retailer and ask him to run the following command:

pip install torch fire sentencepiece fastprogress matplotlib

License

See LICENSE.