Optimus Prime
(Yet another) PyTorch framework for training large language models
How to use
Training
An example on how to use the framework can be found in training.py
. Feel free
to adapt as needed. Also see Custom training.
Inference
After training a model (or getting hold of one from other sources), there's an
example on how to run inference in inference.py
. It uses nucleus sampling,
with adjustable top-p threshold and temperature values.
Custom training
The usual workflow for a user is to create a model architecture (see
optimus/models
as an example) and use the trainer modules to train the model.
The Trainer
module has a number of useful options which can be used during
training (mixed precision training, checkpointing, gradient accumulation etc.;
see optimus/trainer.py
for what the Trainer is capable of).
[!TIP] You can choose which GPU's to train on, using the environment variable
CUDA_VISIBLE_DEVICES
. For example, you can train on the second available GPU on the system withCUDA_VISIBLE_DEVICES=1 python optimus/example_training.py
.
Required packages
There are a number of packages required to run the framework. There's a
convenience conda-env.yml
file that should most likely cover every use case.
Get your closest Python retailer and ask him to run the following command:
conda env create -f conda-env.yml
License
See LICENSE.