From 174ba1aa1bfc617fb97a9613fcf41a643a0c0ff2 Mon Sep 17 00:00:00 2001 From: Alexandru Gherghescu <gherghescu_alex1@yahoo.ro> Date: Wed, 24 Jan 2024 20:20:18 +0200 Subject: [PATCH] Update README file --- README.md | 53 +++++++++++++++++++++++++++++++++++------------------ 1 file changed, 35 insertions(+), 18 deletions(-) diff --git a/README.md b/README.md index 66a8a72..d426eed 100644 --- a/README.md +++ b/README.md @@ -1,26 +1,39 @@ # Optimus Prime -Implementation of the Transformer architecture and different variations on it, -as well as the training loop, written from scratch in PyTorch. +(Yet another) PyTorch framework for training large language models. -Most of the code is easy to read and should be self-explanatory, as it is -documented. +## How to use -There is an example of how to use the 'framework' in -`optimus/example_training.py`. +### Training -## Training a custom model with a custom dataset +An example on how to use the framework can be found in `training.py`. Feel free +to adapt as needed. Also see [Custom training](#custom-training). -The requirements to train a model of a custom size from scratch are: -1. Get a dataset from wherever, and process it to work with the other components - of the 'framework'. See `optimus/datasets/wikitext103.py` for an example. -2. Create a tokenizer for the model, or use the existing LLama 2 tokenizer - (`optimus/llama32K.model`). To create a tokenizer, see - `optimus/tokenizer.py`. Creating a tokenizer requires a dataset ready. -3. Create the model, and specify the training parameters (see - `optimus/example_training.py` for an example of what options you can - provide). Additionally, `optimus/trainer.py` can be directly modified, for - options currently not exposed through an interface. +### Inference + +After training a model (or getting hold of one from other sources), there's an +example on how to run inference can be found in `inference.py`. Feel free to +adapt as needed. + +## Basic building blocks + +As its parent PyTorch, the framework is split between a number of modules. The +most important modules are the `OptimusDataLoader`, the `Dataset`s, the +`Trainer`, the tokenizers and the models. These can be combined and adapted in +any way, shape or form to train a model from scratch. + +## Custom training + +The usual workflow for a user is to create and train a tokenizer (see +`optimus/tokenizers` for an example), model a dataset (see `optimus/datasets` +for an example), create a model architecture (see `optimus/models` as an +example) and use the data loader and the trainer modules to train the model. The +`Trainer` module has a number of useful options which can be used during +training (mixed precision training, checkpointing, gradient accumulation, +plotting the training loss etc.; see `optimus/trainer.py` for what the Trainer +is capable of). + +Of course, any number of the above can be used as defaults. > [!TIP] > You can choose which GPU's to train on, using the environment variable @@ -29,7 +42,11 @@ The requirements to train a model of a custom size from scratch are: ## Required packages -There are a number of packages required to run the thing. Get your closest +There are a number of packages required to run the framework. Get your closest Python retailer and ask him to run the following command: `pip install torch fire sentencepiece fastprogress matplotlib` + +## License + +See [LICENSE](LICENSE). -- GitLab