From 174ba1aa1bfc617fb97a9613fcf41a643a0c0ff2 Mon Sep 17 00:00:00 2001
From: Alexandru Gherghescu <gherghescu_alex1@yahoo.ro>
Date: Wed, 24 Jan 2024 20:20:18 +0200
Subject: [PATCH] Update README file

---
 README.md | 53 +++++++++++++++++++++++++++++++++++------------------
 1 file changed, 35 insertions(+), 18 deletions(-)

diff --git a/README.md b/README.md
index 66a8a72..d426eed 100644
--- a/README.md
+++ b/README.md
@@ -1,26 +1,39 @@
 # Optimus Prime
 
-Implementation of the Transformer architecture and different variations on it,
-as well as the training loop, written from scratch in PyTorch.
+(Yet another) PyTorch framework for training large language models.
 
-Most of the code is easy to read and should be self-explanatory, as it is
-documented.
+## How to use
 
-There is an example of how to use the 'framework' in
-`optimus/example_training.py`.
+### Training
 
-## Training a custom model with a custom dataset
+An example on how to use the framework can be found in `training.py`. Feel free
+to adapt as needed. Also see [Custom training](#custom-training).
 
-The requirements to train a model of a custom size from scratch are:
-1. Get a dataset from wherever, and process it to work with the other components
-   of the 'framework'. See `optimus/datasets/wikitext103.py` for an example.
-2. Create a tokenizer for the model, or use the existing LLama 2 tokenizer
-   (`optimus/llama32K.model`). To create a tokenizer, see
-   `optimus/tokenizer.py`. Creating a tokenizer requires a dataset ready.
-3. Create the model, and specify the training parameters (see
-   `optimus/example_training.py` for an example of what options you can
-   provide). Additionally, `optimus/trainer.py` can be directly modified, for
-   options currently not exposed through an interface.
+### Inference
+
+After training a model (or getting hold of one from other sources), there's an
+example on how to run inference can be found in `inference.py`. Feel free to
+adapt as needed.
+
+## Basic building blocks
+
+As its parent PyTorch, the framework is split between a number of modules. The
+most important modules are the `OptimusDataLoader`, the `Dataset`s, the
+`Trainer`, the tokenizers and the models. These can be combined and adapted in
+any way, shape or form to train a model from scratch.
+
+## Custom training
+
+The usual workflow for a user is to create and train a tokenizer (see
+`optimus/tokenizers` for an example), model a dataset (see `optimus/datasets`
+for an example), create a model architecture (see `optimus/models` as an
+example) and use the data loader and the trainer modules to train the model. The
+`Trainer` module has a number of useful options which can be used during
+training (mixed precision training, checkpointing, gradient accumulation,
+plotting the training loss etc.; see `optimus/trainer.py` for what the Trainer
+is capable of).
+
+Of course, any number of the above can be used as defaults.
 
 > [!TIP]
 > You can choose which GPU's to train on, using the environment variable
@@ -29,7 +42,11 @@ The requirements to train a model of a custom size from scratch are:
 
 ## Required packages
 
-There are a number of packages required to run the thing. Get your closest
+There are a number of packages required to run the framework. Get your closest
 Python retailer and ask him to run the following command:
 
 `pip install torch fire sentencepiece fastprogress matplotlib`
+
+## License
+
+See [LICENSE](LICENSE).
-- 
GitLab