diff --git a/doc/llm.md b/doc/llm.md index 07c42fc84a15b05d8e616b6a0c9dc42095dc322c..f694a462a442e2f7c491a398a1df505a9713e4bf 100644 --- a/doc/llm.md +++ b/doc/llm.md @@ -12,12 +12,9 @@ oct. 2023)](https://arxiv.org/abs/2310.11453) ## Quantization -- ["LoRA: Low-Rank Adaptation of Large Language Models" (Hu et al. - jun. - 2021)](https://arxiv.org/abs/2106.09685) - introduces fine-tuning using the - LoRA algorithm, which helps train models for downstream tasks faster ## Prompt engineering -- ["Language models are few-shot learners"ă(Brown et al. (OpenAI) - may. +- ["Language models are few-shot learners" (Brown et al. (OpenAI) - may. 2020)](https://arxiv.org/abs/2005.14165) - introduces the GPT-3 (175B parameters) model and the technique of prompt engineering by using few-shot learning @@ -43,6 +40,12 @@ 2023)](https://arxiv.org/abs/2307.09288) - LLaMA 2 ## Fine-tuning +- ["LoRA: Low-Rank Adaptation of Large Language Models" (Hu et al. - jun. + 2021)](https://arxiv.org/abs/2106.09685) - introduces fine-tuning using the + LoRA algorithm, which helps train models for downstream tasks faster +- ["QLoRA: Efficient Finetuning of Quantized LLMs" (Dettmers et al. - may + 2023)](https://arxiv.org/abs/2305.14314) - builds on top of LoRA, and further + quantizes models to 4-bit, to reduce memory usage ## Benchmarks