diff --git a/doc/llm.md b/doc/llm.md
index 07c42fc84a15b05d8e616b6a0c9dc42095dc322c..f694a462a442e2f7c491a398a1df505a9713e4bf 100644
--- a/doc/llm.md
+++ b/doc/llm.md
@@ -12,12 +12,9 @@
   oct. 2023)](https://arxiv.org/abs/2310.11453)
 
 ## Quantization
-- ["LoRA: Low-Rank Adaptation of Large Language Models" (Hu et al. - jun.
-  2021)](https://arxiv.org/abs/2106.09685) - introduces fine-tuning using the
-  LoRA algorithm, which helps train models for downstream tasks faster
 
 ## Prompt engineering
-- ["Language models are few-shot learners" (Brown et al. (OpenAI) - may.
+- ["Language models are few-shot learners" (Brown et al. (OpenAI) - may.
   2020)](https://arxiv.org/abs/2005.14165) - introduces the GPT-3 (175B
   parameters) model and the technique of prompt engineering by using few-shot
   learning
@@ -43,6 +40,12 @@
   2023)](https://arxiv.org/abs/2307.09288) - LLaMA 2
 
 ## Fine-tuning
+- ["LoRA: Low-Rank Adaptation of Large Language Models" (Hu et al. - jun.
+  2021)](https://arxiv.org/abs/2106.09685) - introduces fine-tuning using the
+  LoRA algorithm, which helps train models for downstream tasks faster
+- ["QLoRA: Efficient Finetuning of Quantized LLMs" (Dettmers et al. - may
+  2023)](https://arxiv.org/abs/2305.14314) - builds on top of LoRA, and further
+  quantizes models to 4-bit, to reduce memory usage
 
 ## Benchmarks