LLM Reading List
General
Attention is all you need - Łukasz Kaiser
Benchmark for evaluating code generation performance of LLMs
Illustrated Transformer - Jay Alammar
All you need to know about Attention and Transformers - part 1 - Arjun Sarkar
Positional encoding - Hunter Phillips
A tutorial on LLM - Haifeng Li - Meta information about dataset collection, fine-tuning, transfer learning, types of applications where LLM's can be used etc.
Why multi-head self attention works: math, intuitions and 10+1 hidden insights - AI Summer - Really good article
Retentive network: a successor to transformer for large language models (aug. 2023) - Microsoft Research - Paper introducing the retention mechanism to substitute attention in Transformers; there is a decent overview of the paper as a Medium post here
Systems
Llama.cpp 30B runs with only 6GB of RAM now (CPU)
Verification
LEVER: Learning to Verify Language-to-Code Generation with Execution
How Effective Are Neural Networks for Fixing Security Vulnerabilities
Courses/Tutorials
Neural Networks: Zero to Hero - A course by Andrej Karpathy on building neural networks, from scratch, in code.