Add some interesting papers

cf90d345 · Vlad-Andrei BĂDOIU (78692) · fb556e22 · cf90d345
Commit cf90d345 authored 1 year ago by Vlad-Andrei BĂDOIU (78692)
--- a/README.md
+++ b/README.md
@@ -20,6 +20,10 @@

 [Retentive network: a successor to transformer for large language models (aug. 2023) - Microsoft Research](https://arxiv.org/abs/2307.08621) - Paper introducing the retention mechanism to substitute attention in Transformers; there is a decent overview of the paper as a Medium post [here](https://medium.com/ai-fusion-labs/retentive-networks-retnet-explained-the-much-awaited-transformers-killer-is-here-6c17e3e8add8)

+## Papers
+[SIMPLIFYING TRANSFORMER BLOCKS](https://arxiv.org/pdf/2311.01906.pdf)
+[Retentive Network: A Successor to Transformer for Large Language Models](https://arxiv.org/pdf/2307.08621.pdf)
+
 ## Systems

 [Llama.cpp 30B runs with only 6GB of RAM now (CPU)](https://github.com/ggerganov/llama.cpp/discussions/638#discussioncomment-5492916)
@@ -34,4 +38,4 @@

 [Neural Networks: Zero to Hero](https://karpathy.ai/zero-to-hero.html) - A course by Andrej Karpathy on building neural networks, from scratch, in code.

-[MIT Deep Learning Book](https://github.com/janishar/mit-deep-learning-book-pdf)
\ No newline at end of file
+[MIT Deep Learning Book](https://github.com/janishar/mit-deep-learning-book-pdf)]