Finetuning LLMs with LoRA and QLoRA: Insights from Hundreds of Experiments Finetuning LLMs with LoRA and QLoRA: Insights from Hundreds of ExperimentsSebastian RaschkaOctober 12, 2023 “LoRA is one of the
What I Read: Distributed Training, Finetuning
https://sumanthrh.com/post/distributed-and-efficient-finetuning/ Everything about Distributed Training and Efficient FinetuningSumanth R HegdeLast updated on Oct 13, 2023 “practical guidelines and gotchas with multi-GPU and multi-node training”
What I Read: Retrieval Augmented Generation at scale
https://medium.com/@neum_ai/retrieval-augmented-generation-at-scale-building-a-distributed-system-for-synchronizing-and-eaa29162521 Retrieval Augmented Generation at scale — Building a distributed system for synchronizing and ingesting billions of text embeddingsNeum AISep 28 “…getting a Retrieval Augmented Generation (RAG) application started is
What I Read: Tiny Language Models
https://www.quantamagazine.org/tiny-language-models-thrive-with-gpt-4-as-a-teacher-20231005/ Tiny Language Models Come of AgeBen Brubaker10/5/23 10:50 AM “To better understand how neural networks learn to simulate writing, researchers trained simpler versions on synthetic children’s stories.”
What I Read: To Understand Transformers, Focus on Attention
https://drscotthawley.github.io/blog/posts/Transformers1-Attention.html To Understand Transformers, Focus on AttentionScott H. HawleyAugust 21, 2023 “To Understand Transformers, Focus on Attention”