natural language processing – Page 2

What I Read: Mamba, State Space

By Andrew Fairless on February 11, 2025October 25, 2024

https://newsletter.maartengrootendorst.com/p/a-visual-guide-to-mamba-and-state A Visual Guide to Mamba and State Space ModelsMaarten GrootendorstFeb 19, 2024 “To further improve LLMs, new architectures are developed that might even outperform the Transformer architecture. One ofContinue readingWhat I Read: Mamba, State Space

What I Watch: GenAI, Classify Text

By Andrew Fairless on January 21, 2025September 29, 2024

https://www.youtube.com/watch?v=Y4HQQPfyzwo Is GenAI All You Need to Classify Text? Some Learnings from the TrenchesPyDataSep 26, 2024 “…practical implications of using GenAI for text classification tasks. The speakers highlighted challenges suchContinue readingWhat I Watch: GenAI, Classify Text

What I Read: embedding models

By Andrew Fairless on January 6, 2025September 29, 2024

https://unstructured.io/blog/understanding-embedding-models-make-an-informed-choice-for-your-rag Understanding embedding models: make an informed choice for your RAGMaria KhalusovaAug 13, 2024 “How do you choose a suitable embedding model for your RAG application?”

What I Read: Toy Models of Superposition

By Andrew Fairless on December 19, 2024September 29, 2024

https://transformer-circuits.pub/2022/toy_model/index.html Toy Models of SuperpositionNelson Elhage, Tristan Hume, Catherine Olsson, Nicholas Schiefer, Tom Henighan, Shauna Kravec, Zac Hatfield-Dodds, Robert Lasenby, Dawn Drain, Carol Chen, Roger Grosse, Sam McCandlish, Jared Kaplan,Continue readingWhat I Read: Toy Models of Superposition

What I Watch: How LLMs store facts

By Andrew Fairless on December 16, 2024September 3, 2024

How might LLMs store facts | Chapter 7, Deep Learning3Blue1BrownAug 31, 2024 “Unpacking the multilayer perceptrons in a transformer, and how they may store facts”

What I Read: Fine-tuning

By Andrew Fairless on December 11, 2024September 3, 2024

https://openpipe.ai/blog/fine-tuning-best-practices-series-introduction-and-chapter-1-training-data Fine-tuning Best Practices Series Introduction and Chapter 1: Training DataReid MayoAug 1, 2024 “We’ll explore how to choose the best data, common methods for collecting it, and common methodsContinue readingWhat I Read: Fine-tuning

What I Read: Evaluating LLM-Evaluators

By Andrew Fairless on December 3, 2024August 26, 2024

https://eugeneyan.com/writing/llm-evaluators Evaluating the Effectiveness of LLM-Evaluators (aka LLM-as-Judge)Eugene Yan “After reading this, you’ll gain an intuition on how to apply, evaluate, and operate LLM-evaluators. We’ll learn when to apply (i)Continue readingWhat I Read: Evaluating LLM-Evaluators

What I Read: Classifying pdfs

By Andrew Fairless on November 21, 2024August 26, 2024

https://snats.xyz/pages/articles/classifying_a_bunch_of_pdfs.html Classifying all of the pdfs on the internetSantiago Pedroza2024-08-18 “How would you classify all the pdfs in the internet? Well, that is what I tried doing this time.”

What I Read: Tool Retrieval, RAG

By Andrew Fairless on November 20, 2024August 26, 2024

https://jxnl.co/writing/2024/08/21/trade-off-tool-selection Optimizing Tool Retrieval in RAG Systems: A Balanced ApproachJason Liu2024/08/21 “When it comes to Retrieval-Augmented Generation (RAG) systems, one of the key challenges is deciding how to select andContinue readingWhat I Read: Tool Retrieval, RAG

What I Read: LLM Pre-training Post-training

By Andrew Fairless on November 18, 2024August 26, 2024

https://magazine.sebastianraschka.com/p/new-llm-pre-training-and-post-training New LLM Pre-training and Post-training ParadigmsA Look at How Modern LLMs Are TrainedSebastian Raschka, PhDAug 17, 2024 “Initially, the LLM training process focused solely on pre-training, but it hasContinue readingWhat I Read: LLM Pre-training Post-training

Tag: natural language processing