natural language processing – Page 5

What I Read: Platonic Hypothesis

By Andrew Fairless on July 29, 2024May 24, 2024

https://phillipi.github.io/prh The Platonic Representation HypothesisMinyoung Huh, Brian Cheung, Tongzhou Wang, Phillip IsolaMITPosition Paper in ICML 2024 “Neural networks, trained with different objectives on different data and modalities, are converging toContinue readingWhat I Read: Platonic Hypothesis

What I Read: Game Theory, AI

By Andrew Fairless on July 25, 2024May 24, 2024

https://www.quantamagazine.org/game-theory-can-make-ai-more-correct-and-efficient-20240509 Game Theory Can Make AI More Correct and EfficientSteve NadisMay 9, 2024 “Researchers are drawing on ideas from game theory to improve large language models and make them moreContinue readingWhat I Read: Game Theory, AI

What I Read: Matryoshka Embedding

By Andrew Fairless on July 17, 2024May 6, 2024

https://huggingface.co/blog/matryoshka Introduction to Matryoshka Embedding ModelsTom AarsenJoshuaOmar SansevieroFebruary 23, 2024 “…Kusupati et al. (2022) were inspired to create embedding models whose embeddings could reasonably be shrunk without suffering too muchContinue readingWhat I Read: Matryoshka Embedding

What I Read: LLMs, Open Source

By Andrew Fairless on July 15, 2024May 6, 2024

https://www.aidancooper.co.uk/how-to-beat-proprietary-llms How to Beat Proprietary LLMs With Smaller Open Source ModelsAidan CooperApr 26, 2024 “…we explore the unique advantages of open source LLMs, and how you can leverage them toContinue readingWhat I Read: LLMs, Open Source

What I Read: Structured Generation, Constrained Decoding

By Andrew Fairless on June 20, 2024April 16, 2024

https://www.aidancooper.co.uk/constrained-decoding A Guide to Structured Generation Using Constrained DecodingApr 8, 2024 “…there are techniques that ensure language models only return outputs that conform to your requirements. This article serves asContinue readingWhat I Read: Structured Generation, Constrained Decoding

What I Read: Attention, transformers

By Andrew Fairless on June 18, 2024April 16, 2024

Attention in transformers, visually explained | Chapter 6, Deep Learning3Blue1Brown “Demystifying attention, the key mechanism inside transformers and LLMs.”

What I Read: Data Selection, LLMs

By Andrew Fairless on June 12, 2024April 15, 2024

https://www.cs.princeton.edu/~smalladi/blog/2024/04/04/dataselection Using LESS Data to Tune Models: Data Selection in the Era of LLMsMengzhou Xia and Sadhika MalladiApril 04 2024 “We describe how data selection for modern-day LLMs differs fromContinue readingWhat I Read: Data Selection, LLMs

What I Read: Mamba Explained

By Andrew Fairless on June 10, 2024April 15, 2024

https://thegradient.pub/mamba-explained Mamba ExplainedKola Ayonrinde27.Mar.2024 “Mamba promises similar performance (and crucially similar scaling laws) as the Transformer whilst being feasible at long sequence lengths (say 1 million tokens).”

What I Read: Chain-of-Thought Reasoning

By Andrew Fairless on June 3, 2024April 15, 2024

https://www.quantamagazine.org/how-chain-of-thought-reasoning-helps-neural-networks-compute-20240321 How Chain-of-Thought Reasoning Helps Neural Networks ComputeBen Brubaker3/21/24 11:15 AM “Large language models do better at solving problems when they show their work. Researchers are beginning to understand why.”

What I Read: text embeddings

By Andrew Fairless on May 20, 2024March 11, 2024

https://thegradient.pub/text-embedding-inversion/ Do text embeddings perfectly encode text?Jack Morris05.Mar.2024 “What if someone hacks into the database and gains access to all your text embedding vectors – would this be bad?….Put simply:Continue readingWhat I Read: text embeddings

Tag: natural language processing