training – Andrew Fairless, Ph.D.

What I Read: RL, PPO, GRPO

By Andrew Fairless on May 26, 2025February 22, 2025

https://yugeten.github.io/posts/2025/01/ppogrpo A vision researcher’s guide to some RL stuff: PPO & GRPOYuge (Jimmy) ShiJanuary 31, 2025 “This is a deep dive into Proximal Policy Optimization (PPO), which is one ofContinue readingWhat I Read: RL, PPO, GRPO

What I Read: group relative policy optimization

By Andrew Fairless on May 22, 2025February 22, 2025

https://superb-makemake-3a4.notion.site/group-relative-policy-optimization-GRPO-18c41736f0fd806eb39dc35031758885 group relative policy optimization (GRPO)Apoorv NandanJan 31, 2025 “GRPO became popular primarily due to the success of deepseek r1, which used this algorithm to train reasoning capabilities into theirContinue readingWhat I Read: group relative policy optimization

What I Read: memorization, novelty

By Andrew Fairless on May 1, 2025February 1, 2025

https://blog.kjamistan.com/how-memorization-happens-novelty.html How memorization happens: Novelty09 Dezember 2024 “…repeated text and images incentivize training data memorization, but that’s not the only training data that machine learning models memorize. Let’s take aContinue readingWhat I Read: memorization, novelty

What I Read: Age of Data

By Andrew Fairless on April 14, 2025January 5, 2025

https://amatria.in/blog/ageofdata The end of the “Age of Data”? Enter the age of superhuman data and AIXavier AmatriainDecember 24, 2024 “This post will argue that the age of data is farContinue readingWhat I Read: Age of Data

What I Read: benchmark

By Andrew Fairless on February 26, 2025November 9, 2024

Run a benchmark they said. It will be fun they said. | PyData Amsterdam 2024Vincent D. WarmerdamOct 22, 2024 “This is the story of a fun idea that turned intoContinue readingWhat I Read: benchmark

What I Read: Toy Models of Superposition

By Andrew Fairless on December 19, 2024September 29, 2024

https://transformer-circuits.pub/2022/toy_model/index.html Toy Models of SuperpositionNelson Elhage, Tristan Hume, Catherine Olsson, Nicholas Schiefer, Tom Henighan, Shauna Kravec, Zac Hatfield-Dodds, Robert Lasenby, Dawn Drain, Carol Chen, Roger Grosse, Sam McCandlish, Jared Kaplan,Continue readingWhat I Read: Toy Models of Superposition

What I Read: Visual Guide, Quantization

By Andrew Fairless on October 30, 2024August 3, 2024

https://newsletter.maartengrootendorst.com/p/a-visual-guide-to-quantization A Visual Guide to Quantization, Demystifying the Compression of Large Language ModelsMaarten GrootendorstJul 22, 2024 “…I will introduce the field of quantization in the context of language modeling andContinue readingWhat I Read: Visual Guide, Quantization

What I Read: bare metal to 70B

By Andrew Fairless on September 25, 2024July 8, 2024

https://imbue.com/research/70b-infrastructure From bare metal to a 70B model: infrastructure set-up and scriptsThe Imbue TeamJune 25, 2024 “…we trained a 70B parameter model from scratch on our own infrastructure that outperformedContinue readingWhat I Read: bare metal to 70B

What I Read: How Machines ‘Grok’ Data

By Andrew Fairless on June 25, 2024April 23, 2024

https://www.quantamagazine.org/how-do-machines-grok-data-20240412 How Do Machines ‘Grok’ Data?Anil Ananthaswamy4/12/24 “By apparently overtraining them, researchers have seen neural networks discover novel solutions to problems.”

What I Read: Data Selection, LLMs

By Andrew Fairless on June 12, 2024April 15, 2024

https://www.cs.princeton.edu/~smalladi/blog/2024/04/04/dataselection Using LESS Data to Tune Models: Data Selection in the Era of LLMsMengzhou Xia and Sadhika MalladiApril 04 2024 “We describe how data selection for modern-day LLMs differs fromContinue readingWhat I Read: Data Selection, LLMs

Tag: training