https://transformer-circuits.pub/2022/toy_model/index.html Toy Models of SuperpositionNelson Elhage, Tristan Hume, Catherine Olsson, Nicholas Schiefer, Tom Henighan, Shauna Kravec, Zac Hatfield-Dodds, Robert Lasenby, Dawn Drain, Carol Chen, Roger Grosse, Sam McCandlish, Jared Kaplan,
What I Read: How Machines ‘Grok’ Data
https://www.quantamagazine.org/how-do-machines-grok-data-20240412 How Do Machines ‘Grok’ Data?Anil Ananthaswamy4/12/24 “By apparently overtraining them, researchers have seen neural networks discover novel solutions to problems.”
What I Read: High-Quality Human Data
https://lilianweng.github.io/posts/2024-02-05-human-data-quality/Thinking about High-Quality Human DataLilian WengFebruary 5, 2024 “High-quality data is the fuel for modern data deep learning model training.”