sparse – Andrew Fairless, Ph.D.

What I Read: Autoencoders, Interpretability

By Andrew Fairless on March 25, 2025December 1, 2024

https://adamkarvonen.github.io/machine_learning/2024/06/11/sae-intuitions.html An Intuitive Explanation of Sparse Autoencoders for LLM InterpretabilityAdam KarvonenJun 11, 2024 “Sparse Autoencoders (SAEs) have recently become popular for interpretability of machine learning models…”

What I Read: Toy Models of Superposition

By Andrew Fairless on December 19, 2024September 29, 2024

https://transformer-circuits.pub/2022/toy_model/index.html Toy Models of SuperpositionNelson Elhage, Tristan Hume, Catherine Olsson, Nicholas Schiefer, Tom Henighan, Shauna Kravec, Zac Hatfield-Dodds, Robert Lasenby, Dawn Drain, Carol Chen, Roger Grosse, Sam McCandlish, Jared Kaplan,Continue readingWhat I Read: Toy Models of Superposition

What I Read: sparsity, PyTorch, Hadamard product

By Andrew Fairless on December 2, 2024August 26, 2024

https://alexshtf.github.io/2024/07/07/HadamardParameterization.html Alex ShtoffFun with sparsity in PyTorch via Hadamard product parametrizationJul 7, 2024 “The beauty of sparsity inducing regularization is that we let our optimizer discover the sparsity patterns, insteadContinue readingWhat I Read: sparsity, PyTorch, Hadamard product

What I Read: Bonsai Networks, RNNs

By Andrew Fairless on October 3, 2023September 4, 2023

https://cprimozic.net/blog/growing-sparse-computational-graphs-with-rnns/ Growing Bonsai Networks with RNNsCasey Primozic “This writeup introduces what I’m calling Bonsai Networks – extremely sparse computational graphs produced by training and pruning RNNs. They provide an interpretableContinue readingWhat I Read: Bonsai Networks, RNNs

What I Read: Sparse Networks

By Andrew Fairless on July 31, 2023June 15, 2023

https://www.quantamagazine.org/sparse-neural-networks-point-physicists-to-useful-data-20230608/ Sparse Networks Come to the Aid of Big PhysicsSteve NadisJune 8, 2023 “A novel type of neural network is helping physicists with the daunting challenge of data analysis.”

What I Read: Machines Learn, Teach Basics

By Andrew Fairless on March 21, 2023February 6, 2023

https://www.quantamagazine.org/machines-learn-better-if-we-teach-them-the-basics-20230201/ Machines Learn Better if We Teach Them the BasicsMax G. LevyFebruary 1, 2023 “A wave of research improves reinforcement learning algorithms by pre-training them as if they were human.”

What I Read: ‘Machine Scientists’ Distill the Laws of Physics From Raw Data

By Andrew Fairless on June 7, 2022May 23, 2022

https://www.quantamagazine.org/machine-scientists-distill-the-laws-of-physics-from-raw-data-20220510/ Powerful ‘Machine Scientists’ Distill the Laws of Physics From Raw DataCharlie WoodStaff WriterMay 10, 2022 “Researchers say we’re on the cusp of “GoPro physics,” where a camera can pointContinue readingWhat I Read: ‘Machine Scientists’ Distill the Laws of Physics From Raw Data

What I Read: Dense Vectors

By Andrew Fairless on December 14, 2021November 16, 2021

https://www.pinecone.io/learn/dense-vector-embeddings-nlp/ Dense Vectors: Capturing Meaning with CodeJames BriggsData Scientist “But since the days of word2vec, developments in representing language have advanced at ludicrous speeds. This article will explore why weContinue readingWhat I Read: Dense Vectors

What I Read: Deep Learning Recommendation Models

By Andrew Fairless on April 24, 2021April 10, 2021

https://www.kdnuggets.com/2021/04/deep-learning-recommendation-models-dlrm-deep-dive.html Deep Learning Recommendation Models (DLRM): A Deep DiveBy Nishant Kumar, Data Science Professional. “This deep dive article presents the architecture and deployment issues experienced with the deep learning recommendationContinue readingWhat I Read: Deep Learning Recommendation Models

What I Read: Attention with Performers

By Andrew Fairless on February 7, 2021January 11, 2021

https://ai.googleblog.com/2020/10/rethinking-attention-with-performers.html Rethinking Attention with PerformersFriday, October 23, 2020Posted by Krzysztof Choromanski and Lucy Colwell, Research Scientists, Google Research “To resolve these issues, we introduce the Performer, a Transformer architecture withContinue readingWhat I Read: Attention with Performers

Tag: sparse