regularization – Andrew Fairless, Ph.D.

What I Read: LLMs, School Math

By Andrew Fairless on March 5, 2025November 16, 2024

https://towardsdatascience.com/understanding-llms-from-scratch-using-middle-school-math-e602d27ec876?gi=551c5bfd7f21 Understanding LLMs from Scratch Using Middle School MathRohit PatelOct 19, 2024 “In this article, we talk about how Large Language Models (LLMs) work, from scratch — assuming only thatContinue readingWhat I Read: LLMs, School Math

What I Read: sparsity, PyTorch, Hadamard product

By Andrew Fairless on December 2, 2024August 26, 2024

https://alexshtf.github.io/2024/07/07/HadamardParameterization.html Alex ShtoffFun with sparsity in PyTorch via Hadamard product parametrizationJul 7, 2024 “The beauty of sparsity inducing regularization is that we let our optimizer discover the sparsity patterns, insteadContinue readingWhat I Read: sparsity, PyTorch, Hadamard product

What I Read: Regularization, polynomial bases

By Andrew Fairless on November 7, 2024August 25, 2024

https://alexshtf.github.io/2024/06/03/PolynomialBasesRegProps.html Alex ShtoffRegularization properties of polynomial basesJun 3, 2024 “We used the Bernstein basis to demonstrate the importance of chosing a “good” polynomial basis, and that other well-known bases mayContinue readingWhat I Read: Regularization, polynomial bases

What I Read: KL All You Need

By Andrew Fairless on August 21, 2024June 4, 2024

https://blog.alexalemi.com/kl-is-all-you-need.html KL is All You NeedAlexander A. Alemi2024-01-08 “…the core of essentially all modern machine learning methods is a single universal objective: Kullback-Leibler (KL) divergence minimization…. Understand KL, understand theContinue readingWhat I Read: KL All You Need

What I Read: polynomial monster

By Andrew Fairless on March 26, 2024February 5, 2024

https://alexshtf.github.io/2024/01/25/Bernstein-Basis.html Keeping the polynomial monster under controlAlex ShtoffJan 25, 2024 “…we saw that the Bernstein polynomials can be used to fit a high-degree polynomial curve with ease, without its shapeContinue readingWhat I Read: polynomial monster

What I Read: polynomial features

By Andrew Fairless on March 25, 2024February 5, 2024

https://alexshtf.github.io/2024/01/21/Bernstein.html Are polynomial features the root of all evil?Alex ShtoffJan 21, 2024 “There’s nothing inherently wrong with high degree polynomials, and in contrast to what is typically taught, high degreeContinue readingWhat I Read: polynomial features

What I Read: Self-Attention in GPT

By Andrew Fairless on March 4, 2024January 25, 2024

https://twiecki.io/blog/2024/01/04/ An Intuitive Guide to Self-Attention in GPT: The Venetian MasqueradeThomas WieckiJanuary 4, 2024 “In AI, especially with something as intricate as self-attention, it’s easy to get lost in theContinue readingWhat I Read: Self-Attention in GPT

What I Read: Bonsai Networks, RNNs

By Andrew Fairless on October 3, 2023September 4, 2023

https://cprimozic.net/blog/growing-sparse-computational-graphs-with-rnns/ Growing Bonsai Networks with RNNsCasey Primozic “This writeup introduces what I’m calling Bonsai Networks – extremely sparse computational graphs produced by training and pruning RNNs. They provide an interpretableContinue readingWhat I Read: Bonsai Networks, RNNs

What I Read: Ways Digital Minds Know

By Andrew Fairless on August 8, 2023July 9, 2023

https://moultano.wordpress.com/2023/06/28/the-many-ways-that-digital-minds-can-know/ The Many Ways that Digital Minds Can KnowRyan Moulton “…aspects of search engine quality are better analogies to describe the properties of LLMs than “generalization” and “memorization” are, andContinue readingWhat I Read: Ways Digital Minds Know

What I Read: What, Why ChatGPT

By Andrew Fairless on July 12, 2023June 15, 2023

https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/ What Is ChatGPT Doing … and Why Does It Work?February 14, 2023Stephen Wolfram “That ChatGPT can automatically generate something that reads even superficially like human-written text is remarkable, andContinue readingWhat I Read: What, Why ChatGPT

Tag: regularization