matrix – Andrew Fairless, Ph.D.

What I Read: Toy Models of Superposition

By Andrew Fairless on December 19, 2024September 29, 2024

https://transformer-circuits.pub/2022/toy_model/index.html Toy Models of SuperpositionNelson Elhage, Tristan Hume, Catherine Olsson, Nicholas Schiefer, Tom Henighan, Shauna Kravec, Zac Hatfield-Dodds, Robert Lasenby, Dawn Drain, Carol Chen, Roger Grosse, Sam McCandlish, Jared Kaplan,Continue readingWhat I Read: Toy Models of Superposition

What I Read: Attention, transformers

By Andrew Fairless on June 18, 2024April 16, 2024

Attention in transformers, visually explained | Chapter 6, Deep Learning3Blue1Brown “Demystifying attention, the key mechanism inside transformers and LLMs.”

What I Read: High-Dimensional Variance

By Andrew Fairless on May 22, 2024March 11, 2024

https://gregorygundersen.com/blog/2023/12/09/covariance-matrices/ High-Dimensional VarianceGregory Gundersen09 December 2023 “A useful view of a covariance matrix is that it is a natural generalization of variance to higher dimensions.”

What I Read: LoRA from Scratch

By Andrew Fairless on April 4, 2024March 1, 2024

https://lightning.ai/lightning-ai/studios/code-lora-from-scratch Code LoRA from ScratchSebastian Raschka “LoRA, which stands for Low-Rank Adaptation, is a popular technique to finetune LLMs more efficiently,,,. This Studio explains how LoRA works by coding itContinue readingWhat I Read: LoRA from Scratch

What I Read: Finetuning LLMs Using LoRA

By Andrew Fairless on February 5, 2024December 18, 2023

https://magazine.sebastianraschka.com/p/practical-tips-for-finetuning-llms Practical Tips for Finetuning LLMs Using LoRA (Low-Rank Adaptation)Sebastian Raschka, PhDNov 19, 2023 “Low-rank adaptation (LoRA) is among the most widely used and effective techniques for efficiently training customContinue readingWhat I Read: Finetuning LLMs Using LoRA

What I Read: vectorize wide PyTorch expressions

By Andrew Fairless on January 16, 2024November 7, 2023

https://probablymarcus.com/blocks/2023/10/19/vectorizing-wide-pytorch-expressions.html What happens when you vectorize wide PyTorch expressions?Marcus Lewis19 October 2023 “So we see that vectorization has three effects:It lets us keep the GPU busy even when inputs areContinue readingWhat I Read: vectorize wide PyTorch expressions

What I Read: Compiling NumPy

By Andrew Fairless on January 10, 2024November 7, 2023

https://pytorch.org/blog/compiling-numpy-code/ Compiling NumPy code into C++ or CUDA via torch.compileby Evgeni Burovski, Ralf Gommers and Mario LezcanoOctober 17, 2023 “This feature leverages PyTorch’s compiler to generate efficient fused vectorized codeContinue readingWhat I Read: Compiling NumPy

What I Read: SAT Solvers

By Andrew Fairless on January 8, 2024November 7, 2023

https://www.borealisai.com/research-blogs/tutorial-9-sat-solvers-i-introduction-and-applications/ Tutorial #9: SAT Solvers I: Introduction and applicationsS. PrinceNov. 10, 2020 “This tutorial concerns the Boolean satisfiability or SAT problem…. we’ll show how to fit both neural networks andContinue readingWhat I Read: SAT Solvers

What I Read: Visualizing Matrix Multiplication

By Andrew Fairless on November 30, 2023October 5, 2023

https://pytorch.org/blog/inside-the-matrix/ Inside the Matrix: Visualizing Matrix Multiplication, Attention and Beyondby Team PyTorchSeptember 25, 2023 “Matrix multiplications (matmuls) are the building blocks of today’s ML models. This note presents mm, aContinue readingWhat I Read: Visualizing Matrix Multiplication

What I Read: Differentiable Trees

By Andrew Fairless on October 18, 2023September 4, 2023

https://ericmjl.github.io/blog/2023/8/7/journal-club-differentiable-search-of-evolutionary-trees/ Journal Club: Differentiable Search of Evolutionary TreesEric J. Ma2023-08-07 “…how the authors take a non-differentiable problem and turn it into a differentiable problem through interconversion between mathematical data structures.”

Tag: matrix