linear algebra – Page 2 – Andrew Fairless, Ph.D.

What I Read: Toy Models of Superposition

By Andrew Fairless on December 19, 2024September 29, 2024

https://transformer-circuits.pub/2022/toy_model/index.html Toy Models of SuperpositionNelson Elhage, Tristan Hume, Catherine Olsson, Nicholas Schiefer, Tom Henighan, Shauna Kravec, Zac Hatfield-Dodds, Robert Lasenby, Dawn Drain, Carol Chen, Roger Grosse, Sam McCandlish, Jared Kaplan,Continue readingWhat I Read: Toy Models of Superposition

What I Watch: How LLMs store facts

By Andrew Fairless on December 16, 2024September 3, 2024

How might LLMs store facts | Chapter 7, Deep Learning3Blue1BrownAug 31, 2024 “Unpacking the multilayer perceptrons in a transformer, and how they may store facts”

What I Watch: compare high dimensional vectors

By Andrew Fairless on December 12, 2024September 3, 2024

A new way to compare high dimensional vectorsTunadorableAug 26, 2024 “Surpassing Cosine Similarity for Multidimensional Comparisons: Dimension Insensitive Euclidean Metric (DIEM)”

What I Read: Regularization, polynomial bases

By Andrew Fairless on November 7, 2024August 25, 2024

https://alexshtf.github.io/2024/06/03/PolynomialBasesRegProps.html Alex ShtoffRegularization properties of polynomial basesJun 3, 2024 “We used the Bernstein basis to demonstrate the importance of chosing a “good” polynomial basis, and that other well-known bases mayContinue readingWhat I Read: Regularization, polynomial bases

What I Read: Visual Guide, Quantization

By Andrew Fairless on October 30, 2024August 3, 2024

https://newsletter.maartengrootendorst.com/p/a-visual-guide-to-quantization A Visual Guide to Quantization, Demystifying the Compression of Large Language ModelsMaarten GrootendorstJul 22, 2024 “…I will introduce the field of quantization in the context of language modeling andContinue readingWhat I Read: Visual Guide, Quantization

What I Read: Kernel, Convolutional Representations

By Andrew Fairless on October 24, 2024July 22, 2024

https://logb-research.github.io/blog/2024/ckn Kernel Trick I – Deep Convolutional Representations in RKHSOussama Zekri, Ambroise OdonnatJuly 18, 2024 “…we focus on the Convolutional Kernel Network (CKN) architecture proposed in End-to-End Kernel Learning withContinue readingWhat I Read: Kernel, Convolutional Representations

What I Read: Use-cases, inverted PCA

By Andrew Fairless on October 16, 2024July 14, 2024

https://www.youtube.com/watch?v=ZK3_T-UgKWM Use-cases for inverted PCA:probabl.Jul 11, 2024 “A lot of people know of PCA for it’s ability to reduce the dimensionality of a dataset. It can turn a wide datasetContinue readingWhat I Read: Use-cases, inverted PCA

What I Read: Illustrated AlphaFold

By Andrew Fairless on October 9, 2024July 14, 2024

https://elanapearl.github.io/blog/2024/the-illustrated-alphafold The Illustrated AlphaFoldElana Simon, Jake Silberg “A visual walkthrough of the AlphaFold3 architecture…”

What I Read: Sliding Window Attention

By Andrew Fairless on September 30, 2024July 14, 2024

https://amaarora.github.io/posts/2024-07-04%20SWA.html Sliding Window Attention, Longformer – The Long-Document TransformerAman AroraJuly 4, 2024 “…we will look take a deep dive into Sliding Window Attention (SWA) that was introduced as part ofContinue readingWhat I Read: Sliding Window Attention

What I Read: Merge Large Language Models

By Andrew Fairless on August 15, 2024June 4, 2024

https://huggingface.co/blog/mlabonne/merge-models Merge Large Language Models with mergekitJanuary 9, 2024Maxime Labonne “Model merging is a technique that combines two or more LLMs into a single model. It’s a relatively new andContinue readingWhat I Read: Merge Large Language Models

Tag: linear algebra