linear algebra – Page 3 – Andrew Fairless, Ph.D.

What I Read: Transformers by Hand

By Andrew Fairless on August 14, 2024May 25, 2024

https://towardsdatascience.com/deep-dive-into-transformers-by-hand-%EF%B8%8E-68b8be4bd813?gi=b2b3c1885179 Deep Dive into Transformers by HandSrijanie Dey, PhDApr 12, 2024 “…the two mechanisms that are truly the force behind the transformers are attention weighting and feed-forward networks (FFN).”

What I Read: Platonic Hypothesis

By Andrew Fairless on July 29, 2024May 24, 2024

https://phillipi.github.io/prh The Platonic Representation HypothesisMinyoung Huh, Brian Cheung, Tongzhou Wang, Phillip IsolaMITPosition Paper in ICML 2024 “Neural networks, trained with different objectives on different data and modalities, are converging toContinue readingWhat I Read: Platonic Hypothesis

What I Read: time series, Gaussian processes

By Andrew Fairless on July 18, 2024May 24, 2024

https://hendersontrent.github.io/posts/2024/05/gaussian-process-time-series Interpretable time-series modelling using Gaussian processesTrent HendersonMay 03, 2024 “…Gaussian processes (GP)… are an insanely powerful tool that can model an absurd range of data (including continuous and discrete)Continue readingWhat I Read: time series, Gaussian processes

What I Read: Flow Matching

By Andrew Fairless on June 26, 2024April 23, 2024

https://mlg.eng.cam.ac.uk/blog/2024/01/20/flow-matching.html An introduction to Flow MatchingTor Fjelde, Emile Mathieu, Vincent Dutordoir “Flow matching (FM) is a recent generative modelling paradigm which has rapidly been gaining popularity in the deep probabilisticContinue readingWhat I Read: Flow Matching

What I Read: Attention, transformers

By Andrew Fairless on June 18, 2024April 16, 2024

Attention in transformers, visually explained | Chapter 6, Deep Learning3Blue1Brown “Demystifying attention, the key mechanism inside transformers and LLMs.”

What I Read: Linear Algebra, Random

By Andrew Fairless on June 17, 2024April 15, 2024

https://youtu.be/6htbyY3rH1w?si=IXTrcoIReps_ftFq Is the Future of Linear Algebra.. Random?Mutual Information “Randomization is arguably the most exciting and innovative idea to have hit linear algebra in a long time.”

What I Read: Mamba Explained

By Andrew Fairless on June 10, 2024April 15, 2024

https://thegradient.pub/mamba-explained Mamba ExplainedKola Ayonrinde27.Mar.2024 “Mamba promises similar performance (and crucially similar scaling laws) as the Transformer whilst being feasible at long sequence lengths (say 1 million tokens).”

What I Read: High-Dimensional Variance

By Andrew Fairless on May 22, 2024March 11, 2024

https://gregorygundersen.com/blog/2023/12/09/covariance-matrices/ High-Dimensional VarianceGregory Gundersen09 December 2023 “A useful view of a covariance matrix is that it is a natural generalization of variance to higher dimensions.”

What I Read: Mamba, Easy Way

By Andrew Fairless on May 9, 2024March 7, 2024

https://jackcook.com/2024/02/23/mamba.html Mamba: The Easy WayJack CookFebruary 23, 2024 “Mamba appears to outperform similarly-sized Transformers while scaling linearly with sequence length…. If… you’re looking for a higher-level overview of Mamba’s bigContinue readingWhat I Read: Mamba, Easy Way

What I Read: Mamba

By Andrew Fairless on May 1, 2024March 7, 2024

https://jameschen.io/jekyll/update/2024/02/12/mamba.html Mamba No. 5 (A Little Bit Of…)James ChenFeb 12, 2024 “…I attempt to provide a walkthrough of the essence of the Mamba state space model architecture, occasionally sacrificing someContinue readingWhat I Read: Mamba

Tag: linear algebra