recurrent – Andrew Fairless, Ph.D.

What I Read: Mamba, State Space

By Andrew Fairless on February 11, 2025October 25, 2024

https://newsletter.maartengrootendorst.com/p/a-visual-guide-to-mamba-and-state A Visual Guide to Mamba and State Space ModelsMaarten GrootendorstFeb 19, 2024 “To further improve LLMs, new architectures are developed that might even outperform the Transformer architecture. One ofContinue readingWhat I Read: Mamba, State Space

What I Read: Illustrated AlphaFold

By Andrew Fairless on October 9, 2024July 14, 2024

https://elanapearl.github.io/blog/2024/the-illustrated-alphafold The Illustrated AlphaFoldElana Simon, Jake Silberg “A visual walkthrough of the AlphaFold3 architecture…”

What I Read: What can LLMs never do?

By Andrew Fairless on October 1, 2024July 14, 2024

https://www.strangeloopcanon.com/p/what-can-llms-never-do What can LLMs never do?Rohit KrishnanApr 23, 2024 “Every time over the past few years that we came up with problems LLMs can’t do, they passed them with flyingContinue readingWhat I Read: What can LLMs never do?

What I Read: Mamba Explained

By Andrew Fairless on June 10, 2024April 15, 2024

https://thegradient.pub/mamba-explained Mamba ExplainedKola Ayonrinde27.Mar.2024 “Mamba promises similar performance (and crucially similar scaling laws) as the Transformer whilst being feasible at long sequence lengths (say 1 million tokens).”

What I Read: Chain-of-Thought Reasoning

By Andrew Fairless on June 3, 2024April 15, 2024

https://www.quantamagazine.org/how-chain-of-thought-reasoning-helps-neural-networks-compute-20240321 How Chain-of-Thought Reasoning Helps Neural Networks ComputeBen Brubaker3/21/24 11:15 AM “Large language models do better at solving problems when they show their work. Researchers are beginning to understand why.”

What I Read: Mamba, Easy Way

By Andrew Fairless on May 9, 2024March 7, 2024

https://jackcook.com/2024/02/23/mamba.html Mamba: The Easy WayJack CookFebruary 23, 2024 “Mamba appears to outperform similarly-sized Transformers while scaling linearly with sequence length…. If… you’re looking for a higher-level overview of Mamba’s bigContinue readingWhat I Read: Mamba, Easy Way

What I Read: Mamba

By Andrew Fairless on May 1, 2024March 7, 2024

https://jameschen.io/jekyll/update/2024/02/12/mamba.html Mamba No. 5 (A Little Bit Of…)James ChenFeb 12, 2024 “…I attempt to provide a walkthrough of the essence of the Mamba state space model architecture, occasionally sacrificing someContinue readingWhat I Read: Mamba

What I Read: Structured State Space Sequence Models

By Andrew Fairless on April 30, 2024March 7, 2024

https://cnichkawde.github.io/statespacesequencemodels.html Beyond Transformers: Structured State Space Sequence ModelsChetan NichkawdeJanuary 22, 2024 “A new paradigm is rapidly evolving within the realm of sequence modeling that presents a marked advancement over theContinue readingWhat I Read: Structured State Space Sequence Models

What I Read: Bonsai Networks, RNNs

By Andrew Fairless on October 3, 2023September 4, 2023

https://cprimozic.net/blog/growing-sparse-computational-graphs-with-rnns/ Growing Bonsai Networks with RNNsCasey Primozic “This writeup introduces what I’m calling Bonsai Networks – extremely sparse computational graphs produced by training and pruning RNNs. They provide an interpretableContinue readingWhat I Read: Bonsai Networks, RNNs

What I Read: LLMs

By Andrew Fairless on September 7, 2023August 1, 2023

https://willthompson.name/what-we-know-about-llms-primer What We Know About LLMs (Primer)Will Thompson (Twitter)July 23, 2023 “…it is worth reflecting on what we concretely know about LLMs at this point in time and how theseContinue readingWhat I Read: LLMs

Tag: recurrent