Attention in transformers, visually explained | Chapter 6, Deep Learning3Blue1Brown “Demystifying attention, the key mechanism inside transformers and LLMs.”
What I Read: Linear Algebra, Random
https://youtu.be/6htbyY3rH1w?si=IXTrcoIReps_ftFq Is the Future of Linear Algebra.. Random?Mutual Information “Randomization is arguably the most exciting and innovative idea to have hit linear algebra in a long time.”
What I Read: Mamba Explained
https://thegradient.pub/mamba-explained Mamba ExplainedKola Ayonrinde27.Mar.2024 “Mamba promises similar performance (and crucially similar scaling laws) as the Transformer whilst being feasible at long sequence lengths (say 1 million tokens).”
What I Read: Chain-of-Thought Reasoning
https://www.quantamagazine.org/how-chain-of-thought-reasoning-helps-neural-networks-compute-20240321 How Chain-of-Thought Reasoning Helps Neural Networks ComputeBen Brubaker3/21/24 11:15 AM “Large language models do better at solving problems when they show their work. Researchers are beginning to understand why.”
What I Read: reliance on AI-assisted decisions
https://statmodeling.stat.columbia.edu/2024/03/06/defining-optimal-reliance-on-model-predictions-in-ai-assisted-decisions/Defining optimal reliance on model predictions in AI-assisted decisionsJessica Hullman3/6/24 12:31 PM “…AI-assisted decision task is of interest as organizations deploy predictive models to assist human decision-making in domains like