https://www.quantamagazine.org/ai-needs-enormous-computing-power-could-light-based-chips-help-20240520 AI Needs Enormous Computing Power. Could Light-Based Chips Help?Amos Zeeberg5/20/24 10:40 AM “Optical neural networks, which use photons instead of electrons, have advantages over traditional systems. They also face
What I Read: How Machines ‘Grok’ Data
https://www.quantamagazine.org/how-do-machines-grok-data-20240412 How Do Machines ‘Grok’ Data?Anil Ananthaswamy4/12/24 “By apparently overtraining them, researchers have seen neural networks discover novel solutions to problems.”
What I Read: Attention, transformers
Attention in transformers, visually explained | Chapter 6, Deep Learning3Blue1Brown “Demystifying attention, the key mechanism inside transformers and LLMs.”
What I Read: Mamba Explained
https://thegradient.pub/mamba-explained Mamba ExplainedKola Ayonrinde27.Mar.2024 “Mamba promises similar performance (and crucially similar scaling laws) as the Transformer whilst being feasible at long sequence lengths (say 1 million tokens).”
What I Read: Chain-of-Thought Reasoning
https://www.quantamagazine.org/how-chain-of-thought-reasoning-helps-neural-networks-compute-20240321 How Chain-of-Thought Reasoning Helps Neural Networks ComputeBen Brubaker3/21/24 11:15 AM “Large language models do better at solving problems when they show their work. Researchers are beginning to understand why.”