https://huggingface.co/blog/mlabonne/merge-models Merge Large Language Models with mergekitJanuary 9, 2024Maxime Labonne “Model merging is a technique that combines two or more LLMs into a single model. It’s a relatively new and
What I Read: Transformers by Hand
https://towardsdatascience.com/deep-dive-into-transformers-by-hand-%EF%B8%8E-68b8be4bd813?gi=b2b3c1885179 Deep Dive into Transformers by HandSrijanie Dey, PhDApr 12, 2024 “…the two mechanisms that are truly the force behind the transformers are attention weighting and feed-forward networks (FFN).”
What I Read: AI, Light-Based Chips
https://www.quantamagazine.org/ai-needs-enormous-computing-power-could-light-based-chips-help-20240520 AI Needs Enormous Computing Power. Could Light-Based Chips Help?Amos Zeeberg5/20/24 10:40 AM “Optical neural networks, which use photons instead of electrons, have advantages over traditional systems. They also face