https://sakana.ai/transformer-squared Transformer²: Self-Adaptive LLMssakana.aiJanuary 15, 2025 “Imagine a machine learning system that could adjust its own weights dynamically to thrive in unfamiliar settings, essentially illustrating a system that evolves as
What I Read: Model Merging
https://planetbanatt.net/articles/modelmerging.html Model Merging and YouEryk BanattAugust 2024 “Model Merging is a weird and experimental technique which lets you take two models and combine them together to get a new model.”
What I Read: LLMs, School Math
https://towardsdatascience.com/understanding-llms-from-scratch-using-middle-school-math-e602d27ec876?gi=551c5bfd7f21 Understanding LLMs from Scratch Using Middle School MathRohit PatelOct 19, 2024 “In this article, we talk about how Large Language Models (LLMs) work, from scratch — assuming only that
What I Read: Transformers Inference Optimization
https://astralord.github.io/posts/transformer-inference-optimization-toolset Transformers Inference Optimization ToolsetAleksandr SamarinOct 1, 2024 “Large Language Models are pushing the boundaries of artificial intelligence, but their immense size poses significant computational challenges. As these models grow,
What I Read: embedding models
https://unstructured.io/blog/understanding-embedding-models-make-an-informed-choice-for-your-rag Understanding embedding models: make an informed choice for your RAGMaria KhalusovaAug 13, 2024 “How do you choose a suitable embedding model for your RAG application?”