transformer – Andrew Fairless, Ph.D.

What I Read: Chatbot Limitations

By Andrew Fairless on May 19, 2025February 22, 2025

Chatbot Software Begins to Face Fundamental Limitations Chatbot Software Begins to Face Fundamental LimitationsAnil AnanthaswamyJanuary 31, 2025 “Recent results show that large language models struggle with compositional tasks, suggesting aContinue readingWhat I Read: Chatbot Limitations

What I Read: Short, Nvidia

By Andrew Fairless on May 8, 2025February 2, 2025

https://youtubetranscriptoptimizer.com/blog/05_the_short_case_for_nvda The Short Case for Nvidia StockJeffrey EmanuelJanuary 25, 2025 “…NVIDIA faces an unprecedented convergence of competitive threats that make its premium valuation increasingly difficult to justify… The company’s supposedContinue readingWhat I Read: Short, Nvidia

What I Read: Adaptive LLMs

By Andrew Fairless on April 30, 2025January 30, 2025

https://sakana.ai/transformer-squared Transformer²: Self-Adaptive LLMssakana.aiJanuary 15, 2025 “Imagine a machine learning system that could adjust its own weights dynamically to thrive in unfamiliar settings, essentially illustrating a system that evolves asContinue readingWhat I Read: Adaptive LLMs

What I Read: Tensor Dimensions, Transformers

By Andrew Fairless on April 29, 2025January 28, 2025

https://huggingface.co/blog/not-lain/tensor-dims Mastering Tensor Dimensions in TransformersHafedh HichriJanuary 12, 2025 “Most generative AI models are built using a decoder-only architecture. In this blog post, we’ll explore a simple text generation model,Continue readingWhat I Read: Tensor Dimensions, Transformers

What I Read: Model Merging

By Andrew Fairless on April 17, 2025January 18, 2025

https://planetbanatt.net/articles/modelmerging.html Model Merging and YouEryk BanattAugust 2024 “Model Merging is a weird and experimental technique which lets you take two models and combine them together to get a new model.”

What I Read: Multimodal LLMs

By Andrew Fairless on March 6, 2025November 17, 2024

https://magazine.sebastianraschka.com/p/understanding-multimodal-llms Understanding Multimodal LLMsSebastian Raschka, PhDNov 03, 2024 “In this article, I aim to explain how multimodal LLMs function. Additionally, I will review and summarize roughly a dozen other recentContinue readingWhat I Read: Multimodal LLMs

What I Read: LLMs, School Math

By Andrew Fairless on March 5, 2025November 16, 2024

https://towardsdatascience.com/understanding-llms-from-scratch-using-middle-school-math-e602d27ec876?gi=551c5bfd7f21 Understanding LLMs from Scratch Using Middle School MathRohit PatelOct 19, 2024 “In this article, we talk about how Large Language Models (LLMs) work, from scratch — assuming only thatContinue readingWhat I Read: LLMs, School Math

What I Read: Mamba, State Space

By Andrew Fairless on February 11, 2025October 25, 2024

https://newsletter.maartengrootendorst.com/p/a-visual-guide-to-mamba-and-state A Visual Guide to Mamba and State Space ModelsMaarten GrootendorstFeb 19, 2024 “To further improve LLMs, new architectures are developed that might even outperform the Transformer architecture. One ofContinue readingWhat I Read: Mamba, State Space

What I Read: Transformers Inference Optimization

By Andrew Fairless on January 27, 2025October 19, 2024

https://astralord.github.io/posts/transformer-inference-optimization-toolset Transformers Inference Optimization ToolsetAleksandr SamarinOct 1, 2024 “Large Language Models are pushing the boundaries of artificial intelligence, but their immense size poses significant computational challenges. As these models grow,Continue readingWhat I Read: Transformers Inference Optimization

What I Watch: GenAI, Classify Text

By Andrew Fairless on January 21, 2025September 29, 2024

https://www.youtube.com/watch?v=Y4HQQPfyzwo Is GenAI All You Need to Classify Text? Some Learnings from the TrenchesPyDataSep 26, 2024 “…practical implications of using GenAI for text classification tasks. The speakers highlighted challenges suchContinue readingWhat I Watch: GenAI, Classify Text

Tag: transformer