distillation – Andrew Fairless, Ph.D.

What I Read: Reasoning LLMs

By Andrew Fairless on May 21, 2025February 22, 2025

https://magazine.sebastianraschka.com/p/understanding-reasoning-llms Understanding Reasoning LLMsSebastian Raschka, PhDFeb 05, 2025 “This article describes the four main approaches to building reasoning models, or how we can enhance LLMs with reasoning capabilities.”

What I Read: LLMs, Open Source

By Andrew Fairless on July 15, 2024May 6, 2024

https://www.aidancooper.co.uk/how-to-beat-proprietary-llms How to Beat Proprietary LLMs With Smaller Open Source ModelsAidan CooperApr 26, 2024 “…we explore the unique advantages of open source LLMs, and how you can leverage them toContinue readingWhat I Read: LLMs, Open Source

What I Read: diffusion distillation

By Andrew Fairless on May 13, 2024March 11, 2024

https://sander.ai/2024/02/28/paradox.html The paradox of diffusion distillationSander DielemanFebruary 28, 2024 “…let’s take a closer look at the various ways in which the number of sampling steps required to get good resultsContinue readingWhat I Read: diffusion distillation

What I Read: LLM Chatbots, Browser

By Andrew Fairless on July 27, 2023June 15, 2023

https://www.kdnuggets.com/2023/05/webllm-bring-llm-chatbots-browser.html Web LLM: Bring LLM Chatbots to the BrowserBala Priya CMay 22, 2023 “Wouldn’t it be cool if you can run LLMs and LLM chatbots natively in your browser?”

What I Read: Productizing Large Language Models

By Andrew Fairless on November 16, 2022September 27, 2022

https://blog.replit.com/llms Productizing Large Language ModelsWed Sep 21 2022by Amjad Masad, Samip Dahal, Luis Héctor Chávez “Large Language Models (LLMs)… can chat, write poetry, write code, and even do basic arithmetic.Continue readingWhat I Read: Productizing Large Language Models

What I Read: Transformers in computer vision

By Andrew Fairless on September 6, 2022August 3, 2022

https://theaisummer.com/transformers-computer-vision/ Transformers in computer vision: ViT architectures, tips, tricks and improvementsNikolas Adaloglouon2021-10-21 “You are probably already aware of the Vision Transformer (ViT)…. We will address questions like how can youContinue readingWhat I Read: Transformers in computer vision

What I Read: Dataset Distillation

By Andrew Fairless on February 9, 2022January 20, 2022

https://ai.googleblog.com/2021/12/training-machine-learning-models-more.html Training Machine Learning Models More Efficiently with Dataset DistillationWednesday, December 15, 2021Posted by Timothy Nguyen1, Research Engineer and Jaehoon Lee, Senior Research Scientist, Google Research“For a machine learning (ML)Continue readingWhat I Read: Dataset Distillation

What I Read: Why Deep Learning Works

By Andrew Fairless on August 23, 2021August 17, 2021

https://moultano.wordpress.com/2020/10/18/why-deep-learning-works-even-though-it-shouldnt/ Why Deep Learning Works Even Though It Shouldn’tRyan Moulton’s ArticlesRyan Moulton “Stop talking about minima…. Nobody ever trains their model remotely close to convergence…. What really needs further researchContinue readingWhat I Read: Why Deep Learning Works

What I Read: Do Wide and Deep Networks Learn the Same Things?

By Andrew Fairless on May 28, 2021May 14, 2021

https://ai.googleblog.com/2021/05/do-wide-and-deep-networks-learn-same.html Do Wide and Deep Networks Learn the Same Things?Tuesday, May 4, 2021Posted by Thao Nguyen, AI Resident, Google Research

What I Read: Deep learning model compression

By Andrew Fairless on April 10, 2021April 10, 2021

https://rachitsingh.com/deep-learning-model-compression/ Deep learning model compressionRachit SinghWritten March 26, 2021 “Researchers and practitioners have come up with many methods for optimizing neural networks to run faster or with less memory usage.Continue readingWhat I Read: Deep learning model compression

Tag: distillation