machine learning – Page 5 – Andrew Fairless, Ph.D.

What I Read: LLM Pre-training Post-training

By Andrew Fairless on November 18, 2024August 26, 2024

https://magazine.sebastianraschka.com/p/new-llm-pre-training-and-post-training New LLM Pre-training and Post-training ParadigmsA Look at How Modern LLMs Are TrainedSebastian Raschka, PhDAug 17, 2024 “Initially, the LLM training process focused solely on pre-training, but it hasContinue readingWhat I Read: LLM Pre-training Post-training

What I Read: Open-endedness, Agentic AI

By Andrew Fairless on November 13, 2024August 25, 2024

https://press.airstreet.com/p/open-endedness-is-all-well-need Open-endedness is all we’ll need: On “Agentic AI”Air Street Capital, Nathan Benaich, and Alex ChalmersAug 15, 2024 “Agentic systems offer the promise of agents – a software system thatContinue readingWhat I Read: Open-endedness, Agentic AI

What I Read: Regularization, polynomial bases

By Andrew Fairless on November 7, 2024August 25, 2024

https://alexshtf.github.io/2024/06/03/PolynomialBasesRegProps.html Alex ShtoffRegularization properties of polynomial basesJun 3, 2024 “We used the Bernstein basis to demonstrate the importance of chosing a “good” polynomial basis, and that other well-known bases mayContinue readingWhat I Read: Regularization, polynomial bases

What I Read: Contextual Bandit, LinUCB:

By Andrew Fairless on November 5, 2024August 15, 2024

https://truetheta.io/concepts/reinforcement-learning/lin-ucb A Reliable Contextual Bandit Algorithm: LinUCBDJ RichAugust 6, 2024 “A user visits a news website. Which articles should they be shown?”

What I Read: Visual Guide, Quantization

By Andrew Fairless on October 30, 2024August 3, 2024

https://newsletter.maartengrootendorst.com/p/a-visual-guide-to-quantization A Visual Guide to Quantization, Demystifying the Compression of Large Language ModelsMaarten GrootendorstJul 22, 2024 “…I will introduce the field of quantization in the context of language modeling andContinue readingWhat I Read: Visual Guide, Quantization

What I Read: History, Transformer

By Andrew Fairless on October 28, 2024July 22, 2024

Shaping the Future of AI from the History of TransformerStanford CS25: V4 I Hyung Won Chung of OpenAIStanford Online “I will provide a highly-opinionated view on the early history ofContinue readingWhat I Read: History, Transformer

What I Read: Kernel, Convolutional Representations

By Andrew Fairless on October 24, 2024July 22, 2024

https://logb-research.github.io/blog/2024/ckn Kernel Trick I – Deep Convolutional Representations in RKHSOussama Zekri, Ambroise OdonnatJuly 18, 2024 “…we focus on the Convolutional Kernel Network (CKN) architecture proposed in End-to-End Kernel Learning withContinue readingWhat I Read: Kernel, Convolutional Representations

What I Read: LLM evaluation

By Andrew Fairless on October 21, 2024July 22, 2024

https://hamel.dev/blog/posts/evals Your AI Product Needs EvalsHow to construct domain-specific LLM evaluation systems.Hamel HusainMarch 29, 2024 “…I’ve seen many successful and unsuccessful approaches to building LLM products. I’ve found that unsuccessfulContinue readingWhat I Read: LLM evaluation

What I Read: Data Flywheels, LLM

By Andrew Fairless on October 17, 2024July 20, 2024

https://www.sh-reya.com/blog/ai-engineering-flywheel Data Flywheels for LLM ApplicationsShreya ShankarJul 1, 2024 “This diagram illustrates my (idealized) architecture of an LLM pipeline, from input processing through evaluation and logging. It showcases ideas I’llContinue readingWhat I Read: Data Flywheels, LLM

What I Read: Improving Language Models, Practical Size

By Andrew Fairless on October 15, 2024July 14, 2024

https://amaarora.github.io/posts/2024-07-07%20Gemma.html Gemma 2, Improving Open Language Models at a Practical SizeAman AroraJuly 9, 2024 “…we take a deep dive into the architectural components of Gemma 2 such as Grouped QueryContinue readingWhat I Read: Improving Language Models, Practical Size

Tag: machine learning