large language model – Page 4 – Andrew Fairless, Ph.D.

What I Read: Tool Retrieval, RAG

By Andrew Fairless on November 20, 2024August 26, 2024

https://jxnl.co/writing/2024/08/21/trade-off-tool-selection Optimizing Tool Retrieval in RAG Systems: A Balanced ApproachJason Liu2024/08/21 “When it comes to Retrieval-Augmented Generation (RAG) systems, one of the key challenges is deciding how to select andContinue readingWhat I Read: Tool Retrieval, RAG

What I Read: LLM Pre-training Post-training

By Andrew Fairless on November 18, 2024August 26, 2024

https://magazine.sebastianraschka.com/p/new-llm-pre-training-and-post-training New LLM Pre-training and Post-training ParadigmsA Look at How Modern LLMs Are TrainedSebastian Raschka, PhDAug 17, 2024 “Initially, the LLM training process focused solely on pre-training, but it hasContinue readingWhat I Read: LLM Pre-training Post-training

What I Read: Open-endedness, Agentic AI

By Andrew Fairless on November 13, 2024August 25, 2024

https://press.airstreet.com/p/open-endedness-is-all-well-need Open-endedness is all we’ll need: On “Agentic AI”Air Street Capital, Nathan Benaich, and Alex ChalmersAug 15, 2024 “Agentic systems offer the promise of agents – a software system thatContinue readingWhat I Read: Open-endedness, Agentic AI

What I Read: Visual Guide, Quantization

By Andrew Fairless on October 30, 2024August 3, 2024

https://newsletter.maartengrootendorst.com/p/a-visual-guide-to-quantization A Visual Guide to Quantization, Demystifying the Compression of Large Language ModelsMaarten GrootendorstJul 22, 2024 “…I will introduce the field of quantization in the context of language modeling andContinue readingWhat I Read: Visual Guide, Quantization

What I Read: History, Transformer

By Andrew Fairless on October 28, 2024July 22, 2024

Shaping the Future of AI from the History of TransformerStanford CS25: V4 I Hyung Won Chung of OpenAIStanford Online “I will provide a highly-opinionated view on the early history ofContinue readingWhat I Read: History, Transformer

What I Read: LLM evaluation

By Andrew Fairless on October 21, 2024July 22, 2024

https://hamel.dev/blog/posts/evals Your AI Product Needs EvalsHow to construct domain-specific LLM evaluation systems.Hamel HusainMarch 29, 2024 “…I’ve seen many successful and unsuccessful approaches to building LLM products. I’ve found that unsuccessfulContinue readingWhat I Read: LLM evaluation

What I Read: Data Flywheels, LLM

By Andrew Fairless on October 17, 2024July 20, 2024

https://www.sh-reya.com/blog/ai-engineering-flywheel Data Flywheels for LLM ApplicationsShreya ShankarJul 1, 2024 “This diagram illustrates my (idealized) architecture of an LLM pipeline, from input processing through evaluation and logging. It showcases ideas I’llContinue readingWhat I Read: Data Flywheels, LLM

What I Read: Improving Language Models, Practical Size

By Andrew Fairless on October 15, 2024July 14, 2024

https://amaarora.github.io/posts/2024-07-07%20Gemma.html Gemma 2, Improving Open Language Models at a Practical SizeAman AroraJuly 9, 2024 “…we take a deep dive into the architectural components of Gemma 2 such as Grouped QueryContinue readingWhat I Read: Improving Language Models, Practical Size

What I Read: Extrinsic Hallucinations, LLMs

By Andrew Fairless on October 7, 2024July 14, 2024

https://lilianweng.github.io/posts/2024-07-07-hallucination Extrinsic Hallucinations in LLMsLilian WengJuly 7, 2024 “This post focuses on extrinsic hallucination. To avoid hallucination, LLMs need to be (1) factual and (2) acknowledge not knowing the answerContinue readingWhat I Read: Extrinsic Hallucinations, LLMs

What I Read: What can LLMs never do?

By Andrew Fairless on October 1, 2024July 14, 2024

https://www.strangeloopcanon.com/p/what-can-llms-never-do What can LLMs never do?Rohit KrishnanApr 23, 2024 “Every time over the past few years that we came up with problems LLMs can’t do, they passed them with flyingContinue readingWhat I Read: What can LLMs never do?

Tag: large language model