https://hamel.dev/blog/posts/evals Your AI Product Needs EvalsHow to construct domain-specific LLM evaluation systems.Hamel HusainMarch 29, 2024 “…I’ve seen many successful and unsuccessful approaches to building LLM products. I’ve found that unsuccessful
What I Read: Detecting hallucinations, LLMs, semantic entropy
https://oatml.cs.ox.ac.uk/blog/2024/06/19/detecting_hallucinations_2024.html Detecting hallucinations in large language models using semantic entropySebastian Farquhar, Jannik Kossen, Lorenz Kuhn, Yarin Gal19 Jun 2024 “We show how one can use uncertainty to detect confabulations.”