What I Read: Evaluating LLM-Evaluators

https://eugeneyan.com/writing/llm-evaluators

Evaluating the Effectiveness of LLM-Evaluators (aka LLM-as-Judge)
Eugene Yan


“After reading this, you’ll gain an intuition on how to apply, evaluate, and operate LLM-evaluators. We’ll learn when to apply (i) direct scoring vs. pairwise comparisons, (ii) correlation vs. classification metrics, and (iii) LLM APIs vs. finetuned evaluator models.”