https://www.anthropic.com/index/evaluating-ai-systems
Challenges in evaluating AI systems
Oct 4, 2023
“…what many people working inside and outside of AI don’t fully appreciate is how difficult it is to build robust and reliable model evaluations… of model capabilities or safety.”