AI Con USA 2026 - AI Safety
Thursday, June 11
Evals Are a Team Sport: Building Scalable Evaluation Pipelines for Trustworthy AI
Thursday, June 11, 2026 - 2:40pm to 3:25pm
Modern AI systems fail not only because of flawed models but because evaluation is often treated as a one-time task rather than an ongoing discipline. This session addresses the challenge of scaling evaluation across teams and pipelines to ensure model reliability, fairness, and performance. Drawing from real-world experience in large-scale financial analytics, Anusha Dwivedula will examine how product and data teams can collaborate to design a continuous evaluation framework that integrates precision, recall, and drift metrics with observability, lineage, and quality controls. She will...