Troubleshooting and Resolving Common LLM Issues In Production
According to a recent survey, 61.7% of enterprise engineering teams now have or are planning to have a generative AI application within a year—and 14.1% are already in production. As enterprises race to deploy generative AI into their businesses, the need to ensure that LLMs are deployed reliably and responsibly is paramount. But how can enterprises and AI engineers evaluate and troubleshoot models in real time? In this session, Amber Roberts, a data scientist and machine learning engineer at Arize AI, will cover emerging best practices from direct work advising enterprises with real issues. Whether teams have LLM apps or are using LLMs as an additional tool for human-in-the-loop evaluations, this session will help mitigate the inevitable issues that arise, like inaccurate responses and hallucinations.
Amber Roberts is a data scientist, machine learning engineer, and education lead at Arize AI, an AI observability and large language model evaluation platform. Before Arize, Amber was a Product Manager of AI/ML at Splunk and Head of Artificial Intelligence at Insight Data Science. A former astrophysicist and Carnegie Fellow, Amber has an MS in Astrophysics from the Universidad de Chile.