Deep Papers
Deep Papers is a podcast series featuring deep dives on today’s most important AI papers and research. Hosted by Arize AI founders and engineers, each episode profiles the people and techniques behind cutting-edge breakthroughs in machine learning.
Deep Papers
Breaking Down EvalGen: Who Validates the Validators?
This week’s paper explores EvalGen, a mixed-initative approach to aligning LLM-generated evaluation functions with human preferences. EvalGen assists users in developing both criteria acceptable LLM outputs and developing functions to check these standards, ensuring evaluations reflect the users’ own grading standards.
Read it on the blog: https://arize.com/blog/breaking-down-evalgen-who-validates-the-validators/
To learn more about ML observability, join the Arize AI Slack community or get the latest on our LinkedIn and Twitter.