Deep Papers
Deep Papers is a podcast series featuring deep dives on today’s most important AI papers and research. Hosted by Arize AI founders and engineers, each episode profiles the people and techniques behind cutting-edge breakthroughs in machine learning.
Podcasting since 2023 • 40 episodes
Deep Papers
Latest Episodes
LLMs as Judges: A Comprehensive Survey on LLM-Based Evaluation Methods
We discuss a major survey of work and research on LLM-as-Judge from the last few years. "LLMs-as-Judges: A Comprehensive Survey on LLM-based Evaluation Methods" systematically examines the LLMs-as-Judge framework across five dimensions: functio...
•
28:57
Merge, Ensemble, and Cooperate! A Survey on Collaborative LLM Strategies
LLMs have revolutionized natural language processing, showcasing remarkable versatility and capabilities. But individual LLMs often exhibit distinct strengths and weaknesses, influenced by differences in their training corpora. This diversity p...
•
28:47
Agent-as-a-Judge: Evaluate Agents with Agents
This week, we break down the “Agent-as-a-Judge” framework—a new agent evaluation paradigm that’s kind of like getting robots to grade each other’s homework. Where typical evaluation methods focus solely on outcomes or demand extensive manual wo...
•
24:54
Introduction to OpenAI's Realtime API
We break down OpenAI’s realtime API. Learn how to seamlessly integrate powerful language models into your applications for instant, context-aware responses that drive user engagement. Whether you’re building chatbots, dynamic content tools, or ...
•
29:56
Swarm: OpenAI's Experimental Approach to Multi-Agent Systems
As multi-agent systems grow in importance for fields ranging from customer support to autonomous decision-making, OpenAI has introduced Swarm, an experimental framework that simplifies the process of building and managing these systems. Swarm, ...
•
46:46