903: LLM Benchmarks Are Lying to You (And What to Do Instead), with Sinan Ozdemir

903: LLM Benchmarks Are Lying to You (And Wha...

Up next

1002: Fable 5: The Full Story from Capabilities to Drama

Anthropic’s Claude Fable 5 was the most capable AI model ever released to the public and it lasted just three days before the US government forced it offline. Jon Krohn unpacks both halves of the story: what makes Fable 5 special, and why it was pulled. Fable 5 and its locked-dow ...  Show more

1001: How AI Erased My Career Moat, an Episode #1001 Special: Jon Krohn interviewed by Kirill Eremenko

For this episode #1001 special, the tables are turned: SuperDataScience founder Kirill Eremenko takes the host’s chair and Jon Krohn is the guest. They trace Jon Krohn’s path from an Oxford neuroscience PhD to a New York hedge fund to founding the AI consulting firm Y Carrot, why ...  Show more

Recommended Episodes

Metrics Driven Development
Practical AI

How do you systematically measure, optimize, and improve the performance of LLM applications (like those powered by RAG or tool use)? Ragas is an open source effort that has been trying to answer this question comprehensively, and they are promoting a “Metrics Driven Development” ...  Show more

Only as good as the data
Practical AI

You might have heard that “AI is only as good as the data.” What does that mean and what data are we talking about? Chris and Daniel dig into that topic in the episode exploring the categories of data that you might encounter working in AI (for training, testing, fine-tuning, ben ...  Show more

Measuring The Speed of AI Through Benchmarks
The Brave Technologist

David Kanter, Executive Director at MLCommons, discusses the work they're doing with MLPerf Benchmarks, creating the world's first industry standard approach to measuring AI speed and safety. He also shares ways they're testing AI and LLMs for harm, to measure—and, o ...

  Show more

AI Today Podcast: How AI is Transforming Insurance, Interview with Connor Atchison, Wisedocs
AI Today Podcast

AI is proving transformational in every industry, including long established industries, and insurance is no exception. AI is able to optimize underwriting processes, enable more personalized insurance offerings, enhance the overall customer experience, as well as help with proce ...  Show more