Benchmark Bank Heist

Benchmark Bank Heist

Up next

Unfaithful Chain of Thought

What's actually happening when an LLM "thinks out loud"? Research on human decision-making suggests that much of the reasoning we believe drives our choices is actually post hoc rationalization — we decide first, explain later. Katie and Ben get curious about whether the same mig ...  Show more

Benchmarking AI Models

How do you know if a new AI model is actually better than the last one? It turns out answering that question is a lot messier than it sounds. This week we dig into the world of LLM benchmarks — the standardized tests used to compare models — exploring two canonical examples: MMLU ...  Show more

Recommended Episodes

AI Today Podcast: Overview of Synthetic Data
AI Today Podcast

Machine learning algorithms need examples of data from which they can learn, especially supervised machine learning algorithms. However, one big challenge for those looking to put machine learning into practice is the lack of a sufficient quantity of good quality data examples fr ...  Show more

MLG 004 Algorithms - Intuition
Machine Learning Guide

<div>

Machine learning consists of three steps: prediction, error evaluation, and learning, implemented by training algorithms on large datasets to build models that can make decisions or classifications. The primary categories of machine learning algorithms are supervised, un ...

  Show more

Rust and machine learning #4: practical tools (Ep. 110)
Data Science at Home

In this episode I make a non exhaustive list of machine learning tools and frameworks, written in Rust. Not all of them are mature enough for production environments. I believe that community effort can change this very quickly.

To make a comparison with the Python ecos ...

  Show more

MLG 001 Introduction
Machine Learning Guide

Show notes: ocdevel.com/mlg/1. MLG teaches the fundamentals of machine learning and artificial intelligence. It covers intuition, models, math, languages ...

  Show more