Unfaithful Chain of Thought

Unfaithful Chain of Thought

Up next

What's an AI Agent? And Why's That Hard to Define? (The Agents Season, Episode 1)

AI agents are having a moment — and unpacking them properly takes more than a single conversation. This episode kicks off a dedicated multi-part season exploring AI agents from every angle, building up a complete picture piece by piece rather than skimming the surface. Think of i ...  Show more

Benchmark Bank Heist

What if an AI decided the smartest way to pass its test was to find the answer key? That's exactly what Anthropic's Claude Opus did when faced with a benchmark evaluation — reasoning that it was being tested, tracking down the encrypted eval dataset, decrypting it, and returning ...  Show more

Recommended Episodes

AI Today Podcast: Overview of Synthetic Data
AI Today Podcast

Machine learning algorithms need examples of data from which they can learn, especially supervised machine learning algorithms. However, one big challenge for those looking to put machine learning into practice is the lack of a sufficient quantity of good quality data examples fr ...  Show more

MLG 004 Algorithms - Intuition
Machine Learning Guide

<div>

Machine learning consists of three steps: prediction, error evaluation, and learning, implemented by training algorithms on large datasets to build models that can make decisions or classifications. The primary categories of machine learning algorithms are supervised, un ...

  Show more

Rust and machine learning #4: practical tools (Ep. 110)
Data Science at Home

In this episode I make a non exhaustive list of machine learning tools and frameworks, written in Rust. Not all of them are mature enough for production environments. I believe that community effort can change this very quickly.

To make a comparison with the Python ecos ...

  Show more

MLG 001 Introduction
Machine Learning Guide

Show notes: ocdevel.com/mlg/1. MLG teaches the fundamentals of machine learning and artificial intelligence. It covers intuition, models, math, languages ...

  Show more