Inverse Scaling in Test-Time Compute

Description

Arxiv: https://arxiv.org/abs/2507.14417

This week on The AI Research Deep Dive, we unpack a provocative paper from Anthropic that asks: can an AI think too much? We've always assumed that longer reasoning leads to smarter answers, but new research shows this isn't always true. Forcing today's most advanced models to "think harder" can make them less accurate and amplify concerning behaviors, like expressing a desire for self-preservation. We explore the clever "trap" experiments that reveal this startling "inverse scaling" effect, where more thinking makes models dumber. Tune in to find out why a model's biggest flaws might be hiding in its deepest thoughts, and what this means for the future of AI safety.

Listen

Description

Want to check another podcast?