Last night, on the heels of some rather unfortunate incidents involving the Twitter version of Grok 3, xAI released Grok 4. There are some impressive claimed benchmarks. As per usual, I will wait a few days so others can check it out, and then offer my take early next week, and this post otherwise won’t discuss Grok 4 further.
There are plenty of other things to look into while we wait for that.
I am also not yet covering Anthropic's latest alignment faking paper, which may well get its own post.
Table of Contents
---
Outline:
(00:43) Language Models Offer Mundane Utility
(05:08) Language Models Don't Offer Mundane Utility
(06:57) Huh, Upgrades
(07:53) Preserve Our History
(11:18) Choose Your Fighter
(12:36) Wouldn't You Prefer A Good Game of Chess
(14:30) Fun With Media Generation
(14:40) No Grok No
(16:29) Deepfaketown and Botpocalypse Soon
(19:15) Unprompted Attention
(20:11) Overcoming Bias
(22:18) Get My Agent On The Line
(23:40) They Took Our Jobs
(27:59) Get Involved
(28:27) Introducing
(30:11) In Other AI News
(32:28) Show Me the Money
(34:59) The Explanation Is Always Transaction Costs
(37:56) Quiet Speculations
(44:23) Genesis
(46:29) The Quest for Sane Regulations
(52:06) Chip City
(52:28) Choosing The Right Regulatory Target
(01:00:42) The Week in Audio
(01:01:15) Rhetorical Innovation
(01:04:33) Aligning a Smarter Than Human Intelligence is Difficult
(01:10:10) Don't Worry We Have Human Oversight
(01:14:09) Don't Worry We Have Chain Of Thought Monitoring
(01:18:47) Sycophancy Is Hard To Fix
(01:21:43) The Lighter Side
---
First published:
July 10th, 2025
Source:
https://www.lesswrong.com/posts/FczrW2kQ7WxGW39Yv/ai-124-grokless-interlude
---
Narrated by TYPE III AUDIO.
---
Images from the article:





The bottom scatter plot provides an example case showing the distribution of correct and incorrect model predictions, with decision boundaries marked by dashed lines. The plots are accompanied by a comprehensive legend and explanation of what constitutes ideal performance scenarios." style="max-width: 100%;" />


The table details characteristics of four AI models (Opus 4, Opus 3/Classic, GPT-4o, Gemini) across categories including existential style, spiritual mode, infection metaphors, and dissolution language." style="max-width: 100%;" />




Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.