LessWrong posts by zvi

“AI #126: Go Fund Yourself” by Zvi

Listen

Description

The big AI news this week came on many fronts.

Google and OpenAI unexpectedly got 2025 IMO Gold using LLMs under test conditions, rather than a tool like AlphaProof. How they achieved this was a big deal in terms of expectations for future capabilities.

ChatGPT released GPT Agent, a substantial improvement on Operator that makes it viable on a broader range of tasks. For now I continue to struggle to find practical use cases where it is both worth using and a better tool than alternatives, but there is promise here.

Finally, the White House had a big day of AI announcements, laying out the AI Action Plan and three executive orders. I will cover that soon. The AI Action Plan's rhetoric is not great, and from early reports the rhetoric at the announcement event was similarly not great, with all forms of safety considered so [...]

---

Outline:

(01:41) Language Models Offer Mundane Utility

(03:04) Language Models Don't Offer Mundane Utility

(10:48) Huh, Upgrades

(11:40) 4o Is An Absurd Sycophant

(15:05) On Your Marks

(17:32) Choose Your Fighter

(18:12) When The Going Gets Crazy

(22:25) They Took Our Jobs

(26:36) Fun With Media Generation

(27:06) The Art of the Jailbreak

(30:15) Get Involved

(30:58) Introducing

(31:24) In Other AI News

(33:30) Show Me the Money

(40:39) Go Middle East Young Man

(48:21) Economic Growth

(49:40) Quiet Speculations

(57:44) Modest Proposals

(58:36) Predictions Are Hard Especially About The Future

(01:01:52) The Quest for Sane Regulations

(01:06:25) Chip City

(01:07:29) The Week in Audio

(01:07:50) Congressional Voices

(01:09:02) Rhetorical Innovation

(01:14:51) Grok Bottom

(01:18:22) No Grok No

(01:19:06) Aligning a Smarter Than Human Intelligence is Difficult

(01:22:36) Preserve Chain Of Thought Monitorability

(01:30:03) People Are Worried About AI Killing Everyone

(01:31:15) The Lighter Side

---

First published:

July 24th, 2025

Source:

https://www.lesswrong.com/posts/ygND532h4CotfPcp7/ai-126-go-fund-yourself

---

Narrated by TYPE III AUDIO.

---

Images from the article:

WhatsApp chat about AI identity with code snippet about chemical safety.

An academic evaluation report showing a 9.6/10 score with feedback.

ChatGPT conversation showing enthusiastic response about solving the Hodge Conjecture.

Gravestone with business quote about success and bad people by Amodei.

Data table showing rejected posts count by month from 2023-2025, lock icons present.

Line graph showing

Aerial view of Tesla Gigafactory construction site with large orange foundations.

Scatter plot comparing nervousness vs excitement about AI across global regions. Three distinct regional clusters - Anglosphere, Europe, and Asia - show varying attitudes.

Bar graph:

Network server racks with neatly organized yellow and red cable management.

ChatGPT interface showing a response about infidelity and emotional neglect.

Four panels showing AI chat experiments labeled

The image shows experimental results testing different compliance rates when users request the AI to call them names, with varying approaches and response strategies shown in each quadrant." style="max-width: 100%;" />

Pew Research graph comparing AI experts' and public's views on AI's future impact.</p><p>The visualization shows stark differences between expert and public opinion, with AI experts being significantly more optimistic (56% positive) compared to U.S. adults (17% positive) about AI's impact over the next 20 years. The graph also shows response categories for

News article screenshot. The headline reads:

Subheading: "Dependence on chatbots for reassurance and 'objective' evaluations of attractiveness can worsen the deepest insecurities"

Article appears to be from Black Mirror, dated July 18, 2025." style="max-width: 100%;" />

Mike Lee tweets:

Highlighted text excerpt discussing performative forgiveness and banality of actions.

Evan tweets:

A user tweets:

Pliny the Liberator tweets:

Screenshot of text discussing potential relationship between Wilson employees and implications.

Lines of code discussing plans about affair info and AI takeover.

Chat message excerpt discussing sending an email and plot storyline details.

Comparison table showing risk metrics across major AI companies: Anthropic, OpenAI, Meta, DeepMind, XAI.

Lines of numbered code showing AI-related discussion and actions.

Chart comparing AI company positions labeled

The visualization shows relative rankings for Anthropic (35%), OpenAI (33%), Meta (22%), DeepMind (20%), and XAI (18%) using geometric patterns." style="max-width: 100%;" />

Rep. Nancy Mace (R-SC) tweets:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.