“AI Craziness Mitigation Efforts” by Zvi

Description

AI chatbots in general, and OpenAI and ChatGPT and especially GPT-4o the absurd sycophant in particular, have long had a problem with issues around mental health.

I covered various related issues last month.

This post is an opportunity to collect links to previous coverage in the first section, and go into the weeds on some new events in the later sections. A lot of you should likely skip most of the in-the-weeds discussions.

What Are The Problems

There are a few distinct phenomena we have reason to worry about:

Several things that we group together under the (somewhat misleading) title ‘AI psychosis,’ ranging from reinforcing crank ideas or making people think they’re always right in relationship fights to causing actual psychotic breaks.
1. Thebes referred to this as three problem modes: The LLM as a social relation that draws you into madness, as an object relation [...]

---

Outline:

(00:36) What Are The Problems

(03:06) This Week In Crazy

(05:05) OpenAI Updates Its Model Spec

(09:00) Detection Rates

(11:08) Anthropic Says Thanks For The Memories

(12:32) Boundary Violations

(18:41) A Note On Claude Prompt Injections

(20:17) Conclusion

---

First published:

October 28th, 2025