“Kimi K2 Thinking” by Zvi

Description

I previously covered Kimi K2, which now has a new thinking version. As I said at the time back in July, price in that the thinking version is coming.

Is it the real deal?

That depends on what level counts as the real deal. It's a good model, sir, by all accounts. But there have been fewer accounts than we would expect if it was a big deal, and it doesn’t fall into any of my use cases.

Introducing K2 Thinking

Kimi.ai: Hello, Kimi K2 Thinking!

The Open-Source Thinking Agent Model is here.

SOTA on HLE (44.9%) and BrowseComp (60.2%)

Executes up to 200 – 300 sequential tool calls without human interference

Excels in reasoning, agentic search, and coding

256K context window

Built as a thinking agent, K2 Thinking marks our latest efforts in test-time scaling — scaling both thinking tokens and tool-calling turns.

K2 Thinking is now live on http://kimi.com in chat mode, with full agentic mode coming soon. It is also accessible via API.

API here, Tech blog here, Weights and code here.

(Pliny jailbreak here.)

It's got 1T parameters, and Kimi and [...]

---

Outline:

(00:34) Introducing K2 Thinking

(02:15) Writing Quality

(03:07) Agentic Tool Use

(04:06) Overall

(05:08) Are Benchmarks Being Targeted?

(06:23) Just As Good Syndrome

(07:02) Reactions

(09:59) Otherwise It Has Been Strangely Quiet

---

First published:

November 11th, 2025

Source:

https://www.lesswrong.com/posts/SLrWSyS3FypLKyRL6/kimi-k2-thinking

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Bar graph showing AI model performance on telecom benchmark with agentic tool use.

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

Listen

Description

Want to check another podcast?