Listen

Description

我们都希望AI越来越聪明,但它究竟是如何“开窍”的呢?本期节目,我们将深入AI的大脑,看看它如何拥有自己的“错题本”进行考场反思,又如何通过“自我暗示”突破学习瓶颈。我们还会探讨AI“思考”背后看不见的成本,以及一种更聪明的奖励机制,如何让AI偏爱攻克难题。最后,看看这一切如何让AI从一个工具,变成我们真正的“科研合伙人”。

00:00:32 你的错题本,AI现在也学会了

00:05:36 你的下一位科研合伙人,可能不是人

00:12:57 为什么AI有时“装傻”,算力背后的隐形成本

00:19:22 AI学习卡壳了怎么办?让它自己给自己提个醒

00:23:55 AI训练的“差生”偏爱法则

本期介绍的几篇论文:

[CL] Test-time Recursive Thinking: Self-Improvement without External Feedback

[Microsoft Research]

https://arxiv.org/abs/2602.03094

---

[CL] Accelerating Scientific Research with Gemini: Case Studies and Common Techniques

[Google Research]

https://arxiv.org/abs/2602.03837

---

[LG] Reasoning about Reasoning: BAPO Bounds on Chain-of-Thought Token Complexity in LLMs

[Microsoft Research & Netflix]

https://arxiv.org/abs/2602.02909

---

[LG] Self-Hinting Language Models Enhance Reinforcement Learning

[Microsoft Research]

https://arxiv.org/abs/2602.03143

---

[LG] Maximum Likelihood Reinforcement Learning

[CMU & Tsinghua University & Zhejiang University]

https://arxiv.org/abs/2602.02710