Listen

Description

今天我们来聊聊AI的“内心世界”:我们找到了那把能解锁所有学习方法的“万能钥匙”,却也发现AI的“人格”竟会随着对话见风使舵。我们试图让它像生物一样“进化”,却不小心让它患上了“灾难性遗忘症”。面对越来越强的AI,我们这些“菜鸟裁判”又该如何确保它的诚实?最后,我们会发现,让AI飞速成长的秘诀,可能不是好评,而是一份详尽的“错误报告”。

00:00:32 人工智能的“万能钥匙”藏在哪?

00:06:34 AI的“人格”,为什么聊着聊着就变了?

00:11:47 AI的“进化”陷阱,为什么学得越多,忘得越快?

00:16:47 菜鸟裁判,如何拿捏顶尖高手?

00:21:48 差评,好评,不如一份详细的“错误报告”

本期介绍的几篇论文:

[LG] Spectral Ghost in Representation Learning: from Component Analysis to Self-Supervised Learning

[Google DeepMind & Harvard University]

https://arxiv.org/abs/2601.20154

---

[CL] Linear representations in language models can change dramatically over a conversation

[Google DeepMind]

https://arxiv.org/abs/2601.20834

---

[LG] Evolutionary Strategies lead to Catastrophic Forgetting in LLMs

[UC Berkeley]

https://arxiv.org/abs/2601.20861

---

[LG] Truthfulness Despite Weak Supervision: Evaluating and Training LLMs Using Peer Prediction

[UC Berkeley]

https://arxiv.org/abs/2601.20299

---

[LG] Reinforcement Learning via Self-Distillation

[ETH Zurich]

https://arxiv.org/abs/2601.20802