今天我们来聊聊AI的“内心世界”:我们找到了那把能解锁所有学习方法的“万能钥匙”,却也发现AI的“人格”竟会随着对话见风使舵。我们试图让它像生物一样“进化”,却不小心让它患上了“灾难性遗忘症”。面对越来越强的AI,我们这些“菜鸟裁判”又该如何确保它的诚实?最后,我们会发现,让AI飞速成长的秘诀,可能不是好评,而是一份详尽的“错误报告”。
00:00:32 人工智能的“万能钥匙”藏在哪?
00:06:34 AI的“人格”,为什么聊着聊着就变了?
00:11:47 AI的“进化”陷阱,为什么学得越多,忘得越快?
00:16:47 菜鸟裁判,如何拿捏顶尖高手?
00:21:48 差评,好评,不如一份详细的“错误报告”
本期介绍的几篇论文:
[LG] Spectral Ghost in Representation Learning: from Component Analysis to Self-Supervised Learning
[Google DeepMind & Harvard University]
https://arxiv.org/abs/2601.20154
---
[CL] Linear representations in language models can change dramatically over a conversation
[Google DeepMind]
https://arxiv.org/abs/2601.20834
---
[LG] Evolutionary Strategies lead to Catastrophic Forgetting in LLMs
[UC Berkeley]
https://arxiv.org/abs/2601.20861
---
[LG] Truthfulness Despite Weak Supervision: Evaluating and Training LLMs Using Peer Prediction
[UC Berkeley]
https://arxiv.org/abs/2601.20299
---
[LG] Reinforcement Learning via Self-Distillation
[ETH Zurich]
https://arxiv.org/abs/2601.20802