Listen

Description

本期的 6 篇论文如下:

[00:19] 🧠 Latent Implicit Visual Reasoning(潜在隐式视觉推理)

[00:56] 🎬 Spatia: Video Generation with Updatable Spatial Memory(Spatia:基于可更新空间记忆的视频生成)

[01:36] 🧠 Schoenfeld's Anatomy of Mathematical Reasoning by Language Models(基于舍恩菲尔德理论的语言模型数学推理解剖)

[02:11] 🔍 How Much 3D Do Video Foundation Models Encode?(视频基础模型编码了多少3D信息?)

[02:58] 🎯 VA-$π$: Variational Policy Alignment for Pixel-Aware Autoregressive Generation(VA-π:面向像素感知自回归生成的变分策略对齐)

[03:36] 🚀 GTR-Turbo: Merged Checkpoint is Secretly a Free Teacher for Agentic VLM Training(GTR-Turbo:合并的检查点秘密成为智能体化视觉语言模型训练的免费教师)

【关注我们】

您还可以在以下平台找到我们,获得播客内容以外更多信息

小红书: AI速递