Listen

Description

【赞助商】

通勤路上就听AI每周谈。AI每周谈,每周带你回顾上周AI大事

传送门 🔗https://www.xiaoyuzhoufm.com/podcast/688a34636f5a275f1cba40fd

【目录】

本期的 15 篇论文如下:

[00:33] 🚀 OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration(OPUS:迈向大规模语言模型预训练中高效且原理化的逐轮数据选择)

[01:17] 💻 Code2World: A GUI World Model via Renderable Code Generation(Code2World:通过可渲染代码生成的GUI世界模型)

[02:05] 🤖 UI-Venus-1.5 Technical Report(UI-Venus-1.5 技术报告)

[02:58] 🧠 Chain of Mindset: Reasoning with Adaptive Cognitive Modes(思维链模式:基于自适应认知模式的推理)

[03:52] 🧠 SkillRL: Evolving Agents via Recursive Skill-Augmented Reinforcement Learning(SkillRL:通过递归技能增强强化学习进化智能体)

[04:29] 🔬 P1-VL: Bridging Visual Perception and Scientific Reasoning in Physics Olympiads(P1-VL:连接视觉感知与物理奥赛中的科学推理)

[05:24] 🤖 Agent World Model: Infinity Synthetic Environments for Agentic Reinforcement Learning(智能体世界模型:面向智能体强化学习的无限合成环境)

[05:58] 🔍 Prism: Spectral-Aware Block-Sparse Attention(Prism:基于频谱感知的块稀疏注意力机制)

[06:41] ⚡ DLLM-Searcher: Adapting Diffusion Large Language Model for Search Agents(DLLM-Searcher:适配扩散大语言模型用于搜索智能体)

[07:23] 🎬 Olaf-World: Orienting Latent Actions for Video World Modeling(Olaf-World:面向视频世界建模的潜在动作定向)

[08:18] 🎨 Condition Errors Refinement in Autoregressive Image Generation with Diffusion Loss(基于扩散损失的图像自回归生成中的条件误差优化)

[09:09] 🍌 Agent Banana: High-Fidelity Image Editing with Agentic Thinking and Tooling(智能体香蕉:基于智能体思维与工具的高保真图像编辑)

[09:50] 🎯 SCALE: Self-uncertainty Conditioned Adaptive Looking and Execution for Vision-Language-Action Models(SCALE:基于自不确定度条件化的自适应视觉感知与执行视觉-语言-动作模型)

[10:37] 🤖 BagelVLA: Enhancing Long-Horizon Manipulation via Interleaved Vision-Language-Action Generation(BagelVLA:通过交错式视觉-语言-动作生成增强长视野操作)

[11:31] 🎬 TokenTrim: Inference-Time Token Pruning for Autoregressive Long Video Generation(TokenTrim:用于自回归长视频生成的推理时令牌剪枝)

【关注我们】

您还可以在以下平台找到我们,获得播客内容以外更多信息

小红书: AI速递