本期的 13 篇论文如下:
[00:27] 🧠 Mindscape-Aware Retrieval Augmented Generation for Improved Long Context Understanding(面向提升长文本理解的思维景观感知检索增强生成)
[01:07] 🎬 InsertAnywhere: Bridging 4D Scene Geometry and Diffusion Models for Realistic Video Object Insertion(InsertAnywhere:连接4D场景几何与扩散模型以实现逼真的视频对象插入)
[01:46] 🤖 MAI-UI Technical Report: Real-World Centric Foundation GUI Agents(MAI-UI技术报告:面向真实世界的通用图形用户界面智能体)
[02:22] 👁 UniPercept: Towards Unified Perceptual-Level Image Understanding across Aesthetics, Quality, Structure, and Texture(UniPercept:迈向跨美学、质量、结构与纹理的统一感知级图像理解)
[03:04] 🎨 ProEdit: Inversion-based Editing From Prompts Done Right(ProEdit:基于反演的提示编辑的正确方法)
[03:58] ⏱ TimeBill: Time-Budgeted Inference for Large Language Models(TimeBill:面向大语言模型的时间预算推理框架)
[04:37] 🧠 See Less, See Right: Bi-directional Perceptual Shaping For Multimodal Reasoning(少看,看对:用于多模态推理的双向感知塑造)
[05:16] 🌦 Omni-Weather: Unified Multimodal Foundation Model for Weather Generation and Understanding(Omni-Weather:用于天气生成与理解的多模态统一基础模型)
[05:48] 🧠 SVBench: Evaluation of Video Generation Models on Social Reasoning(SVBench:视频生成模型在社会推理能力上的评估)
[06:27] 🔍 InSight-o3: Empowering Multimodal Foundation Models with Generalized Visual Search(InSight-o3:赋能多模态基础模型实现广义视觉搜索)
[07:15] 🎨 SlideTailor: Personalized Presentation Slide Generation for Scientific Papers(SlideTailor:面向科研论文的个性化演示文稿幻灯片生成)
[08:11] 🤖 SWE-RM: Execution-free Feedback For Software Engineering Agents(SWE-RM:面向软件工程智能体的无执行反馈机制)
[08:48] ⚡ A 58-Addition, Rank-23 Scheme for General 3x3 Matrix Multiplication(一种用于通用3x3矩阵乘法的58次加法、秩23方案)

【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递