Listen

Description

本期的 15 篇论文如下:

[00:20] 🌍 Solar Open Technical Report(Solar Open 技术报告)

[00:54] 🤖 User-Oriented Multi-Turn Dialogue Generation with Tool Use at scale(面向用户的大规模多轮对话生成与工具使用)

[01:39] 🧠 MemGovern: Enhancing Code Agents through Learning from Governed Human Experiences(MemGovern:通过从受治理的人类经验中学习来增强代码代理)

[02:11] 🖱 ShowUI-$π$: Flow-based Generative Models as GUI Dexterous Hands(ShowUI-π:基于流的生成模型作为GUI灵巧手)

[02:44] 🧠 KnowMe-Bench: Benchmarking Person Understanding for Lifelong Digital Companions(KnowMe-Bench:面向终身数字伴侣的人物理解基准测试)

[03:15] 🏆 ArenaRL: Scaling RL for Open-Ended Agents via Tournament-based Relative Ranking(ArenaRL:通过基于锦标赛的相对排名扩展开放智能体强化学习)

[04:07] 🧠 Ministral 3(Ministral 3系列模型)

[04:51] ⚖ The Confidence Dichotomy: Analyzing and Mitigating Miscalibration in Tool-Use Agents(置信度二分法:分析与缓解工具使用智能体中的校准错误)

[05:31] 🧭 VLingNav: Embodied Navigation with Adaptive Reasoning and Visual-Assisted Linguistic Memory(VLingNav:基于自适应推理与视觉辅助语言记忆的具身导航)

[06:24] 🎬 End-to-End Video Character Replacement without Structural Guidance(无需结构引导的端到端视频角色替换)

[07:06] 🎬 Motion Attribution for Video Generation(视频生成中的运动归因)

[07:36] 🚀 SnapGen++: Unleashing Diffusion Transformers for Efficient High-Fidelity Image Generation on Edge Devices(SnapGen++:释放扩散变换器在边缘设备上实现高效高保真图像生成)

[08:12] ⚖ JudgeRLVR: Judge First, Generate Second for Efficient Reasoning(JudgeRLVR:先判断后生成的高效推理方法)

[08:46] 📊 Aligning Text, Code, and Vision: A Multi-Objective Reinforcement Learning Framework for Text-to-Visualization(对齐文本、代码与视觉:基于多目标强化学习的文本到可视化生成框架)

[09:25] 🔍 Towards Comprehensive Stage-wise Benchmarking of Large Language Models in Fact-Checking(迈向大型语言模型在事实核查中的全面分阶段基准测试)

【关注我们】

您还可以在以下平台找到我们,获得播客内容以外更多信息

小红书: AI速递