本期的 15 篇论文如下:
[00:22] ⚙ DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI(DataFlow:面向数据为中心AI时代的统一数据准备与工作流自动化LLM驱动框架)
[01:04] 🔍 The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding(棱镜假说:通过统一自编码协调语义与像素表示)
[01:50] 🎬 Region-Constraint In-Context Generation for Instructional Video Editing(区域约束的上下文生成用于教学视频编辑)
[02:33] 🎥 Infinite-Homography as Robust Conditioning for Camera-Controlled Video Generation(无限单应性变换作为相机控制视频生成的鲁棒条件)
[03:08] 🔍 QuCo-RAG: Quantifying Uncertainty from the Pre-training Corpus for Dynamic Retrieval-Augmented Generation(QuCo-RAG:基于预训练语料的动态检索增强生成不确定性量化)
[03:58] 🤔 Can LLMs Estimate Student Struggles? Human-AI Difficulty Alignment with Proficiency Simulation for Item Difficulty Prediction(大型语言模型能否评估学生困境?基于能力模拟的人机难度对齐用于试题难度预测)
[04:35] 🧭 LoGoPlanner: Localization Grounded Navigation Policy with Metric-aware Visual Geometry(LoGoPlanner:基于定位与度量感知视觉几何的导航策略)
[05:13] 🎬 WorldWarp: Propagating 3D Geometry with Asynchronous Video Diffusion(WorldWarp:利用异步视频扩散传播三维几何)
[06:08] 🔍 UCoder: Unsupervised Code Generation by Internal Probing of Large Language Models(UCoder:通过内部探测大语言模型实现无监督代码生成)
[06:45] 🧬 GenEnv: Difficulty-Aligned Co-Evolution Between LLM Agents and Environment Simulators(GenEnv:基于难度对齐的大语言模型智能体与环境模拟器协同进化框架)
[07:22] 🎨 Reasoning Palette: Modulating Reasoning via Latent Contextualization for Controllable Exploration for (V)LMs(推理调色板:通过潜在情境化调节推理以实现(视觉)语言模型的可控探索)
[07:56] ⚡ LoPA: Scaling dLLM Inference via Lookahead Parallel Decoding(LoPA:通过前瞻并行解码扩展扩散大语言模型推理)
[08:38] 📱 MobileWorld: Benchmarking Autonomous Mobile Agents in Agent-User Interactive, and MCP-Augmented Environments(MobileWorld:在智能体-用户交互与MCP增强环境中评测自主移动智能体)
[09:20] ⚖ Does It Tie Out? Towards Autonomous Legal Agents in Venture Capital(它能对上吗?迈向风险投资领域的自主法律智能体)
[10:00] 🎬 StoryMem: Multi-shot Long Video Storytelling with Memory(StoryMem:基于记忆的多镜头长视频故事讲述)

【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递