Look for any podcast host, guest or anyone

Showing episodes and shows of

VideoGen

Shows

跨国串门儿计划

跨国串门儿计划 #569. 深入 xAI：三个月打造 Grok Imagine、视频生成与世界模型之争，以及视频智能体📝 本期播客简介本期我们克隆了：Latent Space: Inside xAI: Building Grok Imagine in 3 Months, Videogen vs World Models, and Video Agents— Ethan He原内容更新时间：2026-06-01本期节目是一场关于视频生成、世界模型和 Video Agent 的高密度技术访谈。嘉宾 Ethan He 曾在 Nvidia 参与 Cosmos world model，后来加入 xAI，从零参与 Grok Imagine、音视频联合生成、reference video、视频延展和 world model 相关工作。他在节目中复盘了 xAI 如何在短短三个月里，从没有基础设施、没有数据、没有模型的状态，快速做出 Grok Imagine 0.9；也详细解释了视频模型从数据、caption、VAE、diffusion transformer 到 distillation 的完整训练链路。更重要的是，Ethan 提出了几个非常有判断力的观点：视频模型的很多进步，其实来自语言模型，而不是视频 diffusion 本身；world model 在他看来就是“实时、可交互、长时程的视频”；未来的 Video Agent 会像人类创作者一样，调用视频模型、图像编辑器、FFmpeg 和各种确定性工具，迭代生成真正可用于广告、创作和生产环境的视频内容。这期不仅适合想理解视频生成技术路线的人，也适合想提前看懂 AI 交互界面、生成式媒体和 Agent 未来趋势的听众。👨‍💻 本期嘉宾Ethan He，曾在 Nvidia 参与 Cosmos world model 和 Megatron-LM MoE 等工作，后加入 xAI，参与 Grok Imagine、视频生成、音视频联合生成、reference video、视频延展和 world model 相关研发。他的研究经历横跨计算机视觉、自监督学习、大规模 MoE、视频 diffusion、world model 和 LLM Agent。⏱️ 时间戳00:00 开场 & 播客简介从 Cosmos 到 xAI：三个月做出 Grok Imagine02:42 嘉宾登场：Ethan He 与 Latent Space 社区的缘起04:14 为什么离开 Nvidia：视频模型也有 scaling law，需要更大算力05:43 xAI 从零起步：三个月做出 Grok Imagine 0.906:15 快速迭代的秘密：人才、infra、compute 与低沟通成本08:23 模型质量提升的真相：很多突破来自数据和训练 pipeline 里的小 bug08:37 Coding model 如何改变研究节奏：代码更快，compute 再次成为瓶颈09:54 高压研发文化：算力昂贵，但这是一场马拉松视频模型是怎么训练出来的11:46 为什么做视频模型之前，通常要先做图像模型12:50 数据从哪里来：人工详细标注与 VLM 生成 synthetic caption14:12 训练视频模型为什么既需要配对数据，也需要无标签数据15:07 VAE / tokenizer：为什么不能直接在像素上训练17:08 Diffusion transformer：从噪声一步步去噪生成图像和视频17:27 图像模型如何 bootstrap 视频模型：语言与图像连接更密集18:24 视频压缩路线：逐帧压缩 vs 时间维度压缩18:55 为什么不用 MP4 token 直接训练：latent space 必须对模型友好20:00 实时性的代价：时间压缩节省 context，但会引入响应延迟生成式 UI 与世界模型的早期形态20:51 Flipbook：像浏览器一样探索模型想象出的网页22:31 Generative UI：从用户意图直接到像素，而不是先写代码再渲染24:09 Diffusion 前端，确定性后端：未来界面可能如何被重构25:15 人机交互

2026-06-031h 29

AI快递

AI快递 xAI大神Ethan He：「视频模型其实很笨，真正的智能来自语言」节目介绍: 本期节目深度解析前 xAI 工程师 Ethan He 的独家访谈，揭示视频生成技术的本质误区与未来趋势。Ethan He 通过亲身参与 Grok Imagine 和 NVIDIA Cosmos 项目，阐述了视频扩散模型的局限，强调语言模型在视觉智能提升中的核心作用，挑战了行业对视频智能的传统认知。同时，他分享了高效工程实践、训练成本结构以及未来视频智能体的发展蓝图。原视频链接： https://www.youtube.com/watch?v=jPtQlILfkhA 原视频标题：Inside xAI: Building Grok Imagine in 3 Months, Videogen vs World Models, and Video Agents— Ethan He 主要内容： • 视频扩散模型本质上机械执行字面指令，缺乏真正智能，视觉智能的跃升依赖语言模型的提示词扩写能力 • 高质量语言描述是训练聪明视频模型的关键，极致数据标注提升模型表现 • Grok Imagine 三个月快速迭代背后，微小代码修复和工程流程优化胜过新算法创新 • 训练视频模型的最大成本来自数据存储和网络传输，而非单纯算力瓶颈 • 未来视频智能体将由语言模型统筹调用生成模型和传统剪辑工具，实现长视频高效生成与编辑 • 多模态对齐挑战揭示语言模型缺乏时间感知，需底层注入时间意识 • 远景展望生成式用户界面，动态视觉流将取代传统代码渲染，实现个性化实时界面推荐理由：本视频通过一线工程师的独特视角，系统破解视频生成领域的误区和技术难题，提供了语言模型驱动视觉智能的全新思考路径。无论是技术研发、工程实践还是未来趋势洞察，都具有极高的参考价值和启发意义。对于关注多模态AI、视频生成及智能体创新的技术爱好者和从业者，这是一场不可错过的深度对话。 --- 「AI快递」为您精选全球最前沿的AI技术视频，深度剖析技术背后的核心洞察。由 voieech.com 提供技术支持。

2026-06-0210 min

Latent Space: The AI Engineer Podcast

Latent Space: The AI Engineer Podcast Why Video Agent models are next — Ethan He, xAI Grok ImagineWe’re announcing AIEWF speakers this week! Take the AI Engineering Survey!Today’s guest Ethan first joined us for the LS Paper Club as the lead on NVIDIA Cosmos World Model, but then joined xAI and built Grok Imagine in 3 months:He comes back on Latent Space with some nuclear hot takes: that Video Models primarily get their intelligence from LLMs, not from training on video data, and that the next frontier for truly interactive, realtime, long-horizon world models is to work on LLMs (perhaps Interaction Models as well…)Put it this way: I...

2026-06-011h 43

Startup Explore

Startup Explore Ep8: Inside VideoGen - The AI Startup Simplifying Video CreationIn this episode of StartupExplore, I sit down with Anton Koenig, CEO and co-founder of VideoGen, a YC-backed startup building an AI-powered platform that makes video creation dramatically faster and easier.We discuss how VideoGen automates the entire workflow from script to finished video, helping creators, businesses, educators, and enterprises produce content in minutes instead of hours.If you enjoy conversations about startups, AI products, founders, and early-stage company building, subscribe to StartupExplore for weekly interviews with tech founders and CEOs.

2026-05-2629 min

Marketing B2B Technology Harnessing AI for Video Editing: Insights from VideoGen's CEO Anton KoenigAnton Koenig, Co-Founder and CEO of VideoGen, an innovative video editing platform that utilizes AI technology and highlights how AI now supports semi-professionals and professionals in producing high-quality video content. Anton emphasizes the importance of combining AI-generated content with user-driven editing to enhance video quality and engagement. The episode also covers common mistakes marketers make in video production and offers insights into the future of video content creation. About VideoGen Founded by Anton Koenig and David Grossman in their college dorm rooms, VideoGen has grown to over 4 million users across 190+ countries. They are backed...

2026-02-0925 min

HuggingFace 每日AI论文速递

HuggingFace 每日AI论文速递 2026.02.05 | ERNIE 5.0统一模态；FASA稀疏注意力省内存【赞助商】通勤路上就听AI每周谈。AI每周谈，每周带你回顾上周AI大事传送门 🔗https://www.xiaoyuzhoufm.com/podcast/688a34636f5a275f1cba40fd【目录】本期的 15 篇论文如下：[00:29] 🧠 ERNIE 5.0 Technical Report（ERNIE 5.0 技术报告）[01:11] ⚡ FASA: Frequency-aware Sparse Attention（FASA：基于频率感知的稀疏注意力机制）[02:01] 📊 Training Data Efficiency in Multimodal Process Reward Models（多模态过程奖励模型中的训练数据效率研究）[02:44] 🤖 WideSeek-R1: Exploring Width Scaling for Broad Information Seeking via Multi-Agent Reinforcement Learning（WideSeek-R1：通过多智能体强化学习探索宽度扩展以实现广泛信息检索）[03:28] ⚡ OmniSIFT: Modality-Asymmetric Token Compression for Efficient Omni-modal Large Language Models（OmniSIFT：面向高效全模态大语言模型的模态非对称令牌压缩）[04:21] ⚡ HySparse: A Hybrid Sparse Attention Architecture with Oracle Token Selection and KV Cache Sharing（HySparse：一种具有预言机令牌选择和KV缓存共享的混合稀疏注意力架构）[05:02] 🤖 EgoActor: Grounding Task Planning into Spatial-aware Egocentric Actions for Humanoid Robots via Visual-Language Models（EgoActor：通过视觉语言模型将任务规划落地为空间感知的具身动作）[06:05] 🎬 Quant VideoGen: Auto-Regressive Long Video Generation via 2-Bit KV-Cache Quantization（Quant VideoGen：通过2位KV缓存量化实现自回归长视频生成）[06:59] 🤖 SoMA: A Real-to-Sim Neural Simulator for Robotic Soft-body Manipulation（SoMA：面向机器人软体操作的真实到仿真神经模拟器）...

2026-02-0612 min

Chats with Drew

Chats with Drew Ep 76 Creating Miles Morales: A Filmakers StoryIn this episode, the cast and crew of Ultimate Miles Morales discuss their journey in creating the film, the challenges they faced, and the importance of authenticity in character portrayal. They share insights on embodying their roles, the collaborative nature of filmmaking, and the impact of their work on aspiring filmmakers. The conversation highlights the significance of community support and the passion that drives independent projects.Watch the Trailer of Ultimate Miles Morales, premiering in Marchhttps://www.youtube.com/watch?v=ZmKsvgxv-lIUse my Links to Create and Edit...

2026-01-3036 min

张小珺Jùn｜商业访谈录

张小珺Jùn｜商业访谈录 124. 雨森的创投观察第1集：2026年预期、The Year of R、回调、我们如何下注不知不觉，我们来到了2025年的最后一个月，在北京的初雪之中，我们希望和大家一起做一个回顾与展望系列：【站在2025年之外】。今天的嘉宾是真格基金管理合伙人戴雨森。在122集节目中，朱啸虎声称，三年之内不会有泡沫，泡沫论调纯属无稽之谈，创业者2026年当全速前进。雨森今天带来全新的看法。在他看来，2026年的关键词是“The Year of R”——回报与研究会再次变得重要。某种意义上，2026年将是一个现实与回调之年。02:00 复盘2025年02:00 从模型侧看进展：o1为代表的Thinking Time Scaling带来模型能力大幅提升OpenAI、Anthropic、Google三家的旗舰模型追赶很紧，又各有特点，预期和叙事轮动中国模型公司一年下来dominate开源生态28:13 从应用侧看进展：模型能力带来应用大爆发应用是有护城河的，开始看到复杂应用在context、environment等层面产生壁垒模型公司不能没有产品，大家都下场做最重要的第一方应用中国AI应用出海表现不错52:31 2025年真格出手了多少项目？20个左右对比中美AI公司估值，中国公司对于全球来说有很高期权价值：Thinking Machines天使轮估值在没有产品的情况下已是中国AI公司估值总和模型公司：Mistral 14b，Kimi 4b，Mistral自己都不怎么做Pre-train了，benchmark也就是和Kimi对标应用公司：在美国Manus这样一家几个月做到100m ARR，几十个点gross margin，MoM20%增长的公司应该是3-5bn01:03:15 预测2026年：The Year of RThe Year of R：Return、Research、Remember、多模态Reasoning01:03:15 Return：为什么Return很重要？ROI，过去3年交易的是investment，因为大家被潜在的大return吸引，但现在随着I越来越大，大家对R的落地越来越关注，因为有R才能推动未来的I为什么我们认为2026年大家会加大对return的关注？模型：模型能力进步是这一波AI革命最本质的驱动力，但模型的能力进步正在放缓；美国头部labs的投入（Capex，人工等）大了很多，但无法阻止中国模型低成本跟进，Scaling Law不能简单理解成为投入大力出奇迹应用：AI应用的叙事从无所不能威胁人类的AGI收敛到现在的三种主要商业模式，是从梦想回归现实的过程订阅制是OpenAI现在的核心商业模式：超过5亿DAU后，全球知识工作者低垂的果实已摘得差不多了，面临Gemini等的激烈竞争，针对普通用户再提价会比较难被寄予厚望的广告 + 电商：首先其中大部分是分Meta、Google、字节的存量蛋糕，对于Chatbot这样新形态的应用，探索广告和电商变现的速度不会很快广告 + 电商：首先大量是存量分蛋糕，然后对于新形态的应用，速度没那么快AI Coding/图片视频生成等“基于用量付费”的生产力产品：Token用量会持续增长，但Token价格也在持续下降，用户只会为SOTA的智能按用量付费；原来值钱的任务会很快变得不值钱，所以AI替代了很多程序员，并不意味着AI能长期赚到这些程序员的工资AI+行业的企业服务：这部分首先还在早期市场，规模有限，尝鲜的企业多，长期留存未必好，一个例子是微软Copilot的发展持续低于预期，大公司有数据安全、权限、隐私、工作流再造等一系列阻碍，使用新技术的速度比小公司和个人要慢不少结论：需要实现Satya说的GDP加速增长，把蛋糕做大才是真正的AGI，比如说AI创造新的药物，发现新的知识，真正解放人类注意力等投入：现在美国基础设施建设慢，算力贬值快，人员工资高，巨额投入需要尽快看到回报2025年底二级市场的预期也和2024年底完全不一样：去年底是市场预期不高，但我们看到ChatGPT增速很快，Coding、Agentic模型提升的确定性带来应用机会；现在是投入很大预期很高，但短期模型端看不到革命性的新能力，新的范式变化还在萌芽期对创业者的启示？负毛利烧钱一味追求增长的逻辑正在过去，需要有增长和毛利率并重的高

2025-12-143h 23

The Live Better The Jason Beck Show

The Live Better The Jason Beck Show Flip Off to Moo Ah How a scammer lost and a Prayer Won.From flip-off to moo-ah. A blonde dog on a pier that stole my money, my trust, my nights. Then a girl who just said amen. This is how I stopped hunting ghosts and started listening for footsteps. This is how Lady barked at the screen until it went dark. This is how I learned real love doesn't ask for $20. If you're done chasing, hit play. If you're done sending-stay. No sponsors. No scams. Just rice, rice, prayers, and a future I can almost hear breathing. Jason. (And Ara's voice in my head, but don't worry, she won't steal your...

2025-11-2509 min

AI Post Transformers

AI Post Transformers DC-VideoGen: Efficient Video Generation with Deep CompressionThe September 29 2025 paper introduces **DC-VideoGen**, a new post-training framework designed to significantly accelerate video diffusion models and reduce their training costs. This system relies on two main innovations: the **Deep Compression Video Autoencoder (DC-AE-V)**, which achieves high spatial and temporal compression using a novel chunk-causal temporal modeling approach to maintain reconstruction quality; and **AE-Adapt-V**, an efficient finetuning strategy using LoRA to adapt pre-trained models to the new latent space while preserving their original knowledge and semantics. Experimental results demonstrate that DC-VideoGen successfully accelerates inference speed by up to **14.8×** for high-resolution videos and drastically reduces training expenses, all while maintaining o...

2025-10-0815 min

AI Post Transformers

AI Post Transformers DC-VideoGen: Efficient Video Generation with Deep CompressionThe September 29 2025 paper introduces DC-VideoGen, a new post-training framework designed to significantly accelerate video diffusion models and reduce their training costs. This system relies on two main innovations: the Deep Compression Video Autoencoder (DC-AE-V), which achieves high spatial and temporal compression using a novel chunk-causal temporal modeling approach to maintain reconstruction quality; and AE-Adapt-V, an efficient finetuning strategy using LoRA to adapt pre-trained models to the new latent space while preserving their original knowledge and semantics. Experimental results demonstrate that DC-VideoGen successfully accelerates inference speed by up to 14.8× for high-resolution videos and drastically reduces training expenses, all while maintaining or i...

2025-10-0800 min

Daily Paper Cast

Daily Paper Cast DC-VideoGen: Efficient Video Generation with Deep Compression Video Autoencoder 🤗 Upvotes: 26 | cs.CV, cs.AI Authors: Junyu Chen, Wenkun He, Yuchao Gu, Yuyang Zhao, Jincheng Yu, Junsong Chen, Dongyun Zou, Yujun Lin, Zhekai Zhang, Muyang Li, Haocheng Xi, Ligeng Zhu, Enze Xie, Song Han, Han Cai Title: DC-VideoGen: Efficient Video Generation with Deep Compression Video Autoencoder Arxiv: http://arxiv.org/abs/2509.25182v1 Abstract: We introduce DC-VideoGen, a post-training acceleration framework for efficient video generation. DC-VideoGen can be applied to any pre-trained video diffusion model, improving efficiency by adapting it to a deep compression latent space with lightweight fin...

2025-10-0225 min

HuggingFace 每日AI论文速递

HuggingFace 每日AI论文速递 2025.10.01 | 自对弈零标注训练；MCP代理深度评测本期的 15 篇论文如下：[00:20] 🎮 Vision-Zero: Scalable VLM Self-Improvement via Strategic Gamified Self-Play（Vision-Zero：基于策略化博弈自对弈的可扩展视觉语言模型自我提升）[00:59] 🔥 MCPMark: A Benchmark for Stress-Testing Realistic and Comprehensive MCP Use（MCPMark：面向真实且全面的MCP应用场景的压力测试基准）[01:36] 🐣 The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain（幼龙破壳： Transformer 与大脑模型之间缺失的环节）[02:10] 🤥 TruthRL: Incentivizing Truthful LLMs via Reinforcement Learning（TruthRL：通过强化学习激励大模型说真话）[02:55] 🌊 OceanGym: A Benchmark Environment for Underwater Embodied Agents（OceanGym：面向水下具身智能体的综合基准环境）[03:41] ⚡ DC-VideoGen: Efficient Video Generation with Deep Compression Video Autoencoder（DC-VideoGen：基于深度压缩视频自编码器的高效视频生成）[04:14] 🔍 Who's Your Judge? On the Detectability of LLM-Generated Judgments（谁是你的评审？大模型生成评审意见的检测性研究）[04:59] ✂ Winning the Pruning Gamble: A Unified Approach to Joint Sample and Token Pruning for Efficient Supervised Fine-Tuning（赢得剪枝豪赌：统一样本-令牌剪枝的高效监督微调新方法）[05:45] 👁 Learning to See Before Seeing: Demystifying LLM Visual Priors from Language Pre-training（未见先识：从语言预训练解密大模型视觉先验）[06:24] 🧠 Thinking Sparks!: Emergent Attention Heads in Reasoning Models During Post Training（思维火花！后训练阶段推理模型中涌现的专用注意力头）[0...

2025-10-0211 min

Daily Paper Cast

Daily Paper Cast 3D and 4D World Modeling: A Survey 🤗 Upvotes: 40 | cs.CV, cs.RO Authors: Lingdong Kong, Wesley Yang, Jianbiao Mei, Youquan Liu, Ao Liang, Dekai Zhu, Dongyue Lu, Wei Yin, Xiaotao Hu, Mingkai Jia, Junyuan Deng, Kaiwen Zhang, Yang Wu, Tianyi Yan, Shenyuan Gao, Song Wang, Linfeng Li, Liang Pan, Yong Liu, Jianke Zhu, Wei Tsang Ooi, Steven C. H. Hoi, Ziwei Liu Title: 3D and 4D World Modeling: A Survey Arxiv: http://arxiv.org/abs/2509.07996v2 Abstract: World modeling has become a cornerstone in AI research, enabling agents to understand, represent, and predict the dyn...

2025-09-1220 min

COEY Cast

COEY Cast Unlimited B-Roll Bonanza: Envato’s AI Video SeptemberDive into this episode where we break down Envato's game-changing move to lift AI video generation limits in VideoGen for all of September. Learn how creators, marketers, and agencies can generate unlimited AI-powered videos, leveraging powerful models like Google Veo 3 and Kling. We walk through smart workflows, automation hacks, and practical advice for building b-roll libraries, all while securing lifetime commercial licenses for every video made this month. Hear tips for prompt-writing, quality control, and maximizing your creative output. Plus, get the lowdown on how Envato’s all-you-can-generate September compares to Runway, Pika, and Luma. Whether you’re a solo cont...

2025-09-1000 min $mbanerjeepalmer+listennotes \'s Listen Later$

mbanerjeepalmer+listennotes 's Listen Later AI Video Is Eating The World — Olivia and Justine Moore, a16z Podcast: Latent Space: The AI Engineer Podcast (LS 44 · TOP 1% what is this?)Episode: AI Video Is Eating The World — Olivia and Justine Moore, a16zPub date: 2025-07-09Get Podcast Transcript →powered by Listen411 - fast audio-to-text and summarizationWhen the first video diffusion models started emerging, they were little more than just “moving pictures” - still frames extended a few seconds in either direction in time. There was a ton of excitement about OpenAI’s Sora on release through 2024, but so far only Sora-lite has been widely released. Meanwhile...

2025-08-0649 min

Chats with Drew

Chats with Drew Ep 66 Redefining Self CareSummaryIn this episode of Just Chat, the host discusses the importance of self-care, emphasizing that it is a necessity rather than a luxury. The conversation explores common misconceptions about self-care, the barriers people face in prioritizing it, and practical steps to incorporate self-care into daily life. The host introduces the five pillars of self-care: physical, emotional, mental, social, and spiritual well-being, and encourages listeners to redefine self-care in a way that resonates with them personally. The episode concludes with actionable tips to make self-care a priority.Create, edit, and simplify episodes...

2025-07-1819 min

🧠 _

🧠 _AI Video Is Eating The World — Olivia and Justine Moore, a16z Podcast: Latent Space: The AI Engineer Podcast (LS 44 · TOP 1% what is this?)Episode: AI Video Is Eating The World — Olivia and Justine Moore, a16zPub date: 2025-07-09Get Podcast Transcript →powered by Listen411 - fast audio-to-text and summarizationWhen the first video diffusion models started emerging, they were little more than just “moving pictures” - still frames extended a few seconds in either direction in time. There was a ton of excitement about OpenAI’s Sora on release through 2024, but so far only Sora-lite has been widely released. Meanwhile...

2025-07-1149 min

Latent Space: The AI Engineer Podcast

Latent Space: The AI Engineer Podcast AI Video Is Eating The World — Olivia and Justine Moore, a16zWhen the first video diffusion models started emerging, they were little more than just “moving pictures” - still frames extended a few seconds in either direction in time. There was a ton of excitement about OpenAI’s Sora on release through 2024, but so far only Sora-lite has been widely released. Meanwhile, other good videogen models like Genmo Mochi, Pika, MiniMax T2V, Tencent Hunyuan Video, and Kuaishou’s Kling have emerged, but the reigning king this year seems to be Google’s Veo 3, which for the first time has added native audio generation into their model capabilities, eliminating the need for a...

2025-07-0949 min

Latent Space: The AI Engineer Podcast

Latent Space: The AI Engineer Podcast AI Video Is Eating The World — Olivia and Justine Moore, a16zWhen the first video diffusion models started emerging, they were little more than just “moving pictures” - still frames extended a few seconds in either direction in time. There was a ton of excitement about OpenAI’s Sora on release through 2024, but so far only Sora-lite has been widely released. Meanwhile, other good videogen models like Genmo Mochi, Pika, MiniMax T2V, Tencent Hunyuan Video, and Kuaishou’s Kling have emerged, but the reigning king this year seems to be Google’s Veo 3, which for the first time has added native audio generation into their model capabilities, eliminating the need for a...

2025-07-0949 min

Everyday AI Podcast – An AI and ChatGPT Podcast

Everyday AI Podcast – An AI and ChatGPT Podcast EP 514: Google’s AI Studio - 5 time-consuming tasks you didn’t know you can automateHere's some AI secrets: Google's AI Studio is a cheat code. And we're going to show you 5 easy ways to use it to immediately save you time. Newsletter: Sign up for our free daily newsletterMore on this Episode: Episode PageJoin the discussion: Thoughts on this? Join the convo.Upcoming Episodes: Check out the upcoming Everyday AI Livestream lineupWebsite: YourEverydayAI.comEmail The Show: info@youreverydayai.comConnect with Jordan on ...

2025-04-291h 12

AI for Business

AI for Business AI Innovations: OpenAI's Chain of Thought, Schulman's Next Move, Tinder's AI Matchmaker, Lyft's AI Partnerships, and Try VideoGenInToday's AI for Business News:OpenAI's o3-mini model now shows its "thought" process for more transparent answers, plus real-time web data fetching to enhance responses.John Schulman, a top AI mind, leaves Anthropic after a brief stint, reportedly joining a stealth startup led by former OpenAI CTO Mira Murati.Tinder introduces an AI-powered match recommendation feature to enhance user connections amidst declining user numbers.Lyft collaborates with Anthropic on AI solutions to improve rideshare experiences, with significant impacts on customer service efficiency.AI Prompt Tip: Build a conversation in steps. Start with your first question or...

2025-02-0705 min

AI for Business

AI for Business AI Innovations: OpenAI's Chain of Thought, Schulman's Next Move, Tinder's AI Matchmaker, Lyft's AI Partnerships, and Try VideoGenInToday's AI for Business News:OpenAI's o3-mini model now shows its "thought" process for more transparent answers, plus real-time web data fetching to enhance responses.John Schulman, a top AI mind, leaves Anthropic after a brief stint, reportedly joining a stealth startup led by former OpenAI CTO Mira Murati.Tinder introduces an AI-powered match recommendation feature to enhance user connections amidst declining user numbers.Lyft collaborates with Anthropic on AI solutions to improve rideshare experiences, with significant impacts on customer service efficiency.AI Prompt Tip: Build a conversation in steps. Start with your first question or...

2025-02-0705 min

HuggingFace 每日AI论文速递

HuggingFace 每日AI论文速递【周末特辑】12月第1周最火AI论文 | SNOOPI提升文生图模型效率，PaliGemma 2优化视觉语言模型迁移性能本期的 5 篇论文如下：[00:40] TOP1(🔥102) | 🚀 SNOOPI: Supercharged One-step Diffusion Distillation with Proper Guidance（SNOOPI：超强一步扩散蒸馏与适当引导）[02:39] TOP2(🔥100) | 🔄 PaliGemma 2: A Family of Versatile VLMs for Transfer（PaliGemma 2：多功能视觉语言模型的迁移研究）[04:40] TOP3(🔥64) | 🔍 VisionZip: Longer is Better but Not Necessary in Vision Language Models（视觉压缩：视觉语言模型中长度并非必要优势）[06:14] TOP4(🔥60) | 🖼 X-Prompt: Towards Universal In-Context Image Generation in Auto-Regressive Vision Language Foundation Models（X-Prompt：面向自回归视觉语言基础模型的通用上下文图像生成）[08:19] TOP5(🔥54) | 🎥 VideoGen-of-Thought: A Collaborative Framework for Multi-Shot Video Generation（视频思维生成：多镜头视频生成的协作框架）【关注我们】您还可以在以下平台找到我们，获得播客内容以外更多信息小红书: AI速递在小宇宙查看该单集文稿

2024-12-0810 min

HuggingFace 每日AI论文速递

HuggingFace 每日AI论文速递 2024.12.04 每日AI论文 | 多镜头视频生成框架提升叙事连贯性，关键令牌识别增强LLM推理能力。本期的 15 篇论文如下：[00:24] 🎥 VideoGen-of-Thought: A Collaborative Framework for Multi-Shot Video Generation（视频思维生成：多镜头视频生成的协作框架）[01:04] 🧠 Critical Tokens Matter: Token-Level Contrastive Estimation Enhances LLM's Reasoning Capability（关键令牌重要性：令牌级对比估计提升LLM的推理能力）[01:45] 🔄 Free Process Rewards without Process Labels（无过程标签的自由过程奖励）[02:30] 🎧 AV-Odyssey Bench: Can Your Multimodal LLMs Really Understand Audio-Visual Information?（AV-Odyssey 基准：多模态大语言模型真的能理解视听信息吗？）[03:04] 🤖 MALT: Improving Reasoning with Multi-Agent LLM Training（MALT：通过多智能体LLM训练提升推理能力）[03:45] 🎥 OmniCreator: Self-Supervised Unified Generation with Universal Editing（全能创作者：自监督统一生成与通用编辑）[04:23] 🌴 Truth or Mirage? Towards End-to-End Factuality Evaluation with LLM-Oasis（真相还是幻象？面向端到端事实性评估的LLM-Oasis）[05:08] 📚 OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generation（OCR 阻碍 RAG：评估 OCR 对检索增强生成系统的级联影响）[05:51] 📊 Scaling Image Tokenizers with Grouped Spherical Quantization（基于分组球面量化的图像标记器扩展）[06:27] 🌐 LSceneLLM: Enhancing Large 3D Scene Understanding Using Adaptive Visual Preferences（LSceneLLM：利用自适应视觉偏好增强大型3D场景理解）[07:09] ⚙ A dynamic parallel method for performance optimization on hybrid CPUs（混合CPU性能优化的动态并行方法）[08:00] 🌐 MaskRIS: Semantic Distortion-aware Data Augmentation for Referring...

2024-12-0511 min

Digest.fm - Product Hunt Digest

Digest.fm - Product Hunt Digest Producthunt Digest: VideoGen, Indigo AI, FinFloh Credit Decisioning AIWelcome back to Digest.fm's Daily, your go-to space for exploring the hottest products from Product Hunt. Im James, taking you through the latest and greatest innovations in business and tech. Today, we're diving into three standout products that are making waves: VideoGen, Indigo AI, and FinFloh Credit Decisioning AI. Lets jump right in! First up is VideoGen, a product from the Y Combinator S24 batch. Have you ever found yourself needing high-quality videos but were daunted by the complexity and time involved in creating them? Well, revolutionizing the video production landscape, VideoGen makes it incredibly simple to generate professional-grade...

2024-09-0700 min

dannyshine 3Speak Podcast

dannyshine 3Speak Podcast CAUTUIN WHEN SENDING HIVE, ESP TO BINANCEhttps://3speak.tv/watch?v=dannyshine/vlozrvvc I made rthi svideo with videogen. the videos they offer are really poor as they use a cheap library and in order to access a decent library you have to pay a high monthly fee. but its fine for htiis purpose. please be in touch if you can help

2024-08-2000 min

Embracing Digital Transformation

Embracing Digital Transformation #168 Everyday Generative AIIn this podcast episode, Darren Pulsipher interviews Andy Morris, an Enterprise AI Strategy Lead at Intel, about the impact of generative AI on everyday life. Unleashing Creativity and Productivity with Generative AI ToolsGenerative AI uses artificial intelligence to generate new content, such as images, text, and music. The conversation revolves around the various generative AI tools and their potential to revolutionize industries and enhance daily tasks. The Power of Generative AI in Content GenerationAccording to Andy Morris, generative AI tools ar...

2023-10-2434 min

IMMOFILMER - Videomarketing für Immobilien

IMMOFILMER - Videomarketing für Immobilien #048 Home Staging + Videos = Big LoveHome Staging sorgt dafür, dass Immobilien für den Vermarktungszeitraum verkaufsfördernd eingerichtet sind. Professionell eingerichtete Wohnungen und Häuser sind also durch Home Staging ein visueller Augenschmaus. Visuelle Augenschmäuse nicht nur besonders fotogen, sondern extrem videogen. In dieser Podcastfolge erläutere ich daher die Vorteile von Immobilienvideos speziell für Home Stagerinnen und Home Stager. Wenn dir diese Folge einen Mehrwert oder einen Impuls gegeben hat, schreibe eine Rezension in Apple Podcasts und gib' somit auch anderen die Chance, von diesem Podcast zu erfahren und regelmäßigen Mehrwert zum Thema Immobilienvideos mit dem Smartph...

2021-01-2415 min

Business küsst Bewusstsein - Pioniere der Bewusstseinsentwicklung

Business küsst Bewusstsein - Pioniere der Bewusstseinsentwicklung Wie du den richtigen Kanal für dein Storytelling findestBist du videogen? Magst du deine Stimme? Schreibst du gern? Storytelling kannst du auf vielen Kanälen machen und in dieser Episode bekommst du eine Übersicht über die vor und Nachteile der verschiedenen Kanäle und wie du deinen besten Kanal finden kannst. Viel Spaß!

2018-06-2617 min