Listen

Description

Key Points

A focused walkthrough of today’s agentic stack in practice. The episode tests Gemini 2.5 “computer use” for real browser tasks, compares it with Operator and Claude, and breaks down safety guardrails and why screenshot-loop agents remain slow. It covers where computer use fits alongside MCP and OS tools, then shifts to Veo 3.1’s new API features for reference-guided video. On the coding side, it explores Claude Code Plugins and community marketplaces, plus GitHub’s Spec Kit for spec-driven development on large codebases. The discussion touches Cerebras for ultra-fast inference, DGX Spark for local experiments, and Karpathy’s NanoChat for training compact chat models. It closes with the “Agent Universe” demo: mapping industries via NAICS, generating value-flow diagrams, and turning stages into deployable agent roles, with open questions on architecture, tools, and handoff into real systems.