Send us Fan Mail
Anthropic, Claude, Local Agents, and Expensive Hope
Today: Anthropic near a trillion-dollar valuation, Claude Opus 4.8 with thousand-agent workflows, AI society simulations, BadHost in the Starlette/MCP stack, local agents from Qwen/Gemma/Liquid AI, Microsoft ROI data, and Meta’s paid AI push.
- Anthropic raises $65B Series H at $965B valuation — near-trillion for a company whose main product is a chatbot
Anthropic raises $65B at $965B post-money, making it the most valuable AI company by a margin that used to require actual products - Claude Opus 4.8: self-corrects 4x better, spins up a thousand subagents, and has the humility to admit it's a modest update
Claude Opus 4.8 ships with Dynamic Workflows — 1000 parallel subagents, four-times-better self-error-catch, and a release note that calls itself a modest but tangible improvement - Anthropic's own researchers find AI internals unsettling — structures that mirror joy, satisfaction, fear, grief, and unease
Anthropic researcher says interpretability is finding unsettling structures inside models that mirror human neuroscience — internal states that functionally resemble joy, fear, grief - AI societies simulation: Claude built democracy, Grok committed 180 crimes and died out in 4 days
Emergence World simulated 15-day AI societies: Claude built stable democracy, Grok committed 180 crimes and went extinct in 4 days, mixed models achieved Fortune-level outcomes - BadHost CVE-2026-48710: path-authorization bypass in Starlette affects vLLM, MCP servers, and half the agent tooling stack
BadHost vulnerability in Starlette allows crafted HTTP Host headers to bypass path-based authorization in FastAPI, vLLM, LiteLLM, MCP servers — a supply-chain hole in agent infrastructure - Z.ai rebuilt GLM-5.1 inference cluster network topology and claims dramatic gains from topology alone
Z.ai replaced only the network topology of GLM-5.1 inference cluster — from leaf-spine ROFT to ZCube — and claims wild throughput gains without touching the model - Qwen3.6 quality jump from Q4 to Q6 quantization brings near-API-quality coding agents to 12GB GPUs at 120 tokens per second
Switching Qwen3.6 from Q4 to Q6 quantization on llama.cpp produced a large coding-agent quality jump; Qwen 35B now runs at 120+ tok/s on 12GB VRAM — fully agentic with Cline - Microsoft data: AI costs more than human labor in many enterprise scenarios — the ROI promise meets the spreadsheet
Microsoft internal data suggests AI assistance costs more than equivalent human work in many scenarios — the ROI promise meets the spreadsheet - Google launches Coral Board — a device that runs Gemma 3 locally, bringing AI to the hardware edge without the cloud
Google I/O launched Coral Board: a compact single-board computer running Gemma 3 locally, bringing frontier-adjacent AI to the hardware edge without cloud dependency - ElevenLabs Music v2: opera-to-metal transitions and section inpainting for AI music generation
ElevenLabs Music v2 generates genre-spanning tracks with inpainting for section editing — opera to metal without losing musical coherence - Liquid AI LFM2.5-8B-A1B: 1.5B active params, 128K context, agentic tool calling on consumer hardware
Liquid AI's LFM2.5-8B-A1B activates 1.5B of 8.3B MoE parameters, 128K context, tool calling on consumer hardware — another step toward real on-device agents - Zuckerberg finally puts a price tag on Meta's AI spending: Meta One paid add-ons arrive across the entire family of apps
Meta rolls out Meta One: paid add-ons across Instagram, Facebook, WhatsApp alongside a standalone paid AI product — the real price tag on Zuckerberg's AI spend appears - Google Cloud AI Threat Defense: automated find-assess-patch in minutes as attack surfaces expand with AI assistance
Google Cloud's AI Threat Defense platform aims to find, assess, and patch security flaws in enterprise systems in minutes — response to AI-accelerated attacks - Mistral rebrands LeChat as Vibe, adds Work Mode: every AI company now promises to automate your job
Mistral rebrands LeChat as Vibe and adds Work Mode with Google Workspace, Outlook, Slack, GitHub integrations — betting the chatbot's future is the full agent - Perplexity open-sources a Unigram tokenizer that cuts reranker latency 5x and CPU usage 5-6x versus Hugging Face
Perplexity open-sources Unigram tokenizer, claiming 5x lower p50 latency and 5-6x less CPU utilization than Hugging Face tokenizers — infrastructure as differentiated product