Send us Fan Mail
Vatican, AlphaProof, coding agents, auth.mdVatican, AlphaProof, coding agents, auth.md
Today: AI ethics reaches the Vatican, AlphaProof Nexus solves verified math problems, coding agents meet slower engineering discipline and skepticism, attribution hallucination gets benchmarked, agent auth and token budgets become real infrastructure.
Stories
- At the Vatican launch of an AI encyclical, Anthropic's Christopher Olah argued models show signs of introspection while the document warned they imitate intelligence. — AI ethics enters religious and institutional language while Anthropic argues for model introspection
- Google DeepMind's AlphaProof Nexus solved nine open Erdős problems using Lean verification at a few hundred dollars per problem, though success stayed near 2.5 percent. — formal proof systems turn frontier math into cheap verified search with low hit rates
- A widely discussed essay argued for using AI to write better code more slowly, turning coding assistants into deliberate review partners instead of speed machines. — developers frame AI coding as slower but better review-oriented practice rather than pure acceleration
- George Hotz warned coding agents could become one of software's most costly mistakes because fast prototypes hide increasingly subtle bugs. — coding-agent skepticism hardens around hidden bugs and prototype quality debt
- Researchers introduced CiteVQA to test attribution hallucination, showing AI systems often cite passages that do not support their correct answers. — attribution hallucination becomes a measurable risk even when answers are correct
- OpenAI announced a strategic content partnership with Grupo Folha and Grupo UOL to bring Brazilian journalism into ChatGPT with attribution. — OpenAI expands news licensing and attribution partnerships beyond US and European publishers
- Hugging Face published a glossary for harnesses, scaffolds and other agent terms, trying to make agent discussions less ornamental and more precise. — agent deployment needs shared vocabulary before autonomy can be governed or debugged
- Together AI open-sourced OSCAR, an attention-aware 2-bit KV cache quantization method for long-context LLM serving. — long-context serving pressure pushes KV cache compression into attention-aware 2-bit methods
- WorkOS released auth.md, a proposed Markdown-based protocol for agents to discover registration flows, scopes and credential requirements. — agent authentication shifts from human sign-up pages toward machine-readable registration contracts
- Uber's COO said it is getting harder to justify money spent on AI token usage, turning tokenmaxxing into a finance problem. — enterprise buyers are scrutinizing token burn as AI spending moves from experiment to operating cost
- Scientists trained an AI model using an IBM quantum computer and reported correct answers the base model missed. — quantum-assisted AI claims remain intriguing but need careful separation of benchmark signal from marketing fog
- The Financial Times covered Heretic, extending the debate about derivative open-weight models and legal pressure beyond specialist forums. — follow-up: open-weight legal pressure becomes mainstream business coverage
- NuExtract3 was released as an open-weight 4B VLM for Markdown, OCR and structured extraction that can be self-hosted. — small self-hostable VLMs push document extraction into local workflows
- Claw-Anything benchmarked always-on personal assistants with broader access to a user's digital world, exposing how narrow current agent tests are. — agent benchmarks expand toward always-on assistants with broad access to a user's digital world