Embedding Model Deprecation: RAG's Silent Killer

Description

When OpenAI retires an embedding model like ada-002, your RAG pipeline doesn’t crash — it just gets subtly worse until users lose trust. This episode unpacks the $40,000 re-embedding nightmare one company faced, and explores three strategies to avoid it: event-driven re-embedding with PostgreSQL triggers, sidestepping embeddings entirely via the Model Context Protocol (MCP) for structured data, and client-side embedding caching with TTLs for gradual, non-breaking migrations. We also cover the VICE scoring model for choosing between vector search and traditional search, why top coding tools have abandoned vector RAG for AST-based retrieval, and the hybrid patterns that combine BM25, vector similarity, and cross-encoders.

Listen

Description

Want to check another podcast?