Look for any podcast host, guest or anyone

Showing episodes and shows of

Nicolay Gerold

Shows

How AI Is Built

How AI Is Built#052 Don't Build Models, Build Systems That Build ModelsNicolay here,Today I have the chance to talk to Charles from Modal, who went from doing a PhD on neural network optimization in the 2010s - when ML engineers could build models with a soldering iron and some sticks - to architecting serverless infrastructure for AI models. Modal is about removing barriers so anyone can spin up a hundred GPUs in seconds.The critical insight that stuck with me: "Don't build models, build systems that build models." Organizations often make the mistake of celebrating a one-time fine-tuned model that matches GPT-4...2025-07-0159 min

How AI Is Built

How AI Is Built#051 Build systems that can be debugged at 4am by tired humans with no contextNicolay here,Today I have the chance to talk to Charity Majors, CEO and co-founder of Honeycomb, who recently has been writing about the cost crisis in observability."Your source of truth is production, not your IDE - and if you can't understand your code there, you're flying blind."The key insight is architecturally simple but operationally transformative: replace your 10-20 observability tools with wide structured events that capture everything about a request in one place. Most teams store the same request data across metrics, logs, traces, APM, and error tracking - creating...2025-06-171h 05

How AI Is Built

How AI Is Built#050 Bringing LLMs to Production: Delete Frameworks, Avoid Finetuning, Ship FasterNicolay here,Most AI developers are drowning in frameworks and hype. This conversation is about cutting through the noise and actually getting something into production.Today I have the chance to talk to Paul Iusztin, who's spent 8 years in AI - from writing CUDA kernels in C++ to building modern LLM applications. He currently writes about production AI systems and is building his own AI writing assistant.His philosophy is refreshingly simple: stop overthinking, start building, and let patterns emerge through use.The key insight that stuck with me: "If you...2025-05-271h 06

How AI Is Built

How AI Is Built#050 TAKEAWAYS Bringing LLMs to Production: Delete Frameworks, Avoid Finetuning, Ship FasterNicolay here,Most AI developers are drowning in frameworks and hype. This conversation is about cutting through the noise and actually getting something into production.Today I have the chance to talk to Paul Iusztin, who's spent 8 years in AI - from writing CUDA kernels in C++ to building modern LLM applications. He currently writes about production AI systems and is building his own AI writing assistant.His philosophy is refreshingly simple: stop overthinking, start building, and let patterns emerge through use.The key insight that stuck with me: "If you...2025-05-2711 min

How AI Is Built

How AI Is Built#049 BAML: The Programming Language That Turns LLMs into Predictable FunctionsNicolay here,I think by now we are done with marveling at the latest benchmark scores of the models. It doesn’t tell us much anymore that the latest generation outscores the previous by a few basis points.If you don’t know how the LLM performs on your task, you are just duct taping LLMs into your systems.If your LLM-powered app can’t survive a malformed emoji, you’re shipping liability, not software.Today, I sat down with Vaibhav (co-founder of Boundary) to dissect BAML—a DSL that treats every LLM...2025-05-201h 02

How AI Is Built

How AI Is Built#049 TAKEAWAYS BAML: The Programming Language That Turns LLMs into Predictable FunctionsNicolay here,I think by now we are done with marveling at the latest benchmark scores of the models. It doesn’t tell us much anymore that the latest generation outscores the previous by a few basis points.If you don’t know how the LLM performs on your task, you are just duct taping LLMs into your systems.If your LLM-powered app can’t survive a malformed emoji, you’re shipping liability, not software.Today, I sat down with Vaibhav (co-founder of Boundary) to dissect BAML—a DSL that treats every LLM...2025-05-201h 12

How AI Is Built

How AI Is Built#048 TAKEAWAYS Why Your AI Agents Need Permission to Act, Not Just ReadNicolay here,most AI conversations obsess over capabilities. This one focuses on constraints - the right ones that make AI actually useful rather than just impressive demos.Today I have the chance to talk to Dexter Horthy, who recently put out a long piece called the “12-factor agents”.It’s like the 10 commandments, but for building agents.One of it is “Contact human with tool calls”: the LLM can call humans for high-stakes decisions or “writes”.The key insight is brutally simple. AI can get to 90% accuracy on most tasks - good e...2025-05-1307 min

How AI Is Built

How AI Is Built#048 Why Your AI Agents Need Permission to Act, Not Just ReadNicolay here,most AI conversations obsess over capabilities. This one focuses on constraints - the right ones that make AI actually useful rather than just impressive demos.Today I have the chance to talk to Dexter Horthy, who recently put out a long piece called the “12-factor agents”.It’s like the 10 commandments, but for building agents.One of it is “Contact human with tool calls”: the LLM can call humans for high-stakes decisions or “writes”.The key insight is brutally simple. AI can get to 90% accuracy on most tasks - good e...2025-05-1157 min

How AI Is Built

How AI Is Built#047 Architecting Information for Search, Humans, and Artificial IntelligenceToday on How AI Is Built, Nicolay Gerold sits down with Jorge Arango, an expert in information architecture. Jorge emphasizes that aligning systems with users' mental models is more important than optimizing backend logic alone. He shares a clear framework with four practical steps:Key Points:Information architecture should bridge user mental models with system data modelsInformation's purpose is to help people make better choices and act more skillfullyWell-designed systems create learnable (not just "intuitive") interfacesContext and domain boundaries significantly impact user understandingProgressive disclosure helps accommodate users with varying expertise levelsChapters00:00 Introduction to...2025-03-2757 min

How AI Is Built

How AI Is Built#046 Building a Search Database From First PrinciplesModern search is broken. There are too many pieces that are glued together.Vector databases for semantic searchText engines for keywordsRerankers to fix the resultsLLMs to understand queriesMetadata filters for precisionEach piece works well alone.Together, they often become a mess.When you glue these systems together, you create:Data Consistency Gaps Your vector store knows about documents your text engine doesn't. Which is right?Timing Mismatches New content appears in one system before another. Users see different results depending on which path their query takes.Complexity Explosion Every new component...2025-03-1353 min

How AI Is Built

How AI Is Built#045 RAG As Two Things - Prompt Engineering and SearchJohn Berryman moved from aerospace engineering to search, then to ML and LLMs. His path: Eventbrite search → GitHub code search → data science → GitHub Copilot. He was drawn to more math and ML throughout his career.RAG Explained"RAG is not a thing. RAG is two things." It breaks into:Search - finding relevant informationPrompt engineering - presenting that information to the modelThese should be treated as separate problems to optimize.The Little Red Riding Hood PrincipleWhen prompting LLMs, stay on the path of what models have seen in tra...2025-03-061h 02

How AI Is Built

How AI Is Built#044 Graphs Aren't Just For Specialists AnymoreKuzu is an embedded graph database that implements Cypher as a library.It can be easily integrated into various environments—from scripts and Android apps to serverless platforms.Its design supports both ephemeral, in-memory graphs (ideal for temporary computations) and large-scale persistent graphs where traditional systems struggle with performance and scalability.Key Architectural Decisions:Columnar Storage:Kuzu stores node and relationship properties in separate, contiguous columns. This design reduces I/O by allowing queries to scan only the needed columns, unlike row-based systems (e.g., Neo4j) that read full records even wh...2025-02-281h 03

How AI Is Built

How AI Is Built#043 Knowledge Graphs Won't Fix Bad DataMetadata is the foundation of any enterprise knowledge graph.By organizing both technical and business metadata, organizations create a “brain” that supports advanced applications like AI-driven data assistants.The goal is to achieve economies of scale—making data reusable, traceable, and ultimately more valuable.Juan Sequeda is a leading expert in enterprise knowledge graphs and metadata management. He has spent years solving the challenges of integrating diverse data sources into coherent, accessible knowledge graphs. As Principal Scientist at data.world, Juan provides concrete strategies for improving data quality, streamlining feature extraction, and enhancing model...2025-02-201h 10

How AI Is Built

How AI Is Built#042 Temporal RAG, Embracing Time for Smarter, Reliable Knowledge GraphsDaniel Davis is an expert on knowledge graphs. He has a background in risk assessment and complex systems—from aerospace to cybersecurity. Now he is working on “Temporal RAG” in TrustGraph.Time is a critical—but often ignored—dimension in data. Whether it’s threat intelligence, legal contracts, or API documentation, every data point has a temporal context that affects its reliability and usefulness. To manage this, systems must track when data is created, updated, or deleted, and ideally, preserve versions over time.Three Types of Data:Observations:Definition: Measurable, verifiable recordings (e.g., “the hat reads...2025-02-131h 33

How AI Is Built

How AI Is Built#041 Context Engineering, How Knowledge Graphs Help LLMs ReasonRobert Caulk runs Emergent Methods, a research lab building news knowledge graphs. With a Ph.D. in computational mechanics, he spent 12 years creating open-source tools for machine learning and data analysis. His work on projects like Flowdapt (model serving) and FreqAI (adaptive modeling) has earned over 1,000 academic citations.His team built AskNews, which he calls "the largest news knowledge graph in production." It's a system that doesn't just collect news - it understands how events, people, and places connect.Current AI systems struggle to connect information across sources and domains. Simple vector search misses crucial...2025-02-061h 33

How AI Is Built

How AI Is Built#040 Vector Database Quantization, Product, Binary, and ScalarWhen you store vectors, each number takes up 32 bits.With 1000 numbers per vector and millions of vectors, costs explode.A simple chatbot can cost thousands per month just to store and search through vectors.The Fix: QuantizationThink of it like image compression. JPEGs look almost as good as raw photos but take up far less space. Quantization does the same for vectors.Today we are back continuing our series on search with Zain Hasan, a former ML engineer at Weaviate and now a Senior AI/ ML Engineer...2025-01-3152 min

How AI Is Built

How AI Is Built#039 Local-First Search, How to Push Search To End-DevicesAlex Garcia is a developer focused on making vector search accessible and practical. As he puts it: "I'm a SQLite guy. I use SQLite for a lot of projects... I want an easier vector search thing that I don't have to install 10,000 dependencies to use.”Core Mantra: "Simple, Local, Scalable"Why SQLite Vec?"I didn't go along thinking, 'Oh, I want to build vector search, let me find a database for it.' It was much more like: I use SQLite for a lot of projects, I want something lightweight that wo...2025-01-2353 min

How AI Is Built

How AI Is Built#038 AI-Powered Search, Context Is King, But Your RAG System Ignores Two-Thirds of ItToday, I (Nicolay Gerold) sit down with Trey Grainger, author of the book AI-Powered Search. We discuss the different techniques for search and recommendations and how to combine them.While RAG (Retrieval-Augmented Generation) has become a buzzword in AI, Trey argues that the current understanding of "RAG" is overly simplified – it's actually a bidirectional process he calls "GARRAG," where retrieval and generation continuously enhance each other.Trey uses a three context framework for search architecture:Content Context: Traditional document understanding and retrievalUser Context: Behavioral signals driving personalization and recommendationsDomain Context: Knowledge graphs and semantic un...2025-01-091h 14

How AI Is Built

How AI Is Built#037 Chunking for RAG: Stop Breaking Your Documents Into Meaningless PiecesToday we are back continuing our series on search. We are talking to Brandon Smith, about his work for Chroma. He led one of the largest studies in the field on different chunking techniques. So today we will look at how we can unfuck our RAG systems from badly chosen chunking hyperparameters.The biggest lie in RAG is that semantic search is simple. The reality is that it's easy to build, it's easy to get up and running, but it's really hard to get right. And if you don't have a good setup, it's near impossible to...2025-01-0349 min

How AI Is Built

How AI Is Built#036 How AI Can Start Teaching Itself - Synthetic Data Deep DiveMost LLMs you use today already use synthetic data.It’s not a thing of the future.The large labs use a large model (e.g. gpt-4o) to generate training data for a smaller one (gpt-4o-mini).This lets you build fast, cheap models that do one thing well.This is “distillation”.But the vision for synthetic data is much bigger.Enable people to train specialized AI systems without having a lot of training data.Today we are talking to Adrien Morisot, an ML engine...2024-12-1948 min

How AI Is Built

How AI Is Built#035 A Search System That Learns As You Use It (Agentic RAG)Modern RAG systems build on flexibility.At their core, they match each query with the best tool for the job.They know which tool fits each task. When you ask about sales numbers, they reach for SQL. When you need to company policies, they use vector search or BM25. The key is switching tools smoothly.A question about sales figures might need SQL, while a search through policy documents works better with vector search. The key is building systems that can switch between these tools smoothly.But all types of retrieval...2024-12-1345 min

How AI Is Built

How AI Is Built#034 Rethinking Search Inside Postgres, From Lexemes to BM25Many companies use Elastic or OpenSearch and use 10% of the capacity.They have to build ETL pipelines.Get data Normalized.Worry about race conditions.All in all. At the moment, when you want to do search on top of your transactional data, you are forced to build a distributed systems.Not anymore.ParadeDB is building an open-source PostgreSQL extension to enable search within your database.Today, I am talking to Philippe Noël, the founder and CEO of ParadeDB.We talk about how t...2024-12-0547 min

How AI Is Built

How AI Is Built#033 RAG's Biggest Problems & How to Fix It (ft. Synthetic Data)RAG isn't a magic fix for search problems. While it works well at first, most teams find it's not good enough for production out of the box. The key is to make it better step by step, using good testing and smart data creation.Today, we are talking to Saahil Ognawala from Jina AI to start to understand RAG.To build a good RAG system, you need three things: ways to test it, methods to create training data, and plans to make it better over time. Testing starts with a set of example searches that...2024-11-2851 min

How AI Is Built

How AI Is Built#032 Improving Documentation Quality for RAG SystemsDocumentation quality is the silent killer of RAG systems. A single ambiguous sentence might corrupt an entire set of responses. But the hardest part isn't fixing errors - it's finding them.Today we are talking to Max Buckley on how to find and fix these errors.Max works at Google and has built a lot of interesting experiments with LLMs on using them to improve knowledge bases for generation.We talk about identifying ambiguities, fixing errors, creating improvement loops in the documents and a lot more.Some Insights:A single...2024-11-2146 min

How AI Is Built

How AI Is Built#031 BM25 As The Workhorse Of Search; Vectors Are Its Visionary CousinEver wondered why vector search isn't always the best path for information retrieval?Join us as we dive deep into BM25 and its unmatched efficiency in our latest podcast episode with David Tippett from GitHub.Discover how BM25 transforms search efficiency, even at GitHub's immense scale.BM25, short for Best Match 25, use term frequency (TF) and inverse document frequency (IDF) to score document-query matches. It addresses limitations in TF-IDF, such as term saturation and document length normalization.Search Is About User ExpectationsSearch isn't just about relevance but aligning with...2024-11-1554 min

How AI Is Built

How AI Is Built#030 Vector Search at Scale, Why One Size Doesn't Fit AllEver wondered why your vector search becomes painfully slow after scaling past a million vectors? You're not alone - even tech giants struggle with this.Charles Xie, founder of Zilliz (company behind Milvus), shares how they solved vector database scaling challenges at 100B+ vector scale:Key Insights:Multi-tier storage strategy: GPU memory (1% of data, fastest)RAM (10% of data)Local SSDObject storage (slowest but cheapest)Real-time search solution: New data goes to buffer (searchable immediately)Index builds in background when buffer fillsCombines buffer & main index resultsPerformance optimization: GPU acceleration for 10k-50k queries/secondCustomizable trade-offs bet...2024-11-0736 min

How AI Is Built

How AI Is Built#029 Search Systems at Scale, Avoiding Local Maxima and Other Engineering LessonsModern search systems face a complex balancing act between performance, relevancy, and cost, requiring careful architectural decisions at each layer.While vector search generates buzz, hybrid approaches combining traditional text search with vector capabilities yield better results.The architecture typically splits into three core components:ingestion/indexing (requiring decisions between batch vs streaming)query processing (balancing understanding vs performance)analytics/feedback loops for continuous improvement.Critical but often overlooked aspects include query understanding depth, systematic relevancy testing (avoid anecdote-driven development), and data governance as search systems naturally evolve into organizational data hubs.2024-10-3154 min

How AI Is Built

How AI Is Built#028 Training Multi-Modal AI, Inside the Jina CLIP Embedding ModelToday we are talking to Michael Günther, a senior machine learning scientist at Jina about his work on JINA Clip.Some key points:Uni-modal embeddings convert a single type of input (text, images, audio) into vectorsMultimodal embeddings learn a joint embedding space that can handle multiple types of input, enabling cross-modal search (e.g., searching images with text)Multimodal models can potentially learn richer representations of the world, including concepts that are difficult or impossible to put into wordsTypes of Text-Image ModelsCLIP-like ModelsSeparate vision and text transformer modelsEach tower maps inputs t...2024-10-2549 min

How AI Is Built

How AI Is Built#027 Building the database for AI, Multi-modal AI, Multi-modal StorageImagine a world where data bottlenecks, slow data loaders, or memory issues on the VM don't hold back machine learning.Machine learning and AI success depends on the speed you can iterate. LanceDB is here to to enable fast experiments on top of terabytes of unstructured data. It is the database for AI. Dive with us into how LanceDB was built, what went into the decision to use Rust as the main implementation language, the potential of AI on top of LanceDB, and more."LanceDB is the database for AI...to manage their data, to...2024-10-2344 min

How AI Is Built

How AI Is Built#026 Embedding Numbers, Categories, Locations, Images, Text, and The WorldToday’s guest is Mór Kapronczay. Mór is the Head of ML at superlinked. Superlinked is a compute framework for your information retrieval and feature engineering systems, where they turn anything into embeddings.When most people think about embeddings, they think about ada, openai.You just take your text and throw it in there.But that’s too crude.OpenAI embeddings are trained on the internet.But your data set (most likely) is not the internet.You have different nuances.And you have more t...2024-10-1046 min

How AI Is Built

How AI Is Built#025 Data Models to Remove Ambiguity from AI and SearchToday we have Jessica Talisman with us, who is working as an Information Architect at Adobe. She is (in my opinion) the expert on taxonomies and ontologies.That’s what you will learn today in this episode of How AI Is Built. Taxonomies, ontologies, knowledge graphs.Everyone is talking about them no-one knows how to build them.But before we look into that, what are they good for in search?Imagine a large corpus of academic papers. When a user searches for "machine learning in healthcare", the system can:Recognize "ma...2024-10-0458 min

How AI Is Built

How AI Is Built#024 How ColPali is Changing Information RetrievalColPali makes us rethink how we approach document processing.ColPali revolutionizes visual document search by combining late interaction scoring with visual language models. This approach eliminates the need for extensive text extraction and preprocessing, handling messy real-world data more effectively than traditional methods.In this episode, Jo Bergum, chief scientist at Vespa, shares his insights on how ColPali is changing the way we approach complex document formats like PDFs and HTML pages.Introduction to ColPali:Combines late interaction scoring from Colbert with visual language model (PoliGemma)Represents screenshots of documents as multi-vector...2024-09-2754 min

How AI Is Built

How AI Is Built#023 The Power of Rerankers in Modern SearchToday, we're talking to Aamir Shakir, the founder and baker at mixedbread.ai, where he's building some of the best embedding and re-ranking models out there. We go into the world of rerankers, looking at how they can classify, deduplicate documents, prioritize LLM outputs, and delve into models like ColBERT.We discuss:The role of rerankers in retrieval pipelinesAdvantages of late interaction models like ColBERT for interpretabilityTraining rerankers vs. embedding models and their impact on performanceIncorporating metadata and context into rerankers for enhanced relevanceCreative applications of rerankers beyond traditional searchChallenges and future directions in the retrieval...2024-09-2642 min

How AI Is Built

How AI Is Built#022 The Limits of Embeddings, Out-of-Domain Data, Long Context, Finetuning (and How We're Fixing It)Text embeddings have limitations when it comes to handling long documents and out-of-domain data.Today, we are talking to Nils Reimers. He is one of the researchers who kickstarted the field of dense embeddings, developed sentence transformers, started HuggingFace’s Neural Search team and now leads the development of search foundational models at Cohere. Tbh, he has too many accolades to count off here.We talk about the main limitations of embeddings:Failing out of domainStruggling with long documentsVery hard to debugHard to find formalize what actually is similarAre you still not su...2024-09-1946 min

How AI Is Built

How AI Is Built#021 The Problems You Will Encounter With RAG At Scale And How To Prevent (or fix) ThemHey! Welcome back.Today we look at how we can get our RAG system ready for scale.We discuss common problems and their solutions, when you introduce more users and more requests to your system.For this we are joined by Nirant Kasliwal, the author of fastembed.Nirant shares practical insights on metadata extraction, evaluation strategies, and emerging technologies like Colipali. This episode is a must-listen for anyone looking to level up their RAG implementations."Naive RAG has a lot of problems on the retrieval end and then there's...2024-09-1250 min

How AI Is Built

How AI Is Built#020 The Evolution of Search, Finding Search Signals, GenAI Augmented RetrievalIn this episode of How AI is Built, Nicolay Gerold interviews Doug Turnbull, a search engineer at Reddit and author on “Relevant Search”. They discuss how methods and technologies, including large language models (LLMs) and semantic search, contribute to relevant search results.Key Highlights:Defining relevance is challenging and depends heavily on user intent and contextCombining multiple search techniques (keyword, semantic, etc.) in tiers can improve resultsLLMs are emerging as a powerful tool for augmenting traditional search approachesOperational concerns often drive architectural decisions in large-scale search systemsUnderappreciated techniques like LambdaMART may see a resurgenceKey Quot...2024-09-0552 min

The Data Stack Show

The Data Stack Show205: How to make LLMs Boring (Predictable, Reliable, and Safe), Featuring Nicolay GeroldHighlights from this week’s conversation include:Nicolay’s Background and Journey in AI (0:39)Milestones in LLMs (4:30)Barriers to Effective Use of LLMs (6:39)Data-Centric AI Approach (10:17)Importance of Data Over Model Tuning (12:20)Capabilities of LLMs (15:08)Challenges in Structuring Data (18:28)JSON Generation Techniques (20:28)Utilizing Unused Data (22:36)Importance of Monitoring in AI (34:11)Challenges in AI Testing (37:40)Error Tracing in AI vs. Software (39:24)The AI Startup Landscape (40:53)Marketing for Technical Founders (42:41)Generative AI Hype Cycle (44:33)Connecting with Nicolay and Final Takeaways (47:59)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk t...2024-09-0448 min

The Data Stack Show

The Data Stack ShowThe PRQL: What are LLMs Actually Good At? Featuring Nicolay GeroldThe Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. 2024-09-0201 min

How AI Is Built

How AI Is Built#019 Data-driven Search Optimization, Analysing RelevanceIn this episode, we talk data-driven search optimizations with Charlie Hull.Charlie is a search expert from Open Source Connections. He has built Flax, one of the leading open source search companies in the UK, has written “Searching the Enterprise”, and is one of the main voices on data-driven search.We discuss strategies to improve search systems quantitatively and much more.Key Points:Relevance in search is subjective and context-dependent, making it challenging to measure consistently.Common mistakes in assessing search systems include overemphasizing processing speed and relying solely on user complaints.Thre...2024-08-3051 min

How AI Is Built

How AI Is Built#018 Query Understanding: Doing The Work Before The Query Hits The DatabaseWelcome back to How AI Is Built. We have got a very special episode to kick off season two. Daniel Tunkelang is a search consultant currently working with Algolia. He is a leader in the field of information retrieval, recommender systems, and AI-powered search. He worked for Canva, Algolia, Cisco, Gartner, Handshake, to pick a few. His core focus is query understanding. **Query understanding is about focusing less on the results and more on the query.** The query of the user is the first-class citizen. It is about figuring out what the...2024-08-1553 min

How AI Is Built

How AI Is BuiltHead vs Torso Queries | Grokking Search | Micro-lessonYour queries are on a spectrum.Head Tail High Volume Low Volume General Specific Few Queries Many Queries When we talk about volume, we talk about the amount of searches with the same query term.Tail queries still have a large volume of search volume, but as a distribution.What counts into the head, torso, and tail queries will always be application specific, so you have to log the queries and create some analytics for it to identify them.Before you use them, you have to find them.Examine...2024-08-1506 min

How AI Is Built

How AI Is BuiltFacets vs Filters | Grokking Search | Micro-lessonFacets let your users tune their search results.They are not filters.Filters eliminate search results.Facets give users the ability to reduce the search results.Facets allow the user in the frontend to limit the search results to a limited, more specific set. They are extremely valuable in combination with head queries (reference), which return a large number of results.Read more: https://nicolaygerold.substack.com/p/facets-vs-filters-grokking-search Further reading:Daniel Tunkelang. Facets of Faceted SearchElastic. Facets Guide.Elastic. Facets API Reference.Nielsen Norman Group. F...2024-08-1304 min

How AI Is Built

How AI Is BuiltSeason 2 Trailer: Mastering SearchToday we are launching the season 2 of How AI Is Built.The last few weeks, we spoke to a lot of regular listeners and past guests and collected feedback. Analyzed our episode data. And we will be applying the learnings to season 2.This season will be all about search.We are trying to make it better, more actionable, and more in-depth. The goal is that at the end of this season, you have a full-fleshed course on search in podcast form, which mini-courses on specific elements like RAG.We will be...2024-08-0804 min

How AI Is Built

How AI Is Built#017 Unlocking Value from Unstructured Data, Real-World Applications of Generative AIIn this episode of "How AI is Built," host Nicolay Gerold interviews Jonathan Yarkoni, founder of Reach Latent. Jonathan shares his expertise in extracting value from unstructured data using AI, discussing challenging projects, the impact of ChatGPT, and the future of generative AI. From weather prediction to legal tech, Jonathan provides valuable insights into the practical applications of AI across various industries.Key TakeawaysGenerative AI projects often require less data cleaning due to the models' tolerance for "dirty" data, allowing for faster implementation in some cases.The success of AI projects post-delivery is ensured through...2024-07-1636 min

How AI Is Built

How AI Is Built#016 Data Processing for AI, Integrating AI into Data Pipelines, SparkThis episode of "How AI Is Built" is all about data processing for AI. Abhishek Choudhary and Nicolay discuss Spark and alternatives to process data so it is AI-ready.Spark is a distributed system that allows for fast data processing by utilizing memory. It uses a dataframe representation "RDD" to simplify data processing.When should you use Spark to process your data for your AI Systems?→ Use Spark when:Your data exceeds terabytes in volumeYou expect unpredictable data growthYour pipeline involves multiple complex operationsYou already have a Spark cluster (e.g., Databricks)Yo...2024-07-1246 min

How AI Is Built

How AI Is Built#015 Building AI Agents for the Enterprise, Agent Cost Controls, Seamless UXIn this episode, Nicolay talks with Rahul Parundekar, founder of AI Hero, about the current state and future of AI agents. Drawing from over a decade of experience working on agent technology at companies like Toyota, Rahul emphasizes the importance of focusing on realistic, bounded use cases rather than chasing full autonomy.They dive into the key challenges, like effectively capturing expert workflows and decision processes, delivering seamless user experiences that integrate into existing routines, and managing costs through techniques like guardrails and optimized model choices. The conversation also explores potential new paradigms for agent interactions beyond...2024-07-0435 min

How AI Is Built

How AI Is Built#014 Building Predictable Agents through Prompting, Compression, and Memory StrategiesIn this conversation, Nicolay and Richmond Alake discuss various topics related to building AI agents and using MongoDB in the AI space. They cover the use of agents and multi-agents, the challenges of controlling agent behavior, and the importance of prompt compression.When you are building agents. Build them iteratively. Start with simple LLM calls before moving to multi-agent systems.Main Takeaways:Prompt Compression: Using techniques like prompt compression can significantly reduce the cost of running LLM-based applications by reducing the number of tokens sent to the model. This becomes crucial when scaling to...2024-06-2732 min

How AI Is Built

How AI Is BuiltData Integration and Ingestion for AI & LLMs, Architecting Data Flows | changelog 3In this episode, Kirk Marple, CEO and founder of Graphlit, shares his expertise on building efficient data integrations. Kirk breaks down his approach using relatable concepts: The "Two-Sided Funnel": This model streamlines data flow by converting various data sources into a standard format before distributing it. Universal Data Streams: Kirk explains how he transforms diverse data into a single, manageable stream of information. Parallel Processing: Learn about the "competing consumer model" that allows for faster data handling. Building Blocks for Success: Discover the importance of well-defined interfaces and actor models in creating robust data systems. Tech...2024-06-2514 min

How AI Is Built

How AI Is Built#013 ETL for LLMs, Integrating and Normalizing Unstructured DataIn our latest episode, we sit down with Derek Tu, Founder and CEO of Carbon, a cutting-edge ETL tool designed specifically for large language models (LLMs).Carbon is streamlining AI development by providing a platform for integrating unstructured data from various sources, enabling businesses to build innovative AI applications more efficiently while addressing data privacy and ethical concerns."I think people are trying to optimize around the chunking strategy... But for me, that seems a bit maybe not focusing on the right area of optimization. These embedding models themselves have gone just like, so much more...2024-06-1936 min

How AI Is Built

How AI Is Built#012 Serverless Data Orchestration, AI in the Data Stack, AI PipelinesIn this episode, Nicolay sits down with Hugo Lu, founder and CEO of Orchestra, a modern data orchestration platform. As data pipelines and analytics workflows become increasingly complex, spanning multiple teams, tools and cloud services, the need for unified orchestration and visibility has never been greater.Orchestra is a serverless data orchestration tool that aims to provide a unified control plane for managing data pipelines, infrastructure, and analytics across an organization's modern data stack.The core architecture involves users building pipelines as code which then run on Orchestra's serverless infrastructure. It can orchestrate tasks like...2024-06-1428 min

How AI Is Built

How AI Is Built#011 Mastering Vector Databases, Product & Binary Quantization, Multi-Vector SearchEver wondered how AI systems handle images and videos, or how they make lightning-fast recommendations? Tune in as Nicolay chats with Zain Hassan, an expert in vector databases from Weaviate. They break down complex topics like quantization, multi-vector search, and the potential of multimodal search, making them accessible for all listeners. Zain even shares a sneak peek into the future, where vector databases might connect our brains with computers!Zain Hasan:LinkedInX (Twitter)WeaviateNicolay Gerold:⁠LinkedIn⁠⁠X (Twitter)Key Insights:Vector databases can handle not just text, but also image, audio, and vi...2024-06-0740 min

How AI Is Built

How AI Is Built#010 Building Robust AI and Data Systems, Data Architecture, Data Quality, Data StorageIn this episode of "How AI is Built", data architect Anjan Banerjee provides an in-depth look at the world of data architecture and building complex AI and data systems. Anjan breaks down the basics using simple analogies, explaining how data architecture involves sorting, cleaning, and painting a picture with data, much like organizing Lego bricks to build a structure.Summary by SectionIntroductionAnjan Banerjee, a data architect, discusses building complex AI and data systemsExplains the basics of data architecture using Lego and chat app examplesSources and ToolsIdentifying data sources...2024-05-3145 min

How AI Is Built

How AI Is Built#009 Modern Data Infrastructure for Analytics and AI, Lakehouses, Open Source Data StackJorrit Sandbrink, a data engineer specializing on open table formats, discusses the advantages of decoupling storage and compute, the importance of choosing the right table format, and strategies for optimizing your data pipelines. This episode is full of practical advice for anyone looking to build a high-performance data analytics platform.Lake house architecture: A blend of data warehouse and data lake, addressing their shortcomings and providing a unified platform for diverse workloads.Key components and decisions: Storage options (cloud or on-prem), table formats (Delta Lake, Iceberg, Apache Hoodie), and query engines (Apache Spark, Polars).Optimizations: Partitioning strategies, file...2024-05-2427 min

How AI Is Built

How AI Is Built#008 Knowledge Graphs for Better RAG, Virtual Entities, Hybrid Data ModelsKirk Marple, CEO and founder of Graphlit, discusses the evolution of his company from a data cataloging tool to an platform designed for ETL (Extract, Transform, Load) and knowledge retrieval for Large Language Models (LLMs). Graphlit empowers users to build custom applications on top of its API that go beyond naive RAG.Key Points:Knowledge Graphs: Graphlet utilizes knowledge graphs as a filtering layer on top of keyword metadata and vector search, aiding in information retrieval.Storage for KGs: A single piece of content in their data model resides across multiple systems: a document store with...2024-05-2036 min

How AI Is Built

How AI Is Built#007 Navigating the Modern Data Stack, Choosing the Right OSS Tools, From Problem to Requirements to ArchitectureFrom Problem to Requirements to Architecture.In this episode, Nicolay Gerold and Jon Erich Kemi Warghed discuss the landscape of data engineering, sharing insights on selecting the right tools, implementing effective data governance, and leveraging powerful concepts like software-defined assets. They discuss the challenges of keeping up with the ever-evolving tech landscape and offer practical advice for building sustainable data platforms. Tune in to discover how to simplify complex data pipelines, unlock the power of orchestration tools, and ultimately create more value from your data."Don't overcomplicate what you're actually doing.""Getting your basic programming software...2024-05-1738 min

How AI Is Built

How AI Is Built#006 Data Orchestration Tools, Choosing the right one for your needsIn this episode, Nicolay Gerold interviews John Wessel, the founder of Agreeable Data, about data orchestration. They discuss the evolution of data orchestration tools, the popularity of Apache Airflow, the crowded market of orchestration tools, and the key problem that orchestrators solve. They also explore the components of a data orchestrator, the role of AI in data orchestration, and how to choose the right orchestrator for a project. They touch on the challenges of managing orchestrators, the importance of monitoring and optimization, and the need for product people to be more involved in the orchestration space. They also discuss...2024-05-1032 min

How AI Is Built

How AI Is Built#005 Building Reliable LLM Applications, Production-Ready RAG, Data-Driven EvalsIn this episode of "How AI is Built", we learn how to build and evaluate real-world language model applications with Shahul and Jithin, creators of Ragas. Ragas is a powerful open-source library that helps developers test, evaluate, and fine-tune Retrieval Augmented Generation (RAG) applications, streamlining their path to production readiness.Main InsightsChallenges of Open-Source Models: Open-source large language models (LLMs) can be powerful tools, but require significant post-training optimization for specific use cases.Evaluation Before Deployment: Thorough testing and evaluation are key to preventing unexpected behaviors and hallucinations in deployed RAGs. Ragas offers metrics and...2024-05-0329 min

How AI Is Built

How AI Is BuiltLance v2: Rethinking Columnar Storage for Faster Lookups, Nulls, and Flexible Encodings | changelog 2In this episode of Changelog, Weston Pace dives into the latest updates to LanceDB, an open-source vector database and file format. Lance's new V2 file format redefines the traditional notion of columnar storage, allowing for more efficient handling of large multimodal datasets like images and embeddings. Weston discusses the goals driving LanceDB's development, including null value support, multimodal data handling, and finding an optimal balance for search performance. Sound Bites "A little bit more power to actually just try." "We're becoming a little bit more feature complete with returns of arrow." "Weird data representations that...2024-04-2921 min

How AI Is Built

How AI Is Built#004 AI with Supabase, Postgres Configuration, Real-Time Processing, and moreHad a fantastic conversation with Christopher Williams, Solutions Architect at Supabase, about setting up Postgres the right way for AI. We dug deep into Supabase, exploring:Core components and how they power real-time AI solutionsOptimizing Postgres for AI workloadsThe magic of PG Vector and other key extensionsSupabase’s future and exciting new featuresHad a fantastic conversation with Christopher Williams, Solutions Architect at Supabase, about setting up Postgres the right way for AI. We dug deep into Supabase, exploring:Core components and how they power real-time AI solutionsOptimizing Postgres for AI workloadsThe magic of PG Vector an...2024-04-2631 min

How AI Is Built

How AI Is Built#003 AI Inside Your Database, Real-Time AI, Declarative ML/AIIf you've ever wanted a simpler way to integrate AI directly into your database, SuperDuperDB might be the answer. SuperDuperDB lets you easily apply AI processes to your data while keeping everything up-to-date with real-time calculations. It works with various databases and aims to make AI development less of a headache.In this podcast, we explore:How SuperDuperDB bridges the gap between AI and databases.The benefits of real-time AI processes within your data deployment.SuperDuperDB's framework for configuring AI workflows.The future of AI-powered databases.TakeawaysSuperDuperDB enables developers to apply AI processes...2024-04-1936 min

How AI Is Built

How AI Is BuiltSupabase acquires OrioleDB, A New Database Engine for PostgreSQL | changelog 1Supabase just acquired OrioleDB, a storage engine for PostgreSQL. Oriole gets creative with MVCC! It uses an UNDO log rather than keeping multiple versions of an entire data row (tuple). This means when you update data, Oriole tracks the changes needed to "undo" the update if necessary. Think of this like the "undo" function in a text editor. Instead of keeping a full copy of the old text, it just remembers what changed. This can be much smaller. This also saves space by eliminating the need for a garbage collection process. It...2024-04-1713 min

How AI Is Built

How AI Is Built#002 AI Powered Data Transformation, Combining gen & trad AI, Semantic ValidationToday’s guest is Antonio Bustamante, a serial entrepreneur who previously built Kite and Silo and is now working to fix bad data. He is building bem, the data tool to transform any data into the schema your AI and software needs.bem.ai is a data tool that focuses on transforming any data into the schema needed for AI and software. It acts as a system's interoperability layer, allowing systems that couldn't communicate before to exchange information. Learn what place LLMs play in data transformation, how to build reliable data infrastructure and more."Surprisingly, th...2024-04-1237 min

How AI Is Built

How AI Is Built#001 Multimodal AI, Storing 1 Billion Vectors, Building Data Infrastructure at LanceDBImagine a world where data bottlenecks, slow data loaders, or memory issues on the VM don't hold back machine learning.Machine learning and AI success depends on the speed you can iterate. LanceDB is here to to enable fast experiments on top of terabytes of unstructured data. It is the database for AI. Dive with us into how LanceDB was built, what went into the decision to use Rust as the main implementation language, the potential of AI on top of LanceDB, and more."LanceDB is the database for AI...to manage their data, to...2024-04-0534 min

AI or DieBeyond Code: AI Data & DesignThis week we are joined by Nicolay Christopher Gerold. In this episode, Nicolay (Nico) brings a unique blend of expertise with his rich background in GenAI, NLP, and LLMs. Nico discusses a practical approach to the real-world applications and implications of generative AI technologies. During this episode, we dive into the multifaceted impacts of GenAI, discussing its role in data extraction, the challenges surrounding data governance and privacy, and the significance of synthetic data. Nico offers valuable insights on the future of AI user interfaces and the shift towards specialized AI models by making AI integrations more...2024-02-2648 min

Artificially Unintelligent

Artificially UnintelligentArtifially Unintelligent TeaserWelcome to Artificially Unintelligent, a twice-a-week show with Nicolay Gerold and William Lindskog, where we explore the shift AI will unlock in the coming years. Each episode dives into a specific topic and gives you a jumping of point to apply it or to dive even deeper. Stay tuned for more! Do you still want to hear more from us? Follow us on the Socials: Nicolay: LinkedIn | Newsletter William: LinkedIn | Twitter 2023-07-0301 min