How Open Table Formats Changed Data Engineering

Description

In this episode of The Data Business Podcast, Lucas and Luna dive into the quiet but tectonic shift happening inside modern data lakes: the rise of open table formats like Apache Iceberg, Delta Lake, and Apache Hudi. Lucas explains how these formats solve the decades-old problem of making cloud object storage behave like a transactional database, enabling atomic commits, time travel queries, and cross-engine compatibility. He cites Netflix's role in developing and open-sourcing Iceberg, which now powers over an exabyte of data in production at companies like Apple and Airbnb. Luna brings up the cost implications — how these formats reduce vendor lock-in and let companies swap query engines without rewriting pipelines. They contrast Iceberg's design philosophy with Delta Lake's tighter Databricks integration, and discuss the real-world operational trade-offs. The episode closes with Lucas's take on why format wars matter more than most engineers realize, and how the data lakehouse architecture is becoming the default for new data platforms. No jargon without explanation, no vague promises — just a clear, specific look at an infrastructure layer that quietly reshapes how data teams work.

#OpenTableFormats #ApacheIceberg #DeltaLake #ApacheHudi #DataLakehouse #DataEngineering #DataInfrastructure #Netflix #Databricks #CloudData #DataArchitecture #BigData #DataStorage #DataPlatform #BusinessAndTechnology #DataBusinessPodcast #FexingoBusiness #BusinessPodcast

Keep every episode free: buymeacoffee.com/fexingo

Listen

Cast

Description

Want to check another podcast?