Listen

Description

Hey everyone! Thank you so much for watching the 72nd episode of the Weaviate Podcast with Madelon Hulsebos!! Madelon is one of the world's experts on Machine Learning with Tables and Tabular-Structured Data, this was such an eye-opening conversation! We discussed all sorts of topics from the relationship of tabular data and embeddings, to searching through tables, semantic joins, more complex Text-to-SQL, using machine learning for query execution, using tabular data in search and recommendation reranking, and many more! This was easily one of the most knowledge packed episodes of the Weaviate podcast so far, please don't hesitate to leave any questions or ideas you have related to the content discussed!

You can learn more about Madelon's incredible research career and publications / talks here: https://www.madelonhulsebos.com/! Papers such as GitTables are listed here!

Another nice nugget form the podcast - Madelon introduced me to the BIRD-SQL benchmark which really expanded my understanding of Text-to-SQL (https://arxiv.org/pdf/2305.03111.pdf.

Chapters
0:00 Welcome Madelon!
0:58 Tabular Data and Embeddings
3:10 Tabular Representation Learning
5:48 Semantic Type Detection
9:50 Pandas as an LLM Tool
11:52 Table-Based Question Answering and Text-to-SQL
19:35 Joins with Machine Learning
21:38 Query Execution with Machine Learning
22:45 Graph Neural Networks
24:07 XGBoost
28:28 Merging Tables
32:10 Fact Representation
35:50 GPT-4V and Tables
39:00 Metadata in Embeddings
42:45 Table Retrieval in Weaviate
46:25 Exciting future directions!!