Listen

Description

Weaviate podcast #33.

Thank you so much for watching the 33rd Weaviate Podcast! This episode features one of the heroes of Deep Learning for Search, Nils Reimers! Nils' work on SentenceBERT is one of the foundational works for applying Deep Representation Learning to text search. This is the idea that personally inspired me to work in this field. Having seen the successes of Contrastive Representation Learning for Computer Vision, I was mind-blown by the possibility of this for NLP and text search. In addition to the scientific foundation, the software development of the Sentence Transformers library and BEIR benchmarks has been enormously impactful! It was an honor getting to ask Nils the questions I have about these things, from the role of Data Quality to Intent, Sparse Vectors, Long Document Encoding, Distribution Shift, and many more. I really hope you enjoy the podcast! We are so excited about the Cohere Multilingual embedding model and can't wait to see what else comes out of Cohere and their amazing team!



Cohere Multilingual ML Models with Weaviate: https://weaviate.io/blog/2022/12/Cohe...



Nils Reimers: https://scholar.google.com/citations?...



Mentioned in the podcast,



Cross-Encoders: https://weaviate.io/blog/2022/08/Usin...



How to choose a Sentence Transformer from HuggingFace: https://weaviate.io/blog/2022/10/How-...



Chapters

0:00 Cohere X Weaviate

0:22 Welcome Nils Reimers!

1:18 Origin Story

3:15 Learning Text Embeddings

6:54 Positive and Negative Sampling in Contrastive Learning

13:32 1 Billion Pairs for Text Embedding Optimization

15:44 Impact of Data Quality

18:40 New Cohere Multilingual Model!

24:50 Challenge of Debugging Multilingual Models

28:30 Intent in Search

30:40 Thoughts on ColBERT

33:50 Sparse Vectors in Search

36:17 Long Documents and Multi-Discourse

43:40 Entity Parsing in Query Understanding

46:08 Unknown Words and Distribution Shift

50:07 Re-Vectorizing with Fine-Tuning

53:07 More on Search Interfaces and Intent in Search

55:15 Thank you Nils!