Hugo speaks with Vincent Warmerdam, a senior data professional and machine learning engineer at :probabl, the exclusive brand operator of scikit-learn. Vincent is known for challenging common assumptions and exploring innovative approaches in data science and machine learning.
In this episode, they dive deep into rethinking established methods in data science, machine learning, and AI. We explore Vincent's principled approach to the field, including:
The critical importance of exposing yourself to real-world problems before applying ML solutions
Framing problems correctly and understanding the data generating process
The power of visualization and human intuition in data analysis
Questioning whether algorithms truly meet the actual problem at hand
The value of simple, interpretable models and when to consider more complex approaches
The importance of UI and user experience in data science tools
Strategies for preventing algorithmic failures by rethinking evaluation metrics and data quality
The potential and limitations of LLMs in the current data science landscape
The benefits of open-source collaboration and knowledge sharing in the community
Throughout the conversation, Vincent illustrates these principles with vivid, real-world examples from his extensive experience in the field. They also discuss Vincent's thoughts on the future of data science and his call to action for more knowledge sharing in the community through blogging and open dialogue.
LINKS
The livestream on YouTube (https://youtube.com/live/-CD66CI1pEo?feature=share)
Vincent's blog (https://koaning.io/)
CalmCode (https://calmcode.io/)
scikit-lego (https://koaning.github.io/scikit-lego/)
Vincent's book Data Science Fiction (WIP) (https://calmcode.io/book)
The Deon Checklist, an ethics checklist for data scientists (https://deon.drivendata.org/)
Of oaths and checklists, by DJ Patil, Hilary Mason and Mike Loukides (https://www.oreilly.com/radar/of-oaths-and-checklists/)
Vincent's Getting Started with NLP and spaCy Course course on Talk Python (https://training.talkpython.fm/courses/getting-started-with-spacy)
Vincent on twitter (https://x.com/fishnets88)
:probabl. on twitter (https://x.com/probabl_ai)
Vincent's PyData Amsterdam Keynote "Natural Intelligence is All You Need [tm]" (https://www.youtube.com/watch?v=C9p7suS-NGk)
Vincent's PyData Amsterdam 2019 talk: The profession of solving (the wrong problem) (https://www.youtube.com/watch?v=kYMfE9u-lMo)
Vanishing Gradients on Twitter (https://twitter.com/vanishingdata)
Hugo on Twitter (https://twitter.com/hugobowne)
Check out and subcribe to our lu.ma calendar (https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk) for upcoming livestreams!