podcast
details
.com
Print
Share
Look for any podcast host, guest or anyone
Search
Showing episodes and shows of
DataTalks.Club
Shows
DataTalks.Club
Linguistics and Fairness - Tamara Atanasoska
In this podcast episode, we talked with Tamara Atanasoska about building fair AI systems.About the Speaker:Tamara works on ML explainability, interpretability and fairness as Open Source Software Engineer at probable. She is a maintainer of fairlearn, contributor to scikit-learn and skops. Tamara has both computer science/ software engineering and a computational linguistics(NLP) background.During the event, the guest discussed their career journey from software engineering to open-source contributions, focusing on explainability in AI through Scikit-learn and Fairlearn. They explored fairness in AI, including challenges in credit loans, hiring, and decision-making, and emph...
2025-01-17
53 min
DataTalks.Club
DataTalks.Club 4th Anniversary AMA Podcast – Alexey Grigorev and Johanna Bayer
We talked about: 00:00 DataTalks.Club intro 00:00 DataTalks.Club anniversary "Ask Me Anything" event with Alexey Grigorev 02:29 The founding of DataTalks .Club 03:52 Alexey's transition from Java work to DataTalks.Club 04:58 Growth and success of DataTalks.Club courses 12:04 Motivation behind creating a free-to-learn community 24:03 Staying updated in data science through pet projects 26 :37 Hosting a second podcast and maintaining programming skills 28:56 Skepticism about LLMs and their relevance 31:53 Transitioning to DataTalks.Club and personal reflections 33:32 Memorable moments and the first event's...
2024-10-26
53 min
DataTalks.Club
Working as a Core Developer in the Scikit-Learn Universe - Guillaume Lemaître
In this podcast episode, we talked with Guillaume Lemaître about navigating scikit-learn and imbalanced-learn. 🔗 CONNECT WITH Guillaume Lemaître LinkedIn - https://www.linkedin.com/in/guillaume-lemaitre-b9404939/ Twitter - https://x.com/glemaitre58 Github - https://github.com/glemaitre Website - https://glemaitre.github.io/ 🔗 CONNECT WITH DataTalksClub Join the community - https://datatalks-club.slack.com/join/shared_invite/zt-2hu0sjeic-ESN7uHt~aVWc8tD3PefSlA#/shared-invite/email Subscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/u/0/r?cid=ZjhxaWRqbnEwamhzY3A4ODA5azFlZ2hzNjBAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmN...
2024-07-26
52 min
DataTalks.Club
Berlin Buzzwords 2024
We stream the podcasts on YouTube, where each session is also recorded and published on our channel, complete with timestamps, a transcript, and important links. You can access all the podcast episodes here - https://datatalks.club/podcast.html 📚Check our free online courses ML Engineering course - http://mlzoomcamp.com Data Engineering course - https://github.com/DataTalksClub/data-engineering-zoomcamp MLOps course - https://github.com/DataTalksClub/mlops-zoomcamp Analytics in Stock Markets - https://github.com/DataTalksClub/stock-markets-analytics-zoomcamp LLM course - https://github.com/DataTalksClub/llm-zoomcamp Read about all our courses in one place - https://datatalks.club/blog/gui...
2024-07-06
37 min
DataTalks.Club
Building Production Search Systems - Daniel Svonava
Links: VectorHub: https://superlinked.com/vectorhub/?utm_source=community&utm_medium=podcast&utm_campaign=datatalks Daniel's LinkedIn: https://www.linkedin.com/in/svonava/ Free Data Engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html This podcast is sponsored by VectorHub, a free open-source learning community for all things vector embeddings and information retrieval systems.
2024-03-22
58 min
DataTalks.Club
Stock Market Analysis with Python and Machine Learning - Ivan Brigida
We talked about: Ivan’s background How Ivan became interested in investing Getting financial data to run simulations Open, High, Low, Close, Volume Risk management strategy Testing your trading strategies Sticking to your strategy Important metrics and remembering about trading fees Important features Deployment How DataTalks.Club courses helped Ivan Ivan’s site and course sign-up Links: Exploring Finance APIs: https://pythoninvest.com/long-read/exploring-finance-apis Python Invest Blog Articles: https://pythoninvest.com/blog Free ML Engineering course: http://mlzoomcamp.com Join DataTalks.Club: https://datatalks.club/slack.html Our even...
2024-01-24
55 min
DataTalks.Club
DataTalks.Club Anniversary Interview - Alexey Grigorev, Johanna Bayer
Free ML Engineering course: http://mlzoomcamp.com Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html
2023-10-12
57 min
DataTalks.Club
Lessons Learned from Freelancing and Working in a Start-up - Antonis Stellas
We talked about; Antonis' background The pros and cons of working for a startup Useful skills for working at a startup and the Lean way to work How Antonis joined the DataTalks.Club community Suggestions for students joining the MLOps course Antonis contributing to Evidently AI How Antonis started freelancing Getting your first clients on Upwork Pricing your work as a freelancer The process after getting approved by a client Wearing many hats as a freelancer and while working at a startup Other suggestions for getting clients as a freelancer Antonis' thoughts on the Data Engineering course Antonis...
2023-06-09
50 min
DataTalks.Club
Analytics for a Better World - Parvathy Krishnan
We talked about: Parvathy’s background Brainstorming sessions with nonprofits to establish data maturity Example of an Analytics for a Better World project The overall data maturity situation of nonprofits vs private sector Solving the skill gap Publicly available content The Analytics for a Better World Academy The Academy’s target audience How researchers can work with Analytics for a Better World Improving data maturity in nonprofit organizations People, processes, and technology Typical tools that Analytics for a Better World recommends to nonprofits Profiles in nonprofits Does Analytics for a Better World has a need for data engineers? The...
2023-03-03
54 min
DataTalks.Club
Accelerating the Adoption of AI through Diversity - Dânia Meira
We talked about: Dania’s background Founding the AI Guild Datalift Summit Coming up with meetup topics Diversity in Berlin Other types of diversity besides gender The pitfalls of lacking diversity Creating an environment where people can safely share their experiences How the AI Guild helps organizations become more diverse How the AI guild finds women in the fields of AI and data science Advice for people in underrepresented groups Organizing a welcoming environment and creating a code of conduct AI Guild’s consulting work and community AI Guild team Dania’s resource recommendations Upcoming Datalift Summit ...
2023-02-24
57 min
DataTalks.Club
Staff AI Engineer - Tatiana Gabruseva
We talked about: Tatiana’s background Going from academia to healthcare to the tech industry What staff engineers do Transferring skills from academia to industry and learning new ones The importance of having mentors Skipping junior and mid-level straight into the staff role Convincing employers that you can take on a lead role Seeing failure as a learning opportunity Preparing for coding interviews Preparing for behavioral and system design interviews The importance of having a network and doing mock interviews How much do staff engineers work with building pipelines, data science, ETC, MPOps, etc.? Context switching Advice for th...
2023-02-17
55 min
DataTalks.Club
Navigating Career Changes in Machine Learning - Chris Szafranek
We talked about Chris’s background Switching careers multiple times Freedom at companies Chris’s role as an internal consultant Chris’s sabbatical ChatGPT How being a generalist helped Chris in his career The cons of being a generalist and the importance of T-shaped expertise The importance of learning things you’re interested in Tips to enjoy learning new things Recruiting generalists The job market for generalists vs for specialists Narrowing down your interests Chris’s book recommendations Links: Lex Fridman: science, philosophy, media, AI (especially earlier episodes): https://www.youtube.com/lexfridman Andrej Kar...
2023-02-03
55 min
DataTalks.Club
Preparing for a Data Science Interview - Luke Whipps
We talked about: Luke’s background Luke’s podcast - AI Game Changers How Luke helps people get jobs What’s changed in the recruitment market over the last 6 months Getting ready for the interview process Stage “zero” – the filter between the candidate and the company Preparing for the introduction stage – research and communication Reviewing the fundamentals during preparation Preparing for the technical part of the interview Establishing the hiring company’s expectations Depth vs breadth Overly theoretical and mathematical questions in interviews Bombing (failing) in the middle of an interview Applying to different roles within the same company Luke’s...
2023-01-27
54 min
DataTalks.Club
Indie Hacking - Pauline Clavelloux
We talked about: Pauline’s background Pauline’s work as a manager at IBM What is indie hacking? Pauline initial indie hacking projects Getting ready for launch Responsibilities and challenges in indie hacking Pauline’s latest indie hacking project Going live and marketing Challenges with Unreal Me Staying motivated with indie hacking projects Skills Pauline picked up while doing indie hacking projects Balancing a day job and indie hacking Micro SaaS and AboutStartup.io How Pauline comes up with ideas for projects Going from an idea on paper to building a project Pauline’s Twitter success Connecting with Pauline...
2023-01-20
51 min
DataTalks.Club
Doing Software Engineering in Academia - Johanna Bayer
We talked about: Johanna’s background Open science course and reproducible papers Research software engineering Convincing a professor to work on software instead of papers The importance of reproducible analysis Why academia is behind on software engineering The problems with open science publishing in academia The importance of standard coding practices How Johanna got into research software engineering Effective ways of learning software engineering skills Providing data and analysis for your project Johanna’s initial experience with software engineering in a project Working with sensitive data and the nuances of publishing it How often Johanna does hackathons, open sour...
2023-01-13
49 min
DataTalks.Club
Data-Centric AI - Marysia Winkels
We talked about: Marysia’s background What data-centric AI is Data-centric Kaggle competitions The mindset shift to data-centric AI Data-centric does not mean you should not iterate on models How to implement the data-centric approach Focusing on the data vs focusing on the model Resources to help implement the data-centric approach Data-centric AI vs standard data cleaning Making sure your data is representative Knowing when your data is good enough The importance of user feedback “Shadow Mode” deployment What to do if you have a lot of bad data or incomplete data Marysia’s role at PyData How Marysia...
2023-01-06
53 min
DataTalks.Club
From Software Engineer to Data Science Manager - Sadat Anwar
We talked about: Sadat’s background Sadat’s backend engineering experience Sadat’s pivot point as a backend engineer Sadat’s exposure to ML and Data Science Sadat’s Act Before you Think approach (with safety nets) Sadat’s street cred and transition into management The hiring process as an internal candidate The importance of people management skills The Brag List The most difficult part of transitioning to management Focusing on projects and setting milestones Sadat’s transition from EM to data science management How much domain knowledge is needed for management? The main difference between engineering and management H...
2022-12-09
52 min
DataTalks.Club
Teaching and Mentoring in Data Analytics - Irina Brudaru
We talked about: Irina’s background Irina as a mentor Designing curriculum and program management at AI Guild Other things Irina taught at AI Guild Why Irina likes teaching Students’ reluctance to learn cloud Irina as a manager Cohort analysis in a nutshell How Irina started teaching formally Irina’s diversity project in the works How DataTalks.Club can attract more female students to the Zoomcamps How to get technical feedback at work Antipatterns and overrated/overhyped topics in data analytics Advice for young women who want to get into data science/engineering Finding Irina online Fundamentals for data a...
2022-12-02
53 min
DataTalks.Club
Technical Writing and Data Journalism - Angelica Lo Duca
We talked about: Angelica’s background Angelica’s books Data journalism How Angelica got into data journalism The field of digital humanities and Angelica’s data journalism course Technical articles vs data journalism articles Transforming reports into data storytelling Are reports to stakeholders considered technical writing? Data visualization in articles Article length The process of writing an article Finding writing topics How Angelica got into writing a book (communication with publishers) The process for writing a book Brainstorming Reviews and revisions Conclusion Links: Data Journalism examples (FENCED OUT): https://www.washingtonpost.com/graphics/world...
2022-11-25
50 min
DataTalks.Club
From Digital Marketing to Analytics Engineering - Nikola Maksimovic
We talked about: Nikola’s background Making the first steps towards a transition to BI and Analytics Engineering Learning the skills necessary to transition to Analytics Engineering The in-between period – from Marketing to Analytics Engineering Nikola’s current responsibilities Understanding what a Data Model is Tools needed to work as an Analytics Engineer The Analytics Engineering role over time The importance of DBT for Analytics Engineers Where can one learn about data modeling theory? Going from Ancient Greek and Latin to understanding Data (Just-In-Time Learning) The importance of having domain knowledge to analytics engineering Suggestion for those wishing to tra...
2022-11-18
46 min
DataTalks.Club
Product Owners in Data Science - Anna Hannemann
We talked about: About Anna and METRO Anna’s background The importance of a technical background for data product owners What are product owners? Product owners vs product managers Anna’s work on recommender systems at METRO Expanding the data team Types of algorithms used for recommender systems What kind of knowledge and skills data product owners need to have Problems and ideas should come from the business How Anna handles all her responsibilities The process for starting work on new domains Product portfolio management ProductTank and Anna’s role in it Anna’s resource recommendations Li...
2022-11-11
54 min
DataTalks.Club
Building Data Science Practice - Andrey Shtylenko
We talked about: Audience Poll Andrey’s background What data science practice is Best DS practice in a traditional company vs IT-centric companies Getting started with building data science practice (finding out who you report to) Who the initiative comes from Finding out what kind of problems you will be solving (Centralized approach) Moving to a semi-decentralized approach Resources to learn about data science practice Pivoting from the role of a software engineer to data scientist The most impactful realization from data science practice Advice for individual growth Finding Andrey online Links: Data Tea...
2022-11-04
49 min
DataTalks.Club
Large-Scale Entity Resolution - Sonal Goyal
We talked about: Sonal’s background How the idea for Zingg came about What Zingg is The difference between entity resolution and identity resolution How duplicate detection relates to entity resolution How Sonal decided to start working on Zingg How Zingg works What Zingg runs on Switching from consultancy to working on a new open source solution Why Zingg is open source Open source licensing Working on Zingg initially vs now Zingg’s current and future team Sonal’s biggest current challenge Avoiding problems with entity/identity resolution through database design Identity resolution vs basic joins, data fusion...
2022-10-28
53 min
DataTalks.Club
From Data Science to DataOps - Tomasz Hinc
We talked about: Tomasz’s background What Tomasz did before DataOps (Data Science) Why Tomasz made the transition from Data science to DataOps What is DataOps? How is DataOps related to infrastructure? How Tomasz learned the skills necessary to become DataOps Becoming comfortable with terminal The overlap between DataOps and Data Engineering Suitable/useful skills for DataOps Minimal operational skills for DataOps Similarities between DataOps and Data Science Managers Tomasz’s interesting projects Confidence in results and avoiding going too deep with edge cases Conclusion Links: Terminal setup video, 19 minutes long: https://www.yout...
2022-10-21
51 min
DataTalks.Club
Data Science Career Development - Katie Bauer
We talked about: Katie’s background What is a data scientist? What is a data science manager? Quality of the craft How data leaders promote career growth Supporting senior data professionals Choosing the IC route vs the management route Managing junior data professionals Talking to senior stakeholders and PMs as a junior The importance of hiring juniors What skills do data scientist managers need to get hired? How juniors that are just starting out can set themselves apart from the competition Asking senior colleagues for help and the rubber duck channel The challenges of the head of data Co...
2022-10-14
53 min
DataTalks.Club
From Testing Phones to Managing NLP Projects - Alvaro Navas Peire
We talked about: Alvaro’s background Working as a QA (Quality Assurance) engineer Transitioning from QA to Machine Learning Gathering knowledge about ML field Searching for an ML job (improving soft skills and CV) Data science interview skills Zoomcamp projects Zoomcamp project deployment How to not undersell yourself during interviews Alvaro’s experience with interviews during his transition Alvaro’s Zoomcamp notes Alvaro’s coach The importance of mathematical knowledge to a transition into ML Preparing for technical interviews Alvaro’s typical workday Alvaro’s team’s tech stack The importance of a technical background to transitioning into ML
2022-10-07
48 min
DataTalks.Club
Responsible and Explainable AI - Supreet Kaur
We talked about: Supreet’s background Responsible AI Example of explainable AI Responsible AI vs explainable AI Explainable AI tools and frameworks (glass box approach) Checking for bias in data and handling personal data Understanding whether your company needs certain type of data Data quality checks and automation Responsibility vs profitability The human touch in AI The trade-off between model complexity and explainability Is completely automated AI out of the question? Detecting model drift and overfitting How Supreet became interested in explainable AI Trustworthy AI Reliability vs fairness Bias indicators The future of explainable AI About DataBuzz The di...
2022-09-30
53 min
DataTalks.Club
Building Data Science Practice - Andrey Shtylenko
We talked about: Audience Poll Andrey’s background What data science practice is Best DS practice in a traditional company vs IT-centric companies Getting started with building data science practice (finding out who you report to) Who the initiative comes from Finding out what kind of problems you will be solving (Centralized approach) Moving to a semi-decentralized approach Resources to learn about data science practice Pivoting from the role of a software engineer to data scientist The most impactful realization from data science practice Advice for individual growth Finding Andrey online Links: Data Teams book: https://ww...
2022-09-30
49 min
DataTalks.Club
Leading Data Research - David Bader
We talked about: David’s background A day in the life of a professor David’s current projects Starting a school The different types of professors David’s recent papers Similarities and differences between research labs and startups Finding (or creating) good datasets David’s lab Balancing research and teaching as a professor David’s most rewarding research project David’s most underrated research project David’s virtual data science seminars on YouTube Teaching at universities without doing research Staying up-to-date in research David’s favorite conferences Selecting topics for research Convincing students to stay in academia and competing with i...
2022-09-16
58 min
DataTalks.Club
Dataset Creation and Curation - Christiaan Swart
We talked about: Christiaan’s background Usual ways of collecting and curating data Getting the buy-in from experts and executives Starting an annotation booklet Pre-labeling Dataset collection Human level baseline and feedback Using the annotation booklet to boost annotation productivity Putting yourself in the shoes of annotators (and measuring performance) Active learning Distance supervision Weak labeling Dataset collection in career positioning and project portfolios IPython widgets GDPR compliance and non-English NLP Finding Christiaan online Links: My personal blog: https://useml.net/ Comtura, my company: https://comtura.ai/ LI: https://www.linkedin.com/in/christiaan-swart-51a68967/ Tw...
2022-09-09
56 min
DataTalks.Club
Data Mesh 101 - Zhamak Dehghani
We talked about: Zhamak’s background What is Data Mesh? Domain ownership Determining what to optimize for with Data Mesh Decentralization Data as a product Self-serve data platforms Data governance Understanding Data Mesh Adopting Data Mesh Resources on implementing Data Mesh Links: Free 30-day code from O'Reilly: https://learning.oreilly.com/get-learning/?code=DATATALKS22 Data Mesh book: https://learning.oreilly.com/library/view/data-mesh/9781492092384/ LinkedIn: https://www.linkedin.com/in/zhamak-dehghani ML Zoomcamp: https://github.com/alexeygrigorev/mlbookcamp-code/tree/master/course-zoomcamp Join DataTalks.Club: https://datatalks.club/sl...
2022-09-02
54 min
DataTalks.Club
Growing Data Engineering Team in a Scale-Up - Mehdi OUAZZA
We talked about: Mehdi’s background The difference between startup, scale-up and enterprise Hypergrowth Data platform engineers in a scale-up environment What a data platform is and who builds it Managing the fast pace of a scale-up while ensuring personal growth Should a senior data person consider a scale-up or an enterprise? Should a junior data person consider a scale-up or an enterprise? Sourcing talent for hyper-growth companies and developing a community culture Generating content and getting feedback Generalization vs specialization for data engineers in a scale-up The ratio of work between platform building and use case pipelines Be...
2022-08-26
53 min
DataTalks.Club
Lessons Learned About Data & AI at Enterprises - Alexander Hendorf
We talked about: Alexander’s background The role of Partner at Königsweg Being part of the data and AI community How Alexander became chair at PyData Alexander’s many talks and advice on giving them Explaining AI to managers Why being able to explain machine learning to managers is important The experimentational nature of AI and why it’s not a cure-all Innovation requires patience Convincing managers not to use AI or ML when there are better (simpler) solutions The role of MLOps in enterprises Thinking about the mid- and long-term when considering solutions Finding Alexander online
2022-08-19
54 min
DataTalks.Club
MLOps Architect - Danny Leybzon
We talked about: Danny’s background What an MLOps Architect does The popularity of MLOps Architect as a role Convincing an employer that you can wear many different hats Interviewing for the role of an MLOps Architect How Danny prioritizes work with data scientists Coming to WhyLabs when you’ve already got something in production vs nothing in production Market awareness regarding the importance of model monitoring How Danny (WhyLabs) chooses tools ONNX Common trends in tooling setups The most rewarding thing for Danny in ML and data science Danny’s secret for staying sane while wearing so many d...
2022-08-12
53 min
DataTalks.Club
Decoding Data Science Job Descriptions - Tereza Iofciu
We talked about: DataTalks.Club intro Tereza’s background Working as a coach Identifying the mismatches between your needs and that of a company How to avoid misalignments Considering what’s mentioned in the job description, what isn’t, and why Diversity and culture of a company Lack of a salary in the job description Way of doing research about the company where you will potentially work How to avoid a mismatch with a company other than learning from your mistakes Before data, during data, after data (a company’s data maturity level) The company’s tech stack Finding Te...
2022-08-05
49 min
DataTalks.Club
Data Science for Social Impact - Christine Cepelak
We talked about: Christine’s Background Private sector vs Public sector Public policy The challenges of being a community organizer How public policy relates to political science Programs that teach data science for public policy Data science for public policy vs regular data science The importance of ethical data science in public policy How data science in social impact project differs from other projects Other resources to learn about data science for public policy Challenges with getting data in data science for public policy The problems with accessing public datasets about recycling Christine’s potential projects after Master’s degr...
2022-07-29
48 min
DataTalks.Club
Hiring Data Science Talent - Olga Ivina
We talked about: Olga’s career journey Hiring data scientists now vs 7 years ago The two qualities of an excellent data scientist What makes Alexey do this podcast How Alexey get the latest information on data science How Olga checks a candidate’s technical skills How to make an answer stand out (showing your depth of knowledge) A strong mathematical background vs a strong engineering background When Auto ML will replace the need to have data scientists Should data scientists transition into management? (the importance of communication in an organization) Switching from a data analyst role to a data...
2022-07-22
52 min
DataTalks.Club
From Open-Source Maintainer to Founder - Will McGugan
We talked about: Will’s background Will’s open source projects S3Fs and PyFile systems Inspiration for open source projects Will as a freelancer Starting a company from a tweet (Rich and Textual) Building in public (Will’s approach to social media) The workforce and roadmap of Textualize.io The importance of working on open source for Textualize employees The workflow of and contributions to Textualize Getting your first thousand GitHub Stars (going viral) Suggestions for those who wish to start in the open-source space Finding Will online Links: Twitter: https://twitter.com/will...
2022-07-15
49 min
DataTalks.Club
Designing a Data Science Organization - Lisa Cohen
We talked about: Lisa’s background Centralized org vs decentralized org Hybrid org (centralized/decentralized) Reporting your results in a data organization Planning in a data organization Having all the moving parts work towards the same goals Which approach Twitter follows (centralized vs decentralized) Pros and cons of a decentralized approach Pros and cons of a centralized approach Finding a common language with all the functions of an org Finding the right approach for companies that want to implement data science How many data scientists does a company need? Who do data scientists report huge findings to? The im...
2022-07-08
51 min
DataTalks.Club
Developer Advocacy Engineer for Open-Source - Merve Noyan
We talked about: Merve’s background Merve’s first contributions to open source What Merve currently does at Hugging Face (Hub, Spaces) What is means to be a developer advocacy engineer at Hugging Face The best way to get open source experience (Google Summer of Code, Hacktoberfest, and sprints) The peculiarities of hiring as it relates to code contributions Best resources to learn about NLP besides Hugging Face Good first projects for NLP The most important topics in NLP right now NLP ML Engineer vs NLP Data Scientist Project recommendations and other advice to catch the eye of recr...
2022-07-01
50 min
DataTalks.Club
Data Scientists at Work - Mısra Turp
We talked about: Misra’s background What data scientists do Consultant data scientists vs in-house data scientists (and freelancers) Expectations for data scientists The importance of keeping up to date with AI developments (FOMA) How does DALL·E 2 work and should you care? Going to conferences to stay up to date The most pressing issue for data scientists Fighting FOMA and imposter syndrome Knowing when you have enough knowledge of a framework The “best” type of data scientist Being a generalist vs a specialist Advice for entry-level data entering an oversaturated market Catching the eye of big AI compani...
2022-06-24
58 min
DataTalks.Club
Freelancing and Consulting with Data Engineering - Adrian Brudaru
We talked about: Adrian’s background Freelancing vs Employment Risk and occupancy rate in freelancing The scariest part of freelancing Adrian’s first projects Freelancing 5 years later Pay rates in freelancing Acquiring skills while freelancing Working with recruitment agencies and networking Looking for projects and getting clients Freelancing vs consulting Clarity in clients’ expectations (scope of work) Building your network Freelancing platforms Adrian’s data loading prototype Going from freelancing to making your own product (and other investments) The usefulness of a portfolio Introverts in freelancing Is it possible to work for 3 months a year in freelancing? Choosing projects...
2022-06-17
52 min
DataTalks.Club
Getting a Data Engineering Job (Summary and Q&A) - Jeff Katz
We talked about: Summary of “Getting a Data Engineering Job” webinar Python and engineering skills Interview process Behavioral interviews Technical interviews Learning Python and SQL from scratch Is having non-coding experience a disadvantage? Analyst or engineer? Do you need certificates? Do I need a master’s degree? Fully remote data engineering jobs Should I include teaching on my resume? Object-oriented programming for data engineering Python vs Java/Scala SQL and Python technical interview questions GCP certificates Is commercial experience really necessary? From sales to engineering Solution engineers Wrapping up Links: Getting a Data Enginee...
2022-06-10
48 min
DataTalks.Club
Using Data for Asteroid Mining - Daynan Crull
We talked about: Daynan’s background Astronomy vs cosmology Applications of data science and machine learning in astronomy Determining signal vs noise What the data looks like in astronomy Determining the features of an object in space Ground truth for space objects Why water is an important resource in the space economy Other useful resources that can be found in asteroids Sources of asteroids The data team at an asteroid mining company Open datasets for hobbyists Mission and hardware design for asteroid mining Partnerships and hires Links: LinkedIn: https://www.linkedin.com/in/day...
2022-06-03
53 min
DataTalks.Club
Machine Learning in Marketing - Juan Orduz
We talked about: Juan’s background Typical problems in marketing that are solved with ML Attribution model Media Mix Model – detecting uplift and channel saturation Changes to privacy regulations and its effect on user tracking User retention and churn prevention A/B testing to detect uplift Statistical approach vs machine learning (setting a benchmark) Does retraining MMM models often improve efficiency? Attribution model baselines Choosing a decay rate for channels (Bayesian linear regression) Learning resource suggestions Bayesian approach vs Frequentist approach Suggestions for creating a marketing department Most challenging problems in marketing The importance of knowing marketing domain know...
2022-05-27
52 min
DataTalks.Club
From Academia to Data Analytics and Engineering - Gloria Quiceno
We talked about: Gloria’s background Working with MATLAB, R, C, Python, and SQL Working at ICE Job hunting after the bootcamp Data engineering vs Data science Using Docker Keeping track of job applications, employers and questions Challenges during the job search and transition Concerns over data privacy Challenges with salary negotiation The importance of career coaching and support Skills learned at Spiced Retrospective on Gloria’s transition to data and advice Top skills that helped Gloria get the job Thoughts on cloud platforms Thoughts on bootcamps and courses Spiced graduation project Standing out in a sea of appli...
2022-05-20
48 min
DataTalks.Club
Teaching Data Engineers - Jeff Katz
We talked about: Jeff’s background Getting feedback to become a better teacher Going from engineering to teaching Jeff on becoming a curriculum writer Creating a curriculum that reinforces learning Jeff on starting his own data engineering bootcamp Shifting from teaching ML and data science to teaching data engineering Making sure that students get hired Screening bootcamp applicants Knowing when it’s time to apply for jobs The curriculum of JigsawLabs.io The market demand of Spark, Kafka, and Kubernetes (or lack thereof) Advice for data analysts that want to move into data engineering The market demand of ETL...
2022-05-13
52 min
DataTalks.Club
From Roasting Coffee to Backend Development - Jessica Greene
We talked about: Jessica’s background Giving a talk at a tech conference about coffee Jessica’s transition into tech (How to get started) Going from learning to actually making money Landing your first job in tech Does your age matter when you’re trying to get a job? Challenges that Jessica faced in the beginning of her career Jessica’s role at PyLadies Fighting the Imposter Syndrome Generational differences in digital literacy and how to improve it Events organized by PyLadies Jessica’s beginnings at PyLadies (organizing events) Jessica’s experience with public speaking The impact of public speaki...
2022-05-06
52 min
DataTalks.Club
Recruiting Data Engineers - Nicolas Rassam
We talked about: Nicolas’ background The tech talent market in different countries Hiring data scientists vs data engineers A spike in interest for data engineering roles The importance of recruiters having technical knowledge The main challenges of hiring data engineers The difference in hiring junior, mid, and senior level data engineers Things recruiters look for in people who switch to a data engineering role The importance of knowing cloud tools The importance of knowing infrastructure tools Preparing for the interview The importance of a formal education The importance having a project portfolio How your current domain influence the inte...
2022-04-29
49 min
DataTalks.Club
Storytime for DataOps - Christopher Bergh
We talked about: Christopher’s background The essence of DataOps Also known as Agile Analytics Operations or DevOps for Data Science Defining processes and automating them (defining “done” and “good”) The balance between heroism and fear (avoiding deferred value) The Lean approach Avoiding silos The 7 steps to DataOps Wanting to become replaceable DataOps is doable Testing tools DataOps vs MLOps The Head Chef at Data Kitchen What’s grilling at Data Kitchen? The DataOps Cookbook Links: DataOps Manifesto website: https://dataopsmanifesto.org/en/ DataOps Cookbook: https://dataops.datakitchen.io/pf-cookbook Recipes for DataOps Success: htt...
2022-04-22
52 min
DataTalks.Club
Machine Learning and Personalization in Healthcare - Stefan Gudmundsson
We talked about: Stefan’s background Applications of machine learning in healthcare Sidekick Health – gamified therapeutics How is working for King different from Sidekick Health? The rewards systems in gamified apps The importance of building a strong foundation for a data science team The challenges of building an app in the healthcare industry Dealing with ethics issues Sidekick Health’s personalized recommendations and content The importance of having the right approach in A/B tests (strong analytics and good data) The importance of having domain knowledge to work as a data professional in the healthcare industry Making a data-d...
2022-04-15
51 min
DataTalks.Club
Innovation and Design for Machine Learning - Liesbeth Dingemans
We talked about: Liesbeth’s background What is design? The importance of interaction in design Design as a process (Double Diamond technique) How long does it take to go from an idea to finishing the second diamond? Design thinking (Google’s PAIR) What is a Design Sprint and who should participate in it? Why should data specialists care about design? Challenging your task-giver (asking “why”) How to avoid the “Chinese whisper game” (reiterating the problem) Defining the roadmap for data science teams What is innovation? Bringing innovation to your management Task force-team approach to solving problems Innovation, resource management i...
2022-04-08
55 min
DataTalks.Club
Hacking Your Data Career - Marijn Markus
We talked about: Marijn’s background Standing out in data science Doing the opposite of what people tell you Don’t shoot the messenger (carefully sharing your findings) Advising the seniors Bite off more than you can chew, then chew Marijn’s side projects (finding value in doing things you find interesting) Building a project portfolio Marijn’s NGO project The importance of a team Open source intelligence (OSINT) The importance of soft skills for data experts Marijn’s LinkedIn growth strategy and tips Links: Twitter: https://twitter.com/MarijnMarkus LinkedIn: https://www.linkedin.com/in/marijnma...
2022-04-01
55 min
DataTalks.Club
Visualising Machine Learning - Meor Amer
We talked about: kDimensions Being self-employed Visual engineering Constrain yourself to get creative Coming up with ideas Visualising difficult concepts The process of creating visuals Creating visuals Learning to create visuals for engineers Consuming with intention to create Learning by breaking code Earning with visuals Adding visuals to blog posts Meor’s book: visual introduction to deep learning Links: A Visual Introduction to Deep Learning by Meor Amer: https://gumroad.com/a/63231091 kDimensions website: https://kdimensions.com/ Book to learn about Figma: https://figmabook.com/ Jack Butcher's approach: https://www.youtube.com/watch?v=azh...
2022-03-25
52 min
DataTalks.Club
From Math Teacher to Analytics Engineer - Juan Pablo
We talked about: Juan Pablo's Backround Data engineering resources Teaching calculus Transitioning to Analytics Data Analytics bootcamp Getting money while studying Going to meetups to get a job Looking for uncrowded doors Using LinkedIn Portfolio Talking to people on meetups Eight tips to get your first analytics job Consider contracts and temporary roles Getting experience with non-profits Create your own internship Networking Website for hosting a portfolio I’m a math teacher. What should I learn first? Analytics engineering Best suggestion: keep showing up Networking on online conferences Communication skills and being organized Links: Website: ht...
2022-03-18
50 min
DataTalks.Club
From Data Science to Data Engineering - Ellen König
We talked about: Ellen’s background Why Ellen switched from data science to data engineering The overlap between data science and data engineering Skills to learn and improve for data engineering Ways to pick up and improve skills (advice for making the transition) What makes a data engineering course “good” Languages to know for data engineering The easiest part of transitioning into data engineering The hardest part of transitioning into data engineering Common data engineering team distributions People who are both data scientists and data engineers Pet projects and other ways to pick up development skills Dealing with cloud...
2022-03-11
54 min
DataTalks.Club
Becoming a Data Engineering Manager - Rahul Jain
We talked about: Rahul’s background What do data engineering managers do and why do we need them? Balancing engineering and management Rahul’s transition into data engineering management The importance of updating your skill set Planning the transition to manager and other challenges Setting expectations for the team and measuring success Data reconciliation GDPR compliance Data modeling for Big Data Advice for people transitioning into data engineering management Staying on top of trends and enabling team members The qualities of a good data engineering team The qualities of a good data engineer candidate (interview advice) The difference betw...
2022-03-04
51 min
DataTalks.Club
A/B Testing - Jakob Graff
We talked about: Jakob’s background The importance of A/B tests Statistical noise A/B test example A/B tests vs expert opinion Traffic splitting, A/A tests, and designing experiments Noisy vs stable metrics – test duration and business cycles Z-tests, T-tests, and time series A/B test crash course advice Frequentist approach vs Bayesian approach A/B/C/D tests Pizza dough Links: Jakob's LinkedIn: https://www.linkedin.com/in/jakob-graff-a6113a3a/ Product Analyst role at Inkitt: https://jobs.lever.co/inkitt/d2b0427a-f37f-4002-975d-28bd60b56d70 ...
2022-02-25
54 min
DataTalks.Club
Machine Learning System Design Interview - Valerii Babushkin
We talked about: Valerii’s background Who goes through an ML system design interview System design VS ML System design Preparing for ML system design interviews Machine learning project checklist The importance of defining a goal and ways of measuring it What to do after you set a goal Typical components of an ML system Applying ML systems to real-world problems System design and coding in interviews for new graduates Humans in the validation of model performance Links: Valerii's telegram channel (in Russian): t.me/cryptovalerii Join DataTalks.Club: https://datatalks.club/sl...
2022-02-18
54 min
DataTalks.Club
Career Coaching - Lindsay McQuade
We talked about: Lindsay’s background Spiced Academy Career coaching role Reframing your experience Helping with career problems Finding what interests you Tailoring a CV and “spray and pray” Career coaching outside a bootcamp Imposter syndrome After bootcamp Internships Working with recruiters Networking on LinkedIn Links: Lindsay's LinkedIn: https://www.linkedin.com/in/lindsay-mcquade/ Impostor questionnaire: http://impostortest.nickol.as/ Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html
2022-02-11
52 min
DataTalks.Club
Product Management Essentials for Data Professionals - Greg Coquillo
We talked about: Greg’s background Responsibilities of Data Product Manager Understanding customer journey Interviewing business partners and decision-makers Products sense, product mindset, and product roadmap Working backwards Driving the roadmap Building a roadmap in Excel Measuring success Advice for teams that don’t have a product manager Links: Greg's LinkedIn: https://www.linkedin.com/in/greg-coquillo/ Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html
2022-02-04
53 min
DataTalks.Club
Recruiting Data Professionals - Alicja Notowska
We talked about: Alicja’s background The hiring process Sourcing and recruiting Managing expectations Making the job description attractive Selecting profiles during sourcing Profile keywords The importance of a Master’s vs a Bachelor’s degree vs a PhD Improving CV Interview with the recruiter Salary expectations Advice for “career changers” Cover letters Data analysts Double Bachelor’s degrees The most difficult part of hiring Coursera courses on the CV Making a good impression on recruiters Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html
2022-01-28
57 min
DataTalks.Club
DataTalks.Club Behind the Scenes - Eugene Yan, Alexey Grigorev
We talked about: Alexey’s background Being a principal data scientist DataTalks.Club The beginning and growth of DataTalks.Club Sustaining the pace Types of talks Popular and favorite talks Making DataTalks.Club self-sufficient Alexey’s book and course Advice for people starting in data science and staying motivated Not keeping up to date with new tools Staying productive Learning technical subjects and keeping notes Inspiration and idea generation for DataTalks.Club Links: https://eugeneyan.com/writing/informal-mentors-alexey-grigorev/ Join DataTalks.Club: https://datatalks.club/slack.html Our event...
2022-01-21
50 min
DataTalks.Club
DTC's minis - From Data Engineering to MLOps - Sejal Vaidya
We don't have a new episode this week, but we have an amazing conversation with Sejal Vaidya from August We talked about Sejal's background Why transitioning to ML engineering Three phases of development of a project Why data engineers should get involved in ML Technologies Tips for people who want to transition Soft skills and understanding requirements Helpful resources Resources: ML checklist (https://twolodzko.github.io/ml-checklist.html) Machine Learning Bookcamp (https://mlbookcamp.com/) Made with ML course (https://madewithml.com) Full-stack deep learning (https://fullstackdeeplearning.com) Newsletters...
2022-01-14
16 min
DataTalks.Club
Becoming a Data Science Manager - Mariano Semelman
We talked about: Mariano’s background Typical day of a manager Becoming a manager Preparing for the transition Balancing projects and assumptions Search and recommendations Dealing with unfamiliar domains Structuring projects Connecting product and data science Rules of Machine Learning CRISP-DM and deployment Giving feedback Dealing with people leaving the team Doing technical work as a manager Dealing with bad hires Keeping up with the industry Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html
2022-01-07
1h 05
DataTalks.Club
Leading NLP Teams - Ivan Bilan
We talked about: Ivan’s role at Personio Ivan’s background Studying technical management Managing a software team NLP teams NLP engineers Becoming an NLP engineer Computer vision NLP engineer vs ML engineer Conversational designers Linguistics outside of chatbots When does a team need an NLP engineer or a linguist? The future of NLP NLP pipelines GPT-3 Problems of GPT-3 Does GPT-3 make everything obsolete? What NLP actually is? Does NLP solve problems better than humans? State of language translation NLP Pandect Links: https://github.com/ivan-bilan/The-NLP-Pandect https://github.com/ivan-bilan/The-Engineering-Manager-Pandect https://github.com/ivan...
2021-12-24
59 min
DataTalks.Club
Product Management for Machine Learning - Geo Jolly
We talked about Geo’s background Technical Product Manager Building ML platform Working on internal projects Prioritizing the backlog Defining the problems Observability metrics Avoiding jumping into “solution mode” Breaking down the problem Important skills for product managers The importance of a technical background Data Lead vs Staff Data Scientist vs Data PM Approvals and rollout Engineering/platform teams Data scientists’ role in the engineering team Scrum and Agile in data science Transitioning from Data Scientist to Technical PM Books to read for the transition Transitioning for non-technical people Doing user research Quality assurance in ML Advice for supporti...
2021-12-17
1h 02
DataTalks.Club
Moving from Academia to Industry - CJ Jenkins
We talked about: CJ’s background Evolutionary biology Learning machine learning Learning on the job and being honest with what you don’t know Convincing that you will be useful CJ’s first interview Transitioning to industry Tailoring your CV Data science courses Moving to Berlin Being selective vs ‘spray and pray’ Moving on to new jobs Plan for transitioning to industry Requirements for getting hired Publications, portfolios and pet projects Adjusting to industry Bad habits from academia Topics with long-term value CJ’s textbook Links: CJ's LinkedIn: https://www.linkedin.com/in/christina-je...
2021-12-10
59 min
DataTalks.Club
Advancing Big Data Analytics: Post-Doctoral Research - Eleni Tzirita Zacharatou
We talked about: Eleni’s background Spatial data analytics Responsibilities of a postdoc Publishing papers Best places for data management papers Differences between postdoc and PhD Helping students become successful Research at the DIMA group Identifying important research directions Reviewing papers Underrated topics in data management Research in data cleaning Collaborating with others Choosing the field for Master’s students Choosing the topic for a Master thesis Should I do a PhD? Promoting computer science to female students Links: https://www.user.tu-berlin.de/tzirita/ Join DataTalks.Club: https://data...
2021-12-03
1h 00
DataTalks.Club
Becoming a Data Product Manager - Sara Menefee
We talked about: Sara’s background Product designer’s responsibilities Data product manager’s responsibilities Planning with the team Design thinking and product design Data PMs vs regular PMs Skill requirements for Data PMs Going from a product designer to a data product manager Case studies Resources for learning about product management Data PM’s biggest challenge Multitasking and context switching Insights from user interviews Using new, unfamiliar tools Documentation Idea generation Do Data PMs need to know ML? Links: Product Management Courses: https://www.lennyrachitsky.com/course and https://www.reforge.com/masterin...
2021-11-26
59 min
DataTalks.Club
Data Science Manager vs Data Science Expert - Barbara Sobkowiak
We talked about: Barbara’s background Do you need a manager or an expert? Technical and non-technical requirements for managers Importance of technical skills for managers Responsibilities and skills of a manager Importance of technical background for managers Getting involved in business development and sales Developing the team Checking team’s work Data science expert Hiring experts Who should we hire first? Can an expert build a team? Data science managers in startups Project management Ensuring that projects provide value Questions before starting a project Women in data science Finding Barbara online General advice Link...
2021-11-19
59 min
DataTalks.Club
Ace Non-Technical Data Science Interviews - Nick Singh
We talked about: Nick’s background Being a career coach Overview of the hiring process Behavioral interviews for data scientists Preparing for behavioral interviews Handling "tricky" questions Project deep dive Business context Pacing, rambling, and honesty “What’s your favorite model?” What if I haven’t worked on a project that brought $1 mln? Different questions for different levels Product-sense interviews Identifying key metrics in unfamiliar domains Tech blogs Cold emailing Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html
2021-11-12
1h 01
DataTalks.Club
Becoming a Solopreneur in Data - Noah Gift
We talked about: Noah’s background Solopreneurship A day of a solopreneur Exponential vs linear work Escaping the office work - digging the tunnel Structuring goals Staying motivated Publishing books Planning out books Writing a book is like preparing to run a marathon Distributed income Getting started as a solopreneur Lowering expenses and adding time The right time to quit full-time Building a network Teaching at universities Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html
2021-11-05
59 min
DataTalks.Club
Building Business Acumen for Data Professionals - Thom Ives
Links: https://join.slack.com/t/integratedmlai/shared_invite/zt-r3hpj44k-gfhf1pzIt3jixrATyXCWnQ https://www.linkedin.com/in/thomives/ Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html
2021-10-30
1h 05
DataTalks.Club
Conquering the Last Mile in Data - Caitlin Moorman
We talked about: Caitlin’s background The last mile in data The Pareto Principle Failing to use data Making sure data is used Communicating with decision-makers Working backwards from the last mile Understanding how data drives decisions Sketching and prototyping Showing the benefits of power data Measurability Driving change in data Asking high-leverage questions Resistance from users Understanding domain experts Linear projects vs circular projects Recommendations for data analyst students Finding Caitlin online Links: Emelie's talk https://locallyoptimistic.com/post/linear-and-circular-projects-part-1/ https://locallyoptimistic.com/post/linear-and-circular-projects-part-2/ Join DataTalks.Club: ht...
2021-10-22
1h 02
DataTalks.Club
Similarities and Differences between ML and Analytics - Rishabh Bhargava
We talked about: Rishabh's background Rishabh’s experience as a sales engineer Prescriptive analytics vs predictive analytics The problem with the term ‘data science’ Is machine learning a part of analytics? Day-to-day of people that work with ML Rule-based systems to machine learning The role of analysts in rule-based systems and in data teams Do data analysts know data better than data scientists? Data analysts’ documentation and recommendations Iterative work - data scientists/ML vs data analysts Analyzing results of experiments Overlaps between machine learning and analytics Using tools to bridge the gap between ML and analytics Do companies...
2021-10-15
59 min
DataTalks.Club
Building and Leading Data Teams - Tammy Liang
We talked about: Tammy’s background Being the chief of data First projects as the first data person in a company Initial resistance Expanding the team Role of business analyst Platanomelon’s stack Order for growing the data team Demand forecasting Should analysts know machine learning Qualifications for the first data person in a company Providing accurate results Receiving insights in a timely manner Providing useful insights Giving ownership to the team Starting as the first data person in a company Data For Future podcast Supporting team members that are stuck Finding Tammy online Link...
2021-10-08
59 min
DataTalks.Club
What Researchers and Engineers Can Learn from Each Other - Mihail Eric
We talked about: Mihail’s background NLP and self-driving vehicles Transitioning from academia to the industry Machine learning researchers Finding open-ended problems Machine learning engineers Is data science more engineering or research? What can engineers and researchers learn from one another? Bridging the disconnect between researchers and engineers Breaking down silos Fluid roles Full-stack data scientists Advice to machine learning researchers Advice to machine learning engineers Reading papers Choosing between engineering or research if you’re just starting Confetti.ai Links: https://twitter.com/mihail_eric http://confetti.ai/ Join...
2021-10-01
1h 01
DataTalks.Club
Introducing Data Science in Startups - Marianna Diachuk
We talked about: Marianna’s background Being the only data scientist What should already be in the company How much experience do you need Identifying problems Prioritization What should the company already know? First week First month First quarter Managing expectations Solving problems without ML Project timelines Finding the best solution Evaluating performance Getting stuck Communicating with analysts Transitioning from engineering to data science Growing the team Stopping projects Questions for the company From research to production Wrapping up Links: Marianna's LinkedIn: https://www.linkedin.com/in/marianna-diachuk-53ba60116/ Jo...
2021-09-24
58 min
DataTalks.Club
Defining Success: Metrics and KPIs - Adam Sroka
We talked about: Adam’s background Adam’s laser and data experience Metrics and why do we care about them Examples of metrics KPIs KPI examples Derived KPIs Creating metrics — grocery store example Metric efficiency North Star metrics Threshold metrics Health metrics Data team metrics Experiments: treatment and control groups Accelerate metrics and timeboxing Links: Domino's article about measuring value: http://blog.dominodatalab.com/measuring-data-science-business-value Adam's article about skills useful for data scientists: https://towardsdatascience.com/how-to-apply-your-hard-earned-data-science-skillset-812585e3cc06 Adam's article about standing out: https://towardsdatascience.com/how-to-stand-out-as-a-great-data-scientist-in-2021-3b7a732114a9
2021-09-17
1h 02
DataTalks.Club
Making Sense of Data Engineering Acronyms and Buzzwords - Natalie Kwong
We talked about: Natalie’s background Airbyte What is ETL? Why ELT instead of ETL? Transformations How does ELT help analysts be more independent? Data marts and Data warehouses Ingestion DB ETL vs ELT Data lakes Data swamps Data governance Ingestion layer vs Data lake Do you need both a Data warehouse and a Data lake? Airbyte and ELT Modern data stack Reverse ETL Is drag-and-drop killing data engineering jobs? Who is responsible for managing unused data? CDC – Change Data Capture Slowly changing dimension Are there cases where ETL is preferable over ELT? Why is Airbyte open source? The...
2021-09-11
1h 00
DataTalks.Club
Mastering Algorithms and Data Structures - Marcello La Rocca
We talked about: Learning algorithms and data structures Resources for learning algorithms and data structures Most important data structures Learning the abstractions Learning algorithms if they aren’t needed at work Common mistakes when using wrong data structures Importance of data structures for data scientists Marcello’s book - Advanced Algorithms and Data Structures Bloom filters Where Bloom filters are useful Approximate nearest neighbours Searching for most similar vectors Knowing frameworks vs knowing internals of data structures Serializing Bloom filters Algorithmic problems in job interviews Important data structures for data scientists and data engineers Learning by doing Importance of c...
2021-09-03
1h 02
DataTalks.Club
Chief Data Officer - Marco De Sa
We talked about: Marco’s background Role of CDO Keeping track of many things Becoming a CDO Strategy vs tactics VP of Data vs CDO How many VPs of Data could be there? Splitting the work between VP and CDO Difference between CTO, CPO, and CDO Breaking down the goals and working backwards from them Assessing if we’re moving in the right direction Dealing with many meetings Being more effective Building the data-driven culture Challenges of working remotely Does CDO need deep technical skills? Importance of MBA The key skills for becoming a CDO Biggest challenges within OLX...
2021-08-27
1h 01
DataTalks.Club
Freelancing in Machine Learning - Mikio Braun
We talked about: Mikio’s background What Mikio helps with Moving from a full-time job to freelancing Finding clients and importance of a strong network Building a network Initial meetings with clients Understanding what clients need Template for the offer (Million dollar consulting) Deciding on rate type: hourly, daily, per project Taking vacations (and paying twice for them) Avoiding overworking Specializing: consulting as a product Working full-time as a principal vs being a consultant Is the overhead worth it? Getting a new client when you already have a project After freelancing: what’s next? Output of Mikio’s work ...
2021-08-20
1h 02
DataTalks.Club
Launching a Startup: From Idea to First Hire - Carmine Paolino
We talked about: Carmine’s background Carmine’s startup FreshFlow Doing user research Design thinking Entrepreneur first Finding co-founders: the “expertise edges” framework The structure of the EF program Coming up with the idea How important is going through a startup accelerator? Finding your first client Finding investors Consequences of having a bad investor Splitting responsibilities between co-founders Hiring The importance of delegating Making work attractive to hires Plans for the future Just-in-time supply chain What would you have done differently? Advice for people starting a startup Don’t focus on skills only Getting motivation Am I ready for a star...
2021-08-13
1h 07
DataTalks.Club
Approach Learning as ML Project - Vladimir Finkelshtein [mini]
We don't have an episode lined up for this week, but we recorded a small chat with Vladimir some time ago. Enjoy it! We talked about: Vladimir's background Learning by answering questions Don't be afraid of being wrong Winnings books Learning random things Approach learning as a machine learning project Links: Vladimir on LinkedIn: https://www.linkedin.com/in/vladimir-finkelshtein/ Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html
2021-08-06
13 min
DataTalks.Club
Humans in the Loop - Lina Weichbrodt
We talked about: Lina’s background What we need to remember when starting a project (checklists) Make sure the problem is formalized and close to the core business Get the buy-in with stakeholders Building trust with stakeholders Don’t just focus on upsides – ask about concerns Turning a concert into a metric What happens when something goes wrong? Post mortem reporting Apply the 5 why’s If a lot of users say it’s a bug – it’s worth investigating Post mortem format Action points Debugging vs explaining the model Are there online versions of checklists? Make sure to log your input...
2021-07-30
57 min
DataTalks.Club
Running from Complexity - Ben Wilson
We talked about: Ben’s Background Building solutions for customers Why projects don’t make it to production Why do people choose overcomplicated solutions? The dangers of isolating data science from the business unit The importance of being able to explain things Maximizing chances of making into production The IKEA effect Risks of implementing novel algorithms If it can be done simply – do that first Don’t become the guinea pig for someone’s white paper The importance of stat skills and coding skills Structuring an agile team for ML work Timeboxing research Mentoring Ben’s book ‘Uncool techniques’ at...
2021-07-23
1h 11
DataTalks.Club
I Want to Build a Machine Learning Startup! - Elena Samuylova
We talked about: Elena’s background Why do a startup instead of being an employee? Where to get ideas for your startup Finding a co-founder What should you consider before starting a startup? Vertical startup vs infrastructure startup ‘AI First’ startups Building tools for engineers What skills do you need to start a startup? Startup risks How to be prepared to fail Work-life balance The part-time startup approach Startup investment models No resources and no technical expertise – what to do? Productionizing your services When to hire an expert Talking to people with a problem before solving the problem Starting...
2021-07-16
58 min
DataTalks.Club
Big Data Engineer vs Data Scientist - Roksolana Diachuk
Links: Twitter: https://twitter.com/dead_flowers22 LinkedIn: https://www.linkedin.com/in/roksolanadiachuk/ Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html
2021-07-09
1h 01
DataTalks.Club
Build Your Own Data Pipeline - Andreas Kretz
We talked about: Andreas’s background Why data engineering is becoming more popular Who to hire first – a data engineer or a data scientist? How can I, as a data scientist, learn to build pipelines? Don’t use too many tools What is a data pipeline and why do we need it? What is ingestion? Can just one person build a data pipeline? Approaches to building data pipelines for data scientists Processing frameworks Common setup for data pipelines — car price prediction Productionizing the model with the help of a data pipeline Scheduling Orchestration Start simple Learning DevOps to implemen...
2021-07-02
1h 01
DataTalks.Club
From Software Engineering to Machine Learning - Santiago Valdarrama
We talked about: Santiago’s background “Transitioning to ML” vs “Adding ML as a skill” Getting over the fear of math for software developers Learning by explaining Seven lessons I learned about starting a career in machine learning Lesson 1 – Take the first step Lesson 2 – Learning is a marathon, not a sprint Lesson 3 – If you want to go quickly, go alone. If you want to go far, go together. Lesson 4 – Do something with the knowledge you gain Lesson 5 – ML is not just math. Math is not scary. Lesson 6 – Your ability to analyze a problem is the most important skill. Coding is secondary. ...
2021-06-25
59 min
DataTalks.Club
Analytics Engineer: New Role in a Data Team - Victoria Perez Mola
Links: https://www.notion.so/Analytics-Engineer-New-Role-in-a-Data-Team-9decbf33825c4580967cf3173eb77177 https://www.linkedin.com/in/victoriaperezmola/ Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html Conference: https://datatalks.club/conferences/2021-summer-marathon.html
2021-06-18
59 min
DataTalks.Club
Data Governance - Jessi Ashdown, Uri Gilad
We talked about: Jessi’s background Uri’s background Data governance Implementing data governance: policies and processes Reasons not to have data governance Start with “why” Cataloging and classifying our data Let data work for you The human component Data quality Defining policies Implementing policies Shopping-card experience for requesting data Proving the value of data catalog Using data catalog Data governance = data catalog? Links: Book: https://www.oreilly.com/library/view/data-governance-the/9781492063483/ Jessi’s LinkedIn: https://www.linkedin.com/in/jashdown/ Uri’s LinkedIn: https://linkedin.com/in/ugilad Uri’s Twitter: https://twitter.com/ug...
2021-06-11
57 min
DataTalks.Club
What Data Scientists Don’t Mention in Their LinkedIn Profiles - Yury Kashnitsky
We talked about: Yury’s background Failing fast: Grammarly for science Not failing fast: Keyword recommender Four steps to epiphany Lesson learned when bringing XGBoost into production When data scientists try to be engineers Joining a fintech startup: Doing NLP with thousands of GPUs Working at a Telco company Having too much freedom The importance of digital presence Work-life balance Quantifying impact of failing projects on our CVs Business trips to Perm: don’t work on the weekend What doesn’t kill you makes you stronger Links: Yury's course: https://mlcourse.ai/ Yury's Twitte...
2021-06-04
59 min
DataTalks.Club
How to Market Yourself (without Being a Celebrity) - Shawn Swyx Wang
We talked about: Shawn’s background and his book Marketing ourselves Components of personal marketing Personal brand for an average developer Picking a domain: what to write about? Being too niche Finding a good niche Learning in public Borrowed platforms vs own platform Starting on social media: Picking what they put down Career transitioning: mutual exchange of value Personal marketing for getting a new job Getting hired through the back door Finding content ideas Marketing yourself in public — summary Open-source knowledge Internal marketing: promoting ourselves at work Signature initiative Public speaking Wrapping up Discount for the coding career book...
2021-05-21
1h 02
DataTalks.Club
Effective Communication with Business for Data Professionals - Lior Barak
We talked about: DataTalks.Club intro Lior’s background Who is a data strategist? Improving communication between business and tech Building trust Putting data and business people together Dealing with pushbacks Building things in the lean way (and growing tomatoes) Starting with ugly code Convincing others to take our code MVP vs development and Hummus Talking to people who can’t code Break down the silos Hummus Hummus places in Berlin Lior’s book: Data is Like a Plate of Hummus Data chaos Links: Book: https://www.amazon.com/-/en/Sarah-Mayor/dp/B086L2...
2021-04-30
57 min
DataTalks.Club
Data Observability - Barr Moses
We covered: Barr’s background Market gaps in data reliability Observability in engineering Data downtime Data quality problems and the five pillars of data observability Example: job failing because of a schema change Three pillars of observability (good pipelines and bad data) Observability vs monitoring Finding the root cause Who is accountable for data quality? (the RACI framework) Service level agreements Inferring the SLAs from the historical data Implementing data observability Data downtime maturity curve Monte carlo: data observability solution Open source tools Test-driven development for data Is data observability cloud agnostic? Centralizing data observability Detecting downstream and up...
2021-04-23
1h 01
DataTalks.Club
Transitioning from Project Management to Data Science - Ksenia Legostay
We talked about: Knesia’s background Data analytics vs data science Skills needed for data analytics and data science Benefits of getting a masters degree Useful online courses How project management background can be helpful for the career transition Which skills do PMs need to become data analysts? Going from working with spreadsheets to working with python Kaggle Productionizing machine learning models Getting experience while studying Looking for a job Gap between theory and practice Learning plan for transitioning Last tips and getting involved in projects Links: Notes prepared by Ksenia with all th...
2021-04-09
1h 03
DataTalks.Club
Translating ML Predictions Into Better Real-World Results with Decision Optimization - Dan Becker
We talked about: How we make decisions with machine learning What is decision optimization Specifying the decision function Emulation for making the best decisions Decision optimization and reinforcement learning Getting started with decision optimization Trends in the industry Links: https://datatalks.club/people/danbecker.html https://www.decision.ai/ Join DataTalks.Club: https://datatalks.club/slack.html
2021-02-19
55 min