Listen

Description

David Hand, a professor of statistics, explains how ChatGPT and other large language models hide behind dark data, leading to misleading outputs. He also critiques the peer‑review process and the perils of partial truths in scientific modeling.- 00:00:00 - Introduction- 00:01:34 - What is Dark Data? (missing data matters more than what you have)- 00:07:03 - The perils of "changing definitions"- 00:09:15 - David on writing and his selective process- 00:20:15 - Theory-driven vs. data-driven models (& the constitution of LLMs)- 00:32:08 - The dilemma of partial truths- 00:34:40 - The "File Drawer Problem" & its adverse effects on clinical trials- 00:39:09 - Regression to the mean (how random variations lead to misleading conclusions)- 00:44:12 - Publication bias- 00:48:03 - Open-access models and their pitfalls- 00:54:06 - Why LLMs are simultaneously brilliant & stupid- 01:03:40 - David’s daily routine- 01:06:24 - The mean vs. median- 01:11:07 - Every type of "Dark Data" listed (watch this first!)SPONSORS:- Patreon: https://patreon.com/curtjaimungal- Crypto: https://tinyurl.com/cryptoTOE- PayPal: https://tinyurl.com/paypalTOE- Twitter: https://twitter.com/TOEwithCurt- Discord Invite: https://discord.com/invite/kBcnfNVwqs- iTunes: https://podcasts.apple.com/ca/podcast...- Pandora: https://pdora.co/33b9lfP- Spotify: https://open.spotify.com/show/4gL14b9...- Subreddit r/TheoriesOfEverything: https://reddit.com/r/theoriesofeveryt...- TOE Merch: https://tinyurl.com/TOEmerchRESOURCES:- YouTube Link: https://www.youtube.com/watch?v=41JBrC5e5tA- Dark Data: https://amzn.to/446Fou1- The Improbability Principle: https://amzn.to/3DOn1iX
Learn more about your ad choices. Visit megaphone.fm/adchoices