Look for any podcast host, guest or anyone
Showing episodes and shows of

Igor Krawczuk

Shows

muckrAIkersmuckrAIkersAI, Reasoning or Rambling?In this episode, we redefine AI's "reasoning" as mere rambling, exposing the "illusion of thinking" and "Potemkin understanding" in current models. We contrast the classical definition of reasoning (requiring logic and consistency) with Big Tech's new version, which is a generic statement about information processing. We explain how Large Rambling Models generate extensive, often irrelevant, rambling traces that appear to improve benchmarks, largely due to best-of-N sampling and benchmark gaming.Words and definitions actually matter! Carelessness leads to misplaced investments and an overestimation of systems that are currently just surprisingly useful autocorrects.(00:00) - Intro (00:40...2025-07-141h 11Into AI SafetyInto AI SafetyGetting Into PauseAI w/ Will PetilloWill Petillo, onboarding team lead at PauseAI, joins me to discuss the grassroots movement advocating for a pause on frontier AI model development. We explore PauseAI's strategy, talk about common misconceptions Will hears, and dig into how diverse perspectives still converge on the need to slow down AI development.Will's LinksPersonal blog on AIHis mindmap of the AI x-risk debateGame demosAI focused YouTube channel(00:00) - Intro (03:36) - What is PauseAI (10:10) - Will Petillo's journey into AI safety advocacy (21:13) - Understanding PauseAI (31:35) - Pursuing a pause (40:06) - Balancing advocacy in a complex world (45:54...2025-06-231h 48muckrAIkersmuckrAIkersOne Big Bad BillIn this episode, we break down Trump's "One Big Beautiful Bill" and its dystopian AI provisions: automated fraud detection systems, centralized citizen databases, military AI integration, and a 10-year moratorium blocking all state AI regulation. We explore the historical parallels with authoritarian data consolidation and why this represents a fundamental shift away from limited government principles once held by US conservatives.(00:00) - Intro (01:13) - Bill, general overview (05:14) - Bill, AI overview (07:54) - Medicaid fraud detection systems (11:20) - Bias in AI Systems and Ethical Concerns (17:58) - Centralization of data (30:04) - Military integration of AI (37:05) - Tax incentives...2025-06-2353 minmuckrAIkersmuckrAIkersBreaking Down the Economics of AIJacob and Igor tackle the wild claims about AI's economic impact by examining three main clusters of arguments: automating expensive tasks like programming, removing "cost centers" like call centers and corporate art, and claims of explosive growth. They dig into the actual data, debunk the hype, and explain why most productivity claims don't hold up in practice. Plus: MIT denounces a paper with fabricated data, and Grok randomly promotes white genocide myths.(00:00) - Recording date + intro (00:52) - MIT denounces paper (04:09) - Grok's white genocide (06:23) - Butthole convergence (07:13) - AI and the economy (14:50) - Automating profit...2025-05-261h 06muckrAIkersmuckrAIkersDeepSeek: 2 Months OutDeepSeek has been out for over 2 months now, and things have begun to settle down. We take this opportunity to contextualize the developments that have occurred in its wake, both within the AI industry and the world economy. As systems get more "agentic" and users are willing to spend increasing amounts of time waiting for their outputs, the value of supposed "reasoning" models continues to be peddled by AI system developers, but does the data really back these claims?Check out our DeepSeek minisode for a snappier overview!EPISODE RECORDED 2025.03.30(00:40...2025-04-091h 31muckrAIkersmuckrAIkersDeepSeek MinisodeDeepSeek R1 has taken the world by storm, causing a stock market crash and prompting further calls for export controls within the US. Since this story is still very much in development, with follow-up investigations and calls for governance being released almost daily, we thought it best to hold of for a little while longer to be able to tell the whole story. Nonetheless, it's a big story, so we provide a brief overview of all that's out there so far.(00:00) - Recording date (00:04) - Intro (00:37) - DeepSeek drop and reactions (04:27) - Export controls (08:05) - Skepticism...2025-02-1015 minmuckrAIkersmuckrAIkersUnderstanding AI World Models w/ Chris CanalChris Canal, co-founder of EquiStamp, joins muckrAIkers as our first ever podcast guest! In this ~3.5 hour interview, we discuss intelligence vs. competencies, the importance of test-time compute, moving goalposts, the orthogonality thesis, and much more.A seasoned software developer, Chris started EquiStamp as a way to improve our current understanding of model failure modes and capabilities in late 2023. Now a key contractor for METR, EquiStamp evaluates the next generation of LLMs from frontier model developers like OpenAI and Anthropic.EquiStamp is hiring, so if you're a software developer interested in a fully remote opportunity with...2025-01-273h 19muckrAIkersmuckrAIkersNeurIPS 2024 Wrapped 🌯What happens when you bring over 15,000 machine learning nerds to one city? If your guess didn't include racism, sabotage and scandal, belated epiphanies, a spicy SoLaR panel, and many fantastic research papers, you wouldn't have captured my experience. In this episode we discuss the drama and takeaways from NeurIPS 2024.Posters available at time of episode preparation can be found on the episode webpage.EPISODE RECORDED 2024.12.22(00:00) - Recording date (00:05) - Intro (00:44) - Obligatory mentions (01:54) - SoLaR panel (18:43) - Test of Time (24:17) - And now: science! (28:53) - Downsides of benchmarks (41:39) - Improving the...2024-12-301h 26muckrAIkersmuckrAIkersOpenAI's o1 System Card, Literally Migraine InducingThe idea of model cards, which was introduced as a measure to increase transparency and understanding of LLMs, has been perverted into the marketing gimmick characterized by OpenAI's o1 system card. To demonstrate the adversarial stance we believe is necessary to draw meaning from these press-releases-in-disguise, we conduct a close read of the system card. Be warned, there's a lot of muck in this one.Note: All figures/tables discussed in the podcast can be found on the podcast website at https://kairos.fm/muckraikers/e009/(00:00) - Recorded 2024.12.08 (00:54) - Actual intro (03:00) - System...2024-12-231h 16muckrAIkersmuckrAIkersHow to Safely Handle Your AGIWhile on the campaign trail, Trump made claims about repealing Biden's Executive Order on AI, but what will actually be changed when he gets into office? We take this opportunity to examine policies being discussed or implemented by leading governments around the world.(00:00) - Intro (00:29) - Hot off the press (02:59) - Repealing the AI executive order? (11:16) - "Manhattan" for AI (24:33) - EU (30:47) - UK (39:27) - Bengio (44:39) - Comparing EU/UK to USA (45:23) - China (51:12) - Taxes (55:29) - The muck LinksSFChronicle article - US gathers allies to talk AI safety as Trump's vow to...2024-12-0258 minmuckrAIkersmuckrAIkersThe End of Scaling?Multiple news outlets, including The Information, Bloomberg, and Reuters [see sources] are reporting an "end of scaling" for the current AI paradigm. In this episode we look into these articles, as well as a wide variety of economic forecasting, empirical analysis, and technical papers to understand the validity, and impact of these reports. We also use this as an opportunity to contextualize the realized versus promised fruits of "AI".(00:23) - Hot off the press (01:49) - The end of scaling (10:50) - "Useful tools" and "agentic" "AI" (17:19) - The end of quantization (25:18) - Hedging (29:41) - The end...2024-11-191h 07muckrAIkersmuckrAIkersUS National Security Memorandum on AI, Oct 2024October 2024 saw a National Security Memorandum and US framework for using AI in national security contexts. We go through the content so you don't have to, pull out the important bits, and summarize our main takeaways.(00:48) - The memorandum (06:28) - What the press is saying (10:39) - What's in the text (13:48) - Potential harms (17:32) - Miscellaneous notable stuff (31:11) - What's the US governments take on AI? (45:45) - The civil side - comments on reporting (49:31) - The commenters (01:07:33) - Our final hero (01:10:46) - The muck LinksUnited States National Security Memorandum on AIFact Sheet on the National...2024-11-061h 16muckrAIkersmuckrAIkersUnderstanding Claude 3.5 Sonnet (New)Frontier developers continue their war on sane versioning schema to bring us Claude 3.5 Sonnet (New), along with "computer use" capabilities. We discuss not only the new model, but also why Anthropic may have released this model and tool combination now.(00:00) - Intro (00:22) - Hot off the press (05:03) - Claude 3.5 Sonnet (New) Two 'o' 3000 (09:23) - Breaking down "computer use" (13:16) - Our understanding (16:03) - Diverging business models (32:07) - Why has Anthropic chosen this strategy? (43:14) - Changing the frame (48:00) - Polishing the lily LinksAnthropic press release - Introducing Claude 3.5 Sonnet (New)Model Card Addendum...2024-10-301h 00muckrAIkersmuckrAIkersWinter is Coming for OpenAIBrace yourselves, winter is coming for OpenAI - atleast, that's what we think. In this episode we look at OpenAI's recent massive funding round and ask "why would anyone want to fund a company that is set to lose net 5 billion USD for 2024?" We scrape through a whole lot of muck to find the meaningful signals in all this news, and there is a lot of it, so get ready!(00:00) - Intro (00:28) - Hot off the press (02:43) - Why listen? (06:07) - Why might VCs invest? (15:52) - What are people saying (23:10) - How *is* OpenAI making...2024-10-221h 22muckrAIkersmuckrAIkersOpen Source AI and 2024 Nobel PrizesThe Open Source AI Definition is out after years of drafting, will it reestablish brand meaning for the ā€œOpen Sourceā€ term? Also, the 2024 Nobel Prizes in Physics and Chemistry are heavily tied to AI; we scrutinize not only this year's prizes, but also Nobel Prizes as a concept.Ā (00:00) - Intro (00:30) - Hot off the press (03:45) - Open Source AI background (10:30) - Definitions and changes in RC1 (18:36) - ā€œBusiness sourceā€ (22:17) - Parallels with legislation (26:22) - Impacts of the OSAID (33:58) - 2024 Nobel Prize Context (37:21) - Chemistry prize (45:06) - Physics prize (50:29) - Takeaways (52:03) - What’s the real muck? (01:00:27) - Outro 2024-10-161h 01muckrAIkersmuckrAIkersSB1047Why is Mark Ruffalo talking about SB1047, and what is it anyway? Tune in for our thoughts on the now vetoed California legislation that had Big Tech scared.(00:00) - Intro (00:31) - Updates from a relatively slow week (03:32) - Disclaimer: SB1047 vetoed during recording (still worth a listen) (05:24) - What is SB1047 (12:30) - Definitions (17:18) - Understanding the bill (28:42) - What are the players saying about it? (46:44) - Addressing critiques (55:59) - Open Source (01:02:36) - Takeaways (01:15:40) - Clarification on impact to big tech (01:18:51) - Outro LinksSB1047 legislation pageSB1047 CalMatters pageNewsom vetoes SB1047CAIS newsletter on SB1047Prominent AI...2024-09-301h 19muckrAIkersmuckrAIkersOpenAI's o1, aka. StrawberryOpenAI's new model is out, and we are going to have to rake through a lot of muck to get the value out of this one!⚠ Opt out of LinkedIn's GenAI scraping āž”ļø https://lnkd.in/epziUeTi (00:00) - Intro (00:25) - Other recent news (02:57) - Hot off the press (03:58) - Why might someone care? (04:52) - What is it? (06:49) - How is it being sold? (10:45) - How do they explain it, technically? (27:09) - Reflection AI Drama (40:19) - Why do we care? (46:39) - Scraping away the muck Note: at around 32 minutes, Igor says the incorrect Llama model versio...2024-09-2350 minInto AI SafetyInto AI SafetyINTERVIEW: Scaling Democracy w/ (Dr.) Igor KrawczukThe almost Dr. Igor Krawczuk joins me for what is the equivalent of 4 of my previous episodes. We get into all the classics: eugenics, capitalism, philosophical toads... Need I say more?If you're interested in connecting with Igor, head on over to his website, or check out placeholder for thesis (it isn't published yet).Because the full show notes have a whopping 115 additional links, I'll highlight some that I think are particularly worthwhile here:The best article you'll ever read on Open Source AIThe best article you'll ever read on emergence in MLKate Crawford's...2024-06-032h 58Machine Learning Street Talk (MLST)Machine Learning Street Talk (MLST)#99 - CARLA CREMER & IGOR KRAWCZUK - X-Risk, Governance, Effective AltruismYT version (with references): https://www.youtube.com/watch?v=lxaTinmKxs0 Support us! https://www.patreon.com/mlst MLST Discord: https://discord.gg/aNPkGUQtc5 Carla Cremer and Igor Krawczuk argue that AI risk should be understood as an old problem of politics, power and control with known solutions, and that threat models should be driven by empirical work. The interaction between FTX and the Effective Altruism community has sparked a lot of discussion about the dangers of optimization, and Carla's Vox article highlights the need for an institutional turn when...2023-02-051h 39