podcast
details
.com
Print
Share
Look for any podcast host, guest or anyone
Search
Showing episodes and shows of
Daniel Filan
Shows
LessWrong (30+ Karma)
“45 - Samuel Albanie on DeepMind’s AGI Safety Approach” by DanielFilan
YouTube link In this episode, I chat with Samuel Albanie about the Google DeepMind paper he co-authored called “An Approach to Technical AGI Safety and Security”. It covers the assumptions made by the approach, as well as the types of mitigations it outlines. Topics we discuss: DeepMind's Approach to Technical AGI Safety and Security Current paradigm continuation No human ceiling Uncertain timelines Approximate continuity and the potential for accelerating capability improvement Misuse and misalignment Societal readiness Misuse mitigations Misalignment mitigations Samuel's thinking about technical AGI safety Following Samuel's work Daniel Filan (00:00:09): Hello, everybody. In t...
2025-07-07
1h 17
AXRP - the AI X-risk Research Podcast
45 - Samuel Albanie on DeepMind's AGI Safety Approach
In this episode, I chat with Samuel Albanie about the Google DeepMind paper he co-authored called "An Approach to Technical AGI Safety and Security". It covers the assumptions made by the approach, as well as the types of mitigations it outlines. Patreon: https://www.patreon.com/axrpodcast Ko-fi: https://ko-fi.com/axrpodcast Transcript: https://axrp.net/episode/2025/07/06/episode-45-samuel-albanie-deepminds-agi-safety-approach.html Topics we discuss, and timestamps: 0:00:37 DeepMind's Approach to Technical AGI Safety and Security 0:04:29 Current paradigm continuation 0:19:13 No human ceiling 0:21:22 Uncertain timelines
2025-07-07
1h 15
AXRP - the AI X-risk Research Podcast
44 - Peter Salib on AI Rights for Human Safety
In this episode, I talk with Peter Salib about his paper "AI Rights for Human Safety", arguing that giving AIs the right to contract, hold property, and sue people will reduce the risk of their trying to attack humanity and take over. He also tells me how law reviews work, in the face of my incredulity. Patreon: https://www.patreon.com/axrpodcast Ko-fi: https://ko-fi.com/axrpodcast Transcript: https://axrp.net/episode/2025/06/28/episode-44-peter-salib-ai-rights-human-safety.html Topics we discuss, and timestamps: 0:00:40 Why AI rights 0:18:34 Why not r...
2025-06-28
3h 21
AXRP - the AI X-risk Research Podcast
43 - David Lindner on Myopic Optimization with Non-myopic Approval
In this episode, I talk with David Lindner about Myopic Optimization with Non-myopic Approval, or MONA, which attempts to address (multi-step) reward hacking by myopically optimizing actions against a human's sense of whether those actions are generally good. Does this work? Can we get smarter-than-human AI this way? How does this compare to approaches like conservativism? Listen to find out. Patreon: https://www.patreon.com/axrpodcast Ko-fi: https://ko-fi.com/axrpodcast Transcript: https://axrp.net/episode/2025/06/15/episode-43-david-lindner-mona.html Topics we discuss, and timestamps: 0:00:29 What MONA is
2025-06-15
1h 40
AXRP - the AI X-risk Research Podcast
42 - Owain Evans on LLM Psychology
Earlier this year, the paper "Emergent Misalignment" made the rounds on AI x-risk social media for seemingly showing LLMs generalizing from 'misaligned' training data of insecure code to acting comically evil in response to innocuous questions. In this episode, I chat with one of the authors of that paper, Owain Evans, about that research as well as other work he's done to understand the psychology of large language models. Patreon: https://www.patreon.com/axrpodcast Ko-fi: https://ko-fi.com/axrpodcast Transcript: https://axrp.net/episode/2025/06/06/episode-42-owain-evans-llm-psychology.html Topics w...
2025-06-06
2h 14
AXRP - the AI X-risk Research Podcast
41 - Lee Sharkey on Attribution-based Parameter Decomposition
What's the next step forward in interpretability? In this episode, I chat with Lee Sharkey about his proposal for detecting computational mechanisms within neural networks: Attribution-based Parameter Decomposition, or APD for short. Patreon: https://www.patreon.com/axrpodcast Ko-fi: https://ko-fi.com/axrpodcast Transcript: https://axrp.net/episode/2025/06/03/episode-41-lee-sharkey-attribution-based-parameter-decomposition.html Topics we discuss, and timestamps: 0:00:41 APD basics 0:07:57 Faithfulness 0:11:10 Minimality 0:28:44 Simplicity 0:34:50 Concrete-ish examples of APD 0:52:00 Which parts of APD are canonical 0:58:10 Hyperparameter selection 1:06:40 A...
2025-06-03
2h 16
AXRP - the AI X-risk Research Podcast
40 - Jason Gross on Compact Proofs and Interpretability
How do we figure out whether interpretability is doing its job? One way is to see if it helps us prove things about models that we care about knowing. In this episode, I speak with Jason Gross about his agenda to benchmark interpretability in this way, and his exploration of the intersection of proofs and modern machine learning. Patreon: https://www.patreon.com/axrpodcast Ko-fi: https://ko-fi.com/axrpodcast Transcript: https://axrp.net/episode/2025/03/28/episode-40-jason-gross-compact-proofs-interpretability.html Topics we discuss, and timestamps: 0:00:40 - Why compact proofs
2025-03-28
2h 36
AXRP - the AI X-risk Research Podcast
38.8 - David Duvenaud on Sabotage Evaluations and the Post-AGI Future
In this episode, I chat with David Duvenaud about two topics he's been thinking about: firstly, a paper he wrote about evaluating whether or not frontier models can sabotage human decision-making or monitoring of the same models; and secondly, the difficult situation humans find themselves in in a post-AGI future, even if AI is aligned with human intentions. Patreon: https://www.patreon.com/axrpodcast Ko-fi: https://ko-fi.com/axrpodcast Transcript: https://axrp.net/episode/2025/03/01/episode-38_8-david-duvenaud-sabotage-evaluations-post-agi-future.html FAR.AI: https://far.ai/ FAR.AI on X (aka T...
2025-03-01
20 min
AXRP - the AI X-risk Research Podcast
38.7 - Anthony Aguirre on the Future of Life Institute
The Future of Life Institute is one of the oldest and most prominant organizations in the AI existential safety space, working on such topics as the AI pause open letter and how the EU AI Act can be improved. Metaculus is one of the premier forecasting sites on the internet. Behind both of them lie one man: Anthony Aguirre, who I talk with in this episode. Patreon: https://www.patreon.com/axrpodcast Ko-fi: https://ko-fi.com/axrpodcast Transcript: https://axrp.net/episode/2025/02/09/episode-38_7-anthony-aguirre-future-of-life-institute.html FAR.AI: https://far.ai/
2025-02-09
22 min
AXRP - the AI X-risk Research Podcast
38.6 - Joel Lehman on Positive Visions of AI
Typically this podcast talks about how to avert destruction from AI. But what would it take to ensure AI promotes human flourishing as well as it can? Is alignment to individuals enough, and if not, where do we go form here? In this episode, I talk with Joel Lehman about these questions. Patreon: https://www.patreon.com/axrpodcast Ko-fi: https://ko-fi.com/axrpodcast Transcript: https://axrp.net/episode/2025/01/24/episode-38_6-joel-lehman-positive-visions-of-ai.html FAR.AI: https://far.ai/ FAR.AI on X (aka Twitter): https://x.com/farairesearch FAR...
2025-01-25
15 min
AXRP - the AI X-risk Research Podcast
38.5 - Adrià Garriga-Alonso on Detecting AI Scheming
Suppose we're worried about AIs engaging in long-term plans that they don't tell us about. If we were to peek inside their brains, what should we look for to check whether this was happening? In this episode Adrià Garriga-Alonso talks about his work trying to answer this question. Patreon: https://www.patreon.com/axrpodcast Ko-fi: https://ko-fi.com/axrpodcast Transcript: https://axrp.net/episode/2025/01/20/episode-38_5-adria-garriga-alonso-detecting-ai-scheming.html FAR.AI: https://far.ai/ FAR.AI on X (aka Twitter): https://x.com/farairesearch FAR.AI on YouTube: https://ww...
2025-01-20
27 min
AXRP - the AI X-risk Research Podcast
38.4 - Shakeel Hashim on AI Journalism
AI researchers often complain about the poor coverage of their work in the news media. But why is this happening, and how can it be fixed? In this episode, I speak with Shakeel Hashim about the resource constraints facing AI journalism, the disconnect between journalists' and AI researchers' views on transformative AI, and efforts to improve the state of AI journalism, such as Tarbell and Shakeel's newsletter, Transformer. Patreon: https://www.patreon.com/axrpodcast Ko-fi: https://ko-fi.com/axrpodcast The transcript: https://axrp.net/episode/2025/01/05/episode-38_4-shakeel-hashim-ai-journalism.html FAR.AI: https...
2025-01-05
24 min
AXRP - the AI X-risk Research Podcast
38.4 - Peter Barnett on Technical Governance at MIRI
The Machine Intelligence Research Institute has recently shifted its focus to "technical governance". But what is that actually, and what are they doing? In this episode, I chat with Peter Barnett about his team's work on studying what evaluations can and cannot do, as well as verifying international agreements on AI development. Patreon: https://www.patreon.com/axrpodcast Ko-fi: https://ko-fi.com/axrpodcast The transcript: https://axrp.net/episode/2024/12/14/episode-38_4-peter-barnett-technical-governance-at-miri.html FAR.AI: https://far.ai/ FAR.AI on X (aka Twitter): https://x.com/farairesearch F...
2024-12-14
20 min
AXRP - the AI X-risk Research Podcast
38.3 - Erik Jenner on Learned Look-Ahead
Lots of people in the AI safety space worry about models being able to make deliberate, multi-step plans. But can we already see this in existing neural nets? In this episode, I talk with Erik Jenner about his work looking at internal look-ahead within chess-playing neural networks. Patreon: https://www.patreon.com/axrpodcast Ko-fi: https://ko-fi.com/axrpodcast The transcript: https://axrp.net/episode/2024/12/12/episode-38_3-erik-jenner-learned-look-ahead.html FAR.AI: https://far.ai/ FAR.AI on X (aka Twitter): https://x.com/farairesearch FAR.AI on YouTube: https://w...
2024-12-12
23 min
AXRP - the AI X-risk Research Podcast
39 - Evan Hubinger on Model Organisms of Misalignment
The 'model organisms of misalignment' line of research creates AI models that exhibit various types of misalignment, and studies them to try to understand how the misalignment occurs and whether it can be somehow removed. In this episode, Evan Hubinger talks about two papers he's worked on at Anthropic under this agenda: "Sleeper Agents" and "Sycophancy to Subterfuge". Patreon: https://www.patreon.com/axrpodcast Ko-fi: https://ko-fi.com/axrpodcast The transcript: https://axrp.net/episode/2024/12/01/episode-39-evan-hubinger-model-organisms-misalignment.html Topics we discuss, and timestamps: 0:00:36 - Model organisms and s...
2024-12-01
1h 45
AXRP - the AI X-risk Research Podcast
38.2 - Jesse Hoogland on Singular Learning Theory
You may have heard of singular learning theory, and its "local learning coefficient", or LLC - but have you heard of the refined LLC? In this episode, I chat with Jesse Hoogland about his work on SLT, and using the refined LLC to find a new circuit in language models. Patreon: https://www.patreon.com/axrpodcast Ko-fi: https://ko-fi.com/axrpodcast The transcript: https://axrp.net/episode/2024/11/27/38_2-jesse-hoogland-singular-learning-theory.html FAR.AI: https://far.ai/ FAR.AI on X (aka Twitter): https://x.com/farairesearch FAR.AI...
2024-11-27
18 min
AXRP - the AI X-risk Research Podcast
38.1 - Alan Chan on Agent Infrastructure
Road lines, street lights, and licence plates are examples of infrastructure used to ensure that roads operate smoothly. In this episode, Alan Chan talks about using similar interventions to help avoid bad outcomes from the deployment of AI agents. Patreon: https://www.patreon.com/axrpodcast Ko-fi: https://ko-fi.com/axrpodcast The transcript: https://axrp.net/episode/2024/11/16/episode-38_1-alan-chan-agent-infrastructure.html FAR.AI: https://far.ai/ FAR.AI on X (aka Twitter): https://x.com/farairesearch FAR.AI on YouTube: https://www.youtube.com/@FARAIResearch The Alignment W...
2024-11-17
24 min
AXRP - the AI X-risk Research Podcast
38.0 - Zhijing Jin on LLMs, Causality, and Multi-Agent Systems
Do language models understand the causal structure of the world, or do they merely note correlations? And what happens when you build a big AI society out of them? In this brief episode, recorded at the Bay Area Alignment Workshop, I chat with Zhijing Jin about her research on these questions. Patreon: https://www.patreon.com/axrpodcast Ko-fi: https://ko-fi.com/axrpodcast The transcript: https://axrp.net/episode/2024/11/14/episode-38_0-zhijing-jin-llms-causality-multi-agent-systems.html FAR.AI: https://far.ai/ FAR.AI on X (aka Twitter): https://x.com/farairesearch FAR...
2024-11-14
22 min
Mutual Understanding
Shea Levy on why he disagrees with Less Wrong rationality, Part 1
In this podcast, Shea and I tried to hunt down a philosophical disagreement we seem to have by diving into his critique of rationality. We went off on what may or may not have been a big tangent about Internal Family Systems therapy, which I’m a big fan of, and which I think Shea thinks should have more caveats?Unfortunately, our conversation got cut short because partway through, Shea got a call and had to deal with some stuff. We hope to record a Part 2 soon!Transcript:Divia (00:01)Hey, I'm he...
2024-10-11
1h 23
AXRP - the AI X-risk Research Podcast
37 - Jaime Sevilla on AI Forecasting
Epoch AI is the premier organization that tracks the trajectory of AI - how much compute is used, the role of algorithmic improvements, the growth in data used, and when the above trends might hit an end. In this episode, I speak with the director of Epoch AI, Jaime Sevilla, about how compute, data, and algorithmic improvements are impacting AI, and whether continuing to scale can get us AGI. Patreon: https://www.patreon.com/axrpodcast Ko-fi: https://ko-fi.com/axrpodcast The transcript: https://axrp.net/episode/2024/10/04/episode-37-jaime-sevilla-forecasting-ai.html T...
2024-10-04
1h 44
AXRP - the AI X-risk Research Podcast
36 - Adam Shai and Paul Riechers on Computational Mechanics
Sometimes, people talk about transformers as having "world models" as a result of being trained to predict text data on the internet. But what does this even mean? In this episode, I talk with Adam Shai and Paul Riechers about their work applying computational mechanics, a sub-field of physics studying how to predict random processes, to neural networks. Patreon: https://www.patreon.com/axrpodcast Ko-fi: https://ko-fi.com/axrpodcast The transcript: https://axrp.net/episode/2024/09/29/episode-36-adam-shai-paul-riechers-computational-mechanics.html Topics we discuss, and timestamps: 0:00:42 - What computational mechanics i...
2024-09-29
1h 48
AXRP - the AI X-risk Research Podcast
New Patreon tiers + MATS applications
Patreon: https://www.patreon.com/axrpodcast MATS: https://www.matsprogram.org Note: I'm employed by MATS, but they're not paying me to make this video.
2024-09-28
05 min
Mutual Understanding
In what sense are there coherence theorems?
In this episode, Daniel Filan and I talk about Elliot Thornley’s LessWrong post There are no coherence theorems. Some other LessWrong posts we reference include:* A stylized dialogue on John Wentworth's claims about markets and optimization* Why Not SubagentsTranscript:Divia (00:03)I'm here today with Elliot Thornley, who goes by EJT on less wrong and Daniel Phylin and Elliot is currently a postdoc at the global priorities Institute working on this sort of AI stuff and also some global population work. And at the end we're goi...
2024-09-20
1h 40
AXRP - the AI X-risk Research Podcast
35 - Peter Hase on LLM Beliefs and Easy-to-Hard Generalization
How do we figure out what large language models believe? In fact, do they even have beliefs? Do those beliefs have locations, and if so, can we edit those locations to change the beliefs? Also, how are we going to get AI to perform tasks so hard that we can't figure out if they succeeded at them? In this episode, I chat with Peter Hase about his research into these questions. Patreon: https://www.patreon.com/axrpodcast Ko-fi: https://ko-fi.com/axrpodcast The transcript: https://axrp.net/episode/2024/08/24/episode-35-peter-hase-llm-beliefs-easy-to-hard-generalization.html
2024-08-25
2h 17
AXRP - the AI X-risk Research Podcast
34 - AI Evaluations with Beth Barnes
How can we figure out if AIs are capable enough to pose a threat to humans? When should we make a big effort to mitigate risks of catastrophic AI misbehaviour? In this episode, I chat with Beth Barnes, founder of and head of research at METR, about these questions and more. Patreon: patreon.com/axrpodcast Ko-fi: ko-fi.com/axrpodcast The transcript: https://axrp.net/episode/2024/07/28/episode-34-ai-evaluations-beth-barnes.html Topics we discuss, and timestamps: 0:00:37 - What is METR? 0:02:44 - What is an "eval"? 0:14:42 - H...
2024-07-28
2h 14
Europa's Children with Kenaz Filan
When East Meets West 20
In this episode we’re joined by Daniel D. of A Ghost in the Machine. Daniel recently wrote a great piece about the death of the American civic religion. We talk about that article and other pertinent DOOM topics. Ahnaf Ibn QaisDaniel D.Kenaz Filan Get full access to Notes from the End of Time with Kenaz Filan at www.notesfromtheendofti.me/subscribe
2024-07-07
1h 18
AXRP - the AI X-risk Research Podcast
33 - RLHF Problems with Scott Emmons
Reinforcement Learning from Human Feedback, or RLHF, is one of the main ways that makers of large language models make them 'aligned'. But people have long noted that there are difficulties with this approach when the models are smarter than the humans providing feedback. In this episode, I talk with Scott Emmons about his work categorizing the problems that can show up in this setting. Patreon: patreon.com/axrpodcast Ko-fi: ko-fi.com/axrpodcast The transcript: https://axrp.net/episode/2024/06/12/episode-33-rlhf-problems-scott-emmons.html Topics we discuss, and timestamps: 0:00:33 - Deceptive...
2024-06-12
1h 41
AXRP - the AI X-risk Research Podcast
32 - Understanding Agency with Jan Kulveit
What's the difference between a large language model and the human brain? And what's wrong with our theories of agency? In this episode, I chat about these questions with Jan Kulveit, who leads the Alignment of Complex Systems research group. Patreon: patreon.com/axrpodcast Ko-fi: ko-fi.com/axrpodcast The transcript: axrp.net/episode/2024/05/30/episode-32-understanding-agency-jan-kulveit.html Topics we discuss, and timestamps: 0:00:47 - What is active inference? 0:15:14 - Preferences in active inference 0:31:33 - Action vs perception in active inference 0:46:07 - Feedback loops 1:01:32...
2024-05-30
2h 22
AXRP - the AI X-risk Research Podcast
31 - Singular Learning Theory with Daniel Murfet
What's going on with deep learning? What sorts of models get learned, and what are the learning dynamics? Singular learning theory is a theory of Bayesian statistics broad enough in scope to encompass deep neural networks that may help answer these questions. In this episode, I speak with Daniel Murfet about this research program and what it tells us. Patreon: patreon.com/axrpodcast Ko-fi: ko-fi.com/axrpodcast Topics we discuss, and timestamps: 0:00:26 - What is singular learning theory? 0:16:00 - Phase transitions 0:35:12 - Estimating the local learning coefficient
2024-05-07
2h 32
AXRP - the AI X-risk Research Podcast
30 - AI Security with Jeffrey Ladish
Top labs use various forms of "safety training" on models before their release to make sure they don't do nasty stuff - but how robust is that? How can we ensure that the weights of powerful AIs don't get leaked or stolen? And what can AI even do these days? In this episode, I speak with Jeffrey Ladish about security and AI. Patreon: patreon.com/axrpodcast Ko-fi: ko-fi.com/axrpodcast Topics we discuss, and timestamps: 0:00:38 - Fine-tuning away safety training 0:13:50 - Dangers of open LLMs vs internet search
2024-04-30
2h 15
AXRP - the AI X-risk Research Podcast
29 - Science of Deep Learning with Vikrant Varma
In 2022, it was announced that a fairly simple method can be used to extract the true beliefs of a language model on any given topic, without having to actually understand the topic at hand. Earlier, in 2021, it was announced that neural networks sometimes 'grok': that is, when training them on certain tasks, they initially memorize their training data (achieving their training goal in a way that doesn't generalize), but then suddenly switch to understanding the 'real' solution in a way that generalizes. What's going on with these discoveries? Are they all they're cracked up to be, and if so...
2024-04-25
2h 13
The Filan Cabinet
14 - The 2024 Eclipse
In this episode, I give you updates from my trip with friends to see the 2024 total solar eclipse. Questions answered include: - Why are we bothering to go see it? - How many of us will fail to make it to the eclipse? - Does it actually get darker during a total solar eclipse, or is that just an optical illusion? - What moral dilemma will we face, and what will we do? - Whose lav mic will mysteriously fail to work during their interview?
2024-04-25
1h 33
AXRP - the AI X-risk Research Podcast
28 - Suing Labs for AI Risk with Gabriel Weil
How should the law govern AI? Those concerned about existential risks often push either for bans or for regulations meant to ensure that AI is developed safely - but another approach is possible. In this episode, Gabriel Weil talks about his proposal to modify tort law to enable people to sue AI companies for disasters that are "nearly catastrophic". Patreon: patreon.com/axrpodcast Ko-fi: ko-fi.com/axrpodcast Topics we discuss, and timestamps: 0:00:35 - The basic idea 0:20:36 - Tort law vs regulation 0:29:10 - Weil's proposal vs H...
2024-04-17
1h 57
AXRP - the AI X-risk Research Podcast
27 - AI Control with Buck Shlegeris and Ryan Greenblatt
A lot of work to prevent AI existential risk takes the form of ensuring that AIs don't want to cause harm or take over the world---or in other words, ensuring that they're aligned. In this episode, I talk with Buck Shlegeris and Ryan Greenblatt about a different approach, called "AI control": ensuring that AI systems couldn't take over the world, even if they were trying to. Patreon: patreon.com/axrpodcast Ko-fi: ko-fi.com/axrpodcast Topics we discuss, and timestamps: 0:00:31 - What is AI control? 0:16:16 - Protocols for A...
2024-04-11
2h 56
Orkan Varan ile Sinema Minema
Voldemort'u oynayan adam dersek ayıp olur! | Ralph Fiennes kimdir?
Hazırlayan & Sunan: Orkan Varan . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2024-02-18
14 min
Pigeon Hour
Best of Pigeon Hour
Table of contentsNote: links take you to the corresponding section below; links to the original episode can be found there.* Laura Duffy solves housing, ethics, and more [00:01:16]* Arjun Panickssery solves books, hobbies, and blogging, but fails to solve the Sleeping Beauty problem because he's wrong on that one [00:10:47]* Nathan Barnard on how financial regulation can inform AI regulation [00:17:16]* Winston Oswald-Drummond on the tractability of reducing s-risk, ethics, and more [00:27:48]* Nathan Barnard (again!) on why general intelligence is basically fake [00:34:10]* Daniel Filan on why I'm...
2024-01-24
1h 47
AXRP - the AI X-risk Research Podcast
26 - AI Governance with Elizabeth Seger
The events of this year have highlighted important questions about the governance of artificial intelligence. For instance, what does it mean to democratize AI? And how should we balance benefits and dangers of open-sourcing powerful AI systems such as large language models? In this episode, I speak with Elizabeth Seger about her research on these questions. Patreon: patreon.com/axrpodcast Ko-fi: ko-fi.com/axrpodcast Topics we discuss, and timestamps: - 0:00:40 - What kinds of AI? - 0:01:30 - Democratizing AI - 0:04:44 - How people talk about democ...
2023-11-26
1h 57
Knižný kompas | Podcast o knihách a čítaní
Legendárny ELÁN v jedinečnej knihe. Jožo Ráž, Jano Baláž, Boris Filan...
Vypočujte si reportáž o unikátnej knihe Elán, o ktorej hovoria: Jožo Ráž a Jano Baláž textári Boris Filan a Ľuboš Zeman autorka knihy Marcela Titzlová manažérka Elánu Karolína "Karotka" Halenárová autor bubáka Alan Lesyk externá pamäť Elánu Oskar Lehotský Gabriela Belopotocká a Marek Néma z vydavateľstva Ikar Roman Bomboš, moderátor a hudobný publicista Kniha Elán vyšla v bežnej trejdovej verzii TU, alebo v luxusnej verzii a limitovanom počte len 333 výtlačkov, ktorú nájdete len na LuxusnáKnižnica.sk. Ďalšie knižné tipy v podca...
2023-11-20
59 min
The Filan Cabinet
13 - Aaron Silverbook on anti-cavity bacteria
In this episode, I speak with Aaron Silverbook about the bacteria that cause cavities, and how different bacteria can prevent them: specifically, a type of bacterium that you can buy at luminaprobiotic.com. This podcast episode has not been approved by the FDA. Specific topics we talk about include: How do bacteria cause cavities? How can you create an anti-cavity bacterium? What's going on with the competitive landscape of mouth bacteria? How dangerous is it to colonize your mouth with a novel bacterium? Why hasn't this product been available for 20 years already? Lumina Probiotic (the brand name...
2023-11-20
49 min
AXRP - the AI X-risk Research Podcast
25 - Cooperative AI with Caspar Oesterheld
Imagine a world where there are many powerful AI systems, working at cross purposes. You could suppose that different governments use AIs to manage their militaries, or simply that many powerful AIs have their own wills. At any rate, it seems valuable for them to be able to cooperatively work together and minimize pointless conflict. How do we ensure that AIs behave this way - and what do we need to learn about how rational agents interact to make that more clear? In this episode, I'll be speaking with Caspar Oesterheld about some of his research on this very...
2023-10-03
3h 02
The Filan Cabinet
12 - Holly Elmore on AI pause
In this episode, I talk to Holly Elmore about her advocacy around AI Pause - encouraging governments to pause the development of more and more powerful AI. Topics we discuss include: Why advocate specifically for AI pause? What costs of AI pause would be worth it? What might AI pause look like? What are the realistic downsides of AI pause? How the Effective Altruism community relates to AI labs. The shift in the alignment community from proving things about alignment to messing around with ML models. Holly's X (twitter) account PauseAI discord
2023-09-13
1h 29
Pigeon Hour
#6 Daniel Filan on why I'm wrong about ethics (+ Oppenheimer and what names mean in like a hardcore phil of language sense)
Note: the core discussion on ethics begins at 7:58 and moves into philosophy of language at ~1:12:19 Blurb and bulleted summary from Clong: This wide-ranging conversation between Daniel and Aaron touches on movies, business drama, philosophy of language, ethics and legal theory. The two debate major ethical concepts like utilitarianism and moral realism. Thought experiments around rational beings choosing to undergo suffering feature prominently. meandering tangents explore the semantics of names and references. Aaron asserts that total utilitarianism does not imply that any amount of suffering can be morally justified by creating more happiness. His argument...
2023-08-07
2h 05
Pigeon Hour
#6 Daniel Filan on why I'm wrong about ethics (+ Oppenheimer and what names mean in like a hardcore phil of language sense)
Listen on: * Spotify* Apple Podcasts* Google PodcastsNote: the core discussion on ethics begins at 7:58 and moves into philosophy of language at ~1:12:19Daniel’s stuff:* AI X-risk podcast * The Filan Cabined podcast* Personal website and blogBlurb and bulleted summary from ClongThis wide-ranging conversation between Daniel and Aaron touches on movies, business drama, philosophy of language, ethics and legal theory. The two debate major ethical concepts like utilitarianism and moral realism. Thought experiments around rational beings choosing to un...
2023-08-07
2h 05
AXRP - the AI X-risk Research Podcast
24 - Superalignment with Jan Leike
Recently, OpenAI made a splash by announcing a new "Superalignment" team. Lead by Jan Leike and Ilya Sutskever, the team would consist of top researchers, attempting to solve alignment for superintelligent AIs in four years by figuring out how to build a trustworthy human-level AI alignment researcher, and then using it to solve the rest of the problem. But what does this plan actually involve? In this episode, I talk to Jan Leike about the plan and the challenges it faces. Patreon: patreon.com/axrpodcast Ko-fi: ko-fi.com/axrpodcast Episode art by Hamish...
2023-07-27
2h 08
AXRP - the AI X-risk Research Podcast
23 - Mechanistic Anomaly Detection with Mark Xu
Is there some way we can detect bad behaviour in our AI system without having to know exactly what it looks like? In this episode, I speak with Mark Xu about mechanistic anomaly detection: a research direction based on the idea of detecting strange things happening in neural networks, in the hope that that will alert us of potential treacherous turns. We both talk about the core problems of relating these mechanistic anomalies to bad behaviour, as well as the paper "Formalizing the presumption of independence", which formulates the problem of formalizing heuristic mathematical reasoning, in the hope that...
2023-07-27
2h 05
AXRP - the AI X-risk Research Podcast
Survey, store closing, Patreon
Very brief survey: bit.ly/axrpsurvey2023 Store is closing in a week! Link: store.axrp.net/ Patreon: patreon.com/axrpodcast Ko-fi: ko-fi.com/axrpodcast
2023-06-29
04 min
AXRP - the AI X-risk Research Podcast
22 - Shard Theory with Quintin Pope
What can we learn about advanced deep learning systems by understanding how humans learn and form values over their lifetimes? Will superhuman AI look like ruthless coherent utility optimization, or more like a mishmash of contextually activated desires? This episode's guest, Quintin Pope, has been thinking about these questions as a leading researcher in the shard theory community. We talk about what shard theory is, what it says about humans and neural networks, and what the implications are for making AI safe. Patreon: patreon.com/axrpodcast Ko-fi: ko-fi.com/axrpodcast Episode art by...
2023-06-15
3h 28
AXRP - the AI X-risk Research Podcast
21 - Interpretability for Engineers with Stephen Casper
Lots of people in the field of machine learning study 'interpretability', developing tools that they say give us useful information about neural networks. But how do we know if meaningful progress is actually being made? What should we want out of these tools? In this episode, I speak to Stephen Casper about these questions, as well as about a benchmark he's co-developed to evaluate whether interpretability tools can find 'Trojan horses' hidden inside neural nets. Patreon: patreon.com/axrpodcast Ko-fi: ko-fi.com/axrpodcast Topics we discuss, and timestamps: - 00:00:42...
2023-05-02
1h 56
The Filan Cabinet
11 - Divia Eden and Ronny Fernandez on the orthogonality thesis
In this episode, Divia Eden and Ronny Fernandez talk about the (strong) orthogonality thesis - that arbitrarily smart intelligences can be paired with arbitrary goals, without additional complication beyond that of specifying the goal - with light prompting from me. Topics they touch on include: Why aren't bees brilliant scientists? Can you efficiently make an AGI out of one part that predicts the future conditioned on some plans, and another that evaluates whether plans are good? If minds are made of smaller sub-agents with more primitive beliefs and desires, does that shape their terminal goals? Also, how would...
2023-04-28
2h 37
The Filan Cabinet
10 - Jeffrey Heninger on Mormonism
In this episode I chat with Jeffrey Heninger about his religious beliefs and practices as a member of the Church of Jesus Christ of Latter-day Saints, sometimes colloquially referred to as "the Mormon church" or "the LDS church". Topics we talk about include: Who or what is God? How can we know things about God? In particular, what role does religious experience play? To what degree is modern morality downstream of Jesus? What's in the Book of Mormon? What does modern-day prophecy look like? What do Sunday services look like in the LDS church? What happens after you...
2023-04-15
2h 34
AXRP - the AI X-risk Research Podcast
20 - 'Reform' AI Alignment with Scott Aaronson
How should we scientifically think about the impact of AI on human civilization, and whether or not it will doom us all? In this episode, I speak with Scott Aaronson about his views on how to make progress in AI alignment, as well as his work on watermarking the output of language models, and how he moved from a background in quantum complexity theory to working on AI. Note: this episode was recorded before this story (vice.com/en/article/pkadgm/man-dies-by-suicide-after-talking-with-ai-chatbot-widow-says) emerged of a man committing suicide after discussions with a language-model-based chatbot, that i...
2023-04-12
2h 27
The Filan Cabinet
9 - Effective Altruism Global: Bay Area (2023)
Every year, the Centre for Effective Altruism runs a number of "Effective Altruism Global" (EA Global or EAG for short) conferences thru-out the world. This year, I attended the one held in the San Francisco Bay Area, and talked to a variety of participants about their relationship with effective altruism, the community around that idea, and the conference. Timestamps: 00:00:16 - interview 1 00:07:06 - interview 2 00:15:46 - interview 3 00:22:35 - interview 4 00:31:22 - interview 5 00:38:30 - interview 6 00:44:18 - interview 7 00:48:59 - interview 8 00:53:14 - interview 9 00:56:22 - interview 10 01:01:08 - interview 11 01:06:50 - interview 12 Website for EA Global conferences
2023-03-13
1h 15
The Filan Cabinet
8 - John Halstead on climate doom
In this episode I chat with John Halstead about whether climate change will kill us all. He thinks it won't. Topics we talk about include: How did the effective altrism community come to have someone dedicated to the question of whether climate change will kill us all? How bad will climate change likely be? How is the role of carbon dioxide in the atmosphere different from that of other greenhouse gasses? How big a volcano would have to go off to warm up the world by 10 degrees Celsius? How concerned should we be about climate change as a...
2023-03-12
1h 34
The Filan Cabinet
7 - Shea Levy on Objectivism
In this episode I speak with Shea Levy about Ayn Rand's philosophy of Objectivism, and what it has to say about ethics and epistemology. Topics we talk about include: What is Objectivism? Can you be an Objectivist and disagree with Ayn Rand? What's the Objectivist theory of aesthetics? Why isn't there a biography of Ayn Rand approved of by orthodox Objectivists? What's so bad about altruism, or views like utilitarianism? What even is selfishness? Can we be mistaken about what we perceive? If so, how? What is consciousness? Could it just be computation? Note that the episode...
2023-02-14
2h 50
AXRP - the AI X-risk Research Podcast
Store, Patreon, Video
Store: https://store.axrp.net/ Patreon: https://www.patreon.com/axrpodcast Ko-fi: https://ko-fi.com/axrpodcast Video: https://www.youtube.com/watch?v=kmPFjpEibu0
2023-02-07
02 min
The Filan Cabinet
6 - Oliver Habryka on LessWrong and other projects
In this episode I speak with Oliver Habryka, head of Lightcone Infrastructure, the organization that runs the internet forum LessWrong, about his projects in the rationality and existential risk spaces. Topics we talk about include: How did LessWrong get revived? How good is LessWrong? Is there anything that beats essays for making intellectual contributions on the internet? Why did the team behind LessWrong pivot to property development? What does the FTX situation tell us about the wider LessWrong and Effective Altruism communities? What projects could help improve the world's rationality? Oli on LessWrong Oli on...
2023-02-05
1h 58
AXRP - the AI X-risk Research Podcast
19 - Mechanistic Interpretability with Neel Nanda
How good are we at understanding the internal computation of advanced machine learning models, and do we have a hope at getting better? In this episode, Neel Nanda talks about the sub-field of mechanistic interpretability research, as well as papers he's contributed to that explore the basics of transformer circuits, induction heads, and grokking. Topics we discuss, and timestamps: - 00:01:05 - What is mechanistic interpretability? - 00:24:16 - Types of AI cognition - 00:54:27 - Automating mechanistic interpretability - 01:11:57 - Summarizing the papers - 01:24:43 - 'A Mathematical Framew...
2023-02-04
3h 52
The Filan Cabinet
5 - Divia Eden on operant conditioning
In this episode, I speak with Divia Eden about operant conditioning, and how relevant it is to human and non-human animal behaviour. Topics we cover include: How close are we to teaching grammar to dogs? What are the important differences between human and dog cognition? How important are unmodelled "trainer effects" in dog training? Why do people underrate positive reinforcement? How does operant conditioning relate to attachment theory? How much does successful dog training rely on the trainer being reinforced by the dog? Why is game theory so fake? Is everything really just about calmness? Divia's twitter...
2023-01-15
2h 33
The Adam Sank Show
LAST ASS: Sanks for the Memories
Our monster two-hour and 20-minute send-off! Featuring Steve Chazaro J.B. Bercy, Ryan Frostig, Drew Lausch, Michelle Buteau, Julie Halston, Daniel Reichard, Patrick McCollum, Joanne Filan, Irene Bremis, Joey DeGrandis, Hunter Foster, Jennifer Cody, Stone & Stone, Frank DeCaro, Micheal Rice, Glenn Scarpelli and Stephen Wallem! With special appearances by Rocco Steele and two-time ghost guest Justin Utley! Plus, a final update on Natalia, the orphan dwarf. Make sure you stick around 'til the very end to hear all the listener voicemails. Thank you, and good night. Visit https://linktr.ee/AdamSank
2023-01-02
2h 19
The Filan Cabinet
4 - Peter Jaworski on paid plasma donation
In this episode, Peter Jaworski talks about the practice of paid plasma donation, whether it's ethical to allow it, and his work to advocate for it to be legalized in more jurisdictions. He answers questions such as: Which country used to run clinics in a former colony to pay their former colonial subjects for their plasma? Why can't we just synthesize what we need out of plasma? What percentage of US exports by dollar value does plasma account for? If I want to gather plasma, is it cheaper to pay donors, or not pay them? Is legal paid...
2022-11-08
1h 34
AXRP - the AI X-risk Research Podcast
New podcast - The Filan Cabinet
I have a new podcast, where I interview whoever I want about whatever I want. It's called "The Filan Cabinet", and you can find it wherever you listen to podcasts. The first three episodes are about pandemic preparedness, God, and cryptocurrency. For more details, check out the podcast website (thefilancabinet.com), or search "The Filan Cabinet" in your podcast app.
2022-10-13
01 min
The Filan Cabinet
3 - Ameen Soleimani on cryptocurrency
In this episode, cryptocurrency developer Ameen Soleimani talks about his vision of the cryptocurrency ecosystem, as well as his current project RAI: an ether-backed floating-price stablecoin. He answers questions such as: What's the point of cryptocurrency? If this is the beginning of the cryptocurrency world, what will the middle be? What would the sign be that cryptocurrency is working? How does RAI work? Does the design of RAI make it impossible for it to be widely used? What's wrong with how the US dollar works? Ameen on twitter: https://twitter.com/ameensol Reflexer Finance: https...
2022-09-18
1h 22
The Filan Cabinet
2 - Wayne Forkner on God
In this episode, Presbyterian Pastor Wayne Forkner talks about God, Christianity, and the Bible. He answers questions such as: What is 'God'? Why do people talk about Jesus so much more than the Father or the Holy Spirit? What is heaven actually like? If justification is by faith alone and not by works, why does the Bible say "A person is justified by works and not by faith alone"? How can people tell that out of all the religions, Christianity is the right one? His church's website: https://www.berkeleyopc.org/ His podcast, Proclaiming the...
2022-09-18
3h 38
The Filan Cabinet
1 - Carrick Flynn on his congressional campaign
In this episode, Carrick Flynn talks about his campaign to be the Democratic nominee for Oregon's 6th congressional district. In particular, we talk about his policies on pandemic preparedness and semiconductor manufacturing. He answers questions such as: Was he surprised by the election results? Should we expect another Carrick campaign? What specific things should or could the government fund to limit the spread of pandemics? Why would those work? What is working at a semiconductor plant like? Carrick's campaign site: https://www.carrickflynnfororegon.com/ Andrea Salinas' campaign site: https://www.andreasalinasfororegon.com/
2022-09-18
1h 16
AXRP - the AI X-risk Research Podcast
18 - Concept Extrapolation with Stuart Armstrong
Concept extrapolation is the idea of taking concepts an AI has about the world - say, "mass" or "does this picture contain a hot dog" - and extending them sensibly to situations where things are different - like learning that the world works via special relativity, or seeing a picture of a novel sausage-bread combination. For a while, Stuart Armstrong has been thinking about concept extrapolation and how it relates to AI alignment. In this episode, we discuss where his thoughts are at on this topic, what the relationship to AI alignment is, and what the open questions are.
2022-09-04
1h 46
AXRP - the AI X-risk Research Podcast
17 - Training for Very High Reliability with Daniel Ziegler
Sometimes, people talk about making AI systems safe by taking examples where they fail and training them to do well on those. But how can we actually do this well, especially when we can't use a computer program to say what a 'failure' is? In this episode, I speak with Daniel Ziegler about his research group's efforts to try doing this with present-day language models, and what they learned. Listeners beware: this episode contains a spoiler for the Animorphs franchise around minute 41 (in the 'Fanfiction' section of the transcript). Topics we discuss, and t...
2022-08-22
1h 00
AXRP - the AI X-risk Research Podcast
16 - Preparing for Debate AI with Geoffrey Irving
Many people in the AI alignment space have heard of AI safety via debate - check out AXRP episode 6 (axrp.net/episode/2021/04/08/episode-6-debate-beth-barnes.html) if you need a primer. But how do we get language models to the stage where they can usefully implement debate? In this episode, I talk to Geoffrey Irving about the role of language models in AI safety, as well as three projects he's done that get us closer to making debate happen: using language models to find flaws in themselves, getting language models to back up claims they make with citations, and figuring...
2022-07-02
1h 04
AXRP - the AI X-risk Research Podcast
15 - Natural Abstractions with John Wentworth
Why does anybody care about natural abstractions? Do they somehow relate to math, or value learning? How do E. coli bacteria find sources of sugar? All these questions and more will be answered in this interview with John Wentworth, where we talk about his research plan of understanding agency via natural abstractions. Topics we discuss, and timestamps: - 00:00:31 - Agency in E. Coli - 00:04:59 - Agency in financial markets - 00:08:44 - Inferring agency in real-world systems - 00:16:11 - Selection theorems - 00:20:22 - Abstraction and natural abstractions - 00:32:42 - Info...
2022-05-23
1h 36
AXRP - the AI X-risk Research Podcast
14 - Infra-Bayesian Physicalism with Vanessa Kosoy
Late last year, Vanessa Kosoy and Alexander Appel published some research under the heading of "Infra-Bayesian physicalism". But wait - what was infra-Bayesianism again? Why should we care? And what does any of this have to do with physicalism? In this episode, I talk with Vanessa Kosoy about these questions, and get a technical overview of how infra-Bayesian physicalism works and what its implications are. Topics we discuss, and timestamps: - 00:00:48 - The basics of infra-Bayes - 00:08:32 - An invitation to infra-Bayes - 00:11:23 - What is naturalized induction? ...
2022-04-06
1h 47
AXRP - the AI X-risk Research Podcast
13 - First Principles of AGI Safety with Richard Ngo
How should we think about artificial general intelligence (AGI), and the risks it might pose? What constraints exist on technical solutions to the problem of aligning superhuman AI systems with human intentions? In this episode, I talk to Richard Ngo about his report analyzing AGI safety from first principles, and recent conversations he had with Eliezer Yudkowsky about the difficulty of AI alignment. Topics we discuss, and timestamps: - 00:00:40 - The nature of intelligence and AGI - 00:01:18 - The nature of intelligence - 00:06:09 - AGI: what and how ...
2022-03-31
1h 33
The Nonlinear Library: LessWrong Top Posts
The ground of optimization by alexflint
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio.This is: The ground of optimization, published by alexflint on the LessWrong.Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.This work was supported by OAK, a monastic community in the Berkeley hills. This document could not have been written without the daily love of living in this beautiful community. The work involved in writing this cannot be separated from the sitting, chanting, cooking, cleaning, crying, correcting, fundraising, listening, laughing, and teaching...
2021-12-12
42 min
The Nonlinear Library: LessWrong Top Posts
The ground of optimization by alexflint
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: The ground of optimization, published by alexflint on the LessWrong. Crossposted from the AI Alignment Forum. May contain more technical jargon than usual. This work was supported by OAK, a monastic community in the Berkeley hills. This document could not have been written without the daily love of living in this beautiful community. The work involved in writing this cannot be separated from the sitting, chanting, cooking, cleaning, crying, correcting, fundraising, listening, laughing, and teaching...
2021-12-12
42 min
The Nonlinear Library: LessWrong Top Posts
2018 Review: Voting Results! by Ben Pace
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: 2018 Review: Voting Results! , published by Ben Pace on the AI Alignment Forum. The votes are in! 59 of the 430 eligible voters participated, evaluating 75 posts. Meanwhile, 39 users submitted a total of 120 reviews, with most posts getting at least one review. Thanks a ton to everyone who put in time to think about the posts - nominators, reviewers and voters alike. Several reviews substantially changed my mind about many topics and ideas, and I was quite grateful for...
2021-12-11
13 min
The Nonlinear Library: LessWrong Top Posts
2018 Review: Voting Results! by Ben Pace
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio.This is: 2018 Review: Voting Results! , published by Ben Pace on the AI Alignment Forum.The votes are in!59 of the 430 eligible voters participated, evaluating 75 posts. Meanwhile, 39 users submitted a total of 120 reviews, with most posts getting at least one review.Thanks a ton to everyone who put in time to think about the posts - nominators, reviewers and voters alike. Several reviews substantially changed my mind about many topics and ideas, and I was quite grateful for...
2021-12-11
13 min
The Nonlinear Library: LessWrong Top Posts
Cryonics signup guide #1: Overview by mingyuan
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Cryonics signup guide #1: Overview , published by mingyuan on the AI Alignment Forum. This is the introduction to a sequence on signing up for cryonics. In the coming posts I will lay out what you need to do, concretely and in detail. This sequence is intended for people who already think signing up for cryonics is a good idea but are putting it off because they're not sure what they actually need to do next. I...
2021-12-11
09 min
The Nonlinear Library: LessWrong Top Posts
Cryonics signup guide #1: Overview by mingyuan
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio.This is: Cryonics signup guide #1: Overview , published by mingyuan on the AI Alignment Forum.This is the introduction to a sequence on signing up for cryonics. In the coming posts I will lay out what you need to do, concretely and in detail. This sequence is intended for people who already think signing up for cryonics is a good idea but are putting it off because they're not sure what they actually need to do next. I...
2021-12-11
09 min
The Nonlinear Library: Alignment Forum Top Posts
The ground of optimization by Alex Flint
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: The ground of optimization, published by Alex Flint on the AI Alignment Forum. This work was supported by OAK, a monastic community in the Berkeley hills. This document could not have been written without the daily love of living in this beautiful community. The work involved in writing this cannot be separated from the sitting, chanting, cooking, cleaning, crying, correcting, fundraising, listening, laughing, and teaching of the whole community. What is optimization...
2021-12-10
43 min
AXRP - the AI X-risk Research Podcast
12 - AI Existential Risk with Paul Christiano
Why would advanced AI systems pose an existential risk, and what would it look like to develop safer systems? In this episode, I interview Paul Christiano about his views of how AI could be so dangerous, what bad AI scenarios could look like, and what he thinks about various techniques to reduce this risk. Topics we discuss, and timestamps: - 00:00:38 - How AI may pose an existential threat - 00:13:36 - AI timelines - 00:24:49 - Why we might build risky AI - 00:33:58 - Takeoff speeds - 00:51:33 - Why AI c...
2021-12-02
2h 49
The Nonlinear Library: Alignment Section
Alignment Newsletter #22 by Rohin Shah
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Alignment Newsletter #22, published by Rohin Shah on the AI Alignment Forum. Highlights AI Governance: A Research Agenda (Allan Dafoe): A comprehensive document about the research agenda at the Governance of AI Program. This is really long and covers a lot of ground so I'm not going to summarize it, but I highly recommend it, even if you intend to work primarily on technical work. Technical AI alignment Agent foundations Agents and Devices: A Relative Definition...
2021-11-17
10 min
Future of Life Institute Podcast
Future of Life Institute's $25M Grants Program for Existential Risk Reduction
Future of Life Institute President Max Tegmark and our grants team, Andrea Berman and Daniel Filan, join us to announce a $25M multi-year AI Existential Safety Grants Program. Topics discussed in this episode include: - The reason Future of Life Institute is offering AI Existential Safety Grants - Max speaks about how receiving a grant changed his career early on - Daniel and Andrea provide details on the fellowships and future grant priorities Check out our grants programs here: https://grants.futureoflife.org/ Join our AI Existential Safety Community: https://futureoflife.org/team/ai-exis... Have any feedback about the podcast...
2021-10-19
24 min
Future of Life Institute Podcast
Future of Life Institute's $25M Grants Program for Existential Risk Reduction
Future of Life Institute President Max Tegmark and our grants team, Andrea Berman and Daniel Filan, join us to announce a $25M multi-year AI Existential Safety Grants Program. Topics discussed in this episode include: - The reason Future of Life Institute is offering AI Existential Safety Grants - Max speaks about how receiving a grant changed his career early on - Daniel and Andrea provide details on the fellowships and future grant priorities Check out our grants programs here: https://grants.futureoflife.org/ Join our AI Existential Safety Community: https://futureoflife.org/team/ai-exis... Have any feedback about the podcast...
2021-10-19
24 min
AXRP - the AI X-risk Research Podcast
11 - Attainable Utility and Power with Alex Turner
Many scary stories about AI involve an AI system deceiving and subjugating humans in order to gain the ability to achieve its goals without us stopping it. This episode's guest, Alex Turner, will tell us about his research analyzing the notions of "attainable utility" and "power" that underlie these stories, so that we can better evaluate how likely they are and how to prevent them. Topics we discuss: - Side effects minimization - Attainable Utility Preservation (AUP) - AUP and alignment - Power-seeking - Power-seeking and al...
2021-09-25
1h 27
AXRP - the AI X-risk Research Podcast
10 - AI's Future and Impacts with Katja Grace
When going about trying to ensure that AI does not cause an existential catastrophe, it's likely important to understand how AI will develop in the future, and why exactly it might or might not cause such a catastrophe. In this episode, I interview Katja Grace, researcher at AI Impacts, who's done work surveying AI researchers about when they expect superhuman AI to be reached, collecting data about how rapidly AI tends to progress, and thinking about the weak points in arguments that AI could be catastrophic for humanity. Topics we discuss: - 00:00:34 - AI...
2021-07-24
2h 02
Towards Data Science
92. Daniel Filan - Peering into neural nets for AI safety
Many AI researchers think it’s going to be hard to design AI systems that continue to remain safe as AI capabilities increase. We’ve seen already on the podcast that the field of AI alignment has emerged to tackle this problem, but a related effort is also being directed at a separate dimension of the safety problem: AI interpretability. Our ability to interpret how AI systems process information and make decisions will likely become an important factor in assuring the reliability of AIs in the future. And my guest for this episode of the podcast has focu...
2021-07-14
1h 06
AXRP - the AI X-risk Research Podcast
9 - Finite Factored Sets with Scott Garrabrant
Being an agent can get loopy quickly. For instance, imagine that we're playing chess and I'm trying to decide what move to make. Your next move influences the outcome of the game, and my guess of that influences my move, which influences your next move, which influences the outcome of the game. How can we model these dependencies in a general way, without baking in primitive notions of 'belief' or 'agency'? Today, I talk with Scott Garrabrant about his recent work on finite factored sets that aims to answer this question. Topics we discuss:
2021-06-25
1h 38
AXRP - the AI X-risk Research Podcast
8 - Assistance Games with Dylan Hadfield-Menell
How should we think about the technical problem of building smarter-than-human AI that does what we want? When and how should AI systems defer to us? Should they have their own goals, and how should those goals be managed? In this episode, Dylan Hadfield-Menell talks about his work on assistance games that formalizes these questions. The first couple years of my PhD program included many long conversations with Dylan that helped shape how I view AI x-risk research, so it was great to have another one in the form of a recorded interview. Link to t...
2021-06-09
2h 23
AXRP - the AI X-risk Research Podcast
7.5 - Forecasting Transformative AI from Biological Anchors with Ajeya Cotra
If you want to shape the development and forecast the consequences of powerful AI technology, it's important to know when it might appear. In this episode, I talk to Ajeya Cotra about her draft report "Forecasting Transformative AI from Biological Anchors" which aims to build a probabilistic model to answer this question. We talk about a variety of topics, including the structure of the model, what the most important parts are to get right, how the estimates should shape our behaviour, and Ajeya's current work at Open Philanthropy and perspective on the AI x-risk landscape. U...
2021-05-28
01 min
AXRP - the AI X-risk Research Podcast
7 - Side Effects with Victoria Krakovna
One way of thinking about how AI might pose an existential threat is by taking drastic actions to maximize its achievement of some objective function, such as taking control of the power supply or the world's computers. This might suggest a mitigation strategy of minimizing the degree to which AI systems have large effects on the world that are not absolutely necessary for achieving their objective. In this episode, Victoria Krakovna talks about her research on quantifying and minimizing side effects. Topics discussed include how one goes about defining side effects and the difficulties in doing so, her work...
2021-05-14
1h 19
AXRP - the AI X-risk Research Podcast
6 - Debate and Imitative Generalization with Beth Barnes
One proposal to train AIs that can be useful is to have ML models debate each other about the answer to a human-provided question, where the human judges which side has won. In this episode, I talk with Beth Barnes about her thoughts on the pros and cons of this strategy, what she learned from seeing how humans behaved in debate protocols, and how a technique called imitative generalization can augment debate. Those who are already quite familiar with the basic proposal might want to skip past the explanation of debate to 13:00, "what problems does it solve and does...
2021-04-08
1h 58
AXRP - the AI X-risk Research Podcast
5 - Infra-Bayesianism with Vanessa Kosoy
The theory of sequential decision-making has a problem: how can we deal with situations where we have some hypotheses about the environment we're acting in, but its exact form might be outside the range of possibilities we can possibly consider? Relatedly, how do we deal with situations where the environment can simulate what we'll do in the future, and put us in better or worse situations now depending on what we'll do then? Today's episode features Vanessa Kosoy talking about infra-Bayesianism, the mathematical framework she developed with Alex Appel that modifies Bayesian decision theory to succeed in these types...
2021-03-10
1h 23
AXRP - the AI X-risk Research Podcast
4 - Risks from Learned Optimization with Evan Hubinger
In machine learning, typically optimization is done to produce a model that performs well according to some metric. Today's episode features Evan Hubinger talking about what happens when the learned model itself is doing optimization in order to perform well, how the goals of the learned model could differ from the goals we used to select the learned model, and what would happen if they did differ. Link to the paper - Risks from Learned Optimization in Advanced Machine Learning Systems: arxiv.org/abs/1906.01820 Link to the transcript: axrp.net/episode/2021/02/17/episode-4-risks-from-learned-optimization-evan-hubinger.h...
2021-02-18
2h 13
AXRP - the AI X-risk Research Podcast
3 - Negotiable Reinforcement Learning with Andrew Critch
In this episode, I talk with Andrew Critch about negotiable reinforcement learning: what happens when two people (or organizations, or what have you) who have different beliefs and preferences jointly build some agent that will take actions in the real world. In the paper we discuss, it's proven that the only way to make such an agent Pareto optimal - that is, have it not be the case that there's a different agent that both people would prefer to use instead - is to have it preferentially optimize the preferences of whoever's beliefs were more accurate. We discuss his...
2020-12-11
58 min
AXRP - the AI X-risk Research Podcast
2 - Learning Human Biases with Rohin Shah
One approach to creating useful AI systems is to watch humans doing a task, infer what they're trying to do, and then try to do that well. The simplest way to infer what the humans are trying to do is to assume there's one goal that they share, and that they're optimally achieving the goal. This has the problem that humans aren't actually optimal at achieving the goals they pursue. We could instead code in the exact way in which humans behave suboptimally, except that we don't know that either. In this episode, I talk with Rohin Shah about...
2020-12-11
1h 08
AXRP - the AI X-risk Research Podcast
1 - Adversarial Policies with Adam Gleave
In this episode, Adam Gleave and I talk about adversarial policies. Basically, in current reinforcement learning, people train agents that act in some kind of environment, sometimes an environment that contains other agents. For instance, you might train agents that play sumo with each other, with the objective of making them generally good at sumo. Adam's research looks at the case where all you're trying to do is make an agent that defeats one specific other agents: how easy is it, and what happens? He discovers that often, you can do it pretty easily, and your agent can behave...
2020-12-11
58 min
Sidespor
Ruben Hughes "If I love it, I'm about it"
Art Director hos Illum, Ruben Hughes gæster dagens episode af podcasten. Ruben er en rigtig livsnyder med ulastelige smag og sans fra kvalitet. Vi vender Rubens baggrund og tidligere liv i New York, og springer videre til hvordan filan han er havnet i lille København. Derudover kommer vi ind på hvad han egentlig laver som AR hos Illum, nogle tanker om branding, hans filosofi bag indretning og naturligvis også hans yndlings bageri samt om han er mest til Rom eller Paris.Links med mere på sidespor.dkSidespor på Facebook: https://www.facebook.com/si...
2020-01-12
00 min
Sidespor
2020 HVAD SKAL DER SKE?!
GODT NYTÅR FOR FILAN! Som forventet, et helt cliché agtig reflektivt afsnit af podcasten. Vi ville egentlig have delt den med jer igår, men vi besluttede os for at 1/1-2020 var en federe dato. Det bliver selvfølgelig også lidt mere seriøst; vi skal bl.a. have på plads om man bør vaske sit hår først eller sidst i badet, og om udsalg egentlig er særlig fedt. Derudover ser vi tilbage på årtiet og naturligvis også 2019. Det har været de mest transformative år for os, formentlig også for mange af jer, og det før...
2020-01-01
00 min
DJ GRIND | The Daily Grind
October 2019 Mix | DJ GRIND Fall Tour Promo Podcast
NEW PODCAST! The wait is over! My all-new podcast is ready for download, featuring some of my favorite tracks from my summer tour and lots of fresh music for fall. This set includes three of my latest remixes with Toy Armada, including our ‘Club Mix’ for Carly Rae Jepsen’s “Too Much,” our ‘Massive Mix’ for Gawler & Francci Richard’s “JOY,” and our ‘Anthem Mix’ for Celine Dion’s “Flying On My Own!” DJ GRIND 2019 Fall Tour Dates Catch me at these upcoming events! FRIDAY, OCTOBER 25 – Salt Lake City, UT Skyfall presents “Evil” @ Sky SLC www.skyfallslc.com SATURDAY, OCTOBER 26 – New Orleans, LA HNO ‘M...
2019-10-15
1h 22