podcast
details
.com
Print
Share
Look for any podcast host, guest or anyone
Search
Showing episodes and shows of
And Ajeya Cotra
Shows
ChinaTalk
Wagner, Two Years On (Kamil on Coups and Power)
How does Russia prevent uprisings, and what can other authoritarians learn from Moscow’s methods of coup control? For the second anniversary of the Wagner uprising, ChinaTalk interviewed London-based historian Kamil Galeev, who was also a classmate of Jordan’s at Peking University. We discuss… Why the Wagner Group rebelled in 2023, and why the coup attempt ultimately failed, How Wagner shifted the Kremlin’s assessment of internal political challengers, Similarities between post-Soviet doomerism and the American right, Historical examples of foreign policy inflienced by a victimhood mentality,
2025-07-02
40 min
ChinaTalk
Wagner, Two Years On (Kamil on Coups and Power)
How does Russia prevent uprisings, and what can other authoritarians learn from Moscow’s methods of coup control? For the second anniversary of the Wagner uprising, ChinaTalk interviewed London-based historian Kamil Galeev, who was also a classmate of Jordan’s at Peking University. We discuss… Why the Wagner Group rebelled in 2023, and why the coup attempt ultimately failed, How Wagner shifted the Kremlin’s assessment of internal political challengers, Similarities between post-Soviet doomerism and the American right, Historical examples of foreign policy inflienced by a victimhood mentality,
2025-07-02
40 min
80,000 Hours Podcast
AGI disagreements and misconceptions: Rob, Luisa, & past guests hash it out
Will LLMs soon be made into autonomous agents? Will they lead to job losses? Is AI misinformation overblown? Will it prove easy or hard to create AGI? And how likely is it that it will feel like something to be a superhuman AGI?With AGI back in the headlines, we bring you 15 opinionated highlights from the show addressing those and other questions, intermixed with opinions from hosts Luisa Rodriguez and Rob Wiblin recorded back in 2023.Check out the full transcript on the 80,000 Hours website.You can decide whether the views we expressed (and...
2025-02-10
3h 12
The Valmy
Ajeya Cotra on AI safety and the future of humanity
Podcast: AI Summer Episode: Ajeya Cotra on AI safety and the future of humanityRelease date: 2025-01-16Get Podcast Transcript →powered by Listen411 - fast audio-to-text and summarizationAjeya Cotra works at Open Philanthropy, a leading funder of efforts to combat existential risks from AI. She has led the foundation’s grantmaking on technical research to understand and reduce catastrophic risks from advanced AI. She is co-author of Planned Obsolescence, a newsletter about AI futurism and AI alignment.Although a committed doomer herself, Cotra has worked hard to unde...
2025-01-17
1h 13
AI Summer
Ajeya Cotra on AI safety and the future of humanity
Ajeya Cotra works at Open Philanthropy, a leading funder of efforts to combat existential risks from AI. She has led the foundation’s grantmaking on technical research to understand and reduce catastrophic risks from advanced AI. She is co-author of Planned Obsolescence, a newsletter about AI futurism and AI alignment.Although a committed doomer herself, Cotra has worked hard to understand the perspectives of AI safety skeptics. In this episode, we asked her to guide us through the contentious debate over AI safety and—perhaps—explain why people with similar views on other issues frequently reach divergent views...
2025-01-16
1h 13
AI Safety Fundamentals
Biological Anchors: A Trick That Might Or Might Not Work
I've been trying to review and summarize Eliezer Yudkowksy's recent dialogues on AI safety. Previously in sequence: Yudkowsky Contra Ngo On Agents. Now we’re up to Yudkowsky contra Cotra on biological anchors, but before we get there we need to figure out what Cotra's talking about and what's going on.The Open Philanthropy Project ("Open Phil") is a big effective altruist foundation interested in funding AI safety. It's got $20 billion, probably the majority of money in the field, so its decisions matter a lot and it’s very invested in getting things right. In 2020, it asked seni...
2025-01-04
1h 10
AI Safety Fundamentals
Biological Anchors: A Trick That Might Or Might Not Work
I've been trying to review and summarize Eliezer Yudkowksy's recent dialogues on AI safety. Previously in sequence: Yudkowsky Contra Ngo On Agents. Now we’re up to Yudkowsky contra Cotra on biological anchors, but before we get there we need to figure out what Cotra's talking about and what's going on.The Open Philanthropy Project ("Open Phil") is a big effective altruist foundation interested in funding AI safety. It's got $20 billion, probably the majority of money in the field, so its decisions matter a lot and it’s very invested in getting things right. In 2020, it asked seni...
2025-01-04
1h 10
DealBook Summit
The A.I. Revolution
A panel of leading voices in A.I., including experts on capabilities, safety and investing, and policy and governance, tease out some of the big debates over the future of A.I and try to find some common ground. The discussion is moderated by Kevin Roose, a technology columnist at The Times.Participants:Jack Clark, co-founder and head of policy at AnthropicAjeya Cotra, senior program officer for potential risks from advanced A.I. at Open PhilanthropySarah Guo, founder and managing partner at ConvictionDan Hendrycks, director of the Center for A.I. SafetyRana el Kaliouby, co-founder and...
2024-12-11
1h 31
AI-Generated Audio for Planned Obsolescence
OpenAI's CBRN tests seem unclear
OpenAI says o1-preview can't meaningfully help novices make chemical and biological weapons. Their test results don’t clearly establish this.https://planned-obsolescence.org/openais-cbrn-tests-seem-unclear
2024-11-21
13 min
AI-Generated Audio for Planned Obsolescence
Dangerous capability tests should be harder
We should be spending less time proving today’s AIs are safe and more time figuring out how to tell if tomorrow’s AIs are dangerous: planned-obsolescence.org/dangerous-capability-tests-should-be-harder
2024-08-20
08 min
80,000 Hours Podcast
#90 Classic episode – Ajeya Cotra on worldview diversification and how big the future could be
You wake up in a mysterious box, and hear the booming voice of God: “I just flipped a coin. If it came up heads, I made ten boxes, labeled 1 through 10 — each of which has a human in it. If it came up tails, I made ten billion boxes, labeled 1 through 10 billion — also with one human in each box. To get into heaven, you have to answer this correctly: Which way did the coin land?”You think briefly, and decide you should bet your eternal soul on tails. The fact that you woke up at all seems like pretty g...
2024-01-12
2h 59
LessWrong (Curated & Popular)
[HUMAN VOICE] "AI Timelines" by habryka, Daniel Kokotajlo, Ajeya Cotra, Ege Erdil
Support ongoing human narrations of curated posts:www.patreon.com/LWCuratedHow many years will pass before transformative AI is built? Three people who have thought about this question a lot are Ajeya Cotra from Open Philanthropy, Daniel Kokotajlo from OpenAI and Ege Erdil from Epoch. Despite each spending at least hundreds of hours investigating this question, they still still disagree substantially about the relevant timescales. For instance, here are their median timelines for one operationalization of transformative AI:Source:https://www.lesswrong.com/posts/K2D45BNxnZjdpSX2j...
2023-11-17
1h 18
Pivot
AI Ethics at Code 2023
Platformer's Casey Newton moderates a conversation at Code 2023 on ethics in artificial intelligence, with Ajeya Cotra, Senior Program Officer at Open Philanthropy, and Helen Toner, Director of Strategy at Georgetown University’s Center for Security and Emerging Technology. The panel discusses the risks and rewards of the technology, as well as best practices and safety measures.Recorded on September 27th in Los Angeles. Learn more about your ad choices. Visit podcastchoices.com/adchoices
2023-10-25
28 min
AI-Generated Audio for Planned Obsolescence
Scale, schlep, and systems
This startlingly fast progress in LLMs was driven both by scaling up LLMs and doing schlep to make usable systems out of them. We think scale and schlep will both improve rapidly: planned-obsolescence.org/scale-schlep-and-systems
2023-10-10
09 min
The 80,000 Hours Podcast on Artificial Intelligence (September 2023)
Two: Ajeya Cotra on accidentally teaching AI models to deceive us
Originally released in May 2023. Imagine you are an orphaned eight-year-old whose parents left you a $1 trillion company, and no trusted adult to serve as your guide to the world. You have to hire a smart adult to run that company, guide your life the way that a parent would, and administer your vast wealth. You have to hire that adult based on a work trial or interview you come up with. You don't get to see any resumes or do reference checks. And because you're so rich, tonnes of people apply for the job — for all sorts of...
2023-09-02
2h 49
AI-Generated Audio for Planned Obsolescence
Language models surprised us
Most experts were surprised by progress in language models in 2022 and 2023. There may be more surprises ahead, so experts should register their forecasts now about 2024 and 2025: https://planned-obsolescence.org/language-models-surprised-us
2023-08-29
08 min
80k After Hours
Highlights: #151 – Ajeya Cotra on accidentally teaching AI models to deceive us
This is a selection of highlights from episode #151 of The 80,000 Hours Podcast.These aren't necessarily the most important, or even most entertaining parts of the interview — and if you enjoy this, we strongly recommend checking out the full episode:Ajeya Cotra on accidentally teaching AI models to deceive usAnd if you're finding these highlights episodes valuable, please let us know by emailing podcast@80000hours.org.Get this episode by subscribing to our podcast on the world’s most pressing problems and how to solve them: type ‘80,000 Hours’ into your podcasting app. Or...
2023-08-02
25 min
The Inside View
Curtis Huebner on Doom, AI Timelines and Alignment at EleutherAI
Curtis, also known on the internet as AI_WAIFU, is the head of Alignment at EleutherAI. In this episode we discuss the massive orders of H100s from different actors, why he thinks AGI is 4-5 years away, why he thinks we're 90% "toast", his comment on Eliezer Yudkwosky's Death with Dignity, and what kind of Alignment projects is currently going on at EleutherAI, especially a project with Markov chains and the Alignment test project that he is currently leading. Youtube: https://www.youtube.com/watch?v=9s3XctQOgew Transcript: https://theinsideview.ai...
2023-07-16
1h 29
AI-Generated Audio for Planned Obsolescence
Could AI accelerate economic growth?
Most new technologies don’t accelerate the pace of economic growth. But advanced AI might do this by massively increasing the research effort going into developing new technologies.
2023-06-06
04 min
Hard Fork
The Surgeon General’s Social Media Warning + A.I.’s Existential Risks
The U.S. surgeon general, Dr. Vivek Murthy, says social media poses a “profound risk of harm” to young people. Why do some in the tech industry disagree?Then, Ajeya Cotra, an A.I. researcher, on how A.I. could lead to a doomsday scenario.Plus: Pass the hat. Kevin and Casey play a game they call HatGPT.On today’s episode:Ajeya Cotra is a senior research analyst at Open PhilanthropyAdditional reading:The surgeon general issued an advisory about the risks of social media for young people.Ajeya Cotra...
2023-05-26
1h 13
"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis
[Bonus Episode] Connor Leahy on AGI, GPT-4, and Cognitive Emulation w/ FLI Podcast
[Bonus Episode] Future of Life Institute Podcast host Gus Docker interviews Conjecture CEO Connor Leahy to discuss GPT-4, magic, cognitive emulation, demand for human-like AI, and aligning superintelligence. You can read more about Connor's work at https://conjecture.devFuture of Life Institute is the organization that recently published an open letter calling for a six-month pause on training new AI systems. FLI was founded by Jann Tallinn who we interviewed in Episode 16 of The Cognitive Revolution.We think their podcast is excellent. They frequently interview critical thinkers in AI like Neel Nanda, Ajeya Cotra...
2023-05-19
1h 41
80,000 Hours Podcast
#151 – Ajeya Cotra on accidentally teaching AI models to deceive us
Imagine you are an orphaned eight-year-old whose parents left you a $1 trillion company, and no trusted adult to serve as your guide to the world. You have to hire a smart adult to run that company, guide your life the way that a parent would, and administer your vast wealth. You have to hire that adult based on a work trial or interview you come up with. You don't get to see any resumes or do reference checks. And because you're so rich, tonnes of people apply for the job — for all sorts of reasons.Today's guest Aj...
2023-05-12
2h 49
AI-Generated Audio for Planned Obsolescence
The costs of caution
Both AI fears and AI hopes rest on the belief that it may be possible to build alien minds that can do everything we can do and much more. AI-driven technological progress could save countless lives and make everyone massively healthier and wealthier: https://planned-obsolescence.org/the-costs-of-caution
2023-05-01
04 min
AI-Generated Audio for Planned Obsolescence
Continuous doesn't mean slow
Once a lab trains AI that can fully replace its human employees, it will be able to multiply its workforce 100,000x. If these AIs do AI research, they could develop vastly superhuman systems in under a year: https://planned-obsolescence.org/continuous-doesnt-mean-slow
2023-04-12
03 min
AI-Generated Audio for Planned Obsolescence
AIs accelerating AI research
Researchers could potentially design the next generation of ML models more quickly by delegating some work to existing models, creating a feedback loop of ever-accelerating progress. https://planned-obsolescence.org/ais-accelerating-ai-research
2023-04-04
05 min
AI-Generated Audio for Planned Obsolescence
Is it time for a pause?
The single most important thing we can do is to pause when the next model we train would be powerful enough to obsolete humans entirely. If it were up to me, I would slow down AI development starting now — and then later slow down even more: https://www.planned-obsolescence.org/is-it-time-for-a-pause/
2023-03-30
06 min
AI-Generated Audio for Planned Obsolescence
Alignment researchers disagree a lot
Many fellow alignment researchers may be operating under radically different assumptions from you: https://www.planned-obsolescence.org/disagreement-in-alignment/
2023-03-27
03 min
AI-Generated Audio for Planned Obsolescence
Training AIs to help us align AIs
If we can accurately recognize good performance on alignment, we could elicit lots of useful alignment work from our models, even if they're playing the training game: https://www.planned-obsolescence.org/training-ais-to-help-us-align-ais/
2023-03-27
04 min
AI-Generated Audio for Planned Obsolescence
Playing the training game
We're creating incentives for AI systems to make their behavior look as desirable as possible, while intentionally disregarding human intent when that conflicts with maximizing reward: https://www.planned-obsolescence.org/the-training-game/
2023-03-27
07 min
AI-Generated Audio for Planned Obsolescence
Situational awareness
AI systems that have a precise understanding of how they’ll be evaluated and what behavior we want them to display will earn more reward than AI systems that don’t: https://www.planned-obsolescence.org/situational-awareness/
2023-03-27
07 min
AI-Generated Audio for Planned Obsolescence
"Aligned" shouldn't be a synonym for "good"
Perfect alignment just means that AI systems won’t want to deliberately disregard their designers' intent; it's not enough to ensure AI is good for the world: https://www.planned-obsolescence.org/aligned-vs-good/
2023-03-27
06 min
AI-Generated Audio for Planned Obsolescence
What we're doing here
We’re trying to think ahead to a possible future in which AI is making all the most important decisions: https://www.planned-obsolescence.org/what-were-doing-here/
2023-03-27
04 min
AI-Generated Audio for Planned Obsolescence
The ethics of AI red-teaming
If we’ve decided we’re collectively fine with unleashing millions of spam bots, then the least we can do is actually study what they can – and can’t – do: https://www.planned-obsolescence.org/ethics-of-red-teaming/
2023-03-27
02 min
"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis
The Embedding Revolution: Anton Troynikov on Chroma, Stable Attribution, and future of AI
We're hiring across the board at Turpentine and for Erik's personal team on other projects he's incubating. He's hiring a Chief of Staff, EA, Head of Special Projects, Investment Associate, and more. For a list of JDs, check out: eriktorenberg.com.(0:00) Preview(1:17) Sponsor (4:00) Anton breaks down the advantages of vector databases(4:45) How embeddings have created an AI-native way to represent data(11:50) Anton identifies the watershed moment and step changes in AI(12:55) Open AI’s pricing(18:50) How chroma works(33:04) Stable Attribution and...
2023-03-02
1h 28
TYPE III AUDIO (All episodes)
"Literature review of Transformative Artificial Intelligence timelines" by Jaime Sevilla
---client: ea_forumproject_id: curatedfeed_id: ai, ai_safety, ai_safety__forecastingnarrator: pwqa: mdsnarrator_time: 1h20mqa_time: 0h15m---This is a linkpost for https://epochai.org/blog/literature-review-of-transformative-artificial-intelligence-timelinesWe summarize and compare several models and forecasts predicting when transformative AI will be developed.HighlightsThe review includes quantitative models, including both outside and inside view, and judgment-based forecasts by (teams of) experts.While we do not necessarily endorse their conclusions, the inside-view...
2023-02-10
10 min
Future of Life Institute Podcast
Ajeya Cotra on Thinking Clearly in a Rapidly Changing World
Ajeya Cotra joins us to talk about thinking clearly in a rapidly changing world. Learn more about the work of Ajeya and her colleagues: https://www.openphilanthropy.org Timestamps: 00:00 Introduction 00:44 The default versus the accelerating picture of the future 04:25 The role of AI in accelerating change 06:48 Extrapolating economic growth 08:53 How do we know whether the pace of change is accelerating? 15:07 How can we cope with a rapidly changing world? 18:50 How could the future be utopian? 22:03 Is accelerating technological progress immoral? 25:43 Should we imagine concrete future scenarios? 31:15 How should we act in an accelerating world? 34:41 How Ajeya could be wrong about...
2022-11-10
44 min
Future of Life Institute Podcast
Ajeya Cotra on Thinking Clearly in a Rapidly Changing World
Ajeya Cotra joins us to talk about thinking clearly in a rapidly changing world. Learn more about the work of Ajeya and her colleagues: https://www.openphilanthropy.org Timestamps: 00:00 Introduction 00:44 The default versus the accelerating picture of the future 04:25 The role of AI in accelerating change 06:48 Extrapolating economic growth 08:53 How do we know whether the pace of change is accelerating? 15:07 How can we cope with a rapidly changing world? 18:50 How could the future be utopian? 22:03 Is accelerating technological progress immoral? 25:43 Should we imagine concrete future scenarios? 31:15 How should we act in an accelerating world? 34:41 How Ajeya could be wrong about...
2022-11-10
44 min
The Valmy
Ajeya Cotra on how Artificial Intelligence Could Cause Catastrophe
Podcast: Future of Life Institute Podcast Episode: Ajeya Cotra on how Artificial Intelligence Could Cause CatastropheRelease date: 2022-11-03Get Podcast Transcript →powered by Listen411 - fast audio-to-text and summarizationAjeya Cotra joins us to discuss how artificial intelligence could cause catastrophe. Follow the work of Ajeya and her colleagues: https://www.openphilanthropy.org Timestamps: 00:00 Introduction 00:53 AI safety research in general 02:04 Realistic scenarios for AI catastrophes 06:51 A dangerous AI model developed in the near future 09:10 Assumptions behind dangerous AI development 14:45 Can AIs learn long-term planning? 18:09 Can AIs understand human psychology? 22:32 Training an...
2022-11-05
54 min
Future of Life Institute Podcast
Ajeya Cotra on how Artificial Intelligence Could Cause Catastrophe
Ajeya Cotra joins us to discuss how artificial intelligence could cause catastrophe. Follow the work of Ajeya and her colleagues: https://www.openphilanthropy.org Timestamps: 00:00 Introduction 00:53 AI safety research in general 02:04 Realistic scenarios for AI catastrophes 06:51 A dangerous AI model developed in the near future 09:10 Assumptions behind dangerous AI development 14:45 Can AIs learn long-term planning? 18:09 Can AIs understand human psychology? 22:32 Training an AI model with naive safety features 24:06 Can AIs be deceptive? 31:07 What happens after deploying an unsafe AI system? 44:03 What can we do to prevent an AI catastrophe? 53:58 The next episode
2022-11-03
54 min
Future of Life Institute Podcast
Ajeya Cotra on how Artificial Intelligence Could Cause Catastrophe
Ajeya Cotra joins us to discuss how artificial intelligence could cause catastrophe. Follow the work of Ajeya and her colleagues: https://www.openphilanthropy.org Timestamps: 00:00 Introduction 00:53 AI safety research in general 02:04 Realistic scenarios for AI catastrophes 06:51 A dangerous AI model developed in the near future 09:10 Assumptions behind dangerous AI development 14:45 Can AIs learn long-term planning? 18:09 Can AIs understand human psychology? 22:32 Training an AI model with naive safety features 24:06 Can AIs be deceptive? 31:07 What happens after deploying an unsafe AI system? 44:03 What can we do to prevent an AI catastrophe? 53:58 The next episode
2022-11-03
54 min
TYPE III AUDIO (All episodes)
LessWrong: "How might we align transformative AI if it’s developed very soon?" by Holden Karnofsky
---narrator_time: 4h30mnarrator: pwqa: kmfeed_id: ai, ai_safety, ai_safety__technical, ai_safety__governanceclient: lesswrong---https://www.lesswrong.com/posts/rCJQAkPTEypGjSJ8X/how-might-we-align-transformative-ai-if-it-s-developed-very This post is part of my AI strategy nearcasting series: trying to answer key strategic questions about transformative AI, under the assumption that key events will happen very soon, and/or in a world that is otherwise very similar to today's. This post gives my understanding of what the set of available strategies for aligning transformative AI...
2022-11-03
1h 39
Future of Life Institute Podcast
Ajeya Cotra on Forecasting Transformative Artificial Intelligence
Ajeya Cotra joins us to discuss forecasting transformative artificial intelligence. Follow the work of Ajeya and her colleagues: https://www.openphilanthropy.org Timestamps: 00:00 Introduction 00:53 Ajeya's report on AI 01:16 What is transformative AI? 02:09 Forecasting transformative AI 02:53 Historical growth rates 05:10 Simpler forecasting methods 09:01 Biological anchors 16:31 Different paths to transformative AI 17:55 Which year will we get transformative AI? 25:54 Expert opinion on transformative AI 30:08 Are today's machine learning techniques enough? 33:06 Will AI be limited by the physical world and regulation? 38:15 Will AI be limited by training data? 41:48 Are there human abilities that AIs cannot learn? 47:22 The next episode
2022-10-27
47 min
Future of Life Institute Podcast
Ajeya Cotra on Forecasting Transformative Artificial Intelligence
Ajeya Cotra joins us to discuss forecasting transformative artificial intelligence. Follow the work of Ajeya and her colleagues: https://www.openphilanthropy.org Timestamps: 00:00 Introduction 00:53 Ajeya's report on AI 01:16 What is transformative AI? 02:09 Forecasting transformative AI 02:53 Historical growth rates 05:10 Simpler forecasting methods 09:01 Biological anchors 16:31 Different paths to transformative AI 17:55 Which year will we get transformative AI? 25:54 Expert opinion on transformative AI 30:08 Are today's machine learning techniques enough? 33:06 Will AI be limited by the physical world and regulation? 38:15 Will AI be limited by training data? 41:48 Are there human abilities that AIs cannot learn? 47:22 The next episode
2022-10-27
47 min
LessWrong (Curated & Popular)
"Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover" by Ajeya Cotra
https://www.lesswrong.com/posts/pRkFkzwKZ2zfa3R6H/without-specific-countermeasures-the-easiest-path-toCrossposted from the AI Alignment Forum. May contain more technical jargon than usual.I think that in the coming 15-30 years, the world could plausibly develop “transformative AI”: AI powerful enough to bring us into a new, qualitatively different future, via an explosion in science and technology R&D. This sort of AI could be sufficient to make this the most important century of all time for humanity.The most straightforward vision for developing transformative AI that I can imagine working with very litt...
2022-09-27
3h 07
LessWrong (Curated & Popular)
"Two-year update on my personal AI timelines" by Ajeya Cotra
https://www.lesswrong.com/posts/AfH2oPHCApdKicM4m/two-year-update-on-my-personal-ai-timelines#fnref-fwwPpQFdWM6hJqwuY-12Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.I worked on my draft report on biological anchors for forecasting AI timelines mainly between ~May 2019 (three months after the release of GPT-2) and ~Jul 2020 (a month after the release of GPT-3), and posted it on LessWrong in Sep 2020 after an internal review process. At the time, my bottom line estimates from the bio anchors modeling exercise were:[1]Roughly ~15% probability of transformative AI by 2036[2] (16 years from posting the report; 14 years...
2022-09-22
39 min
Future Matters
#5: supervolcanoes, AI takeover, and What We Owe the Future
Future Matters is a newsletter about longtermism brought to you by Matthew van der Merwe and Pablo Stafforini. Each month we collect and summarize longtermism-relevant research, share news from the longtermism community, and feature a conversation with a prominent researcher. You can also subscribe on Substack, read on the EA Forum and follow on Twitter. 00:00 Welcome to Future Matters. 01:08 MacAskill — What We Owe the Future. 01:34 Lifland — Samotsvety's AI risk forecasts. 02:11 Halstead — Climate Change and Longtermism. 02:43 Good Judgment — Long-term risks and climate change. 02:54 Thorstad — Existential risk pessimism and the time of perils. 03:32 Hamilton — Space and existential risk. 04:07 Cassidy & Mani — Huge...
2022-09-13
31 min
EA Talks
What is an Effective Altruist? EARadio Trailer
Should we spend money on guide dogs for the blind or bed nets to protect kids from disease carrying mosquitoes? Life is full of choices like this. So how can we help others best? And what happens when we fall short? This trailer was adapted from Ajeya Cotra's Introduction to EA. You can find the original talk here. To suggest an episode or interview, please e-mail us at contact@earad.io If you're new to effective altruism, you can learn more about it by listening to the podcast or reading this....
2022-09-01
01 min
Clearer Thinking with Spencer Greenberg
Critiquing Effective Altruism (with Michael Nielsen and Ajeya Cotra)
Read the full transcript here. What is Effective Altruism? Which parts of the Effective Altruism movement are good and not so good? Who outside of the EA movement are doing lots of good in the world? What are the psychological effects of thinking constantly about the trade-offs of spending resources on ourselves versus on others? To what degree is the EA movement centralized intellectually, financially, etc.? Does the EA movement's tendency to quantify everything, to make everything legible to itself, cause it to miss important features of the world? To what extent do EA people rationalize spending...
2022-08-20
1h 38
EA Talks
Introduction to EA | Ajeya Cotra | EAGxBerkeley 2016
Ajeya Cotra introduces the core principles of effective altruism.This talk was taken from EA GxBerkeley 2016. Click here to watch the talk with the video.Effective Altruism is a social movement dedicated to finding ways to do the most good possible, whether through charitable donations, career choices, or volunteer projects. EA Global conferences are gatherings for EAs to meet. You can also listen to this talk along with its accompanying video on YouTube.
2022-08-09
35 min
EA Talks
SERI 2022: Timelines for Transformative AI and Language Model Alignment | Ajeya Cotra
Ajeya Cotra is a Senior Research Analyst at Open Philanthropy. She’s currently thinking about how difficult it may be to ensure AI systems pursue the right goals. Previously, she worked on a framework for estimating when transformative AI may be developed, as well as various cause prioritization and worldview diversification projects. She joined Open Philanthropy in July 2016 as a Research Analyst. Ajeya received a B.S. in Electrical Engineering and Computer Science from UC Berkeley, where she co-founded the Effective Altruists of Berkeley student group and taught a course on effective altruism.This video was first pu...
2022-08-06
28 min
The Inside View
Ethan Caballero–Scale is All You Need
Ethan is known on Twitter as the edgiest person at MILA. We discuss all the gossips around scaling large language models in what will be later known as the Edward Snowden moment of Deep Learning. On his free time, Ethan is a Master’s degree student at MILA in Montreal, and has published papers on out of distribution generalization and robustness generalization, accepted both as oral presentations and spotlight presentations at ICML and NeurIPS. Ethan has recently been thinking about scaling laws, both as an organizer and speaker for the 1st Neural Scaling Laws Workshop. Transcript: https://th...
2022-05-05
51 min
AXRP - the AI X-risk Research Podcast
13 - First Principles of AGI Safety with Richard Ngo
How should we think about artificial general intelligence (AGI), and the risks it might pose? What constraints exist on technical solutions to the problem of aligning superhuman AI systems with human intentions? In this episode, I talk to Richard Ngo about his report analyzing AGI safety from first principles, and recent conversations he had with Eliezer Yudkowsky about the difficulty of AI alignment. Topics we discuss, and timestamps: - 00:00:40 - The nature of intelligence and AGI - 00:01:18 - The nature of intelligence - 00:06:09 - AGI: what and how ...
2022-03-31
1h 33
Astral Codex Ten Podcast
Biological Anchors: A Trick That Might Or Might Not Work
https://astralcodexten.substack.com/p/biological-anchors-a-trick-that-might?utm_source=url Introduction I've been trying to review and summarize Eliezer Yudkowksy's recent dialogues on AI safety. Previously in sequence: Yudkowsky Contra Ngo On Agents. Now we’re up to Yudkowsky contra Cotra on biological anchors, but before we get there we need to figure out what Cotra's talking about and what's going on. The Open Philanthropy Project ("Open Phil") is a big effective altruist foundation interested in funding AI safety. It's got $20 billion, probably the majority of money in the field, so its decisions matter a...
2022-02-24
1h 10
The Nonlinear Library: Alignment Section
(Part 4/4) Forecasting TAI with biological anchors by Ajeya Cotra. Timelines estimates and responses to objections
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is part four of: Forecasting TAI with biological anchors, published by Ajeya Cotra. Part 4: Timelines estimates and responses to objections This report emerged from discussions with our technical advisors Dario Amodei and Paul Christiano. However, it should not be treated as representative of either of their views; the project eventually broadened considerably, and my conclusions are my own. This is a work in progress and does not represent Open Philanthropy’s institutional view. We are making it...
2021-12-23
1h 38
The Nonlinear Library: Alignment Section
(Part 3/4) Forecasting TAI with biological anchors by Ajeya Cotra. Hypotheses and 2020 training computation requirements
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is part three of: Forecasting TAI with biological anchors, published by Ajeya Cotra. Part 3: Hypotheses and 2020 training computation requirements This report emerged from discussions with our technical advisors Dario Amodei and Paul Christiano. However, it should not be treated as representative of either of their views; the project eventually broadened considerably, and my conclusions are my own. This is a work in progress and does not represent Open Philanthropy’s institutional view. We are making it pu...
2021-12-23
1h 20
The Nonlinear Library: Alignment Section
(Part 2/4) Forecasting TAI with biological anchors by Ajeya Cotra. How training data requirements scale with parameter count
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is part two of: Forecasting TAI with biological anchors, published by Ajeya Cotra. Part 2: How training data requirements scale with parameter count This report emerged from discussions with our technical advisors Dario Amodei and Paul Christiano. However, it should not be treated as representative of either of their views; the project eventually broadened considerably, and my conclusions are my own. This is a work in progress and does not represent Open Philanthropy’s institutional view. We ar...
2021-12-23
1h 20
The Nonlinear Library: Alignment Section
(Part 1/4) Forecasting TAI with biological anchors by Ajeya Cotra. Overview, conceptual foundations, and runtime computation.
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is part one of: Forecasting TAI with biological anchors, published by Ajeya Cotra. Part 1: Overview, conceptual foundations, and runtime computation This report emerged from discussions with our technical advisors Dario Amodei and Paul Christiano. However, it should not be treated as representative of either of their views; the project eventually broadened considerably, and my conclusions are my own. This is a work in progress and does not represent Open Philanthropy’s institutional view. We are making it...
2021-12-18
1h 38
The Nonlinear Library: Alignment Section
(Part 1/2) Is power-seeking AI an existential risk? by Joseph Carlsmith
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is part one of: Is power-seeking AI an existential risk?, published by Joseph Carlsmith. 1. Introduction Some worry that the development of advanced artificial intelligence will result in existential catastrophe -- that is, the destruction of humanity’s longterm potential. Here I examine the following version of this worry (it’s not the only version): By 2070: It will become possible and financially feasible to build AI systems with the following properties: Advanced capability: they outperform the best huma...
2021-12-18
1h 29
The Nonlinear Library: Alignment Section
(Part 2/2) Eliciting latent knowledge: How to tell if your eyes deceive you by Paul Christiano, Ajeya Cotra, and Mark Xu
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is part two of: Eliciting latent knowledge: How to tell if your eyes deceive you, published by Paul Christiano, Ajeya Cotra, and Mark Xu. Why we’re excited about tackling worst-case ELK We think that worst-case ELK — i.e. the problem of devising a training strategy to get an AI to report what it knows no matter how its mind is shaped internally — is one of the most exciting open problems in alignment theory (if not the mo...
2021-12-18
2h 02
The Nonlinear Library: Alignment Section
(Part 1/2)Eliciting latent knowledge: How to tell if your eyes deceive you by Paul Christiano, Ajeya Cotra, and Mark Xu
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is part one of: Eliciting latent knowledge: How to tell if your eyes deceive you, published by Paul Christiano, Ajeya Cotra, and Mark Xu. In this post, we’ll present ARC’s approach to an open problem we think is central to aligning powerful machine learning (ML) systems: Suppose we train a model to predict what the future will look like according to cameras and other sensors. We then use planning algorithms to find a sequence of a...
2021-12-15
1h 02
The Nonlinear Library: LessWrong Top Posts
The case for aligning narrowly superhuman models by Ajeya Cotra
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio.This is: The case for aligning narrowly superhuman models, published by Ajeya Cotra on the LessWrong.Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.I wrote this post to get people’s takes on a type of work that seems exciting to me personally; I’m not speaking for Open Phil as a whole. Institutionally, we are very uncertain whether to prioritize this (and if we do where it should be housed and...
2021-12-12
53 min
The Nonlinear Library: LessWrong Top Posts
The case for aligning narrowly superhuman models by Ajeya Cotra
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: The case for aligning narrowly superhuman models, published by Ajeya Cotra on the LessWrong. Crossposted from the AI Alignment Forum. May contain more technical jargon than usual. I wrote this post to get people’s takes on a type of work that seems exciting to me personally; I’m not speaking for Open Phil as a whole. Institutionally, we are very uncertain whether to prioritize this (and if we do where it should be housed and...
2021-12-12
53 min
The Nonlinear Library: LessWrong Top Posts
Draft report on AI timelines by Ajeya Cotra
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio.This is: Draft report on AI timelines, published by Ajeya Cotra on the LessWrong.Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.Hi all, I've been working on some AI forecasting research and have prepared a draft report on timelines to transformative AI. I would love feedback from this community, so I've made the report viewable in a Google Drive folder here.With that said, most of my focus so far has...
2021-12-12
01 min
The Nonlinear Library: LessWrong Top Posts
Draft report on AI timelines by Ajeya Cotra
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Draft report on AI timelines, published by Ajeya Cotra on the LessWrong. Crossposted from the AI Alignment Forum. May contain more technical jargon than usual. Hi all, I've been working on some AI forecasting research and have prepared a draft report on timelines to transformative AI. I would love feedback from this community, so I've made the report viewable in a Google Drive folder here. With that said, most of my focus so far has...
2021-12-12
01 min
The Nonlinear Library: LessWrong Top Posts
My research methodologyΩ by paulfchristiano
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio.This is: My research methodologyΩ, published by paulfchristiano on the LessWrong.Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.(Thanks to Ajeya Cotra, Nick Beckstead, and Jared Kaplan for helpful comments on a draft of this post.)I really don’t want my AI to strategically deceive me and resist my attempts to correct its behavior. Let’s call an AI that does so egregiously misaligned (for the purpose of this post).Most...
2021-12-11
23 min
The Nonlinear Library: LessWrong Top Posts
My research methodologyΩ by paulfchristiano
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: My research methodologyΩ, published by paulfchristiano on the LessWrong. Crossposted from the AI Alignment Forum. May contain more technical jargon than usual. (Thanks to Ajeya Cotra, Nick Beckstead, and Jared Kaplan for helpful comments on a draft of this post.) I really don’t want my AI to strategically deceive me and resist my attempts to correct its behavior. Let’s call an AI that does so egregiously misaligned (for the purpose of this post). Most...
2021-12-11
23 min
The Nonlinear Library: LessWrong Top Posts
MIRI comments on Cotra's "Case for Aligning Narrowly Superhuman Models" by Rob Bensinger
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio.This is: MIRI comments on Cotra's "Case for Aligning Narrowly Superhuman Models", published by Rob Bensinger on the AI Alignment Forum.Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.Below, I’ve copied comments left by MIRI researchers Eliezer Yudkowsky and Evan Hubinger on March 1–3 on a draft of Ajeya Cotra’s "Case for Aligning Narrowly Superhuman Models." I've included back-and-forths with Cotra, and interjections by me and Rohin Shah.The section divisi...
2021-12-11
39 min
The Nonlinear Library: LessWrong Top Posts
MIRI comments on Cotra's "Case for Aligning Narrowly Superhuman Models" by Rob Bensinger
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: MIRI comments on Cotra's "Case for Aligning Narrowly Superhuman Models", published by Rob Bensinger on the AI Alignment Forum. Crossposted from the AI Alignment Forum. May contain more technical jargon than usual. Below, I’ve copied comments left by MIRI researchers Eliezer Yudkowsky and Evan Hubinger on March 1–3 on a draft of Ajeya Cotra’s "Case for Aligning Narrowly Superhuman Models." I've included back-and-forths with Cotra, and interjections by me and Rohin Shah. The section divisi...
2021-12-11
39 min
The Nonlinear Library: LessWrong Top Posts
Redwood Research’s current project
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio.This is: Redwood Research’s current project , published by Buck on the AI Alignment Forum.Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.Here’s a description of the project Redwood Research is working on at the moment. First I’ll say roughly what we’re doing, and then I’ll try to explain why I think this is a reasonable applied alignment project, and then I’ll talk a bit about the takeaway...
2021-12-11
22 min
The Nonlinear Library: LessWrong Top Posts
Redwood Research’s current project
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Redwood Research’s current project , published by Buck on the AI Alignment Forum. Crossposted from the AI Alignment Forum. May contain more technical jargon than usual. Here’s a description of the project Redwood Research is working on at the moment. First I’ll say roughly what we’re doing, and then I’ll try to explain why I think this is a reasonable applied alignment project, and then I’ll talk a bit about the takeaway...
2021-12-11
22 min
The Nonlinear Library: LessWrong Top Posts
The theory-practice gap by Buck
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio.This is: The theory-practice gap, published by Buck on the AI Alignment Forum.Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.[Thanks to Richard Ngo, Damon Binder, Summer Yue, Nate Thomas, Ajeya Cotra, Alex Turner, and other Redwood Research people for helpful comments; thanks Ruby Bloom for formatting this for the Alignment Forum for me.]I'm going to draw a picture, piece by piece. I want to talk about the capability of...
2021-12-11
10 min
The Nonlinear Library: LessWrong Top Posts
The theory-practice gap by Buck
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: The theory-practice gap, published by Buck on the AI Alignment Forum. Crossposted from the AI Alignment Forum. May contain more technical jargon than usual. [Thanks to Richard Ngo, Damon Binder, Summer Yue, Nate Thomas, Ajeya Cotra, Alex Turner, and other Redwood Research people for helpful comments; thanks Ruby Bloom for formatting this for the Alignment Forum for me.] I'm going to draw a picture, piece by piece. I want to talk about the capability of...
2021-12-11
10 min
The Nonlinear Library: Alignment Forum Top Posts
The case for aligning narrowly superhuman models by Ajeya Cotra
I wrote this post to get people’s takes on a type of work that seems exciting to me personally; I’m not speaking for Open Phil as a whole. Institutionally, we are very uncertain whether to prioritize this (and if we do where it should be housed and how our giving should be structured). We are not seeking grant applications on this topic right now. Thanks to Daniel Dewey, Eliezer Yudkowsky, Evan Hubinger, Holden Karnofsky, Jared Kaplan, Mike Levine, Nick Beckstead, Owen Cotton-Barratt, Paul Christiano, Rob Bensinger, and Rohin Shah for comments on earlier drafts. A genre of t...
2021-12-10
50 min
The Nonlinear Library: Alignment Forum Top Posts
Draft report on AI timelines by Ajeya Cotra
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Draft report on AI timelines, published by Ajeya Cotra on the AI Alignment Forum. Hi all, I've been working on some AI forecasting research and have prepared a draft report on timelines to transformative AI. I would love feedback from this community, so I've made the report viewable in a Google Drive folder here. With that said, most of my focus so far has been on the high-level structure of the framework...
2021-12-10
01 min
The Nonlinear Library: Alignment Forum Top Posts
My research methodology by Paul Christiano
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: My research methodology, published by Paul Christiano on the AI Alignment Forum. (Thanks to Ajeya Cotra, Nick Beckstead, and Jared Kaplan for helpful comments on a draft of this post.) I really don’t want my AI to strategically deceive me and resist my attempts to correct its behavior. Let’s call an AI that does so egregiously misaligned (for the purpose of this post). Most possible ML techniques for avoiding egre...
2021-12-10
23 min
The Nonlinear Library: Alignment Forum Top Posts
Seeking Power is Often Convergently Instrumental in MDPs by Paul Christiano
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Seeking Power is Often Convergently Instrumental in MDPs, published by Paul Christiano on the AI Alignment Forum. (Thanks to Ajeya Cotra, Nick Beckstead, and Jared Kaplan for helpful comments on a draft of this post.) I really don’t want my AI to strategically deceive me and resist my attempts to correct its behavior. Let’s call an AI that does so egregiously misaligned (for the purpose of this post). Most poss...
2021-12-10
23 min
The Nonlinear Library: Alignment Forum Top Posts
MIRI comments on Cotra's "Case for Aligning Narrowly Superhuman Models" by Rob Bensinger
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: MIRI comments on Cotra's "Case for Aligning Narrowly Superhuman Models", published by Rob Bensinger on the AI Alignment Forum. Below, I’ve copied comments left by MIRI researchers Eliezer Yudkowsky and Evan Hubinger on March 1–3 on a draft of Ajeya Cotra’s "Case for Aligning Narrowly Superhuman Models." I've included back-and-forths with Cotra, and interjections by me and Rohin Shah. The section divisions below correspond to the sections in Cotra's post. 0. Introd...
2021-12-10
39 min
The Nonlinear Library: Alignment Forum Top Posts
Redwood Research’s current project by Buck Shlegeris
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Redwood Research’s current project, published by Buck Shlegeris on the AI Alignment Forum. Here’s a description of the project Redwood Research is working on at the moment. First I’ll say roughly what we’re doing, and then I’ll try to explain why I think this is a reasonable applied alignment project, and then I’ll talk a bit about the takeaways I’ve had from the project so far. There are...
2021-12-10
22 min
The Nonlinear Library: Alignment Forum Top Posts
The theory-practice gap by Buck Shlegeris by Buck Shlegeris
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: The theory-practice gap by Buck Shlegeris, published by Buck Shlegeris on the AI Alignment Forum. [Thanks to Richard Ngo, Damon Binder, Summer Yue, Nate Thomas, Ajeya Cotra, Alex Turner, and other Redwood Research people for helpful comments; thanks Ruby Bloom for formatting this for the Alignment Forum for me.] I'm going to draw a picture, piece by piece. I want to talk about the capability of some different AI systems. You...
2021-12-10
10 min
The Nonlinear Library: Alignment Forum Top Posts
Paul's research agenda FAQ by Alex Zhu
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Paul's research agenda FAQ, published by Alex Zhu on the AI Alignment Forum. I think Paul Christiano’s research agenda for the alignment of superintelligent AGIs presents one of the most exciting and promising approaches to AI safety. After being very confused about Paul’s agenda, chatting with others about similar confusions, and clarifying with Paul many times over, I’ve decided to write a FAQ addressing common confusions around his agenda. This F...
2021-12-06
33 min
The Nonlinear Library: Alignment Forum Top Posts
Against GDP as a metric for timelines and takeoff speeds by Daniel Kokotajlo
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Against GDP as a metric for timelines and takeoff speeds, published by Daniel Kokotajlo on the AI Alignment Forum. [Epistemic status: Strong opinion, lightly held] I think world GDP (and economic growth more generally) is overrated as a metric for AI timelines and takeoff speeds. Here are some uses of GDP that I disagree with, or at least think should be accompanied by cautionary notes: Timelines: Ajeya Cotra thinks of...
2021-12-06
23 min
The Nonlinear Library: Alignment Forum Top Posts
My Understanding of Paul Christiano's Iterated Amplification AI Safety Research Agenda by Chi Nguyen
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: My Understanding of Paul Christiano's Iterated Amplification AI Safety Research Agenda, publishedby Chi Nguyen on the AI Alignment Forum. Write a Review Crossposted from the EA forum You can read this post as a google docs instead (IMO much better to read). This document aims to clarify the AI safety research agenda by Paul Christiano (IDA) and the arguments around how promising it is. Target audience: All levels of...
2021-12-06
1h 05
The Nonlinear Library: Alignment Section
Teaching ML to answer questions honestly instead of predicting human answers by Paul Christiano
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is:Teaching ML to answer questions honestly instead of predicting human answers , published by Paul Christiano on the AI Alignment Forum. (Note: very much work in progress, unless you want to follow along with my research you'll probably want to wait for an improved/simplified/clarified algorithm.) In this post I consider the particular problem of models learning “predict how a human would answer questions” instead of “answer questions honestly.” (A special case of the problem from Ina...
2021-11-19
23 min
The Nonlinear Library: Alignment Section
My research methodology by Paul Christiano
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: My research methodology, published by Paul Christiano on the AI Alignment Forum. (Thanks to Ajeya Cotra, Nick Beckstead, and Jared Kaplan for helpful comments on a draft of this post.) I really don’t want my AI to strategically deceive me and resist my attempts to correct its behavior. Let’s call an AI that does so egregiously misaligned (for the purpose of this post). Most possible ML techniques for avoiding egregious misalignment depend on deta...
2021-11-19
23 min
The Nonlinear Library: Alignment Section
How do we become confident in the safety of a machine learning system? by Evan Hubinger
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: How do we become confident in the safety of a machine learning system?, published by Evan Hubinger on the AI Alignment Forum. Thanks to Rohin Shah, Ajeya Cotra, Richard Ngo, Paul Christiano, Jon Uesato, Kate Woolverton, Beth Barnes, and William Saunders for helpful comments and feedback. Evaluating proposals for building safe advanced AI—and actually building any degree of confidence in their safety or lack thereof—is extremely difficult. Previously, in “An overview of 11 proposals for bu...
2021-11-19
50 min
The Nonlinear Library: Alignment Section
Automating Auditing: An ambitious concrete technical research proposal by Evan Hubinger
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Automating Auditing: An ambitious concrete technical research proposal, published by Evan Hubinger on the AI Alignment Forum. This post was originally written as a research proposal for the new AI alignment research organization Redwood Research, detailing an ambitious, concrete technical alignment proposal that I’m excited about work being done on, in a similar vein to Ajeya Cotra’s “The case for aligning narrowly superhuman models.” Regardless of whether Redwood actually ends up working on this pro...
2021-11-19
22 min
The Nonlinear Library: Alignment Section
Techniques for enhancing human feedback by abergal, Ajeya Cotra, Nick_Beckstead
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Techniques for enhancing human feedback, published by abergal, Ajeya Cotra, Nick_Beckstead on the AI Alignment Forum. Training powerful models to maximize simple metrics (such as quarterly profits) could be risky. Sufficiently intelligent models could discover strategies for maximizing these metrics in perverse and unintended ways. For example, the easiest way to maximize profits may turn out to involve stealing money, manipulating whoever keeps records into reporting unattainably high profits, capturing regulators of the industry...
2021-11-17
04 min
The Nonlinear Library: Alignment Section
The case for aligning narrowly superhuman models by Ajeya Cotra
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: The case for aligning narrowly superhuman models, published by Ajeya Cotra on the AI Alignment Forum. I wrote this post to get people’s takes on a type of work that seems exciting to me personally; I’m not speaking for Open Phil as a whole. Institutionally, we are very uncertain whether to prioritize this (and if we do where it should be housed and how our giving should be structured). We are not seeking gran...
2021-11-17
00 min
The Nonlinear Library: Alignment Section
AMA on EA Forum: Ajeya Cotra, researcher at Open Phil by Ajeya Cotra
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: AMA on EA Forum: Ajeya Cotra, researcher at Open Phil, published by Ajeya Cotra on the AI Alignment Forum. This is a linkpost for Hi all, I'm Ajeya, and I'll be doing an AMA on the EA Forum (this is a linkpost for my announcement there). I would love to get questions from LessWrong and Alignment Forum users as well -- please head on over if you have any questions for me! I’ll plan to...
2021-11-17
01 min
The Nonlinear Library: Alignment Section
Draft report on AI timelines by Ajeya Cotra
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Draft report on AI timelines, published by Ajeya Cotra on the AI Alignment Forum. Hi all, I've been working on some AI forecasting research and have prepared a draft report on timelines to transformative AI. I would love feedback from this community, so I've made the report viewable in a Google Drive folder here. With that said, most of my focus so far has been on the high-level structure of the framework, so the particular...
2021-11-17
01 min
The Nonlinear Library: Alignment Section
Iterated Distillation and Amplification by Ajeya Cotra
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Iterated Distillation and Amplification, published by Ajeya Cotra on the AI Alignment Forum. This is a guest post summarizing Paul Christiano’s proposed scheme for training machine learning systems that can be robustly aligned to complex and fuzzy values, which I call Iterated Distillation and Amplification (IDA) here. IDA is notably similar to AlphaGoZero and expert iteration. The hope is that if we use IDA to train each learned component of an AI then the ov...
2021-11-17
10 min
The Nonlinear Library: Alignment Section
Alignment Newsletter #35 by Rohin Shah
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Alignment Newsletter #35, published by Rohin Shah on the AI Alignment Forum. Find all Alignment Newsletter resources here. In particular, you can sign up, or look through this spreadsheet of all summaries that have ever been in the newsletter. This week we don't have any explicit highlights, but remember to treat the sequences as though they were highlighted! Technical AI alignment Iterated amplification sequence Corrigibility (Paul Christiano): A corrigible agent is one which helps its operator...
2021-11-17
11 min
The Nonlinear Library: Alignment Section
Redwood Research"s current project by Buck Shlegeris
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Redwood Research"s current project, published by Buck Shlegeris on the AI Alignment Forum. Here’s a description of the project Redwood Research is working on at the moment. First I’ll say roughly what we’re doing, and then I’ll try to explain why I think this is a reasonable applied alignment project, and then I’ll talk a bit about the takeaways I’ve had from the project so far. There are a bunch of p...
2021-11-16
22 min
The Nonlinear Library: Alignment Section
The theory-practice gap by Buck Shlegeris
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: The theory-practice gap, published by Buck Shlegeris on the AI Alignment Forum. [Thanks to Richard Ngo, Damon Binder, Summer Yue, Nate Thomas, Ajeya Cotra, Alex Turner, and other Redwood Research people for helpful comments; thanks Ruby Bloom for formatting this for the Alignment Forum for me.] I'm going to draw a picture, piece by piece. I want to talk about the capability of some different AI systems. You can see here that we've drawn the...
2021-11-16
10 min
Cold Takes Audio
Why AI alignment could be hard with modern deep learning (guest post by Ajeya Cotra)
Why would we program AI that wants to harm us? Because we might not know how to do otherwise.https://www.cold-takes.com/why-ai-alignment-could-be-hard-with-modern-deep-learning/
2021-09-20
28 min
AXRP - the AI X-risk Research Podcast
7.5 - Forecasting Transformative AI from Biological Anchors with Ajeya Cotra
If you want to shape the development and forecast the consequences of powerful AI technology, it's important to know when it might appear. In this episode, I talk to Ajeya Cotra about her draft report "Forecasting Transformative AI from Biological Anchors" which aims to build a probabilistic model to answer this question. We talk about a variety of topics, including the structure of the model, what the most important parts are to get right, how the estimates should shape our behaviour, and Ajeya's current work at Open Philanthropy and perspective on the AI x-risk landscape. U...
2021-05-28
01 min
Effective Altruism: An Introduction – 80,000 Hours (April 2021)
Six: Ajeya Cotra on worldview diversification and how big the future could be
Imagine that humanity has two possible futures ahead of it: Either we’re going to have a huge future like that, in which trillions of people ultimately exist, or we’re going to wipe ourselves out quite soon, thereby ensuring that only around 100 billion people ever get to live.If there are eventually going to be 1,000 trillion humans, what should we think of the fact that we seemingly find ourselves so early in history? If the future will have many trillions of people, the odds of us appearing so strangely early are very low indeed.If w...
2021-04-12
2h 56
80,000 Hours Podcast
#90 – Ajeya Cotra on worldview diversification and how big the future could be
You wake up in a mysterious box, and hear the booming voice of God: “I just flipped a coin. If it came up heads, I made ten boxes, labeled 1 through 10 — each of which has a human in it. If it came up tails, I made ten billion boxes, labeled 1 through 10 billion — also with one human in each box. To get into heaven, you have to answer this correctly: Which way did the coin land?” You think briefly, and decide you should bet your eternal soul on tails. The fact that you woke up at all s...
2021-01-21
2h 59
EA Talks
EAG 2017 London: Implementing cause prioritization at OpenPhil (Ajeya Cotra)
I go through a number of tangles that have come up trying to translate cause prioritisation theory into practice at OpenPhil, some proposed patches, and remaining open questions. (Credit for most of these ideas goes to other people; I make attributions whenever I can.)Source: Effective Altruism Global (video).Effective Altruism is a social movement dedicated to finding ways to do the most good possible, whether through charitable donations, career choices, or volunteer projects. EA Global conferences are gatherings for EAs to meet. You can also listen to this talk along with its accompanying video...
2018-04-23
30 min