And Ajeya Cotra - Podcast Details

Shows

ChinaTalk Wagner, Two Years On (Kamil on Coups and Power)How does Russia prevent uprisings, and what can other authoritarians learn from Moscow’s methods of coup control? For the second anniversary of the Wagner uprising, ChinaTalk interviewed London-based historian Kamil Galeev, who was also a classmate of Jordan’s at Peking University. We discuss… Why the Wagner Group rebelled in 2023, and why the coup attempt ultimately failed, How Wagner shifted the Kremlin’s assessment of internal political challengers, Similarities between post-Soviet doomerism and the American right, Historical examples of foreign policy inflienced by a victimhood mentality,

2025-07-0240 min

80,000 Hours Podcast AGI disagreements and misconceptions: Rob, Luisa, & past guests hash it outWill LLMs soon be made into autonomous agents? Will they lead to job losses? Is AI misinformation overblown? Will it prove easy or hard to create AGI? And how likely is it that it will feel like something to be a superhuman AGI?With AGI back in the headlines, we bring you 15 opinionated highlights from the show addressing those and other questions, intermixed with opinions from hosts Luisa Rodriguez and Rob Wiblin recorded back in 2023.Check out the full transcript on the 80,000 Hours website.You can decide whether the views we expressed (and...

2025-02-103h 12

The Valmy Ajeya Cotra on AI safety and the future of humanity Podcast: AI Summer Episode: Ajeya Cotra on AI safety and the future of humanityRelease date: 2025-01-16Get Podcast Transcript →powered by Listen411 - fast audio-to-text and summarizationAjeya Cotra works at Open Philanthropy, a leading funder of efforts to combat existential risks from AI. She has led the foundation’s grantmaking on technical research to understand and reduce catastrophic risks from advanced AI. She is co-author of Planned Obsolescence, a newsletter about AI futurism and AI alignment.Although a committed doomer herself, Cotra has worked hard to unde...

2025-01-171h 13

AI Summer Ajeya Cotra on AI safety and the future of humanityAjeya Cotra works at Open Philanthropy, a leading funder of efforts to combat existential risks from AI. She has led the foundation’s grantmaking on technical research to understand and reduce catastrophic risks from advanced AI. She is co-author of Planned Obsolescence, a newsletter about AI futurism and AI alignment.Although a committed doomer herself, Cotra has worked hard to understand the perspectives of AI safety skeptics. In this episode, we asked her to guide us through the contentious debate over AI safety and—perhaps—explain why people with similar views on other issues frequently reach divergent views...

2025-01-161h 13

AI Safety Fundamentals Biological Anchors: A Trick That Might Or Might Not WorkI've been trying to review and summarize Eliezer Yudkowksy's recent dialogues on AI safety. Previously in sequence: Yudkowsky Contra Ngo On Agents. Now we’re up to Yudkowsky contra Cotra on biological anchors, but before we get there we need to figure out what Cotra's talking about and what's going on.The Open Philanthropy Project ("Open Phil") is a big effective altruist foundation interested in funding AI safety. It's got $20 billion, probably the majority of money in the field, so its decisions matter a lot and it’s very invested in getting things right. In 2020, it asked seni...

2025-01-041h 10

DealBook Summit The A.I. RevolutionA panel of leading voices in A.I., including experts on capabilities, safety and investing, and policy and governance, tease out some of the big debates over the future of A.I and try to find some common ground. The discussion is moderated by Kevin Roose, a technology columnist at The Times.Participants:Jack Clark, co-founder and head of policy at AnthropicAjeya Cotra, senior program officer for potential risks from advanced A.I. at Open PhilanthropySarah Guo, founder and managing partner at ConvictionDan Hendrycks, director of the Center for A.I. SafetyRana el Kaliouby, co-founder and...

2024-12-111h 31

AI-Generated Audio for Planned Obsolescence OpenAI's CBRN tests seem unclearOpenAI says o1-preview can't meaningfully help novices make chemical and biological weapons. Their test results don’t clearly establish this.https://planned-obsolescence.org/openais-cbrn-tests-seem-unclear

2024-11-2113 min

AI-Generated Audio for Planned Obsolescence Dangerous capability tests should be harderWe should be spending less time proving today’s AIs are safe and more time figuring out how to tell if tomorrow’s AIs are dangerous: planned-obsolescence.org/dangerous-capability-tests-should-be-harder

2024-08-2008 min

80,000 Hours Podcast #90 Classic episode – Ajeya Cotra on worldview diversification and how big the future could beYou wake up in a mysterious box, and hear the booming voice of God: “I just flipped a coin. If it came up heads, I made ten boxes, labeled 1 through 10 — each of which has a human in it. If it came up tails, I made ten billion boxes, labeled 1 through 10 billion — also with one human in each box. To get into heaven, you have to answer this correctly: Which way did the coin land?”You think briefly, and decide you should bet your eternal soul on tails. The fact that you woke up at all seems like pretty g...

2024-01-122h 59

LessWrong (Curated & Popular)[HUMAN VOICE] "AI Timelines" by habryka, Daniel Kokotajlo, Ajeya Cotra, Ege ErdilSupport ongoing human narrations of curated posts:www.patreon.com/LWCuratedHow many years will pass before transformative AI is built? Three people who have thought about this question a lot are Ajeya Cotra from Open Philanthropy, Daniel Kokotajlo from OpenAI and Ege Erdil from Epoch. Despite each spending at least hundreds of hours investigating this question, they still still disagree substantially about the relevant timescales. For instance, here are their median timelines for one operationalization of transformative AI:Source:https://www.lesswrong.com/posts/K2D45BNxnZjdpSX2j...

2023-11-171h 18

Pivot AI Ethics at Code 2023Platformer's Casey Newton moderates a conversation at Code 2023 on ethics in artificial intelligence, with Ajeya Cotra, Senior Program Officer at Open Philanthropy, and Helen Toner, Director of Strategy at Georgetown University’s Center for Security and Emerging Technology. The panel discusses the risks and rewards of the technology, as well as best practices and safety measures.Recorded on September 27th in Los Angeles. Learn more about your ad choices. Visit podcastchoices.com/adchoices

2023-10-2528 min

AI-Generated Audio for Planned Obsolescence Scale, schlep, and systemsThis startlingly fast progress in LLMs was driven both by scaling up LLMs and doing schlep to make usable systems out of them. We think scale and schlep will both improve rapidly: planned-obsolescence.org/scale-schlep-and-systems

2023-10-1009 min

The 80,000 Hours Podcast on Artificial Intelligence (September 2023)Two: Ajeya Cotra on accidentally teaching AI models to deceive usOriginally released in May 2023. Imagine you are an orphaned eight-year-old whose parents left you a $1 trillion company, and no trusted adult to serve as your guide to the world. You have to hire a smart adult to run that company, guide your life the way that a parent would, and administer your vast wealth. You have to hire that adult based on a work trial or interview you come up with. You don't get to see any resumes or do reference checks. And because you're so rich, tonnes of people apply for the job — for all sorts of...

2023-09-022h 49

AI-Generated Audio for Planned Obsolescence Language models surprised usMost experts were surprised by progress in language models in 2022 and 2023. There may be more surprises ahead, so experts should register their forecasts now about 2024 and 2025: https://planned-obsolescence.org/language-models-surprised-us

2023-08-2908 min

80k After Hours Highlights: #151 – Ajeya Cotra on accidentally teaching AI models to deceive usThis is a selection of highlights from episode #151 of The 80,000 Hours Podcast.These aren't necessarily the most important, or even most entertaining parts of the interview — and if you enjoy this, we strongly recommend checking out the full episode:Ajeya Cotra on accidentally teaching AI models to deceive usAnd if you're finding these highlights episodes valuable, please let us know by emailing podcast@80000hours.org.Get this episode by subscribing to our podcast on the world’s most pressing problems and how to solve them: type ‘80,000 Hours’ into your podcasting app. Or...

2023-08-0225 min

The Inside View Curtis Huebner on Doom, AI Timelines and Alignment at EleutherAICurtis, also known on the internet as AI_WAIFU, is the head of Alignment at EleutherAI. In this episode we discuss the massive orders of H100s from different actors, why he thinks AGI is 4-5 years away, why he thinks we're 90% "toast", his comment on Eliezer Yudkwosky's Death with Dignity, and what kind of Alignment projects is currently going on at EleutherAI, especially a project with Markov chains and the Alignment test project that he is currently leading. Youtube: https://www.youtube.com/watch?v=9s3XctQOgew Transcript: https://theinsideview.ai...

2023-07-161h 29

AI-Generated Audio for Planned Obsolescence Could AI accelerate economic growth?Most new technologies don’t accelerate the pace of economic growth. But advanced AI might do this by massively increasing the research effort going into developing new technologies.

2023-06-0604 min

Hard Fork The Surgeon General’s Social Media Warning + A.I.’s Existential RisksThe U.S. surgeon general, Dr. Vivek Murthy, says social media poses a “profound risk of harm” to young people. Why do some in the tech industry disagree?Then, Ajeya Cotra, an A.I. researcher, on how A.I. could lead to a doomsday scenario.Plus: Pass the hat. Kevin and Casey play a game they call HatGPT.On today’s episode:Ajeya Cotra is a senior research analyst at Open PhilanthropyAdditional reading:The surgeon general issued an advisory about the risks of social media for young people.Ajeya Cotra...

2023-05-261h 13

"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis [Bonus Episode] Connor Leahy on AGI, GPT-4, and Cognitive Emulation w/ FLI Podcast[Bonus Episode] Future of Life Institute Podcast host Gus Docker interviews Conjecture CEO Connor Leahy to discuss GPT-4, magic, cognitive emulation, demand for human-like AI, and aligning superintelligence. You can read more about Connor's work at https://conjecture.devFuture of Life Institute is the organization that recently published an open letter calling for a six-month pause on training new AI systems. FLI was founded by Jann Tallinn who we interviewed in Episode 16 of The Cognitive Revolution.We think their podcast is excellent. They frequently interview critical thinkers in AI like Neel Nanda, Ajeya Cotra...

2023-05-191h 41

80,000 Hours Podcast #151 – Ajeya Cotra on accidentally teaching AI models to deceive usImagine you are an orphaned eight-year-old whose parents left you a $1 trillion company, and no trusted adult to serve as your guide to the world. You have to hire a smart adult to run that company, guide your life the way that a parent would, and administer your vast wealth. You have to hire that adult based on a work trial or interview you come up with. You don't get to see any resumes or do reference checks. And because you're so rich, tonnes of people apply for the job — for all sorts of reasons.Today's guest Aj...

2023-05-122h 49

AI-Generated Audio for Planned Obsolescence The costs of cautionBoth AI fears and AI hopes rest on the belief that it may be possible to build alien minds that can do everything we can do and much more. AI-driven technological progress could save countless lives and make everyone massively healthier and wealthier: https://planned-obsolescence.org/the-costs-of-caution

2023-05-0104 min

AI-Generated Audio for Planned Obsolescence Continuous doesn't mean slowOnce a lab trains AI that can fully replace its human employees, it will be able to multiply its workforce 100,000x. If these AIs do AI research, they could develop vastly superhuman systems in under a year: https://planned-obsolescence.org/continuous-doesnt-mean-slow

2023-04-1203 min

AI-Generated Audio for Planned Obsolescence AIs accelerating AI researchResearchers could potentially design the next generation of ML models more quickly by delegating some work to existing models, creating a feedback loop of ever-accelerating progress. https://planned-obsolescence.org/ais-accelerating-ai-research

2023-04-0405 min

AI-Generated Audio for Planned Obsolescence Is it time for a pause?The single most important thing we can do is to pause when the next model we train would be powerful enough to obsolete humans entirely. If it were up to me, I would slow down AI development starting now — and then later slow down even more: https://www.planned-obsolescence.org/is-it-time-for-a-pause/

2023-03-3006 min

AI-Generated Audio for Planned Obsolescence Alignment researchers disagree a lotMany fellow alignment researchers may be operating under radically different assumptions from you: https://www.planned-obsolescence.org/disagreement-in-alignment/

2023-03-2703 min

AI-Generated Audio for Planned Obsolescence Training AIs to help us align AIsIf we can accurately recognize good performance on alignment, we could elicit lots of useful alignment work from our models, even if they're playing the training game: https://www.planned-obsolescence.org/training-ais-to-help-us-align-ais/

2023-03-2704 min

AI-Generated Audio for Planned Obsolescence Playing the training gameWe're creating incentives for AI systems to make their behavior look as desirable as possible, while intentionally disregarding human intent when that conflicts with maximizing reward: https://www.planned-obsolescence.org/the-training-game/

2023-03-2707 min

AI-Generated Audio for Planned Obsolescence Situational awarenessAI systems that have a precise understanding of how they’ll be evaluated and what behavior we want them to display will earn more reward than AI systems that don’t: https://www.planned-obsolescence.org/situational-awareness/

2023-03-2707 min

AI-Generated Audio for Planned Obsolescence "Aligned" shouldn't be a synonym for "good"Perfect alignment just means that AI systems won’t want to deliberately disregard their designers' intent; it's not enough to ensure AI is good for the world: https://www.planned-obsolescence.org/aligned-vs-good/

2023-03-2706 min

AI-Generated Audio for Planned Obsolescence What we're doing hereWe’re trying to think ahead to a possible future in which AI is making all the most important decisions: https://www.planned-obsolescence.org/what-were-doing-here/

2023-03-2704 min

AI-Generated Audio for Planned Obsolescence The ethics of AI red-teamingIf we’ve decided we’re collectively fine with unleashing millions of spam bots, then the least we can do is actually study what they can – and can’t – do: https://www.planned-obsolescence.org/ethics-of-red-teaming/

2023-03-2702 min

"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis The Embedding Revolution: Anton Troynikov on Chroma, Stable Attribution, and future of AIWe're hiring across the board at Turpentine and for Erik's personal team on other projects he's incubating. He's hiring a Chief of Staff, EA, Head of Special Projects, Investment Associate, and more. For a list of JDs, check out: eriktorenberg.com.(0:00) Preview(1:17) Sponsor (4:00) Anton breaks down the advantages of vector databases(4:45) How embeddings have created an AI-native way to represent data(11:50) Anton identifies the watershed moment and step changes in AI(12:55) Open AI’s pricing(18:50) How chroma works(33:04) Stable Attribution and...

2023-03-021h 28

TYPE III AUDIO (All episodes)"Literature review of Transformative Artificial Intelligence timelines" by Jaime Sevilla---client: ea_forumproject_id: curatedfeed_id: ai, ai_safety, ai_safety__forecastingnarrator: pwqa: mdsnarrator_time: 1h20mqa_time: 0h15m---This is a linkpost for https://epochai.org/blog/literature-review-of-transformative-artificial-intelligence-timelinesWe summarize and compare several models and forecasts predicting when transformative AI will be developed.HighlightsThe review includes quantitative models, including both outside and inside view, and judgment-based forecasts by (teams of) experts.While we do not necessarily endorse their conclusions, the inside-view...

2023-02-1010 min

Future of Life Institute Podcast Ajeya Cotra on Thinking Clearly in a Rapidly Changing WorldAjeya Cotra joins us to talk about thinking clearly in a rapidly changing world. Learn more about the work of Ajeya and her colleagues: https://www.openphilanthropy.org Timestamps: 00:00 Introduction 00:44 The default versus the accelerating picture of the future 04:25 The role of AI in accelerating change 06:48 Extrapolating economic growth 08:53 How do we know whether the pace of change is accelerating? 15:07 How can we cope with a rapidly changing world? 18:50 How could the future be utopian? 22:03 Is accelerating technological progress immoral? 25:43 Should we imagine concrete future scenarios? 31:15 How should we act in an accelerating world? 34:41 How Ajeya could be wrong about...

2022-11-1044 min

The Valmy Ajeya Cotra on how Artificial Intelligence Could Cause Catastrophe Podcast: Future of Life Institute Podcast Episode: Ajeya Cotra on how Artificial Intelligence Could Cause CatastropheRelease date: 2022-11-03Get Podcast Transcript →powered by Listen411 - fast audio-to-text and summarizationAjeya Cotra joins us to discuss how artificial intelligence could cause catastrophe. Follow the work of Ajeya and her colleagues: https://www.openphilanthropy.org Timestamps: 00:00 Introduction 00:53 AI safety research in general 02:04 Realistic scenarios for AI catastrophes 06:51 A dangerous AI model developed in the near future 09:10 Assumptions behind dangerous AI development 14:45 Can AIs learn long-term planning? 18:09 Can AIs understand human psychology? 22:32 Training an...

2022-11-0554 min

Future of Life Institute Podcast Ajeya Cotra on how Artificial Intelligence Could Cause CatastropheAjeya Cotra joins us to discuss how artificial intelligence could cause catastrophe. Follow the work of Ajeya and her colleagues: https://www.openphilanthropy.org Timestamps: 00:00 Introduction 00:53 AI safety research in general 02:04 Realistic scenarios for AI catastrophes 06:51 A dangerous AI model developed in the near future 09:10 Assumptions behind dangerous AI development 14:45 Can AIs learn long-term planning? 18:09 Can AIs understand human psychology? 22:32 Training an AI model with naive safety features 24:06 Can AIs be deceptive? 31:07 What happens after deploying an unsafe AI system? 44:03 What can we do to prevent an AI catastrophe? 53:58 The next episode

2022-11-0354 min

TYPE III AUDIO (All episodes)LessWrong: "How might we align transformative AI if it’s developed very soon?" by Holden Karnofsky---narrator_time: 4h30mnarrator: pwqa: kmfeed_id: ai, ai_safety, ai_safety__technical, ai_safety__governanceclient: lesswrong---https://www.lesswrong.com/posts/rCJQAkPTEypGjSJ8X/how-might-we-align-transformative-ai-if-it-s-developed-very This post is part of my AI strategy nearcasting series: trying to answer key strategic questions about transformative AI, under the assumption that key events will happen very soon, and/or in a world that is otherwise very similar to today's. This post gives my understanding of what the set of available strategies for aligning transformative AI...

2022-11-031h 39

Future of Life Institute Podcast Ajeya Cotra on Forecasting Transformative Artificial IntelligenceAjeya Cotra joins us to discuss forecasting transformative artificial intelligence. Follow the work of Ajeya and her colleagues: https://www.openphilanthropy.org Timestamps: 00:00 Introduction 00:53 Ajeya's report on AI 01:16 What is transformative AI? 02:09 Forecasting transformative AI 02:53 Historical growth rates 05:10 Simpler forecasting methods 09:01 Biological anchors 16:31 Different paths to transformative AI 17:55 Which year will we get transformative AI? 25:54 Expert opinion on transformative AI 30:08 Are today's machine learning techniques enough? 33:06 Will AI be limited by the physical world and regulation? 38:15 Will AI be limited by training data? 41:48 Are there human abilities that AIs cannot learn? 47:22 The next episode

2022-10-2747 min

LessWrong (Curated & Popular)"Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover" by Ajeya Cotrahttps://www.lesswrong.com/posts/pRkFkzwKZ2zfa3R6H/without-specific-countermeasures-the-easiest-path-toCrossposted from the AI Alignment Forum. May contain more technical jargon than usual.I think that in the coming 15-30 years, the world could plausibly develop “transformative AI”: AI powerful enough to bring us into a new, qualitatively different future, via an explosion in science and technology R&D. This sort of AI could be sufficient to make this the most important century of all time for humanity.The most straightforward vision for developing transformative AI that I can imagine working with very litt...

2022-09-273h 07

LessWrong (Curated & Popular)"Two-year update on my personal AI timelines" by Ajeya Cotrahttps://www.lesswrong.com/posts/AfH2oPHCApdKicM4m/two-year-update-on-my-personal-ai-timelines#fnref-fwwPpQFdWM6hJqwuY-12Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.I worked on my draft report on biological anchors for forecasting AI timelines mainly between ~May 2019 (three months after the release of GPT-2) and ~Jul 2020 (a month after the release of GPT-3), and posted it on LessWrong in Sep 2020 after an internal review process. At the time, my bottom line estimates from the bio anchors modeling exercise were:[1]Roughly ~15% probability of transformative AI by 2036[2] (16 years from posting the report; 14 years...

2022-09-2239 min

Future Matters #5: supervolcanoes, AI takeover, and What We Owe the FutureFuture Matters is a newsletter about longtermism brought to you by Matthew van der Merwe and Pablo Stafforini. Each month we collect and summarize longtermism-relevant research, share news from the longtermism community, and feature a conversation with a prominent researcher. You can also subscribe on Substack, read on the EA Forum and follow on Twitter. 00:00 Welcome to Future Matters. 01:08 MacAskill — What We Owe the Future. 01:34 Lifland — Samotsvety's AI risk forecasts. 02:11 Halstead — Climate Change and Longtermism. 02:43 Good Judgment — Long-term risks and climate change. 02:54 Thorstad — Existential risk pessimism and the time of perils. 03:32 Hamilton — Space and existential risk. 04:07 Cassidy & Mani — Huge...

2022-09-1331 min

EA Talks What is an Effective Altruist? EARadio TrailerShould we spend money on guide dogs for the blind or bed nets to protect kids from disease carrying mosquitoes? Life is full of choices like this. So how can we help others best? And what happens when we fall short? This trailer was adapted from Ajeya Cotra's Introduction to EA. You can find the original talk here. To suggest an episode or interview, please e-mail us at contact@earad.io If you're new to effective altruism, you can learn more about it by listening to the podcast or reading this....

2022-09-0101 min

Clearer Thinking with Spencer Greenberg Critiquing Effective Altruism (with Michael Nielsen and Ajeya Cotra)Read the full transcript here. What is Effective Altruism? Which parts of the Effective Altruism movement are good and not so good? Who outside of the EA movement are doing lots of good in the world? What are the psychological effects of thinking constantly about the trade-offs of spending resources on ourselves versus on others? To what degree is the EA movement centralized intellectually, financially, etc.? Does the EA movement's tendency to quantify everything, to make everything legible to itself, cause it to miss important features of the world? To what extent do EA people rationalize spending...

2022-08-201h 38

EA Talks Introduction to EA | Ajeya Cotra | EAGxBerkeley 2016Ajeya Cotra introduces the core principles of effective altruism.This talk was taken from EA GxBerkeley 2016. Click here to watch the talk with the video.Effective Altruism is a social movement dedicated to finding ways to do the most good possible, whether through charitable donations, career choices, or volunteer projects. EA Global conferences are gatherings for EAs to meet. You can also listen to this talk along with its accompanying video on YouTube.

2022-08-0935 min

EA Talks SERI 2022: Timelines for Transformative AI and Language Model Alignment | Ajeya CotraAjeya Cotra is a Senior Research Analyst at Open Philanthropy. She’s currently thinking about how difficult it may be to ensure AI systems pursue the right goals. Previously, she worked on a framework for estimating when transformative AI may be developed, as well as various cause prioritization and worldview diversification projects. She joined Open Philanthropy in July 2016 as a Research Analyst. Ajeya received a B.S. in Electrical Engineering and Computer Science from UC Berkeley, where she co-founded the Effective Altruists of Berkeley student group and taught a course on effective altruism.This video was first pu...

2022-08-0628 min

The Inside View Ethan Caballero–Scale is All You NeedEthan is known on Twitter as the edgiest person at MILA. We discuss all the gossips around scaling large language models in what will be later known as the Edward Snowden moment of Deep Learning. On his free time, Ethan is a Master’s degree student at MILA in Montreal, and has published papers on out of distribution generalization and robustness generalization, accepted both as oral presentations and spotlight presentations at ICML and NeurIPS. Ethan has recently been thinking about scaling laws, both as an organizer and speaker for the 1st Neural Scaling Laws Workshop. Transcript: https://th...

2022-05-0551 min

AXRP - the AI X-risk Research Podcast 13 - First Principles of AGI Safety with Richard NgoHow should we think about artificial general intelligence (AGI), and the risks it might pose? What constraints exist on technical solutions to the problem of aligning superhuman AI systems with human intentions? In this episode, I talk to Richard Ngo about his report analyzing AGI safety from first principles, and recent conversations he had with Eliezer Yudkowsky about the difficulty of AI alignment. Topics we discuss, and timestamps: - 00:00:40 - The nature of intelligence and AGI - 00:01:18 - The nature of intelligence - 00:06:09 - AGI: what and how ...

2022-03-311h 33

Astral Codex Ten Podcast Biological Anchors: A Trick That Might Or Might Not Work https://astralcodexten.substack.com/p/biological-anchors-a-trick-that-might?utm_source=url Introduction I've been trying to review and summarize Eliezer Yudkowksy's recent dialogues on AI safety. Previously in sequence: Yudkowsky Contra Ngo On Agents. Now we’re up to Yudkowsky contra Cotra on biological anchors, but before we get there we need to figure out what Cotra's talking about and what's going on. The Open Philanthropy Project ("Open Phil") is a big effective altruist foundation interested in funding AI safety. It's got $20 billion, probably the majority of money in the field, so its decisions matter a...

2022-02-241h 10

The Nonlinear Library: Alignment Section (Part 4/4) Forecasting TAI with biological anchors by Ajeya Cotra. Timelines estimates and responses to objectionsWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is part four of: Forecasting TAI with biological anchors, published by Ajeya Cotra. Part 4: Timelines estimates and responses to objections This report emerged from discussions with our technical advisors Dario Amodei and Paul Christiano. However, it should not be treated as representative of either of their views; the project eventually broadened considerably, and my conclusions are my own. This is a work in progress and does not represent Open Philanthropy’s institutional view. We are making it...

2021-12-231h 38

The Nonlinear Library: Alignment Section (Part 3/4) Forecasting TAI with biological anchors by Ajeya Cotra. Hypotheses and 2020 training computation requirementsWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is part three of: Forecasting TAI with biological anchors, published by Ajeya Cotra. Part 3: Hypotheses and 2020 training computation requirements This report emerged from discussions with our technical advisors Dario Amodei and Paul Christiano. However, it should not be treated as representative of either of their views; the project eventually broadened considerably, and my conclusions are my own. This is a work in progress and does not represent Open Philanthropy’s institutional view. We are making it pu...

2021-12-231h 20

The Nonlinear Library: Alignment Section (Part 2/4) Forecasting TAI with biological anchors by Ajeya Cotra. How training data requirements scale with parameter countWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is part two of: Forecasting TAI with biological anchors, published by Ajeya Cotra. Part 2: How training data requirements scale with parameter count This report emerged from discussions with our technical advisors Dario Amodei and Paul Christiano. However, it should not be treated as representative of either of their views; the project eventually broadened considerably, and my conclusions are my own. This is a work in progress and does not represent Open Philanthropy’s institutional view. We ar...

2021-12-231h 20

The Nonlinear Library: Alignment Section (Part 1/4) Forecasting TAI with biological anchors by Ajeya Cotra. Overview, conceptual foundations, and runtime computation.Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is part one of: Forecasting TAI with biological anchors, published by Ajeya Cotra. Part 1: Overview, conceptual foundations, and runtime computation This report emerged from discussions with our technical advisors Dario Amodei and Paul Christiano. However, it should not be treated as representative of either of their views; the project eventually broadened considerably, and my conclusions are my own. This is a work in progress and does not represent Open Philanthropy’s institutional view. We are making it...

2021-12-181h 38

The Nonlinear Library: Alignment Section (Part 1/2) Is power-seeking AI an existential risk? by Joseph CarlsmithWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is part one of: Is power-seeking AI an existential risk?, published by Joseph Carlsmith. 1. Introduction Some worry that the development of advanced artificial intelligence will result in existential catastrophe -- that is, the destruction of humanity’s longterm potential. Here I examine the following version of this worry (it’s not the only version): By 2070: It will become possible and financially feasible to build AI systems with the following properties: Advanced capability: they outperform the best huma...

2021-12-181h 29

The Nonlinear Library: Alignment Section (Part 2/2) Eliciting latent knowledge: How to tell if your eyes deceive you by Paul Christiano, Ajeya Cotra, and Mark XuWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is part two of: Eliciting latent knowledge: How to tell if your eyes deceive you, published by Paul Christiano, Ajeya Cotra, and Mark Xu. Why we’re excited about tackling worst-case ELK We think that worst-case ELK — i.e. the problem of devising a training strategy to get an AI to report what it knows no matter how its mind is shaped internally — is one of the most exciting open problems in alignment theory (if not the mo...

2021-12-182h 02

The Nonlinear Library: Alignment Section (Part 1/2)Eliciting latent knowledge: How to tell if your eyes deceive you by Paul Christiano, Ajeya Cotra, and Mark XuWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is part one of: Eliciting latent knowledge: How to tell if your eyes deceive you, published by Paul Christiano, Ajeya Cotra, and Mark Xu. In this post, we’ll present ARC’s approach to an open problem we think is central to aligning powerful machine learning (ML) systems: Suppose we train a model to predict what the future will look like according to cameras and other sensors. We then use planning algorithms to find a sequence of a...

2021-12-151h 02

The Nonlinear Library: LessWrong Top Posts The case for aligning narrowly superhuman models by Ajeya CotraWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio.This is: The case for aligning narrowly superhuman models, published by Ajeya Cotra on the LessWrong.Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.I wrote this post to get people’s takes on a type of work that seems exciting to me personally; I’m not speaking for Open Phil as a whole. Institutionally, we are very uncertain whether to prioritize this (and if we do where it should be housed and...

2021-12-1253 min

The Nonlinear Library: LessWrong Top Posts The case for aligning narrowly superhuman models by Ajeya CotraWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: The case for aligning narrowly superhuman models, published by Ajeya Cotra on the LessWrong. Crossposted from the AI Alignment Forum. May contain more technical jargon than usual. I wrote this post to get people’s takes on a type of work that seems exciting to me personally; I’m not speaking for Open Phil as a whole. Institutionally, we are very uncertain whether to prioritize this (and if we do where it should be housed and...

2021-12-1253 min

The Nonlinear Library: LessWrong Top Posts Draft report on AI timelines by Ajeya CotraWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio.This is: Draft report on AI timelines, published by Ajeya Cotra on the LessWrong.Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.Hi all, I've been working on some AI forecasting research and have prepared a draft report on timelines to transformative AI. I would love feedback from this community, so I've made the report viewable in a Google Drive folder here.With that said, most of my focus so far has...

2021-12-1201 min

The Nonlinear Library: LessWrong Top Posts Draft report on AI timelines by Ajeya CotraWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Draft report on AI timelines, published by Ajeya Cotra on the LessWrong. Crossposted from the AI Alignment Forum. May contain more technical jargon than usual. Hi all, I've been working on some AI forecasting research and have prepared a draft report on timelines to transformative AI. I would love feedback from this community, so I've made the report viewable in a Google Drive folder here. With that said, most of my focus so far has...

2021-12-1201 min

The Nonlinear Library: LessWrong Top Posts My research methodologyΩ by paulfchristianoWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio.This is: My research methodologyΩ, published by paulfchristiano on the LessWrong.Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.(Thanks to Ajeya Cotra, Nick Beckstead, and Jared Kaplan for helpful comments on a draft of this post.)I really don’t want my AI to strategically deceive me and resist my attempts to correct its behavior. Let’s call an AI that does so egregiously misaligned (for the purpose of this post).Most...

2021-12-1123 min

The Nonlinear Library: LessWrong Top Posts My research methodologyΩ by paulfchristianoWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: My research methodologyΩ, published by paulfchristiano on the LessWrong. Crossposted from the AI Alignment Forum. May contain more technical jargon than usual. (Thanks to Ajeya Cotra, Nick Beckstead, and Jared Kaplan for helpful comments on a draft of this post.) I really don’t want my AI to strategically deceive me and resist my attempts to correct its behavior. Let’s call an AI that does so egregiously misaligned (for the purpose of this post). Most...

2021-12-1123 min

The Nonlinear Library: LessWrong Top Posts MIRI comments on Cotra's "Case for Aligning Narrowly Superhuman Models" by Rob BensingerWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio.This is: MIRI comments on Cotra's "Case for Aligning Narrowly Superhuman Models", published by Rob Bensinger on the AI Alignment Forum.Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.Below, I’ve copied comments left by MIRI researchers Eliezer Yudkowsky and Evan Hubinger on March 1–3 on a draft of Ajeya Cotra’s "Case for Aligning Narrowly Superhuman Models." I've included back-and-forths with Cotra, and interjections by me and Rohin Shah.The section divisi...

2021-12-1139 min

The Nonlinear Library: LessWrong Top Posts MIRI comments on Cotra's "Case for Aligning Narrowly Superhuman Models" by Rob BensingerWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: MIRI comments on Cotra's "Case for Aligning Narrowly Superhuman Models", published by Rob Bensinger on the AI Alignment Forum. Crossposted from the AI Alignment Forum. May contain more technical jargon than usual. Below, I’ve copied comments left by MIRI researchers Eliezer Yudkowsky and Evan Hubinger on March 1–3 on a draft of Ajeya Cotra’s "Case for Aligning Narrowly Superhuman Models." I've included back-and-forths with Cotra, and interjections by me and Rohin Shah. The section divisi...

2021-12-1139 min

The Nonlinear Library: LessWrong Top Posts Redwood Research’s current projectWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio.This is: Redwood Research’s current project , published by Buck on the AI Alignment Forum.Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.Here’s a description of the project Redwood Research is working on at the moment. First I’ll say roughly what we’re doing, and then I’ll try to explain why I think this is a reasonable applied alignment project, and then I’ll talk a bit about the takeaway...

2021-12-1122 min

The Nonlinear Library: LessWrong Top Posts Redwood Research’s current projectWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Redwood Research’s current project , published by Buck on the AI Alignment Forum. Crossposted from the AI Alignment Forum. May contain more technical jargon than usual. Here’s a description of the project Redwood Research is working on at the moment. First I’ll say roughly what we’re doing, and then I’ll try to explain why I think this is a reasonable applied alignment project, and then I’ll talk a bit about the takeaway...

2021-12-1122 min

The Nonlinear Library: LessWrong Top Posts The theory-practice gap by BuckWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio.This is: The theory-practice gap, published by Buck on the AI Alignment Forum.Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.[Thanks to Richard Ngo, Damon Binder, Summer Yue, Nate Thomas, Ajeya Cotra, Alex Turner, and other Redwood Research people for helpful comments; thanks Ruby Bloom for formatting this for the Alignment Forum for me.]I'm going to draw a picture, piece by piece. I want to talk about the capability of...

2021-12-1110 min

The Nonlinear Library: LessWrong Top Posts The theory-practice gap by BuckWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: The theory-practice gap, published by Buck on the AI Alignment Forum. Crossposted from the AI Alignment Forum. May contain more technical jargon than usual. [Thanks to Richard Ngo, Damon Binder, Summer Yue, Nate Thomas, Ajeya Cotra, Alex Turner, and other Redwood Research people for helpful comments; thanks Ruby Bloom for formatting this for the Alignment Forum for me.] I'm going to draw a picture, piece by piece. I want to talk about the capability of...

2021-12-1110 min

The Nonlinear Library: Alignment Forum Top Posts The case for aligning narrowly superhuman models by Ajeya CotraI wrote this post to get people’s takes on a type of work that seems exciting to me personally; I’m not speaking for Open Phil as a whole. Institutionally, we are very uncertain whether to prioritize this (and if we do where it should be housed and how our giving should be structured). We are not seeking grant applications on this topic right now. Thanks to Daniel Dewey, Eliezer Yudkowsky, Evan Hubinger, Holden Karnofsky, Jared Kaplan, Mike Levine, Nick Beckstead, Owen Cotton-Barratt, Paul Christiano, Rob Bensinger, and Rohin Shah for comments on earlier drafts. A genre of t...

2021-12-1050 min

The Nonlinear Library: Alignment Forum Top Posts Draft report on AI timelines by Ajeya CotraWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Draft report on AI timelines, published by Ajeya Cotra on the AI Alignment Forum. Hi all, I've been working on some AI forecasting research and have prepared a draft report on timelines to transformative AI. I would love feedback from this community, so I've made the report viewable in a Google Drive folder here. With that said, most of my focus so far has been on the high-level structure of the framework...

2021-12-1001 min

The Nonlinear Library: Alignment Forum Top Posts My research methodology by Paul ChristianoWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: My research methodology, published by Paul Christiano on the AI Alignment Forum. (Thanks to Ajeya Cotra, Nick Beckstead, and Jared Kaplan for helpful comments on a draft of this post.) I really don’t want my AI to strategically deceive me and resist my attempts to correct its behavior. Let’s call an AI that does so egregiously misaligned (for the purpose of this post). Most possible ML techniques for avoiding egre...

2021-12-1023 min

The Nonlinear Library: Alignment Forum Top Posts Seeking Power is Often Convergently Instrumental in MDPs by Paul ChristianoWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Seeking Power is Often Convergently Instrumental in MDPs, published by Paul Christiano on the AI Alignment Forum. (Thanks to Ajeya Cotra, Nick Beckstead, and Jared Kaplan for helpful comments on a draft of this post.) I really don’t want my AI to strategically deceive me and resist my attempts to correct its behavior. Let’s call an AI that does so egregiously misaligned (for the purpose of this post). Most poss...

2021-12-1023 min

The Nonlinear Library: Alignment Forum Top Posts MIRI comments on Cotra's "Case for Aligning Narrowly Superhuman Models" by Rob BensingerWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: MIRI comments on Cotra's "Case for Aligning Narrowly Superhuman Models", published by Rob Bensinger on the AI Alignment Forum. Below, I’ve copied comments left by MIRI researchers Eliezer Yudkowsky and Evan Hubinger on March 1–3 on a draft of Ajeya Cotra’s "Case for Aligning Narrowly Superhuman Models." I've included back-and-forths with Cotra, and interjections by me and Rohin Shah. The section divisions below correspond to the sections in Cotra's post. 0. Introd...

2021-12-1039 min

The Nonlinear Library: Alignment Forum Top Posts Redwood Research’s current project by Buck ShlegerisWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Redwood Research’s current project, published by Buck Shlegeris on the AI Alignment Forum. Here’s a description of the project Redwood Research is working on at the moment. First I’ll say roughly what we’re doing, and then I’ll try to explain why I think this is a reasonable applied alignment project, and then I’ll talk a bit about the takeaways I’ve had from the project so far. There are...

2021-12-1022 min

The Nonlinear Library: Alignment Forum Top Posts The theory-practice gap by Buck Shlegeris by Buck ShlegerisWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: The theory-practice gap by Buck Shlegeris, published by Buck Shlegeris on the AI Alignment Forum. [Thanks to Richard Ngo, Damon Binder, Summer Yue, Nate Thomas, Ajeya Cotra, Alex Turner, and other Redwood Research people for helpful comments; thanks Ruby Bloom for formatting this for the Alignment Forum for me.] I'm going to draw a picture, piece by piece. I want to talk about the capability of some different AI systems. You...

2021-12-1010 min

The Nonlinear Library: Alignment Forum Top Posts Paul's research agenda FAQ by Alex ZhuWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Paul's research agenda FAQ, published by Alex Zhu on the AI Alignment Forum. I think Paul Christiano’s research agenda for the alignment of superintelligent AGIs presents one of the most exciting and promising approaches to AI safety. After being very confused about Paul’s agenda, chatting with others about similar confusions, and clarifying with Paul many times over, I’ve decided to write a FAQ addressing common confusions around his agenda. This F...

2021-12-0633 min

The Nonlinear Library: Alignment Forum Top Posts Against GDP as a metric for timelines and takeoff speeds by Daniel KokotajloWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Against GDP as a metric for timelines and takeoff speeds, published by Daniel Kokotajlo on the AI Alignment Forum. [Epistemic status: Strong opinion, lightly held] I think world GDP (and economic growth more generally) is overrated as a metric for AI timelines and takeoff speeds. Here are some uses of GDP that I disagree with, or at least think should be accompanied by cautionary notes: Timelines: Ajeya Cotra thinks of...

2021-12-0623 min

The Nonlinear Library: Alignment Forum Top Posts My Understanding of Paul Christiano's Iterated Amplification AI Safety Research Agenda by Chi NguyenWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: My Understanding of Paul Christiano's Iterated Amplification AI Safety Research Agenda, publishedby Chi Nguyen on the AI Alignment Forum. Write a Review Crossposted from the EA forum You can read this post as a google docs instead (IMO much better to read). This document aims to clarify the AI safety research agenda by Paul Christiano (IDA) and the arguments around how promising it is. Target audience: All levels of...

2021-12-061h 05

The Nonlinear Library: Alignment Section Teaching ML to answer questions honestly instead of predicting human answers by Paul ChristianoWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is:Teaching ML to answer questions honestly instead of predicting human answers , published by Paul Christiano on the AI Alignment Forum. (Note: very much work in progress, unless you want to follow along with my research you'll probably want to wait for an improved/simplified/clarified algorithm.) In this post I consider the particular problem of models learning “predict how a human would answer questions” instead of “answer questions honestly.” (A special case of the problem from Ina...

2021-11-1923 min

The Nonlinear Library: Alignment Section My research methodology by Paul ChristianoWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: My research methodology, published by Paul Christiano on the AI Alignment Forum. (Thanks to Ajeya Cotra, Nick Beckstead, and Jared Kaplan for helpful comments on a draft of this post.) I really don’t want my AI to strategically deceive me and resist my attempts to correct its behavior. Let’s call an AI that does so egregiously misaligned (for the purpose of this post). Most possible ML techniques for avoiding egregious misalignment depend on deta...

2021-11-1923 min

The Nonlinear Library: Alignment Section How do we become confident in the safety of a machine learning system? by Evan HubingerWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: How do we become confident in the safety of a machine learning system?, published by Evan Hubinger on the AI Alignment Forum. Thanks to Rohin Shah, Ajeya Cotra, Richard Ngo, Paul Christiano, Jon Uesato, Kate Woolverton, Beth Barnes, and William Saunders for helpful comments and feedback. Evaluating proposals for building safe advanced AI—and actually building any degree of confidence in their safety or lack thereof—is extremely difficult. Previously, in “An overview of 11 proposals for bu...

2021-11-1950 min

The Nonlinear Library: Alignment Section Automating Auditing: An ambitious concrete technical research proposal by Evan HubingerWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Automating Auditing: An ambitious concrete technical research proposal, published by Evan Hubinger on the AI Alignment Forum. This post was originally written as a research proposal for the new AI alignment research organization Redwood Research, detailing an ambitious, concrete technical alignment proposal that I’m excited about work being done on, in a similar vein to Ajeya Cotra’s “The case for aligning narrowly superhuman models.” Regardless of whether Redwood actually ends up working on this pro...

2021-11-1922 min

The Nonlinear Library: Alignment Section Techniques for enhancing human feedback by abergal, Ajeya Cotra, Nick_BecksteadWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Techniques for enhancing human feedback, published by abergal, Ajeya Cotra, Nick_Beckstead on the AI Alignment Forum. Training powerful models to maximize simple metrics (such as quarterly profits) could be risky. Sufficiently intelligent models could discover strategies for maximizing these metrics in perverse and unintended ways. For example, the easiest way to maximize profits may turn out to involve stealing money, manipulating whoever keeps records into reporting unattainably high profits, capturing regulators of the industry...

2021-11-1704 min

The Nonlinear Library: Alignment Section The case for aligning narrowly superhuman models by Ajeya CotraWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: The case for aligning narrowly superhuman models, published by Ajeya Cotra on the AI Alignment Forum. I wrote this post to get people’s takes on a type of work that seems exciting to me personally; I’m not speaking for Open Phil as a whole. Institutionally, we are very uncertain whether to prioritize this (and if we do where it should be housed and how our giving should be structured). We are not seeking gran...

2021-11-1700 min

The Nonlinear Library: Alignment Section AMA on EA Forum: Ajeya Cotra, researcher at Open Phil by Ajeya CotraWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: AMA on EA Forum: Ajeya Cotra, researcher at Open Phil, published by Ajeya Cotra on the AI Alignment Forum. This is a linkpost for Hi all, I'm Ajeya, and I'll be doing an AMA on the EA Forum (this is a linkpost for my announcement there). I would love to get questions from LessWrong and Alignment Forum users as well -- please head on over if you have any questions for me! I’ll plan to...

2021-11-1701 min

The Nonlinear Library: Alignment Section Draft report on AI timelines by Ajeya CotraWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Draft report on AI timelines, published by Ajeya Cotra on the AI Alignment Forum. Hi all, I've been working on some AI forecasting research and have prepared a draft report on timelines to transformative AI. I would love feedback from this community, so I've made the report viewable in a Google Drive folder here. With that said, most of my focus so far has been on the high-level structure of the framework, so the particular...

2021-11-1701 min

The Nonlinear Library: Alignment Section Iterated Distillation and Amplification by Ajeya CotraWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Iterated Distillation and Amplification, published by Ajeya Cotra on the AI Alignment Forum. This is a guest post summarizing Paul Christiano’s proposed scheme for training machine learning systems that can be robustly aligned to complex and fuzzy values, which I call Iterated Distillation and Amplification (IDA) here. IDA is notably similar to AlphaGoZero and expert iteration. The hope is that if we use IDA to train each learned component of an AI then the ov...

2021-11-1710 min

The Nonlinear Library: Alignment Section Alignment Newsletter #35 by Rohin ShahWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Alignment Newsletter #35, published by Rohin Shah on the AI Alignment Forum. Find all Alignment Newsletter resources here. In particular, you can sign up, or look through this spreadsheet of all summaries that have ever been in the newsletter. This week we don't have any explicit highlights, but remember to treat the sequences as though they were highlighted! Technical AI alignment Iterated amplification sequence Corrigibility (Paul Christiano): A corrigible agent is one which helps its operator...

2021-11-1711 min

The Nonlinear Library: Alignment Section Redwood Research"s current project by Buck ShlegerisWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Redwood Research"s current project, published by Buck Shlegeris on the AI Alignment Forum. Here’s a description of the project Redwood Research is working on at the moment. First I’ll say roughly what we’re doing, and then I’ll try to explain why I think this is a reasonable applied alignment project, and then I’ll talk a bit about the takeaways I’ve had from the project so far. There are a bunch of p...

2021-11-1622 min

The Nonlinear Library: Alignment Section The theory-practice gap by Buck ShlegerisWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: The theory-practice gap, published by Buck Shlegeris on the AI Alignment Forum. [Thanks to Richard Ngo, Damon Binder, Summer Yue, Nate Thomas, Ajeya Cotra, Alex Turner, and other Redwood Research people for helpful comments; thanks Ruby Bloom for formatting this for the Alignment Forum for me.] I'm going to draw a picture, piece by piece. I want to talk about the capability of some different AI systems. You can see here that we've drawn the...

2021-11-1610 min

Cold Takes Audio Why AI alignment could be hard with modern deep learning (guest post by Ajeya Cotra)Why would we program AI that wants to harm us? Because we might not know how to do otherwise.https://www.cold-takes.com/why-ai-alignment-could-be-hard-with-modern-deep-learning/

2021-09-2028 min

AXRP - the AI X-risk Research Podcast 7.5 - Forecasting Transformative AI from Biological Anchors with Ajeya CotraIf you want to shape the development and forecast the consequences of powerful AI technology, it's important to know when it might appear. In this episode, I talk to Ajeya Cotra about her draft report "Forecasting Transformative AI from Biological Anchors" which aims to build a probabilistic model to answer this question. We talk about a variety of topics, including the structure of the model, what the most important parts are to get right, how the estimates should shape our behaviour, and Ajeya's current work at Open Philanthropy and perspective on the AI x-risk landscape. U...

2021-05-2801 min

Effective Altruism: An Introduction – 80,000 Hours (April 2021)Six: Ajeya Cotra on worldview diversification and how big the future could beImagine that humanity has two possible futures ahead of it: Either we’re going to have a huge future like that, in which trillions of people ultimately exist, or we’re going to wipe ourselves out quite soon, thereby ensuring that only around 100 billion people ever get to live.If there are eventually going to be 1,000 trillion humans, what should we think of the fact that we seemingly find ourselves so early in history? If the future will have many trillions of people, the odds of us appearing so strangely early are very low indeed.If w...

2021-04-122h 56

80,000 Hours Podcast #90 – Ajeya Cotra on worldview diversification and how big the future could beYou wake up in a mysterious box, and hear the booming voice of God: “I just flipped a coin. If it came up heads, I made ten boxes, labeled 1 through 10 — each of which has a human in it. If it came up tails, I made ten billion boxes, labeled 1 through 10 billion — also with one human in each box. To get into heaven, you have to answer this correctly: Which way did the coin land?” You think briefly, and decide you should bet your eternal soul on tails. The fact that you woke up at all s...

2021-01-212h 59

EA Talks EAG 2017 London: Implementing cause prioritization at OpenPhil (Ajeya Cotra)I go through a number of tangles that have come up trying to translate cause prioritisation theory into practice at OpenPhil, some proposed patches, and remaining open questions. (Credit for most of these ideas goes to other people; I make attributions whenever I can.)Source: Effective Altruism Global (video).Effective Altruism is a social movement dedicated to finding ways to do the most good possible, whether through charitable donations, career choices, or volunteer projects. EA Global conferences are gatherings for EAs to meet. You can also listen to this talk along with its accompanying video...

2018-04-2330 min