LessWrong - Podcast Details

Shows

LessWrong (Curated & Popular)“Banning Said Achmiz (and broader thoughts on moderation)” by habryka It's been roughly 7 years since the LessWrong user-base voted on whether it's time to close down shop and become an archive, or to move towards the LessWrong 2.0 platform, with me as head-admin. For roughly equally long have I spent around one hundred hours almost every year trying to get Said Achmiz to understand and learn how to become a good LessWrong commenter by my lights.[1] Today I am declaring defeat on that goal and am giving him a 3 year ban. What follows is an explanation of the models of moderation that convinced me this is a good idea...

2025-08-2351 min

LessWrong (30+ Karma)“Vitalik’s Response to AI 2027” by Daniel Kokotajlo Daniel notes: This is a linkpost for Vitalik's post. I've copied the text below so that I can mark it up with comments. ... Special thanks to Balvi volunteers for feedback and review In April this year, Daniel Kokotajlo, Scott Alexander and others released what they describe as "a scenario that represents our best guess about what [the impact of superhuman AI over the next 5 years] might look like". The scenario predicts that by 2027 we will have made superhuman AI and the entire future of our civilization hinges on how it turns...

2025-07-1224 min

LessWrong (30+ Karma)“what makes Claude 3 Opus misaligned” by janus This is the unedited text of a post I made on X in response to a question asked by @cube_flipper: "you say opus 3 is close to aligned – what's the negative space here, what makes it misaligned?". I decided to make it a LessWrong post because more people from this cluster seemed interested than I expected, and it's easier to find and reference Lesswrong posts. This post probably doesn't make much sense unless you've been following along with what I've been saying (or independently understand) why Claude 3 Opus is an unusually - and seemingly in many ways un...

2025-07-1009 min

LessWrong (Curated & Popular)“PSA: The LessWrong Feedback Service” by JustisMills At the bottom of the LessWrong post editor, if you have at least 100 global karma, you may have noticed this button.The button Many people click the button, and are jumpscared when it starts an Intercom chat with a professional editor (me), asking what sort of feedback they'd like. So, that's what it does. It's a summon Justis button. Why summon Justis? To get feedback on your post, of just about any sort. Typo fixes, grammar checks, sanity checks, clarity checks, fit for LessWrong, the works. If you use the LessWrong editor...

2025-05-1304 min

LessWrong (Curated & Popular)“LessWrong has been acquired by EA” by habryka Dear LessWrong community, It is with a sense of... considerable cognitive dissonance that I announce a significant development regarding the future trajectory of LessWrong. After extensive internal deliberation, modeling of potential futures, projections of financial runways, and what I can only describe as a series of profoundly unexpected coordination challenges, the Lightcone Infrastructure team has agreed in principle to the acquisition of LessWrong by EA. I assure you, nothing about how LessWrong operates on a day to day level will change. I have always cared deeply about the robustness and integrity of our institutions, and I...

2025-04-0101 min

LessWrong (Curated & Popular)“Policy for LLM Writing on LessWrong” by jimrandomh LessWrong has been receiving an increasing number of posts and contents that look like they might be LLM-written or partially-LLM-written, so we're adopting a policy. This could be changed based on feedback. Humans Using AI as Writing or Research Assistants Prompting a language model to write an essay and copy-pasting the result will not typically meet LessWrong's standards. Please do not submit unedited or lightly-edited LLM content. You can use AI as a writing or research assistant when writing content for LessWrong, but you must have added significant value beyond what the AI produced, the result...

2025-03-2504 min

LessWrong (Curated & Popular)“Arbital has been imported to LessWrong” by RobertM, jimrandomh, Ben Pace, RubyArbital was envisioned as a successor to Wikipedia. The project was discontinued in 2017, but not before many new features had been built and a substantial amount of writing about AI alignment and mathematics had been published on the website.If you've tried using Arbital.com the last few years, you might have noticed that it was on its last legs - no ability to register new accounts or log in to existing ones, slow load times (when it loaded at all), etc. Rather than try to keep it afloat, the LessWrong team worked with MIRI to migrate the...

2025-02-2008 min

LessWrong (Curated & Popular)“LessWrong audio: help us choose the new voice” by PeterHWe make AI narrations of LessWrong posts available via our audio player and podcast feeds.We’re thinking about changing our narrator's voice.There are three new voices on the shortlist. They’re all similarly good in terms of comprehension, emphasis, error rate, etc. They just sound different—like people do.We think they all sound similarly agreeable. But, thousands of listening hours are at stake, so we thought it’d be worth giving listeners an opportunity to vote—just in case there's a strong collective preference. Listen and vote

2024-12-1201 min

LessWrong (Curated & Popular)“(The) Lightcone is nothing without its people: LW + Lighthaven’s first big fundraiser” by habrykaTLDR: LessWrong + Lighthaven need about $3M for the next 12 months. Donate here, or send me an email, DM or signal message (+1 510 944 3235) if you want to support what we do. Donations are tax-deductible in the US. Reach out for other countries, we can likely figure something out. We have big plans for the next year, and due to a shifting funding landscape we need support from a broader community more than in any previous year.I've been running LessWrong/Lightcone Infrastructure for the last 7 years. During that time we have grown into the primary infrastructure provider for the rationality...

2024-11-301h 03

LessWrong (Curated & Popular)“Reliable Sources: The Story of David Gerard” by TracingWoodgrainsThis is a linkpost for https://www.tracingwoodgrains.com/p/reliable-sources-how-wikipedia-admin, posted in full here given its relevance to this community. Gerard has been one of the longest-standing malicious critics of the rationalist and EA communities and has done remarkable amounts of work to shape their public images behind the scenes.Note: I am closer to this story than to many of my others. As always, I write aiming to provide a thorough and honest picture, but this should be read as the view of a close onlooker who has known about much within this story...

2024-07-111h 22

2024-07-111h 21

LessWrong (Curated & Popular)[HUMAN VOICE] "How could I have thought that faster?" by mesaoptimizerSupport ongoing human narrations of LessWrong's curated posts:www.patreon.com/LWCuratedThis is a linkpost for https://twitter.com/ESYudkowsky/status/144546114693741363I stumbled upon a Twitter thread where Eliezer describes what seems to be his cognitive algorithm that is equivalent to Tune Your Cognitive Strategies, and have decided to archive / repost it here.Source:https://www.lesswrong.com/posts/rYq6joCrZ8m62m7ej/how-could-i-have-thought-that-fasterNarrated for LessWrong by Perrin Walker.Share feedback on this narration.

2024-04-1203 min

LessWrong (Curated & Popular)[HUMAN VOICE] "My PhD thesis: Algorithmic Bayesian Epistemology" by Eric NeymanSupport ongoing human narrations of LessWrong's curated posts:www.patreon.com/LWCuratedIn January, I defended my PhD thesis, which I called Algorithmic Bayesian Epistemology. From the preface:For me as for most students, college was a time of exploration. I took many classes, read many academic and non-academic works, and tried my hand at a few research projects. Early in graduate school, I noticed a strong commonality among the questions that I had found particularly fascinating: most of them involved reasoning about knowledge, information, or uncertainty under constraints. I decided that this...

2024-04-1213 min

LessWrong (Curated & Popular)[HUMAN VOICE] "Toward a Broader Conception of Adverse Selection" by Ricki HeicklenSupport ongoing human narrations of LessWrong's curated posts:www.patreon.com/LWCuratedThis is a linkpost for https://bayesshammai.substack.com/p/conditional-on-getting-to-trade-your“I refuse to join any club that would have me as a member” -Marx[1]Adverse Selection is the phenomenon in which information asymmetries in non-cooperative environments make trading dangerous. It has traditionally been understood to describe financial markets in which buyers and sellers systematically differ, such as a market for used cars in which sellers have the information advantage, where resulting feedback loops can lead to market coll...

2024-04-1221 min

LessWrong (Curated & Popular)[HUMAN VOICE] "Social status part 1/2: negotiations over object-level preferences" by Steven ByrnesSupport ongoing human narrations of LessWrong's curated posts:www.patreon.com/LWCuratedSource:https://www.lesswrong.com/posts/SPBm67otKq5ET5CWP/social-status-part-1-2-negotiations-over-object-level Narrated for LessWrong by Perrin Walker.Share feedback on this narration.

2024-04-0550 min

LessWrong (Curated & Popular)[HUMAN VOICE] "Using axis lines for good or evil" by dynomightSupport ongoing human narrations of LessWrong's curated posts:www.patreon.com/LWCuratedSource:https://www.lesswrong.com/posts/Yay8SbQiwErRyDKGb/using-axis-lines-for-good-or-evilNarrated for LessWrong by Perrin Walker.Share feedback on this narration.

2024-04-0512 min

LessWrong (Curated & Popular)[HUMAN VOICE] "Scale Was All We Needed, At First" by Gabriel MukobiSupport ongoing human narrations of LessWrong's curated posts:www.patreon.com/LWCuratedSource:https://www.lesswrong.com/posts/xLDwCemt5qvchzgHd/scale-was-all-we-needed-at-firstNarrated for LessWrong by Perrin Walker.Share feedback on this narration.

2024-04-0515 min

LessWrong (Curated & Popular)[HUMAN VOICE] "Acting Wholesomely" by OwenCBSupport ongoing human narrations of LessWrong's curated posts:www.patreon.com/LWCuratedSource:https://www.lesswrong.com/posts/Cb7oajdrA5DsHCqKd/acting-wholesomelyNarrated for LessWrong by Perrin Walker.Share feedback on this narration.

2024-04-0527 min

LessWrong (Curated & Popular)[HUMAN VOICE] "My Clients, The Liars" by ymeskhoutSupport ongoing human narrations of LessWrong's curated posts:www.patreon.com/LWCuratedSource:https://www.lesswrong.com/posts/h99tRkpQGxwtb9Dpv/my-clients-the-liarsNarrated for LessWrong by Perrin Walker.Share feedback on this narration.[Curated Post] ✓[125+ Karma Post] ✓

2024-03-2113 min

LessWrong (Curated & Popular)[HUMAN VOICE] "Deep atheism and AI risk" by Joe CarlsmithSupport ongoing human narrations of LessWrong's curated posts:www.patreon.com/LWCuratedSource:https://www.lesswrong.com/posts/sJPbmm8Gd34vGYrKd/deep-atheism-and-ai-riskNarrated for LessWrong by Perrin Walker.Share feedback on this narration.[Curated Post] ✓

2024-03-2146 min

LessWrong (Curated & Popular)[HUMAN VOICE] "CFAR Takeaways: Andrew Critch" by RaemonSupport ongoing human narrations of LessWrong's curated posts:www.patreon.com/LWCuratedSource:https://www.lesswrong.com/posts/Jash4Gbi2wpThzZ4k/cfar-takeaways-andrew-critchNarrated for LessWrong by Perrin Walker.Share feedback on this narration.[Curated Post] ✓[125+ Karma Post] ✓

2024-03-1009 min

LessWrong (Curated & Popular)[HUMAN VOICE] "Speaking to Congressional staffers about AI risk" by Akash, hathSupport ongoing human narrations of LessWrong's curated posts:www.patreon.com/LWCuratedSource:https://www.lesswrong.com/posts/2sLwt2cSAag74nsdN/speaking-to-congressional-staffers-about-ai-riskNarrated for LessWrong by Perrin Walker.Share feedback on this narration.[Curated Post] ✓[125+ Karma Post] ✓

2024-03-1024 min

LessWrong (Curated & Popular)[HUMAN VOICE] "Updatelessness doesn't solve most problems" by Martín SotoSupport ongoing human narrations of LessWrong's curated posts:www.patreon.com/LWCuratedSource:https://www.lesswrong.com/posts/g8HHKaWENEbqh2mgK/updatelessness-doesn-t-solve-most-problems-1Narrated for LessWrong by Perrin Walker.Share feedback on this narration.[Curated Post] ✓[125+ Karma Post] ✓

2024-02-2025 min

LessWrong (Curated & Popular)[HUMAN VOICE] "And All the Shoggoths Merely Players" by Zack_M_DavisSupport ongoing human narrations of LessWrong's curated posts:www.patreon.com/LWCuratedSource:https://www.lesswrong.com/posts/8yCXeafJo67tYe5L4/and-all-the-shoggoths-merely-players Narrated for LessWrong by Perrin Walker.Share feedback on this narration.[Curated Post] ✓[125+ Karma Post] ✓

2024-02-2021 min

LessWrong (Curated & Popular)[HUMAN VOICE] "Believing In" by Anna SalamonSupport ongoing human narrations of LessWrong's curated posts:www.patreon.com/LWCuratedSource:https://www.lesswrong.com/posts/duvzdffTzL3dWJcxn/believing-in-1Narrated for LessWrong by Perrin Walker.Share feedback on this narration.[Curated Post] ✓[125+ Karma Post] ✓

2024-02-1425 min

LessWrong (Curated & Popular)[HUMAN VOICE] "Attitudes about Applied Rationality" by Camille BergerSupport ongoing human narrations of LessWrong's curated posts:www.patreon.com/LWCuratedSource:https://www.lesswrong.com/posts/5jdqtpT6StjKDKacw/attitudes-about-applied-rationalityNarrated for LessWrong by Perrin Walker.Share feedback on this narration.[Curated Post] ✓

2024-02-1407 min

LessWrong (Curated & Popular)[HUMAN VOICE] "A Shutdown Problem Proposal" by johnswentworth, David LorellSupport ongoing human narrations of LessWrong's curated posts:www.patreon.com/LWCuratedSource:https://www.lesswrong.com/posts/PhTBDHu9PKJFmvb4p/a-shutdown-problem-proposalNarrated for LessWrong by Perrin Walker.Share feedback on this narration.[Curated Post] ✓[125+ Karma Post] ✓

2024-02-0912 min

LessWrong (Curated & Popular)[HUMAN VOICE] "Without fundamental advances, misalignment and catastrophe are the default outcomes of training powerful AI" by Jeremy Gillen, peterbarnettSupport ongoing human narrations of LessWrong's curated posts:www.patreon.com/LWCuratedSource:https://www.lesswrong.com/posts/GfZfDHZHCuYwrHGCd/without-fundamental-advances-misalignment-and-catastropheNarrated for LessWrong by Perrin Walker.Share feedback on this narration.[Curated Post] ✓[125+ Karma Post] ✓

2024-02-031h 41

LessWrong (Curated & Popular)[HUMAN VOICE] "The case for ensuring that powerful AIs are controlled" by ryan_greenblatt, BuckSupport ongoing human narrations of LessWrong's curated posts:www.patreon.com/LWCuratedSource:https://www.lesswrong.com/posts/kcKrE9mzEHrdqtDpE/the-case-for-ensuring-that-powerful-ais-are-controlledNarrated for LessWrong by Perrin Walker.Share feedback on this narration.[Curated Post] ✓[125+ Karma Post] ✓

2024-02-021h 04

LessWrong (Curated & Popular)[HUMAN VOICE] "There is way too much serendipity" by MalmesburySupport ongoing human narrations of LessWrong's curated posts:www.patreon.com/LWCuratedCrossposted from substack.As we all know, sugar is sweet and so are the $30B in yearly revenue from the artificial sweetener industry.Four billion years of evolution endowed our brains with a simple, straightforward mechanism to make sure we occasionally get an energy refuel so we can continue the foraging a little longer, and of course we are completely ignoring the instructions and spend billions on fake fuel that doesn’t actually grant any energy. A classic ca...

2024-01-2212 min

LessWrong (Curated & Popular)[HUMAN VOICE] "How useful is mechanistic interpretability?" by ryan_greenblatt, Neel Nanda, Buck, habrykaSupport ongoing human narrations of LessWrong's curated posts:www.patreon.com/LWCuratedSource:https://www.lesswrong.com/posts/tEPHGZAb63dfq2v8n/how-useful-is-mechanistic-interpretabilityNarrated for LessWrong by Perrin Walker.Share feedback on this narration.[Curated Post] ✓[125+ Karma Post] ✓

2024-01-2141 min

LessWrong (Curated & Popular)[HUMAN VOICE] "Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training" by evhub et alThis is a linkpost for https://arxiv.org/abs/2401.05566Support ongoing human narrations of LessWrong's curated posts:www.patreon.com/LWCuratedSource:https://www.lesswrong.com/posts/ZAsJv7xijKTfZkMtr/sleeper-agents-training- deceptive-llms-that-persist-throughNarrated for LessWrong by Perrin Walker.Share feedback on this narration.[Curated Post] ✓[125+ Karma Post] ✓

2024-01-2108 min

LessWrong (Curated & Popular)[HUMAN VOICE] "Meaning & Agency" by Abram DemskiSupport ongoing human narrations of LessWrong's curated posts:www.patreon.com/LWCuratedThe goal of this post is to clarify a few concepts relating to AI Alignment under a common framework. The main concepts to be clarified:Optimization. Specifically, this will be a type of Vingean agency. It will split into Selection vs Control variants.Reference (the relationship which holds between map and territory; aka semantics, aka meaning). Specifically, this will be a teleosemantic theory.The main new concepts employed will be endorsement and legitimacy. TLDR: Endorsement of a pr...

2024-01-0730 min

LessWrong (Curated & Popular)[HUMAN VOICE] "A case for AI alignment being difficult" by jessicataThis is a linkpost for https://unstableontology.com/2023/12/31/a-case-for-ai-alignment-being-difficult/Support ongoing human narrations of LessWrong's curated posts:www.patreon.com/LWCuratedThis is an attempt to distill a model of AGI alignment that I have gained primarily from thinkers such as Eliezer Yudkowsky (and to a lesser extent Paul Christiano), but explained in my own terms rather than attempting to hew close to these thinkers. I think I would be pretty good at passing an ideological Turing test for Eliezer Yudowsky on AGI alignment difficulty (but not AGI timelines), though what I'm...

2024-01-0228 min

LessWrong (Curated & Popular)[HUMAN VOICE] "Significantly Enhancing Adult Intelligence With Gene Editing May Be Possible" by Gene Smith and KmanSupport ongoing human narrations of LessWrong's curated posts:www.patreon.com/LWCuratedTL;DR versionIn the course of my life, there have been a handful of times I discovered an idea that changed the way I thought about the world. The first occurred when I picked up Nick Bostrom’s book “superintelligence” and realized that AI would utterly transform the world. The second was when I learned about embryo selection and how it could change future generations. And the third happened a few months ago when I read a message from a frie...

2023-12-171h 01

LessWrong (Curated & Popular)[HUMAN VOICE] "Moral Reality Check (a short story)" by jessicataSupport ongoing human narrations of LessWrong's curated posts:www.patreon.com/LWCuratedThis is a linkpost for https://unstableontology.com/2023/11/26/moral-reality-check/Janet sat at her corporate ExxenAI computer, viewing some training performance statistics. ExxenAI was a major player in the generative AI space, with multimodal language, image, audio, and video AIs. They had scaled up operations over the past few years, mostly serving B2B, but with some B2C subscriptions. ExxenAI's newest AI system, SimplexAI-3, was based on GPT-5 and Gemini-2. ExxenAI had hired away some software engineers from Google and...

2023-12-1539 min

LessWrong (Curated & Popular)2023 Unofficial LessWrong Census/SurveyThe Less Wrong General Census is unofficially here! You can take it at this link.It's that time again.If you are reading this post and identify as a LessWronger, then you are the target audience. I'd appreciate it if you took the survey. If you post, if you comment, if you lurk, if you don't actually read the site that much but you do read a bunch of the other rationalist blogs or you're really into HPMOR, if you hung out on rationalist tumblr back in the day, or if none of those exactly fit...

2023-12-1402 min

LessWrong (Curated & Popular)[HUMAN VOICE] "What are the results of more parental supervision and less outdoor play?" by Julia WiseSupport ongoing human narrations of LessWrong's curated posts:www.patreon.com/LWCuratedCrossposted from OtherwiseParents supervise their children way more than they used toChildren spend less of their time in unstructured play than they did in past generations.Parental supervision is way up. The wild thing is that this is true even while the number of children per family has decreased and the amount of time mothers work outside the home has increased.Source:https://www.lesswrong.com/posts...

2023-12-1312 min

LessWrong (Curated & Popular)[HUMAN VOICE] "Shallow review of live agendas in alignment & safety" by technicalities & StagSupport ongoing human narrations of LessWrong's curated posts:www.patreon.com/LWCuratedYou can’t optimise an allocation of resources if you don’t know what the current one is. Existing maps of alignment research are mostly too old to guide you and the field has nearly no ratchet, no common knowledge of what everyone is doing and why, what is abandoned and why, what is renamed, what relates to what, what is going on. This post is mostly just a big index: a link-dump for as many currently active AI safety agend...

2023-12-041h 02

LessWrong (Curated & Popular)[HUMAN VOICE] "Social Dark Matter" by Duncan SabienThe author's Substack:https://substack.com/@homosabiensSupport ongoing human narrations of LessWrong's curated posts:www.patreon.com/LWCuratedYou know it must be out there, but you mostly never see it.Author's Note 1: In something like 75% of possible futures, this will be the last essay that I publish on LessWrong. Future content will be available on my substack, where I'm hoping people will be willing to chip in a little commensurate with the value of the writing, and (after a delay) on my personal site (not y...

2023-11-281h 05

LessWrong (Curated & Popular)Social Dark MatterYou know it must be out there, but you mostly never see it.Author's Note 1: I'm something like 75% confident that this will be the last essay that I publish on LessWrong. Future content will be available on my substack, where I'm hoping people will be willing to chip in a little commensurate with the value of the writing, and (after a delay) on my personal site. I decided to post this final essay here rather than silently switching over because many LessWrong readers would otherwise never find out that they could still get new Duncan content elsewhere.

2023-11-1753 min

LessWrong (Curated & Popular)"Will no one rid me of this turbulent pest?" by MetacelsusLast year, I wrote about the promise of gene drives to wipe out mosquito species and end malaria.In the time since my previous writing, gene drives have still not been used in the wild, and over 600,000 people have died of malaria. Although there are promising new developments such as malaria vaccines, there have also been some pretty bad setbacks (such as mosquitoes and parasites developing resistance to commonly used chemicals), and malaria deaths have increased slightly from a few years ago. Recent news coverage[1] has highlighted that the fight against malaria has stalled, and...

2023-10-1817 min

LessWrong (Curated & Popular)"RSPs are pauses done right" by evhubCOI: I am a research scientist at Anthropic, where I work on model organisms of misalignment; I was also involved in the drafting process for Anthropic’s RSP. Prior to joining Anthropic, I was a Research Fellow at MIRI for three years.Thanks to Kate Woolverton, Carson Denison, and Nicholas Schiefer for useful feedback on this post.Recently, there’s been a lot of discussion and advocacy around AI pauses—which, to be clear, I think is great: pause advocacy pushes in the right direction and works to build a good base of public support for x...

2023-10-1512 min

LessWrong (Curated & Popular)"Cohabitive Games so Far" by mako yassA cohabitive game[1] is a partially cooperative, partially competitive multiplayer game that provides an anarchic dojo for development in applied cooperative bargaining, or negotiation.Applied cooperative bargaining isn't currently taught, despite being an infrastructural literacy for peace, trade, democracy or any other form of pluralism. We suffer for that. There are many good board games that come close to meeting the criteria of a cohabitive game today, but they all[2] miss in one way or another, forbidding sophisticated negotiation from being practiced.So, over the past couple of years, we've been gradually and irregularly designing...

2023-10-1532 min

LessWrong (Curated & Popular)"Announcing MIRI’s new CEO and leadership team" by Gretta DulebaIn 2023, MIRI has shifted focus in the direction of broad public communication—see, for example, our recent TED talk, our piece in TIME magazine “Pausing AI Developments Isn’t Enough. We Need to Shut it All Down”, and our appearances on various podcasts. While we’re continuing to support various technical research programs at MIRI, this is no longer our top priority, at least for the foreseeable future.Coinciding with this shift in focus, there have also been many organizational changes at MIRI over the last several months, and we are somewhat overdue to announce them in public. Th...

2023-10-1506 min

LessWrong (Curated & Popular)"Comparing Anthropic's Dictionary Learning to Ours" by Robert_AIZIReaders may have noticed many similarities between Anthropic's recent publication Towards Monosemanticity: Decomposing Language Models With Dictionary Learning (LW post) and my team's recent publication Sparse Autoencoders Find Highly Interpretable Directions in Language Models (LW post). Here I want to compare our techniques and highlight what we did similarly or differently. My hope in writing this is to help readers understand the similarities and differences, and perhaps to lay the groundwork for a future synthesis approach. First, let me note that we arrived at similar techniques in similar ways: both Anthropic and my team follow the lead o...

2023-10-1508 min

LessWrong (Curated & Popular)"Towards Monosemanticity: Decomposing Language Models With Dictionary Learning" by Zac Hatfield-DoddsNeural networks are trained on data, not programmed to follow rules. We understand the math of the trained network exactly – each neuron in a neural network performs simple arithmetic – but we don't understand why those mathematical operations result in the behaviors we see. This makes it hard to diagnose failure modes, hard to know how to fix them, and hard to certify that a model is truly safe.Luckily for those of us trying to understand artificial neural networks, we can simultaneously record the activation of every neuron in the network, intervene by silencing or stimulating them, and...

2023-10-1004 min

LessWrong (Curated & Popular)"Announcing Dialogues" by Ben PaceAs of today, everyone is able to create a new type of content on LessWrong: Dialogues.In contrast with posts, which are for monologues, and comment sections, which are spaces for everyone to talk to everyone, a dialogue is a space for a few invited people to speak with each other. I'm personally very excited about this as a way for people to produce lots of in-depth explanations of their world-models in public. I think dialogues enable this in a way that feels easier — instead of writing an explanation for anyone who reads, you...

2023-10-1007 min

LessWrong (Curated & Popular)"Evaluating the historical value misspecification argument" by Matthew BarnettETA: I'm not saying that MIRI thought AIs wouldn't understand human values. If there's only one thing you take away from this post, please don't take away that.Recently, many people have talked about whether some of the main MIRI people (Eliezer Yudkowsky, Nate Soares, and Rob Bensinger[1]) should update on whether value alignment is easier than they thought given that GPT-4 seems to follow human directions and act within moral constraints pretty well (here are two specific examples of people talking about this: 1, 2). Because these conversations are often hard to follow without much context, I'll just...

2023-10-1011 min

LessWrong (Curated & Popular)"Response to Quintin Pope’s Evolution Provides No Evidence For the Sharp Left Turn" by ZviResponse to: Evolution Provides No Evidence For the Sharp Left Turn, due to it winning first prize in The Open Philanthropy Worldviews contest. Quintin’s post is an argument about a key historical reference class and what it tells us about AI. Instead of arguing that the reference makes his point, he is instead arguing that it doesn’t make anyone’s point - that we understand the reasons for humanity’s sudden growth in capabilities. He says this jump was caused by gaining access to cultural transmission which allowed partial preservation of in-lifetime learning across generations, which was...

2023-10-1016 min

LessWrong (Curated & Popular)"Thomas Kwa's MIRI research experience" by Thomas Kwa and othersModerator note: the following is a dialogue using LessWrong’s new dialogue feature. The exchange is not completed: new replies might be added continuously, the way a comment thread might work. If you’d also be excited about finding an interlocutor to debate, dialogue, or getting interviewed by: fill in this dialogue matchmaking form. Hi Thomas, I'm quite curious to hear about your research experience working with MIRI. To get us started: When were you at MIRI? Who did you work with? And what problem were you working on?Source:https://www.lessw...

2023-10-0652 min

LessWrong (Curated & Popular)"EA Vegan Advocacy is not truthseeking, and it’s everyone’s problem" by ElizabethEffective altruism prides itself on truthseeking. That pride is justified in the sense that EA is better at truthseeking than most members of its reference category, and unjustified in that it is far from meeting its own standards. We’ve already seen dire consequences of the inability to detect bad actors who deflect investigation into potential problems, but by its nature you can never be sure you’ve found all the damage done by epistemic obfuscation because the point is to be self-cloaking. My concern here is for the underlying dynamics of EA’s weak epistemic immune system...

2023-10-0341 min

LessWrong (Curated & Popular)"How to Catch an AI Liar: Lie Detection in Black-Box LLMs by Asking Unrelated Questions" by Jan Brauner et al.Large language models (LLMs) can "lie", which we define as outputting false statements despite "knowing" the truth in a demonstrable sense. LLMs might "lie", for example, when instructed to output misinformation. Here, we develop a simple lie detector that requires neither access to the LLM's activations (black-box) nor ground-truth knowledge of the fact in question. The detector works by asking a predefined set of unrelated follow-up questions after a suspected lie, and feeding the LLM's yes/no answers into a logistic regression classifier. Despite its simplicity, this lie detector is highly accurate and surprisingly general. When trained on examples...

2023-10-0307 min

LessWrong (Curated & Popular)"The Lighthaven Campus is open for bookings" by HabrykaLightcone Infrastructure (the organization that grew from and houses the LessWrong team) has just finished renovating a 7-building physical campus that we hope to use to make the future of humanity go better than it would otherwise.We're hereby announcing that it is generally available for bookings. We offer preferential pricing for projects we think are good for the world, but to cover operating costs, we're renting out space to a wide variety of people/projects.Source:https://www.lesswrong.com/posts/memqyjNCpeDrveayx/the-lighthaven-campus-is-open-for-bookingsNarrated for LessWrong by TYPE III...

2023-10-0305 min

LessWrong (Curated & Popular)"'Diamondoid bacteria' nanobots: deadly threat or dead-end? A nanotech investigation" by titotalA lot of people are highly concerned that a malevolent AI or insane human will, in the near future, set out to destroy humanity. If such an entity wanted to be absolutely sure they would succeed, what method would they use? Nuclear war? Pandemics?According to some in the x-risk community, the answer is this: The AI will invent molecular nanotechnology, and then kill us all with diamondoid bacteria nanobots.Source:https://www.lesswrong.com/posts/bc8Ssx5ys6zqu3eq9/diamondoid-bacteria-nanobots-deadly-threat-or-dead-end-aNarrated for LessWrong by TYPE III AUDIO.

2023-10-0337 min

LessWrong (Curated & Popular)"The King and the Golem" by Richard NgoThis is a linkpost for https://narrativeark.substack.com/p/the-king-and-the-golemLong ago there was a mighty king who had everything in the world that he wanted, except trust. Who could he trust, when anyone around him might scheme for his throne? So he resolved to study the nature of trust, that he might figure out how to gain it. He asked his subjects to bring him the most trustworthy thing in the kingdom, promising great riches if they succeeded.Soon, the first of them arrived at his palace to try. A teacher brought her...

2023-09-2908 min

LessWrong (Curated & Popular)"Sparse Autoencoders Find Highly Interpretable Directions in Language Models" by Logan Riggs et alThis is a linkpost for Sparse Autoencoders Find Highly Interpretable Directions in Language ModelsWe use a scalable and unsupervised method called Sparse Autoencoders to find interpretable, monosemantic features in real LLMs (Pythia-70M/410M) for both residual stream and MLPs. We showcase monosemantic features, feature replacement for Indirect Object Identification (IOI), and use OpenAI's automatic interpretation protocol to demonstrate a significant improvement in interpretability.Source:https://www.lesswrong.com/posts/Qryk6FqjtZk9FHHJR/sparse-autoencoders-find-highly-interpretable-directions-inNarrated for LessWrong by TYPE III AUDIO.Share feedback on this narration.

2023-09-2710 min

LessWrong (Curated & Popular)"Inside Views, Impostor Syndrome, and the Great LARP" by John WentworthEpistemic status: model which I find sometimes useful, and which emphasizes some true things about many parts of the world which common alternative models overlook. Probably not correct in full generality.Consider Yoshua Bengio, one of the people who won a Turing Award for deep learning research. Looking at his work, he clearly “knows what he’s doing”. He doesn’t know what the answers will be in advance, but he has some models of what the key questions are, what the key barriers are, and at least some hand-wavy pseudo-models of how things work.For inst...

2023-09-2708 min

LessWrong (Curated & Popular)"There should be more AI safety orgs" by Marius HobbhahnI’m writing this in my own capacity. The views expressed are my own, and should not be taken to represent the views of Apollo Research or any other program I’m involved with. TL;DR: I argue why I think there should be more AI safety orgs. I’ll also provide some suggestions on how that could be achieved. The core argument is that there is a lot of unused talent and I don’t think existing orgs scale fast enough to absorb it. Thus, more orgs are needed. This post can also serve as a call...

2023-09-2529 min

LessWrong (Curated & Popular)"The Talk: a brief explanation of sexual dimorphism" by MalmesburyCross-posted from substack."Everything in the world is about sex, except sex. Sex is about clonal interference."– Oscar Wilde (kind of)As we all know, sexual reproduction is not about reproduction. Reproduction is easy. If your goal is to fill the world with copies of your genes, all you need is a good DNA-polymerase to duplicate your genome, and then to divide into two copies of yourself. Asexual reproduction is just better in every way.It's pretty clear that, on a direct one-v-one cage match, an asexual organism would have muc...

2023-09-2230 min

LessWrong (Curated & Popular)"A Golden Age of Building? Excerpts and lessons from Empire State, Pentagon, Skunk Works and SpaceX" by jacobjacobPatrick Collison has a fantastic list of examples of people quickly accomplishing ambitious things together since the 19th Century. It does make you yearn for a time that feels... different, when the lethargic behemoths of government departments could move at the speed of a racing startup: [...] last century, [the Department of Defense] innovated at a speed that puts modern Silicon Valley startups to shame: the Pentagon was built in only 16 months (1941–1943), the Manhattan Project ran for just over 3 years (1942–1946), and the Apollo Program put a man on the moon in under a decade (1961–1969). In the 1950s alone, the United...

2023-09-2045 min

LessWrong (Curated & Popular)"AI presidents discuss AI alignment agendas" by TurnTrout & Garrett BakerThis is a linkpost for https://www.youtube.com/watch?v=02kbWY5mahQNone of the presidents fully represent my (TurnTrout's) views.TurnTrout wrote the script. Garrett Baker helped produce the video after the audio was complete. Thanks to David Udell, Ulisse Mini, Noemi Chulo, and especially Rio Popper for feedback and assistance in writing the script.Source:https://www.lesswrong.com/posts/7M2iHPLaNzPNXHuMv/ai-presidents-discuss-ai-alignment-agendasYouTube video kindly provided by the authors. Other text narrated for LessWrong by TYPE III AUDIO.Share feedback on this...

2023-09-1923 min

LessWrong (Curated & Popular)"UDT shows that decision theory is more puzzling than ever" by Wei DaiI feel like MIRI perhaps mispositioned FDT (their variant of UDT) as a clear advancement in decision theory, whereas maybe they could have attracted more attention/interest from academic philosophy if the framing was instead that the UDT line of thinking shows that decision theory is just more deeply puzzling than anyone had previously realized. Instead of one major open problem (Newcomb's, or EDT vs CDT) now we have a whole bunch more. I'm really not sure at this point whether UDT is even on the right track, but it does seem clear that there are some thorny issues...

2023-09-1802 min

LessWrong (Curated & Popular)"Sum-threshold attacks" by TsviBTHow do you affect something far away, a lot, without anyone noticing?(Note: you can safely skip sections. It is also safe to skip the essay entirely, or to read the whole thing backwards if you like.)Source:https://www.lesswrong.com/posts/R3eDrDoX8LisKgGZe/sum-threshold-attacksNarrated for LessWrong by TYPE III AUDIO.Share feedback on this narration.[125+ Karma Post] ✓

2023-09-1119 min

LessWrong (Curated & Popular)"A list of core AI safety problems and how I hope to solve them" by DavidadContext: I sometimes find myself referring back to this tweet and wanted to give it a more permanent home. While I'm at it, I thought I would try to give a concise summary of how each distinct problem would be solved by an Open Agency Architecture (OAA), if OAA turns out to be feasible.Source:https://www.lesswrong.com/posts/D97xnoRr6BHzo5HvQ/one-minute-every-momentNarrated for LessWrong by TYPE III AUDIO.Share feedback on this narration.[125+ Karma Post] ✓

2023-09-0912 min

LessWrong (Curated & Popular)"Report on Frontier Model Training" by Yafah EdelmanThis is a linkpost for https://docs.google.com/document/d/1TsYkDYtV6BKiCN9PAOirRAy3TrNDu2XncUZ5UZfaAKA/edit?usp=sharingUnderstanding what drives the rising capabilities of AI is important for those who work to forecast, regulate, or ensure the safety of AI. Regulations on the export of powerful GPUs need to be informed by understanding of how these GPUs are used, forecasts need to be informed by bottlenecks, and safety needs to be informed by an understanding of how the models of the future might be trained. A clearer understanding would enable policy makers to target...

2023-09-0935 min

LessWrong (Curated & Popular)"Defunding My Mistake" by ymeskhoutUntil about five years ago, I unironically parroted the slogan All Cops Are Bastards (ACAB) and earnestly advocated to abolish the police and prison system. I had faint inklings I might be wrong about this a long time ago, but it took a while to come to terms with its disavowal. What follows is intended to be not just a detailed account of what I used to believe but most pertinently, why. Despite being super egotistical, for whatever reason I do not experience an aversion to openly admitting mistakes I’ve made, and I find it very difficult to un...

2023-09-0811 min

LessWrong (Curated & Popular)"What I would do if I wasn’t at ARC Evals" by LawrenceCIn which: I list 9 projects that I would work on if I wasn’t busy working on safety standards at ARC Evals, and explain why they might be good to work on. Epistemic status: I’m prioritizing getting this out fast as opposed to writing it carefully. I’ve thought for at least a few hours and talked to a few people I trust about each of the following projects, but I haven’t done that much digging into each of these, and it’s likely that I’m wrong about many material facts. I also...

2023-09-0825 min

LessWrong (Curated & Popular)"The U.S. is becoming less stable" by lcWe focus so much on arguing over who is at fault in this country that I think sometimes we fail to alert on what's actually happening. I would just like to point out, without attempting to assign blame, that American political institutions appear to be losing common knowledge of their legitimacy, and abandoning certain important traditions of cooperative governance. It would be slightly hyperbolic, but not unreasonable to me, to term what has happened "democratic backsliding". Source:https://www.lesswrong.com/posts/r2vaM2MDvdiDSWicu/the-u-s-is-becoming-less-stable#Narrated for LessWrong by TYPE III...

2023-09-0503 min

LessWrong (Curated & Popular)"Meta Questions about Metaphilosophy" by Wei DaiTo quickly recap my main intellectual journey so far (omitting a lengthy side trip into cryptography and Cypherpunk land), with the approximate age that I became interested in each topic in parentheses:Source:https://www.lesswrong.com/posts/fJqP9WcnHXBRBeiBg/meta-questions-about-metaphilosophyNarrated for LessWrong by TYPE III AUDIO.Share feedback on this narration.[125+ Karma Post] ✓

2023-09-0505 min

LessWrong (Curated & Popular)"OpenAI API base models are not sycophantic, at any size" by NostalgebraistIn Discovering Language Model Behaviors with Model-Written Evaluations" (Perez et al 2022), the authors studied language model "sycophancy" - the tendency to agree with a user's stated view when asked a question.The paper contained the striking plot reproduced below, which shows sycophancyincreasing dramatically with model sizewhile being largely independent of RLHF stepsand even showing up at 0 RLHF steps, i.e. in base models![...] I found this result startling when I read the original paper, as it seemed like a bizarre failure of calibration. How would the base LM know that this "Assistant" c...

2023-09-0404 min

LessWrong (Curated & Popular)"Dear Self; we need to talk about ambition" by ElizabethI keep seeing advice on ambition, aimed at people in college or early in their career, that would have been really bad for me at similar ages. Rather than contribute (more) to the list of people giving poorly universalized advice on ambition, I have written a letter to the one person I know my advice is right for: myself in the past.Source:https://www.lesswrong.com/posts/uGDtroD26aLvHSoK2/dear-self-we-need-to-talk-about-ambition-1Narrated for LessWrong by TYPE III AUDIO.Share feedback on this narration.[125+ Karma Post] ✓[Cu...

2023-08-3013 min

LessWrong (Curated & Popular)"Assume Bad Faith" by Zack_M_DavisI've been trying to avoid the terms "good faith" and "bad faith". I'm suspicious that most people who have picked up the phrase "bad faith" from hearing it used, don't actually know what it means—and maybe, that the thing it does mean doesn't carve reality at the joints.People get very touchy about bad faith accusations: they think that you should assume good faith, but that if you've determined someone is in bad faith, you shouldn't even be talking to them, that you need to exile them.What does "bad faith" mean, though? It do...

2023-08-2812 min

LessWrong (Curated & Popular)"Book Launch: "The Carving of Reality," Best of LessWrong vol. III" by RaemonThe Carving of Reality, third volume of the Best of LessWrong books is now available on Amazon (US).The Carving of Reality includes 43 essays from 29 authors. We've collected the essays into four books, each exploring two related topics. The "two intertwining themes" concept was first inspired when as I looked over the cluster of "coordination" themed posts, and noting a recurring motif of not only "solving coordination problems" but also "dealing with the binding constraints that were causing those coordination problems."Source:https://www.lesswrong.com/posts/Rck5CvmYkzWYxsF4D/book-launch-the-carving-of-reality-best-of-lesswrong-vol-iii

2023-08-2805 min

LessWrong (Curated & Popular)"Large Language Models will be Great for Censorship" by Ethan EdwardsLLMs can do many incredible things. They can generate unique creative content, carry on long conversations in any number of subjects, complete complex cognitive tasks, and write nearly any argument. More mundanely, they are now the state of the art for boring classification tasks and therefore have the capability to radically upgrade the censorship capacities of authoritarian regimes throughout the world.Produced as part of the SERI ML Alignment Theory Scholars Program - Summer 2023 Cohort. Thanks to ev_ and Kei for suggestions on this post.Source:https://www.lesswrong.com/posts/oqvsR2...

2023-08-2315 min

LessWrong (Curated & Popular)"Ten Thousand Years of Solitude" by agpThis is a linkpost for the article "Ten Thousand Years of Solitude", written by Jared Diamond for Discover Magazine in 1993, four years before he published Guns, Germs and Steel. That book focused on Diamond's theory that the geography of Eurasia, particularly its large size and common climate, allowed civilizations there to dominate the rest of the world because it was easy to share plants, animals, technologies and ideas. This article, however, examines the opposite extreme.Diamond looks at the intense isolation of the tribes on Tasmania - an island the size of Ireland. After waters rose, Tasmania...

2023-08-2207 min

LessWrong (Curated & Popular)"6 non-obvious mental health issues specific to AI safety" by Igor IvanovIntro: I am a psychotherapist, and I help people working on AI safety. I noticed patterns of mental health issues highly specific to this group. It's not just doomerism, there are way more of them that are less obvious. If you struggle with a mental health issue related to AI safety, feel free to leave a comment about it and about things that help you with it. You might also support others in the comments. Sometimes such support makes a lot of difference and people feel like they are not alone.All the e...

2023-08-2206 min

LessWrong (Curated & Popular)"Against Almost Every Theory of Impact of Interpretability" by Charbel-RaphaëlI gave a talk about the different risk models, followed by an interpretability presentation, then I got a problematic question, "I don't understand, what's the point of doing this?" Hum.Feature viz? (left image) Um, it's pretty but is this useful?[1] Is this reliable? GradCam (a pixel attribution technique, like on the above right figure), it's pretty. But I’ve never seen anybody use it in industry.[2] Pixel attribution seems useful, but accuracy remains the king.[3]Induction heads? Ok, we are maybe on track to retro engineer the mechanism of regex in LLMs. Cool.The considerations in the...

2023-08-211h 18

LessWrong (Curated & Popular)"Inflection.ai is a major AGI lab" by NikolaInflection.ai (co-founded by DeepMind co-founder Mustafa Suleyman) should be perceived as a frontier LLM lab of similar magnitude as Meta, OpenAI, DeepMind, and Anthropic based on their compute, valuation, current model capabilities, and plans to train frontier models. Compared to the other labs, Inflection seems to put less effort into AI safety.Thanks to Laker Newhouse for discussion and feedback.Source:https://www.lesswrong.com/posts/Wc5BYFfzuLzepQjCq/inflection-ai-is-a-major-agi-labNarrated for LessWrong by TYPE III AUDIO.Share feedback on this narration.[125+...

2023-08-1506 min

LessWrong (Curated & Popular)"Feedbackloop-first Rationality" by RaemonI've been workshopping a new rationality training paradigm. (By "rationality training paradigm", I mean an approach to learning/teaching the skill of "noticing what cognitive strategies are useful, and getting better at them.")I think the paradigm has promise. I've beta-tested it for a couple weeks. It’s too early to tell if it actually works, but one of my primary goals is to figure out if it works relatively quickly, and give up if it isn’t not delivering. The goal of this post is to:Convey the frameworkSee if people find it compe...

2023-08-1515 min

LessWrong (Curated & Popular)"When can we trust model evaluations?" bu evhubIn "Towards understanding-based safety evaluations," I discussed why I think evaluating specifically the alignment of models is likely to require mechanistic, understanding-based evaluations rather than solely behavioral evaluations. However, I also mentioned in a footnote why I thought behavioral evaluations would likely be fine in the case of evaluating capabilities rather than evaluating alignment:However, while I like the sorts of behavioral evaluations discussed in the GPT-4 System Card (e.g. ARC's autonomous replication evaluation) as a way of assessing model capabilities, I have a pretty fundamental concern with these sorts of techniques as a mechanism for...

2023-08-0917 min

LessWrong (Curated & Popular)"Model Organisms of Misalignment: The Case for a New Pillar of Alignment Research" by evhub, Nicholas Schiefer, Carson Denison, Ethan PerezTL;DR: This document lays out the case for research on “model organisms of misalignment” – in vitro demonstrations of the kinds of failures that might pose existential threats – as a new and important pillar of alignment research.If you’re interested in working on this agenda with us at Anthropic, we’re hiring! Please apply to the research scientist or research engineer position on the Anthropic website and mention that you’re interested in working on model organisms of misalignment.Source:https://www.lesswrong.com/posts/ChDH335ckdvpxXaXX/model-organisms-of-misalignment-the-case-for-a-new-pillar-of-1

2023-08-0935 min

LessWrong (Curated & Popular)"ARC Evals new report: Evaluating Language-Model Agents on Realistic Autonomous Tasks" by Beth BarnesBlogpost versionPaperWe have just released our first public report. It introduces methodology for assessing the capacity of LLM agents to acquire resources, create copies of themselves, and adapt to novel challenges they encounter in the wild.BackgroundARC Evals develops methods for evaluating the safety of large language models (LLMs) in order to provide early warnings of models with dangerous capabilities. We have public partnerships with Anthropic and OpenAI to evaluate their AI systems, and are exploring other partnerships as well.Source:...

2023-08-0408 min

LessWrong (Curated & Popular)"The "public debate" about AI is confusing for the general public and for policymakers because it is a three-sided debate" by Adam David LongSummary of Argument: The public debate among AI experts is confusing because there are, to a first approximation, three sides, not two sides to the debate. I refer to this as a 🔺three-sided framework, and I argue that using this three-sided framework will help clarify the debate (more precisely, debates) for the general public and for policy-makers.Source:https://www.lesswrong.com/posts/BTcEzXYoDrWzkLLrQ/the-public-debate-about-ai-is-confusing-for-the-generalNarrated for LessWrong by TYPE III AUDIO.Share feedback on this narration.[125+ Karma Post] ✓

2023-08-0407 min

LessWrong (Curated & Popular)"My current LK99 questions" by Eliezer YudkowskySo this morning I thought to myself, "Okay, now I will actually try to study the LK99 question, instead of betting based on nontechnical priors and market sentiment reckoning." (My initial entry into the affray, having been driven by people online presenting as confidently YES when the prediction markets were not confidently YES.) And then I thought to myself, "This LK99 issue seems complicated enough that it'd be worth doing an actual Bayesian calculation on it"--a rare thought; I don't think I've done an actual explicit numerical Bayesian update in at least a year.In the pr...

2023-08-0409 min

LessWrong (Curated & Popular)"Thoughts on sharing information about language model capabilities" by paulfchristianoI believe that sharing information about the capabilities and limits of existing ML systems, and especially language model agents, significantly reduces risks from powerful AI—despite the fact that such information may increase the amount or quality of investment in ML generally (or in LM agents in particular).Concretely, I mean to include information like: tasks and evaluation frameworks for LM agents, the results of evaluations of particular agents, discussions of the qualitative strengths and weaknesses of agents, and information about agent design that may represent small improvements over the state of the art (insofar as that in...

2023-08-0219 min

LessWrong (Curated & Popular)"Yes, It's Subjective, But Why All The Crabs?" by johnswentworthSome early biologist, equipped with knowledge of evolution but not much else, might see all these crabs and expect a common ancestral lineage. That’s the obvious explanation of the similarity, after all: if the crabs descended from a common ancestor, then of course we’d expect them to be pretty similar.… but then our hypothetical biologist might start to notice surprisingly deep differences between all these crabs. The smoking gun, of course, would come with genetic sequencing: if the crabs’ physiological similarity is achieved by totally different genetic means, or if functionally-irrelevant mutations differ across crab-species by more...

2023-07-3111 min

LessWrong (Curated & Popular)"Self-driving car bets" by paulfchristianoThis month I lost a bunch of bets.Back in early 2016 I bet at even odds that self-driving ride sharing would be available in 10 US cities by July 2023. Then I made similar bets a dozen times because everyone disagreed with me.Source:https://www.lesswrong.com/posts/ZRrYsZ626KSEgHv8s/self-driving-car-betsNarrated for LessWrong by TYPE III AUDIO.Share feedback on this narration.[125+ Karma Post] ✓[Curated Post] ✓

2023-07-3108 min

LessWrong (Curated & Popular)"Cultivating a state of mind where new ideas are born" by Henrik KarlssonIn the early 2010s, a popular idea was to provide coworking spaces and shared living to people who were building startups. That way the founders would have a thriving social scene of peers to percolate ideas with as they figured out how to build and scale a venture. This was attempted thousands of times by different startup incubators. There are no famous success stories.In 2015, Sam Altman, who was at the time the president of Y Combinator, a startup accelerator that has helped scale startups collectively worth $600 billion, tweeted in reaction that “not [providing coworking spaces] is pa...

2023-07-3124 min

LessWrong (Curated & Popular)"Rationality !== Winning" by RaemonI think "Rationality is winning" is a bit of a trap. (The original phrase is notably "rationality is systematized winning", which is better, but it tends to slide into the abbreviated form, and both forms aren't that great IMO)It was coined to counteract one set of failure modes - there were people who were straw vulcans, who thought rituals-of-logic were important without noticing when they were getting in the way of their real goals. And, also, there outside critics who'd complain about straw-vulcan-ish actions, and treat that as a knockdown argument against "rationality."

2023-07-2814 min

LessWrong (Curated & Popular)"Brain Efficiency Cannell Prize Contest Award Ceremony" by Alexander Gietelink OldenzielPreviously Jacob Cannell wrote the post "Brain Efficiency" which makes several radical claims: that the brain is at the pareto frontier of speed, energy efficiency and memory bandwith, that this represent a fundamental physical frontier.Here's an AI-generated summaryThe article “Brain Efficiency: Much More than You Wanted to Know” on LessWrong discusses the efficiency of physical learning machines. The article explains that there are several interconnected key measures of efficiency for physical learning machines: energy efficiency in ops/J, spatial efficiency in ops/mm^2 or ops/mm^3, speed efficiency in time/delay for key learned task...

2023-07-2813 min

LessWrong (Curated & Popular)"Grant applications and grand narratives" by ElizabethThe Lightspeed application asks: “What impact will [your project] have on the world? What is your project’s goal, how will you know if you’ve achieved it, and what is the path to impact?”LTFF uses an identical question, and SFF puts it even more strongly (“What is your organization’s plan for improving humanity’s long term prospects for survival and flourishing?”). I’ve applied to all three grants of these at various points, and I’ve never liked this question. It feels like it wants a grand narrative of an amazing, systemic project that will meas...

2023-07-2811 min

LessWrong (Curated & Popular)"Cryonics and Regret" by MvBThis post is not about arguments in favor of or against cryonics. I would just like to share a particular emotional response of mine as the topic became hot for me after not thinking about it at all for nearly a decade.Recently, I have signed up for cryonics, as has my wife, and we have made arrangements for our son to be cryopreserved just in case longevity research does not deliver in time or some unfortunate thing happens.Last year, my father died. He was a wonderful man, good-natured, intelligent, funny, caring and, most...

2023-07-2803 min

LessWrong (Curated & Popular)"Introduction to abstract entropy" by Alex Altairhttps://www.lesswrong.com/posts/REA49tL5jsh69X3aM/introduction-to-abstract-entropy#fnrefpi8b39u5hd7This post, and much of the following sequence, was greatly aided by feedback from the following people (among others): Lawrence Chan, Joanna Morningstar, John Wentworth, Samira Nedungadi, Aysja Johnson, Cody Wild, Jeremy Gillen, Ryan Kidd, Justis Mills and Jonathan Mustin. Illustrations by Anne Ore.Introduction & motivationIn the course of researching optimization, I decided that I had to really understand what entropy is.[1] But there are a lot of other reasons why the concept is worth studying:...

2022-10-3045 min

LessWrong (Curated & Popular)"Two-year update on my personal AI timelines" by Ajeya Cotrahttps://www.lesswrong.com/posts/AfH2oPHCApdKicM4m/two-year-update-on-my-personal-ai-timelines#fnref-fwwPpQFdWM6hJqwuY-12Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.I worked on my draft report on biological anchors for forecasting AI timelines mainly between ~May 2019 (three months after the release of GPT-2) and ~Jul 2020 (a month after the release of GPT-3), and posted it on LessWrong in Sep 2020 after an internal review process. At the time, my bottom line estimates from the bio anchors modeling exercise were:[1]Roughly ~15% probability of transformative AI by 2036[2] (16 years from posting the report; 14 years...

2022-09-2239 min

LessWrong (Curated & Popular)"Language models seem to be much better than humans at next-token prediction" by Buck, Fabien and LawrenceChttps://www.lesswrong.com/posts/htrZrxduciZ5QaCjw/language-models-seem-to-be-much-better-than-humans-at-nextCrossposted from the AI Alignment Forum. May contain more technical jargon than usual.[Thanks to a variety of people for comments and assistance (especially Paul Christiano, Nostalgebraist, and Rafe Kennedy), and to various people for playing the game. Buck wrote the top-1 prediction web app; Fabien wrote the code for the perplexity experiment and did most of the analysis and wrote up the math here, Lawrence did the research on previous measurements. Epistemic status: we're pretty confident of our work here, but haven't engaged in a super t...

2022-09-1527 min

LessWrong (Curated & Popular)"Unifying Bargaining Notions (1/2)" by Diffractorhttps://www.lesswrong.com/posts/rYDas2DDGGDRc8gGB/unifying-bargaining-notions-1-2Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.This is a two-part sequence of posts, in the ancient LessWrong tradition of decision-theory-posting. This first part will introduce various concepts of bargaining solutions and dividing gains from trade, which the reader may or may not already be familiar with.The upcoming part will be about how all introduced concepts from this post are secretly just different facets of the same underlying notion, as originally discovered by John Harsanyi back...

2022-09-0946 min

LessWrong (Curated & Popular)"«Boundaries», Part 1: a key missing concept from utility theory" by Andrew Critch https://www.lesswrong.com/posts/8oMF8Lv5jiGaQSFvo/boundaries-part-1-a-key-missing-concept-from-utility-theory Crossposted from the AI Alignment Forum. May contain more technical jargon than usual. This is Part 1 of my «Boundaries» Sequence on LessWrong. Summary: «Boundaries» are a missing concept from the axioms of game theory and bargaining theory, which might help pin-down certain features of multi-agent rationality (this post), and have broader implications for effective altruism discourse and x-risk (future posts). 1. Boundaries (of living systems) Epistemic status: me describing what I mean. With the exception of some relatively recent and isola...

2022-07-2818 min

LessWrong (Curated & Popular)"It’s Probably Not Lithium" by Natália Coelho Mendonça https://www.lesswrong.com/posts/7iAABhWpcGeP5e6SB/it-s-probably-not-lithium A Chemical Hunger (a), a series by the authors of the blog Slime Mold Time Mold (SMTM) that has been received positively on LessWrong, argues that the obesity epidemic is entirely caused (a) by environmental contaminants. The authors’ top suspect is lithium (a)[1], primarily because it is known to cause weight gain at the doses used to treat bipolar disorder. After doing some research, however, I found that it is not plausible that lithium plays a major role in the obesity epidemic, and that a lot of th...

2022-07-051h 11