The Economics of Book Slop

Description

In this episode, Seth and Andrey break down AI and the Quantity and Quality of Creative Products: Have LLMs Boosted Creation of Valuable Books? by Imke Reimers and Joel Waldfogel, presented at the NBER Digital Economics and AI conference. Imke and Joel are a great team of digitization researchers, with particular expertise in Amazon book sales data.The paper uses Amazon data to ask whether AI has increased the number of books being published and whether those books are better or worse.

A hypothesis of the article is that heavily AI-assisted books may have low average quality, but are so easy to produce that you get lots of ‘shots on goal’ for an outlier good book. A few good valueable books are added in addition to masses of slop. But if you assume free disposal on slop, you would accept this as a positive exchange.

Does their data change our views on this topic? We’ll read to find out, and along the way bring in Borges’ Library of Babel, the economics of free disposal, preferential attachment models, and the digitization-of-music literature.

Priors

Hypothesis 1: Has AI increased the number of books released from 2022 to 2025?

* Andrey’s View:

* Prior: Yes, by about 50%. The fall in the cost of writing a book has been so great that the number must have gone up. Analogous to how students are producing far more written work with AI assistance.

* Key caveat: The definition of “book” matters enormously — from a major publisher release to a random PDF online. The looser the definition, the bigger the number.

* Seth’s View:

* Prior: Yes, by about 3x. To the extent that slop gets dumped on the market and is allowed in, a dramatic increase is inevitable. Though he acknowledges it’s still an empirical question — AI also lowered the cost of everything else, including Substack.

Hypothesis 2: Has AI increased the average quality of books released?

* Andrey’s View:

* Prior: Average quality goes down. ~1% chance it goes up. The slop influx is substantial. Imagine a science fiction author with one semi-popular book who now milks it into a series of increasingly sloppy sequels — that author exists and AI just gave them a turbo boost.

* Seth’s View:

* Prior: Average quality goes down. ~10% chance it goes up. He raises the “free disposal” argument — authors who would have written anyway only use AI if it makes the book better, which is a force pushing quality up. But the slop influx probably wins. He remains unwilling to put the probability at zero: “Maybe we’re making some real gems here.”

Hypothesis 3 (The Thinker): By 2030, will total social surplus from book reading by humans be higher or lower because of AI?

* Andrey’s View:

* Prior: 25% chance it goes up. People are reading fewer books over time regardless of AI. Nonfiction manuals and textbooks have a clear substitute in ChatGPT. The form factor of the book seems to be on a secular decline, and new AI-generated books won’t be so good as to reverse that trend.

* Seth’s View:

* Prior: 75% chance it goes up. LLMs may be complements to reading rather than substitutes — he cites using an LLM to track character names while reading Dostoevsky’s Demons as a present-day example. Good books are a complement to everything else in the economy. If AI makes context and curated knowledge more valuable, books have a real role in the 5-to-10-year time horizon. “I don’t care if my job gets automated because I’ll just move to the woods and read books” — Tyler Cowen, representative of no one but Seth.

Links + Shownotes

* AI and the Quantity and Quality of Creative Products: Have LLMs Boosted Creation of Valuable Books? – The central paper of the episode by Imke Reimers and Joel Waldfogel (NBER, 2025).

* Can an AI Interview You Better Than a Human? – Recent Justified Posteriors episode referenced during the discussion.

* BookStat – The independent data provider the authors use to calibrate ratings-to-sales conversions for Amazon books.

Scholars Mentioned

* Imke Reimers – Co-author of the paper; Associate Professor of Economics at Cornell University.

* Joel Waldfogel – Co-author of the paper; Frederick R. Kappel Chair in Applied Economics at the University of Minnesota Carlson School of Management. Previously co-authored the digitization-and-music paper referenced in the episode.

* Tyler Cowen – Economist quoted on the idea of moving to the woods to read books once automation arrives, and on the question of whether you really want to read the 100th automatically generated biography about an imaginary person. Everyone on the internet is saying how they love him this week, so we’ll join in — we love this guy, and have had the honor and exhilaration of being personally encouraged by him.

* Jorge Luis Borges – Author of The Library of Babel, invoked by Seth to frame the question of what a “book” even is — and whether every possible book has, in some sense, already been written.

* Nicholas Decker — Economist as Reporter – A Substack post about economists being more like journalists in the modern era, cited approvingly in the posteriors section.

* Frank Herbert – Author of the Dune series; his sons’ continuations offered up (by Seth) as exhibit A in the case for sequelitis-as-slop.

* Brandon Sanderson – Fantasy author; Andrey volunteers his later-series books as a possible example of quality decline, before declining to name specific titles.

Connections

* The Library of Babel – Borges’ short story imagining a library containing every possible 300-page permutation of the alphabet. Seth invokes it to ask: if AI can generate any text, what does “a new book” even mean?

* The Barnes Foundation – Seth closes with a defense of collage-as-art, citing Albert Barnes’ idiosyncratic collection of Impressionists, Post-Impressionists, and rusty keys as a model for the authorial value in curation and juxtaposition — even if you didn’t write every word.

Discord Community Link: https://discord.gg/KCJwgkTj Justified Posteriors Podcast Transcript

“AI and the Quantity and Quality of Creative Products: Have LLMs Boosted Creation of Valuable Books?”

Hosts: Seth Benzell & Andrey Fradkin

SETH: Welcome to the Justified Posteriors Podcast, the podcast that updates its beliefs about the economics of AI and technology. I’m Seth Benzell, racing against the machine for authorial glory before AI transcends all human writers. Coming to you from Chapman University in sunny Southern California.

ANDREY: And I’m Andrey Fradkin, looking forward to SLOP detection technologies all across all my media surfaces, coming to you from San Francisco, California.

SETH: Andrey, how’s it going, man? It’s been a while since we’ve done a paper episode.

ANDREY: I know, I know. It’s great to actually get back to our core of reading and analyzing a paper. And it’s a particularly fun day to be thinking big exuberant thoughts about the quality of society improving because it’s Mardi Gras. We’re recording this on Fat Tuesday. I’ve got my James Carville shirt on, I’ve got my Mardi Gras beads. Are you doing anything special for Mardi Gras this year?

SETH: You know, Mardi Gras is not my religious holiday, but I am flying to Austin for a fun adventure there. But for me, my sort of Mardi Gras actually happened last week, which was the NBER Digital Economics and AI conference.

ANDREY: What a transition. So what parades and what crews were present at that conference?

SETH: Well, we had the structural crew, we had the reduced form crew. We had the economists and then the business school professors.

ANDREY: No macroeconomists. My macro paper was —

SETH: No, no, no. There was one macro paper, one macro paper allowed.

ANDREY: We allow one. Amazing. Any sort of themes jump out at you from the conference?

SETH: Yeah. I think half the papers were AI papers, which I think is more than we’ve had in the past. Digital economics really started as a group thinking about the internet and the spread of the internet. And AI has until this point not been the dominant theme in the group, but it obviously is becoming so. And of course, there was a lot of discussion about what the future of research will look like given how easy it is to produce slop — and also maybe non-slop — with AI.

ANDREY: So speaking of producing slop, today we’re going to be discussing a paper that was presented at that conference. Would you maybe tell us the title and the authors?

SETH: Sure. The title is “AI and the Quantity and Quality of Creative Products: Have LLMs Boosted Creation of Valuable Books?” It’s by our friends Imke Reimers and Joel Waldfogel.

ANDREY: Oh, great guys. Hopefully we can get Imke on the show sometime, or Joel. So — production of slop. A lot of people I know who write have a lot of anxiety around AI coming after their turf. I remember when I was in undergrad there was this idea of the logical cold computer that can never do creative writing, and maybe you should specialize in skills that are complements to that, like long-form writing. And now it seems like increasingly we can use AI for everything. I’m not telling this audience anything it doesn’t know. But this article is actually trying to use some data to get at the question: is AI helping us write more books? Is it helping us write better books? And it’s going to look across fiction and nonfiction.

SETH: Yeah. So why don’t we get to our priors, Andrey?

Laying Out Our Priors

ANDREY: Sure — what are your priors on this subject?

SETH: So it’s a straightforward paper, which is why I really like it, but it gives us some deep things to think about. Around this question of AI making better writing easier, but also making slop easier. The first prior I’d like to ask you about: do we think that AI increased the number of books released from 2022 to 2025?

ANDREY: Yes. I mean, yeah.

SETH: But think of all the things you could do instead of writing books now.

ANDREY: I think the fall in the cost of writing a book has been so great that surely numbers have increased. One analogy is that our students are able to write a lot of essays with substantially less effort.

SETH: Yeah, the amount of words submitted by my students has increased dramatically. I’m with you on this, Andrey. I would be really surprised if the number of books written goes down as a result of AI. I do maintain it’s still an empirical question in principle, because AI also decreased the cost of doing other things — so maybe people substitute into essay writing or Substack instead. But yeah, end of the day, 99% sure the number of books written goes up.

ANDREY: Yeah. And I guess there’s a more subtle question here, which is by how much, and I’m substantially less sure of that.

SETH: What’s your intuition? Give me a point estimate. You feel like 2x?

ANDREY: I think before I read this paper, if I had to introspect, I would think it would be more like up by 50% or something like that. Nothing huge. So that would be my prior.

SETH: My prior would be a lot bigger. To the extent that you think what’s going to happen is a lot of slop getting dumped on the market — conditional on that slop being allowed in — you’ve got to anticipate a big increase. So I’m going to guess like 3x going in.

ANDREY: Well, yeah. And I think this is kind of where the definition of what a book is really starts to matter. Is it that a major publishing house published the book? Is it that there’s a PDF on a random website? The looser the definition, the bigger the numbers surely are.

SETH: I mean, in one sense — are you familiar with Borges’ Library of Babel, Andrey?

ANDREY: Are you trying to insult me or is this a joke?

SETH: Of course you are familiar. And what that library imagines is a library which is very, very large but not infinite — it has every 300-page permutation of English letters. So in a certain sense, every possible book has already been written, Andrey. Just take a deck of playing cards and randomly select one letter at a time.

ANDREY: Yeah, yeah.

SETH: All right. But anyway, the definition we’re going to be working with in this paper is: released on Amazon. The Library of Babel is ruled out.

ANDREY: Yes, yes.

SETH: Okay, second prior, Andrey. Conditional on this definition — needing to be released on Amazon as at least an ebook — would you say that AI will increase the average quality of books released, or decrease it? What’s your percentage chance that average quality goes up?

ANDREY: Yeah, the average will go down. For sure the average has got to go down, at least with the current AI technologies.

SETH: What about free disposal, Andrey?

ANDREY: What do you mean free disposal? The average book made is a different question from the average one that’s read.

SETH: What I’m trying to say by free disposal is that the books that would have been written anyway have free disposal of the technology. They only use it if it makes the book better. So that should be a force that boosts the average quality of books. Of course there’s going to be a slop influx, but there are at least two offsetting effects here.

ANDREY: Yeah, I agree the average could in theory go up, but I think the slop increase is substantial. One way to think about it — imagine you’re a science fiction author and you’ve written one semi-popular book. You can now milk that as part of a series. And unfortunately, we’ve all experienced this. The next books become sloppier and sloppier. And I wouldn’t be surprised if authors lean into the slop so they don’t have to write as much for their subsequent books.

SETH: Right. You’re imagining there’s some quality threshold you have to reach just to have the self-respect to post it online, and that AI can help you clear that bar. But then conditional on clearing it, you don’t invest more in quality — you just release this giant lump of books at minimum quality.

ANDREY: Yeah. And that was already true before AI. Some people were already doing that.

SETH: Do you have any authors in mind that you want to throw some shade at?

ANDREY: No, no, no.

SETH: He’s too nice. I’ve got a couple in mind. The Frank Herbert sons — the additional Dune sequels — I’ve been told are slop. I’ve read pages of them and been warned away from the rest. So that would be an example of selling out a brand name in terms of books.

ANDREY: Yeah. I think some of the Brandon Sanderson later-series books are not that great.

SETH: Is that Wheel of Time, or is that — there’s a magic sword. There’s always a magic sword.

ANDREY: There’s always a magic sword.

SETH: Okay, so anyway — our prediction is that the amount of mediocre magic swords will increase and outweigh the increase in quality of good magic swords. What about Dungeon Crawler Carl?

ANDREY: Definitely fell off in the later books.

SETH: Oh man, I didn’t realize you were an isekai fan.

ANDREY: Is it eye-suh-kai?

SETH: Isekai — “other world” books. Maybe lit RPGs is the more Western term. All right, home audience: you’ve been warned. Don’t read Dungeon Crawler Carl past Book 2.

ANDREY: Once it gets to Book 3 or 4, that’s when it really falls off.

SETH: Book 2 is fine.

ANDREY: Book 2 is fine.

SETH: Okay. I came in thinking the increase in slop books would be even larger — like 3x — which should bring down my prediction about average quality. At least some of the data we’ll look at speaks to this at the book level. And I want to be a little optimistic. I want to say there’s like a 10% chance that average quality goes up. Maybe we’re making some real gems here. I don’t want to put it at 0%.

ANDREY: Never put it at 0.

SETH: Never. No dogmatic priors.

ANDREY: Closer to 1%.

SETH: 1%. All right. But to be clear, this paper makes claims about books by rank, books by percentile, and average over everything. So we’re going to talk about all of that. Now I’m going to give you a thinker, because those two priors were too easy. Let’s zoom out. Do you think that by 2030, the total social surplus from book reading by humans will be higher or lower because of AI? I specify “by humans” because AIs will obviously benefit a lot from reading books.

ANDREY: Yeah, the general trend, as I understand it, is that people are reading fewer books over time and doing other things more.

SETH: Certainly physical print book lines are getting shut down.

ANDREY: Yeah. There might be a different trend for romance novels. But generally, my base-rate prediction is that people are reading less over time and there’s no way the new books are going to be so good that they overcome that trend. So the social surplus from reading books goes down. Another reason it goes down: a lot of the surplus from nonfiction manuals and textbooks now has a pretty clear substitute in ChatGPT knowing everything. So yeah, I would say it will go down on average.

SETH: Give me a percentage on it going up.

ANDREY: 25%.

SETH: 25%. Andrey, I have almost the opposite intuition. On the demand side, I definitely agree that a big hit to the usefulness of books is people talking to LLMs instead of reading — clearly for technical manuals, that’s a giant advantage of LLMs. But by 2030, there’s unlikely to be a giant effect of people having more free time due to automation. There’s at least an angle where LLMs unlock our ability to spend more time on deep work and deep learning. Tyler Cowen talks about this — he says he doesn’t care if his job gets automated because he’ll just move to the woods and read books. I empathize with that.

ANDREY: Absolutely not representative.

SETH: Another idea is that LLMs will be complements to reading, not substitutes. Right now someone has told me that Dostoevsky’s Demons explains the thinking of Silicon Valley thought leaders, and I’m one-third of the way in. At this point it seems to have no connection at all. But keeping track of all these Russian diminutives and surnames is much easier with an LLM to give you updated character lists for each chapter. LLM as complement.

ANDREY: Have you heard of SparkNotes?

SETH: SparkNotes can’t say “give me no spoilers past chapter 3, page 2.” Okay — supply side: it’s going to be much easier to write books as well as shorter-form content. But again, with free disposal, it makes it easier to gather data and ideas for good books. And good books are in some deep sense a complement to everything else in the economy. As long as they’re not perfect substitutes for everything else, total welfare from books can still go up. In the long run, I think the social surplus from all kinds of media is going to go up. When I think about reading a book, you’re not just reading a list of facts — it’s a collection of what was meaningful for the writer. So if AI makes context and curated knowledge more valuable, I see a real role for books in the 5-to-10-year time horizon. I’ll say 75% chance that social value from books goes up by 2030 because of AI.

ANDREY: To be clear, you said 2030, which is at the low end of your 5-to-10-year range. I really do believe the form factor of the book is on a secular decline. And I don’t want to make a general claim about all written content — that’s too strong. But the book itself — it’s hard for me to see how that makes a comeback, especially given that other forms of media are going to become more and more compelling relative to books.

SETH: Well, good points. Let’s read this paper and see if any of the information therein moves your thinking.

ANDREY: Can I have a prior about whether any of the information in it moves my prior?

SETH: Sure. What’s your meta-prior?

ANDREY: My meta-prior? Specifically on that last point? It’s damn near close to zero.

The Evidence

SETH: All right, let’s go to the evidence. This paper starts off with some interesting background. First, they cite a survey showing that 45% of authors — including a large subsample of published physical-book authors — reported using AI in 2025. 48% reported not using AI, with the vast majority of those saying they found it actively unethical. So there’s a real holdout group. Do you think this is just sour grapes, or is it collective action?

ANDREY: I think some people have taken an ideological position. I don’t think it’s all sour grapes. For an artistic or creative endeavor, it’s a very valid choice not to use AI. Though I do think some of this is driven by mistaken beliefs about what AI is and isn’t capable of.

SETH: Okay. Speaking of what AI is and isn’t capable of: BookAutoAI.com, a source of tools for people to help write books with AI, suggests that AI is best for genre fiction such as romance, sci-fi, mystery, and horror; can help structure nonfiction but requires editing for expertise and tone; and has low suitability for literary fiction, satire, poetry, and academic or personal writing. I was a little surprised by this list. I feel like GPT-3 was pretty decent at poetry.

ANDREY: I think people who know poetry would beg to differ on GPT-3’s abilities.

SETH: I have a New Orleans story about this. For our listeners who’ve ever made it to Frenchmen Street in New Orleans — on a party night, you’ll find young men sitting on the street with typewriters who will write you a poem for a donation. Right after GPT-3 was released, I found myself down there on a Friday night and paid for a poem. I then gave GPT-3 the same topic. And I think the GPT-3 poem was better.

ANDREY: Yeah, I do think poetry is a genre of maxes, not averages, if that makes sense.

SETH: Fair enough. All great writing is. But anyway — interesting to see what’s on that list and what’s not. We’d expect literary fiction to see the least AI effect since it has the highest bar to clear. And spoiler alert: we’re going to see some of these themes show up when we look at where the actual growth in book publishing was — because they did write a lot more books.

ANDREY: The paper has a little bit of light theory. They want to think about ex ante book quality as drawing from a normal distribution. The normal distribution assumption is useful because you only have to worry about average and variance. If LLMs lower the cost such that we’re increasing the number of books made but decreasing average quality, what you might get is that book quality at a specific rank may increase even as book quality by percentile decreases. To make it concrete: we write 10 times as many books and the average quality is lower, but the very best book might be better because we’re getting so many more shots on goal.

SETH: And this very much relates to Joel and Luis Aguilar’s classic paper about music and ex ante predictability. Digitization made it a lot easier to create new music. Even though the average music by new entrants — people who wouldn’t have otherwise been supported by a record label — is worse, what you care about is the max. A lot of people who you wouldn’t have expected to produce great music end up producing hits. That’s one of the big benefits of digitization, and it’s very natural to view this book paper as attempting to make a very similar argument.

ANDREY: Right. One thing I wanted to run by you: to what extent do you think it’s important that ex ante book quality is actually normally distributed? LLMs might shift the quality distribution in a more complex way than just shifting the average or variance. Intuitively, maybe AI makes it easier to write a good-enough book, but somehow reduces the rate of home runs because it makes books more similar. I’m not sure the normal model is right.

SETH: Yeah. Generally my intuition is that with a lot more entry, if there’s enough variance in the process, some entrants are going to be at the head of the quality distribution. But I agree that in this market, maybe these entrants just don’t have enough variance. They’re never going to reach the truly great books by using AI to write it. That’s my hunch, but I could be wrong.

ANDREY: So your intuition is that ex ante quality of books is heavy-tailed for humans.

SETH: Yes. And maybe it’s not heavy-tailed for AIs. There’s some sense in which softmax is preventing the computer from doing heavy-tailed stuff — it wants to do modal stuff.

ANDREY: And it raises an additional question: why do cultural products become popular in the first place? These are social processes. By preferential attachment arguments, you might get ex ante identical content having very different popularities.

SETH: Right. If we’re in a pure preferential attachment world where all books are truly average quality and we’re just creating more of them, but the amount of potential readers is fixed — then in any case, I think we’re willing to start with the intuition that more shots on goal should give you more superstars, but we both have caveats there.

ANDREY: Well, I wanted to make the point that if the total amount of reading attention is fixed, this shouldn’t really affect how many reads the top book gets. The argument I was making is that something from the new AI-assisted books might become preferentially attached to — not because it’s good, but because of preferential attachment — even if total readership is constant.

SETH: It’s a little hard to think about in the traditional preferential attachment framework, but I share that intuition. Okay — one last idea here, a riff from our Discord. Jonathan Becker writes: “I’m curious about short versus medium-term differences. One mental model — could be wrong — is that books take a long time to go from idea to publication. A story you could tell is that good ideas in the pipeline when LLMs come out get pulled forward by the tech, but the arrival rate of good ideas and good execution on them remains unchanged in the long run. I don’t fully buy the story, but maybe there’s something interesting there.” Andrey, you’re nodding vigorously.

ANDREY: I think it’s totally a possibility. I can totally imagine it. A lot of publication dates for prestige publishers are set in advance, and maybe there are overruns anyway. But yes, it’s certainly possible that some of what we’re seeing is just pulling forward publications rather than net new ones. The authors don’t try to address this point.

SETH: Okay. So now let’s get to what they actually do in the paper. They’re looking at Amazon. Andrey, do you want to lead us through the data?

ANDREY: Yeah — I should disclose that my current employer is Amazon, Incorporated. I do not speak on their behalf. I do not actually know how the Books product works. I’ve never looked at the data, so I have no inside information about it.

SETH: But he has been on Bezos’s yacht.

ANDREY: No, I haven’t. I don’t want this misinformation circulating. Okay. So this data is not super easy to get. They use some scraping techniques to get a count of the number of books available for different categories, with publication dates, by using some filters. They end up with aggregate monthly time series of numbers of new works published across 30 categories. They also have a random sample of books from all categories and months for which they do a bunch of analysis.

SETH: Right. So they get author, date of release, and total and average ratings for 10.3 million randomly selected books between 2020 and 2025. Then they have comprehensive coverage of 480,000 books from 2008 to 2025 across 8 specific categories, as well as some additional information grabbed at each 100-point rank. One limitation: they get total number of ratings and average rating, but not the distribution of ratings, and not number of people actually buying the book. So they’re going to have to estimate that.

ANDREY: It’s very common in papers about Amazon to estimate purchases by making an assumption about the relationship between sales rank and actual purchases. The number of reviews is also used as a proxy for purchases. Of course, this embeds an assumption that the review rate is constant over time and across works per purchase, and you can imagine why that may or may not be a good assumption.

SETH: Yeah. So what they do is buy data from BookStat, which puts together comprehensive data on published physical books as well as ebooks, where they have actual total number of sales. Then from Amazon they’ve got the number of ratings for each of those books. Basically they go from number of ratings to number of sales via a regression model. It’s not amazing, but until Jeff Bezos decides to reveal sales of all products, that’s the best we can do.

ANDREY: Yeah, this is all pretty standard stuff in the literature. I don’t have too many issues with it specifically.

SETH: Okay. Finally, a small detail — they’re only measuring the number of ratings at one point in time. So they have to normalize everyone by adjusting the number of ratings by days since release, assuming a growth rate in ratings so we’re always comparing apples to apples. Okay. That’s the data collection. Let’s get to the results.

ANDREY: First big result — did people write more books?

SETH: People wrote a lot more books. Figure 3 in the paper is quite striking. About a 3x increase overall by the end of the period.

ANDREY: About a 3x. And it varies a lot by category. A lot more self-help, travel, and sports and outdoors — and not as much new content in education and teaching. Not a lot more parenting. See, this is why society is screwed up.

SETH: Yeah. You have AI that allows you to write more useful stuff, and instead you just write travel books.

ANDREY: Travel, self-help, sports and outdoors. Any surprises? We did say literature would see the least effect. Literature is only 1.3x, so that prediction was kind of correct. For those of you at home thinking about writing a business and economics book — business and money was only 1.6x, so perhaps not completely saturated. Maybe a little surprising that law is only 2x. But romance is 3x. Teen and young adult is 3.5x.

SETH: I’ll just say — some of this increase seems to be happening before 2023. There are existing trends in the industry toward more self-published work. But some of the action, certainly past 2024, is just stratospheric. It’s hard to imagine it’s anything other than AI.

ANDREY: Yeah, the trend is just such an explosion. It kind of has to be AI.

SETH: There’s no other explanation. This isn’t COVID, dude.

ANDREY: Yeah, exactly. This is not interest rates going up. As we know, all authors have a little widget on their computer showing the long-run real interest rate, and when it goes up, they write faster.

SETH: Okay. So that’s the first big result: a dramatic increase in the number of books on Amazon, heterogeneous by category. Next, they think about average quality across all books as measured by ratings, average quality adjusting for percentile, and book quality conditional on rank position. So 100th best book, 200th best book, etc. Pretty striking results here too. What do you see, Andrey?

ANDREY: We see a fall in the average number of book ratings after 2023. And let me ask — how do they calculate their standard errors?

SETH: Good question. And I should clarify — this is number of ratings, not average rating. That’s actually a very important distinction.

ANDREY: Yeah, the standard errors are clustered on category by release month. I’m heartened it’s by category at least, because there could be category-specific preference shocks. Risk-averse — our second favorite word on this podcast after “eigenvalue.”

SETH: Yes, the listeners thought we’d forgotten about clustering our standard errors, but rest assured, we still got it. So the takeaway is: if you’re willing to take number of ratings as a proxy for number of sales, and number of sales as a proxy for quality, it kind of looks like quality is going up by rank position but going down by percentile — which is consistent with the story of more shots on goal, but worse shots on average.

ANDREY: Yeah. For books in the top 2,000, the average number of ratings has gone up. But to me, this is not about quality. I just think there are shocks to overall readership that are correlated with all sorts of things: how Amazon’s algorithm works, societal trends, even the weather in the Northeast. This is just not a good measure of quality. It’s a measure of aggregate demand for a category. And attributing that to AI versus all sorts of other factors that affect aggregate demand — that’s a bridge too far, personally.

SETH: Okay, well let’s go to the next figure, which explicitly compares categories that are seeing a lot of growth in production from AI versus categories that aren’t. Now, you might say the categories with a lot of AI books are so because of a demand shock, and that’s an endogenous response.

ANDREY: That is what I might say.

SETH: You might also say that now we’re measuring something about supply, which would be convenient for the paper. But it does go in the direction the AI story would predict.

ANDREY: Yeah. And there’s no evidence in this paper that any of the books in the top 2,000 have been written by an AI. I want an AI detection algorithm run on these 2,000 books before I’m convinced, because I’m not even sure that AI was actually used here. And I haven’t seen any evidence that any of these top 2,000 books in a category have been produced by someone who’s unlikely to produce at a higher rate than before.

SETH: Fair enough. But the survey did say that 45% of authors use AI — including a third who were published physical-book authors. That’s non-trivial.

ANDREY: But they’re very different from the new entrants we’re talking about when we talk about slop. I can use AI to look up who the King of France was in 1650. That’s not slop. Slop is detectable. So I just don’t know if the ratings boost is very attributable to AI. And they also show — in Figure 7 — that for the top 100 books, there’s actually no treatment effect from high AI-category exposure. No effect at the very, very top.

SETH: Let me put up Figure 7. For the top 100 books, there’s no treatment effect from high AI-category exposure. No effect at the very, very top.

ANDREY: Yeah. And I’m kind of like — look, now this becomes quite a bit more ambiguous. If you’re asking “are the top books getting better?”, you could have looked at the top 100 books and found nothing. Which is exactly what you see.

SETH: Right. And you could tell a Pareto story where most of the value is in the top 100 books. I mean, the one thing they really do decisively show is that first figure — Figure 3. This explosion in the number of books has to be AI, and it really is heterogeneous by category. I don’t think this is all demand response.

ANDREY: No, I absolutely don’t think it’s all demand response. But it doesn’t need to be much demand response to create an apparent effect on ratings. And I want to mention one other thing about ratings, since it’s a hobby horse of mine: the technology by which ratings are solicited is constantly changing. The ratings-per-sale ratio is not constant. I’ve looked at tons of datasets for platforms where this thing is moving around, and it doesn’t need to move by a lot to create an apparent change in ratings that doesn’t reflect a real change in sales.

SETH: Important point. Your main outcome measure is not directly connected to the thing you care about. Okay. So there’s a little bit of a welfare exercise at the end where they plug this into a model of aggregate demand. It’s got even more assumptions built in, and they admit it’s heroic. Anything you want to say about that before we move into posteriors?

ANDREY: Not particularly. Let’s go posteriors mode.

Justifying Our Posteriors

SETH: Okay. First question: do you think AI is increasing the amount of books written? You were at near 100%. Does this move your prior to 100%?

ANDREY: Yeah, yeah.

SETH: I mean, they have a pretty comprehensive survey of Amazon, and we’ve documented that Amazon books have gone up. I don’t see how you could doubt it at this point. I do want to make a broader point, though. Nicholas Decker recently wrote a Substack about how economists should be more like journalists in the modern era.

ANDREY: I liked that essay.

SETH: And I think this is a great example of that. If you talked to an industry insider, they might have had a sense that the number of books is going up. But it wasn’t a widely known fact. Imke and Joel noticed this phenomenon, put out this really nice dataset and these really nice plots, and now everyone’s aware of it. A great example of economists being journalists. I also want to note a result we didn’t talk about: the increase in book writing is both from new and returning authors. Returning authors are writing more books, even though a lot of the additional books are from authors who already produce a lot.

ANDREY: Yes, that’s right.

SETH: Okay. Second prior: has AI increased the average quality of books released from 2022 to 2025? We both thought we’d just get a lot more slop that outweighs everything. Where are you after reading this?

ANDREY: I think it’s consistent with what we said. But am I moved very much by it? Not particularly, because the evidence on ratings isn’t convincing to me on quality.

SETH: I think you should update because you thought the number of books would increase only 50%, and instead it’s about 3x. With more slop books, the average quality should fall more.

ANDREY: Sorry — I did move on the number. But on the question of whether average quality fell, I understand your point. With more slop books, average true quality should fall more. So I have to update a bit on that, but I’m not updating very much based on the ratings alone, even though they’re directionally consistent with a fall in quality.

SETH: Yeah. I came into this thinking maybe there was a 10% chance average quality would increase. Whether or not this data fully convinces me, the number of ratings going down for the average book is a data point. And then there’s just the absolute explosion in the number of books, including in categories I think are mid — such as self-help and travel.

ANDREY: How dare you, Seth? This podcast wouldn’t exist without self-help books.

SETH: Oh damn — let me say they’re high variance. Heavy-tailed. Okay, I’m going to go down from 10% chance that average quality went up to 5%. I still won’t go all the way to zero, because this evidence doesn’t speak decisively to quality.

ANDREY: Yeah, fair enough.

SETH: Okay. Final and most intriguing question — I want to spend a minute here. By 2030, will the total social surplus from reading books be higher or lower because of AI? Your prior was 25% chance it goes up, and you said you’d be unmoved. Tell me — did this move you?

ANDREY: I’m unmoved. My main reasoning was a secular trend of declining readership of books. I want to see a reversal in that before I update.

SETH: Well, we are seeing the number of ratings go up. That’s not nothing.

ANDREY: I understand, but this is not how you make that argument. I’d look at time-use surveys, measures of book consumption versus other media. My understanding is that all such measures continue to decline over time.

SETH: Interesting. I was just looking at the American Time Use Survey data. Until recently there wasn’t actually a “reading for pleasure” line — it was all TV. Americans watch 2 hours of TV a day.

ANDREY: That’s what they do. Wait — we count as TV, right?

SETH: Yes. Streaming, online video. If you’re watching this on YouTube, this is TV. So be like an average American and watch us on YouTube. What would you have loved to see in this paper that would have moved you?

ANDREY: I would love a textual analysis — something about what’s actually in the books. I’d want an AI detection algorithm run on the top 2,000 books, and I’d want some measure of actual content quality — reading level, readability, grammar. I know I keep beating this drum.

SETH: You’d need a budget for it, but it’s not inconceivable. You could buy a couple thousand books, spend on the tokens to read them, and look at a couple of different quality metrics — readability, grammar, AI detection. That would be a really spicy paper, and this is just a first step toward it.

ANDREY: Yes.

SETH: Okay — where do I end up? I was at 75% chance that social value from books goes up by 2030. I was more optimistic about the long-term trend of AI rewarding deep reading and deep knowledge, and about the general complementarity argument — as society becomes more productive, everything is more complementary to everything else, and as long as books are not perfect substitutes for other things, everything getting better is a gross complement to reading. Does this move me? I’m slightly reassured to see that the number of ratings is going up. And it’s good to see that the amount of writing has jumped so dramatically — it suggests that somebody thinks they’re writing for someone. Those 3x new books being written aren’t people intentionally screaming into the void. At least some of them think they’re creating value. So maybe I go from 75% to 76%.

ANDREY: I inch up.

SETH: Okay. Any closing thoughts before we wrap up this intriguing, provocative, but in some ways limited analysis of AI’s effects on book production and consumption?

ANDREY: Look, I think this is getting at something very profound that’s changing in our society. We have no idea if the person who claims to have written something has had the thoughts required to write it — let alone has actually typed those words in that specific order. And we don’t know as a society how to even think about that. Questions about assigning credit, about how much we should update from a piece of text, about whether we should downweight arguments written by AI or treat them as equal — a lot of our intuitions about the value of content, especially writing but not only writing, are going to have to be rethought.

SETH: I want to say one last thing. I do hope people understand that collage is art. Collage has value, even if you’re only copying and pasting from different sources. And of course AI can also create collages. I think there is authorial voice in that and an art in that. I’m reminded of the Barnes Museum in Philadelphia — a fantastic collection by a man who invented an eye drop that prevents blindness in babies and used his fortune to collect amazing Impressionists and Post-Impressionists. The most striking thing about the collection is not that he did a great job choosing winners — there’s a mix — but unlike the Philadelphia Art Museum next door where everything is organized chronologically by artist, what you get is one man’s vision: a Matisse next to a Dürer print next to a rusty key. It creates a completely unique new effect. I don’t think there’s anything necessarily dehumanizing about the idea that humans will move up the value chain and maybe not be writing every individual word, but will find the value in composing and in the juxtaposition of words.

ANDREY: Yeah, I do think there’s something potentially dehumanizing, though. Let’s say I put my name on a work where I didn’t come up with the words — and when we’re having a conversation, you might find me not as articulate or poetic as my writing implies. Right now we have the intuition that speaking ability and writing ability are very strongly tied to each other. Maybe incorrectly.

SETH: Yeah. Writing as a window into the soul of the author. And for certain kinds of reading, maybe that isn’t important. But for certain kinds, it is. Tyler Cowen has talked about this too — do you really want to read the 100th automatically generated biography about an imaginary person? No. Some of the value of an autobiography is that it was a real person. So yes, in some forms of writing, collage doesn’t get you there.

ANDREY: Yeah.

SETH: All right. Well, this has been a fascinating conversation as always. Keep your posteriors justified — and sign up for our Discord, which you’ll find in the show notes.

Get full access to Justified Posteriors at empiricrafting.substack.com/subscribe

Listen

Description

Want to check another podcast?