Studies purport to identify the sources of information that generative AI models like ChatGPT, Gemini, and Claude draw on to provide overviews in response to search prompts. The information seems compelling, but different studies produce different results. Complicating matters is the fact that the kinds of sources AI uses one month aren’t necessarily the same the next month. In this short midweek episode, Neville and Shel look at a couple of these reports and the challenges communicators face relying on them to help guide their content marketing placements.
Links from this episode:
The next monthly, long-form episode of FIR will drop on Monday, December 29.
We host a Communicators Zoom Chat most Thursdays at 1 p.m. ET. To obtain the credentials needed to participate, contact Shel or Neville directly, request them in our Facebook group, or email fircomments@gmail.com.
Special thanks to Jay Moonah for the opening and closing music.
You can find the stories from which Shel’s FIR content is selected at Shel’s Link Blog. You can catch up with both co-hosts on Neville’s blog and Shel’s blog.
Disclaimer: The opinions expressed in this podcast are Shel’s and Neville’s and do not reflect the views of their employers and/or clients.
Raw Transcript:
Shel Holtz Hi everybody, and welcome to episode number 490 of For Immediate Release. I’m Shel Holtz.
Neville Hobson And I’m Neville Hobson. One of the big questions behind generative AI is also one of the simplest: What is it actually reading? What are these systems drawing on when they answer our questions, summarize a story, or tell us something about our own industry? A new report from Muckrec in October offers one of the clearest snapshots we’ve seen so far. They analyzed more than a million links cited by leading AI tools and discovered something striking.
When you switch citations on, the model doesn’t just add footnotes, it changes the answer itself. The sources it chooses shape the narrative, the tone, and even the conclusion. We’ll dive into this next.
Those sources are overwhelmingly from earned media. Almost all the links AI sites come from non-paid content, and journalism plays a huge role, especially when the query suggests something recent. In fact, the most commonly cited day for an article is yesterday. It’s a very different ecosystem from SEO, where you can sometimes pay your way to the top. Here, visibility depends much more on what is credible, current, and genuinely covered. So that gives us one part of the picture.
AI relies heavily on what is most available and most visible in the public domain. But that leads to another question, a more unsettling one raised by a separate study published in the JMIR Mental Health in November. Researchers examined how well GPT-4.0 performs when asked to generate proper academic citations. And the answer is not well at all. Nearly two thirds of the citations were either wrong or entirely made up.
The less familiar the topic, the worse the accuracy became. In other words, when AI doesn’t have enough real sources to draw from, it fills the gaps confidently. When you put these two pieces of research side by side, a bigger story emerges. On the one hand, AI tools are clearly drawing on a recognizable media ecosystem: journalism, corporate blogs, and earned content. On the other hand, when those sources are thin, or when the task shifts from conversational answers to something more formal, like scientific referencing, the system becomes much less reliable. It starts inventing the citations it thinks should exist.
We end up with a very modern paradox. AI is reading more than any of us ever could, but not always reliably. It’s influenced by what is published, recent, and visible, yet still perfectly capable of fabricating material when the trail runs cold. There’s another angle to this that’s worth noting.
Nature reported last week that more than 20% of peer reviews for a major AI conference were entirely written by AI, many containing hallucinated citations and vague or irrelevant analysis. So if you think about that in the context of the Muckrec findings in particular, it becomes part of a much bigger story. AI tools are reading the public record, but increasing parts of that public record are now being generated by AI itself.
The oversight layer that you use to catch errors is starting to automate as well. And that creates a feedback loop where flawed material can slip into the system and later be treated as legitimate source material. For communicators, that’s a reminder that the integrity of what AI reads is just as important as the visibility of what we publish. All this raises fundamental questions. How much has earned media now underpin what AI says about a brand?
If citations actively reshape AI outputs, what does that mean for accuracy and trust? How do we work in a world where AI can appear transparent, citing its sources, while still producing invented references in other contexts? And the Muckrec and MJIR studies show that training data coverage, not truth, determines what AI cites. So the question, is AI reading, has two answers, I think. It reads what is most visible and recent in the public domain, and it invents what it thinks should exist when the knowledge isn’t there. That gap between the real and the fabricated is now a core communication risk for organizations. How do you see it, Shel? Thoughts on that?
Shel Holtz It is a very, very complex issue. I was looking at a study from Profound called AI Search Volatility. And what it found was that search engines within the AI context, the search that ChatGPT and Gemini and Claude conduct, are probabilistic rather than deterministic, which means that they’re designed to give different answers and to cite different resources, even for the same query over time.
Another thing that this study found was that there is citation drift. That is, the percentage of domains cited in July are not necessarily present in June for the same prompts. You look at these results, the number that weren’t present in June that were in July for Google AI overviews, nearly 60%, just over 54% for ChatGPT, over 53% for Co-Pilot, and over 40% for Perplexity. So 40 to 60% of the domains that are cited in AI responses are going to be different a month later for the same prompt. And this volatility increases over time, goes from 70 to 90 percent over a six month period.
So you look at one of these studies that’s a snapshot in time and it’s not necessarily telling you that you should be using this information as a strategy to guide where you’re going to publish your content if the sources are going to drift. And by the way, a profound study by their AEO specialist, a guy named Josh Bliskolp, found that AI relies heavily on social media and user generated content, which is different from what the Muckrec study found. They were probably getting that snapshot in time where the citations had drifted. So, while I think all these studies are interesting, I think what it tells us as communicators looking to show up in these answers is we need to be everywhere.
Neville Hobson Yeah, I’ve been trying to get my head around this. I must admit reading these reports and the Nature one kind of threw me sideways when I found that because I thought how relevant is that to the topic we’re discussing in this podcast? And so my further research showed it is relevant as the content is being fed back into the system and that’s showing up in social results. You’re right. In another sense, I think you can get all these survey reports and dissect them which way to Christmas.
But they have credibility in my eyes, certainly, particularly Muckrec’s. I find the MJIR one equally good, but it touches on areas that I’m not wholly familiar with. This one in Nature is equally good, quite troubling, I think, that that one shows. Listening to how you were describing the profound report on citation consistency over time, I just kept thinking now about the Nature one as an example, let’s say. What if that sounds great, it’s measuring citation consistency over time, but what if the citations are fake, they’re full of hallucinations, they’re full of invalid information? Where does that sit? That’s my question, I suppose.
Shel Holtz Well, yeah, this shouldn’t surprise anybody who’s been paying attention. AI still confabulates. It’s still at the bottom. I think of the ChatGPT or Gemini that this is still prone to misinformation. They are configured more to satisfy your query than they are to be accurate. So when they can’t find or don’t know an accurate citation, they’ll make one up.
We still have attorneys who are filing briefs with cases that don’t actually exist. So this is the nature of the beast right now. If you’re not verifying the information that you get before you do something with it, that’s on you. That’s not on the AI. They’re telling you that these things still hallucinate. They’re working on it. They hope to have that fixed one of these days, but they’re not quite sure how that actually works. So it’s not like just going in and turning a dial or flipping a switch, the researchers are struggling to figure this out. And if it were that easy, they would have done it by now.
Neville Hobson Sure. Although what you just said does not come across at all in any of the communication you see from any of the chatbot makers, except in four point tight at the bottom, you know, it can hallucinate, you need to do your verification. I don’t hear that clear call to a kind of a warning shot, if you like, from anyone when they’re talking about all this stuff, and that needs to change in that case. I don’t feel that it’s as bad as what I got from what you were saying.
Although the point does rear itself quite clearly and it’s got to be repeated again and again. You’ve got to double check everything that you run through. Well, not run through an AI, but the results you get when you do a search. So, you know, it’s all very well talking about citation consistency of time frame from one month to the next. You’ve got to check that yourself. The question will arise, I think, for many. How do you do that? You might use a chatbot to do it, would you? Of course you would, because it’s a tool you’ve got in your armory, but you’ve got to check that.
Shel Holtz Well, I’ve got Google in my armory too. If I see it make an assertion and has a citation, I’m going to go to Google and look it up. I’m not going to look up the URL that the chat doc presented. I’m going to type in the information about the report or the study or the white paper or whatever it was that is cited and see if I can find it. And then if I can and it’s the right one, I’m going to check and see if the link is the same one that the AI provided.
I did a white paper. I used Google Gemini’s deep research for the first pass of this, it was loaded with citations. Where I spent my time wasn’t in doing the initial research, it was validating every citation that it provided before I passed this along to people. So that’s got to be part of the workflow with these things for now. I hope they fix it one day, but for now, you can’t just crank one of these things out and, you know, submit it to a judge or, you know, use it in your medical practice or pass it along to your boss. You have to validate that it’s all accurate.
Neville Hobson Yeah. By the way, didn’t you say once a long time ago now, I expect you didn’t use Google anymore? was only only ChatGPT or Gemini.
Shel Holtz I switch back and forth based on which one is performing better on the benchmarks. I also find that the three primary models, ChatGPT, Google Gemini, and Claude, are better at different things. So I tend to use different ones for different things. But Gemini 3.0 is spectacular. This most recent upgrade that just came out, I think it was last week, wasn’t it? It’s amazing. So I have sort of shifted most of my work using one of the large language models to Gemini right now. I still use ChatGPT for a few things right now. Of course, they’re going to come out with their own big upgrade, probably. Well, there’s some speculation before the end of the year. So we’ll see where they land. But right now, I find Google Gemini is best for a number of things. And by the way, Nano Banana Pro, the image generator. If I were the product manager for Photoshop or for Canva, I’d be worried because you can just upload an image and edit it in Nano Banana with plain text and just tell it what you want done and it does it and pretty awesome. I’ve been playing with it. I can tell you what I did with it, but it’s spectacular.
Neville Hobson Okay, so yeah.
Shel Holtz And fast. You compare that to OpenAI’s image generator, which takes minutes. You’re just sitting there watching this gray blob slowly resolve. Nano Banana’s, boom, there it is.
Neville Hobson Yeah, I see a lot of people posting examples of what it can do. It looks pretty good. So going back to this, though, I think let’s talk a bit about the kind of verification because I think many people know, I don’t know how many it might be, maybe a small number needs some guidance in what to do with that. It’s a quite an additional step, you might argue, in what some people see as the speed and simplicity of using an AI tool to conduct your research, for instance, or to summarize a PDF file or whatever it might be. So what would be your tips for a communicator then on building this into the workflow so that it becomes a natural part of what they’re doing and not a pain in the ass, frankly? So what would you say to them?
Shel Holtz Yeah, well, my tip is to build it into the workflow. It’s still going to save you, well, first of all, it’s still going to save you time. For me to go through and validate the facts that are presented in a bit of AI research takes me less time than it would to conduct the research and draft the white paper. And by the way, I want to be sure everyone understands, I do heavily edit the white paper for language, for style. I rewrite entire sections based on how I would say this. But for that first draft, think that’s the point is that you have to look at these as a first draft. This is why we have interns, right? Is to crank out first drafts of things and save us the time. And I still think that metaphor for AI being a really smart intern who doesn’t go home at the end of the day, doesn’t need a paycheck and just works 24 hours and never gets sick. I think that’s an apt metaphor.
But to just ignore the need to review these things and think it’s going to give you a finished product, that’s a mistake. And you need to come up with a workflow, define your own, but it has to include validation of the information that it provided. If it doesn’t, you’re setting yourself up for some real grief. I mean, if you share the results of that with somebody who is important in your career, in your life, and they make decisions based on it that turn out to be bad decisions because it was a confabulated citation, then that’s going to roll right back on you. So you have to build it into the workflow, just like any other workflow. This is the step that comes after the first step.
Neville Hobson I wonder if this is, tell me what you think, is this significantly more concerning if you’re in academia, say, or working for scientific firm in the science side of things, where peer reviewed, citation-led work on research for medical breakthroughs, or whatever it might be, or scientific discoveries, typically takes months, if not years to go through a process. What would you do if you were in that situation where you are, I know they’re relying on this and this is now emerging that academic papers in particular, well becoming what, untrustworthy? That’s to me is a pretty big deal if many people see it that way. I’m just curious how you discuss that with someone.
Shel Holtz I don’t think my guidance would change. There is an obligation to ensure that what you are sharing is accurate. And if you are using Gen. AI to produce some or all of this report, that obligation extends to fact checking. Mean, hire an intern to do the fact checking so that you have time to do other things. There’s a reason to have an intern. I’ve had this as a question that we hear, if AI can do what an intern can do, what will an intern do? And the answer may be validate what the AI cranks out.
But the risk is so severe that this just needs to become a matter of routine. And especially in science, where these things can be translated into medicines and treatment protocols and the like, you don’t want to be responsible for people getting sicker or dying because you had a confabulated resource or a citation that you didn’t check before you moved on to the next step with this. And if the peer review of the document that you have created produces those errors, if the peers that are reviewing it find the fictitious citations or the wrong citations, it’s your reputation that’s on the line. No one’s going to blame the AI. They’re going to blame you. So your credibility is on the line.
One other point I want to make here in terms of what I would recommend, I would go back to Ginny Dietrich’s PESO model, paid, earned, social and owned, and recognize that that model hasn’t changed in the age of AI. If you want to be cited, don’t chase the shiny object of the latest report that says, it’s reading this, it’s reading that. The fact that it shifts from month to month means you need to be in all those places. And before AI, we were paying a lot of attention to the PESO model. And I’d hate to see it fall by the wayside as lazy people think they can get away with just doing one thing. It’s gotten so easy because AI reads this. Well, that’s this month. Next month, you’re toast.
Neville Hobson Yeah, of course I recall that many people I know still now talk about I don’t need an intern anymore because I have an AI.
Shel Holtz Yeah, well, then they must be spending a, they’re either spending a lot of time validating with the AI produced or they’re putting garbage out into the world.
Neville Hobson I sense not a lot of time, actually. So this then comes back to you got to put in the time. On the some of the work that I’ve been doing recently on research reminds me of something I did, I guess, two weeks ago, which was checking the links in a report that cited this, this and this. And I would say of the 65 or so links I checked 15 404s or not known or not, you know, or even the browser errors you get when it can’t connect to something. So no one had checked those. But I’m okay with that because that’s why I’m here. That’s what I will do. And you’ve got to do it. I agree, you’ve got to do it.
Shel Holtz Well, exactly. Yeah, and the net is still a gain for you, the communicator. You’re still going to save time. You’re just not going to save as much as you think you will if you don’t have to do anything other than write a prompt. There’s more to it than that.
Neville Hobson Right. So I would say that could to conclude on that then we kind of rang the alarm bell about in the in my narrative intro about, you know, this this report in Nature in particular, that flagged up all these fake citations. Just see that then as something if you’ve got a report that that had lots of links in there and all sorts of things being said, you have to manually check each one. And that then comes
Shel Holtz Yes, yes you do.
Neville Hobson back to good old Google probably. But it’s not just the tool, it’s the framework under which you do it in that for instance, minor thing. But if I was doing that, now I would be doing it on a clean interface like the browser I’m not logged into, probably different browser perhaps than I normally using even a different computer if I really wanted to take to extreme level. But it gives you more confidence that your own persona, if you like, is not influencing anything even unbeknownst to you that it might be doing. I mean, I’m not saying it is, but this gives you the, the best way of doing it, I would say so this is best practice. So we should write a best practice guide on this, perhaps. But you know, it’s it’s food for thought.
Shel Holtz It certainly is. And by the way, I think I said paid, earned, social and owned when I was running down what the letters in PESO stand for. The S is actually shared, which includes social, but has a few other things in it. Go look up Ginny Dietrich’s PESO model, folks, and you’ll find it.
Neville Hobson think she did an update to to this for the AI age. I’ve seen to recall a lot of talk about that. Yeah, as well as a tool for ChatGPT that that you could use just, you know, based on that, basically.
Shel Holtz She did. Yeah, she did. I believe she did. And that’ll be a 30 for this episode of For Immediate Release.
The post FIR #490: What Does AI Read? appeared first on FIR Podcast Network.