Mapping Psychiatric Disorders with GWAS
I wanted to write record a brief episode about a paper I just read that came out in December 2025. The title is Mapping the genetic landscape across fourteen psychiatric disorders published in Nature.
At a high level, this was a genome-wide association study (GWAS) looking across a large number of psychiatric diagnoses. The basic goal was to identify genetic variants that are correlated with different psychiatric disorders, and then see whether those disorders naturally cluster together in “genetic space.”
They looked at fourteen psychiatric disorders, spanning both childhood-onset and adult-onset conditions, using a very large sample of over a million cases. The authors examined which alleles, particularly single nucleotide polymorphisms (SNPs), were most strongly associated with each diagnosis.
Brief review of GWAS
The goal of GWAS is to find genetic variants that are significantly more frequent in a particular condition compared to a control. Usually researchers are looking at variants in genes called single nucleotide polymorphisms or SNPs (pronounced snips), where a single DNA letter is different between different people, which could be enough to change the function of the gene.
Most individual variants have very small effects, but when you look across many of them at once, patterns can emerge. GWAS is essentially a very large, statistical pattern-matching exercise across the genome.
The five genetic clusters
So the researchers did a big analysis on these over a million cases to discover SNPs that were more common in these disorders compared to controls, and then asked whether these variants clustered in meaningful ways across diagnoses.
For example, they might find that a SNP on gene X was more frequent in schizophrenia than controls, but is also found more commonly in bipolar disorder cases. So then by looking at which disorders shared a lot of these SNPs they could cluster them together.
What they found were five broad factors, where disorders within each factor shared a high degree of genetic overlap.
The first factor was labeled SB, standing for schizophrenia and bipolar disorder. That factor included diagnoses of schizophrenia and bipolar disorder grouped together.
They also identified an internalizing factor. In psychiatric and psychological research, internalizing generally refers to patterns like negative affect, rumination, self-criticism, and inwardly directed distress. This is often contrasted with externalizing, which involves outwardly directed behaviors like impulsivity, aggression, or blaming others.
A third factor was a compulsive factor, which included diagnoses such as OCD, Tourette syndrome, and anorexia.
The fourth was a neurodevelopmental factor, including diagnoses like autism.
The final factor was a substance use disorder factor, which included most of the specific substance use disorders such as cannabis use disorder, stimulant use disorder, nicotine use disorder, and so on.
The p-factor
On top of these five factors, the authors also used a hierarchical model with a single node at the very top, which they called the p-factor. This idea isn’t new and has appeared in prior work. The p-factor represents genetic variants that seem to be shared across all psychiatric disorders, regardless of which specific cluster they fall into.
So the structure looks something like this: a general p-factor at the top, five broad factors underneath it, and then individual diagnoses within each factor. Some genetic variants correlate broadly on all these tested disorders, while others are more specific to one factor.
What do the genes actually do?
The authors then asked a natural next question: what do these genes actually do, biologically? What kinds of proteins do they encode, and what processes are they involved in?
Some of the findings were, in a way, reassuringly obvious. The substance use disorder factor, for example, included genes like alcohol dehydrogenase, which is directly involved in ethanol metabolism. There was also a SNP in a nicotinic receptor subunit.
That fits common sense. Some people drink alcohol and feel terrible; they get no pleasure, and instead maybe some nausea or flushing. And people are unlikely to become addicted to something that feels bad. Other people find alcohol calming or even euphoric, which makes repeated use and addiction more likely. Differences in metabolism or receptor binding could plausibly contribute to that.
Beyond substance use disorders, the findings were less straightforward. For the p-factor, the shared genes tended to be involved in very general biological processes, such as gene regulation. Nothing particularly specific or mechanistic stood out.
For the schizophrenia-bipolar (SB) factor, there were genes involved in neuronal development, particularly excitatory neurons. The internalizing factor also showed some involvement of excitatory neuron genes, but the signal was less consistent. And for the compulsive factor, there wasn’t anything especially compelling in terms of functional interpretation.
What does this mean for the DSM?
One way to interpret these results is that the DSM may be splitting diagnoses too finely, drawing artificial boundaries between disorders that are not well supported biologically. Clinically, we already see this problem. At a single point in time, it can be genuinely difficult, or impossible, to distinguish bipolar disorder with psychosis from schizophrenia. The shared genetic signal between those diagnoses has been described before, and this paper reinforces that overlap.
It naturally raises the question of whether these are truly distinct disorders, or whether they are different presentations of the same underlying condition.
On the other hand, there’s an important limitation here. Psychiatric diagnoses are not static. People’s diagnoses change over time. Many clinicians have seen patients move from a bipolar diagnosis to schizophrenia, and then later back to bipolar disorder depending on who evaluates them and at what point in their illness.
That diagnostic uncertainty inevitably contaminates genetic studies like this. It doesn’t invalidate the findings, but it likely distorts them to some degree.
There’s one more point I’d emphasize. While there are clearly genetic risk factors for psychiatric conditions, I don’t think we’re going to find all the answers at the level of genes alone.
If you think of the brain loosely as an information-processing system, an analogy that is not perfect, but still useful, consider this: take a very large, complex software program like Adobe Photoshop or Microsoft Excel that has millions of lines of code. Now run a million identical copies of that program on different computers.
Some percentage of those programs will crash or behave unexpectedly. And the code is identical in every case. The failures happen because users push the software into regimes the original programmers didn’t anticipate.
If you ran a GWAS on that situation, you wouldn’t learn much from the “code,” because the code is the same everywhere, even though failures still occur.
I think something similar applies to psychiatric disorders. Genetics matter and may reveal some mechanistic vulnerabilities, but even genetically “normal” brains can experience psychiatric disorders when put in unexpected or extreme environments. So we shouldn’t expect gene-level analyses to give us a complete explanation of psychiatric illness.