Novelty is a slippery concept in genetics: Literally, it simply means that a gene’s connection to a certain condition has never previously been noted. But many researchers say that being first to implicate a gene isn’t enough. Genes also must pass a stringent statistical bar to become candidates, or they remain merely curiosities.
In autism, this issue has come to the fore in the past few months.
For example, in March, an analysis of the genetic blueprints of more than 5,000 people revealed 61 genes linked to autism. The researchers claimed that 18 of those genes were ‘new candidates’ for the condition. But 17 of them had previously been found in people with autism.
Ryan Yuen, a researcher on the study, defends the genes’ status as new candidates, saying earlier reports didn’t firmly establish the link. “A gene needs to reach a certain mutation rate with a large-enough sample size so that we have enough confidence to call it an autism risk gene,” says Yuen, scientist at the Hospital for Sick Children in Toronto, Canada. The 18 genes had not previously crossed that statistical threshold, he says.
To identify genes linked to a condition, researchers compare genetic variants in people who have the condition with those in people who don’t. Alternatively, they may check the variants against a model that predicts the background rate of variation in the genome. If a certain variant occurs more often than expected — above a certain statistical threshold — it’s considered to be significant.
In other words, whether a gene should be considered a ‘novel candidate’ for autism depends not just on whether it’s been linked to the condition before, but on the strength of that link.
Not everyone agrees that a statistical definition of novelty makes sense, however.
“Novelty is not defined by how many cases are reported, but [by] whether it’s the first time to identify the link,” says Hengye Man, associate professor of biology at Boston University. Man and his team were the first to link a gene called KIAA2022 to autism in 2013; the same gene turned up in Yuen’s study. Man says he doesn’t understand how Yuen’s group can claim KIAA2022 as a novel candidate.
The good news is that Man’s finding held up in another study. But there are countless examples of genes being mistakenly tied to conditions, says David Cutler, associate professor of human genetics at Emory University in Atlanta. “There’s a lot of stuff in the literature that is simply wrong,” he says.
Some of the flimsier findings are based on reports of a mutation in one person or a handful of people with a condition. When a study involves only a few people, the connection could be due to chance — even if there’s a plausible biological explanation for how a mutation could cause autism.
“The literature is full of these one-offs,” says Evan Eichler, professor of genome sciences at the University of Washington.
Sometimes missteps occur even when scientists try to do their statistical due diligence. In February, Eichler and his colleagues identified 91 genes tied to neurodevelopmental disorders, including autism. The team flagged 38 as novel genes.
Eichler’s team selected 208 genes previously associated with neurodevelopmental conditions, and sequenced those genes in more than 11,000 people with the conditions. Because the previous studies had included whole genomes, the researchers should have corrected for an entire genome’s worth of genes. Instead, they corrected for only the 208 genes. As a result, they inadvertently set the statistical threshold far too low.
Cutler and his colleagues pointed out the error in a critique of the study posted to the preprint server bioRxiv. When they reanalyzed the data using the correct threshold, they found that only about half of the 91 genes retained their ties to autism. Eichler and his team are working to correct the error.
Apart from human errors like this, statisticians have developed standards for ruling out chance associations in genetics, but not all researchers employ them, says Joseph Buxbaum, professor of psychiatry at the Icahn School of Medicine at Mount Sinai in New York and a co-author on the critique. “The statistical police should be the top-tier journals,” he says. Nature Genetics, the prestigious journal in which Eichler and his team published the results, missed the error.
Overall, however, there is far more agreement today than disagreement, says Matthew State, chair of psychiatry at the University of California, San Francisco. Different groups might use slightly different statistical methods, but most seem to embrace the same threshold for significance, he says.
That statistical agreement is important, because new findings can prompt biologists to embark on labor-intensive and costly paths to investigate a gene’s role in a condition. If geneticists erroneously connect a gene to a condition — something that used to occur more often than it does now, State says — those investigations will be in vain. “The amount of wasted time that emanated from unreliable candidate gene studies across all of psychiatry is heartbreaking,” he says.
Given the focus on novelty of genes, autism geneticists should perhaps abandon claims of novelty altogether, says Eichler. Novel genes might attract the attention of journal editors, but “finding that the same gene has been seen two or three times is in fact really important information in itself,” he says. “Maybe we should just disavow the use of the word ‘novel’ forever.”