An unbiased search of more than 2,000 whole genomes for variants in the regions between genes has not found any that contribute to autism1.
The study does not prove that those variants don’t exist, but suggests that finding them will require thousands more whole genomes than researchers have now.
The stakes are high: Focusing on mutations in genes — or the coding genome — has uncovered only a small proportion of autism risk factors so far.
“The noncoding genome has the potential to provide us more insight into when and where and which cell type these variants are affecting. There’s only so much information you get at the level of the gene,” says co-lead investigator Stephan Sanders, assistant professor of psychiatry at the University of California, San Francisco. But “to get anywhere near a single variant, we’re going to need thousands and thousands of families,” he says.
The finding is at odds with three other studies published in the past two years that reported that certain noncoding regions carry an excess of autism variants2,3,4. In those studies, researchers looked for variants in a subset of noncoding regions that they predicted would play a role in autism.
But Sanders and his colleagues say it’s a mistake to conduct these sorts of directed searches. “We know so little that we can’t really make informed judgements about this,” Sanders says.
A search based on hypotheses may lead researchers to inadvertently “cherry-pick” results, says Naomi Wray, professorial research fellow of molecular bioscience at the University of Queensland in Australia, who was not involved in the studies. “I think now we have to move away from that. And I think this study has provided a nice framework for other people to build upon.”
Some researchers still believe in prioritizing certain parts of the genome, however.
“I understand where they are coming from; we have all been badly burned by the poor results of candidate-gene approaches,” says Lucia Peixoto, assistant professor of biomedical sciences at Washington State University Spokane, who was not involved in the studies. “However, I am still a believer in narrowing down to candidates.”
Sanders and his colleagues analyzed 2,076 whole genomes from 519 families that have one child with autism. They identified every type of DNA variation in the noncoding genome that is present in individuals with autism but not in their unaffected parents.
These so-called ‘de novo variants’ include changes to a single DNA base, flips in the orientation of DNA segments, and structural rearrangements, such as deletions of DNA. The team also looked at rare inherited variants. Their analysis is the first published overview of all these types of variation in the noncoding genomes of people with autism.
People with autism do not have an excess of these variants in regulatory regions near autism genes, the researchers found. The researchers then looked for links to autism in select regions of the noncoding genome.
Other teams prioritized certain regions — say, those that are conserved during evolution or are near an autism gene — based on their hypotheses. Sanders and his colleagues instead considered five categories of noncoding regions — based on function or nearby gene type, for example — and 68 items, such as timing of expression or proximity to an autism gene, within these categories.
They then compiled every possible combination of these items, taking one from each category. After eliminating redundancy and removing combinations that contain fewer than seven variants, they arrived at 13,704 possible combinations — for example, regions that control genes, are conserved and are near a gene expressed in the brain.
Variants in some of these combinations are unusually common in people with autism; others are more common in controls. But all of these associations disappear after a statistical correction that accounts for an association in any of these thousands of combinations by chance. The study appeared 26 April in Nature Genetics.
“If you don’t build this up in a systematic way that corrects for the number of comparisons you have done, you will make quite erroneous claims of what’s associated with autism,” says co-lead investigator Michael Talkowski, associate professor of neurology at Harvard University.
Others maintain that focusing on certain classes of noncoding variation is still a valid approach. Often, the choice is backed by solid biological evidence, says Evan Eichler, professor of genome sciences at the University of Washington in Seattle. Eichler led two of the studies that took directed approaches to whole-genome analysis.
“What’s nice about their study is they have identified additional categories for future investigation,” Eichler says.
The rationale for zeroing in on particular categories is likely to get stronger with time, others say, perhaps improving the results of directed searches.
“I do think a ‘biased’ approach could work if we can improve the ability to annotate functional consequence of noncoding variants,” says Yufeng Shen, assistant professor of systems biology and biomedical informatics at Columbia University.
What’s more, the new study is still too small to make sweeping conclusions about the best method for analyzing whole genomes, says Jonathan Sebat, chief of the Beyster Center for Genomics and Neuropsychiatric Diseases at the University of California, San Diego.
“Unless you are studying a trait with a relatively simple genetic architecture, 500 cases and 500 controls is not going to cut it,” he says.
In a study published earlier this month, Sebat and his colleagues reported that people with autism carry an excess of structural variants inherited from their fathers in regulatory regions near genes. This study included 9,274 whole genomes from 2,600 families that have at least one person with autism.
Sanders and Talkowski have repeated their analysis in more than 1,900 families with autism. So far, they have unpublished results that show that an unbiased approach is still necessary for this number of participants, Sanders says. They estimate that more than 24,000 whole genomes are necessary to extract statistically valid results.