Two independent teams have identified the genetic culprits of three rare, inherited diseases by sequencing the genomes of several members of the same family. One of the studies appears today in Science1.
As the cost of whole-genome sequencing plummets, this family-based approach will reveal candidate genes not just for rare diseases but for common, complex disorders such as autism, experts say.
“It really shows the power of the technology,” notes Michael Zwick, assistant professor of human genetics at Emory University, who was not involved with either study.
Zwick’s team is searching for genetic causes of autism by sequencing the X chromosome in boys with the disorder. “Whole-genome sequencing is certainly on the near horizon for autism,” Zwick says.
In the Science study, researchers sequenced the genomes of a family of four: two healthy parents and a son and daughter. Both children have Miller syndrome, which is characterized by face and limb deformities, and primary ciliary dyskinesia, a disorder in which the lungs are filled with mucus.
By directly comparing the DNA of the parents and the children, the researchers pinpointed four potential candidate genes for both diseases.
In the second study, geneticist James Lupski decoded his own genome to find the mutations responsible for Charcot-Marie-Tooth neuropathy, a disease that causes atrophy of peripheral nerves. Lupski has spent his career trying to track down the genetic culprits of his disease.
The analysis, published 1 April in the New England Journal of Medicine, reveals that Lupski carries more than one million single-letter variations in regions that code for genes2. Two of these mutations appear on the same gene, SH3TC2, which several studies had already linked to Charcot-Marie-Tooth disease3.
Lupski’s team sequenced the gene in nine members of his family and found that three of his siblings, who have the disease, also carry both mutations. Four healthy siblings and his two healthy parents carry one of the mutations or neither.
These rare diseases are ideally suited for this kind of analysis because they arise from inherited, highly penetrant variants that almost always cause disease.
Similar searches for autism genes are not likely to be as successful, at least at first, because autism doesn’t have a simple inheritance pattern. Most variants linked to the disorder so far have not proven to be highly penetrant.
“[Autism sequencing studies] will be harder to interpret and understand, but obviously the first step is getting all the variation,” Zwick says.
He adds that autism research would greatly benefit from family sequencing in particular, because it would allow researchers to figure out which mutations are inherited and which arise spontaneously.
The first efforts to find disease-causing genes, in the 1980s, relied on linkage studies, which screen families with history of a particular disease4. By looking for genetic markers that appear in affected family members, but not in healthy ones, researchers pinpoint chromosomal regions involved in disease. Linkage studies are limited in that they work best for rare, single-gene disorders, however.
The next wave of research used the genome-wide association approach, which looks for variants that are more common in individuals with a disease. Because there is so much variability in the genome, these studies typically require thousands of cases and controls to meet statistical significance. Some studies turned up dozens of variants that increase the risk of various diseases, from cancer and diabetes to schizophrenia and autism.
Healthy people also carry these variations, however, so they are seldom the main cause of any disease.
In contrast, whole-genome sequencing, which involves decoding all of an individual’s three billion DNA base pairs, has the potential to turn up rare variants that usually or always cause disease. This technology is developing rapidly: The first human genomes in 2001 cost upwards of $3 billion each to decode5,6. Today, the wholesale cost is down to about $4,400, and falling7.
Still, there are technical hurdles that limit whole-genome studies. For example, sequencing machines occasionally misread a string of code, perhaps mistaking a G base for an A.
“There are many different reasons why errors can occur — from the level of the fundamental chemistry to reading out the data,” says Leroy Hood, lead investigator of the Science study, and a pioneer of sequencing technology.
Hood’s team identified roughly 70 percent of those sequencing errors by looking at inheritance patterns in the family of four. This is thanks to simple Mendelian genetics: at every spot in the genome, an individual inherits one allele from the mother and one from the father. If a child carries a combination of alleles that could not possibly have been inherited, it’s sure to be either a spontaneous mutation or a sequencing error.
The typical error rate today is roughly 1,000 times higher than the rate of new mutations. “If you look at what the parents have and say it’s impossible for the child to have this, then it’s probably an error,” says Hood, president of the Institute for Systems Biology in Seattle.
An enormous challenge in whole-genome sequencing is interpreting the massive glut of data. Each of us carries large numbers of mutations that are unique and do not cause disease. When scientists analyze a genomic read-out, they have trouble spotting which mutations confer risk of disease and which are benign variants.
Studying family members helps clear this hurdle. The genomes of family members tend to be similar, so when someone with a disease has a variant that the sibling does not, it’s more likely to be linked to disease.
Hood’s team found, for instance, that sequencing just one of the children in the family resulted in a pool of thousands of candidate genes. Analyzing both affected children narrowed that list to 34 candidates, and adding in the parents cut it down to just 4.
“Whether this [approach] will work effectively on the more complicated, multi-gene diseases remains to be seen — though we are certainly going to look at that,” Hood says.
Autism geneticists are delving into cheaper, scaled-down projects that may produce data that are easier to interpret.
For instance, Huda Zoghbi‘s laboratory at Baylor College of Medicine is focusing on the ‘autism interactome’ — a collection of about 800 genes that are either known to be involved in autism or interact with those that are. The team is sequencing these genes to see whether mutations in them are more likely to crop up in individuals with autism than in family members or healthy controls.
Meanwhile, geneticists Matthew State of Yale University and Evan Eichler at the University of Washington are sequencing the exome — the one percent of the genome that actually codes for genes — in individuals from the Simons Simplex Collection, a gene bank of families with autism. Compared with whole-genome sequencing, whole-exome sequencing is a bargain at about $2,000 per person.
Beyond the savings, State says exome searches are particularly suited for this group. The collection has samples from simplex families, which have only one child with autism, meaning that the samples are more likely to contain spontaneous variants not inherited from healthy parents. Because spontaneous variants tend to have larger effects, they are more likely to cause the dysfunction of a protein-coding gene.
“[Whole-exome sequencing] is an extremely efficient way of going after that pool of variation,” State says. What’s more, he notes, variants found in the exome are more easily translated into mouse models.
Sequencing approaches will help determine how these autism risk variants are transmitted in families, and why people carrying them display such a wide variety of symptoms.
“Knowing the complete variation within the genome is the ultimate — it’s every geneticist’s ideal,” State says. But because of the complexity involved in autism genetics, he adds, “it’s going to be tremendously important to look at families.”
Roach J.C. et al. Science 328, 636-639 (2010) PubMed
Lupski J.R. et al. New Engl. J. Med. 362, 1181-1191 (2010) PubMed
Azzedine H. et al. Neurology 67, 602-606 (2006) PubMed
Tsui L.C. et al. Science 230, 1054-1057 (1985) PubMed
Venter J.C. et al. Science 291, 1304-1351 (2001) PubMed
Lander E.S. et al. Nature 409, 860-921 (2001) PubMed
Drmanac R. et al. Science 327, 78-81 (2010) PubMed