For the past 15 years, doctors have heard that whole genome sequencing would soon transform medicine. Being able to pinpoint specific mutations and variations in patient's DNA would identify diseases and the therapies needed to treat them. But sequencing a whole genome has remained too slow, the interpretation has been too complex, and the cost just too high for routine use in the hospital setting.
The team applied the new sequencing protocol to seven babies in a neonatal intensive care unit at Children's Mercy Hospitals and Clinics and showed that in just over two days neonatologists could study a baby's symptoms, then take a drop of blood, have DNA isolated and then have the whole genome sequenced and analyzed to make a rapid disease diagnosis.
"The significance of this is pretty straightforward. We can now consider whole genome sequencing to be relevant for hospital medicine," said Kingsmore in a press conference on the paper.
Need for Speed
Kingsmore's team chose to sequence the whole genomes of newborns because disease progression in infants is extremely rapid. Of the 4.1 million babies born in the U.S. every year, about one in 20 is admitted to a neonatal intensive care unit. At Children’s Mercy Hospital, the leading cause for admission is an illness that is likely genetic.
Doctors know of more than 3500 genetic diseases—most of which are childhood diseases—that are caused by a mutation in a single gene. Because of the large number, it can be hard for doctors to pick the correct gene to test using standard genome sequencing. Often babies died or got better and were discharged before the results of a gene test were returned.
"The waiting might not be the hardest part for families poised to receive a diagnosis in neonatal intensive care units, but it can be destructive nonetheless. While they wait on pins and needles for their newborn baby’s diagnosis, parents anguish, nurture false hope, wrestle with feelings of guilt, and all the while treatment and counseling are delayed," says Kelly Lamarco, senior editor of Science Translational Medicine.
To reduce sequencing time, the clinicians at Children's Mercy teamed with Illumina scientists, who tweaked the DNA preparation chemistry and the flow cell design and imaging speed on one of their commercial sequencers. Meanwhile, bioinformatics specialists at the Center for Pediatric Genomic Medicine at Children’s Mercy created two new software programs to analyze genomes in a fast and medically meaningful way.
On the sequencing side, the biggest changes were in preparing the DNA library and the way the processed DNA moved through and was imaged in Illumina's Hi-Seq 2500 sequencer, says Joel Fellis, market manager for sequencing systems for the company and a co-author of the paper.
Preparing a DNA library requires that highly trained and qualified technicians prepare samples, which may take 12 hours for one batch (2). Automating part of the process can cut library preparation time (3), so the Illumina technicians took 500 nanograms of DNA from each baby's sample and prepared it by standard methods of shearing, repairing, tailing, and ligation. Instead of PCR amplification, the team purified the sequencing libraries using a bead-based machine from Beckman Coulter. They then quantified the samples with real-time PCR. The entire process took about 4.5 hours, rather than the standard 16 hours.
"Part of the quick speed is fast chemistry," says Fellis. Once the material was ready, the team loaded it into their sequencer. Based on hardware changes and improved onboard chemistry, the instrument forces flow cells to process templates faster. Fluorescently tagged DNA clusters also appear brighter and the imaging area is smaller. Based on the changes, the sequencer generated up to 140 gigabases of sequence in less than 30 hours.
Not the End Game
Speed, of course, is part of the goal but "it's not the end game," says Fellis. Researchers and clinicians are asking for low-cost, high-speed, and high-accuracy sequencing in one system. At the beginning of 2012, Illumina, along with Ion Torrent, promised that by the end of the year their next-generation sequencing systems would cycle through all the sequencing chemistry and call bases in a single day. It was an exciting possibility then, and showing that it can be done is a big step forward, says Jeff Schloss, program director for technology development coordination at the National Human Genome Research Institute (NHGRI) in Washington, D.C.
It’s important to remember though that what's driving genome sequencing into the express lane is its drop in cost. "It's not cost versus rate to get sequencing done. The cost has come down, from millions of dollars to something on the order of 10,000. But if it weren’t for cost reductions, doing sequencing quickly would be silly," says Schloss.
Now that scientists have gotten the cost curve to drop, they are beginning to address other sequencing features, such as speed, quality, and how to analyze all the data from a genome. In whole genome analysis at hospitals, the biggest challenge can be picking which of the estimated 20,000–25,000 genes to focus on. In the case of the new 50-hour genome, Kingsmore's group developed two new software packages to focus the genome analysis on one or a few genes of interest.
The first program maps symptoms of a set of recessive genetic diseases to the mutations in genes that may cause those symptoms. Doctors pick symptoms from a drop-down menu, which when selected, focuses the analysis and interpretation of a patient’s sequencing results. Right now the software analyzes 595 genes, but Kingsmore's colleagues at Children's Mercy are expanding the system to include all the 3500 known disease genes, many of which an ordering physician would not have heard of.
The second software program characterizes each of the four million variants found in a person’s genome and estimates their consequences for diseases. The team, and possibly others who adopt the system, can use the first software system to determine what genes to look at, and then use the second program to determine which genes look like they are causing problems. The two-step process, coupled with the genomic information, narrows down four million variants to a handful that a clinician can review and use to make a diagnosis.
Schloss, who runs NHGRI's technology development for the $1000 genome project, says the improvements in chemistry and analysis of next-generation sequencers are important for moving sequencing to the clinic. But in the future, technology using direct DNA base reads—such as nanopore sequencers—will be the most accurate and give even higher quality genomic data.
In these sequencing designs, strands of DNA are pushed through biological or synthetic pores while an electronic signal based on the changes in ionic current that accompanies each nucleotide as it is incorporated during DNA polymerization is recorded (4). Tapping into high-tech physics, there's even a design that reads nucleotides based on the way electrons pair-up, or tunnel, and move through each base as it crosses the pore. Oxford Nanopore, along with several other researchers and companies, is developing direct DNA base read techniques, which could be ready at the end of this year or early next year.
Because groups like Kingsmore's are now showing that sequencing in the clinic is possible, the sequencing express lane for even more clinical applications will likely drive innovation for faster, cheaper, and better genomic processing technology. And, as with any product that follows that philosophy, the sequencing tests will become routine at hospitals. They may one day even be as standard as carpool lanes and E-ZPass on the interstate.
1. Saunders, C., et. al. 2012. Rapid whole-genome sequencing for genetic disease diagnosis in neonatal intensive care units. Sci. Transl. Med. 4, 154ra135.
2. Farias-Hesson, E., et al. 2010. Semi-automated library preparation for high-throughput DNA sequencing platforms. Journal of Biomedicine and Biotechnology, vol. 2010, Article ID 617469: 1-8.
3. Rohland, N., and Reich, D. 2012. Cost-effective, high-throughput DNA sequencing libraries for multiplexed target capture." Genome Res. 22: 939-946.
4. Niedringhaus T., et. al. 2011. Landscape of next-generation sequencing technologies. Anal Chem. 83: 4327–4341.