to BioTechniques free email alert service to receive content updates.
Finding the true $1000 genome
Jeffrey Perkel, Ph.D.
Full Text (PDF)

In early 2012, Illumina released an update of its HiSeq 2000, the HiSeq 2500. The HiSeq 2500 can generate about enough data to produce a 30x genome (~120 Gb) in 24 hours. Recently, Stephen Kingsmore of the Center for Pediatric Genomic Medicine at Children's Mercy Hospital in Kansas City, MO, published a paper demonstrating that it was possible to couple this instrument with some clever bioinformatics processing to perform pediatric genomic diagnostics in just over two days.

The technique, called STAT-Seq, costs $13,500 and returns results in 50 hours. That includes 4.5 hours for sample preparation, 26 hours for sequencing, and 15 hours for base calling, sequencing alignment, and variant calling. The final few hours comprise data interpretation.

Following sequencing, a home-built software pipeline filters the ~4 million genetic variants found in each genome by their location in protein-coding genes, allele frequency, protein class, disease likelihood, and so on. These data are then passed to a tool called SSAGA.

SSAGA “is a new clinicopathological correlation tool that maps the clinical features of 591 well established recessive genetic diseases with pediatric presentations to corresponding phenotypes and genes known to cause the symptoms,” according to the STAT-Seq publication (2). In short, says Emily Farrow, a research scientist at Children's Mercy Hospital and coauthor of the paper, clinicians input a patient's clinical symptoms so the software can narrow its search field. “We're filtering out the noise, so to speak, and really narrowing down the area that we have to look at,” she explains.

Farrow says her lab has now processed seven STAT-Seq families (22 samples total). By comparison, in 2012 her lab sequenced more than 1,200 exomes and some 500 targeted gene panels.

Researchers hoping to take advantage of processes like STAT-Seq can send their samples to Children's Mercy, or pass them to a more local core facility. In many cases, basic bioinformatics analyses no longer represent a financial or even intellectual bottleneck: cloud-based, automated pipelines like DNAnexus and Illumina's BaseSpace service have, in some sense, democratized the field. All of which means the power of the human genome will finally make its way to the people who paid for it. Not bad for 10 year's work, or even 60.

1.) 2013.. “Another layer on reality,”.

2.) Saunders, C.J.. 2012. Rapid whole-genome sequencing for genetic disease diagnosis in neonatal intensive care units. Sci. Transl. Med. 4:154ra135.

  1    2    3    4