Is the $1000 genome in sight? The answer to that question could come in early 2013 when the Archon Genomics X-prize competition is held to see what technology can deliver the sequences of 100 genomes at an accuracy of one error in 1,000,000 bases with 98% completeness in 30 days. The reward for these efforts: $10 million dollars and the knowledge that you have completed arguably one of the most widely cited challenges in genomics today, and finally eliminated what has become a common tag line for many a news story. But in the meantime, if 2011 was an indication, new methods and techniques to improve sequencing and expand the range of sequencing applications still seem to be on the rise. And it’s these improvements and adaptations that the editors of BioTechniques are celebrating with our end of the year list of the top sequencing articles published in the journal in 2011.
Moving from upstream preparation methods to applications, for many researchers the real value of next-generation sequencing platforms is how they expand the scope of their studies. More genomes can be sequenced at a faster pace, a greater number of transcripts can be analyzed in a single instrument run, or, in this particular case, an expanded number and range of STR loci can be processed faster for forensic investigations. In an article published in the August 2011 issue of BioTechniques, Sarah Fordyce and her colleagues demonstrated that using next-generation sequencing to profile STR loci not only generates data comparable to conventional approaches (PCR-based fragment analysis, for example), but can identify base substitution variations and repeats that would be missed by the other methods. In combination with a bioinformatic tool the authors created to assess sequence lengths and frequencies, this new way of profiling STRs is sure to extend the range of forensic testing in the future.
There is no questioning the power of next-generation sequencing, but with power comes the responsibility to use it wisely. Sequencing amplicons to high depth with next-generation sequencing platforms can lead to the identification of minor variants, nucleotide changes that are typically found at low frequencies. But with the possibility of amplification bias due to the use of polymerase, there is a need to identify optimized amplification conditions and parameters that can decrease the possibility of introducing any artifactual mutations that could be mistaken for true minor variants. In a Perspective article published in the September issue of BioTechniques, Jeroen Aerssens and his colleagues detailed optimal primer design metrics, amplification conditions, and polymerases to avoid artifactual mutations. Using several case studies, the authors specify the pitfalls when it comes to sequencing amplicons – information that should prove essential to all researchers working with next-generation sequencing systems.
In February 2011, we published a Benchmark article from Harold Smith, a researcher at the National Institute of Diabetes and Digestive and Kidney Diseases of the National Institutes of Health, describing a new bioinformatic approach to locating insertion sites within next-generation sequencing data. In species such as Drosophila, locating active mobile genetic elements, or transposons, and their sites of insertion can be a tremendous challenge, especially when using a reference genome sequence. Smith’s computational method relies on split-end alignments using two ends of a sequence from a short standard sequencing library, a reference genome and a transposon-specific reference. The biggest advantage of this approach, as Smith notes in the article, is that no new software is needed and current alignment platforms can be adapted to this analysis without the need for major modifications – thus providing a simple solution to a complicated problem.
No article published in 2011 better demonstrates the expanding number of applications exploiting next-generation sequencing platforms than this December 2011 Report by Jonathan Scolnick and his colleagues. Finding aptamers has traditionally required numerous rounds of selection and optimization to identify high-affinity binders through the systematic evolution of ligands by exponential enrichment (SELEX) method. But here, the authors developed a sequencing-based method where, following an initial round of selection using a 33-mer oligonucleotide library, all binding oligonucleotides were sequenced to locate those sequences that were enriched. The depth of sequencing of the next-generation platforms enabled Scolnick and colleagues to find shorter lengths of strong binders (strongly enriched in sequence reads) and thereby design efficient aptamers quicker than the more traditional SELEX approach.
Our selection of DNA sequencing articles for 2011 provides a clear indication that sequencing is not just about whole genome assembly anymore; scientists are now finding novel applications for next-generation sequencing platforms in everything from affinity binder selection to forensic analysis. In the end, generating that $1000 genome will definitely be a watershed moment for genomics, but the applications and approaches developed on the road to the $1000 genome may prove to be worth just as much.