2Chemical and Nuclear Engineering, University of New Mexico, Albuquerque, NM, USA
3Cancer Research and Treatment Center, University of New Mexico Health Sciences Center, Albuquerque, NM, USA
Materials and methods
We have developed a highly sensitive single-molecule clonal amplification method called dual primer emulsion PCR (DPePCR) for next-generation DNA sequencing. The approach is similar in concept to standard emulsion PCR; however, in DPePCR both primers are attached to the beads, therefore following PCR amplification, both strands of the PCR products are attached to the beads. The ends of each strand can be freed for analysis by restriction digestion of the bridged PCR fragments, which allows efficient paired-end sequencing of fragment libraries.
The concept of studying single molecules in high throughput using emulsions is not new and has been used for single-molecule evaluation for a number of years (1) before the approach was first applied to next-generation sequencing (2). However, for emulsion technology to be applicable to next-generation sequencing, a method for capturing the contents from the droplets is required. To accomplish this, Dressman et al. (3) described an approach to amplify single DNA molecules onto beads for detection and enumeration of genetic variation. They named their technology BEAMing, for “beads, emulsion, amplification, and magnetics.” This approach has now evolved to emulsion PCR (ePCR). The DNA bound to beads generated from ePCR provides an excellent template for high-throughput sequencing, because PCR can amplify a single molecule of DNA into many clonal molecules per bead (2). Due to that fact, a number of next-generation sequencing approaches utilize emulsions and beads for DNA amplification prior to sequencing (4,5,6,7,8).
An alternative approach to amplify DNA for next-generation sequencing is the bridge amplification strategy, used by Illumina and originally reported by Bing et al. (www.promega.com/geneticidproc/ussymp7proc/0726.html). Bridge amplification is a technology that uses a single aqueous compartment; however, the individual amplicons are constrained by primers bound to a solid phase that are extended and amplified (9). As the name implies, the extension product from one primer forms a bridge to the other primer.
In this manuscript, we describe a novel approach called dual primer emulsion PCR (DPePCR), which combines concepts from both emulsion PCR and bridge amplification for the generation of simple fragment libraries for paired-end next-generation sequencing. The DPePCR strategy can amplify short DNA fragments (less than ~300 bp, including genome fragment and primers) and enables sequencing of both ends of a DNA fragment. This effectively shortens library preparation time, increases the library complexity (2) when compared with the construction of a mate-paired DNA library, and will contribute to the $1,000-genome goal.
Normal ePCR has been extensively applied to next-generation DNA sequencing (10). Most of the next-generation sequencing approaches are restricted to short read lengths (5), and therefore to optimize resequencing of a human genome, mate-paired or paired-end sequencing is important. However, the construction of mate-paired libraries for next-generation sequencing is difficult and time-consuming (10). Fortunately, fragment library construction is quite simple, and the described DPePCR strategy (Figure 1) enables paired-end sequencing of fragment libraries with essentially no modification of existing sequencing approaches.
To perform DPePCR, both forward and reverse primers are attached to 1-µm beads (Figure 1B) (see Supplementary Materials) that are included in a modified ePCR protocol (10). Additionally, since the amplicons are confined to the droplets, the amplification efficiency is increased by including free primers in the aqueous phase (Figure 1C). After 120 PCR cycles, a single DNA fragment in an emulsion drop can be amplified effectively. After amplification, we have found the DNA to be highly stable in the double-stranded state. Denaturing conditions will denature the DNA; however, since both strands are present on the bead, the double-stranded state immediately reforms, which inhibits the ability to sequence the DNA. To overcome this issue, type IIs recognition enzyme sites (i.e., BceAI and AcuI) were placed at the ends of the amplicons being amplified immediately adjacent to the unknown sequence during library construction (Figure 1A). The DPePCR product can then be digested with restriction enzymes (i.e., BceAI and AcuI), and capping adaptors are ligated to the free end of the dsDNA (Figure 1D). This provides the ability to sequence from both strands of the DPePCR using standard sequencing by ligation (SBL) (Figure 1D) (2).
The SBL sequencing strategy for DPePCR beads is identical to sequencing from standard ePCR beads (Figure 1D). The difference is that since there are two paired-end fragments, both ends can be sequenced independently from both the 3′→5′ and 5′→3′ directions using four different anchor primers (Figure 1D).
To validate the formation of strong double-stranded DPePCR product, the beads (following DPePCR) were treated with 0.1 M NaOH (without restriction enzyme digestion). After denaturing, a Cy5-labeled oligo (Cy5-labeled FDV2-PM; see Supplementary Material) was annealed to the bead-bound DNA fragment (Supplementary Material). The results indicated that the Cy5-oligo could not hybridize to the DPePCR product (data not shown), which suggests the formation of double-stranded DPePCR product was in a bridged conformation. Following restriction enzyme digestion, we were able to anneal the anchor primers (Supplementary Material) and successfully sequence the DNA on the beads from both ends of the fragment.
Theoretically, the DNA fragment on each bead arose from a single molecule in an emulsion and should therefore be clonal, which is the critical requirement for the DPePCR. To validate the clonal amplification, we used DPePCR and SBL in the G.007 Polonator (Westborough, MA, USA) to sequence a Streptococcus pyogenes genome fragment library; when sequenced, the clonal amplified beads showed a single color during each SBL cycle (Figure 2, A and B), which indicated the beads were clonal. Additionally, a random sampling of ~106 reads were mapped to the S. pyogenes genome (AE014074) to ensure we were sequencing paired ends. The reads were NNNNNNN from the forward primer site, and NNNNNNN(NN)NNNNNN from the reverse priming site [note that the (NN) bases was not sequenced]. To map to the S. pyogenes genome, the reverse priming site sequence was first mapped (to both strands of the genome) and then the forward priming site sequence was mapped to the opposite strand on the genome. Roughly, 25% of the reads mapped to the genome and the average separation between the paired ends in the genome was found to be ~100 bases (Figure 2C); this is consistent with the size range that selected in the gel purification step of the library preparation (Supplementary Material).
Additionally, as a final validation for the approach, we measured the library complexity. The complexity of the library is defined as the diversity of DNA molecules that are sequenced. For example, when a sequencing library is prepared and subjected to extensive amplification (PCR or rolling circle amplification) the complexity is reduced. The fragment library was found to have high complexity, with >99% of the reads being unique, thus improving upon the traditional mate-pair library production protocol (2,10).
In conclusion, we have presented DPePCR, an efficient emulsion PCR strategy for paired-end next-generation sequencing. The DPePCR concept will be applicable to any next-generation sequencing platform using emulsion PCR.
This work was supported by the National Institutes of Health and the National Human Genome Research Initiative (NIH/NHGRI $1,000 Genome grant no.R21HG004350, to J.S.E.). This paper is subject to the NIH Public Access Policy.
The authors declare no competing interests.
Address correspondence to Jeremy S. Edwards, University of New Mexico, Molecular Genetics and Microbiology, 915 Camino de Salud NE MSC08 4660, Albuquerque, NM 87131, USA. e-mail: [email protected]