Reducing the impact of PCR-mediated recombination in molecular evolution and environmental studies using a new-generation high-fidelity DNA polymerase
Rates of recombination obtained for Phusion are highly variable across conditions. Previous research has proposed that enzymes with higher processivity yield more chimeric sequences (26). This is confirmed at high cycle/high template concentration conditions, with an average yield of 71% chimeras (Figure 2). However, the surprising result is that the same enzyme yields absolutely no chimeras when the initial template concentration is sufficiently low (1.4 × 103 starting molecules) and when the cycle number is reduced to 30 cycles (Figure 2). The processivity-enhanced enzyme makes up for its high rate of chimera formation by being able to amplify initial concentrations that are four to five orders of magnitude lower than what the strict proofreading polymerase can amplify (Table 1), effectively reducing chimera formation to zero (Table 2). Non-proofreading polymerases like Taq might also benefit from less concentrated initial templates but they are less desirable for molecular evolution studies. Furthermore, we attribute Phusion's ability to amplify low template concentrations to the enhanced processivity. Ordinary Taq polymerases might not be able to amplify concentrations that are low enough to reduce artifact formation.
The percentages of chimeras formed per reaction are higher for certain conditions in the present survey than have been reported in much of the literature, which we believe is due to the greater complexity of our starting templates (i.e., eight actin haplotypes). The rates of recombination for a proofreading enzyme (VentR) under standard conditions (30 cycles) averaged ~40% recombinants in the present study, while the highest available literature reports are 32% for Taq polymerase after 30 cycles on 7 distinct initial haplotypes (30), 16% for the proofreading Expand H-F system (Boehringer, Mannheim, Germany) after 25 cycles using 8 initial haplotypes (27), and 31% recombinants using Taq polymerase across multiple loci in polyploid cotton (41). Since the reported chimera formation rates in other available literature ranges 1–5% across a variety of enzymes (15,16,17,18,19)(24,26), we attribute the higher rates in our experiment [as well as in two others (27,30)] to the higher number of initial haplotypes: most studies used 2–4 initial haplotypes to test chimera formation. In contrast, ≤35% recombinants were reported for only two MHC loci (31), which may indicate a possible influence of the template itself. Furthermore, we demonstrated that there is an increase in chimera formation as diversity in the original sample increases. This reinforces the idea that molecular environmental studies might be plagued with a slew of artificial sequences (20). Moreover, initial template concentrations (i.e., abundances) in environmental samples will most likely be unequally distributed, which might influence the formation of artifacts. However, differential abundance of templates probably has a larger impact on the detection of true diversity and our analyses indicate that multiple (>2) PCR reactions will be required to capture the true diversity of a sample even when abundances of templates are equivalent.
We find that the majority of chimeras contain more than one breakpoint, indicating that more than two parental sequences can be involved in PCR-mediated recombination. This high rate of crossover is independent of cycle number or initial template concentration (Figure 3). This observation will create problems for chimera-detecting software that base search criteria in finding one breakpoint per sequence. For example, while using the online software Bellerophon (40) to detect the chimeras in the present data set, only an average of 65 ± 18% of chimeras were detected. Even more worrisome, there is a false-positive rate of 40 ± 31% (see Supplementary Material 3 for details).
Capturing the full diversity within a sample requires a combination of multiple PCR reactions that have been performed under chimera-reducing conditions. On average, PCR reactions at high cycle numbers are unable to recover all diversity (average 4 ± 1 out of 8 starting haplotypes, Figure 2), even if all three replicates are combined (Supplementary Material 1), and have the added bias of generating false haplotypes. While low cycle number improves recovery (7 ± 1 out of 8 starting haplotypes; Figure 2), it is likely that a single PCR experiment will not capture all the diversity. For example, in chimera-reducing conditions with low cycle number and lowest initial template concentration possible, individual PCR reactions detected an average of four out of the eight haplotypes, but performing three replicates will certainly describe all diversity in this eight-haplotype system (Supplementary Material 1). Hence we reiterate the necessity of replicating PCR reactions to assess biodiversity (19,24,26), and we add the recommendation that these replicate PCR runs be undertaken with minimal DNA concentrations and cycle numbers, which need to be established on a sample-by-sample basis.
This work was supported by the National Institutes of Health (NIH; AREA Award no. 1R15GM081865-01) and the National Science Foundation (nos. OCE-0648713 and DEB 043115) to L.A.K., and the Conselho National de Pesquisae Desenvolvimento, Brazil (no. GDE 200853/2007-4) to D.J.G.L.
The authors declare no competing interests. This paper is subject to the NIH Public Access Policy.
Address correspondence to Daniel J. G. Lahr, Graduate Program in Organismic and Evolutionary Biology, University of Massachusetts, 319 Morrill Science Center, Amherst, MA, 01003, USA. email: [email protected]
1.) Bapteste, E., H. Brinkmann, J.A. Lee, D.V. Moore, C.W. Sensen, P. Gordon, L. Durufle, T. Gasterlan. 2002. The analysis of 100 genes supports the grouping of three highly divergent amoebae: Dictyostelium, Entamoeba, and Mastigamoeba. Proc. Natl. Acad. Sci. USA 99:1414-1419.
2.) Grant, J., Y.I. Tekle, O.R. Anderson, D.J. Patterson, and L.A. Katz. 2009. Multigene evidence for the placement of a heterotrophic amoeboid lineage Leukarachnion sp. among photosynthetic stramenopiles. Protist. 160:376-385.
3.) Baldauf, S.L. 2003. The deep roots of eukaryotes. Science 300:1703-1706.
4.) Nikolaev, S.I., C. Berney, J.F. Fahrni, I. Boliver, S. Polet, A.P. Mylnikov, V.V. Aleshin, N.B. Petrov, and J. Pawlowski. 2004. The twilight of Heliozoa and rise of Rhizaria, a new supergroup of amoeboid eukaryotes. Proc. Natl. Acad. Sci. USA 101:8066-8071.
5.) Tekle, Y.I., J. Grant, O.R. Anderson, T.A. Nerad, J.C. Cole, D.J. Patterson, and L.A. Katz. 2008. Phylogenetic placement of diverse amoebae inferred from multigene analyses and assessment of clade stability within ‘Amoebozoa’ upon removal of varying rate classes of SSU-rDNA. Mol. Phylogenet. Evol. 47:339-352.
6.) Yoon, H.S., J. Grant, Y. Tekle, M. Wu, B. Chaon, J. Cole, J. Logsdon, D. Patterson. 2008. Broadly sampled multigene trees of eukaryotes. BMC Evol. Biol. 8:14.
7.) Holt, R.A., and S.J. Jones. 2008. The new paradigm of flow cell sequencing. Genome Res. 18:839-846.
8.) Allen, E.E., and J.F. Banfield. 2005. Community genomics in microbial ecology and evolution. Nat. Rev. Microbiol. 3:489-498.
9.) Keller, M., and K. Zengler. 2004. Tapping into microbial diversity. Nat. Rev. Microbiol. 2:141.
10.) Barns, S.M., C.F. Delwiche, J.D. Palmer, and N.R. Pace. 1996. Perspectives on archaeal diversity, thermophily and monophyly from environmental rRNA sequences. Proc. Natl. Acad. Sci. USA 93:9188-9193.
11.) Costas, B.A., G. McManus, M. Doherty, and L.A. Katz. 2007. Use of species-specific primers and PCR to measure the distributions of planktonic ciliates in coastal waters. Limnol. Oceanogr. Methods 5:163-173.
12.) Dawson, S.C., and N.R. Pace. 2002. Novel kingdom-level eukaryotic diversity in anoxic environments. Proc. Natl. Acad. Sci. USA 99:8324-8329.
13.) Doherty, M., B.A. Costas, G.B. McManus, and L.A. Katz. 2007. Culture-independent assessment of planktonic ciliate diversity in coastal northwest Atlantic waters. Aquat. Microb. Ecol. 48:141-154.
14.) Edgcomb, V.P., D.T. Kysela, A. Teske, A. de Vera Gomez, and M.L. Sogin. 2002. Benthic eukaryotic diversity in the Guaymas Basin hydrothermal vent environment. Proc. Natl. Acad. Sci. USA 99:7658-7662.
15.) Brakenhoff, R.H., J.G. Schoen-makers, and N.H. Lubsen. 1991. Chimeric cDNA clones: a novel PCR artifact. Nucleic Acids Res. 19:1949.
16.) Meyerhans, A., J.-P. Vartanian, and S. Wain-Hobson. 1990. DNA recombination during PCR. Nucleic Acids Res. 18:1687-1691.
17.) Judo, M.S., A.B. Wedel, and C. Wilson. 1998. Stimulation and suppression of PCR-mediated recombination. Nucleic Acids Res. 26:1819-1825.
18.) Bradley, R.D., and D.M. Hillis. 1997. Recombinant DNA sequences generated by PCR amplification. Mol. Biol. Evol. 14:592-593.
19.) Kanagawa, T. 2003. Bias and artifacts in multitemplate polymerase chain reactions (PCR). J. Biosci. Bioeng. 96:317.
20.) Hugenholtz, P., and T. Huber. 2003. Chimeric 16S rDNA sequences of diverse origin are accumulating in the public databases. Int. J. Syst. Evol. Microbiol. 53:289-293.
21.) Berney, C., J. Fahrni, and J. Pawlowski. 2004. How many novel eukaryotic ‘kingdoms’? Pitfalls and limitations of environmental DNA surveys. BMC Biol. 2:13.
22.) von Wintzingerode, F., U.B. Gobel, and E. Stackebrandt. 1997. Determination of microbial diversity in environmental samples: pitfalls of PCR-based rRNA analysis. FEMS Microbiol. Rev. 21:213.
23.) Lenz, T.L., and S. Becker. 2008. Simple approach to reduce PCR artefact formation leads to reliable genotyping of MHC and other highly polymorphic loci—implications for evolutionary analysis. Gene 427:117.
24.) Acinas, S.G., R. Sarma-Rupavtarm, V. Klepac-Ceraj, and M.F. Polz. 2005. PCR-induced sequence artifacts and bias: insights from comparison of two 16S rRNA clone libraries constructed from the same sample. Appl. Environ. Microbiol. 71:8966-8969.
25.) Liesack, W., H. Weyland, and E. Stack-ebrandt. 1991. Potential risks of gene amplification by PCR as determined by 16S rDNA analysis of a mixed-culture of strict barophilic bacteria. Microb. Ecol. 21:191-198.
26.) Qiu, X., L. Wu, H. Huang, P.E. McDonel, A.V. Palumbo, J.M. Tiedje, and J. Zhou. 2001. Evaluation of PCR-generated chimeras, mutations, and heteroduplexes with 16S rRNA gene-based cloning. Appl. Environ. Microbiol. 67:880-887.
27.) Speksnijder, A.G.C.L., G.A. Kowalchuk, S. De Jong, E. Kline, J.R. Stephen, and H.J. Laanbroek. 2001. microvariation artifacts introduced by PCR and cloning of closely related 16S rRNA gene sequences. Appl. Environ. Microbiol. 67:469-472.
28.) Suzuki, M., M.S. Rappe, and S.J. Giovannoni. 1998. Kinetic bias in estimates of coastal picoplankton community structure obtained by measurements of small-subunit rRNA gene PCR amplicon length heterogeneity. Appl. Environ. Microbiol. 64:4522-4529.
29.) Suzuki, M.T., and S.J. Giovannoni. 1996. Bias caused by template annealing in the amplification of mixtures of 16S rRNA genes by PCR. Appl. Environ. Microbiol. 62:625-630.
30.) Wang, G.C., and Y. Wang. 1997. Frequency of formation of chimeric molecules as a consequence of PCR coamplification of 16S rRNA genes from mixed bacterial genomes. Appl. Environ. Microbiol. 63:4645-4650.
31.) Yu, W., K.J. Rusterholtz, A.T. Krummel, and N. Lehman. 2006. Detection of high levels of recombination generated during PCR amplification of RNA templates. Biotechniques 40:499-507.
32.) Zaphiropoulos, P.G. 1998. Non-homologous recombination mediated by Thermus aquaticus DNA polymerase I. Evidence supporting a copy choice mechanism. Nucleic Acids Res. 26:2843-2848.
33.) Kurata, S., T. Kanagawa, Y. Magariyama, K. Takatsu, K. Yamada, T. Yokomaku, and Y. Kamagata. 2004. Reevaluation and reduction of a PCR bias caused by reannealing of templates. Appl. Environ. Microbiol. 70:7545-7549.
34.) Wang, Y., D.E. Prosen, L. Mei, J.C. Sullivan, M. Finney, and P.B. Vander Horn. 2004. A novel strategy to engineer DNA polymerases for enhanced processivity and improved performance in vitro. Nucleic Acids Res. 32:1197-1207.
35.) Clewley, J.P., and C. Arnold. 1997. MegAlign. The multiple alignment module of LaserGene. Methods Mol. Biol. 70:119-129.
36.) Thompson, J.R., L.A. Marcelino, and M.F. Polz. 2002. Heteroduplexes in mixed-template amplifications: formation, consequence and elimination by ‘reconditioning PCR’. Nucleic Acids Res. 30:2083-2088.
37.) Maddison, W.P., and D.R. Maddison. 1989. Interactive analysis of phylogeny and character evolution using the computer program MacClade. Folia Primatol (Basel) 53:190-202.
38.) Wilgenbusch, J.C., and D.L. Swofford. 2003. Inferring evolutionary trees with PAUP*. Current Protocols in Bioinformatics. John Wiley & Sons, Malden, MA.
39.) Boston, R.C., and A.E. Sumner. 2003. STATA: a statistical analysis system for examining biomedical data. Adv Exp Med Biol. 537:353-369.
40.) Huber, T., G. Faulkner, and P. Hugenholtz. 2004. Bellerophon: a program to detect chimeric sequences in multiple sequence alignments. Bioinformatics 20:2317-2319.
41.) Cronn, R., M. Cedroni, T. Haselkorn, C. Grover, and J.F. Wendel. 2002. PCR-mediated recombination in amplification products derived from polyploidy cotton. Theor. Appl. Genet. 104:482-489.