2Department of Biological Sciences, Smith College, Northampton, MA, USA
PCR-mediated recombination can greatly impact estimates of diversity, both in environmental studies and in analyses of gene family evolution. Here we measure chimera (PCR-mediated recombinant) formation by analyzing a mixture of eight partial actin sequences isolated from the amoeba Arcella hemisphaerica amplified under a variety of conditions that mimic standard laboratory situations. We further compare a new-generation proofreading processivity-enhanced polymerase to both a standard proofreading enzyme and previously published results. Proofreading polymerases are preferred over other polymerases in instances where evolutionary inferences must be made. Our analyses reveal that reducing the initial template concentration is as critical as reducing the number of cycles for decreasing chimera formation and improving accuracy. Furthermore, assessing the efficiency of recovery of original haplotypes demonstrates that multiple PCR reactions are required to capture the actual genetic diversity of a sample. Finally, the experiments confirm that processivity-enhanced polymerases enable a substantial decrease in PCR-mediated recombination through reducing starting template concentration, without compromising the robustness of PCR reactions.
Polymerase chain reaction (PCR)–based methods are the norm in molecular evolution studies of non-model taxa and in explorations of environmental DNA. For example, degenerate PCR is often used in systematic studies where numerous diverse taxa are to be sampled (1,2,3,4,5,6). Despite advances and dropping costs in mega- and metagenomic sequencing techniques (7,8,9), PCR methods remain key in hypothesis-driven environmental studies (10,11,12,13,14). PCR is used in such studies mainly because of its reproducibility, which enables the targeting of specific genes of interest from diverse taxa.
One worrisome aspect of PCR-based studies is the phenomenon of PCR-mediated recombination, or chimera formation (15,16). Chimeras are formed when incompletely extended DNA fragments anneal to closely related sequences generating recombinants between starting templates (17,18,19). It can be difficult to differentiate original haplotypes from chimeras, leading to overestimation of biological diversity in environmental studies (20,21,22). Interpretations about the fate of genes in molecular evolution studies can also be compromised by the presence of chimeras, as has been shown in tests of positive selection in the major histocompatibility complex (MHC) in sticklebacks (23).
Most experimental work on PCR-mediated recombination has used traditional enzymes such as Taq polymerase to determine rates of chimera formation under conditions normally used in studies of environmental microbial samples [e.g., bacterial and archaeal 16s SSU-rDNA surveys (22, 24,25,26,27,28,29,30,31)]. These studies determined that chimera formation can be reduced for most DNA polymerases when the cycle number is lowered and extension time increased (17,19, 31,32,33). It is recommended that the lowest number of cycles be determined experimentally, which should be ~20 cycles or fewer. These suggestions can be easily followed in experiments with high-quality DNA from organisms of known genome complexity (23,26). However, when dealing with DNA extracted from organisms that may have highly complex genomes, or preparations with chemical compounds that are not completely removed (e.g., environmental DNA from sediments), it is more difficult to optimize PCR for downstream applications such as cloning and sequencing (22,24,27,30). Additionally, in molecular evolution studies where single nucleotide polymorphisms are important for inferring evolutionary processes (e.g., population studies or analyses of rare biosphere), the high error rate of Taq polymerase is not desirable. To address such difficulties, a new generation of DNA polymerases has emerged that combine proofreading capabilities with enhanced DNA binding motifs (34), including Phusion (Finnzymes, Espoo, Finland), PfuUltra (Stratagene, La Jolla, CA, USA), and Pfx50 (Invitrogen, Carlsbad, CA, USA). These enzymes have not yet been analyzed for dynamics of chimera formation.
Our goal is to understand the formation of PCR-mediated recombinants when many closely related sequences are present in the same reaction, and when a high number of cycles is required to generate robust products. Low primer-to-target amplicon ratio is assumed to be the main reason for mismatch pairing in later cycles, which leads to chimera formation (15,16,17)(24); thus, we also surveyed different initial DNA concentrations. Varying DNA concentrations is also relevant because in genomic DNA extractions, the absolute number of genome copies varies according to genome size and the subsequent high copy number of members of large gene families could lead to increased PCR recombination (23).
Here we analyze the formation of chimeras from a set of eight paralogous protein-coding genes by comparing the following experimental conditions: (i) a processivity-enhanced, proofreading polymerase to a traditional proofreading polymerase; (ii) high cycle number to standard cycle number; and (iii) a range of initial template concentrations. These sets of conditions are relevant to numerous research areas as parameters fall within recommendations and are likely to be used in standard laboratory practice.Material and methods Origin of templates
We chose to investigate a set of eight paralogous haplotypes of the actin gene extracted from the testate amoeba Arcella hemisphaerica. The eight haplotypes differ 2.4–20.5% in nucleotide sequence. Actin clones were obtained from previous work in A. hemisphaerica as described in Tekle et. al (5), except that resulting clones were purified using the PureLink kit (Invitrogen). To generate templates for the experiment (Figure 1), we eliminated the vector by diluting each purification to 25 ng/µL and amplified them separately using Arcella-specific degenerate primers designed from an alignment with >30 actin paralogs from this taxon: AhemAct-F (5′-GARGARCAYC CYGTYTTGTTGAC-3′) and AhemAct-R (5′-TAYTTYCTYTCDGGRGGAGCAAT-3′). Phusion Hot Start polymerase (Cat. no. F540; New England BioLabs, Ipswich, MA, USA) was used in the following conditions: 35 cycles of 98°C denaturing for 15 s, 56°C annealing for 15 s, and 72°C extension for 45 s. These primers yield an actin fragment that is 670 bp long. We performed these experiments using appropriate negative and positive controls and the amplified products were sequenced to check for quality (data not shown). Each amplified product was then purified using Microclean (The Gel Company, San Francisco, CA, USA). Finally, all haplotypes were individually diluted to 1 ng/µL and mixed (Figure 1).