Recovery of original haplotypes per treatment
The number of original haplotypes recovered out of the eight initially used varied for different treatments (Figure 2), and do not differ significantly with varying number of cycles (one-way ANOVA: df = 23, F = 2.52, P = 0.12) or starting template concentration (one-way ANOVAs; 30 cycles: df = 11, F = 1.85, P = 0.21; 50 cycles: df = 11, F = 2.91, P = 0.1). Nevertheless, a few trends emerge. For Phusion at 50 cycles, all template concentrations, on average, recover an assortment of four haplotypes out of the eight originals. At 30 cycles, the intermediate concentrations (1.4 × 107 and 1.4 × 105, molecules/µL) recovered seven haplotypes on average, with individual PCR reactions actually being able to recover all eight (Figure 2, Supplementary Material 1); the higher and lower template concentrations recovered four haplotypes on average. VentR polymerase recovered an average of five original haplotypes for all conditions. Some sequences were more prone to be recovered: original haplotypes 2 and 5 were recovered in almost all experiments; original haplotype 3 was recovered only in half the experiments (Supplementary Material 1).Number of breakpoints in chimeric sequences
We determined breakpoints and participant original haplotypes for each chimeric haplotype. Both the distribution of numbers of breakpoints per sequence and the distribution of breakpoints along the sequence suggests that under PCR conditions, the recombination events are random. Chimeras varied from having a single breakpoint with two clear parental sequences to having eight breakpoint and six parental sequences alternating in participation (see Supplementary Material 2 for a complete tally). The majority of chimeras (65%) had more than one breakpoint and in most cases there were more than two parental sequences for each chimera (Figure 3). There is no clear pattern between number of breakpoints and template concentration or number of cycles: the distribution of sequences with breakpoints follows a Poisson distribution when taken together (Poisson regression likelihood ratio chi-squared = 7.83, P = 0.02, df = 2; Pearson goodness-of-fit chi-squared = 110.67, P = 0.99, df = 152), and follow that distribution when partitioned by concentration (P = 0.01) or cycling number (P = 0.04) (Figure 3). Additionally, we were unable to determine a correlation between sequence features (local similarity, conservation) and susceptibility to being a breakpoint. The distributions of breakpoints along the sequence are not significantly different from the expected in a normal distribution (Shapiro-Wilk W = 0.95, P = 0.04; Supplementary Material 4).
Haplotype participation in chimeric sequences
The frequency that a specific original haplotype was involved in chimeric events corresponds with the frequency with which that haplotype was recovered overall in PCR reactions (Supplementary Material 5). Original haplotype 2 was recovered the most times across experiments (51 clones overall), and it also was involved as a part of a chimera in 81 cases. Original haplotype 3 was the least recovered haplotype across experiments (16 clones overall) and was also the least likely haplotype to participate in chimeras (23 counts). Such a pattern in the composition of chimeras is further evidence that recombination events are mostly random. The more readily available sequences are more likely to participate in recombination events, without any bias toward a particular haplotype or group of haplotypes.Discussion
Chimeras are more likely to be observed when both a high-cycle condition and high initial concentration of templates are used (Table 2). In contrast, no recombinants were observed in the low cycle/low concentration conditions. Although our analyses corroborate the inference that high cycle numbers induce chimera formation in PCR (17,18,19)(26,30,33), they also highlight the importance of initial template concentration in chimera formation (Figure 2). The effect of template concentration revealed here is due to the wider range (seven orders of magnitude) of concentrations analyzed compared with previous research (26). Such a range is enabled by the ability of Phusion—a processivity-enhanced polymerase—to amplify very low concentrations of template (Table 1).