In contrast however, we found that use of fusion enzymes (Phusion, Herculase II Fusion) resulted in a marked improvement in reducing stutter product formation. Quality scores >20 were determined for nearly 100% of the bases derived from samples that had mononucleotide repeats ≤13 bases in length (Figures 1 and 2). Considerable improvement in quality scores for repeats of 14 and 15 bases was also observed. Two sequences we tested with 16- and 17-base repeats were little improved with fusion enzymes over reactions that used Taq DNA polymerase. This improvement could be due to a number of mutually nonexclusive phenomenon.
Since the formation of stutter products necessitates the dissociation of the DNA polymerase, it is possible that the increased processivity of the fusion-based enzymes decreases the likelihood of dissociation during replication of a mononucleotide repeat and therefore reduces stutter product formation. If this was the mechanism, one would expect to see similar results from other enzymes with high processivity. Our quality results from KAPAHiFi, however, showed no consistent improvement over AmpliTaq Gold polymerase for samples with mononucleotide repeats greater than 12 bp, which is consistent with previous work (3) that failed to find a link between processivity and frameshift error.
A separate possible mechanism for reducing stutter product formation that we considered is proofreading ability. A study of T7, T4, and Pfu DNA polymerases found that exonuclease-deficient mutants of the enzymes produced more mutations than their proofreading native forms, indicating that proofreading ability may enable the enzymes to correct frameshift mutations. However, the ability of proofreading polymerases to correct frameshift mutations was greatlyreduced as the repeat size reached 8 nucleotides (10). Kroutil et al. (10) found that proofreading T7 DNA polymerase decreased deletion frameshift errors over a non-proofreading–deficient mutant by 160× for repeats 3 nucleotides in length, but this advantage decreased to only 7× for runs of 8 nucleotides in length, indicating an upper limit to the ability of proofreading polymerases to reduce stuttering. Our data also indicates that proofreading ability has little effect in reducing frameshift errors associated with mononucleotide repeats. Our trials using KAPAHiFi, a proofreading polymerase with the lowest error rate currently available (2.8 × 10−7 errors per nucleotide), yielded little to no improvement in sequence quality versus the non-proofreading AmpliTaq Gold enzyme. This result fits well with the hypothesized process of slipped-strand mispairing. Despite formation of a loop in one strand during replication of a long mononucleotide repeat, the 3′ end of the nascent strand could be paired properly with the template strand at any point along the repeat, leaving the 3′-to-5′ proofreading ability nothing to operate on.
An additional possibility is that some property of polymerases makes them susceptible to dissociation, at least in vitro, when their active sites are entirely occupied with repetitive sequences (9). One explanation of fusion enzymes' ability to decrease stutter is that the additional DNA binding domain effectively increases the contact surface with the DNA, enabling accurate replication of larger repeat regions. The maximum rate of mutation in homopolymer runs has been found to occur in vitro at runs 8 bp in length (9), which corresponds to the estimated number of nucleotides that fill the active site of Taq and many other DNA polymerases (12,13,14). It is interesting to note that in this study the quality of sequences generated with Phusion (as well as Herculase II Fusion) declined rapidly after 13 mononucleotides (Figure 2). This represents a 5-bp improvement over previous studies that found maximal mutation rates in runs ≤8 bp (9,10) and corresponds to the 4–5 bp estimated to interact with the Sso7d protein (29,30,31). This is suggestive that the mechanism of decreased stutter product formation observed with the Phusion enzyme in this study is a property of the increased contact between enzyme and DNA afforded by the fusion of the Sso7d protein to the polymerase.
Another potential benefit of the Sso7d protein is the ability to increase the melting temperature of dsDNA by up to 39°C (32). This attribute may be generally beneficial in the amplification of A/T-rich amplicons. Some DNA melting may occur at 72°C during the elongation phase that would cause the termination of elongation (19) and result in potential stutter product formation.
Although the use of fusion enzymes improved sequence quality for a number of our samples, the improvement appears to reach a limit at mononucleotide repeats of 15 bp, with no improvement in quality of sequences with longer runs. Further optimization of the Phusion-based PCR reactions and/or cycle sequencing reaction may yield further improvements in sequence quality; however, the initial trials we have performed have not yielded significantly positive results.
Additional improvements in quality for sequences with repeats >15 bp may require development of other accessory proteins, or novel polymerases. We note the recent report of SsoDPo1 (33), which can form trimeric complexes with DNA resulting in a large contact surface and extreme processivity (900 bp).
Here we have reported an unexplored attribute of fusion polymerases, and a resulting simple and cost-effective way to reduce genotyping errors in PCR-based sequencing. These findings will be of broad utility to investigators interested in optimizing sequence quality and allele detection in simple sequence repeats. Our findings also indicate that neither processivity nor proofreading ability alone can account for the mitigation of slipped-strand intermediates and suggest that the reduction of stutter product formation may be a result of increasing the polymerase contact surface.
We would like to thank Angela Holliss and Jeff Gross of the Advanced Analysis Centre Genomics Facility at the University of Guelph for their sequencing expertise. Annabel Por, John Gerrath, and members of the Ontario Agricultural College Herbarium Floral Diversity Research Group assisted with field collections. This work was supported by Genome Canada through the Ontario Genomics Institute (grant no. 047741, to S.G.N.); and the Canadian Foundation for Innovation (grant no. 460042, to S.G.N.).
The authors declare no competing interests.
Address correspondence to Aron J. Fazekas, Department of Integrative Biology, University of Guelph, Guelph, Ontario, N1G 2W1 Canada. e-mail: [email protected]