Full Text (PDF)
We have frequently encountered a problem screening microsatellite libraries constructed in topoisomerase I functionalized pCR®2.1 plasmids (Invitrogen, Carlsbad, CA, USA). The problem involves a high proportion (up to 90%) of isolated colonies generating double sequence in automated DNA sequencing runs (see Figure 1 for an example). Due to difficulties in resolving base pair calls in double sequence traces, these clones have had to be excluded from further locus development. In discussing this problem with colleagues, we have found further instances of this cloning anomaly, which decreases the efficiency and yield of sequencing. Here, we report on a cause for the difficulty and show how using an internal pCR2.1 reverse primer eliminates this double sequence problem and provides clean reads from traces.
TA cloning using pCR2.1 is a multiple step process. One critical step involves using Vaccinia topoisomerase I to functionalize the vector. Topoisomerase I accomplishes this by cleaving the plasmid after the sequence 5′-CCCTT-3′ at base pair position 294, linearizing the circular vector (1). Linearized vector is then combined with PCR-amplified DNA products that contain overhanging A base pairs. Insertion of PCR-amplified products recircularizes the plasmids, which are then transformed into either electrocompetent or chemically competent cells. Positive bacteria colonies are subsequently isolated, and their plasmids are purified and sequenced using one of three different priming sites contained in the vector (M13+, M13-, or T7).
Inspection of sequence trace files for double sequence colonies led us to hypothesize that the problem was due to the anomalous action of topoisomerase I in binding and cleaving at two 5′-CCCT-3′ target sites in the plasmid. We based this hypothesis on the observation that one of the two sequences in problematic colonies was common to all double sequence traces (Figure 1). We identified this sequence as a truncated version of the vector pCR2.1, missing a 74-bp fragment excised between the two CCCT motifs (Figure 2). The first CCCT site, at positions 290–293, incorporates a portion of the intended cleavage and cloning site in the vector, but cuts one base further upstream than expected (Figure 2A). Cleavage also occurs after a second CCCT site, at positions 364–367, followed by the subsequent recircularization of the plasmid deleted for the intervening fragment (Figure 2B). Sequencing of colonies containing both the excised version of pCR2.1 and vectors with cloned inserts using any of the three standard primers M13+, M13-, or T7 would therefore generate double sequence.
Support for this hypothesis comes from the observation that a number of nontarget cleavage sites are known for Vaccinia topoisomerase I, including CCATT, CCCTA, and CCCTC (1,2,3). Rates and probabilities of cleavage can also depend on the length and identity of the flanking sequences (3). Furthermore, corruption of genomic libraries by vector sequence (commonly referred to as vector contamination) is widespread (4) and may also be due to this type of irregular enzymatic activity.
Identification of the anomalous sequence as a deleted version of pCR2.1 suggested a strategy for eliminating the double sequence problem. We reasoned that sequencing with an oligonucleotide primer unique to the excised fragment should generate only one trace specific for vectors containing cloned inserts. To test this hypothesis, we resequenced eight problematic double trace clones generated from a Diachasma alloeum (Hymenoptera: Braconidae) CA-dinucleotide repeatenriched library using a 21-bp primer specific for the excised region (5′-GCCAGTGTGATGGATATCTGC-3′; hereafter referred to as inseq-1). We also sequenced 288 newly isolated colonies from the library using both the inseq-1 and the standard M13-primer to assess the effectiveness of the internal primer in a general screen for microsatellite inserts.
The inseq-1 primer was designed using Primer3 (5). The 21-bp inseq-1 primer was positioned to be inside the excised region of pCR2.1, between the PCR product cloned insertion site and the priming site for T7. An enriched CA-repeat D. alloeum library was constructed using the protocol of Hamilton et al. (6). The subgenomic microsatellite library was PCR- amplified with the primer sequence 5′-GCGGTACCCGGGAAGCTTGG-3′ and cloned using a TOPO® TA Cloning kit (Invitrogen). Cells were grown on ampicillin selective plates coated with 5-bromo-4-chloro-3-indolyl-β-D-galactopyranoside (X-gal) and picked via blue/white screening. Plasmids were isolated using the Eppendorf® Perfectprep® Plasmid 96 Vac Direct Bind kit (Brinkmann Instruments, Westbury, NY, USA). Plasmids were sequenced from the M13-primer on an ABI Prism® 3730XL automated DNA analyzer using the BigDye® Terminator v3.1 system (Applied Biosystems, Foster City, CA, USA). Almost 90% of the 96 initially sequenced clones generated double sequence traces. From this pool, we chose eight colonies designated C1–C8 that appeared to also contain CA repeats for reanalysis. Resequencing of these eight clones was performed as before using both the inseq-1 and M13-primers. In addition, we replated the D. alloeum library and isolated 288 new colonies. Plasmids were purified from each of these colonies as described above, and sequences were generated using both the inseq-1 and M13-primers for comparison.
For each of the eight previous problematic colonies (C1–C8), resequencing with the M13-primer produced the same compromised double trace patterns. In contrast, sequencing with the inseq-1 primer resulted in clean and readable single traces for all eight colonies. Moreover, all eight clones contained CA-dinucleotide repeats suitable for microsatellite marker development.
To further confirm the nature and severity of the double sequence problem, we sequenced an additional 288 newly isolated colonies from the D. alloeum library using both the inseq-1 and M13-primers. A total of 79.9% of the clones (230/288) sequenced with the inseq-1 primer produced a single, uncompromised trace signal, of which 120 proved to contain CA dinucleotide repeats. The remaining 20.1% of clones consisted of a combination of failed cycle sequencing reactions and traces too small to read (many of these did appear to contain library fragments and were subsequently resequenced successfully). In comparison, just 11.5% (33/288) of clones sequenced with the M13-primer produced resolvable traces, of which 13 contained microsatellites. All microsatellites identified by sequencing with the M13-primer were also characterized using the inseq-1 primer. Further sequencing of microsatellite libraries constructed for the rusty crayfish (Orconectes rusticus) and Mexican fruit fly (Anastrepha ludens) confirmed both the double sequence problem and the utility of inseq-1 as its solution.
It remains to be resolved when excision of the 74-bp fragment of pCR2.1 is occurring during the cloning process. One possibility is that many copies of the vector are cut and recircularized prior to cell transformation. In this case, competent cells acquire an excised plasmid along with a second insert containing vector. A second possibility is that the deletion occurs during or after cell transformation and involves topoisomerase I excision of both plasmid and cloned insert sequences in vivo. We consider this second scenario more likely, because it explains why affected colonies appear as positive white clones during color media selection, why only a single cloned insert is found for each colony following primer sequencing, and why we fail to find excised plasmids alone in transformed cells. Finally, it is not clear whether the excision of plasmid sequence is peculiar to these microsatellite libraries. While numerous reports of double sequence after TOPO cloning lead us to suspect that plasmid excisions are not related to the insert type, it cannot be ruled out as a potential driving mechanism. Regardless of when the artifact is generated, however, the use of the inseq-1 primer eliminates the secondary signal problem completely, greatly increasing the efficiency and yield of the microsatellite library screening process.
The authors wish to thank Andy Michel and two anonymous reviewers for helpful comments on a previous draft of this manuscript, as well as the University of Notre Dame Sequencing Facility. We would also like to thank the many people who responded to our posting on the EvolDir (Evolutionary Directory), confirming that they too had noticed this cloning anomaly.
The authors declare no competing interests.

