Small-intelligent primer is, in principle, compatible with most PCR-based NNS randomization strategies. Thus, a suitable PCR-based mutagenesis approach could be easily selected for library construction. For instance, a simple, standard QuikChange site-directed mutagenesis protocol (Stratagene, Agilent Technologies, Santa Clara, CA, USA) could be adopted when one site is to be randomized (Figure 1, Step II, Option A). In conjunction with a double-stranded primer mixture (NDT, VMA, ATG, and TGG), a small-intelligent library could be easily constructed. Overlapping PCR is suitable for the randomization of two nearby sites, including two contiguous sites (Figure 1, Step II, Option B), while megaprimer-based PCR could be used for the randomization of two sites located far from each other (Figure 1, Step II, Option C) (19,20). When coupled with these procedures, two-site randomized small-intelligent libraries could be constructed using the small-intelligent system with two sets of single-stranded primers (four oligonucleotides per set: NDT, VMA, ATG, and TGG). Although the libraries could also be constructed by using double-stranded primers carrying two randomized sites, 16 oligonucleotides instead of eight would be required in this case. Obviously, using the single-stranded primers greatly reduces primer costs. However, such a primer design recommendation could not be applied in the case of two contiguous randomized sites, for which 16 mutagenic oligonucleotides (forward) and one nonmutagenic oligomucleotide (reverse) are needed. It is worthwhile to note that the megaprimer-based PCR has been proven to be applicable for megaprimers with sizes up to a few kilobases (21). Thus, in principle, the small-intelligent system could be applied in any cases where two sites are to be randomized.
Moreover, Wang et al. have recently reported a novel method for the construction of multiple-site saturation mutagenesis libraries (NNS randomization) by the combination of overlapping PCR and megaprimer-based PCR, and the strategy has been successfully applied in the randomization of an E. coli K12 malic enzyme at three target positions (22). Combined with this PCR strategy, the small-intelligent system could, in principle, randomize more than two sites simultaneously using (2 × N-2) sets of single-stranded oligonucleotides, where N represents the number of mutagenized sites. In practice, it was found that for a saturation mutagenesis library containing three randomized sites, the minimal library size is 8000, while it is 32,768 for NNS randomization. An extensive screening effort is needed for these two cases. In response to such drawbacks, Reetz et al. reported their iterative CASTing method (23), which has proven to be a valuable means to accelerate enzyme evolution. The essence of the methodology is to construct several small focused libraries (each library containing only two spatially close randomized sites) and to combine the positive hits by repeating the process until the final hits with desired catalytic properties are achieved. For such a small library, about 3100 transformants are required for 95% library coverage using NNS randomization. By applying our small-intelligent system, only 1200 transformants are needed. Apparently, the screening effort is minimized by a factor of 2.6 in this way. Thus, the following case studies were mainly focused on library construction when no more than two sites are randomized.Theoretical analysis and case studies
To evaluate the feasibility of the small-intelligent strategy, one active-site residue of HheA was fully randomized using both conventional NNS randomization and small-intelligent methods. For each library, about 55 randomly picked colonies were sequenced, and the encoded amino acids were analyzed. The results showed that in the small-intelligent library, all 20 amino acid codons were identified, and no rare codons and stop codons were observed. In the conventional NNS randomized library, only 16 amino acids were obtained, and two stop codons and three rare codons of E. coli were present.
Moreover, the encoding amino acids biases of the above gene sequences were analyzed. Theoretically, for NNS randomization (Figure 2A) Ser, Arg, and Leu have the highest frequency of occurrence, followed by Ala, Gly, Pro, Thr, and Val, while the remaining amino acids have the lowest frequency of occurrence, and such amino acid biases could be eliminated by applying the small-intelligent strategy (Figure 2B). It was found that there is a good correlation between our results and the theoretical amino acid bias expectations for both NNS and small-intelligent randomizations. In our experimental data from NNS randomization, Ser had the highest redundancy, followed by Arg and Leu, while Glu, Gln, Asn, and Lys, which theoretically have the lowest frequency of occurrence, were not obtained (Figure 2C). In comparison, the observed amino acid biases in the small-intelligent library are much less profound than those observed in the constructed NNS mutagenesis library (Figure 2D), and this might become even smaller when the sequencing sample is large enough. Taken together, these results demonstrated that randomization using the small-intelligent strategy could greatly improve the quality of the constructed library by reducing library size and inherent amino acid biases, which would in turn facilitate library screening efficiency.