Multi-tag pyrosequencing has become a key method in the analysis of microbial community composition. However, it is well known that kinetic bias during the initial PCR amplification of such microbial communities can dramatically distort amplicon abundance prior to downstream emulsion PCR and pyrosequencing. Here we present a simple protocol combining length-heterogeneity PCR fingerprinting with pyrosequencing to ensure the linearity of microbial community amplification. The method employs a fluorescently labeled reverse primer along with multi-tagged forward primers to initially amplify the microbial community. The resulting labeled amplicons are then fingerprinted, purified, and quantitated prior to emulsion PCR and pyrosequencing. Our data demonstrates: (i) use of this protocol results in a distribution of sequences showing linear amplification following emulsion PCR when compared with the initial length-heterogeneity PCR fingerprints, and (ii) that the added tags and labels do not have a negative effect on overall microbial community profiles.
Amplification of homologous genes using PCR involves complex hybridization kinetics and amplification bias is inevitable (1, 2). Several factors, including the GC content of the template, length of the target amplicon, structural differences, and binding bias of primers, may contribute to this bias (3). Variability in PCR efficiency is also commonly observed, and this may distort the amplicon abundances and subsequently affect downstream processes (3-6). Therefore, information on the relative efficiency of PCR amplification in mixed-template samples is essential to the interpretation of these experiments and various methods have been introduced to minimize kinetic bias (1, 2).
Multi-tag pyrosequencing (MTPS) has revolutionized the field of microbial ecology, enabling the routine analysis of microbial communities from many different environments (7-9). As a new quality control step in pyrosequencing of microbial communities, we describe the use of length heterogeneity PCR (LH-PCR) fingerprinting to optimize the amount of microbial community DNA prior to the MTPS process (2, 10). Our novel protocol combines the fingerprinting technique and pyrosequencing methodology to ensure linearity in microbial community amplification. Specifically, we utilize a FAM (6-carboxyfluorescein) labeled reverse primer along with multi-tagged forward primers to amplify the community; these tagged products can then be used directly in the emulsion PCR process for multitag pyrosequencing.
To test our protocol, we designed different tags/barcodes to be added to the universal forward fungal ITS or bacterial 16S primers (Supplementary Table 1) in a microtiter plate format (Life Technologies, Grand Island, NY, USA). Bacterial primers were designed to amplify the first two variable regions of the 16S rDNA, while fungal primers amplified the ITS1 portion of the ITS region. The reverse primers had a FAM (6-carboxyfluorescein) label for use in fingerprinting. The forward and reverse primers had the titanium adapters A or B (Roche Diagnostics, Indianapolis, IN, USA), respectively, for emulsion PCR.
We used a standard protocol for the initial LH-PCR using Taq Gold DNA polymerase (Applied Biosystems, Foster City, CA, USA) and FAM-labeled reverse primer (8, 11). The PCR products were visualized on a 1% agarose gel, diluted according to the strength of the products (1:10–1:20 dilutions) and then mixed with ILS600 size standard (Promega, Madison, WI, USA) in a 1:20 ratio with HiDi formamide and run on an ABI 3130XL fluorescent sequencer (Applied BioSystems). The results were analyzed using the Genemapper software v4.1 (Applied BioSystems) and the community profiles were assessed for reproducibility. The same PCR product was purified with Agencourt Ampure magnetic beads (Beckman Coulter, Indianapolis, IN, USA) and quantified with a DTX880 Multimode Fluorescent detector (Beckman Coulter) using excitation at 485 nm and emission at 535 nm. The appropriate number of molecules of purified product was calculated using all the peak areas from the LH-PCR fingerprint and used in emulsion PCR according to manufacturer's protocol (Roche Diagnostics) with minor modifications.
It is important to note that, in many cases, variability in the total yield of PCR products from samples with comparable amount of DNA as starting material can be observed. For example, when DNA from a tissue biopsy sample is extracted, both host eukaryotic DNA and target bacterial DNA will be isolated (data not shown). Thus, each individual community PCR reaction needs to be optimized prior to pooling for the MTPS emulsion PCR as excess DNA can actually inhibit the PCR reaction. Typically, we perform PCR on multiple dilutions of a community DNA to both optimize the signal strength and assure linear amplification, i.e., that there is no kinetic bias in the amplification. Figure 1 depicts the LH-PCR of three dilutions of a community DNA sample. The data show that the 1:100 dilution and 32 cycles produce the optimum community profile for this sample.
Previously, we had used the PicoGreen assay (Life Technologies) or agarose gel imaging to quantify the PCR products. However, both of these methods measure total DNA mass and the number of molecules are estimated based on an average product size and we, along with other groups, have observed variability in the number of reads per sample after pooled MTPS (12). To reduce this variability, the FAM-labeled LH-PCR amplicons are used to quantify the number of molecules in each of the multi-tagged products prior pooling. By simply summing the areas of all the peaks in a sample LH-PCR fingerprint, we can normalize amplification products prior to sequencing. Thus, one has a direct measure of the number of molecules in each sample and can pool an equal number of molecules for each multitagged sample.
Following the initial PCR optimization steps, recovered emulsion PCR beads were subjected to pyrosequencing on a Roche GS Junior instrument according to original protocol (Roche Diagnostics), with some minor modifications. Figure 2 and Table 1 show the results comparing the original protocol using 50 emPCR cycles and loading 500,000 emPCR beads versus our optimized protocol using 40 emPCR cycles and loading 360,000 emPCR beads. Under typical conditions, 40 cycles on an MJ Research PTC-200 thermal cycler (BioRad Laboratories, Hercules, CA) gave more uniform signal strength over the exposure than the 50 cycles suggested in the original protocol (Supplementary Figure 1B vs. 1A).
Data analysis (Table 1) demonstrated that our optimized 40 cycle protocol with fewer beads loaded results in more passed filter reads (137,181 versus 82,496) as well as fewer short-quality reads (5982 versus 25,606), partly because of less crosstalk among the wells. We also examined the effect of differential loading of the same emulsion PCR on pyrosequencing in Supplementary Figure 2 and Supplementary Table 2. When twice as many recovered beads of the same emulsion PCR in second run (MTPS42) were loaded, there were 1.5 times more raw reads. However, loading this number of beads also resulted in fewer filter passed reads (94,733 vs. 102,531) in comparison to the standard procedure after analysis with a 250 bp cutoff.
An important implication of our protocol is that accurate quantitation of the number of molecules in the pooled samples using the FAM-labeled products facilitates the emPCR step in the MTPS process as this measurement will be directly correlated with the number of molecules in the pooled sample and is independent from the size distribution of the fragments. This allows for much greater flexibility in the size range of PCR products that can be pooled, and permits routine pooling of archaeal, bacterial, and fungal PCR products in the same MTPS run. Additionally, the size distribution of the LH-PCR fingerprint and the size histogram of the reads from the GS Junior sequencing run can be compared with determine if the size profile are similar for quality assurance (Figure 3). This helps to determine whether the origin of short reads after MTPS run are from the initial community PCR or from the pyrosequencing process itself, thus allow for optimizing the process.
Additionally, one issue that our new quality control protocol can directly address is the impact of contamination in microbial community analysis since the same sample is both fingerprinted and sequenced for quality assurance. The fingerprinting can show the background amplification, even though there is no visible band in the PCR picture visualized with ethidium bromide in an agarose gel image. We routinely include negative controls with different tags in our MTPS runs. Occasionally we observe a low number of reads in the negative controls after pyrosequencing which most often do not map to the same taxa as the sample. Interpretation of such data after sequencing is at the researcher's discretion, i.e., one can subtract these taxa from the community analysis or ignore them.
In summary, we present a novel protocol that combines LH-PCR fingerprinting and pyrosequencing to ensure linearity of microbial community amplification through accurate quantitation of the number of molecules in a sample. Our approach enhances the quality control of the entire process and facilitates a clearer understanding of the organismal composition of various microbial communities.
This work was supported by 1 RC2 AA019405-01 from NIH. This paper is subject to the NIH Public Access Policy.
PMG has a financial interest in BioSpherex, LLC.
Address correspondence to Masoumeh Sikaroodi, M.Sc., Department of Environmental Science and Policy, George Mason University, 10900 University Blvd, MSN 4D4, Manassas, VA, USA. Email: [email protected]
1.) Suzuki, M.T., and S.J. Giovannoni. 1996. Bias caused by template annealing in the amplification of mixtures of 16S rRNA genes by PCR. Appl. Environ. Microbiol. 62:625-630. 2.) Suzuki, M., M.S. Rappe, and S.J. Giovannoni. 1998. Kinetic bias in estimates of coastal picoplankton community structure obtained by measurements of small-subunit rRNA gene PCR amplicon length heterogeneity. Appl. Environ. Microbiol. 64:4522-4529. 3.) Alon, S., F. Vigneault, S. Eminaga, D.C. Christodoulou, J.G. Seidman, G.M. Church, and E. Eisenberg. 2011. Barcoding bias in high-throughput multiplex sequencing of miRNA. Genome Res. 21:1506-1511. 4.) Polz, M.F., and C.M. Cavanaugh. 1998. Bias in Template-to-Product Ratios in Multitemplate PCR. Appl. Environ. Microbiol. 64:3724-3730. 5.) Reysenbach, A.L., L.J. Giver, G.S. Wickham, and N.R. Pace. 1992. Differential Amplification of rRNA Genesby Polymerase Chain Reaction. Appl. Environ. Microbiol. 58:3417-3418. 6.) Wagner, A., N. Blackstone, P. Cartwright, M. Dick, B. Misof, P. Snow, G.P. Wagner, J. Bartels, M. Murtha, and J. Pendleton. 1994. Surveys of Gene Families Using Polymerase Chain Reaction: PCR Selection and PCR Drift. Syst. Biol. 43:250-261. 7.) Andersson, A.F., M. Lindberg, H. Jakobsson, F. Bäckhed, P. Nyrén, and L. Engstrand. 2008. Comparative analysis of human gut microbiota by barcoded pyrosequencing. PLoS One 3:e2836. 8.) Gillevet, P., M. Sikaroodi, A. Keshavarzian, and E.A. Mutlu. 2010. Quantitative assessment of the human gut microbiome using multitag pyrosequencing. Chem. Biodivers. 7:1065-1075. 9.) Wu, G.D., J.D. Lewis, C. Hoffmann, Y.Y. Chen, R. Knight, K. Bittinger, J. Hwang, J. Chen. 2010. Sampling and pyrosequencing methods for characterizing bacterial communities in the human gut using 16S sequence tags. BMC Microbiol. 10:206-215. 10.) Mills, D.K., K. Fitzgerald, C.D. Litchfield, and P.M. Gillevet. 2003. A comparison of DNA profiling techniques for monitoring nutrient impact on microbial community composition during bioremediation of petroleum-contaminated soils. J. Microbiol. Methods 54:57-74. 11.) Gillevet, P.M., M. Sikaroodi, and A.P. Torzilli. 2009. Analyzing salt-marsh fungal diversity: comparing ARISA fingerprinting with clone sequencing and pyrosequencing. Fungal Ecol. 2:160-167. 12.) Harris, J.K., J.W. Sahl, T.A. Castoe, B.D. Wagner, D.D. Pollock, and J.R. Spear. 2010. Comparison of normalization methods for construction of large, multiplex amplicon pools for next-generation sequencing. Appl. Environ. Microbiol. 76:3863-3868.