Finally, the technique was used to sequence linear molecules of Candidatus Phytoplasma mali, a plant-pathogenic mycoplasma with a small genome of ∼600 kb that is 21.4% GC and characterized by large terminal inverted repeats and covalently closed hairpin ends (20). The DNA was sheared to approximately 3-kb fragments and a 25-ng aliquot was sequenced using random hexamers in a similar manner to that described previously. From a single SMRT cell with 2 × 45 min movies, only 870 post-filter reads were generated of which 63 reads mapped, with a mean consensus accuracy of 84.4%. The mean mapped read-length was 817 bp and the coverage only 0.08%. The poor mapping rate is most likely due to a greater percentage of low-quality reads from this particular sample. Although the yield is poor, direct sequencing of these linear DNA molecules shows some promise too. A blastn (21) search using the NCBI server against the refseq_genomic database called out Candidatus Phytoplasma mali as the most likely taxonomic hit (Supplementary Table 1). This suggests it is possible to obtain enough information from very few mapped reads to begin to identify the genomes present in a sample. However, comparing the difference in data yield between the S. aureus and Ca. Phytoplasma mali, it is clear that further optimization of the method is required to improve the number of reads that can be mapped when sequencing linear molecules from a variety of genomic samples.
The method described here utilizes the PacBio RS platform for direct sequencing, enabling the generation of sequence data from small single- and double-stranded DNA genomes. Potentially this technique also could be applied to circularized molecules, e.g., amplicons or sheared fragments that have been circularized. However, the additional circularization step and clean up would mean relatively minor time and DNA savings compared with current PacBio protocols. The direct sequencing technique could allow the identification of plasmids present in a bacterial sample in an extremely straightforward and fast manner. Although there is an indication that different genomes may be more or less accessible with this method, we have demonstrated its application to sequencing ssDNA and dsDNA viruses, plasmid vector models for methylation studies, antibiotic resistance gene-carrying plasmids, and the entire genome of a clinically relevant microbial pathogen. All of these were performed without the need for library preparation, and it is possible to generate sequence data within 8 h from <1 ng of DNA without a PCR amplification step. The fact that our method can be performed without a priori knowledge of any sequence and with no organism-specific reagents, coupled with its simplicity and speed, makes it particularly well suited for use in acute disease and infectious outbreak scenarios.
We would like to thank NEB for providing the M13 primers which were not currently available for purchase. Sascha Sauer of the Max-Planck-Institute for Molecular Genetics for isolating and providing the Candidatus Phytoplasma mali sample. Theresa Feltwell of The Wellcome Trust Sanger Institute for culturing S. aureus TW20 and performing the plasmid prep. Albert Jeltsch and Tomasz Jurkowski for providing the original Dam constructs. This work was supported by the Wellcome Trust grant 098051 (PC, MQ, HS) and The Cambridge Cancer Center (TC, WR). ESGI – The research leading to these results has received funding from the Seventh Framework Programme (FP7/2007-2013) under grant agreement no. 262055.
The authors declare no competing interests.
Address correspondence to Paul Coupland, Wellcome Trust Sanger Institute, Genome Campus, Hinxton, Cambridgeshire, UK. Email: [email protected]
1.) Eid, J., A. Fehr, J. Gray, K. Luong, J. Lyle, G. Otto, P. Peluso, D. Rank. 2009. Real-Time DNA Sequencing from Single Polymerase Molecules. Science 323:133-138. 2.) Korlach, J., A. Bibillo, J. Wegener, P. Peluso, T.T. Pham, I. Park, S. Clark, G.A. Otto, and S.W. Turner. 2008. Long, processive enzymatic DNA synthesis using 100% dye-labeled terminal phosphate-linked nucleotides. Nucleosides Nucleotides Nucleic Acids 27:1072-1083. 3.) Korlach, J., K.P. Bjornson, B.P. Chaudhuri, R.L. Cicero, B.A. Flusberg, J.J. Gray, D. Holden, R. Saxena. 2010. Real-time DNA sequencing from single polymerase molecules. Methods Enzymol. 472:431-455. 4.) Levene, M.J., J. Korlach, S.W. Turner, M. Foquet, H.G. Craighead, and W.W. Webb. 2003. Zero-mode waveguides for single-molecule analysis at high concentrations. Science 299:682-686. 5.) McCarthy, A. 2010. Third generation DNA sequencing: pacific biosciences’ single molecule real time technology. Chem. Biol. 17:675-676. 6.) Schadt, E.E., S. Turner, and A. Kasarskis. 2010. A window into third-generation sequencing. Hum. Mol. Genet. 19:R227-R240. 7.) Carneiro, M.O., C. Russ, M.G. Ross, S.B. Gabriel, C. Nusbaum, and M.A. Depristo. 2012. Pacific biosciences sequencing technology for genotyping and variation discovery in human data. BMC Genomics 13:375. 8.) Rasko, D.A., D.R. Webster, J.W. Sahl, A. Bashir, N. Boisen, F. Scheutz, E.E. Paxinos, R. Sebra. 2011. Origins of the E. coli strain causing an outbreak of hemolytic-uremic syndrome in Germany. N. Engl. J. Med. 365:709-717. 9.) Travers, K.J., C.S. Chin, D.R. Rank, J.S. Eid, and S.W. Turner. 2010. A flexible and efficient template format for circular consensus sequencing and SNP detection. Nucleic Acids Res. 38:e159. 10.) Clark, T.A., I.A. Murray, R.D. Morgan, A.O. Kislyuk, K.E. Spittle, M. Boitano, A. Fomenkov, R.J. Roberts, and J. Korlach. 2012. Characterization of DNA methyltransferase specificities using single-molecule, real-time DNA sequencing. Nucleic Acids Res. 40:e29. 11.) Song, C.X., T.A. Clark, X.Y. Lu, A. Kislyuk, Q. Dai, S.W. Turner, C. He, and J. Korlach. 2012. Sensitive and specific single-molecule sequencing of 5-hydroxymethylcytosine. Nat. Methods 9:75-77. 12.) Murray, I.A., T.A. Clark, R.D. Morgan, M. Boitano, B.P. Anton, K. Luong, A. Fomenkov, S.W. Turner. 2012. The methylomes of six bacteria. Nucleic Acids Res.. 13.) Holden, M.T., J.A. Lindsay, C. Corton, M.A. Quail, J.D. Cockfield, S. Pathak, R. Batra, J. Parkhill. 2010. Genome sequence of a recently emerged, highly transmissible, multi-antibiotic- and antiseptic-resistant variant of methicillin-resistant Staphylococcus aureus, sequence type 239 (TW). J. Bacteriol. 192:888-892. 14.) Robicsek, A., G.A. Jacoby, and D.C. Hooper. 2006. The worldwide emergence of plasmid-mediated quinolone resistance. Lancet Infect. Dis. 6:629-640. 15.) Svara, F., and D.J. Rankin. 2011. The evolution of plasmid-carried antibiotic resistance. BMC Evol. Biol. 11:130. 16.) Haenni, M., E. Saras, V. Metayer, B. Doublet, A. Cloeckaert, and J.Y. Madec. 2012. Spread of the blaTEM-52 gene is mainly ensured by IncI1/ST36 plasmids in Escherichia coli isolated from cattle in France. J Antimicrob Chemother.. 17.) Miró, E., C. Segura, F. Navarro, L. Sorli, P. Coll, J.P. Horcajada, F. Alvarez-Lerma, and M. Salvadó. 2010. Spread of plasmids containing the bla(VIM-1) and bla(CTX-M) genes and the qnr determinant in Enterobacter cloacae, Klebsiella pneumoniae and Klebsiella oxytoca isolates. J. Antimicrob. Chemother. 65:661-665. 18.) Valverde, A., R. Canton, M.P. Garcillan-Barcia, A. Novais, J.C. Galan, A. Alvarado, F. de la Cruz, F. Baquero, and T.M. Coque. 2009. Spread of bla(CTX-M-14) is driven mainly by IncK plasmids disseminated among Escherichia coli phylogroups A, B1, and D in Spain. Antimicrob. Agents Chemother. 53:5204-5212. 19.) Dionisio, F., I. Matic, M. Radman, O.R. Rodrigues, and F. Taddei. 2002. Plasmids spread very fast in heterogeneous bacterial communities. Genetics 162:1525-1532. 20.) Kube, M., B. Schneider, H. Kuhl, T. Dandekar, K. Heitmann, A.M. Migdoll, R. Reinhardt, and E. Seemuller. 2008. The linear chromosome of the plant-pathogenic mycoplasma ‘Candidatus Phytoplasma mali’. BMC Genomics 9:306. 21.) Altschul, S.F., T.L. Madden, A.A. Schaffer, J. Zhang, Z. Zhang, W. Miller, and D.J. Lipman. 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25:3389-3402.