2Department of Microbiology and Plant Pathology, Forestry and Agricultural Biotechnology Institute, University of Pretoria, South Africa
3Department of Biochemistry, Forestry and Agricultural Biotechnology Institute, University of Pretoria, South Africa
Full Text (PDF)
Robust molecular markers such as microsatellites are important tools used to understand the dynamics of natural populations, but their identification and development are typically time consuming and labor intensive. The recent emergence of so-called next-generation sequencing raised the question as to whether this new technology might be applied to microsatellite development. Following this view, we considered whether deep sequencing using the 454 Life Sciences/Roche GS-FLX genome sequencing system could lead to a rapid protocol to develop microsatellite primers as markers for genetic studies. For this purpose, genomic DNA was sourced from three unrelated organisms: a fungus (the pine pathogen Fusarium circinatum), an insect (the pine-damaging wasp Sirex noctilio), and the wasp's associated nematode parasite (Deladenus siricidicola). Two methods, FIASCO (fast isolation by AFLP of sequences containing repeats) and ISSR-PCR (inter-simple sequence repeat PCR), were used to generate microsatellite-enriched DNA for the 454 libraries. From the resulting 1.2–1.7 megabases of DNA sequence data, we were able to identify 873 microsatellites that have sufficient flanking sequence available for primer design and potential amplification. This approach to microsatellite discovery was substantially more rapid, effective, and economical than other methods, and this study has shown that pyrosequencing provides an outstanding new technology that can be applied to this purpose.
Microsatellites, or simple sequence repeats (SSRs), are DNA sequences that consist of tandem repeats of 1–6 nucleotides, found at varying frequencies in the genomes of just about every known organism and organelle (1). They belong to a class of highly mutable genomic sequences known as variable number of tandem repeat (VNTR) elements (2,3) that show extensive levels of intraspecific polymorphisms in both eukaryotic (4,5,6) and prokaryotic (7,8) genomes. Because of their ease of use, co-dominance, and high levels of polymorphism (9), micro-satellites have been particularly valuable in genome mapping, forensics, paternity testing, population genetics, conservation or management of biological resources, and molecular typing of microbial strains (9,10,11,12).
Both the identification and development of microsatellite markers represent significant challenges. This is especially true in the case of organisms for which there are little or no sequence data and where the development of microsatellite markers requires the protracted steps of generating clone libraries and sequencing them (9,13). For species with known genome sequences, in silico scanning of genome databases using bioinformatics tools can be used to identify microsatellites and to design primers targeting these regions (5,11). However, genome sequences are available for relatively few eukaryotes and providing these is generally beyond the limited budgets of most research programs. Also, microsatellite loci in some instances cannot be employed across distantly related species (14) and they usually need to be identified and characterized de novo for each species, which can be a time intensive and expensive exercise. In general, the success with which micro-satellite markers are obtained and the size of clone libraries to be constructed are related to the frequency of occurrence of microsatellite sequences in the genome of interest (15,16). However, the frequency of microsatellites observed in the genomes of plants, animals, fungi and prokaryotes has been reported to be significantly different (5), and in some cases researchers have reported extreme difficulty in obtaining any microsatellite sequences (17).
A number of methods are available to identify microsatellites (17). Of these, the most commonly used methods employ targeted enrichment of DNA for microsatellites (16,18). One is known as inter simple sequence repeat PCR (ISSR-PCR) (19). In this procedure, ISSR primers, which contain microsatellite motifs and three anchoring nucleotides at the 5′ terminal end, are used to amplify regions of the genome that are thought to be abundant in microsatellites. The PCR products are cloned and subsequently sequenced to determine the presence of microsatellite sequences (20). More recently, DNA enrichment strategies involving hybridization with probes containing micro-satellite sequences to genomic DNA fragments have been introduced (16). After exclusion of the non-hybridized DNA that presumably lacks repeat regions, the remaining microsatellite-rich fragments are cloned and sequenced to identify the microsatellite sequences. One of these approaches, known as fast isolation by AFLPs of sequences containing repeats (FIASCO), also uses amplified fragment length polymorphism (AFLP) (21) to aid in the enrichment process. Both the ISSR-PCR and FIASCO methods have been widely used in contemporary studies to isolate microsatellites from a wide variety of different eukaryotic species (22,23,24,25).
Development of new microsatellite markers has been streamlined to some extent by optimizing the numerous steps in microsatellite identification and subsequent sequencing throughput, to make the process cheaper, more efficient and more successful (15,16). However, using the currently available methods, certain factors—such as cloning efficiency, the necessity to sequence large numbers of cloned fragments, and the need for a multitude of hybridization probes for enrichment—limit the success rate of microsatellite isolation. The recent appearance of next-generation sequencing such as Roche 454 genome sequencing (26), which uses pyrosequencing, raised the question of whether this could facilitate more effective production of microsatellites.
Materials and methodsIn this study, the 454 Life Sciences/Roche GS-FLX genome sequence system (Roche Applied Science, Penzburg, Germany) (26) was used for the identification of microsatellite sequences, directly from microsatellite-enriched genomic DNA. To provide a broadly applicable test, we evaluated this method on three unrelated eukaryotes with little genome information available: Fusarium circinatum (a fungal ascomycete), Sirex noctilio (a hymenopteran insect), and Deladenus siricidicola (a tylenchid nematode).