2Centre for Reproduction and Genomics, Department of Anatomy & Structural Biology, University of Otago, Dunedin, New Zealand
3Department of Zoology, University of Otago, Dunedin, New Zealand
4Anatomy Otago Genomics Sequencer Unit, Department of Anatomy & Structural Biology, University of Otago, Dunedin, New Zealand
Microsatellites are the genetic markers of choice for many population genetic studies, but must be isolated de novo using recombinant approaches where prior genetic data are lacking. Here we utilized high-throughput genomic sequencing technology to produce millions of base pairs of short fragment reads, which were screened with bioinformatics toolsets to identify primers that amplify polymorphic microsatellite loci. Using this approach we isolated 13 polymorphic microsatellites for the blue duck (Hymenolaimus malacorhynchos), a species for which limited genetic data were available. Our genomic approach eliminates recombinant genetic steps, significantly reducing the time and cost requirements of marker development compared with traditional approaches. While this application of genomic sequencing may seem obvious to many, this study is, to the best of our knowledge, the first attempt to describe the use of genomic sequencing for the development of microsatellite markers in a non-model organism or indeed any organism.
Microsatellites have emerged as one of the most popular genetic markers for a wide range of applications in population genetics, conservation biology, and evolutionary biology (1). Their high mutation rates and simple Mendelian mode of inheritance make them particularly suitable for the study of fine population structure, mating systems and pedigrees. However, the major drawback of microsatellite markers is that for most species, microsatellites must be developed de novo. Classically, microsatellite development requires the construction of a genomic library enriched for repeated motifs, isolation and sequencing of microsatellite containing clones, primer design, optimization of PCR amplification for each primer pair, and a test of polymorphism on a few unrelated individuals (2).
Most of these steps are either expensive, time-consuming, or both. Moreover, because microsatellite isolation requires the construction of genetically modified bacteria, in certain jurisdictions biosecurity and cultural sensitivity issues must be dealt with when molecular cloning is used, which increases the costs of marker development and project timelines. The entire process, while not overly complicated, is time-consuming, leading many research teams to use external services provided by academic institutions or private companies. The costs of such services vary, but are largely governed by the number of polymorphic loci to be supplied and the timeframe for delivery. A survey of service providers indicates that on average, $5000–10,000 USD will deliver ∼10 polymorphic loci in anywhere from one to more than three months.
We have explored the advantages offered by a new genomic shotgun sequencing technology coupled with fast and efficient bioinformatics tools to eliminate the most intensive wet lab steps in a simple, fast, and economic way. Figure 1 shows the general outline of the procedure for isolating microsatellites using the genomic shotgun approach. Instead of creating a genomic library enriched for microsatellites, shotgun sequencing is conducted using the Genome Sequencer FLX (GS-FLX) System (Roche, Penzberg, Germany), which produces tens of thousands of reads between 200 and 300 bp in length.
The essence of the approach is to generate enough random sequences to isolate a satisfying number of repeated motifs by chance. A full run on the GS-FLX System using an LR70 plate typically produces more than 400,000 reads at an average read length of approximately 250 bases, or ∼100 Mbp of sequence. In most eukaryotic genomes, microsatellites of the 15-bp threshold size occur approximately every 0.85–1 kb (3). Thus a full run on the FLX system might be expected to include approximately 100,000 microsatellite sequences. Scaling down to the 1/16th format, the smallest partition available on the FLX, the expectation might be to isolate 6–7,000 microsatellites in a single run. Assuming a random distribution of microsatellites through the sequencing reads, perhaps 2/3 of these microsatellites will be too close to either fragment end to enable design of flanking PCR primers, leaving roughly 2,000 microsatellites of 15 bp or greater to be considered for use as markers. Selecting only those reads that have longer, pure repeats, which are widely held to have association with heightened polymorphism (4), the numbers of loci obtained would drop further, but could remain in the 10–100 range, which would be adequate for all but the most intensive applications of these markers. This reduction factor throughout the process, though hard to precisely anticipate theoretically, represents the main limitation to this approach.
In order to test the method in practice, we trialed this approach on a previously unstudied species, New Zealand's endangered blue duck (Hymenolaimus malacorhynchos). Our choice of the blue duck to test this methodology is particularly pertinent for two reasons. First, it has been shown that avian genomes show a low frequency of microsatellites compared with other organisms (5). Second, the blue duck is endangered due to habitat fragmentation and predation by introduced species; the blue duck's range, population size and genetic diversity are now severely reduced (6). Both factors suggest blue duck should be a challenging species in which to isolate polymorphic microsatellites, and therefore a good test of the general validity of our new methodology.