First, genomic DNA was extracted from clotted blood samples using a standard proteinase K and phenol/chloroform protocol (7). Purified DNA was then subjected to shotgun sequencing on the GS-FLX System. This technique has been described in detail elsewhere (8), but briefly, a single-stranded DNA (ssDNA) sequencing library was constructed using Roche kits and reagents. Five micrograms of genomic DNA were fragmented into 300–800 bp by nebulization. Short adaptors were added to each fragment such that different adaptors were ligated to the 3′ and 5′ ends. The blue duck ssDNA library had an average length of 460 bp and a concentration of 2538 pg/µl. Emulsion PCR (emPCR) was carried out at a concentration of 1 copy per bead in six emulsion oils, to give 43,800 enriched beads. Amplified fragments were sequenced on 1/16th of an LR70 plate. From this single run, a total of 17,215 sequenced fragments passed the quality filters to give a total of 4.1 Mb of sequence with an average read length of 243 bp.
The 17,215 reads obtained were converted into a single FASTA format file and screened for perfect microsatellites (di-, tri-, and tetranucleotides) with at least eight repeats using MSATCOMMANDER version 0.8.1 (9). This software has an inbuilt workflow that enables the simultaneous detection of repeat motifs and the design of PCR primers to amplify these repeats when flanking regions enable it (9). If the microsatellite detected is too close to the extremity of the read, the sequence is automatically discarded.Results and disccussion
In our 17,215 reads, we detected 73 dinucleotides, 107 trinucleotides, and 51 tetranucleotides, and it was possible to design primers for the amplification of 24 markers (17 dinucleotides, 5 trinucleotides, and 2 tetranucleotides). Trials of amplification and polymorphism for each of these markers were assessed using eight putatively unrelated blue ducks from the same region. Amplification of DNA was successfully realized for all 24 markers using touchdown PCR (See Supplementary Methods). Electrophoresis of the amplified products was performed using an ABI PRISM 3100 Genetic Analyzer (Applied Biosystems, Foster City, CA, USA) and scored with GENEMAPPER version 3.7 (Applied Biosystems). Thirteen of the 24 microsatellite markers displayed polymorphism, ranging from 2 to 4 alleles per locus (Table 1), and thus are informative for genetic studies of blue duck. Since these individuals come from the same region, this result is likely to be conservative and more polymorphism might be expected at a larger geographic scale. All 13 sequences have been submitted to GenBank (accession nos. FJ457101–FJ457113).
In order to verify if the approach used could bias in favor of markers in one or a limited number of genomic positions, the location of orthologous sequences on the chicken genome assembly (build 2.1) was evaluated using a previously described approach (10). Briefly, the BLASTN program (http://blast.ncbi.nlm.nih.gov/Blast.cgi) was run for each of the 24 reads for which suitable primers were designed. Repeat motifs were masked using the DUST filter in order to avoid numerous random matches to other repeated motifs throughout the genome. The default values for the parameters of the BLASTN algorithm were kept [i.e., stringent search settings used in Dawson et al. (10)]. From the 24 loci tested, 16 (67%) provided a unique match (hit) at 1 × 10−10 or lower. These 16 loci were distributed over eight different chromosomes (1, 2, 3, 6, 7, 11, 13, and Z), with no obvious aggregation patterns and a minimum distance between two loci of 5.5 Mbp (loci Hmal05 and Hmal07 on chromosome 7).
As for all new methods, it is important to evaluate how efficient this one is compared with commonly used ones. In order to do so, we conducted a survey of all the primer notes (n = 126) published in two recent issues (September 2008 and November 2008) of the journal Molecular Ecology Resources. No less than 44 different studies were cited for the description of the protocols used, showing a large variation in the methods employed for marker isolation. However, only seven studies (5.5%) did not use a protocol based on the design of a partial genomic library enriched for microsatellite repeats and cloning, generally taking advantage of the existence of a closely related species heavily sequenced. Because of inconsistencies in the figures provided in the different primer notes, the occurrence number (n) varies for the successive stages of marker development. Across all 126 studies, where the necessary data were available, we found that on average, 190 inserts (n = 92) were sequenced resulting in the detection of 72 repeated motifs (n = 52). From these, 37 primer pairs were designed on average (n = 113), resulting in 14 polymorphic loci per study (n = 119). Ninety-one studies (72%) described the development of ≤13 polymorphic markers. This comparison shows that our approach gave a similar level of efficiency than most of the studies using traditional methods to isolate microsatellite markers on non-model organisms.