Full Text (PDF)
Despite the initial controversy surrounding its undertaking, the HapMap project has already yielded data that are permitting a fuller understanding of the human genome. In a recent issue of Nature Genetics, two papers describe strategies that employ HapMap data to explore the frequency and characteristics of deletion polymorphisms.
The HapMap consortium—which published the first phase of its multiyear project in October of 2005—was created in order to measure the association of common SNPs. By determining those SNPs that associate together as a haplotype, the task of geno-typing can arguably be reduced to the determination of so-called tag SNPs, which would act as surrogates of the untested SNPs with which they are in perfect linkage disequilibrium. At this first stage, the HapMap consists of data on over a million SNPs identified in 269 geographically diverse individuals, representing one polymorphism per 5 kb or so of sequence.
Although the HapMap project is limited to the cataloging of single nucleotide base changes, the studies from Conrad et al., and McCarroll et al., show that HapMap data can also be used to detect common, small-scale deletions. Conrad et al,. honed in on the critical regions by focusing on data that appeared to contravene Mendelian inheritance in characteristic ways. In two of the four geographical locations in which HapMap data were collected, samples were collected from parent-child trios. Thus, the genotype of the mother, father, and child are all known. Technical limitations of the genotyping strategies used mean that a hemizygous genotype is often miscalled as homozygous. As a result, deletions can show up as an apparently impossible inheritance pattern, and by scanning SNP data for runs of these non sequitor calls, putative deletions should—in theory—be amenable to mapping. Starting with a similar hypothesis, McCarroll et al., inspected regions that revealed Mendelian inconsistencies, null genotypes, or Hardy-Weinberg disequilibrium. Both screens would be expected to result in a number of false positives, so the two groups used secondary analysis and carefully selected statistical thresholds, respectively, to identify the most reliable data. Once candidate regions were obtained, the groups used different wet-lab methods to gather supporting evidence. As an initial test, Conrad et al., performed quantitative PCR on putative deletions; in each case tested, DNA levels were consistent with the prediction. For fuller analysis, the group also used arrayCGH, finding experimental support for 86% of predicted deletions. For their part, McCarroll et al., used FISH, an Illumina BeadArray assay, and PCR methods in order to obtain experimental evidence to back up their predictions. FISH was performed in cases in which the deletion was long enough to be picked up by existing probes, and the candidate deletion was confirmed in five out of five cases. The BeadArray allele-specific fluorescence assay and the PCR evidence confirmed in a larger number of cases that a high percentage of the candidate regions appeared to be bona fide deletions.
Together, these methodologies promise to expand knowledge of human variation beyond the relatively well-mapped SNPs to the less intensively studied copy number polymorphisms (CNPs). In addition, together with a companion paper in the same issue from a team at Perlegen, McCarroll et al., provide evidence suggesting that nearby SNPs exhibit significant linkage disequilibrium with the validated deletions. This means that tag SNPs may be equally informative for deletion polymorphisms as for other single nucleotide polymorphisms, opening up the possibility of assaying disease-associated deletions through the relatively straightforward analysis of surrogate SNP markers. The potential utility for disease analyses is particularly relevant given the findings from both papers that these deletions appear with significant frequency in coding regions. Nevertheless, despite the promise of the methodologies for both ongoing analyses of genomic variation and eventual application to genome-wide association studies, the fact remains that the deletion sets identified in each paper are mostly nonoverlapping. This implies that the false negative rate for any given method is likely to be high and that a full analysis of deletion-associated copy number polymorphisms awaits further methodological developments. -ND

Conrad et al. 2006. A high-resolution survey of deletion polymorphism in the human genome. Nature Genetics 38(1):75-81.
McCarroll et al. 2006. Common deletion polymorphisms in the human genome. Nature Genetics 38(1):86-92.
Red No Longer rec-lessRecombination in bacteria—particularly Escherichia coli and Salmonella typhimurium—as a means to manipulate DNA in vitro has become a popular technique that is rapidly replacing restriction enzyme cloning for such tasks as expression vector construction and BAC cloning. A critical component of this useful and malleable system, also known by the moniker “recombineering,” is the λ phage Red operon. It was recognized in the late 1990s that using the Red recombination system, homologous recombination could be achieved between linear dsDNA molecules with only small amounts of sequence overlap (frequently as little as 40 bp). Three Red proteins act in concert to bring about efficient recombination: Bet (or Redβ), which binds ssDNA, Exo (or Redα), which provides 5′ to 3′ exonuclease functionality, and Gam (or Redγ), which inhibits the activity of the endogenous RecBCD complex, enhancing the stability of the resulting recombination products. Now, a new paper in Molecular Biotechnology from a group that has already produced seminal work in this field extends the usefulness and efficacy of the Red recombination system by investigating transient expression of the RecA protein in E. coli cells during recombination reactions. This expression was shown to improve the robustness of the procedure by making the cells better able to withstand the insult of the transformation process, confirming suppositions based on previously published work. The recA gene was transformed into cells as part of the Red operon cassette, carried on an inducible low copy number, temperature-sensitive plasmid. Co-expression of RecA with Bet, Exo, and Gam was shown to improve the transformation efficiency by approximately 4-fold; this advantage was evident for cells transformed both chemically and by electroporation. The authors further demonstrated the utility of the system by engineering a BAC clone, which was stable through two rounds of recombineering. -SS
Wang et al. 2006. An improved recombineering approach by adding RecA to λ Red recombination. Molecular Biotechnology 32:43-53.
Multiplexing MethylationAs the CpG methylation of genes involved in tumor initiation and progression has become more widely accepted as a marker and prognostic indicator of cancer, researchers have become more active in seeking higher throughput technologies to simultaneously screen larger number of samples or larger numbers of methylation sites. A common aim with many of these methodologies is the development of assays with clinical application, particularly for early identification and diagnosis of cancers, enabling doctors to provide more rapid treatment and to follow the efficacy of such regimens. A variety of techniques, many based on the bisulfite modification protocol, have been developed over the years that have moved researchers closer to the high-throughput goal, with mixed success. Previously developed techniques include combining bisulfite-PCR (bisulfite treatment to modify unmethylated cytosines followed by PCR amplification) with MALDI-TOF mass spectrometry or with methylation-sensitive microarrays. However, assays using bisulfite conversion suffer from a number of inherent drawbacks, including their limited sensitivity to partially methylated sequences. To try to alleviate this, Cheng et al., in a recent Genome Research article, used a modified version of a medley of techniques, by combining bisulfite modification with PCR, ligase detection reaction (LDR), and a Universal DNA Microarray. This new protocol, the authors promise, provides greater accuracy and better specificity while allowing for high-throughput by multiplexing. Flexibility in the PCR step is introduced by using two rounds of amplification, the second with a universal primer. This amplification is independent of the methylation status, which is detected later by the LDR step. So-called 3′ zip-code sequences on the LDR primers direct the ligation products to their correct location on the Universal Array, while 5′ fluorescent molecules provide a detectable signal. Although not perfect, this multiplexed, quantitative protocol demonstrates good correlation with previously determined percentage methylation values using other techniques, while also providing greater sensitivity and higher throughput; it promises to be an important advance towards the clinical application of methylation detection methodologies. –SS

Cheng et al. 2005. Multiplexed profiling of candidate genes for CpG island methylation status using a flexible PCR/LDR/Universal Array assay. Genome Research [Epub ahead of print, December 20, 2005].