to BioTechniques free email alert service to receive content updates.
A genome walking strategy for the identification of eukaryotic nucleotide sequences adjacent to known regions
Claudia Leoni1, Raffaele Gallerani1, 2, Luigi R. Ceci2
1, Department of Biochemistry and Molecular Biology, University of Bari, Bari
2, Institute of Biomembranes and Bioenergetics, Italian National Research Council, Trani, Italy
BioTechniques, Vol. 44, No. 2, February 2008, pp. 229–235
Full Text (PDF)

Determination of nucleotide sequences adjacent to a known region is a recurring need in many genome scale studies. Various methods have been developed based on PCR techniques in order to fulfill this aim and overcome the time-consuming approach of screening genomic libraries. Usually these protocols rely on specific requirements and strategies, such as the presence of suitable nucleotide restriction sites and ligation of specific single- or double- strand linkers, thus limiting their application to a certain extent. In this paper we present an alternative PCR-based protocol, consisting of four main steps: (i) extension of a sequence-specific primer; (ii) 3′-tailing of extended single-strand DNA; (iii) PCR; and (iv) nested PCR amplifications. This method, which appears to be a valid alternative to the other PCR-based protocols, was used for the identification of sequences flanking the cDNA encoding region of the Lhcb1.1 gene (one member of the multigene family coding for the light harvesting protein Lhcb1) in the spinach genome.


Isolating and identifying nucleotide sequences flanking known regions is a common requirement in a number of studies of gene and genome characterization. It may be used for identification of regulatory sequences outside cDNA coding regions and gaps in genome sequencing projects, or for mapping of insertional mutagenesis events produced by retroviruses and transposable elements (1,2).

Subsequent to the introduction of the PCR technique, several protocols have been developed to identify nucleotide sequences outside known regions. With the exception of the “inverse PCR” (3), these methods adopt the strategy of linking a small stretch of synthetic DNA to the known sequence, thereby allowing PCR amplifications to be carried out. Certain methods combine preliminary restriction digestions of genomic DNA with ligations of either double-strand DNA cassettes, as in “vectorette PCR” (4), “splinkerette PCR” (5), “capture-PCR” (6), “T-linker PCR” (7) or single-strand oligonucleotides, as in “panhandle” (8), and “boomerang PCR” (an atypical PCR performed using a single primer, working in both directions) (9). In addition, “inverse PCR” requires restriction digestion of genomic DNA, but restricted fragments are self-ligated and subjected to PCR using reverse oriented primers (3). As an alternative to restriction digestion of genomic DNA, in the ligation-mediated method proposed by Mueller and Wold (10), a double-strand DNA cassette is ligated to blunt-end DNA fragments obtained by primer extension of a specific oligonucleotide carried out on chemically nicked DNA. Other methods perform PCR amplifications using degenerate primers coupled with sequence-specific primers, such as the so-called TAIL (thermal asymmetric interlaced) PCR (11) and the UFW (universal fast walking) method (1), or the method by Levano-Garcia et al. (12), in which consensus-degenerate primers are used. Critical overviews of some of the above-mentioned methods have already been reported (9,13). All of these methods have been successfully used, but with certain limitations due either to the requirements of restriction sites, the efficiency of ligation reactions or annealing of degenerate primers. Accordingly, there are only a few commercial kits for genome walking, such as the GenomeWalker kit (Clontech, Mountain View, CA, USA) and UVS1 Vectorette Genomic Systems (Sigma-Aldrich, St. Louis, MO, USA). These kits, based on restriction digestions of genomic DNA and ligation of a double-strand DNA cassette, are ready to use only for the human genome (as well as mouse and rat for the GenomeWalker kit), but require isolation and restriction digestions of genomic DNA if intended for use with other organisms. Alternatively, Evrogen (Moscow, Russia) provides a customized service based on a similar approach.

The genome walking approach that we have developed is independent of the presence of specific restriction sites and does not require the use of random primers or ligation of single- or double-strand linkers. Our method is based mainly on a classical 5′-RACE approach, but it is applicable for both 5′ and 3′ genome walking. Whereas a similar protocol was applied with success only for a bacterial genome (Microcystis aeruginosa) (14), our method has broader potential for application since it has been developed for the genome of a higher eukaryote.

In this paper we show the application of the genome walking method for the identification of the nucleotide sequences flanking the cDNA coding region of the spinach Lhcb1.1 gene (one member of the multigene family coding for the light harvesting protein Lhcb1) (15).

Materials and Methods

Standard Nucleic Acids Isolation and Manipulation Methods

Genomic DNA was purified from two-week old spinach seedlings (var. America), using the Gene Elute Plant genomic kit (Sigma-Aldrich).

Restriction digestion of DNA, cloning, plasmid DNA isolation, and sequencing were carried out according to standard procedures [(16); pGEM-T Easy Vector System II manual (Promega, Madison, WI, USA)].


Oligonucleotides specific for spinach DNA were from Operon (Alameda, CA, USA) or Sigma-Aldrich. Oligonucleotide sequences are reported in (Table 1). For poly-dG containing primer, we used the 5′-RACE abridged anchor primer (AAP) from Invitrogen (Carlsbad, CA, USA).

Table 1. Oligonucleotides Used in the Genome Walking Experiments

  1    2    3    4