Full Text (PDF)
Recent technological advances have allowed the application of high-throughput methods to the identification of transcription factor binding regions on a genomic scale. The combination of chromatin immunoprecipitation (ChIP) with microarray and other high-throughput technologies has enabled the identification of protein-DNA interactions on an unprecedented scale. With this ability, the identification of all regions of DNA where a given transcription factor binds has become a theoretical possibility.
With the sequencing of the genomic DNA of a variety of organisms, including yeast and humans, the question of how the expression of the genetic information contained within the genome is regulated becomes very important and pertinent. Recent efforts to investigate the interactions of transcription factor proteins with DNA on a genomic scale have underlined the complexity of the regulatory networks.
Several approaches can be used to identify the sites of protein-DNA interaction. If the consensus sequence that directs the binding of a protein to a specific region of DNA is known, this information can be used in a bioinformatics approach to identify other potential binding regions. If the binding site requirements are not known, techniques such as systematic evolution of ligands by exponential enrichment can be used to elucidate this information (1). Electrophoretic mobility shift assays can also be used to determine binding specificity (2). While these approaches can be used to identify potential target sites of DNA binding proteins, their utility is limited for a variety of reasons. Technically, such approaches are labor-intensive and provide very low-throughput. From the standpoint of biology, many additional factors are likely to be involved in the interaction of a protein with a specific region of DNA, including other associated proteins, the local chromatin structure, and interactions with other, often far removed regions of chromatin.
Previous work in our laboratory, and in others’, has led to the development of a number of techniques that allow the high-throughput identification and analysis of DNA binding by a variety of DNA binding proteins. In both mammalian (3,4) and yeast (5,6) systems, high-throughput micro-array-based analysis has been used to study the binding of transcription factor proteins to potential regulatory DNA elements. These analyses demonstrate the ability of these high-throughput methods to confirm and expand our knowledge of previously identified protein-DNA interactions, as well as to identify previously unknown interactions. At the present time, many of these techniques are being used by the ENCODE Project Consortium, a group of laboratories whose aim is to identify all functional elements within the human genome (7).
ChIP-on-Chip: Microarray Analysis of Protein-DNA InteractionsThe most dramatic improvement in the throughput of identifying transcription factor-DNA interactions came with the combination of microarray analysis with the ChIP technique. Initial uses of ChIP required prior knowledge of a potential binding target DNA sequence to identify binding of protein to DNA regions of interest. By the combining microarray analysis with ChIP, the identification of protein binding sites was possible on a genomic scale, limited only to the DNA features present on the microarray.
The standard protocol for ChIP-on-Chip (or ChIP-chip as it is sometimes called) is shown in Figure 1. As with the basic ChIP method, a DNA binding protein is covalently attached to its target DNA using formaldehyde cross-linking. After shearing of the chromatin, the protein-DNA adduct is immunopreciptated from nuclear extracts using an antibody specific for the protein in question or for an epitope sequence appended to the end of the protein. Commonly used in this manner are the hemagglutanin or myc tag sequences. After immunoprecipitation, the enriched DNA sample is commonly, but not always, amplified using a variety of methods. The final step is to fluorescently label the enriched DNA and the reference DNA. In the case of yeast ChIP, the reference sample is usually composed of DNA from the mock immunoprecipitation of a strain where the target transcription factor is untagged. In mammalian ChIP, the reference is usually composed of the input sheared chromatin or mock immunoprecipitations using normal, nonspecific sera from mice or rabbits.
