While a PhD student at Harvard and the Massachusetts Institute of Technology, Aiden began to unravel the 3-D architecture of whole genomes by helping to develop a technique called Hi-C. As reported in Science in 2009, the method couples proximity-based ligation with massively parallel sequencing to enable the unbiased identification of chromatin interactions across an entire genome (1). But more recently, Aiden and his collaborators have worked hard to make the technique even better.
In a new study published in Cell, the researchers reported an effort to comprehensively map more than 15 billion chromatin contacts genome-wide using an approach called “in situ Hi-C” (2). By performing the DNA-DNA proximity ligation in intact nuclei, they dramatically decreased the random noise that occurred in the original Hi-C approach as a result of shaking up the DNA in solution. “Just like reducing the shaking of a camera results in a sharper picture, performing Hi-C in situ reduced the blurriness of our maps and allowed us to create the first unbiased genome-wide maps of DNA folding at the resolution of individual genes,” said co-first author Suhas Rao, a researcher at the Baylor College of Medicine and Rice University.
“We developed in situ Hi-C in order to take high-resolution snapshots of all of the DNA contacts inside the nucleus—collisions between two distinct fragments of DNA brought into close spatial proximity—in order to better understand how this amazing feat of packaging happens,” Aiden said. “We anticipate that the ease of use and improved resolution achieved by the in situ Hi-C protocol will make it a widely used addition to the toolkit for studying many biological processes, from cell differentiation to immune response to disease, just to name a few.”
Form Follows Function
In situ Hi-C combines the original Hi-C protocol with a nuclear ligation assay, in which DNA is digested using a restriction enzyme, DNA-DNA proximity ligation is performed in intact nuclei, and the resulting ligation junctions are quantified. The new protocol facilitates the generation of much denser Hi-C maps and requires three days instead of one week.
The researchers constructed in situ Hi-C maps—lists of DNA-DNA contacts produced by a Hi-C experiment—of nine human and mouse cell lines. Whereas the original Hi-C experiments had a map resolution of 1 megabase, the largest of the new maps contained about 5 billion contacts and had a map resolution of about 1 kilobase. The researchers also generated 8 in situ Hi-C maps at 5 kilobase resolution, with each map containing between approximately 400 million and 1 billion contacts.
“When we first began this project, using Hi-C to systematically map all chromatin loops across the genome was thought to be infeasible. It wasn’t believed that sufficient resolution could be attained,” Aiden said. “It took a great deal of effort in refining the protocol, improving the signal-to-noise ratio and making it easier to perform, to get it to the point that it was even possible to create the gene resolution 3-D maps that we present in our study.”
Even after the researchers improved the technology, they were faced with massive amounts of data from the Hi-C experiments. “We had to build a whole suite of analysis tools from scratch, utilizing big-data tools like parallelized pipelines for high-performance computer clusters, dynamic programming algorithms, and graphics processing units to efficiently deal with the deluge of data we generated,” Aiden said.
Whereas Hi-C previously revealed that the genome is partitioned into numerous domains that fall into 2 distinct compartments, in situ Hi-C showed that genomes are partitioned into previously undetected small contact domains—spanning 185 kilobases on average—that are associated with distinct patterns of histone marks and segregate into 6 sub-compartments. “Contact domains marked with different flavors of chemical marks tended to congregate in different nuclear neighborhoods, suggesting that histone modifications may act as a nuclear zip code system, directing pieces of the genome to different spatial locations in the nucleus depending on their function,” explained co-first author Miriam Huntley, a graduate student at Harvard University.
The researchers also identified about 10,000 loops, which are formed when 2 pieces of DNA separated by a large distance on the genomic polymer are pinched together. “We’ve created the first reliable genome-wide catalog of DNA loops in the human genome,” Aiden said. “This has been a long-standing goal in the field for decades.” Interestingly, these loops showed conservation across cell types and species. “This suggests an extremely important role for the chromatin loops we observed, as they have been preserved over hundreds of millions of years of evolution,” Aiden said.
Moreover, the loops frequently linked promoters and enhancers and correlated with gene activation. “The loops that we mapped across the genome are incredibly important for cellular function. They often have genes at one end of the loop, and when the loop forms, the gene turns on,” Aiden said. “Depending on which genes get looped to, a cell can activate the gene expression program that can help turn it into an eye cell or a heart cell or a lung cell. Understanding how cells use the same starting material and end up diversifying into the thousands of different types of cells in the human body is one of the important open questions in biology.”
Mapping Cell Diversity and Disease
One major advantage of the Hi-C approach is that it probes all possible DNA contacts in the cell, providing global views of genome structure at multiple scales, Rao explained. “This ability to simultaneously resolve chromatin structures at varied size scales is crucial to synthesizing a more comprehensive model of nuclear organization.” But the main limitation is that it takes snapshots of DNA contacts averaged across millions of cells. “A major question arising from our work is how these structures change over time and how heterogeneous they are across the cell population. Single-cell approaches, like single-cell Hi-C and microscopy, will be tremendously valuable in addressing these questions.”
Susan Gasser, director of the Friedrich Miescher Institute for Biomedical Research, agreed that this is the main limitation of the approach. “The problem with Hi-C is that it captures long-range interactions in a population of cells, and then averages out the noise, which may actually have contained all the interesting and functional transient interactions. What is missing is live imaging technologies that reveal how frequently these interactions occur, and how stable they are once they occur,” she said. “The Hi-C technique is continually being improved with better sequencing and more sensitivity. There needs to be a lot more imaging coupled with the Hi-C to confirm what it really means in space and time. That is the future of genome organization.”
Moving forward, another important avenue of research will be to map the 3-D genomes of cells associated with various disease states such as cancer. Moreover, the study may have important implications for the ongoing effort to interpret how alterations in the genomic sequence cause disease. “Many studies over the last decade have identified pieces of the genome that are associated with all kinds of diseases, but more often than not, those pieces of the genome don’t actually lie in genes,” Aiden said. “We’ve known that these non-coding variants are often involved in regulating genes, like genetic switches, but the big challenge has been figuring out which genes they actually control. Using our maps, we can start to unravel this mystery.”
1. Lieberman-Aiden E, van Berkum NL, Williams L, Imakaev M, Ragoczy T, Telling A, Amit I, Lajoie BR, Sabo PJ, Dorschner MO, Sandstrom R, Bernstein B, Bender MA, Groudine M, Gnirke A, Stamatoyannopoulos J, Mirny LA, Lander ES, Dekker J. Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science. 2009 Oct 9;326(5950):289-93. doi: 10.1126/science.1181369.
2. Rao SS, Huntley MH, Durand NC, Stamenova EK, Bochkov ID, Robinson JT, Sanborn AL, Machol I, Omer AD, Lander ES, Aiden EL. A 3-D Map of the Human Genome at Kilobase Resolution Reveals Principles of Chromatin Looping. Cell. 2014 Dec 18;159(7):1665-80. doi: 10.1016/j.cell.2014.11.021. Epub 2014 Dec 11.