In math, you learn early on that there are elegant and brute force ways to solve a problem, and the approach you take often depends on the problem at hand. You can prove there are an infinite number of prime numbers quite simply, but if you want to discover all of the large prime numbers, it takes a supercomputer to find them. Biologists tend to approach their experiments in a similar way: elegantly when possible but using brute force when necessary.
Jay Shendure and his colleagues in the Genome Sciences department at the University of Washington are one such group. Recently, Shendure’s lab along with colleagues from Illumina in San Diego, California made quite a splash when they described a combinatorial indexing approach that enabled large-scale labeling of DNA. Using the technique, the team demonstrated robust phase sequencing of genomic DNA, a process that had proven extremely tricky for genome researchers in the past (1). Phasing, however, was only a starting point; the researchers quickly had other ideas in mind for their indexing technique.
“We realized this concept could also be very powerful for capturing molecular information from large numbers of single cells,” said Shendure, who soon turned his sights onto the chromatin landscapes of single cells.
Decades of research have taught us that changes in chromatin conformation are associated with gene activity; open the chromatin up, and a gene is accessible to transcription factors and can be transcribed. Therefore, the state of chromatin in any particular cell or group of cells can actually tell researchers a lot about what that cell has done, is doing at the moment, or is about to do in the future.
When researchers first started generating chromatin state maps a few years back, the samples were mainly collections of cells, not single cells. Although these early maps gave unique insights into chromatin regulation and gene expression, researchers now know that the expression patterns of individual cells vary dramatically, so in order to understand how systems work, you need to look at how individual cells work together—you need to look at single cells. But the brute force isolation of massive numbers of single cells, which would be required to assess chromatin states, just did not seem feasible.
What Shendure recognized, however, was that it might be possible to “tag” open sites in chromatin with a unique barcode within each cell. This would then create the possibility of sequencing all of the single-cell samples at once and then “deconvoluting” the barcodes to assign chromatin states to specific single cells. The approach would mean very little cell sorting had to be done (thus eliminating the brute force approach), and the pooled sequencing runs would make for better use of the high-throughput possibilities of today’s next-generation sequencing platforms. All of this was possible using his combinatorial index approach.
The chromatin profiling scheme, which Shendure and his co-authors reported in the journal Science, goes a little like this: first, you need to tag nucleosome-free regions. Assaying for transposase-accessible chromatin using sequencing (ATAC-sequencing) gave the authors a way to barcode these regions by using transposases that target “open regions” and were loaded with barcoded sequencing adapters. This created a first level of barcoding within the cells. From there, 15–25 nuclei can be randomly distributed to the wells of a 96-well plate, lysed, and a second barcode can be introduced during PCR. All that is left is to pool the PCR products and sequence.
The truly surprising thing is that, even with only two barcodes, the rate of “collision,” or nuclei receiving the same two barcodes, is extremely low. Thus, when you aggregate sequences with the same combinations of barcodes after sequencing, most often these will be from the same single cell.
To prove this was the case, Shendure and his colleagues performed a series of experiments using the combinatorial indexing strategy to define chromatin accessibility for cells in mixtures, even when there was no reference cell to work with. Their data indicated that the indexing is robust; in fact, they profiled more than 15,000 single cells for the article.
But just how many single-cell chromatin landscapes could be explored with the new technique? “The sky is the limit. We certainly won't run out of cells!” quipped Shendure. “Seriously, I think the concept of combinatorial barcoding, if further developed as we plan to do, could possibly enable tens to hundreds of thousands of single cells to be queried in a single experiment.”
And this leads to the intriguing possibility that one day we may be able to generate a human cell atlas that defines the transcriptional landscapes of all cell types in the human body. “I think that now is the time to get going on planning what such an effort might look like. I think the technologies for making this feasible–both for RNA and chromatin accessibility–are there as of this year. There is room for continued technical developments, and certainly there should be investments in that, but that shouldn't stop such an effort from getting going.”
1. Amini et al. “Haplotype-resolved whole-genome sequencing by contiguity-perserving transposition and combinatorial indexing” Nat. Genetics. 46, 1343-1349, 2014.
2. Cusanovich, D.A. et al. “Multiplex single-cell profiling of chromatin accessibility by combinatorial cellular indexing” Science 348 (6237): 910-914, 2015.