In cancer genome studies, mutational heterogeneity across cancer types and between individual patients has contributed to a large, unwieldy, and suspect list of genes associated with the disease. But a new software program promises to weed out some of the potential artifacts in that list.
“As we got more data, we understood that our background model is inaccurate,” said senior investigator Gad Getz, director of Cancer Genome Computational Analysis at the Broad Institute of the Massachusetts Institute of Technology and Harvard and author of a new article published online this week in Nature (1).
Measuring mutation rates in 3083 tumor–normal pairs across 27 types of cancer, Getz and a multisite research team found that mutation rates can span 3 orders of magnitude, from 1 mutation in the entire exome in pediatric cancers to thousands of mutations in melanoma and lung cancers.
For some cancers, such as acute myeloid leukemia, the variation in mutation rates varied was extreme. “The fact that mutation frequencies vary over a thousand fold range for a specific type of cancer is a dramatic illustration of how seemingly similar cancers are actually very different,” noted Douglas Fowler, an assistant professor of genome sciences at the University of Washington in Seattle, who was not involved with the study. “It's a strong argument for taking a personalized approach to cancer diagnosis and treatment.”
In addition, the researchers found sequence-context specific variability. That is, some individual nucleotides, based on which bases flank them, are more likely to be mutated. This depends on different mutational mechanisms that occur in the cell, such as spontaneous deamination of methylated CpGs. Different cancers show different characteristic base changes, the group found.
A third type of heterogeneity—genome-wide variation—was also measured from 126 tumor-normal pairs across 10 cancer types. The team found that mutational frequencies varied by more than 5-fold.
Mutational heterogeneity is strongly tied to the timing of DNA replication and also to transcriptional activity—two factors that may explain many of the suspect genes that inadvertently turn up in association studies. For reasons that are not entirely clear, late-replicating regions of the genome seem to have higher mutation rates. Meanwhile, genes that are highly expressed tend to have lower mutation rates due to transcription-coupled repair.
To account for all 3 types of variation, Getz and his colleagues created MutSigCV. Using MutSigCV to reanalyze previously published lung cancer data, the team narrowed a list of 450 suspect genes down to 11, most of which had been previously been linked to lung cancer.
The group is now reanalyzing data from their own projects and those of other research groups. “This is just beginning. As more data is accumulated, we will have more accurate models,” Getz said.
- Lawrence, M. S., P. Stojanov, P. Polak, G. V. Kryukov, K. Cibulskis, A. Sivachenko, S. L. Carter, C. Stewart, C. H. Mermel, S. A. Roberts, et al. 2013. Mutational heterogeneity in cancer and the search for new cancer-associated genes. Nature advance online publication (June).