When a strand of RNA is churned out of the transcription machinery in a cell, its message is not set in stone. Since the 1980s, scientists have known that mRNA can be modified after transcription, but quantifying the frequency of this so-called RNA-editing has been challenging. Last year, researchers found more than 10,000 sites in the human genome that underwent RNA editing (1), but some questioned how many of those sites were simply the result of sequencing errors.
Using RNA-seq, or whole transcriptome shotgun sequencing, the team sequenced the RNA in a lymphoblastoid cell line from “YH,” a male Han Chinese individual whose genome was sequenced in 2008. Then, they compared the data from YH’s genome sequence to the data from RNA-seq, and sent the results through a pipeline of 12 filters to reduce errors—for example, deleting segments that could match to multiple parts of the genome and segments with known single-nucleotide polymorphisms.
After applying these filters, the team pinpointed 22,688 RNA-editing sites. The vast majority of these—around 93%— were conversions of adenosine to inosine, known to be caused by adenosine deaminases acting on RNA. Inosine behaves chemically like guanine, changing the structure of the resulting transcript.
Meanwhile, other studies have found higher rates of non-adenosine RNA edits. “The discrepancy is probably caused by the more accurate methods used in our study,” said Peng. “The mechanism of non-A-to-G editing is unclear so far, and we are not sure whether non-A-to-G editing does exist. Thus, lower percentages of A-to-G in previous studies may indicate higher false positive in their data sets.”
In addition, Peng’s team found that RNA editing sites tend to be clustered together and are found more often in double-stranded RNA than single-stranded elements. The sites were widespread throughout the genome, with about 1% of total editing sites falling within coding regions.
In the end, the new paper offers proof that RNA editing is widespread even when filters are put in place to minimize false-positives from sequence reading errors. Now, the analysis method can be applied to additional individuals and cell types to begin getting a handle on the implications of RNA editing.
“The accurate method of RNA editing identification could help researchers to better understand this kind of post-transcriptional modification,” says Peng. “Furthermore, it can also help to find the connection between RNA editing and human development and diseases.”
- Li, M., I. X. Wang, Y. Li, A. Bruzel, A. L. Richards, J. M. Toung, and V. G. Cheung. 2011. Widespread RNA and DNA sequence differences in the human transcriptome. Science 333(6038):53-58.
- Peng, Z., Y. Cheng, B. C. Tan, L. Kang, Z. Tian, Y. Zhu, W. Zhang, Y. Liang, X. Hu, X. Tan, J. Guo, Z. Dong, Y. Liang, L. Bao, and J. Wang. 2012. Comprehensive analysis of RNA-seq data reveals extensive RNA editing in a human transcriptome. Nature Biotechnology advance online publication(February).