Any undergrad, or at least any pre-med, could tell you the central dogma of molecular biology: DNA makes RNA, which in turn makes proteins. According to a paper published in Science last year, however, the transcription process from DNA into RNA is not as faithful as had previously been thought (1); Mingyao Li from the University of Pennsylvania and his collaborators identified 10,000 exonic sites in which the mRNA sequence differed from its cDNA template.
“We have a lot of experience sequencing RNA and DNA from many human cell types, and we have never seen this phenomenon,” said co-author of one of the published comments Yoav Gilad, an associate professor at the University of Chicago. “It was so unexpected, so novel, and so widespread, so we immediately searched for it in our own data. But we could not find it.”
All three comments noted a number of anomalies in the initial paper. Scientists have been aware of RNA editing, but only two types of substitutions have been seen to date: adenosine to inosine (read as guanine) and cytosine to uracil. But Li’s group identified all 12 possible nucleotide substitutions and at a frequency never dreamed of. Yet, the newly described RDDs were only about half as common as the two established types.
Moreover, the RDD sites were dramatically enriched at the ends of RNA sequencing reads and preferentially appeared on only one sequencing strand. A possible explanation for this is sequencing or alignment errors, especially when reads span splice junction sites (which, since their reference genome was a transcriptome and not a genome, the RDD sites often did). The comments also pointed out that the DNA sequencing coverage used for the analysis was quite low, with some regions sequenced only four times. In regions with higher coverage, the number of RDDs was lowest and in expected values based on previous RNA editing studies.
In a published response to these comments, Li and her co-authors reported additional assays that they claim validate their results (5). The group used two new sequence alignment methods and compared RNA sequences to genomic DNA rather than a transcriptome. This was to ensure that all potential paralogous regions that may have arisen through gene duplication were included, and that an RDD was not simply a transcript of a paralogous gene. In their response, the researchers wrote that “we view the discovery of RDDs as the important point and find the exact number to be less salient.”
In the end, Gilad remains unconvinced by the new data. “Everyone knew these sites existed, but no one knew that there were 10,000 of them and that all of the nucleotide changes could occur—that was the novelty in their paper. They may have answered anecdotal cases, but they did not address the difficulties the three comments raised.”
- Li, M., I. X. Wang, Y. Li, A. Bruzel, A. L. Richards, J. M. Toung, and V. G. Cheung. 2011. Widespread RNA and DNA sequence differences in the human transcriptome. Science 333(6038):53-58.
- Kleinman, C. L., and J. Majewski. 2012. Comment on ” widespread RNA and DNA sequence differences in the human transcriptome”. Science 335(6074):1302.
- Pickrell, J. K., Y. Gilad, and J. K. Pritchard. 2012. Comment on ” widespread RNA and DNA sequence differences in the human transcriptome”. Science 335(6074):1302.
- Lin, W., R. Piskol, M. H. Tan, and J. B. Li. 2012. Comment on ” widespread RNA and DNA sequence differences in the human transcriptome”. Science 335(6074):1302.
- Li, M., I. X. Wang, and V. G. Cheung. 2012. Response to comments on ” widespread RNA and DNA sequence differences in the human transcriptome”. Science 335(6074):1302.