Antonio Giraldez, a geneticist at Yale University, has long been fascinated with embryogenesis. “How is gene expression regulated? How is it coordinated so you can go from a fertilized egg to an embryo with differentiated cells—retinal cells, heart cells, muscle cells?” he questioned. “And what I’d like to specifically understand is the very first step of that process. It’s called the maternal-to-zygotic transition.”
This discovery was made using ribosome footprinting, a technique that allowed Giraldez’s group to explore the coding potential of genes in vertebrates. “Historically, there were criteria used to define coding genes. They had somewhat arbitrary cut-offs on the size of the open reading frame.” Giraldez explained. “But by looking at how the ribosome is moving on the messenger RNA (mRNA), moving every three nucleotides, we can say that they are being translated, even in these short open reading frames.”
This does not mean that all long non-coding RNAs are translated into peptides, but it does challenge commonly held notions about what these sequences are doing—and, perhaps, calls for renaming some RNAs. “I think a fraction of what we now call long non-coding RNAs will probably be translated into protein and those proteins will have a function. But they could also have some other kind of intrinsic function in translation, perhaps regulating the stability of the mRNA in some way,” he said.
A Curious Story
Across the Atlantic at the University of Sussex, senior research fellow Juan Pablo Couso also serendipitously discovered a smORF-encoded peptide. “It was a curious story. We work with mutant flies. I was looking at a fly in my microscope, and it was clearly a spontaneous mutant, and he had no legs,” Couso recounted. “We isolated that mutation and mapped it to a region of the genome where there was no gene characterized, only some RNAs. One of those RNAs seemed to be non-coding so we took a closer look at that.”
In that region, Cuoso’s group found small open reading frames (smORFs) that encoded 42 amino acids and 32 amino acids, but they thought these were too short to influence the leg mutation. Then they noticed an 11 amino acid sequence encoded in a smORF; when they removed the sequence, leg development was compromised (2).
“This is 10 times smaller than the previous minimum gene coding sequence for ‘normal’ genes,” Cuoso said. “But 11 amino acids was all that was needed to develop proper legs in the fly. We were completely amazed.”
Encouraged, Couso and colleagues started looking at other non-coding RNAs in the fruit fly and discovered a new smORF that encoded two similar peptides, each less than 30 amino acids. They soon learned that these small proteins helped regulate calcium transport.
“This is something that is expressed in all muscles. The best place to characterize this was in the heart and we saw that it was important to regular heart contraction,” he said. “The amazing surprise was that those same peptides also concern humans. In fact, it’s been known for a while to be involved in arrhythmias.”
Couso argues that these studies on smORFs show that genes can be much smaller than previously thought. And, because of their diminutive nature, they may be missed easily by geneticists working to decipher the role of RNAs in embryogenesis and development.
The True Coding Potential of the Genome
Giraldez and Couso are not the only researchers to stumble on to smORF-related discoveries. Alexander Schier and his team at Harvard University recently discovered a small peptide in zebrafish called Toddler that promotes cell movement during embryogenesis (3). And Couso believes this is only the tip of the iceberg—many more discoveries are sure to follow.
“We have underestimated the coding potential of the genome,” said Couso. “And these peptides could affect all aspects of biology, not just embryogenesis. I foresee that we’re going to have a lot of exciting discoveries in the next few years—and we’ll see there are many, many more of these peptides that affect embryogenesis, the development of disease, the regular function of the nervous system, and perhaps much more.”
Giraldez agreed but cautioned that there is still a lot of work to be done. “We are now seeing that many of these non-coding RNAs are translated and possibly code for a protein. It raises many questions. What is the function of each of these coding sequences? Is it regulatory? Is it coding a protein? Is it irrelevant? We don’t know yet.” But until researchers systematically remove these so-called non-coding RNAs to test their influence on the course of development, we won’t be any closer to understanding their role in embryogenesis.
“We have read the book of the genome. But we still need to identify many of the words in that book,” said Giraldez. “These studies allow us to start assigning meaning to different genes. It’s not a complete picture, of course. But it allows us to start classifying and identifying some of those words. It gives us hints to the meaning of those words. And one day, understanding those meanings will help us better understand embryo development.”
1. Bazzini AA, Johnstone TG, Christiano R, Mackowiak SD, Obermayer B, Fleming ES, Vejnar CE, Lee MT, Rajewsky N, Walther TC, Giraldez AJ. Identification of small ORFs in vertebrates using ribosome footprinting and evolutionary conservation. EMBO J. 2014 May 2;33(9):981-93.
2. Ladoukakis E, Pereira V, Magny EG, Eyre-Walker A, Couso JP. Hundreds of putatively functional small open reading frames in Drosophila. Genome Biol. 2011 Nov 25;12(11):R118.
3. Pauli A, Norris ML, Valen E, Chew GL, Gagnon JA, Zimmerman S, Mitchell A, Ma J, Dubrulle J, Reyon D, Tsai SQ, Joung JK, Saghatelian A, Schier AF. Toddler: an embryonic signal that promotes cell movement via Apelin receptors. Science. 2014 Feb 14;343(6172):1248636.