By Victoria Foe
Originally published in Friday Harbor Labs Tide Bite
I came to FHL in 2000 to study cell division, to take advantage of the sea urchins, sea stars and sand dollars whose huge, glassy-clear, permeable and easy-to-manipulate eggs make them ideal subjects for that work. Local species have eggs that ripen at different times, providing an almost year-round source of perfect research material. For experiments on marine cells, FHL’s flowing seawater tables are the “lab benches” that land-locked cell scientists can only dream of.
Then in 2013, the Seaver Institute gave me the opportunity to tackle a question that had haunted me since graduate school: does DNA replication activate new gene expression? If so, this would provide an elegant mechanism compelling embryonic development to unfold in lock-step with cell division. Answering this required looking directly at DNA and RNA via electron microscope, and thousands of hours searching tangles of DNA for those rare instances when DNA and RNA synthesis had collided. I will write a future Tide Bite about what I discovered. But this Bite is drawn from a related review I’m just now completing. In this time of the COVID-19 pandemic it is also a reminder that it is great catastrophes that can bring about great transformations.
This story is about another virus that turned the world upside down. Like coronaviruses, this transformer used RNA, not DNA, as its genetic material. The events in question occurred about two billion years ago, when life was simpler. At that time Earth was home to only two of life’s three domains: Bacteria and the ever-so-slightly more sophisticated Archaea. The organisms of these two domains are unicellular beings, with small DNA genomes, no discrete nucleus and limited possibilities. There existed then no beautiful diatoms, no gliding amoeba, no sun-capturing algae, no sea stars, or octopuses, or grasses, or trees, or tree frogs, or birds, or mammals.
The genes which a bacterial or archaean cell uses to conduct its small affairs are sufficiently short and few that they all fit like beads on one circular chromosome. These units of information hard-wired together are individually activated as needed and transcribed into messenger RNAs (mRNA) — transportable cassettes — each of which directs the synthesis of a specific protein.
We do not know which species fell prey to the virus. Or even whether the disease attacked one, or many species. But we are pretty sure the pathogen was a Group II retro-transposon (Rogozin et al. 2012, Lambowitz and Zimmerly 2004). When it infects its victim, a retrotransposon presents its mRNA-like genome for translation and the host innocently translates it, mistaking it for one of its own. This piece of malicious code contains instructions for making Reverse transcriptase and Integrase. Reverse transcriptase reverse-copies the viral RNA into DNA. Integrase inserts viral DNA seamlessly into the host’s necklace of genes. Unlike coronaviruses, retrotransposon infections are usually permanent, since viral DNA – once integrated – is rarely lost from the host chromosome. As a consequence, thenceforth the host and her descendants must repair, replicate and transcribe the parasite’s DNA together with her own, much as a lark feeds the cuckoo chick in her nest, but in perpetuity. To ensure the newly transcribed copies of itself are released to infect other cells and other sites in the host’s chromosome, the retrotransposon RNA contains two special stretches of sequence: the first self-folds into molecular scissors able to cut RNA molecules at specific sites. The second, formed by the two ends of the viral RNA contacting one another, creates the sequence those scissors recognize. Thus, the vicious and infectious virus dissects itself out of the host mRNA. In the process, the host’s own transcripts are left in fragments.
Mutation would eventually reduce its virulence, but not before the virus had inserted itself throughout its victim’s chromosome. But in so doing, it permanently changed how the host’s descendants could regulate their genes. In the Eukarya – life’s third domain, which includes ourselves and the life forms we see all about us – genes are enormously longer than in Bacteria and Archaea and are laid out most oddly. And in this oddness, the telltale fingerprints of the virus remain. Eukaryotic genes exist as many discontinuous fragments of protein-encoding sequence interrupted by long stretches of non-coding “junk” DNA. RNA polymerases transcribe the short stretches of coding DNA plus the long runs of junk sequence into one long continuous piece of RNA. Production of mRNA thus requires clipping out the junk and suturing the coding sequences together. The boundaries between the junk and the coding RNA are encoded by the same sequence even present-day bacterial Group II retrotransposons use to mark boundaries between viral and host sequences. In eukaryotes, the splicing is now done by an RNA/protein complex called a spliceosome. But the RNA moiety at the heart of the spliceosome is that same self-folding sequence retrotransposons use as scissors (Lambowitz and Zimmerly 2004, Rogers 1990). What has changed, though, is that eukaryotes have seized control of the scissors: now the pieces of coding RNA are stitched back together to make the eukaryote’s mRNA, and it is the stranded relics of viral DNA — long since drained of all contagion and slowly turned by mutation to noncoding junk — that are cut out, broken down and recycled. Under my electron microscope, I often glimpse this RNA pruning underway.
RNA polymerases fall off chromosomes when cells divide, so at the start of each new cell cycle they must load again at the beginning of a gene and commence anew the journey down its length (Shermoen and O’Farrell 1991). Therefore, the inclusion of lengths of junk DNA in genes acts as a timing fuse, determining when each gene’s first mRNA (thence protein) appears. Some genes take under a minute to transcribe, some take hours, a few are so long they take days. In different genes and species, natural selection has altered the lengths of junk DNA inserted between splice sites, sometimes by orders of magnitude. Bacteria, archaeans and eukaryotes all turn genes on and off by regulatory molecules that gate the loading of RNA polymerase. But the additional tool of using a transcription of junk DNA as a delay timer allowed eukaryotes to create vastly more complex genetic circuits. One such example is the auto-inhibitory feedback circuit with a long delay set by junk DNA, which produces the oscillations that drive segment formation in vertebrate embryos (Takashima et al. 2011). The resulting oscillatory gene activation lays down those cell blocks that produce vertebrae, ribs, muscles etc along the body axis, sequentially.
Also, by selective use of alternate splice sites, eukaryotes can make multiple variants of a protein from a single gene. The Down syndrome cell adhesion gene ( Dscam) takes this to an extreme. Dscam encodes cell surface receptors used for axon identity and guidance during nervous system development in organisms as diverse as fruit flies and humans. By virtue of combinatorial use of many alternative splice sites, the single Dscam gene can potentially generate over 38,000 slightly different versions of the DSCAM protein (Schmucker et al. 2000)!
In summary, a catastrophic viral invasion of the ancestor of all eukaryotes introduced a radical new tool for gene regulation, one that would facilitate the development of more complex life forms. Much of this regulation is based on gene length and is independent of base sequence, explaining why so much of what was once thought to be junk DNA is now integral to eukaryotic genomes. In humans, for example, over 80 percent of the genome is transcribed into RNA, yet only about 1 percent encodes messenger RNAs. No crisis has ever been turned more spectacularly to advantage than that two billion-year ago confrontation between a primitive cell and its virus. Some tiny cell recast its calamity and brought forth a world populated with what Darwin called “endless forms most beautiful and most wonderful.”
Dr. Foe is a Research Professor emeritus at the University of Washington. She has been a full-time member of the Friday Harbor Labs research community for the past 20 years, was a founding member of the FHL Center for Cell Dynamics and has been a Guggenheim and a MacArthur Fellow.
References: Lambowitz A.M. and S. Zimmerly. 2004. Mobile Group II introns. Annu Rev Genet: 38 (1-35);Rogers J.H. 1990. The role of introns in evolution. FEBS Letters: 268 / 2 (339-343); Rogozin I.B., Carmel L., Csuros M. and E.V. Koonin. 2012. Origin and evolution of spliceosomal introns. Biology Direct: 7 / 1 (11-28); Schmucker D., Clemens J.C., Shu H., Worby C.A., Xiao J., Muda M., Dixon J.E. and S.L. Zipursky. 2000. Drosophila Dscam is an axon guidance receptor exhibiting extraordinary molecular diversity. Cell: 101 / 6 (671-684); Shermoen A.W. and P.H. O’Farrell. 1991. Progression of the cell cycle through mitosis leads to abortion of nascent transcripts. Cell: 67 / 2 (303-310); Takashima Y., Toshiyuku O., Gonzalez A., Miyachi H. and R. Kageyama. 2011. Intronic delay is essential for oscillatory expression in the segmentation clock. PNAS: 108 / 8 (3300-3305).