Periodic Reporting for period 3 - ERVE (Systematic discovery of functional elements in RNA virus genomes: an Encyclopedia of RNA Virus Elements)
Reporting period: 2018-09-01 to 2020-02-29
Computational analysis of virus genomes provides a practical and cost-effective way forward that can be used to precisely and efficiently target follow-up experimental research. The genome sequences of RNA viruses evolve very rapidly so that there is considerable diversity between different isolates of a single virus species. For medically or economically important species, there are often dozens or even hundreds of sequences available. By comparing the sequences of different virus isolates and computationally analysing the patterns of changes at different nucleotide positions (a technique known as ""comparative genomics""), we can predict novel functional elements and often gain extensive insight into their function. Part of our work involves developing new comparative genomic techniques for virus genome analysis. We are also taking some of the most interesting newly discovered features into the lab to experimentally characterize exactly what their function is during virus infection. By enhancing our understanding of the molecular biology of many virus species, the project lays essential ground work for follow-up advances in diverse virus control strategies."
One such mechanism is ""ribosomal frameshifting"" whereby a proportion of translating ribosomes are stimulated to shift into an alternative reading frame to produce a ""hybrid"" protein. Frameshifting is normally stimulated by signals within the mRNA which induce a fixed expression ratio of frameshift and non-frameshift products, e.g. the Gag and Gag-Pol polyproteins of HIV. However, we discovered and characterized the only two known examples of protein-stimulated frameshifting (one in the cardioviruses and the other in the arteriviruses) where frameshifting depends on a viral protein binding to signals in the mRNA. This allows the virus to modulate the efficiency of frameshifting as the amount of virus protein changes over time in an infected cell and thus regulate virus gene expression.
In another group of viruses called the potyviruses (the largest and most important group of plant RNA viruses) we discovered a new gene, termed pipo, that is absolutely essential for virus spread within infected plants. Unusually, expression of pipo depends not on ribosomes slipping into an alternative reading frame, but instead on the virus replication enzyme slipping during synthesis of a small percentage of the virus mRNAs. This is a new phenomenon in this type of virus and led to a series of follow-up studies to investigate the molecular mechanism.
One of the ways in which we are investigating virus gene expression is the relatively recently developed technique known as Ribosome Profiling. This technique relies on the fact that a translating ribosome covers around 30 nucleotides of mRNA. Nucleotide sequences that are not thus protected by translating ribosomes can be enzymatically digested, leaving millions of 30-nucleotide-long RNA fragments that can be analyzed using High-Throughput Sequencing technology and then mapped back to virus and host mRNAs, to give a global snapshot of the positions of translating ribosomes. We have been using Ribosome Profiling to study gene expression in murine hepatitis coronavirus (a model for SARS and MERS coronaviruses), avian infectious bronchitis coronavirus, equine torovirus, enteroviruses, human astrovirus, and murine leukemia virus.
A key aspect of our work is to identify and characterize ""hidden genes"" in RNA viruses. We identified and characterized a novel protein (termed UP) encoded in the genomes of most human-infecting enteroviruses including poliovirus type 1, and a novel protein (termed XP) encoded in the genomes of human and other mammalian astroviruses. Very recently, through comparative genomic analysis of SARS-CoV and related coronaviruses, we identified a candidate new gene (termed 3c) that we are currently investigating experimentally."
We are also using targeted laboratory work to further investigate the most significant computational findings. This is leading to advances in our understanding of the biology of particular virus species of medical or economic significance, thus laying the ground work for advances in virus control strategies including the rational design of virus vaccine candidates, and identification of potential targets for antiviral drugs. Due to the unique constraints under which RNA viruses evolve, they have developed a variety of novel gene expression strategies and other unique molecular mechanisms, many of which have applications in biotechnology and as tools for fundamental molecular biology research. One focus of our research is to identify and characterize new such mechanisms. A significant sub-project is using Ribosome Profiling to understand the dynamics of virus gene expression and we have developed new computational tools to refine and interpret the data produced from this technique.