CORDIS - Forschungsergebnisse der EU
CORDIS

Understanding (harmful) virus-host interactions by linking virology and bioinformatics

Periodic Reporting for period 1 - VIROINF (Understanding (harmful) virus-host interactions by linking virology and bioinformatics)

Berichtszeitraum: 2020-09-01 bis 2022-08-31

VIROINF is an EU-financed Marie Skłodowska-Curie Action (MSCA) Innovative Training Network aiming to understand (harmful) virus-host interactions by linking virology and bioinformatics. Emerging and existing viruses are currently at the forefront of scientific priorities. Viruses will remain an urgent topic given the continuously growing and aging global population, creating major challenges for health science, technology, and society as a whole. The VIROINF Innovative Training Network focuses on virus-host interaction by combining virus research with specifically designed bioinformatical tools to avoid infections and enable vaccinations and treatments.
The objective of VIROINF is to train Early Stage Researchers (ESRs) as a new generation of young scientists with excellent understanding and skills in virology, cell biology, and bioinformatics, to the great benefit and advancement of the field. The consortium consists of 27 high-profile universities, research institutions, and companies. Coordination is at the Friedrich Schiller University Jena, Germany. The host institutions and partners are located in Austria, Belgium, France, Germany, Israel, the Netherlands, Spain, Switzerland, and the United Kingdom.

The overall scientific objectives of the VIROINF project are to tackle with an intense interaction of virology and bioinformatics:
- Modelling of virus-host interactions
> Virus identification
> Host prediction
> Virus-interactions
> Virus regulation
> Virus products
- Modelling of virus evolution in hosts
> Microevolution: virus quasispecies
> Macroevolution: natural selection of viruses
To improve virus identification two large Illumina shotgun sequencing datasets of the honey bee gut have been produced. The datasets are jointly analyzed and will aid in designing a machine-learning algorithm for viral sequence identification.
Furthermore, a bioinformatic pipeline for the identification and classification of Crassvirales bacteriophages has been developed. This resource will be used to increase the number and diversity of this group of bacteriophages in open-access databases.
As the specific hosts of bacteria-infecting viruses are usually unknown, Viral-Tagging (VT) approach was used to establish a dataset of virus-host species interactions, including bacteriophage-bacteria interactions that is used for developing the machine learning host prediction model.
During productive infection, viral proteins and RNAs interact with cellular pathways to reprogram their host cells for efficient virus replication and immune evasion. We developed and implemented computational methods to analyze virus-host interactions in a specific context of cytomegalovirus infection using an integrative analysis of a broad range of high-throughput data sets. We developed a software package for nucleotide conversion sequencing data analysis and developed a computational method for trajectory inference from timecourse data. In addition, barcoded viruses were generated to enable us to determine the exact infection dose in single cells.
Viruses circulate predominantly in their natural reservoir host but frequently also infect other species. Influenza A viruses are known etiological agents of occasional flu pandemics and annual seasonal epidemics among humans. The ability ofInfluenza A viruses to adapt efficiently across a spectrum of mammalian hosts is at least in part due to the segmented genome of the virus, which confers them with the ability to generate new combinations of segments upon coinfection through reassortment. Currently, it is largely accepted that IAV genome packaging is a non-random process. In order to answer how this might be regulated, we make use of two complementing techniques: (i) RNA proximity ligation - which provides us with a framework to identify interaction partners, and (ii) SHAPE - which provides reactivities of individual bases. We develop experimental and computational methods to use these techniques in the elucidation of the IAV interactome.
New discovered viral products especially enzymes have a high chance of showing new and different properties that can open up new or improved applications. Phage cocktail metagenomes are specific in a way that they contain many closely related genome sequences and genes of bacteriophages are not well represented in databases used in annotation. We established a computational pipeline to assemble as many as possible full-length viral genomes by combining state of the art viral metagenome assembler metaviralspades and a general assembler for metagenomes metaspades.
Viruses display high genetic diversity both within and among viral species, as well as within and among infected hosts.
Low-frequency variants are of especially great interest for harbouring drug resistance mutations or affecting virulence. We try to distinguish viral haplotypes in RNA-Seq data and characterise sequence-based evolution to understand the role of quasispecies in virus pathogenesis and evolution.
Phylogenetic trees are the most widespread presentation for viral phylogenies in the literature. Several tree-building methods and software tools exist, but these methods produce incorrect results for viral phylogenies due to the complex evolutionary relationships that are relevant for viruses. We started to infer the evolutionary linkages between distinct genomic sites in order to develop methods to accurately reconstruct viral phylogenies in the presence of disruptive processes such as virus recombination. The necessary genotypic data is derived from controlled evolution experiments of Deformed wing virus (DWV) in vivo in the honeybee..
We implemented the end-to-end deep learning pipeline using TensorFlow framework and graphical processing units (GPUs) to accelerate training for viral identification, a thorough analysis of the large datasets is still in progress.
Viral-Tagging (VT) is a high-throughput mean of experimentally linking phages to a target host. A new Viral-Tagging (VT) approach based on single-cell sorter for anaerobic bacteria was adopted to separate the phage-infected anaerobic or facultative bacteria with its infecting phage(s), from non-infected anaerobic or facultative bacteria, using fluorescence-activated cell sorting.
To identify RNA-RNA interactions that play a key role in genome packaging, we produced a set of eight reassortants through reverse genetics of different Influenza A viruses. These reassortants were evaluated for viral fitness in vitro and subjected to chemical modification subsequent mutational profiling for identifying potential regions of RNA-RNA interactions.
By using CrassUS we could identify new Crassvirales genomes in coprolites, non-human primates' feces, and the Gut Phage Database, among others. According to our genomic analyses, some of these genomes represent new genera and species of the widespread Crassvirales bacteriophage order.
We observed that phages change their phenotype regarding the range of bacterial strains they can lyse. Phage genomes have been sequenced and at the moment analysis is in progress.
We have explored the application of Direct Coupling Analysis (DCA) on large datasets of real SARS2 sequences sampled from different time points during the pandemic. We hope to identify hotspots of mutations for SARS2 evolution and predict associated fitness changes.
Viroinf logo without background
Location of VIROINF consortium partners.
Viroinf logo on a white background