Cross-species Transmission History and Evolution in the LIneage of Gammaretroviruses and related Endogenous Retroviruses

Final Report Summary - C THE LIGER (Cross-species Transmission History and Evolution in the LIneage of Gammaretroviruses and related Endogenous Retroviruses)

Non technical summary of the project

“C THE LIGER” is a molecular evolution project that aims to study the viral or host factors, which can be predictors of the ability of Gammaretroviruses to move between hosts belonging to different species. The project was tailored to provide training and career development to Dr Gkikas Magiorkinis, and in that respect has been successful. During the project Dr Magiorkinis developed his skills on the area of genomics, Endogenous Retroviruses, especially with respect to evolution and bioinformatics. Dr Magiorkinis managed the available resources of the project and followed the Career Development Plan as was set at the beginning of the project. He successfully applied for an MRC Clinician Scientist Fellowship; the scheme provides him the opportunity to develop his group and lab on Endogenous Retroviruses. As a result “CTHELIGER” has to terminate early, before completing its research objectives, but has produced the data and bioinformatics algorithms to serve the project objectives. Dr Magiorkinis aims to continue the research plan of “CTHELIGER” and use the developed algorithms-data although he will have to prioritise the research objectives of his new fellowship. In conclusion, “CTHELIGER” has successfully provided Dr Magiorkinis’ the chance to integrate into the University of Oxford and develop the skills and experience to start his independent research group.

Technical outline of the work done on the project

We have built reference sequence libraries from sequences available in the Genebank. We have built two reference libraries, one partial pol sequence library from all available ERV families and XRV species, and one full-length genome library from all available Gammaretroviruses and Class-I families. The pol reference sequence library has been used for the data-mining procedure, because pol sequences are the most well-conserved part of the genomes, while the full-length genome library will be used to construct alignments of Gammaretroviruses and Class-I ERVs. The reference sequence libraries will be used in subsequent analyses.

We have closely collaborated with Dr. Robert Gifford from ADARC who has mined the available mammalian genomes to locate ERVs. The procedure has been performed repeatedly to ensure that all the possible ERVs from mammals have been located; our role was to provide feedback on the extracted ERVs to improve mining algorithms.

Dr Magiorkinis has built an algorithm to measure the genetic integrity of each Class-I ERV element. We have set-up a bioinformatics algorithm that measures the proportion of the Open Reading Frame, which is intact for each gene. The algorithm defines which ERVs are replicating through a retrotransposing or a re-infecting life-cycle by looking at the integrity, and thus functionality, of the env gene, as well as at shared deletions in the genomes from different ERV elements.

Dr Magiorkinis has built an algorithm to measure cross-species transmission. The algorithm uses a maximum parsimony approach assuming that the ERV’s host species are discrete characters; this method reconstructs all the possible scenarios of host-switching history and shows which scenario is the one with the smaller number of switches. We define as a cross-species transmission event as the point on the tree where long subtending branches belong to a different host species in the most parsimonious scenario.

Research-training activities
The fellow has engaged in the following training activities:
A) Training in molecular evolution of ERVs (Training through research)
B) Training in mining and analysing large genomes (Hands-on training)
C) Training in advanced bioinformatics’ analyses and software programming (Hands-on training, seminars)
D) Training in advanced statistics: the Phylogenetic Comparative Method (Hands-on training)

During the project Dr Magiorkinis has produced 4 publications that are connected (directly or indirectly) with his activity in “C THE LIGER”:
- 2 publications (1 lead-author, 1 middle-author) in PLoS Computational Biology
- 1 lead-author review in the Philosophical Transactions of the Royal Society B
- 1 middle-author publication in Hepatology