CORDIS - Forschungsergebnisse der EU
CORDIS

Reconstructing gene family evolution in the ash genus (Fraxinus

Periodic Reporting for period 1 - FraxiFam (Reconstructing gene family evolution in the ash genus (Fraxinus)

Berichtszeitraum: 2015-07-01 bis 2017-06-30

This project undertook research on the structure and evolution of the genome of the European ash tree, Fraxinus excelsior. Ash tree populations in Europe are highly threatened by the invasive fungal pathogen Hymenoschyphus fraxineus, which causes ash dieback. This project supplies essential foundational knowledge for modern approaches to the breeding of increased disease resistance in ash trees. A major finding of the project is there appears to have been a whole genome duplication event in the history of ash that is also shared with the olive tree. This finding was published in the journal Nature in a paper on the ash genome that was widely publicised by news media. Insights from the project have been communicated to the European research community working on ash dieback. As well as having important implications for ongoing research programmes on disease resistance in ash, this finding also unexpectedly impacts research on the biosynthesis of olive oil. This project also researched the genomes of 27 other species and sub-species of ash from around the world, and has also led to the release of genome assemblies for these, allowing future research on the genetic basis for resistance of some of these species to ash dieback and the emerald ash borer.
A reference genome of the European ash, Fraxinus excelsior, was sequenced and annotated by the host lab group and their collaborators. The MSCA Fellow, Dr Endymion Cooper, used the annotated genome to analyse evidence for past whole genome duplications by plotting the distribution of synonymous substitution rates between pairs of related genes in the F. excelsior genome. The same analysis was performed using the genomes of six other species, including olive. A script for automating this process was created and published online (https://github.com/EndymionCooper/KSPlotting/blob/master/kSPlotter.py). The analysis revealed that ash and olive appear to shared at least two ancestral whole genome duplications, one of which appears to be restricted to the family Oleaceae.

The fellow also performed synteny analysis and although this analysis shows regions of multiple synteny between ash and monkey flower, low contiguity of the Fraxinus excelsior genome meant that this approach was not as useful as expected for detecting past shared whole genome duplications.

Much of the Fellow's work was dedicated to producing high contiguity genome assemblies for 27 other species and sub-species of Fraxinus, to allow analyses of gene family evolution in the genus. These assemblies proved difficult due to relatively high levels of heterozygosity (up to 5.13%) in these genomes. The Fellow tested a range of state-of-the-art assembly approaches (including ABySS, SOAPdenovo2, Redundans, and Platanus). However, none of these approaches delivered high quality assemblies. Typically the assemblies were highly discontiguous and greatly exceeded the expected genome size due to the assembler failing to properly handle heterozygous regions. The best assemblies were generated by Dr Laura Kelly, at QMUL's School of Biological and Chemical Sciences using the CLC assembler. She has provided her expertise in this area to the Fellow to support FraxiFam's research activities. The Fellow finished these CLC assemblies by scaffolding using SSPACE and filling scaffold gaps with GapCloser. De novo genome assemblies for 28 accessions of Fraxinus were built, and made publicly available on the project website here: http://www.ashgenome.org/worldwide.

In order to further improve the contiguity of a subset of these genomes (representing phylogenetic diversity within Fraxinus) the Fellow extracted high molecular weight DNA from four Fraxinus species - one from each of the major clades of the Fraxinus phylogeny - and sequenced them, and DNA from F. pennsylvanica provided by collaborators in the USA, with long mate-pair libraries using the Illumina HiSeq platform. These data were used to further improve the genome assemblies of these five species. The final assemblies for these five species had scaffold N50s ranging from 18.5kbp to 50.5kbp and were thus suitable for de novo gene annotation.

In order to achieve robust gene annotations for these five genomes RNASeq data were required. The Fellow also extracted RNA from these four of these five species (RNASeq data were already available for F. pennsylvanica via collaborators in the USA), from various tissues, and sequenced these, for use in improved annotations. The resulting transcriptomic datasets were assembled using Trinity and generated from 100K to 133K putative transcripts per species.

To analyse gene families among 28 species and sub-species of Fraxinus, genes were annotated in each genome based on the F. excelsior annotation, and placed into orthologous groups. This provides a database of gene families in the genus Fraxinus. In order to map these gene families onto the history of Fraxinus an accurate phylogeny is needed, as until now the only phylogenies available have been based on low numbers of genes. To do this, genes were selected that are present in all of the species and also three outgroups species, and have suitable variation to be phylogenetically informative. This resulted in over 250 genes that were used to build a new phylogeny for Fraxinus.
The work accomplished by the Fellow has enhanced our understanding of the genomes of both ash and olive. This lays strong foundations for future genome analysis of these trees. These two species are of great importance to the EU. Ash is an important forestry species, and a critical component of the natural environment of Europe. Olive is the basis of a large industry in olive oil production and key to the landscapes of southern Europe.
Endymion Cooper, the research fellow
Ash dieback in Kent, UK