Final Report Summary - DOUBLE-UP (The importance of gene and genome duplications for natural and artificial organism populations)
DOUBLE-UP (project no. 322739) aimed to study the importance and significance of gene and genome duplications with respect to plant evolution and speciation through (WP1) the analysis of whole plant genome sequences, (WP2) the experimental elucidation of the mechanisms of duplication retention and intolerance, (WP3-WP4) simulation of evolving artificial (duplicated) regulatory networks and digital life forms, and (WP5) an experimental evolution study. DOUBLE-UP started with revisiting our previously proposed clustering of plant paleopolyploidizations around the K/Pg boundary (Fawcett et al. 2009). We analyzed the complete genome sequences of more than 40 plant species to date several tens of whole genome duplications (WGDs) in various species. We used a state-of-the-art Bayesian dating framework and tested whether these WGDs follow a model where polyploid abundance simply increases randomly over time, or alternatively cluster statistically significantly in time in association with the K/Pg boundary. In doing so, we found a strongly non-random pattern with a majority of WGDs clustered around the K/Pg boundary and argued that WGDs are usually an evolutionary dead end, but that environmental upheaval can facilitate their survival and establishment (Vanneste et al., 2014a, b). In addition, we analysed several other, newly sequenced complete plant genomes, and showed that several of those, again, have undergone a WGD close to the K/Pg boundary (Cai et al., 2014; Zhang et al., 2016; Zhang et al., 2017; Olsen et al., 2016; Unver et al., 2017). In preparation for WP2 (see higher), we performed a large-scale study in which we investigated duplicate retention for more than 9000 gene families shared between 37 flowering plant species, referred to as angiosperm core gene families (Li et al., 2016). For most gene families, we observe a strikingly consistent pattern of gene duplicability across species, with gene families being either primarily single-copy or multicopy in all species. The distinction between single-copy and multi-copy gene families is reflected in their functional annotation, with single-copy genes being mainly involved in the maintenance of genome stability and organelle function and multi-copy genes in signaling, transport, and metabolism. These genes were then used (WP2) to experimentally test the impact of their duplication (overexpression) on plant fitness by engineering about 100 transgenic Arabidopsis lines, each line carrying an extra copy of a specific gene selected from either group. The majority of the transgenic lines showed reduced fitness compared to the controls. We found that in the diploid genomic background, duplication of the genes with higher duplicate retention history has a greater detrimental impact on plant fitness than the single-copy genes. Especially, increased expression of the multi-copy genes associated with thylakoid was the least tolerated among all genes tested. Unexpectedly, based on the results, variations in expression levels of multi-copy genes are less tolerated than variations in single-copy gene expression in Arabidopsis, which requires further research. Through WP3 and WP4, with the development of our novel bio-inspired GRN controller, we studied the effect of gene and genome duplication in silico. Besides emerging behavior, we successfully observed increased adaptive potential of simulated digital organisms that underwent WGD in silico. Finally, in the later developed work package, WP5, we set up an evolutionary experiment with the unicellular green alga Chlamydomonas to compare the genomic and phenotypic adaptation of polyploid and non-polyploids to stressful environments. Although the experiments are still ongoing, our results suggest that recent polyploids grow as well as the ancestral non-polyploid lines in normal conditions. When populations have grown in stressful environments (salinity), on average the initial adaptation of non-polyploids to the saline environment is faster (first 30 generations) than the polyploids. However, on the longer-term, both populations continue to adapt and the polyploids recover relatively quickly from the initial shock of the stress, within about 50-70 generations, and their fitness becomes comparable to the fitness of non-polyploid populations, representing a larger adaptation margin within several tens of generations. We do not know yet whether these adaptation trends are genetic.