Mid-Term Report Summary - DOUBLE-UP (The importance of gene and genome duplications for natural and artificial organism populations)
DOUBLE-UP (project no. 322739) wants to study the importance and significance of gene and genome duplications for the evolution of natural and artificial organism populations through (WP1) the analysis of whole plant genome sequences, (WP2) the experimental validation of ‘duplicated’ genes, (WP3) evolving artificial (duplicated) regulatory networks, and (WP4) the use of simulated evolutionary robots or digital life forms. Late 2013 and early 2014, we started this project with revisiting our previously proposed clustering of plant paleopolyploidizations around the K/Pg boundary (Fawcett et al. 2009), using the latest genome sequence data sets and phylogenetic dating methods available. We analyzed data from more than 40 plant species, for which the complete genome sequence was available, to date several tens of whole genome duplications (WGDs) in various species that correspond to approximately 20 independent plant WGDs. We used a state-of-the-art Bayesian dating framework and tested whether these 20 plant WGDs follow a model where polyploid abundance simply increases randomly over time, or alternatively cluster statistically significantly in time in association with the K/Pg boundary, by comparing our WGD age estimates with a null model that assumes random WGD occurrence. In doing so, we found a strongly non-random pattern with many WGDs clustering around the K/Pg boundary and argued that WGDs are usually an evolutionary dead end, but that environmental upheaval can facilitate their survival and establishment (Vanneste et al., 2014a, b). In addition, we analysed several other, newly sequenced complete plant genomes, and showed that several of those, again, have undergone a WGD close to the K/Pg boundary. Examples are the orchids Phaeodactylum equestris (Cai et al., 2014) and Dendrobium catenatum (Zhang et al., 2016), and the seagrass Zostera marina (Olsen et al., 2016). In preparation for WP2 (see higher), we also worked on an overarching view of ‘gene duplicability’, which has been lacking so far, mainly due to the fact that previous studies usually focused on individual species and did not account for the influence of genomic context and the time of duplication. To this end, we recently finished a large-scale study in which we investigated duplicate retention for more than 9000 gene families shared between 37 flowering plant species, referred to as angiosperm core gene families (Li et al., 2015). For most gene families, we observe a strikingly consistent pattern of gene duplicability across species, with gene families being either primarily single-copy or multicopy in all species. The distinction between single-copy and multi-copy gene families is reflected in their functional annotation, with single-copy genes being mainly involved in the maintenance of genome stability and organelle function and multi-copy genes in signaling, transport, and metabolism. The single copy gene set will provide an excellent set of genes for experimental validation in future experiments (WP2), to see what makes these genes so-called ‘duplication-resistant’. In WP3, we have started modeling the evolution of ‘organisms’ having artificial genomes. During an evolutionary simulation, individual organisms undergo a reproduction–mutation–selection life cycle. Amongst other things, we are modelling several kinds of mutations on the artificial genome sequence, such as point mutations, gene duplications, deletions and rearrangements, and WGDs, each with their own parametrizable mutation rate. The gene network gets extracted from the genome and specifies the dynamics of gene expression over time which represents development. The phenotype of an organism is defined as some pattern of gene expression of some (subset of) genes, oscillatory patterns of all or some genes, or a specific response pattern of some genes on a specific input pattern. Fitness can be defined by e.g. how well a specific optimal or target gene expression pattern is matched, possibly combined with measures of other phenotypic characteristics, such as time or energy to develop. We are currently still building and evaluating this model. In addition, we have also developed a first version of a bio-inspired robot controller combining an artificial genome with an agent-based control system (Yao et al., 2014, 2015). A gene regulatory network, switched on by environmental cues and following the rules of transcriptional regulation, provides output signals to actuators. Whereas the artificial genome represents the full encoding of the transcriptional network, the agent-based system mimics the active regulatory network and signal transduction system that is also present in naturally occurring biological systems. Using such a design that separates the static from the conditionally active part of the gene regulatory network contributes to a better general adaptive behaviour (Yao et al., 2014, Yao et al., submitted). We are now ready to start using this framework in an Alife swarm robot environment to study the effects of gene and genome duplication in artificial organism populations (WP4).
Rik Audenaert, (CFO)
Record Number: 187764 / Last updated on: 2016-08-23