Skip to main content
European Commission logo print header

Deciphering de novo gene birth in populations

Project description

A new perspective on de novo gene emergence

Investigation into the evolution of new genes and the emergence of novel traits has identified many mechanisms such as gene duplication and de novo gene emergence. While we know that the latter phenomenon involves the evolution of genes from DNA sequences that were non-genic, the precise mechanism at play remains elusive. The key objective of the EU-funded NovoGenePop project is to understand how de novo gene birth emerges and explore the hypothesis that the process involves differences in gene content between individuals. To investigate this, researchers will develop novel computational methods for evaluating all transcripts in different biological systems and identify potential mutations that have led to the evolution of new genes.


Genes are fundamental units of life and their origin has fascinated researchers since the beginning of the molecular era. Many of the studies on the formation of new genes in genomes have focused on gene duplication and subsequent divergence of the two gene copies. But, in recent years, we have learnt that genes can also arise de novo from previously non-genic sequences. The discovery of de novo genes has become possible by the sequencing of complete genomes and the comparison of gene sets between closely related species. Here we wish to test a novel hypothesis, we propose that de novo gene formation dynamics in populations results in substantial differences in gene content between individuals. If they exist, these differences would be not be visible by the current methods to study gene variation, which are based on the comparison of the sequences of each individual to a common set of reference genes. To test our hypothesis, we will need to develop novel computational approaches to first obtain an accurate representation of all transcripts and translated open reading frames in each individual, and then integrate the information at the population level. We propose to apply these methods to two very distinct biological systems, a large collection of Saccharomyces cerevisiae world isolates and a human lymphoblastoid cell line (LCL) panel. For this, we will collect and generate RNA (RNA-Seq) and ribosome profiling (Ribo-Seq) sequencing data. In order to identify de novo originated events occurred within populations, as opposed to phylogenetically conserved genes that have been lost in some individuals, we will also generate similar data from a set of closely related species in each of the two systems. Combined with genomics data, we will identify the spectrum of mutations associated with de novo gene birth with an unprecedented level of detail and uncover footprints of adaptation linked to the birth of new genes.

Host institution

Net EU contribution
€ 2 453 751,00
Doctor Aiguader 88
08003 Barcelona

See on map

Este Cataluña Barcelona
Activity type
Research Organisations
Total cost
€ 2 453 751,00

Beneficiaries (1)