CORDIS - Forschungsergebnisse der EU
CORDIS

Detecting Polygenic Adaptation Targeting Gene Expression Regulation In Humans Using eQTL Networks.

Periodic Reporting for period 1 - PATTERNS (Detecting Polygenic Adaptation Targeting Gene Expression Regulation In Humans Using eQTL Networks.)

Berichtszeitraum: 2020-04-01 bis 2022-03-31

In humans, a lot of phenotypes involved in local adaptation such as immune response to specific pathogens or metabolism of nutrients such as sugar, fats and protein are called "polygenic". This means that they are determined by several genes or genomic regions. Polygenic adaptation has been proposed to be a major adaptive mechanism for complex phenotypes. In this model, the frequency of several slightly advantageous mutations at independent genomic loci increase simultaneously in frequency in the population. Most of these advantageous mutations are believed to be located within non-coding, regulatory regions of the genome. However, detecting polygenic adaptation signatures, in particular outside of coding, genic regions, has proved to be challenging. Most approaches to detect polygenic adaptation consist in combining signatures of positive selection across functionally homogeneous sets of genes or variants. Conversely, few studies have looked at regulatory variants, and none have accounted for the tissue-specificity of gene expression. Here, we proposed to combine network biology and population genetics approaches in order to detect polygenic adaptation acting on complex phenotypes through gene expression regulation, and to identify and characterise biological functions evolving under polygenic adaptation, taking into account the tissue-specificity of their expression.

This project aimed to answer the following questions:
Q1. How can we efficiently detect polygenic selection targeting regulatory variants ?
Q2. Which phenotypes and biological functions have been targeted by polygenic adaptation in humans?

These questions have led to two main results:
1. The development of a statistical approach to detect polygenic selection signals. Its power has been assessed carefully using simulation and its sensitivity to confounding scenarios has been assessed.
2. The identification of groups of genetic variants regulating the expression of groups of functionally-related genes, that can be used as a basis to detect polygenic adaptation targeting gene expression levels.

This project aimed at increasing our general understanding of processes that shaped present-day genetic diversity in human populations, and in particular the impact of polygenic selection on genome-wide diversity. The application of the developed approach will provide a quantitative assessment of the proportion of gene expression variation that can be attributed to groups of genetic variants under polygenic adaptation. In addition, the analysis of polygenic selection in several dataset providing samples of different tissues from hundreds of individuals should provide insights into how evolutionary processes affect phenotypes expressed in various tissues in humans. Finally, by crossing these results with GWAS databases, we should improve our understanding of the role of polygenic adaptation in the evolution of the risks to develop complex diseases, which could help anthropologists and biologists to better understand how complex phenotypes evolve and how side-effects of selection can sometimes lead to an increase in disease risks.
We started this project by developing an approach based on the use of expression quantitative traits networks to identify groups regulatory mutations targeting the same
genes. This approach aims at grouping regulatory mutations according to the biological function they regulate, and at detecting enrichment in selection signals among loci regulating the same biological function. We first reviewed the state-of-the-art on polygenic selection detection and explained the rational of our approach in the following opinion piece:
* M. Fagny and F. Austerlitz (2021). Polygenic Adaptation: Integrating Population Genetics and Gene Regulatory Networks. Trends In Genetics 37(7):631-638.
doi: 10.1016/j.tig.2021.03.005.

We then performed a statistical analysis of this approach using simulations. We showed that three tests were efficient to detect polygenic selection acting on many low-effect variants: Zheng's E, Fay and Wu's H and FST, although with some risk of confounding results for the two first tests.

Finally, we identified groups of regulatory variants regulating groups of genes in 29 tissues. Using eQTL results from the GTEx v8 data set, we built 29 tissue-specific eQTL bipartite networks. We showed that these groups are well-defined, and we identified, through tissue-specific networks, between 9 to 200 groups of genetic variants, called regulatory modules, depending on the tissue.

All scripts and simulations necessary to reproduce the results we obtained can be downloaded from this github repository: https://github.com/maudf/PATTERNS/.

We have disseminated research stemming from this project at several scientific meetings. A poster has notably presented at the Cold Spring Harbor Network Biology meeting (2021). Finally, a seminar has been given at the Musée de l'Homme as part of the AGene seminar series. In the context of this project, a short introduction to population genetics for the general public has been presented during a seminar serie on evolution organised by “Fête le savoir” http://fetelesavoir.com/. Finally, we currently are synthesising the results obtained during this project as articles to be published in peer-reviewed international journals in Open Access.
This project has led to several results that will have impacts at different levels.

The first objective has provided an approach to detect polygenic adaptation signatures in regulatory regions. It is available to everybody, and can be used by researchers to detect polygenic adaptation in any population or eukaryotic species. The results of this part are indeed transferable to other models and other biological questions. Applied to humans, this approach allows better understanding the history of human populations. Applied to other species, such as species of agronomical interest, it could allow better identifying the molecular bases of adaptation to several environmental constraints, and the varieties better adapted to new climatic conditions brought by global warming.

The second objective of this project was to identify the phenotypes under local polygenic selection. This has important impacts as it allows better understanding the evolution of phenotypes involved in the adaptation to local environment. On top of bringing precious information about the adaptative processes that human populations went through, it can bring important results in the domain of global health. It can for example provide explanations to why some people are more or less susceptible to infectious diseases or metabolic disorders.
Example of an eQTL network