Periodic Reporting for period 1 - ClimAHealth (Assessing the role of climate adaptation on human evolution and its implications for health.)
Reporting period: 2021-10-15 to 2023-10-14
Considering that the mechanisms for coping with temperature changes involve energy production, they are closely related to energy balance. Consequently, past adaptive events that altered the ability of human populations to expend energy in heat production might contribute to differential risk of obesity between current populations.
ClimAHealth is dedicated to the search of signals of adaptation caused by climate across human populations. We employ methods capable of detecting signals left in the genome by recent and old selective events. These methods are complemented by a diverse range of machine learning approaches. This will allow us to test whether signals of adaptation are more frequent in genes implicated in energy metabolism and hence relevant for coping with temperature changes. An excess of adaptation signals within these genes would suggest the existence of past adaptation to temperature conditions. Furthermore, this would support the functional relevance of genetic variants within these energy metabolism genes, i.e. their impact on human traits. For a genetic variant to undergo positive selection, it should impact a trait that influences an individual’s ability to cope with external factors and contribute to reproductive success. Given the connection of these genes with energy balance, ClimAHealth will also assess whether they tend to associate with obesity-related traits.
We used these genomic features to predict the probability of recent adaptation (iHS) across the genome. We developed a novel machine learning method: Mixture Density Regressions (MDRs). This approach assumes that the distribution of iHS across the genome is more complex than the usual Gaussian distribution. In addition, this approach can predict the probability of adaptation (iHS) using multiple predictors as input. In other words, we have been able to assess the influence of multiple genomic factors on adaptation at the same time.
The MDR approach successfully modeled the probability of recent adaptation (iHS) across several human populations. Notably, we identified multiple genomic factors associated with this adaptation probability. This suggests that signals of recent adaptation were not randomly distributed across the genome, reducing the likelihood of them being false positives. Our computer simulations provided full support for this. We used these simulations to reconstruct the genomic history of a human population that had not experienced adaptive events. If our approach is solely sensitive to adaptation, it should not detect signals of positive selection in these simulated human genomes. This was indeed the case, reinforcing the idea that the MDR approach is not misled by factors that resemble adaptation.
In the next step, we are considering not only recent but also older adaptation events (last 120,000 years). We are leveraging an approach recently developed in the hosting lab of the outgoing phase of this project (Flex-Sweep). This novel approach, based on deep learning, outperforms previous methods in detecting complex (old) signals of adaptation. We used the probability of adaptation calculated by Flex-Sweep as input in our analyses. The data generated by this approach is more complex compared to the iHS statistic. Thus, we are currently developing a new modeling framework to handle it. This framework follows the same rationale as the MDRs, aiming to model the probability of adaptation across the genome. However, it extends to much more complex distributions, making it flexible enough to analyze data generated by Flex-Sweep. Comparing multiple modeling approaches, we aim to select the best one for predicting the probability of adaptation.
In a preliminary analysis of one population, we successfully modeled the probability of adaptation as predicted by Flex-Sweep. The models demonstrated a relatively good fit, indicating their effective ability to predict the probability of adaptation across the genome. Our models found again multiple factors associated with adaptation (recent and old in this case). This further suggests that adaptation has been relatively frequent in human populations. Importantly, one of the factors used as predictor in the models was the distance to genes related to brown adipose tissue (BAT), a tissue implicated in the dissipation of heat. Therefore, BAT could be implicated in the adaptation of human populations to different temperatures. We have generated a list of genes related to BAT by selecting those closely related to Uncoupling Protein 1 (a hallmark for BAT). Notably, we found a higher probability of adaptation around BAT genes compared to the rest of the genome. Given the link between BAT and environmental temperature, these adaptation signals provide preliminary evidence for climate adaptation in humans. This also suggests that genetic variants in BAT genes are functionally relevant, as signals of adaptation should occur in genomic regions impacting human phenotypes. Finally, we have already started to test the association between obesity-related traits and genes implicated in energy metabolism, finding some instances of such association.
The heterogeneity of the human genome has greatly complicated the search for recent adaptation signals, resulting in a controversial topic. However, our innovative modeling approaches have provided robust evidence indicating relatively frequent adaptation in recent evolutionary times. As a result, ClimAHealth is contributing to answering critical questions in human evolutionary biology. It is also opening new avenues of research by applying novel modeling techniques to model positive selection.