Skip to main content
Go to the home page of the European Commission (opens in new window)
English English
CORDIS - EU research results
CORDIS

Understanding epigenetic inheritance and the structures of non-amyloid prion condensates using deep mutagenesis scans

Periodic Reporting for period 1 - DEEPCONSTRUCT (Understanding epigenetic inheritance and the structures of non-amyloid prion condensates using deep mutagenesis scans)

Reporting period: 2022-08-01 to 2024-07-31

Proteins are essential molecules that perform a vast array of functions within living organisms. They are made up of long chains of amino acids, which fold into specific three-dimensional structures that determine their function. The relationship between sequence, structure, and function is a central theme in biology. Understanding how a specific sequence leads to a particular function is a complex challenge because even small changes in the sequence (mutations) can significantly alter the protein's behavior. Indeed, mutations in proteins can lead to disease through different mechanisms such as destabilization of the protein fold, by affecting the specific function of the protein, or by causing the protein to aggregate. Predicting the effects of these mutations, or variants, is crucial for understanding their potential impact on health. Accurate predictions can aid in diagnosing genetic disorders, developing treatments, and understanding the underlying mechanisms of disease.

In order to improve our understanding of how mutations lead to genetic diseases we set the following objectives:

- Generate a large dataset of how mutations in human proteins involved in genetic diseases affect the stability of their three-dimensional folds
- Identify which mutations cause genetic diseases through protein destabilization, and analyze the importance of destabilization as a disease mechanism across different diseases and proteins
- Use the data to develop predictive models of how mutations affect protein stability to cover a larger number of pathogenic mutations
We have generated a large scale dataset capturing the effects of >500,000 mutations on the fold stability of >500 human proteins that we have called Human Domainome 1.0. We find that >50% of mutations cause disease through protein destabilization. We also find, however, that the contribution of stability changes to disease varies across proteins and diseases: while destabilization is an excellent predictor of pathogenicity in some diseases, this is not always the case, indicating that mechanisms other than destabilization can have an important role in some proteins. We have combined predictions from state of the art variant effect predictors with our protein stability data to systematically identify sites in proteins that are important for protein function independently of protein stability. Finally, we have compared the effects of mutations on protein stability between members of the same protein family, showing extensive conservation of effects within families. This allowed us to develop thermodynamic models of protein families that allow us to predict stability changes in a large set of human proteins and disease mutations.

This work is now published as a preprint (https://www.biorxiv.org/content/10.1101/2024.04.26.591310v1(opens in new window)) and is currently under peer review at a scientific journal.
Domainome 1.0 is a first important step in the comprehensive experimental analysis of mutations in human proteins. It forms part of an ongoing global effort to determine the consequences of every mutation in every human protein and to produce reference atlases for the mechanistic interpretation of clinical variants. Beyond human genetics, it is also part of a  broader effort to produce large, well-calibrated datasets that quantify how changes in sequence alter the biophysical properties of proteins. These multimodal biophysical measurements for millions of proteins and variants will enable machine learning approaches to be developed to achieve accurate prediction and engineering from sequence.

This will have a direct impact on clinical genetics, allowing accurate interpretation of novel genetic variants, including a better understanding of disease mechanisms, improved diagnosis, and treatments. Ultimately, these will contribute to a more personalized approach to medicine tailored to the genetics of each individual.
Human Domainome 1.0
My booklet 0 0