Skip to main content

Dissecting the functional importance of eukaryotic protein phosphorylation

Periodic Reporting for period 4 - PhosFunc (Dissecting the functional importance of eukaryotic protein phosphorylation)

Reporting period: 2019-10-01 to 2020-09-30

Living cells have to adapt to the changes in their environmental conditions. Such changes are very often first sensed at the membrane of cells and communicated internally by reversible post-translational modification (PTM) of proteins in complex and dynamic signaling networks. Protein phosphorylation is one of the most common and well studied PTMs that is catalyzed by specialized enzymes called protein kinases. This regulation not only need to be efficient in mounting an appropriate response but must also be evolutionarily adaptable. Studying the underlying evolutionary process that gives rise to PTM signaling systems will allow us to better understand the functional relevance of PTM regulation in extant species.

Divergence of expression patterns is often asserted as the main driving force in generating phenotypic diversity. However, several studies have challenged this view. Recent advances in mass spectrometry (MS) have lead to an increase in throughput with thousands of phosphosites discovered for some model organisms allowing for the first time to study the evolution of PTMs. The first evolutionary studies have shown that there is only weak evolutionary constraint imposed by the modifications which might be explained by the existence of a significant number of sites that serve no biological role in present day species but are the by-product of the high evolutionary rate of creation and destruction of phosphorylation sites.

The increased throughput in identification of protein phosphorylation sites along with the lack of sequence conservation at phosphosites and potential existence of non-functional sites has resulted in a tremendous challenge of identifying functional PTM sites among the many thousands of phosphosites identified to date. For example, there are over 200,000 phosphosites that have been experimentally determined in human proteins of which only 3% have a curated described function. Knowing the extent of non-functional phosphorylation as well as developing methods to rank sites according to functional importance is a major bottleneck in current studies of cell signaling. Tackling these issues will have an impact on many areas of fundamental cell biology (e.g. cell-cycle, DNA damage, response to stress, etc). Protein kinases and phosphorylation signaling networks are very often mutated in cancer and hijacked during infection. Understanding the function of protein phosphorylation will facilitate our understanding of how cancer mutations or some pathogens change these regulatory networks in disease.

In order to study the contribution to fitness of protein phosphorylation we developed in this project a combined computational and genetic approach to study phosphosite function in the model organism S. cerevisiae. For this purpose we first devised an approach to classify phosphorylation sites according to its evolutionary age using phosphorylation data from 18 different fungal species. This work conclusively showed that phosphorylation sites that exist in present day species have arisen recently in evolution. We then developed a genetic approach to study the functional importance of protein phosphorylation by measuring the impact of mutating one of nearly 500 phosphorylation sites in S. cerevisiae. From this work we could say that on the order of 60% of phosphosites don't show phenotypes when mutated. In addition, by comparing the growth profile of the phosphorylation mutants with those elicited by gene knock-outs we could predict the functional role of the mutated phosphorylation sites. Based on these genetic measurements we could also see that conservation alone was not a good predictor for which phosphosites seem to be more important. In order to be able to take into account multiple characteristics of protein phosphorylation sites with build a computational predictor of functional relevance that integrates multiple features, such as the degree of conservation, regulation, among others. This predictor is capable of scoring phosphorylation according to their relative importance of the cell which can now facilitate the study of regulation of almost any cellular process.

Altogether, this work allows to understand which phosphorylation sites are most important for the cell and therefore more likely to have important regulatory functions. This has wide reaching implications for the study of cell biology and the study of misregulation during disease.
To study the functional importance of protein phosphorylation we first developed a computational approach to estimate the age of phosphorylation sites using a collection of protein phosphorylation data for 18 different fungal species (Studer et al. Science 2016). Based on this analysis we have determined that most phosphosites are of very young origin. The speed by which phosphosites are created and destroyed during evolution is similar to the rate of changes in gene expression. Our work has conclusively shown that changes in protein post-translational regulation can also be a strong contributor to evolutionary differences. These evolutionary studies suggest that a significant fraction of phosohosites may be under little to no evolutionary pressure to be maintained. One explanation for this could be that there may be a significant fraction of phosphosites that do not have any functional role in present day species. To prioritize those phosphosites that could be most critical for the cell we devised a series of approaches as initially detailed in the proposal. One approach we used was to identify regions within domain families that show high conservation of phosphorylation across species (published in Strumillo et al. Nature Communications 2019). A second approach we developed to identify critical phosphosites was the development of a supervised machine learning method (Ochoa et al. Nature Biotech 2020) that was strongly capable of ranking highly phosphosites with previously known functions as well as positions that when mutated are likely to cause strong impacts on protein function or cellular defects. Finally, we have also developed a chemical-genomic approach to study the function of protein phosphorylation through experimental means (Vietez et al. in revision). We generated a library of nearly 500 phospho-deficient mutants of S. cerevisiae and measured their fitness differences in a panel of 102 conditions. Of all phospho-mutants, around 40% exhibited growth phenotypes in at least 1 condition, suggesting these phosphosites are likely functional. We then compared the similarity of their growth profiles across the 102 conditions with those elicited by gene deletions allowing us to identify phosphosites that elicited phenotypes that resembled the loss-of-function of specific genes.

These different projects have allowed us to prioritise phosphorylation sites that we think are highly relevant for the cell. The functional relevance and structural implications of phosphorylation are areas we will continue to develop further in the future.
Overall, the results obtained in the course of this ERC grant have advanced on our capacity to identify which phosphorylation sites are most critical for the cell and what are the biological processes that are under regulation. Given that most cellular processes are regulated by phosphorylation this work will have wide ranging future applications.
Artist's impression of phosphosite and machine learning. Credit: Spencer Phillips