CORDIS - Forschungsergebnisse der EU
CORDIS

Painting the landscape of translational control of gene expression through meta-analysis of genome wide data sets

Final Report Summary - TRANSLATOMICS (Painting the landscape of translational control of gene expression through meta-analysis of genome wide data sets)

Translational control of gene expression is a post-transcriptional mechanism that regulates how many proteins are synthesized per mRNA and time unit. Such regulation is important for many biological processes ranging from development to memory formation and is dysregulated in diseases such as cancer and fibrosis. Despite this, knowledge of how translational control is used by the cell to regulate gene expression at a global level is limited. Importantly, in order to understand how dysregulated translational control contributes to human diseases, better genome wide level characterization of the dynamics of translational control is essential. We have applied a multidisciplinary approach where many genome wide studies are mined using bioinformatics to derive rules and mechanisms of translational control at a genome wide level that can then be explored using cellular assays. Such information can then be exploited in efforts to unravel why and how translational control of gene expression contributes to human diseases. Advancement in this area may benefit the society via downstream development of novel treatment approaches for diseases such as cancer and indeed results from this project are currently being used in the context of breast and brain cancer. The project is structured in several phases: establishing which genes are regulated by translation, how these form functional units that are regulated in concert, whether such units are regulated independently or if there are underlying constraints for their combinatorial regulation and finally which mechanisms that mediate such regulation.

Thus, our first objective was to identify genes that can be regulation by translation. We therefore searched online database repositories and identified a set of contrasts (i.e. independent comparisons) that describe regulation of translation. Contrasts were removed when they lacked biologically sound underpinning (i.e. included to obtain orthogonal contrasts in the models leading e.g. to comparisons of time-points in a non-sound fashion) but contrasts that failed to show differential translation after adjustments for multiple testing using the Benjamini-Hochberg false discovery rate (FDR) procedure were also removed (because they were considered non-informative). To assess the number of genes that were regulated among these contrasts we randomly selected increasing number of contrasts (1000 permutations) and calculated the mean number of genes found to be differentially translated in at least one study. This identified of a large proportion of all genes (more than one third) to be regulated by translation.

Next, among genes that were regulated by translation, we identified a set (>100) of co-regulation modules which define higher order organization of mRNA translation. These were validated to make certain that they do provide non-overlapping information about patterns of translation using statistical approaches. To further assess the validity of the identified modules we examined how they defined co-regulation in contrasts that were not used to define modules. This step is essential to make certain that modules will be generally applicable and do not represent an over-fitting to the data used to identify them. To assess this we first used additional data sets that were published during module identification (but only those that were amendable for solid statistical anota analysis – i.e. had sufficient replication and data from cytoplasmic RNA). Similar to the analysis of contrasts used for module identification some modules were regulated in multiple contrasts while others were only regulated in a single contrast. In total >90% of the modules were regulated in at least one contrasts with an FDR<0.05. To further assess the validity of the model we used data sets for which there was no data from cytoplasmic mRNA for adjustments of data on translated mRNA. Such analysis identified >90% of all modules as regulated under at least one condition. In total >95% of the modules could be validated (FDR<0.05) by external in vitro data sets. Thus we have identified a set of translational operons which coordinate expression of genes.

We then used network approaches to see if the modules were regulated in concert, i.e. if there were underlying constraints in how the modules could be regulated in a combinatorial fashion. This enabled identification of a second level of regulation where modules were not regulated as units but as pairs such that when one module showed increased activity the second module showed reduced activity. This regulatory dichotomy which limits how translation can be regulated, possible to achieve homoestasis, was validated using the model developed here and data generated in a parallel project. This gave confidence in that these regulatory patters indeed existed and supported that the model offered an ideal opportunity for identification of regulatory RNA-elements. RNA-element are believed to be the underlying mechanism by which post-transcriptional processes target individual mRNAs for regulation. Therefore identification of these is essential to understand regulatory patterns.
We did realize that for such complex analysis-approaches it was necessary to develop a more sophisticated pipe-line for identification of regulatory RNA-elements which also included efficient approaches to cross-compare RNA-elements to known RNA-elements and those identified in other modules. This pipe-line is complete and we are currently progressing with a first set of regulatory RNA-elements identified from the modules for functional validation. The functional validation is carried out in a high throughput fashion using approaches that have been adopted for this setup within the scope of this project.
Thus the project has provided information about how translational control is orchestrated at multiple levels, information which will be important for further studies into understanding the underlying mechanisms for translational control. The work here is also currently applied (in independent projects) to several human cancers and may as such facilitate development of future treatment strategies.