Final Report Summary - TRANSLATOMICS (Painting the landscape of translational control of gene expression through meta-analysis of genome wide data sets)
Thus, our first objective was to identify genes that can be regulation by translation. We therefore searched online database repositories and identified a set of contrasts (i.e. independent comparisons) that describe regulation of translation. Contrasts were removed when they lacked biologically sound underpinning (i.e. included to obtain orthogonal contrasts in the models leading e.g. to comparisons of time-points in a non-sound fashion) but contrasts that failed to show differential translation after adjustments for multiple testing using the Benjamini-Hochberg false discovery rate (FDR) procedure were also removed (because they were considered non-informative). To assess the number of genes that were regulated among these contrasts we randomly selected increasing number of contrasts (1000 permutations) and calculated the mean number of genes found to be differentially translated in at least one study. This identified of a large proportion of all genes (more than one third) to be regulated by translation.
Next, among genes that were regulated by translation, we identified a set (>100) of co-regulation modules which define higher order organization of mRNA translation. These were validated to make certain that they do provide non-overlapping information about patterns of translation using statistical approaches. To further assess the validity of the identified modules we examined how they defined co-regulation in contrasts that were not used to define modules. This step is essential to make certain that modules will be generally applicable and do not represent an over-fitting to the data used to identify them. To assess this we first used additional data sets that were published during module identification (but only those that were amendable for solid statistical anota analysis – i.e. had sufficient replication and data from cytoplasmic RNA). Similar to the analysis of contrasts used for module identification some modules were regulated in multiple contrasts while others were only regulated in a single contrast. In total >90% of the modules were regulated in at least one contrasts with an FDR<0.05. To further assess the validity of the model we used data sets for which there was no data from cytoplasmic mRNA for adjustments of data on translated mRNA. Such analysis identified >90% of all modules as regulated under at least one condition. In total >95% of the modules could be validated (FDR<0.05) by external in vitro data sets. Thus we have identified a set of translational operons which coordinate expression of genes.
We then used network approaches to see if the modules were regulated in concert, i.e. if there were underlying constraints in how the modules could be regulated in a combinatorial fashion. This enabled identification of a second level of regulation where modules were not regulated as units but as pairs such that when one module showed increased activity the second module showed reduced activity. This regulatory dichotomy which limits how translation can be regulated, possible to achieve homoestasis, was validated using the model developed here and data generated in a parallel project. This gave confidence in that these regulatory patters indeed existed and supported that the model offered an ideal opportunity for identification of regulatory RNA-elements. RNA-element are believed to be the underlying mechanism by which post-transcriptional processes target individual mRNAs for regulation. Therefore identification of these is essential to understand regulatory patterns.
We did realize that for such complex analysis-approaches it was necessary to develop a more sophisticated pipe-line for identification of regulatory RNA-elements which also included efficient approaches to cross-compare RNA-elements to known RNA-elements and those identified in other modules. This pipe-line is complete and we are currently progressing with a first set of regulatory RNA-elements identified from the modules for functional validation. The functional validation is carried out in a high throughput fashion using approaches that have been adopted for this setup within the scope of this project.
Thus the project has provided information about how translational control is orchestrated at multiple levels, information which will be important for further studies into understanding the underlying mechanisms for translational control. The work here is also currently applied (in independent projects) to several human cancers and may as such facilitate development of future treatment strategies.