Skip to main content
European Commission logo print header

Cross-Study Analysis of Cancer Gene Expression Datasets

Final Activity Report Summary - CSACGED (Cross-study analysis of cancer gene expression datasets)

A wealth of genome-wide gene expression data is available in public databases. This extraordinary material has been poorly exploited so far. The objective of the project was to develop the integration of high throughput gene expression data in the field of cancer, and in particular thyroid cancer.

We found that appropriate methodological approaches alleviate the need to remove study-specific biases, a challenging task, before integrative studies are carried out. Taking advantage of this, we explored a new frontier in data integration: the integration of in vitro and in vivo expression data. In a first study we characterised gene expression after stimulation of normal thyroid cells exposed in vitro to thyroid stimulating hormone (TSH). The steady state expression of TSH-associated genes in vitro resembled that of in vivo autonomous adenomas. The TSH receptor is constitutively activated in these benign tumours. Interestingly, several genes inhibiting the effect of TSH stimulation were expressed in the cultures, but not in the tumours. This suggests that these negative feedbacks are involved in tumourigenesis. It also demonstrates that expression data integration is a powerful tool to characterise quantitatively and objectively in vivo gene expression from gene expression in in vitro systems. The same principle was applied to characterise the genes differentially expressed between Ukrainian post-Chernobyl, radiation-induced, thyroid cancers and sporadic thyroid cancers from France. We reasoned that in the absence of exposure to high radiation levels, the French cancer are due to failure to cope with hydrogen peroxide, a compound produced in large quantities during thyroid hormone synthesis, and also a potent DNA damaging agent. Symmetrically, most people exposed to high-level radiation in the vicinity of the Chernobyl plant did not develop thyroid cancer. Those who did, did so because their thyroid cells failed to repair the damage that radiation caused in their DNA. Thus, different susceptibility factors might underlie French and Ukrainian tumours.

We derived from published in vitro data a set of 118 genes that are expressed differently when cells are exposed to radiation and to hydrogen peroxide. We then showed that machine learning algorithms predict accurately whether a tumour is French or Ukrainian on the basis of these 118 genes. We further show that accurate prediction is also possible on the basis of 13 genes involved in the repair of DNA double strand breaks. These results support the existence of a molecular radiation susceptibility signature. Such signature would have a wide range of applications. In addition to its own biological significance, the work carried out during this project suggests that gene expression in well-defined functional in vitro assays could lie at the foundation of a general functional taxonomy of cancers.