Skip to main content
CORDIS - Forschungsergebnisse der EU
CORDIS

STATISTICAL ANALYSIS AND AI ON PUBLIC CANCER DATA FOR THE DEVELOPMENT OF A HEALTH MONITORING SERVICE

Periodic Reporting for period 1 - U-MON (STATISTICAL ANALYSIS AND AI ON PUBLIC CANCER DATA FOR THE DEVELOPMENT OF A HEALTH MONITORING SERVICE)

Berichtszeitraum: 2020-12-01 bis 2022-01-31

Cancer and cardiovascular diseases are stealth diseases that develop without physical complaints. By the time they are noticed they have typically progressed to a stage where cure-rates are low, treatment burden for the patient is heavy and treatment costs for society are high. Our envisaged innovation has the potential to overcome this and change healthcare from reactive to preventive: a personalized health monitoring service that will detect any serious disease at an early stage.
To achieve this, we have identified microRNAs as broad-spectrum biomarkers in body fluids that can be measured cost-effectively. MicroRNAs are expressed in organ specific manners and profiling of microRNAs in blood or urine could thus potentially be used to detect organ diseases such as cancer. We defined 2 research goals as objectives for the innovation scheme:

1) To elucidate microRNA panels whose expression levels are associated with specific types of cancer by statictical analysis of public data on RNA expression in tumor tissues.

2) To develop deconvolution algorithms, by using statistical modelling and AI, that can be used to extract the presence of specific cancer types from small RNA expression profiles.

The innovation associate has succesfully completed research aim 1 and had provided a catalog of microRNAs that show tissue and/or tumor specific expression profiles. Algorithms were developed to detect specific cancers from biofluids that can now be tested in the laboratory. The foundation was laid for a multi-cancer deconvolution algorithm, that will be improved with additional data becoming available. We concluded that disease detection and specification on the basis of small RNA profiling is indeed possible and that will further invest in the development of a personalized health monitor.
To generate a local database of microRNA profiling data from different tissues and tumors we started by mining of the public TCGA (the cancer genome atlas) database to include processed microRNA sequencing profiles from 9781 different healthy tissue and tumor samples encompassing all relevant organs. A second dataset was compiled by obtaining unprocessed sequencing data from 2458 samples from the public Sequencing Read Archive (SRA) that we processed ourselves using publicly available software co-developed by the innovation associate during het PhD: sRNAbench. The TCGA dataset was investigated by statistical methods to understand if tissue specificity of microRNA expression could be found and to understand whether tumorigenesis would change tissue expression profiles. By principle component analysis, indeed a number of tissues was found to contain distinctive microRNA expression profiles (Figure 1). By differential expression analysis and machine learning (LASSO) microRNA panels were identified that are specifically expressed in only one or enriched in a limited number of tissues. These organ-microRNA panels were confirmed by analysis of the SRA dataset. Surprisingly the only tissue that showed no tissue specific or enriched microRNAs in these analyses was the lung.
To understand if the tissue specific microRNA panels could be potentially used as reporters of organ problems, the presence of these microRNAs in plasma samples from healthy individuals was assessed by using sequencing data generated in house at VUMC. Indeed a number of organ specific and organ enriched microRNAs were detected in plasma samples (figure 2). This indicates that indeed using plasma or potentially urine as a source for microRNA profiling, organ diseases could be detected by changes in organ specific microRNA panels.
By comparing tumor tissue microRNA profiles with healthy tissue microRNA profiles it was assessed if changes in microRNA profiles were associated with tumorigenesis that could directly lead to cancer detection. Interestingly indeed several tumor tissues, including breast, colon and prostate showed significant changes in microRNAs, but these were not always changes in tissue specific microRNAs.
Overal the results from this project confirm the observations from literature that tissue specific or enriched microRNA panels exist that are found in body fluids. Moreover, changes of expression in microRNA panels were found associated with specific forms of cancer. The discovered cancer-microRNA panels can be used in separation or combination as algorithms for cancer detection in body fluids and it is likely that this can be extended also to the detection of other organ diseases. The absence of a lung or lung cancer enriched microRNA panel could mean that lung cancer detection by this method will be difficult. In literature however, multiple studies have shown the feasibility of lung cancer detection by microRNA panels in blood, indicating that our current panels may be improved and extended with additional data.
Besides indications for the feasibility of multi-organ cancer detection, the building and curating of 2 large databases of tissue-specific microRNA expression profiles has generated a valuable resource for biomarker discovery and validation.
The results of the project are forming the basis for a scientific manuscript that is currently in preparation and is expected to be ready for submission to a peer reviewed journal in 2022.
In conclusion, the work performed by the innovation associate has progressed our understanding of microRNA profiles associated with specific organs and with the occurrence of cancer in specific organs. The observations made in the project indicate that detection of cancer, and specification of the organ of origin of the cancer is realistic and worth pursuing further by the company to eventually bring a minimally invasive cancer detection test to the market.
figure 1
figure 2