Skip to main content
European Commission logo
English English
CORDIS - EU research results
CORDIS
CORDIS Web 30th anniversary CORDIS Web 30th anniversary

Foundation models for molecular diagnostics - machine learning with biological ‘common sense’

Project description

Foundation models for reliable cancer prediction

Molecular diagnostics plays a crucial role in personalised medicine. However, current AI models require assistance in learning from molecular patient profiles and making predictions due to the complex nature of molecular disease biology and limited training data. The ERC-funded FoundationDX project addresses this gap by developing foundation models using biomolecular data available for healthy and diseased tissues. Employing self-supervised learning, a pivotal driver of AI, the project seeks to create a comprehensive representation of cell biology without necessitating well-annotated patient data. This approach will enable reliable prediction of cancer subtypes and prognosis. The project strives to offer potent machine-learning solutions to previously challenging molecular diagnostics problems.

Objective

Molecular diagnostics is crucial in fulfilling the promise of personalized medicine. While we are amidst an AI revolution, current machine learning models (ML) struggle to effectively learn from molecular (‘omics’) patient profiles and fail to make robust predictions. Perhaps this is not a surprise. After all, molecular disease biology is immensely complex, and we ask ML models to predict such complicated things as patient prognosis, without them ‘knowing’ anything about molecular biology and based on limited training data.

To address this, I will create foundation models on top of the vast troves of available biomolecular data, such as multi-omics profiles in healthy and diseased tissues, high-resolution single-cell data and biological knowledge graphs. This unique approach is driven by self-supervised learning (SSL), an important driver of AI, which offers the opportunity to learn a comprehensive representation of the multimodal biology of the cell – without the need for well-annotated patient data.

Starting from this strong basis, the FoundationDX model can then reliably predict cancer subtype or prognosis as it no longer needs to start from scratch on too high-dimensional, too low sample-size datasets. Effectively, we give our systems biological ‘common sense’, foregoing the need for millions of labeled training samples. This uniquely enables us to address one of the most clinically relevant questions: which treatment is best for the patient?

The FoundationDX research program is designed to deliver key insights into how the SSL revolution can be used to drive progress in the field of molecular diagnostics. It contains a ‘clinical-grade’ benchmarking module and solves three urgent diagnostic challenges, including noninvasive subtyping of pediatric brain cancer. The time for powerful, robust and generalizable, knowledge-aware machine learning solutions to previously intractable molecular diagnostics problems has come. FoundationDX aims to deliver this.

Host institution

UNIVERSITAIR MEDISCH CENTRUM UTRECHT
Net EU contribution
€ 2 000 000,00
Address
HEIDELBERGLAAN 100
3584 CX Utrecht
Netherlands

See on map

Region
West-Nederland Utrecht Utrecht
Activity type
Higher or Secondary Education Establishments
Links
Total cost
€ 2 000 000,00

Beneficiaries (1)