European Commission logo
français français
CORDIS - Résultats de la recherche de l’UE
CORDIS

An AI Platform integrating imaging data and models, supporting precision care through prostate cancer’s continuum

Periodic Reporting for period 2 - ProCAncer-I (An AI Platform integrating imaging data and models, supporting precision care through prostate cancer’s continuum)

Période du rapport: 2022-04-01 au 2023-09-30

In Europe, prostate cancer (PCa) is the second most frequent type of cancer in men and the third most lethal. Current clinical practices, often leading to overdiagnosis and overtreatment of indolent tumors, suffer from lack of precision calling for advanced AI models to go beyond SoA by deciphering non-intuitive, high-level medical image patterns and increase performance in discriminating indolent from aggressive disease, early predicting recurrence and detecting metastases or predicting effectiveness of therapies. To date efforts are fragmented, based on single–institution, size-limited and vendor-specific datasets while available PCa public datasets are only few hundred cases making model generalizability impossible.
ProCAncer-I’s vision is to deliver a platform featuring a unique collection of PCa mpMRI images worldwide, in terms of data quantity, quality and diversity; to focus on delivering novel AI-based clinical tools for advancing characterization of PCa lesions, assessment of the metastatic potential, and early detection of disease recurrence; to design and seamlessly integrate an open source framework for the development, sharing and deployment of AI models and tools; to develop a concrete plan to sustain and exploit project results.
ProCAncer-I has developed nine concrete and clinically relevant use cases that span the care continuum of PCa, as shown in Figure 1. These are used as demonstrator use cases for validating the added value and usability of the platform. The ProCAncer-I project has achieved significant milestones from the clinical viewpoint, including the definition of detailed study protocols for retrospective and prospective studies, the development and translation of a model for informed consent forms, and the submission for approval to local Ethics Committees. The study received ethical approval from all relevant committees and is registered on clinicaltrial.gov. The project also established robust procedures and technologies for data anonymization, developed the ProCancer-I platform as a secure cloud-based infrastructure supporting AI model development, and delivered locally installed eCRF and data upload tools to clinical partners, enhancing data control capabilities.
A significant focus of our work has been on defining ontologies and catalog mechanisms. The MOLGENIS platform functions as the primary metadata catalogue, aligned with the DCAT-AP specification. Moreover, the project uses the OMOP-CDM, along with its extensions introduced within ProCAncer-I, as a common data model for storing clinical and imaging-related metadata. Collaboration with the OHDSI Medical Imaging Working Group persisted, focusing on integrating annotation, segmentation, and curation data as radiomics features, leading to the creation of two extensions to the OMOP-CDM (MI-CDM and R-CDM).
Substantial efforts were also devoted to enhancing the platform with image pre-processing and curation tools including Bias Field Correction, Image Enhancement. Also, a significant result has been the development of master models, as foundational models used for different tasks and methodologies. Efforts were directed towards creating classification master models based on radiomics and deep learning (DL), alongside segmentation master models for whole prostate gland, prostate zone, and lesion segmentation. Several partners conducted analyses like fairness, learning curve, and feature importance to understand diverse model requirements and feature impacts on performance. Other partners focused on analyzing performance using manually annotated lesion or whole prostate gland segmentation masks, offering a comparison between different model requirements, whereas others assessed radiomics features on UC7a, employing predicted whole prostate gland segmentations.
Regarding the development of DL based Master Model we have i) studied the impact of different factors on classification performance, including model types, clinical features, crop sizes, and data amounts; ii) compared unsupervised and supervised approaches, providing a learning curve analysis, iii) compared 2D and 3D data performances in prostate and lesion segmentation models and designed a DL-based lesion segmentation model and strategies to address over-fitting and iv) investigated self-supervised learning (SSL) models' performance in 3D classification using 2D orphan data stored in DICOM format, comparing their performance with models trained in previous chapters.
The project has implemented a multitier structure that encompasses Master, Vendor-Specific, and Vendor-Neutral AI models. These models address the challenge of AI dependability in diverse diagnostic settings. The strategy to train the Master models on the complete dataset foresaw the emerging innovation trend of providing foundation models in diagnostic imaging. Furthermore, each of the developed models is innovative in their own right (e.g. concerning the model architecture, data preparation and cropping, the use of priors, etc.), as evidenced by the partners’ scientific publications.
Furthermore, during model development, the scarcity of data was observed, with particular emphasis in some use-cases, as indicated by the learning curves of many master models. To address this issue and the absence of fully labeled data, a novel class of models has been studied leveraging semi-supervised learning techniques. Self-supervised learning (SSL) involving multiple instance learning (MIL) techniques were used to develop models for two specific cases (UC1 and UC2). It was found that SSL combined with MIL produced models performed similarly to fully supervised ones.
Other points of innovation concern exploring causal inference and reasoning in deep learning models, and devising uncertainty estimation methods. A pioneering approach has been established by crafting deep learning models to utilize causal inference and reasoning. The goal was to let models comprehend weak causal signals within a scene to formulate how the existence of a characteristic in one section of the image influences the manifestation of a different characteristic in a separate section of the image. The methodology underwent evaluation on the PICAI challenges dataset of prostate MRI images for diagnosing prostate cancer and displayed an improvement in classification performance and produced more robust predictions by emphasizing relevant parts of the image.
Given the significance of incorporating a confidence score for AI model outcomes, pioneering methods for integrating uncertainty estimation have been devised and are presently undergoing testing.
Another area in which the Project is contributing beyond the current SoA is the area of standards. Examples include:
- the Oncology CDM extension for representing prostate cancer data at the levels of granularity and abstraction required to support cancer research.
- Extending W3C DCAT-AP metadata model. Our extension adds additional constraints to the basic DCAT model, and proposes a set of controlled vocabularies to be used for specific attributes, ensuring this way future interoperability across different imaging data portals within the European Union. In addition, we have implemented a Fair Data Point (FDP) on top of the ProCAncer-I catalogue that exposes the available DCAT-AP metadata through http protocols in a machine-readable standardized format (.rdf format) ensuring the FAIRification of the ProCAncer-I datasets.
clinical-challenges-procancer-i.png