EXtreme-scale Analytics via Multimodal Ontology Discovery & Enhancement

Medical institutions produce huge amounts of digital data, including images, biosignals and associated text, such as diagnostic reports.
The application of modern machine learning techniques to biomedical data represents a unique opportunity to radically change and improve medicine in the upcoming decades. However, the difficulty to obtain large datasets annotated by experts, paired with clinical data heterogeneity limits progress in this direction. Working in the domain of histopathology (the gold standard for the diagnosis of several diseases including cancer), ExaMode contributed to solve these challenges by ideating and developing methods to link concepts included in thousands clinical reports to the related medical images without human interaction. This allows creating representations of medical knowledge that are multimodal, multilingual and that can serve as bases for computer-assisted diagnostic algorithms and products.

The ExaMode consortium is composed of seven partners that collaborated to reach the project objectives. The academic partners University of Applied Sciences Western Switzerland (HES-SO Valais, Switzerland, coordinator), the University of Padova (Italy) and the Radboud University Medical Center (Netherlands) strongly collaborated to develop methods and tools for dealing with data heterogeneity and to extract multimodal knowledge from reports and digital pathology images. The industrial partners Tooploox (Poland) and SIRMA AI (Bulgaria) focused on new products that match the requirements of hospitals and histopathology departments. SURF (Netherlands), provided High-Performance Computing (HPC) resources to the consortium and actively collaborated on the development of the research tools and product prototypes. Clinical partners (Azienda Sanitaria Provinciale Di Catania and Radboud University Medical Center) provided data and annotations in accordance with ethics requirements, clinical guidance to research and tools validation.
One year after the beginning of ExaMode, the Covid-19 pandemic created a global challenge that affected the consortium and, particularly, the hospitals involved in the project that were reorganized in order to fight the pandemic. Nevertheless, ExaMode obtained impressive results, including the completion of all 47 deliverables and 10 milestones, the publication of over 120 scientific papers that obtained over 1500 citations by the project end in mid 2023 and the release of 25 open source software libraries that empower the application of machine learning to multimodal biomedical data.
Among the results obtained in the context of ExaMode, the following are particularly noteworthy.
First, data that are a fundamental resource for machine learning. During ExaMode, the consortium acquired tens of thousands of high resolution microscopy images and related anonymized reports from clinical practice, as well as data from other publicly available sources (including for instance the scientific biomedical literature).
Second, several software resources were produced to extract knowledge from heterogeneous clinical data. Libraries were mostly made open source (https://www.examode.eu/software/) and allow important applications such as dealing with data variability, data compression, segmentation and classification of digital pathology images, the extraction of concepts from text, the manual annotation and exploration of medical text, the visualization and exploration of biomedical concepts and ontologies, data transfer between centers, weakly supervised learning, multimodal learning and multimodal ontologies.
Finally, several product prototypes were developed by the companies involved in ExaMode and tested in clinical settings. SIRMA AI developed Histographer and SNOMEDICO: respectively a platform to support pathologists in making more informed decisions (exploiting the similarity to other cases in their clinical practice or in scientific literature) and a REST API service for structuring diagnosis information based on SNOMED CT ontology. Tooploox developed three prototypes (VIRTUM P1, P2 and P3), allowing to manage and annotate histopathological slides to detect tissue abnormalities and to aid researchers by making the PubMed Central knowledge-base interconnected with visual context references.

Progress beyond the state of the art: Starting from the domain of digital pathology, ExaMode developed new technologies and tools to handle vast amounts of biomedical data more efficiently. The project is a breakthrough in managing and analyzing data from various sources, including text and images, leading to faster training, more precise predictions, and decision-making support without the need of involving human experts for data annotations.
Socio-economic impact: Industrial partners developed prototypes based on the project's research. These new tools streamline workflows, improve research efficiency, and enhance remote access and collaboration, making the applications more productive. Several outcomes of the project, including weakly supervised learning processes from clinical images and reports and multimodal ontologies, have the potential to be a game changer for biomedical sciences in the next decade, allowing by design statistical analysis between multimodal data of different types and improving in-depth diagnostics and precision medicine.
Wider societal implications of the project: With a high number of publications, citations, open source-libraries and prototypes, ExaMode has made significant contributions to the machine learning and biomedical scientific research domain. The use of AI and machine learning in digital pathology can speed up diagnoses, improving patient care and outcomes. Moreover, ExaMode's approach to data handling and analysis can potentially be extended to other fields like radiology, establishing EU leadership in these sectors.

Periodic Reporting for period 3 - EXA MODE (EXtreme-scale Analytics via Multimodal Ontology Discovery & Enhancement)

Share this page

Download