Periodic Reporting for period 3 - EXA MODE (EXtreme-scale Analytics via Multimodal Ontology Discovery & Enhancement)
Reporting period: 2022-01-01 to 2023-06-30
The application of modern machine learning techniques to biomedical data represents a unique opportunity to radically change and improve medicine in the upcoming decades. However, the difficulty to obtain large datasets annotated by experts, paired with clinical data heterogeneity limits progress in this direction. Working in the domain of histopathology (the gold standard for the diagnosis of several diseases including cancer), ExaMode contributed to solve these challenges by ideating and developing methods to link concepts included in thousands clinical reports to the related medical images without human interaction. This allows creating representations of medical knowledge that are multimodal, multilingual and that can serve as bases for computer-assisted diagnostic algorithms and products.
One year after the beginning of ExaMode, the Covid-19 pandemic created a global challenge that affected the consortium and, particularly, the hospitals involved in the project that were reorganized in order to fight the pandemic. Nevertheless, ExaMode obtained impressive results, including the completion of all 47 deliverables and 10 milestones, the publication of over 120 scientific papers that obtained over 1500 citations by the project end in mid 2023 and the release of 25 open source software libraries that empower the application of machine learning to multimodal biomedical data.
Among the results obtained in the context of ExaMode, the following are particularly noteworthy.
First, data that are a fundamental resource for machine learning. During ExaMode, the consortium acquired tens of thousands of high resolution microscopy images and related anonymized reports from clinical practice, as well as data from other publicly available sources (including for instance the scientific biomedical literature).
Second, several software resources were produced to extract knowledge from heterogeneous clinical data. Libraries were mostly made open source (https://www.examode.eu/software/) and allow important applications such as dealing with data variability, data compression, segmentation and classification of digital pathology images, the extraction of concepts from text, the manual annotation and exploration of medical text, the visualization and exploration of biomedical concepts and ontologies, data transfer between centers, weakly supervised learning, multimodal learning and multimodal ontologies.
Finally, several product prototypes were developed by the companies involved in ExaMode and tested in clinical settings. SIRMA AI developed Histographer and SNOMEDICO: respectively a platform to support pathologists in making more informed decisions (exploiting the similarity to other cases in their clinical practice or in scientific literature) and a REST API service for structuring diagnosis information based on SNOMED CT ontology. Tooploox developed three prototypes (VIRTUM P1, P2 and P3), allowing to manage and annotate histopathological slides to detect tissue abnormalities and to aid researchers by making the PubMed Central knowledge-base interconnected with visual context references.
Socio-economic impact: Industrial partners developed prototypes based on the project's research. These new tools streamline workflows, improve research efficiency, and enhance remote access and collaboration, making the applications more productive. Several outcomes of the project, including weakly supervised learning processes from clinical images and reports and multimodal ontologies, have the potential to be a game changer for biomedical sciences in the next decade, allowing by design statistical analysis between multimodal data of different types and improving in-depth diagnostics and precision medicine.
Wider societal implications of the project: With a high number of publications, citations, open source-libraries and prototypes, ExaMode has made significant contributions to the machine learning and biomedical scientific research domain. The use of AI and machine learning in digital pathology can speed up diagnoses, improving patient care and outcomes. Moreover, ExaMode's approach to data handling and analysis can potentially be extended to other fields like radiology, establishing EU leadership in these sectors.