Periodic Reporting for period 2 - CLARIFY (CLoud ARtificial Intelligence For pathologY)
Reporting period: 2021-11-01 to 2024-02-29
The main goal of CLARIFY was to develop a robust automated digital diagnostic environment based on artificial intelligence (AI) and cloud-oriented data algorithms to facilitate automatic histological image interpretation and diagnosis with the aim of maximizing the benefits of digital pathology.
Throughout the project, CLARIFY has provided insights into the field of digital pathology to develop customized technical solutions and has leveraged AI models to objectively evaluate clinicopathologic parameters, standardizing assessments and reducing variability among pathologists, thereby enhancing tumor classification and histopathological interpretation. Specifically, CLARIFY has pioneered novel and robust methods for data-driven Whole Slide Image (WSI) interpretation across three selected diseases. Cutting-edge deep neural network models and tailored architectures were employed for feature extraction and classification. CLARIFY introduced both diagnostic and prognostic classification pipelines, along with content-based image retrieval capabilities. Furthermore, CLARIFY has explored and constructed an advanced decentralized data flow management architecture for histological images and related metadata, seamlessly linking necessary analysis tools within a federated cloud environment.
As a primary outcome, CLARIFY has nurtured the growth of 12 young scientists who contribute to making more informed decisions in pathology. Two PhD theses have already been successfully defended within the project's framework, with others scheduled for defense in the coming months.
- Collaboration among clinical and technical partners to develop cancer-specific annotation protocols and development of a user-friendly web-based application for histological image navigation and annotation.
- Development of a web application for pathologists that includes several methods developed in the project (CBMIR, segmentation, classification, automatic scoring).
- Development of a data set and notebook search system using information retrieval techniques, and a data quality control framework using active learning techniques.
- Development of a Notebook based decentralized workflow management system scaling privacy preserved data-centric workflows.
- Development of a blockchain design for data protection.
- Development of an automated preprocessing pipeline for detecting artifacts in WSI.
- Development of fully automated pipelines including tissue classification, ROI extraction, and AI models for diagnostic and prognostic urinary bladder cancer predictions.
- Development of a mitosis detection approach and a graph-based molecular subtype prediction in breast cancer.
- Development of a framework for spitzoid tumor analysis inspired by the biopsy examination process.
- Development of novel probabilistic deep learning models enabling training with limited or imperfect annotations for histopathological images.
- Development of a tailored network for Content-Based Histopathological Image Retrieval (CBHIR).
- Publication of 60 research papers in international conferences (34) and journals (26) which resulted in 32 open software code sources and datasets.
- Defense of five PhD theses. The others are in process.
OPEN ENTERPRISE-SCALE INFRASTRUCTURE OF QUALITY-CONTROLLED WSIs
CLARIFY has investigated and built an advanced decentralized data flow management architecture for WSIs and related metadata, linking the necessary analysis tools in a federated cloud environment. The architecture is designed based on a computational notebook environment and can handle large, concentrated data streams from different sources to enable federated machine learning and privacy-preserved distributed workflows. The CLARIFY decentralized data management framework gives access to an open, comprehensive, and quality-controlled database of WSI and proper metadata using blockchain and smart contract techniques. Using information retrieval and semantic web techniques, the proposed data framework can allow users to discover datasets and notebooks from distributed sources and promote efficient workflow construction. The data management framework integrates and builds on advanced features in cloud computing and blockchain to address the needs of modern information architectures.
AUTOMATIC WSI INTERPRETATION AND NOVEL PATTERN IDENTIFICATION
CLARIFY has developed novel and robust methods for data-driven WSI interpretation across the three selected diseases. Deep neural network models of the state of the art and tailored architectures were used for feature extraction and classification for different tasks. Preprocessing pipelines were proposed for the automatic detection and removal of artifacts, crucial for weakly supervised learning, inference, and quality assessment of WSIs. CLARIFY introduced both diagnostic and prognostic classification pipelines for the example diseases, as well as content-based image retrieval for WSI with efficient feature-matching strategies. Various learning strategies were adopted depending on the annotations available and the task type, including fully supervised, weakly-supervised, semi-supervised, and unsupervised learning. Different data labeling methods were explored, including active learning and crowdsourcing. The incorporation of explainable AI techniques provided interpretability and facilitated new insights, especially in prognostic tasks.
APPLICATIONS FOR DIGITAL PATHOLOGY
CLARIFY has leveraged artificial intelligence (AI) models to objectively evaluate clinicopathologic parameters and to improve the diagnostic process for the selected diseases, standardizing assessments, reducing variability among pathologists and thereby enhancing tumor classification and histopathological interpretation.
CLARIFY has provided insights into the field of digital pathology to develop tailored technical solutions. As a proof of concept, a prototype of a web application targeting pathologists was established, incorporating several methods developed in the project.