Periodic Reporting for period 3 - BIGPICTURE (Central Repository for Digital Pathology)
Reporting period: 2024-02-01 to 2025-07-31
Building on existing assets such as derived from ELIXIR infrastructure, Bigpicture establishes the first European ethical and regulatory compliant platform connecting pathologists, researchers, AI developers, patients, and industry. Our vision is to become a catalyst in the digital transformation of Pathology, allowing AI to reach maturity in Pathology. Bigpicture will enable the development and validation of trustworthy technologies to help diagnose and predict a wide range of diseases as well as improve the quality and efficiency of toxicity studies. We will create the tools and workflows to support the collection and disclosure of large series of images and metadata by European pathology departments. By engaging and building stakeholder consensus, Bigpicture will contribute to a regulatory framework for digital pathology and AI-based methods. Finally, Bigpicture envisions sustainability of its platform through a community-based model that creates value and reciprocity in use.
The backend technical infrastructure of the Bigpicture repository is in operation with improved data-submission workflows, enhanced metadata handling, and dataset landing pages. Integration with Life Science AAI for authentication and authorization is complete, cybersecurity scanning is in place, indirect data access was demonstrated via Grand Challenge, and a filesystem-based access layer enables use without manual downloads. A new SOP and ticketing system streamline dataset ingestion.
To support the collection of harmonized and high-value data, a (meta)data model for directly accessible datasets was developed and updated. The node coordinator network has submitted data to the repository, and additional clinical partners have prepared datasets for submission. The non-clinical data transporter was further updated and deployed across EFPIA partner sites, resulting in successful submissions from two EFPIA partners. In total, 65,407 Whole Slide Image (WSI) have been successfully submitted, with ongoing uploads from all contributing parties. As part of the honest broker mechanism, the REMS service for managing data access requests and the Perun group management service have been implemented and integrated with LS-AAI.
Despite the unexpected bankruptcy of partner Cytomine, progress on AI tools continued: the platform was transferred to ULiege and a new version released enabling integration and execution of AI models. Tools for DICOM conversion and indirect data access were developed and demonstrated. Several AI models were created and integrated (quality control, image retrieval, toxicopathology), with work on domain adaptation and model interpretability. Foundational steps for model/annotation interoperability were taken via new metadata standards. Validation platforms and demonstrators were set up with pathologists, and planning for a Bigpicture foundation model began alongside ethical and technical assessments.
The basics for responsible data sharing namely the DPIA, pseudonymisation strategy and the ethics advisory board have been delivered. A comparison of national ethical and legal frameworks was completed. A major milestone was reached with formal sign-off of the Data Sharing Agreement and the Hosting & Processing Agreement in July 2024. Further outputs include DPIA updates, a white paper on digital pathology in toxicology, and reports on trustworthy AI, validation resources, and a hands-on workshop for pathologists.
To plan for sustainability, a mid-term business plan was developed. Ongoing stakeholder engagement, including with data contributors, refined value propositions and informed long-term sustainability and exploitation planning.
Traditional one-to-one data-sharing with tailor-made agreements hampers collection of large data sets that are representative for real-world use cases. In Bigpicture we created a data sharing agreement that meets diverse provider needs crucial for impactful collections. Setting up a resource for AI development in pathology of the size of Bigpicture is unprecedented, attracting international attention from researchers, large and small companies and many other stakeholders. The numbers of partners, disease areas, and the unique combination of data from the clinical and preclinical fields offer new perspectives on the use of AI in pathology.
To enable contributions from varied sources, we developed tools and processes that let providers prepare cohorts on-premises, within local setups: quality management for high-grade WSI; software to find cases, extract data, DICOM-convert images, and package uploads. For EFPIA partners, a dedicated Data Transporter supports regulated environments. By unlocking large, representative datasets, industry can build AI with broader applicability and impact.
Progress has also been limited by missing standards. Bigpicture goes beyond recommendations by delivering a comprehensive metadata standard spanning clinical and pre-clinical data, releasing DICOM conversion tools, and contributing to an ISO standard for digital/computational pathology. These steps reduce integration friction and help AI developers deploy solutions across heterogeneous clinical workflows.