Periodic Reporting for period 1 - SEARCH (Synthetic hEalthcare dAta goveRnanCe Hub)
Période du rapport: 2024-10-01 au 2025-09-30
SEARCH directly addresses these challenges by combining synthetic data generation, federated analytics, and robust governance frameworks to enable secure, scalable, and regulation-ready medical data innovation. Through extensive engagement with clinicians, data managers, and technical stakeholders, the project ensures that solutions are grounded in real operational requirements.
During the first reporting period, SEARCH established the foundational elements required for this ecosystem: user and technical requirements (56 prioritised needs across 25 organisations), mapping of 140 real-world clinical datasets, and a modular federated architecture for secure data discovery, harmonisation, and model training. The project also delivered the first Synthetic Data Assessment and Credibility (SDAC) Framework, providing structured methods for evaluating privacy, fidelity, utility, and fairness of synthetic datasets—essential for regulatory readiness.
SEARCH’s overall objectives are to:
Develop high-fidelity multimodal synthetic data aligned with FAIR principles.
Build a federated platform enabling secure distributed analytics while preserving data sovereignty.
Apply and validate these methods across six clinical studies.
Establish pathways for exploitation, regulatory alignment, and long-term sustainability.
Through these activities, SEARCH aims to reduce barriers to data access, accelerate AI development, and strengthen Europe’s capacity for safe, trustworthy digital health innovation.
A consortium-wide consultation captured input from across partner organisations, resulting in prioritised user, technical, and clinical requirements that now guide the platform’s development. In parallel, partners completed a comprehensive mapping of 140 datasets across oncology, cardiovascular, neurological and gastrointestinal domains, representing more than 6.5 million data items. Metadata schemas and interoperability specifications were defined using established standards including HL7 FHIR, SNOMED CT, DICOM and OMOP, providing a harmonised structure for future federated data preparation.
The project delivered the initial technical architecture for the SEARCH platform, detailing modular components for data curation, harmonisation, federated learning, synthetic data generation.
Foundational synthetic data generation methods were developed across structured/tabular data, physiological signals, radiology and endoscopy images, and genomics. Early evaluation procedures were aligned with the SDAC framework to assess privacy, fidelity and utility.
Clinical teams defined the structure, data flows and variable requirements for six validation studies in oncology, cardiovascular disease and gastrointestinal medicine, establishing the basis for downstream model testing and clinical decision-support evaluation.
Together, these advances provide the essential groundwork for federated platform deployment, synthetic data generation at scale, and clinical validation activities planned for the next period.
A major step beyond current practice is the creation of the Synthetic Data Assessment and Credibility (SDAC) framework, which consolidates privacy, fidelity, utility and fairness checks into a single validation process for synthetic medical data—one of the first structured approaches of its kind in Europe.
The project also defined a modular architecture for federated learning and synthetic data generation, clearly separating local (on-premise) and central components to maintain data sovereignty while enabling multi-institution model development in line with GDPR and emerging EU health data policies.
To maximise future uptake, further progress will require: continued technical refinement of synthetic data pipelines; targeted engagement with regulators and HTA bodies; contribution to international standardisation; and alignment of exploitation pathways with clinical priorities. These will support sustainable deployment across healthcare, research and industry.