Skip to main content
Vai all'homepage della Commissione europea (si apre in una nuova finestra)
italiano it
CORDIS - Risultati della ricerca dell’UE
CORDIS

A European Health Data Toolbox for Enhancing Cardiology Data Interoperability, Reusability and Privacy

Periodic Reporting for period 2 - DataTools4Heart (A European Health Data Toolbox for Enhancing Cardiology Data Interoperability, Reusability and Privacy)

Periodo di rendicontazione: 2024-04-01 al 2025-09-30

Cardiovascular disease remains the leading cause of death in Europe, yet cardiology data are still fragmented across hospitals, languages and national systems. This fragmentation, combined with strict but heterogeneous data protection practices, makes it difficult to reuse health data for research and to develop AI tools that are robust, fair and deployable across borders. At the same time, the EU is introducing an ambitious regulatory framework for digital health and AI, including the AI Act and the European Health Data Space, which calls for interoperable standards, privacy-preserving data reuse and strong governance.
DataTools4Heart (DT4H) addresses these challenges by building a European health data “toolbox” for cardiology. The project develops a standards-based Common Data Model and ingestion pipelines to harmonise cardiology data, a multilingual clinical NLP suite for cardiology reports, a federated and privacy-preserving AI infrastructure, and an open platform with virtual assistants and audit trails. These tools are demonstrated in real-world heart-failure use cases across several European centres. Social sciences, law and ethics are integrated through a dedicated work package that interprets EU law, designs governance mechanisms and drafts a sector-specific EU Code of Conduct, while clinicians and patients help shape requirements and workflows. Together, these elements are expected to deliver a reusable reference infrastructure for trustworthy cardiology AI with potential to scale to other diseases and to support EU strategies on health data and AI.
During this reporting period, DT4H moved from design to a functioning ecosystem of tools and methods. On the governance side, the consortium analysed the AI Act, the European Health Data Space and related guidance, translated them into concrete requirements for system architecture, access control and logging, developed a comprehensive ethics framework and prepared an EU Code of Conduct for health data sharing and reuse under Article 40 GDPR. The Data Management Plan was updated as a living document aligned with FAIR principles and with the federated architecture of the project.
On the technical side, the project delivered a mature data backbone. A heart-failure Common Data Model based on HL7 FHIR and major clinical terminologies was finalised and implemented; a modular open-source Data Ingestion Suite, built on the toFHIR engine, was deployed and validated in multiple European centres; and a Feature Extraction Suite formalised cohorts and AI-ready variables as machine-readable definitions. A federated metadata catalogue was introduced to expose harmonised feature availability and codebooks across sites. For unstructured data, DT4H developed a multilingual clinical NLP stack with cardiology-tuned language models, annotated corpora and named entity recognition systems for seven European languages, integrated with terminology services. In parallel, the project built a federated learning framework (FLCore) with machine learning and survival-analysis models, fairness-oriented and uncertainty-aware aggregation, and complementary tools for differentially private synthetic data generation that underpin the CardioSynth pipeline. All components were integrated in a single web-based platform that includes shared workspaces, the catalogue, NLP and synthetic-data services, federated learning, a permissioned blockchain for audit trails and a virtual assistant capable of interacting with the ecosystem in natural language. Clinical partners, through regular clinical–technical meetings and demonstrations in realistic heart-failure scenarios, guided the design and initial validation of these tools.
DT4H advances the state of the art by combining, in a single architecture, elements that are usually developed in isolation. At the data layer, the project offers not just an ETL pipeline but a standards-based Common Data Model, a multi-model ingestion suite and a FHIR-native feature store with declarative definitions of populations and variables, all exposed via a federated catalogue that supports FAIR and reproducible data preparation across heterogeneous hospitals. In language technologies, DT4H delivers one of the first coordinated sets of cardiology-specific transformer models, corpora and extraction tools across multiple European languages, moving beyond single-language or generic biomedical NLP. In AI, the project integrates federated learning, survival-analysis models, fairness-oriented and uncertainty-aware aggregation, synthetic data generation and privacy-enhancing technologies into a coherent framework, complemented by explainability tools and a blockchain-based audit layer.
These results open the door to large-scale, multi-centre, multilingual research in cardiology without centralising patient-level data. To ensure further uptake, the project focuses on open-source releases, clear documentation and containerised deployment, as well as alignment with major standards and EU regulation. Remaining needs for full exploitation include extended real-world validation in additional centres, continued hardening and maintenance of the open-source components, suitable funding and procurement instruments for hospitals and SMEs to adopt federated infrastructures, and sustained collaboration with other European initiatives to ensure interoperability and long-term sustainability.
dt4h-logo-color.png
Il mio fascicolo 0 0