Skip to main content
Go to the home page of the European Commission (opens in new window)
English English
CORDIS - EU research results
CORDIS

A European Health Data Toolbox for Enhancing Cardiology Data Interoperability, Reusability and Privacy

Periodic Reporting for period 1 - DataTools4Heart (A European Health Data Toolbox for Enhancing Cardiology Data Interoperability, Reusability and Privacy)

Reporting period: 2022-10-01 to 2024-03-31

Healthcare data re-use in Europe faces ethical and legal issues, a high diversity in data formats and languages, and a lack of technical and clinical interoperability. To improve this situation, the DataTools4Heart project aims to co-create, develop and demonstrate a comprehensive, federated, privacy-preserving cardiology data toolbox including standardised data ingestion and harmonisation tools, multilingual natural language processing, federated machine learning and data synthesis methods, as well as virtual assistants to help scientists and clinicians navigate through large-scale multi-source cardiology data, while complying with European regulations and data standards. Specifically, the project will build an innovative, federated, privacy-preserving cardiology data toolbox that will improve health data re-usability while strictly complying with ethical and legal standards and requirements. Furthermore, we will develop a common data extraction tool that will enhance metadata and data interoperability while taking into account data heterogeneity across European regions and cardiology units. Moreover, we will introduce a multilingual NLP suite, including cardiology-specific entity recognition and machine translation, for standardised data structuring of cardiology reports across European regions. By developing an open platform with dedicated virtual assistants, we will help clinicians, researchers and data scientists to structure and navigate through existing and new health data sets in cardiology. Ultimately, we will demonstrate and optimise the value of the proposed cardiology health data toolbox based on concrete clinical use cases drawn from real-world practice in the outpatient and emergency cardiology units.
The DataTools4Heart project has made significant advancements towards achieving its scientific objectives. Clinical and technical partners have collaborated to define the healthcare and clinical requirements of the project. Based on these specifications, a total of 27 FHIR Profile definitions, 5 FHIR Code System definitions and 16 FHIR Value Set definitions are published on the Git repository of the Common Data Model. Moreover, the Data Ingestion Suite has been designed and configured for deployment at all clinical sites.
Clinical case corpora have been constructed for all seven consortium languages (English, Spanish, Italian, Romanian, Czech, Swedish and Dutch), selection criteria and guidelines have been implemented for the generation of synthetic and artificial clinical texts and neural translation of these content collections has been carried out to generate translated versions for all seven languages.
In parallel, the first complete version of DataTools4Heart federated learning module was designed and developed, enabling federated machine learning without the need for hospitals to share the data. The prototype was installed and tested at four clinical sites for which approvals from the hospitals’ ethical committees (ICRC, GEM, KUH, UCL) were obtained, while the installation process has been initiated in two more clinical sites (UMCU, AMC). Furthermore, the module was installed in three technical partners for experimentation (UB, BSC, SRDC) and testing of the integration between the different components of the platform. A Secure Multi-Party Computation (SMPC) service was developed to ensure privacy-preserving federated learning and was subjected to rigorous testing.
We also developed a web-based AI-powered Virtual Assistants prototype using Large Language Models. The prototype does not have access to real patient data yet, but it is able to simulate the end dialogues using manufactured data. Significant progress has been made towards developing the DataTools4Heart integrated platform, which includes the AI-powered Virtual Assistants for clinicians and a permissioned blockchain layer for tracking user operations. Key tasks involved designing the platform architecture, developing the AI virtual assistants and blockchain platform, and establishing protocols for tool inclusion.
Ultimately, after establishing the clinical study protocols and defining the minimally required datasets, all clinical partners have applied for ethical approval for the three different sub studies. In case a clinical centre received approval from the local committees, the data access process has been started and in close collaboration with WP2 clinical partners have started mapping the local databases to the common data model, and the federated learning module was installed and tested for connectivity, remote triggering, and model training using publicly available data.
DataTools4Heart will build the first cardiology Common Data Model (CDM) in HL7 FHIR by representing our Common Data Element definitions as the building blocks for cardiovascular data analysis. We will provide a flexible, modular Data Ingestion and Feature Extraction Suite to set up standardised data ingestion pipelines in CDMs and produce datasets ready for federated machine learning. DataTools4Heart will develop new language-specific models for cardiology to capture the characteristics of clinical narratives in the 7 project languages and will research and develop semantic annotation and entity normalisation systems capable of working with multiple languages. Moreover, DataTools4Heart will introduce centre dropout, advanced fairness techniques and uncertainty-aware aggregation to create a novel federated learning pipeline characterised by increased fairness and efficiency. At the same time, DataTools4Heart will develop the first multi-lingual virtual assistants to allow easy access and use of clinical information in the 7 languages of the project. Ultimately, the project will exploit multi-source and large-scale structured and unstructured cardiology health data to develop and validate novel, federated machine learning models for diagnostic pathway support, including referral, for cardiologists in both outpatient clinics and emergency units.
dt4h-logo-color.png
My booklet 0 0