Periodic Reporting for period 1 - DataTools4Heart (A European Health Data Toolbox for Enhancing Cardiology Data Interoperability, Reusability and Privacy)
Reporting period: 2022-10-01 to 2024-03-31
Clinical case corpora have been constructed for all seven consortium languages (English, Spanish, Italian, Romanian, Czech, Swedish and Dutch), selection criteria and guidelines have been implemented for the generation of synthetic and artificial clinical texts and neural translation of these content collections has been carried out to generate translated versions for all seven languages.
In parallel, the first complete version of DataTools4Heart federated learning module was designed and developed, enabling federated machine learning without the need for hospitals to share the data. The prototype was installed and tested at four clinical sites for which approvals from the hospitals’ ethical committees (ICRC, GEM, KUH, UCL) were obtained, while the installation process has been initiated in two more clinical sites (UMCU, AMC). Furthermore, the module was installed in three technical partners for experimentation (UB, BSC, SRDC) and testing of the integration between the different components of the platform. A Secure Multi-Party Computation (SMPC) service was developed to ensure privacy-preserving federated learning and was subjected to rigorous testing.
We also developed a web-based AI-powered Virtual Assistants prototype using Large Language Models. The prototype does not have access to real patient data yet, but it is able to simulate the end dialogues using manufactured data. Significant progress has been made towards developing the DataTools4Heart integrated platform, which includes the AI-powered Virtual Assistants for clinicians and a permissioned blockchain layer for tracking user operations. Key tasks involved designing the platform architecture, developing the AI virtual assistants and blockchain platform, and establishing protocols for tool inclusion.
Ultimately, after establishing the clinical study protocols and defining the minimally required datasets, all clinical partners have applied for ethical approval for the three different sub studies. In case a clinical centre received approval from the local committees, the data access process has been started and in close collaboration with WP2 clinical partners have started mapping the local databases to the common data model, and the federated learning module was installed and tested for connectivity, remote triggering, and model training using publicly available data.