Scaling Up secure Processing, Anonymization and generation of Health Data for EU cross border collaborative research and Innovation

Projektinformationen

SECURED

ID Finanzhilfevereinbarung: 101095717

DOI

10.3030/101095717

EK-Unterschriftsdatum 23 November 2022

Startdatum 1 Januar 2023

Enddatum 31 Dezember 2025

Finanziert unter

Health

Gesamtkosten

€ 6 999 723,75

EU-Beitrag

€ 6 999 723,25

6 999 723,25

0,50

Koordiniert durch

UNIVERSITEIT VAN AMSTERDAM
Netherlands

Periodic Reporting for period 1 - SECURED (Scaling Up secure Processing, Anonymization and generation of Health Data for EU cross border collaborative research and Innovation)

Berichtszeitraum: 2023-01-01 bis 2024-06-30

In SECURED, we provide a comprehensive collaboration platform called the SECURED Innohub, which offers a secure and trusted environment for decentralized, cooperative processing of health data using Secure Multiparty Computation (SMPC)/Homomorphic Encryption (HE) techniques. It also supports the generation of synthetic data (SDG), data anonymization, and anonymization assessment for health data providers and users while in parallel make sure that anu adopted health dataset contains no bias (and if yes perform unbiasing). Our goal is to promote widespread use of health datasets across Europe by connecting EU health data hubs, the health data analytics research community, healthcare innovators (such as SMEs), and end users. To achieve the above vision, in SECURED we follow two
The SECURED vision is to initiate an EU-wide, cross-border health data collaboration ecosystem that enables data providers, researchers, and innovators to develop new AI-based data analytics solutions and foster innovation. To achieve that the project through its Innohub provides tools and services for Anonymisation, Deanonymisation, Secure Multiparty Computation and Unbiasing of Health data that can be utilised by a broad range of users to build privacy-preserving health applications that can scale up without significant overheads.. To accomplish this goal, the project was divided into five work packages. The first is dedicated to coordinating and ensuring coherence of the project work, mitigating risks and preparing outreach and dissemination of the project achievements. The second focuses on Anonymization, Deanonymization and Synthetic data generation, with the goal of enabling Innohub users to produce datasets that can be securely shared among interested parties. The third focuses on three primary research areas for data processing: Federated Learning (FL), Unbiased Artificial Intelligence (UB), Secure Multi-Party Computation (SMPC/ Homomorphic Encryption (HE). The integration and design of the SECURED platform and overall SECURED ecosystem through its Innohub is the goal of the fourth work package, and finally, the fifth work package will provide the validation of the technologies and demonstrate the use of the SECURED hub in the context of use cases, in the second half of the project.

A number of achievements have already been accomplished by SECURED, through the work carried out in the first half of the project duration. In relation to the second work package, the anonymization architecture and technical designs have been completed, in the shape of microservices, and the initial framework has been improved with new privacy models to target use cases. A comprehensive review of validated re-identification and de-anonymization attacks on health data has been performed, providing a solid foundation for the benchmarking efforts and the implementation of such attacks in the SECURED Innohub and related library (based on availability of source code). The main efforts in synthetic data generation have been towards developing differential privacy for cancer tissue, foetal heartbeat, mammogram generation. In relation to work package three, a preliminary analysis of available SMPC and HE libraries has been performed, considering computational tasks relevant for the SECURED use cases, such as inference in neural networks and image processing analysis based on matrix multiplication. Furthermore, selected HE libraries have been assessed using specific KPIs and benchmarked with relevant micro-benchmarks. A risk analysis in relation to federated learning libraries has been conducted, with the aim of evaluating best practices for the development of federated learning solutions that are both versatile and lightweight, scalable to the project use cases. A first selection of available libraries was done accordingly. Regarding Health Dataset Bias Assessment and Unbiasing a first analysis had been made about Fairness for generative AI, which emerged as a highly relevant property for SECURED during the development of the project. - In this regard, we have envisioned and started working on 3 different research axes to provide a complete solution: improving data generation from the State of the Art, giving more functionalities to the user and improving the security of the generative models.

As outlined at the previous point, a number of technological advancements have been made to the state of the art. Technologies such as new anonymization and de-anonymization techniques, SMPC/HE implementations, UB methods and federated learning solutions all improve on existing solutions, with a specific focus on e-health settings, and in particular the application domains defined by the use cases: real-time tumor classification, telemonitoring of child patients, synthetic data generation for health education, and privacy-preserving access to genomic data.

Periodic Reporting for period 1 - SECURED (Scaling Up secure Processing, Anonymization and generation of Health Data for EU cross border collaborative research and Innovation)

Diese Seite teilen Diese Seite in sozialen Netzwerken teilen

Herunterladen Den Inhalt der Seite herunterladen