Scaling up multi-party computation, data anonymisation techniques, and synthetic data generation
It is essential to speed up and facilitate innovations in the field of data-driven tools and services for wellbeing, prevention, diagnosis, treatment and follow-up of care, among others. However, limited access by developers to health data and secure testing environments hinder the development of innovative data-driven digital health products and services.
Therefore, the proposals are expected to scale up multi-party computation, data anonymisation techniques and synthetic data generation. To ensure privacy, the data analytics should be conducted in a distributed way among processors that grant third parties access to analysis outcomes but not to the underlying data. The developers should have access to distributed testing data sources and cloud and computing resources at large scale, with a view to improving the speed and robustness of multi-party computation solutions for innovators. The aim is to allow secure GDPR-compliant data processing for research, and clinical purposes.
The proposals should consider the use of synthetic, i.e. artificially generated, data as they allow researchers and developers to test, verify and fine-tune algorithms in large-scale data experimentations without re-identifiable personal data.
In addition, the proposed anonymisation techniques will have to be sophisticated and robust enough to tackle the challenge of anonymised data sets that still make it possible to trace back to individuals.
The proposals are expected to foster the development of secure, interoperable, transparent - and therefore trustable - cross-border health data hubs that can facilitate the provision of the required testing environments for innovators. This will support the uptake of new data tools, technologies and digital solutions for health care.
To this end, integration of national/regional health data hubs/repositories/research infrastructures is appropriate to achieve the scope of the topic. The proposals are expected to address all of the following areas:
- Consolidate and scale up multi-party computation and data anonymisation techniques and synthetic data generation to support health technology providers, in particular SMEs.
- Support the development of innovative unbiased AI based and distributed tools, technologies and digital solutions for the benefit of researchers, patients and providers of health services, while maintaining a high level of data privacy.
- Advance the state-of-the-art of de-identification techniques, to tackle the challenge of anonymised datasets that can be traced back to individuals.
- Develop innovative anonymisation techniques demonstrating that effective data quality and usefulness can be preserved without compromising privacy.
- Explore and develop further the techniques of creating synthetic data, also dynamically on demand for specific use cases.
- Widen the basis for GDPR-compliant research and innovation on health data.
- Ensure wide uptake and scalability of the methodologies and tools developed, promote high standards of transparency and openness, going well beyond documentation and extending to aspects such as assumptions, architecture, code and any underlying data.