Periodic Reporting for period 1 - FeatureCloud (Privacy preserving federated machine learning and blockchaining for reduced cyber risks in a world of distributed healthcare)
Reporting period: 2019-01-01 to 2020-04-30
In addition to the first core apps of the FeatureCloud platform itself, we have worked on demonstrating the principle of federated machine learning. While the FeatureCloud prototype platform emerged, we have worked on stand-alone solutions for typical medical application scenarios. We began with a federated genome-wide association study (GWAS) tool: sPLINK, which mimics the non-federated standard GWAS tool PLINK (https://www.biorxiv.org/content/10.1101/2020.06.05.136382v1). We demonstrate that currently available distributed GWAS software (so-called meta-analysis tools) massively loses accuracy when the data suffers heterogeneously distributed outcomes or confounders. In contrast, sPLINK gives the exact same results as PLINK, and thus has the potential to become the new standard tool for genotyping in the future as it does not require any exchange of raw data between the participating institutions/hospitals and on top is not suffering any accuracy lost compared to the state-of-the-art centralized tools. sPLINK implements federated Chi-squared tests, as well as federated multimodal linear and logistic regression models. Likewise, we have developed first prototype software for federated survival analysis: FedSurv. It combines federated statistical modelling and differential privacy approaches based on Laplacian noise to generate privacy-preserving Kaplan-Meier plots.
• Improved security of Health and Care services, data and infrastructures.
By addressing the evident roadblock in medical data mining – centralized data mining but distributed clinical data – we improve the cyber security of computational health care services, patient data and communication infrastructure by design and by architecture. FeatureCloud’s federated machine learning engine erases the necessity to share sensitive data with a cloud.
• Less risk of data privacy breaches caused by cyberattacks.
FeatureCloud significantly reduces the risk of data privacy breaches caused by cyberattacks on health cloud services or on the communication channels between hospital and cloud. Instead of bringing the data to the AI, we bring the AI to the data.
• Increased patient trust and safety.
Based on trusted authority technology, like blockchains, we work on ensuring full control over the access rights to own sensitive data combined with the guarantee that no sensitive data is exchanged to learn the federated AI which could be traced back to individual patients (by design) will increase patient trust and safety significantly. Our FeatureCloud platform is in full accordance with E.U. GDPR and NISD policies, and it is developed with respect to the criteria for software-supported medical devices of the FDA and EMA, respectively.
FeatureCloud furthermore contributes to the following most significant impacts not mentioned in the work programme:
• The novel FeatureCloud technology will create new market opportunities.
FeatureCloud's replicability and scalability of client-side machine learning concepts will have an enormous impact worldwide and foster pan-European business, e.g. with spin-offs and start-ups because of a huge emerging market in privacy-aware machine learning.
• The European society will benefit from new levels of personalized medicine, new possibilities for research of complex diseases like e.g. cancer, and lower costs of medical research. FeatureCloud enables open science without boundaries, cross-domain and pan-European, which will particularly allow new levels of cancer research because FeatureCloud solves current privacy, ethical, security, and safety restrictions and will thus enable what was not possible to date, which can help to reduce increasing health costs in Europe by rising medical quality at the same time.