Skip to main content
European Commission logo print header

Programme Category


Article available in the following languages:


Development of a platform for federated and privacy-preserving machine learning in support of drug discovery


For full details of the topic, please refer to the call text

The delivery of a federated and privacy-preserving machine learning platform, initially validated on publicly accessible data, that is demonstrably safe enough (privacy-preserving in the face of legitimate and illegitimate (attempted) access and use) and scalable enough to be deployed to a significant representation of private data in the actual preclinical data warehouses of the participating major pharmaceutical companies in yearly evaluation runs. This effort will be mainly driven by the applicant consortium and enabled by the EFPIA partners.

For full details of the topic, please refer to the call text

Enabled by an ever-expanding arsenal of model systems, analysis methods, libraries of chemical compounds and other agents (like biologics), the amount of data generated during drug discovery programmes has never been greater, yet the biological complexity of many diseases still defies pharmaceutical treatment. Hand in hand with rising regulatory expectations, this growing complexity has inflated the research intensity and associated cost of the average discovery project. It is, therefore, imperative that the learnings from these data investments are maximised to enable efficient future research. This could be empowered by the big data analysis and machine learning approaches that are currently driving the digital transformation across all industries.

For full details of the topic, please refer to the call text

The in silico predictions from the platform developed within the project will increasingly replace the costly and time-consuming in vitro testing, resulting in cost and time savings on compound synthesis and measurement in assays and preclinical studies, and therefore increase the efficiency of pharmaceutical discovery research. Although out of the direct scope of the present topic, the application of similar concepts to clinical data to enable faster recruitment of more targeted patients holds the longer-term promise of reducing costs of development.

The concepts developed within the project will be generic and will apply not only to the pharmaceutical discovery and clinical development setting, but also to other clinical applications, including real-world evidence analysis. Beyond the health area, they will prove relevant to multiple alternative industrial and other commercial or non-commercial settings where parties are interested in different predictive models that benefit from indirect access to the same volumes of private data. By providing data owners with the confidence that their data and the corresponding predictive models will remain private, this project will facilitate access to much larger data sets and therefore improve performance over that of conventional machine learning approaches.

For knowledge and ICT partners, federated learning presents a line of research and product development beyond that of data federation.

Applicants should indicate how their proposal will impact on the competitiveness and industrial leadership of Europe by, for example engaging suitable SMEs.