European Commission logo
français français
CORDIS - Résultats de la recherche de l’UE
CORDIS

Programme Category

Programme

Article available in the following languages:

EN

Developing a Data Quality and Utility Label for the European Health Data Space

 

A vast quantity of health datasets exist across Europe, from multiple sources (individual care, medical registries, social, environmental behavioural, wellbeing, clinical trials, research, administrative, etc.), and of varying quality. This represents a tremendous opportunity for the reuse of this data for purposes other than for the one for which they were originally collected and spur the development of better prevention strategies, diagnoses, treatments and care plans.

The European Health Data Space (EHDS) will provide a common EU framework for secondary use of health data such as research, innovation, regulatory purposes, policymaking and personalised medicine. It will enable data users to have access to large amounts of health data through health data access bodies empowered with the EHDS legal provisions to overcome existing limitations regarding the processing of health data for secondary uses.

To support data users in the discovery and selection of datasets for their purposes, there is a growing need to develop a data quality and utility framework to articulate the characteristics and the potential usefulness of datasets. This framework will also support data holders in identifying and addressing areas of improvement which can, in turn, allow for wider and better use of these datasets.

Several initiatives have developed or are developing guidelines and recommendations for health data quality, however, these typically focus on specific data types (i.e. 1+ Million Genome Initiative[[ https://b1mg-project.eu/work-packages/wp3]]) or areas of applications (i.e. European Medicines Agency – EMA and Heads of Medicines Agencies’ Big Data Steering Group activities to support medicines regulation[[ https://www.ema.europa.eu/en/about-us/how-we-work/big-data]]). Similarly, previous studies and initiatives have addressed specific dimensions of ‘data quality’ for health data but none are offering a framework suitable for the breadth of data types and encompassing the quality and utility elements proposed in the EHDS legal provisions. The proposed framework should take into account the various needs of data users whilst at the same time avoid becoming an excessive burden on data holders which will need to produce the data quality and utility label.

Proposals should address all of the following activities:

  • Perform a mapping of existing data quality and utility principles/initiatives/frameworks (i.e. EMA/HMA Big Data Stakeholders Group Data quality efforts, TEHDAS Data Quality Working Group[[ https://tehdas.eu/packages/]], EOSC-LIFE[[https://www.eosc-life.eu/]] Health Data Research UK’s data quality and utility framework[[Development of a data utility framework to support effective health data curation: https://informatics.bmj.com/content/28/1/e100303?utm_source=twitter&utm_medium=social&utm_term=hootsuite&utm_content=sme&utm_campaign=usage]], and relevant data principles, resources and tools (FAIR, FAIR Cookbook, etc.)[[See definition of FAIR data in the introduction to this work programme part.]];
  • Conduct various stakeholder consultations, integrating all relevant data users and data holders of health data, EHDS Health Data Access Bodies (HDABs) and other relevant actors to validate data user needs and adequately take into account relevant initiatives when developing the proposed framework;
  • Develop a framework (set of technical specifications) for the data quality and utility label that supports the implementation of the EHDS legal provisions and the roll out of the label by the data holders and EHDS Health Data Access Bodies;
  • Pilot and evaluate the use of the proposed framework (as a label and as a maturity model) on a datasets sample representing the wide-ranging data types (such as electronic health records, genomics datasets, medical registries, administrative data, etc.) and taking into account the needs of all data users identified.
  • Develop recommendations for the successful implementation and adoption of the data quality and utility label and maturity model across European Member States considering the maturity levels regarding secondary of health data.

The consortium should be composed of representatives from data users, data holders, health data access bodies, and other relevant stakeholders to the scope of secondary use of health data, adequately covering the diversity of heath data types and users’ needs across European Member States.