Skip to main content
Aller à la page d’accueil de la Commission européenne (s’ouvre dans une nouvelle fenêtre)
français français
CORDIS - Résultats de la recherche de l’UE
CORDIS

INNOVATIVE APPLICATIONS OF ASSESSMENT AND ASSURANCE OF DATA AND SYNTHETIC DATA FOR REGULATORY DECISION SUPPORT

Periodic Reporting for period 1 - INSAFEDARE (INNOVATIVE APPLICATIONS OF ASSESSMENT AND ASSURANCE OF DATA AND SYNTHETIC DATA FOR REGULATORY DECISION SUPPORT)

Période du rapport: 2023-11-01 au 2025-04-30

New medical devices are increasingly using data-driven innovations such as Artificial Intelligence (AI) to deliver improvements in patient care and diagnostics. Unlike traditional devices that were verified to be predictable in their operation, AI-driven devices are more complex and often more obscure in their behaviour. These devices train themselves to the required functionality using a large set of data, and apply this training to interpret the data offered by individual patients. These new software and data innovations introduce risks that are challenging European regulator decision-making needed to assure safe and effective operation of medical devices. Testing requires datasets representing the targeted population, which can be difficult to obtain due to data privacy constraints and exacerbated by the fact that datasets need to be produced specifically for each medical application.

Development of realistic synthetic datasets provides a potential solution to these challenges. These are precise computer generated datasets that exhibit the same statistical properties as the equivalent real dataset. Compared with anonymised and de-identified datasets, synthetic datasets have three advantages: a) they overcome lengthy approval processes required for anonymised data; b) they offer access to variables that may be considered sensitive and not included in anonymised and de-identified datasets; and c) they are immune to cross-referencing and harvesting of information with other datasets.

The INSAFEDARE project is developing a toolkit to enable cost-effective and high assurance decision-making within regulatory compliance processes for medical devices. The project will provide guidance on quality and safety assurance of datasets as a tool for validation, and identify how synthetic datasets can be used to establish assurance in advance of formal certification processes to reduce risks for device developers and provide improved efficiencies for regulatory bodies. The project will develop tools for discovery, integration, and query of multiple datasets, and for supporting the sustainable, dynamic, and through-life surveillance of medical devices, while capturing the impact of new evidence from newly published datasets.
The initial work in the project included assessments of requirements from various stakeholder communities targeted by the project: clinical trial scientists, medical device manufacturers, independent assessors and regulatory bodies, along with foundational research that was completed and reported by the project partners. This included a review of ML methods for producing synthetic datasets, a literature review and classification of digital health applications, a review of data types used in various digital health use cases, as well as an initial standards gap analysis. The requirements assessments and foundational research are driving the technology developments launched during the reporting period for deliverables that will be completed in the next reporting period.

An important technology milestone was achieved with the development of the Data Integration Pipeline Tools, which enable users to define workflows as a series of interconnected tasks, each encapsulated within a Docker container. Its modular architecture and model-based design simplify the management of complex data handling processes. The tool supports integration with various data sources, such as files and databases, and offers flexibility in how these sources are connected. The deliverable includes a comprehensive review of current data pipeline orchestration technologies evaluated against 12 key characteristics derived from the project’s requirements. The analysis concludes that no existing solution fully meets all these criteria. Key features include visual workflow design, automated execution, and built-in monitoring. It has been developed to be user-friendly and adaptable to various synthetic data generation scenarios.
INSAFEDARE includes the development of four beyond the state of the art technological pillars as its foundation for delivering healthcare industry advances:
+ Innovative data quality framework
+ Advanced data integration framework
+ Novel algorithms for synthetic data generation
+ Regulatory compliance-driven metrics for assessing datasets
These technologies will also form the basis for stakeholder engagement for extending Europe’s regulatory practices and standards.
Project Logo
Mon livret 0 0