Periodic Reporting for period 1 - LAGO (LESSEN DATA ACCESS AND GOVERNANCE OBSTACLES)
Période du rapport: 2022-11-01 au 2023-10-31
WP2 addresses the legal, ethical and societal aspects and provides support to the consortium in multiple cases.
WP3 performs the assessment of the current research data landscape in FCT domain, analysing the practices, data sharing procedures, potential barriers and enablers to the adoption of a RDE, and recommendations for enabling access to research data. The first version of requirements has been prepared and prioritized. A Reference Model for the RDE has been proposed, focusing at high level on the actors and processes enabling access to research data in trusted and secured way. The main functionality areas identified so far are related to the setup of the RDE, onboarding of new participants, data creation procedures, dataset publishing, search, request and exchange, training and testing of model. The RDE Reference Architecture derives from the Reference Model and focuses more on logical view and technical aspects, dividing the solution into multiple technical and software components, which together form the envisioned ecosystem, and detailing interactions among them in terms of services and exchanged messages.
WP4 delivers methodologies and tools for data creation, annotation, anonymization, synthesis and watermarking. During the first period, multiple tools have been realised to guide end users in the proper creation of significant datasets, thus based on the principles of delivering high-quality datasets and meaningful annotations and considering methods for security and privacy preservation.
In WP5 the first version of the Data Quality Assessment tool has been realised, with the goal of providing users with indicators about the quality of data being shared. The first version of the Risk Assessment tool has been developed too, aiming at making users aware of the risks related to sharing data with specific characteristics (data types, usage purposes, presence of personal information, FCT domain, etc.) and proposing mitigation measures to avoid risks in sharing those data for research purposes. To properly trace events occurring in the RDE in secure way, an Ethereum-based prototype has been realised and custom smart contracts have been defined in support of decentralised authentication mechanisms envisioned in WP6. Research data usage also includes procedures for model training and testing with data received from a provider. A sandbox environment is under development to allow end users to test trained models without the need for disclosing data outside their premises. The sandbox environment is based on containerization technologies, to ensure portability of the solutions. For the case in which access to data is not possible, a Federated Learning (FL) approach is under development as complementary strategy for model training.
WP6 is responsible for defining a governance model for the FCT Research Data Ecosystem. To this end, a trust establishment mechanism based on Verifiable Credentials standard has been adopted for the accreditation of participants in the RDE in trusted way, with roles and responsibilities defined. To enable interoperability, a semantic harmonization is ongoing, aimed at defining a LAGO vocabulary, to use as reference for modelling concepts and metadata related to the RDE processes. Relevant existing ontologies have been identified, whose concepts will be incorporated into the LAGO vocabulary to foster the reuse of open standards. In addition, the proposed governance framework foresees the possibility for participants to define their own licenses, to ensure that data providers are able to define the conditions under which their data will be used and to enforce a usage agreement between the parties before the data is transferred.
Definition of demonstration scenarios and planning of demonstration rounds have been addressed in WP7. Test scenarios have been derived from use cases and divided into unit test scenarios and system scenarios.
WP8 deals with the planning and implementation of dissemination and communication activities, the preparation of an exploitation plan, community building, and training activities.
Finally, WP9 activities focused on the fulfilment of the four Ethics Requirements laid out by the European Commission.