Periodic Reporting for period 1 - CEDAR (Common European Data Spaces and Robust AI for Transparent Public Governance)
Période du rapport: 2024-01-01 au 2025-06-30
CEDAR will identify, collect, fuse, harmonise, protect, and share 10+ new high-quality datasets. This will involve digitising data from public administration archives and generating synthetic data to improve real-world data quality. The project also aims to harmonise and standardise different public and private data sources into new unified datasets. Furthermore, it seeks to enable fair and secure data access to these datasets and integrate them with Common European Data Spaces available in Europe.
Methods, tools, and guidelines will be developed to digitise, protect, and integrate data to address significant issues like corruption, aligning with the European Strategy for Data and the development of Common European Data Spaces (CEDS), and the European Data Act. This will lead to improved transparency and accountability in public governance, promoting European values and rights in the digital world, and enriching the European data ecosystem and economy.
At WP2, the data for the pilots and the requirements of the data sources were collected. Pilot clusters and their dictionaries were populated and CEDAR Pilots related Greek Governance Data overnight decisions were collected. A framework to understand data needs was integrated into the project. Harmonisation and alignment of the datasets gathered for the pilots and a comprehensive review of the state-of-the-art techniques for anonymization and pseudonymization of structured textual data were performed. The D2.1 was submitted.
At WP3, the preparation of data and data model has begun. A state-of-the-art analysis of MLOps methodology was conducted and connectors suitable for the project were selected. A framework for cybersecurity risk assessment was developed and a state-of-the-art overview for penetration testing methodologies was conducted. The D3.1 was submitted.
At WP4, a baseline implementation of the Relationship Extraction tool and a tool for recognizing named objects were developed. Drone datasets and audio related datasets were studied and a person's verification model was evaluated. Publicly available and comparable data from other countries were investigated. A baseline to analyse the knowledge graph and possible correlations between different nodes and a tool for analysing digital footprints were developed.
At WP6, news posts about CEDAR activities were shared across the website and social media channels. Partners have co-organized or participated in conferences and international meetings. Video material presenting goals and activities of the Slovenian pilot team was prepared. The objectives and goals of the Slovenian pilot were presented through social media. A plan for scientific exploitation is also prepared. The D6.1 and D6.2 were submitted.
At WP7, the periodic financial reporting was completed. A Data Management Plan (DMP) has been developed and data dictionaries for each pilot were created. An alignment with the work in WP2 in terms of ethical and legal requirements was conducted. Quality control and risk management in the WPs was also performed. The D7.1 and D7.2 were submitted.