Periodic Reporting for period 1 - DataPACT (DATAPACT: COMPLIANCE BY DESIGN OF DATA/AI OPERATIONS AND PIPELINES)
Periodo di rendicontazione: 2025-01-01 al 2025-12-31
DataPACT validates its core results through a strong selection of seven complementary use cases offered by SMEs, large companies, and public sector organizations in relevant areas, including media and entertainment, healthcare, smart cities, law enforcement and security, customer relationship management, manufacturing, and public data. DataPACT helps them ensure straightforward and cost-effective compliance with existing and emerging regulations and guidelines, shorter time-to-market for compliant data solutions, fair and unbiased data-driven systems, respect for privacy and other fundamental rights, and lower and transparent environmental impact for intensive data/AI pipeline operations.
DataPACT gathers a balanced consortium of 18 partners from 16 countries, consisting of two large companies, two public sector organizations, four SMEs, two research centers, and seven universities, covering relevant aspects related to legal, ethical, social, environmental, and technical compliance of data/AI operations and pipelines.
Main achievements include:
- Establishment of the Data Management Plan (Deliverable D1.4) as a living document guiding data management throughout the project.
- Development of the state-of-the-art overview for expected results, identifying key challenges and knowledge gaps (part of Deliverable D1.1).
- Consolidated identification and mapping of technical, market, and use case requirements (Deliverables D1.1 and D5.1).
- Gathering requirements through AS-IS and TO-BE interviews, user stories, and survey.
- Cataloguing of relevant tools and mapping them against expected results and user stories.
- Identification of relevant legislation and national transpositions, forming the basis for automated compliance solutions and the first iteration of the Compliance Framework (Deliverable D4.1).
- Definition and documentation of the DataPACT architecture (Deliverable D1.2).
- Release of the first integrated compliance-aware data/AI pipeline lifecycle toolbox, incorporating tools from the PipelineR, GreenR, and AppleyR collections (Deliverable D2.1).
- Release of the first version of the Compliance Toolbox, delivering early results across the PolicyR and TrustR toolsets (Deliverable D3.1).
- Establishment of the project’s exploitation strategy with a phased workflow and identification of Key Exploitable Results (Deliverable D6.1).
- Technical and market requirements were consolidated, and regulatory, ethical, societal, and sustainability considerations translated into shared architectures, interoperable tools, and compliant data models.
- The first version of the Compliance-aware Pipeline Lifecycle Toolbox was released, allowing data/AI pipelines to be designed, executed, and monitored with embedded compliance, traceability, and environmental awareness.
- The toolbox integrates functionality such as policy-aware operation definition, pseudonymisation, pipeline simulation, compliance monitoring, and energy and emissions tracking, supporting evidence generation and automated compliance assessment.
- The first integrated version of the Compliance Toolbox was developed, operationalising compliance-by-design principles through tools addressing privacy, consent, access control, contracting, trust, fairness, robustness, and explainability.
- The PolicyR and TrustR tools enable machine-readable policy specification, dynamic consent management, automated negotiation, trust and reputation modelling, bias inspection, fairness assessment, and transparent AI decision-making.
- The first version of the DataPACT Compliance Framework was established, providing the legal, ethical, and social foundations for compliance assessment across the entire pipeline lifecycle.
- The framework aligns with key EU regulations, including GDPR, the DGA, the Data Act, and the AI Act, and introduces AssessR tools that connect regulatory knowledge, LLM-based compliance assistants, and certification mechanisms with operational toolchains.
Together, these developments establish a baseline approach that embeds compliance and ethical considerations into data and AI pipelines, improving their reliability and sustainability