Skip to main content
Aller à la page d’accueil de la Commission européenne (s’ouvre dans une nouvelle fenêtre)
français français
CORDIS - Résultats de la recherche de l’UE
CORDIS

A privacy layer to power all research and AI workflows

Periodic Reporting for period 2 - PrivacyForDataAI (A privacy layer to power all research and AI workflows)

Période du rapport: 2025-01-01 au 2025-12-31

The project aims to develop and industrialize a privacy-preserving data solution for all analytics and AI projects. The core objective is to empower researchers and data scientists to leverage sensitive data without direct access. This solution functions as a protective layer between the data source and the user, guaranteeing privacy in all data processing activities. This is particularly important in a world with increasing concerns about data privacy and regulations like GDPR.
Key objectives include:
* Enabling access to sensitive data for research and innovation, eliminating privacy related barriers.
* Automating privacy protection by applying Differential Privacy (DP) principles to all data processing.
* Providing synthetic data versions to facilitate exploration and analysis without compromising the confidentiality of the original information.
* Supporting a broad range of analyses, from statistical analysis to machine learning and AI, while seamlessly integrating with existing workflows.
* Contributing to EU priorities in citizen rights protection and economic development, boosting trust in technology and technical progress.
During the first year, the effort was split between technical implementation and go-to-market initiatives. The technical implementation followed the planned trajectory without significant technical hurdles.
Main technical achievements include:
* Unstructured Type Support: Sarus now supports free text columns, using pre-trained small language models.
* Flexibility for New Types: The Sarus transformer-based-SD-model allows easy support for new data types.
* DP-LLM-FT Module: Sarus has built a DP-LLM-FT module to fine-tune LLMs with DP guarantees without exposing training data.
* DP-RAG Module: A DP-RAG module for RAG queries with DP guarantees was built and open-sourced.
* Backbone for data manipulation: The delivery of the backbone is complete, maintaining user tracking, DP recursive compilation, pushes to external tables, and performance improvements.
* Qrlew: An open-source tool (Qrlew) for manipulating SQL queries to ensure DP has been developed.
* Advanced Types: Improved handling of ranges and possible values, with types inferred from data, validated and modifiable by the user. These types are propagated through SQL transforms.
* Docker Support: Support for pre-validated, cross-language, docker-based computations, which allow for encapsulating computation in any language as a Docker image and composing it with other Sarus operations.
The project results represent a significant advance in data privacy, offering:
* An automated approach to privacy protection: By automating the application of DP to all data processing, Sarus considerably reduces data leakage risks and the workload of compliance.
* A comprehensive solution for structured and unstructured data: Sarus's ability to handle diverse data types, including free text and multi-table databases, significantly broadens its potential use cases.
* Improved usability for data scientists: Sarus tools and SDKs allow data scientists to work with sensitive data using their usual methods and libraries.
* An open-source tool for the community: The open-sourcing of Qrlew and other resources promotes transparency and collaboration in data privacy.

Key needs for continued adoption and success of Sarus include:
* Continued research: Further research is needed to improve DP algorithms and the efficiency of data processing.
* Demonstration and validation: It is essential to continue to demonstrate the value of Sarus through concrete use cases and validate its compliance with regulations.
* Market access and funding: Further efforts are needed to establish business partnerships and secure funding to support the growth of Sarus.
* Regulatory and standards support: It is important to work with data protection authorities to establish clear regulatory and standards frameworks for the use of solutions such as Sarus.
Mon livret 0 0