A Comprehensive Framework enabling the Delivery of Trustworthy Datasets for Efficient AIoT Operation

Información del proyecto

PANDORA

Identificador del acuerdo de subvención: 101135775

DOI

10.3030/101135775

Fecha de la firma de la CE 29 Marzo 2024

Fecha de inicio 1 Abril 2024

Fecha de finalización 31 Marzo 2027

Financiado con arreglo a

Digital, Industry and Space

Coste total

€ 8 991 728,75

Aportación de la UE

€ 8 991 728,75

8 991 728,75

Coordinado por

ETHNICON METSOVION POLYTECHNION
Greece

Periodic Reporting for period 1 - PANDORA (A Comprehensive Framework enabling the Delivery of Trustworthy Datasets for Efficient AIoT Operation)

Período documentado: 2024-04-01 hasta 2025-09-30

PANDORA aims to devise and implement a comprehensive framework enabling the delivery of trustworthy datasets of smart space ecosystems, as well as the deployment and green operation of AIoT systems in such spaces. PANDORA spans two phases: (1) prior to AIoT system deployment; (2) post AIoT system deployment and operation. Phase 1 proposes and combines a series of novel techniques such as synthetic data generation, quantification of uncertainties, and data summarization for the delivery of trustworthy datasets, as well as explainable AI and domain-informed model training/ testing in smart space ecosystems. Phase 2 defines novel AIaaS and CaaS techniques for the robust, explainable, green and continual operation of AIoT systems deployed in such spaces. To achieve this, PANDORA employs a multidisciplinary group of experts from different scientific and industrial fields. The trustworthiness and applicability of the PANDORA framework will be tested through five pilot cases with AIoT applications in smart buildings, factories and critical infrastructures.
The objectives of the project are:
1. Advance research excellence in the development of resilient, transparent, and human-centred AI approaches towards optimised and autonomous data processing and use.
2. Provide novel methods, mechanisms, and tools for the development of customisable, and trustworthy datasets for model-based AI developments.
3. Support the development and the continual autonomous operations of robust and energy efficient “data in AI” pipelines across the computing continuum.
4. Provide a cross-sector and multi-variant smart data space to realise the PANDORA framework and validate the data-enabled trustworthy AI mechanisms in real life scenarios.
5. Foster synergetic approaches in the EU industrial and scientific research communities and promote international collaboration on efficient and trustworthy AI approaches
6. Enhance the EU multidisciplinary competencies in the fields of industrial AI, data and robotics and embrace open innovation.

Some of the project highlights during this period are:
- Scientific and Technical advances across trustworthy AI components. PANDORA delivered major progress in federated representation learning, uncertainty quantification, semi-automated data labelling, synthetic data generation, dimensionality reduction, and energy-efficient continual learning. Several KPIs under Objectives 1–3 were already achieved or exceeded (e.g. +70% federated detection accuracy, +35k× energy gains, +50% improvement in annotation accuracy).
- Deployment of core aArchitecture, data backbone, and first operational testbeds. The project defined the 4+1 PANDORA architecture, delivered the GDPR-compliant Data Collection Mechanism, released initial AaaS/CaaS/IM platform components, and instantiated PANDORA testbeds enabling end-to-end data pipelines for training, inference, monitoring, and orchestration.
-Consolidation of cross-sector pilots and validation framework. All use-case scenarios, requirements, and pipeline blueprints were defined, and GDPR-compliant data flows established.

Work highlights per WP:
WP2 delivered the complete architectural foundation of the PANDORA framework and produced the requirements analysis framework guiding all pilots. Use-case scenarios, business/data/user requirements and pipeline blueprints were aligned across domains.

WP3 advanced key scientific components: synthetic data generation (tSDG/vSDG), uncertainty quantification (QU-MAD), explainability (GENEO & causal models), automated labelling/completion (NNTL-MVI), and dimensionality reduction/fusion (DRFEC). Multiple KPIs were successfully achieved, with validation across industrial datasets.

WP4 developed core methods enabling resilient and energy-efficient AI pipelines, including continual, domain-informed and explainable AI , continual inference acceleration, federated representation learning, and adaptive distribution of inference tasks. Models were validated on real datasets with significant performance gains.

WP5 delivered key platform components, i.e. AaaS, CaaS, Intent Manager, authentication and UI. Initial integration of training, inference, monitoring, and orchestration pipelines was achieved. Testbeds were instantiated, enabling operational deployment of PANDORA components in industrial settings.

WP6 produced complete pipeline instantiation documents for all pilots, with validation procedures and KPI-driven evaluation variables. Preparatory steps for on-site experimentation were initiated across all use cases.

PANDORA delivered several scientific, technological advances that go beyond the current state of the art in trustworthy AI, data lifecycle automation, and computing-continuum operations. These innovations demonstrate measurable performance gains and new methodological approaches.

- Breakthroughs in federated and continual learning efficiency, introducing advanced federated representational learning methods achieving substantially higher anomaly detection performance compared with existing FL baselines. Novel MAML-based continual learning techniques reduced training time and energy consumption, enabling autonomous and low-cost model updates across edge–cloud systems.
- Uncertainty Quantification and Explainability methods for Industrial AI. The QU-MAD component provided significant improvements in probabilistic accuracy. The GENEO-based causal reasoning and the ChronoEpilogi algorithm delivered domain-informed explainability advances, achieving high conciseness, fidelity, and stability in explanations.
- Automated data labelling, completion and dimensionality reduction. PANDORA developed a novel neural transfer-learning framework achieving improvements in annotation accuracy and reduction in manual interventions for missing-value reconstruction. The DRFEC component achieved >50% dimensionality reduction while preserving predictive performance, outperforming mainstream DR techniques.
- Synthetic data generation for visual and time-series data. Components that enabled generation of realistic, customizable synthetic datasets that support training robustness, data augmentation, and reduced dependency on costly real data.
- Real-Time inference acceleration and adaptive edge–cloud optimization via novel continual inference models (Continual Nyströmformers, DeepCoT) that reduced inference and execution times exceeding current state-of-the-art streaming inference systems. Furthermore dynamic placement mechanisms reduce inference latency and energy consumption through cluster-aware model distribution.

Potential Impacts:
- Higher efficiency: Reduced data processing costs, fewer human interventions, and improved model performance directly support predictive maintenance, quality inspection, and safety-critical operations.
- Trustworthy, transparent AI adoption: Explainability and UQ capacities build operator trust and regulatory alignment.
- Reduced dependency on real data: Synthetic data and automated labelling lower the barriers for AI deployment in data-sparse sectors.
- Energy-efficient AI: Reductions in computational cost support sustainability and lower environmental impact.

Periodic Reporting for period 1 - PANDORA (A Comprehensive Framework enabling the Delivery of Trustworthy Datasets for Efficient AIoT Operation)

Descargar Descargar el contenido de la página