Periodic Reporting for period 2 - AI4REALNET (AI for REAL-world NETwork operation)
Reporting period: 2024-10-01 to 2025-09-30
The project consolidated the conceptual foundations of human–AI decision-making in safety-critical systems, creating a hypervision tool that supports operators through AI recommendations, contextual visualisations, and KPIs (robustness, transparency, explainability, usability). Upgraded digital environments—Grid2Op, FLATLAND, and BlueSky—now include new scenarios and KPIs, enabling realistic cross-domain testing. The framework was demonstrated in power grid, railway, and air traffic management (ATM), showing adaptability. Interoperability connectors link the framework to agents from WPs 2–3, enabling real-time KPI exchange and AI recommendations.
WP2 – Knowledge-Assisted AI and Transparency
Three knowledge-assisted AI approaches were developed, including hierarchical and distributed reinforcement learning (RL) agents for the power grid and railway domains. These integrate human knowledge, decomposition methods, and graph neural networks to enhance scalability and robustness. Key advances include imitation learning agents for transparent decision-making, GNNs for improved failure prediction, and “what-if” visual tools for operator understanding. Progress in Explainable AI (XAI), safety, and human–machine interaction improved AI trustworthiness. Community engagement was promoted through open-source releases and workshops.
WP3 – Human-Centred and Autonomous AI
WP3 developed human-centred, uncertainty-aware, and multi-objective AI systems. A flexible agent-as-a-service platform enables simulation rollouts and KPI computation. Research advanced epistemic uncertainty estimation (evidential networks, conformal prediction) and multi-objective RL, including a SoftGNN imitation learning agent for preference-based decision-making. Human–AI co-learning was enhanced through adjustable autonomy, inverse reinforcement learning, and interactive evolutionary optimisation (CMA-ES) for airspace sectorisation, offering transparency and control. Multi-agent architectures now enable cooperative negotiation under human supervision, defining the role of the human “director”.
WP4 – Evaluation and Validation
WP4 defined a comprehensive evaluation framework (Deliverable D4.1 “Evaluation and Test Protocols”) comprising 62 KPIs and 12 scenarios for assessing AI reliability, adaptability, and efficiency. Domain-specific perturbation agents simulate cyberattacks and failures to test robustness and resilience. Human-centred evaluation covers trust, usability, and collaboration, with protocols to assess workload, trust, and user experience via the InteractiveAI platform.
WP2 introduced hybrid models combining domain knowledge and data-driven learning for generalisation and transparency. Physics-informed neural networks, adaptive action-space reduction, and hybrid RL–decision-tree frameworks improved interpretability. Distributed and hierarchical RL with graph-based coordination enhanced scalability. A soft-label imitation learning agent outperformed deep RL baselines, ensuring explainability and safety.
WP3 achieved uncertainty quantification and multi-objective optimisation advances, supporting risk-aware, trustworthy AI. Interactive evolutionary algorithms and preference-based learning promoted transparency and operator control, while multi-agent negotiation architectures introduced cooperative autonomy with human oversight.
WP4’s validation framework integrates technical, ethical, and socio-technical dimensions. It establishes new metrics for scalability, robustness, and human–AI augmentation. Resilience is quantified through recovery metrics after perturbations, complementing the AI Act’s concept of robustness. Human-centred metrics assess co-learning, autonomy, and long-term impacts, ensuring AI augments human capability.