WP1 – Conceptual and Framework Development
The project consolidated the conceptual foundations of human–AI decision-making in safety-critical systems, creating a hypervision tool that supports operators through AI recommendations, contextual visualisations, and KPIs (robustness, transparency, explainability, usability). Upgraded digital environments—Grid2Op, FLATLAND, and BlueSky—now include new scenarios and KPIs, enabling realistic cross-domain testing. The framework was demonstrated in power grid, railway, and air traffic management (ATM), showing adaptability. Interoperability connectors link the framework to agents from WPs 2–3, enabling real-time KPI exchange and AI recommendations.
WP2 – Knowledge-Assisted AI and Transparency
Three knowledge-assisted AI approaches were developed, including hierarchical and distributed reinforcement learning (RL) agents for the power grid and railway domains. These integrate human knowledge, decomposition methods, and graph neural networks to enhance scalability and robustness. Key advances include imitation learning agents for transparent decision-making, GNNs for improved failure prediction, and “what-if” visual tools for operator understanding. Progress in Explainable AI (XAI), safety, and human–machine interaction improved AI trustworthiness. Community engagement was promoted through open-source releases and workshops.
WP3 – Human-Centred and Autonomous AI
WP3 developed human-centred, uncertainty-aware, and multi-objective AI systems. A flexible agent-as-a-service platform enables simulation rollouts and KPI computation. Research advanced epistemic uncertainty estimation (evidential networks, conformal prediction) and multi-objective RL, including a SoftGNN imitation learning agent for preference-based decision-making. Human–AI co-learning was enhanced through adjustable autonomy, inverse reinforcement learning, and interactive evolutionary optimisation (CMA-ES) for airspace sectorisation, offering transparency and control. Multi-agent architectures now enable cooperative negotiation under human supervision, defining the role of the human “director”.
WP4 – Evaluation and Validation
WP4 defined a comprehensive evaluation framework (Deliverable D4.1 “Evaluation and Test Protocols”) comprising 62 KPIs and 12 scenarios for assessing AI reliability, adaptability, and efficiency. Domain-specific perturbation agents simulate cyberattacks and failures to test robustness and resilience. Human-centred evaluation covers trust, usability, and collaboration, with protocols to assess workload, trust, and user experience via the InteractiveAI platform.