Periodic Reporting for period 1 - AI4REALNET (AI for REAL-world NETwork operation)
Okres sprawozdawczy: 2023-10-01 do 2024-09-30
[WP2] State-of-the-art has been investigated, and the challenges and research directions have been identified in a project’s position paper. Key developments include:
- Inventorize the knowledge present in the three AI4REALNET domains and start of development of knowledge-assisted AI approaches
- Algorithm testing for distributed reinforcement learning (RL) in the Grid2OP environment and exploration of hierarchical RL approaches. Initial research on graph RL for power grid topology optimisation
- Advanced work in explainable AI, focusing on analysing agent behaviours. Development of methods to embed ethical dimensions into reinforcement learning alongside theoretical research on human factors for effective human-AI collaboration
[WP3] Developed requirements for human-AI interaction at generic and use case levels, specifying roles and tasks and abstracting interaction protocols into a unified representation. Research on uncertainty estimation methods began, focusing on their impact on decision-making [Task 3.1]. Collected RL training KPIs focused on conflicting objectives, reviewed multi-objective RL approaches, and explored methods to synthesise objectives, enabling agent training without immediate human data. Identified RL methods for human-AI co-learning, including behaviour cloning, inverse RL, and imitation learning, with synthetic data considered for limited real data use cases [Task 3.3]. Conceptualised workflows for human supervision in automated processes, designing a two-stage agent training plan using offline and online RL to align AI with human performance and build trust
From WP2, there are two main contributions: a) a model to break the complexity originated by the course of dimensionality in large-scale reinforcement learning problems, which makes use of information-theoretic quantities such as mutual information to find the more effective way to split large state and action spaces effectively, and b) machine learning method for identifying and predicting power grid failures in deep reinforcement learning. These methods have the potential to be replicated in different contexts and domains and address challenges such as the scalability of reinforcement learning and explainability of AI agents’ behaviour.
Given that WP3 started in month M6, most of the work done is in preliminary phases or at a conceptual level, mainly proposing architectures for co-learning methods that focus on inverse learning, imitation learning and behaviour cloning, and defining human-AI interaction protocols and requirements.