Skip to main content

Distributed Learning-Based Control for Multi-Agent Systems

Periodic Reporting for period 1 - DiLeBaCo (Distributed Learning-Based Control for Multi-Agent Systems)

Reporting period: 2019-10-07 to 2021-04-06

Multi-agent systems offer a tremendous potential to improve the quality of modern society life. For instance, robotic networks will increase food production, or engage in search and rescue missions. Fleets of autonomous cars will reduce traffic congestion and fuel consumption while increasing road safety. With 45% of all freight being transported by road, transportation makes up 26% of the total EU energy consumption and accounts for 18% of the greenhouse gas emissions. Fuel reduction in this area will have a significant impact on the environment. This will be achieved through platooning where heavy-duty vehicles drive close to each other in groups, which reduces their aerodynamic drag and thus increases their fuel efficiency.

The challenge in controlling autonomous systems is their increasing complexity. This is due to interactions between multiple autonomous agents in a system, and by the complex dynamic environments they are operated in. Therefore, state-of-the-art classical control methods are either overly conservative leading to a poor control performance, or they cannot guarantee safety. In particular, since these classical methods are based on analytical models they might even be impossible to be used.

The objective of the research project therefore is to fuse methods from machine learning with control approaches in order to guarantee both performance and safety of the controlled systems. In particular, previously seen data (from experience) or simulated data (rollouts) are used in order to learn missing information arising from the complexities of the systems, i.e. about the optimal control policies, about the dynamical model of the systems, or about the complex and dynamic environment they are navigated in. The specific goals of the project are to 1) develop novel control algorithms by fusing methods from machine learning with control approaches 2) to guarantee safety and performance of the developed algorithms 3) to focus on data-efficiency, scalability and computational efficiency of these methods, such that they can be applied online in real-time, and for complex multi-agent systems.
This research project is focused on complex autonomous systems, where the complexities mainly arise from interactions between multiple agents, from tightly constrained and dynamic environments, and human-interactions. To cope with these complexities, classical model-based robust control methods can be overly conservative or even impossible to be implemented. Therefore, data-driven control methods have been developed. In particular, methods from machine learning have been fused with control algorithms in order to achieve a high performance and guarantee safety for the controlled complex systems.

Data-driven methods have been developed to learn optimal control policies for complex interconnected and multi-agent systems, leading to (local) optimal performance and safe control behavior. The results have been validated in extensive simulations of distributed linear systems and of multi-agent nonlinear systems. For the latter, an application example is a time-optimal navigation task of multiple agents to a desired goal position, while ensuring collision avoidance with all other agents. The optimal control policies are iteratively learned from previously seen data and employed in a decentralized way (without communication between the agents), leading to a scalable (applicable to a large number of subsystems) data-driven control method with locally optimal control performance and guaranteed safety (collision avoidance).

Furthermore, a hierarchical control framework has been developed, where previously recorded data is used to learn a higher-level strategy to guide the lower-level optimization problem. The advantage is that the underlying optimization problem is less complex and therefore can be solved online. Furthermore, both a good control performance as well as the safety of the controlled system through a finite state machine are guaranteed, even for control tasks that need to navigate in tightly constrained and dynamic environments with other human-driven cars. One considered control scenario is autonomous driving in a tight parking lot, where other human-driven cars are driving and parking into empty spots. Since the environment is dynamic and tightly constrained, the exact optimization problem that needs to be solved for controlling the autonomous car is too complex to be solved online in real-time. Therefore, using the novel hierarchical control framework, based on previously recorded data, higher level strategies are learned, which in this scenario are defined as “passing left”, “passing right”, or “yielding”. These higher-level strategies are evaluated online and then guide the online optimization problem via additional constraints. The performance and safety of this control task were validated in extensive simulations, and in experiments on the BARC platform at the Model Predictive Control Laboratory at UC Berkeley.
The developed data-driven optimal control methods achieve global (for linear systems) or local (for nonlinear systems) optimality while guaranteeing safety. To the best of our knowledge, this result has not been achieved before by distributed or decentralized control algorithms for general distributed systems.
The developed data-driven hierarchical control framework combines learning strategies from optimal data with guiding a lower-level optimization problem that can thus be solved online. In this way, a high level of interpretability is maintained. Safety is guaranteed through a finite state machine, and good performance is achieved through the data-driven approach. These advantages of the novel framework compared to a state-of-the-art controller have been demonstrated and confirmed in experiments.

Expected results until the end of the project:
The developed methods will be used in the research project for platooning at KTH. Here, safety in terms of collision avoidance is indispensable, while an optimal performance in terms of fuel reduction and time optimality are highly desirable to ensure the climate objectives of the EU. The results will be demonstrated at the Smart Mobility Lab at KTH. Furthermore, while the developed methods so far are partially based on analytical models of the systems, paired with machine learning techniques, an entirely data-driven optimal control approach will be investigated for systems where analytical models are arbitrarily difficult to obtain.
Complex maneuvers on a tightly-constrained parking by the hierarchical data-driven control framework
Trajectories for data generation and by optimized distributed control policies by Dezentralized LMPC