Periodic Reporting for period 4 - OCAL (Optimal Control at Large)
Reporting period: 2023-05-01 to 2024-10-31
OCAL addressed precisely this challenge, by developing a framework for approximately solving optimal control problems that is both computationally tractable and provides theoretical approximation guarantees. In the context of approximate dynamic programming, the starting point were formulations of optimal control problems linear programs. Since for continuous states and action these programs are infinite dimensional, we developed randomised methods relying on finite dimensional function approximation and the sampling of constraints as a basis for algorithms. Our approach enjoys close connections to statistical learning theory, providing a direct link to data-driven approximation and resulting in the desired theoretical guarantees. Besides uncovering theoretical properties of these methods, however, our work showed that scaling them up to large-scale systems is far from trivial computationally, as empirically it requires an unreasonably large number of constraints to ensure that the approximate linear program remains bounded. We addressed this issue by moving away from random constraint sampling and developing structured, iterative constraint sampling methods. This nicely complemented our parallel on approximate solution of dynamic programming problems for finite state-action problems, where high performance, parallel software was developed for performing the approximation, drawing on a theoretical connection to non-smooth variants of Newton’s method. To demonstrate the efficacy of these methods, in addition to benchmark problems we also applied them to a simulation case study on insulin injection for the treatment of diabetes.
In the context of model predictive control, our work focused on the use of data to alleviate the need to develop a model. We showed that, though primarily inspired by deterministic linear problems where it is exact, this approach can also be used to approximate nonlinear and stochastic problems through regularisation. We were moreover able to establish a close link between the choice of regulariser and the various sources of uncertainty entering the problem. The approximation methodology resulted in a very powerful method that we were able to apply to practical problems, both in simulation and in experiments; examples range from quadrotor control in the lab, to energy management and urban traffic management.
In a parallel stream, we developed methods for removing the "M" in MPC. Model Predictive Control (MPC) methods are very popular in industry and academia, but their reliance on a model sometimes hampers their deployment in settings where models are difficult to obtain and maintain; an example is energy management in buildings and districts. In thois context, we worked on Data Enabled Predictive Control (DeePC) methods for replacing the model in the optimisation problem solved by MPC through constraints involving the so-called Hankel matrices constructed from data. They key challenge we addressed is dealing with systems that are subject to uncertainty; the key ingredient is appropriate regularisation based on methods from stochastic programming and robust optimisation.
OCAL resulted in numerous publications in the best venues of automatic control and the successful completion of three doctoral theses, with two more nearing completion at the end of the project. Our computational work was also released in two open source parallel, high performance implementations, for GPUs and clusters respectively.
Thanks to the project we now understand better the advantages and limitations of the use of numerical methods for linear systems and linear programs in the approximation of the dynamic programming formulation of oprimal control. Besides our methodological results that others can now build on (such as the introduction of the relaxed Bellman operator and the characterisation of its fundamental properties), OCAL also developed open-source computational tools that other groups can use to deploy dynamic programming solutions to optimal control problems of unprecedented scale.
Similarly, in the context of Model Predictive Control, the OCAL results contributed to the development of the Data Enabled Predictive Control methodology, arguably the leading contender for model-free predictive control to day. Our results allowed is to put solid theoretical foundations to support the practice of using regularisation in Data Enabled Predictive Control. This enabled the deployment of the methods to difficult nonlinear problems such as power systems or the control or urban mobility systems.