Control without Trust: A Distributionally Robust Approach

Informations projet

TRUST

N° de convention de subvention: 949796

DOI

10.3030/949796

Date de signature de la CE 26 Octobre 2020

Date de début 1 Avril 2021

Date de fin 31 Mars 2026

Financé au titre de

EXCELLENT SCIENCE - European Research Council (ERC)

Coût total

€ 1 499 515,00

Contribution de l’UE

€ 1 499 515,00

1 499 515,00

Coordonné par

TECHNISCHE UNIVERSITEIT DELFT
Netherlands

Periodic Reporting for period 2 - TRUST (Control without Trust: A Distributionally Robust Approach)

Période du rapport: 2022-10-01 au 2024-03-31

Overall objectives:
================
The project TRUST aims to develop scalable and provably reliable data-driven control methodologies for industrial-size applications. To reach this end, TRUST utilizes recent robust decision-making models from operations research, dynamic programming characterization from control theory, and real-time computational tools from machine learning. Specifically, the following are the three main methodological objectives of TRUST:

(i) building more reliable data-driven models for decision-making;
(ii) provides rigorous frameworks whose control mechanism performance is provably guaranteed;
(iii) develop scalable, fast, and memory-efficient computational solutions using the rich literature of convex optimization and other algorithmic techniques;

The framework anticipated in TRUST to deliver the above objectives falls into the broad area of distributionally robust decision-making. Many decision problems in science, engineering, and economics are affected by uncertain parameters whose distribution is only indirectly observable through samples. The goal of data-driven decision-making is to learn a decision from finitely many training samples that will perform well on unseen test samples. This learning task is difficult even if all training and test samples are drawn from the same distribution—especially if the dimension of the uncertainty is large relative to the training sample size. Wasserstein distributionally robust optimization seeks data-driven decisions that perform well under the most adverse distribution within a certain Wasserstein distance from a nominal distribution constructed from the training samples.

Important output:
================
The techniques developed in TRUST enable us to convert the distributionally robust control problems represented to instances of structured convex optimization problems recognizable by solvers. We aim to develop open-source software to automate this process and bridge the mathematical model to relevant applications at an industrial level, providing an efficient and easy-to-use tool with no need for expert knowledge in the technical disciplines. This is showcased by the reproducibility of our numerical results via the open-source software developed during the project. All the codes and publications are available publically and can also be found in the link https://www.dcsc.tudelft.nl/~mohajerin/pub_complete.html.

Besides the methodological outcome envisioned in TRUST, we have also progressed in implementing them in real-world applications including health monitoring of energy systems, smart decision-making in transportation and logistics, leakage detection in water distribution networks, and anomaly detection in autonomous vehicles.

TRUST contains four work packages to accomplish its main objectives. The results pertaining to these Work Packages (WP), both on the theoretical/fundamental side and also open-source software tools, are reported through several key publications which are briefly explained in the following. Concerning the references below, the abbreviation "J" stands for journals, "C" for conferences, and "P" for submitted manuscripts that are under review, and the numbers are per the publications numbers available on my personal homepage here: https://www.dcsc.tudelft.nl/~mohajerin/pub_complete.html. All the papers are also available on arXiv and their arXiv numbers are also provided in the reference list.

WP1 proposes data-driven decision-making models that have several conceptual and computational benefits. Most prominently, the optimal decisions can often be computed by solving tractable convex optimization problems, and they enjoy rigorous out-of-sample performance guarantees (i.e. performance on unseen datasets). These models have interesting ramifications for statistical learning and motivate new approaches for fundamental learning tasks such as classification, regression, maximum likelihood estimation, or minimum mean square error estimation, among others. Two main tasks in this direction are to introduce nonparametric and parametric ambiguity sets which have been proposed in [J23] and [J31], respectively.

In the next step in WP2, we develop efficient algorithms to solve dynamic decision-making schemes, also called dynamic programming (DP). The DP technique is the workhorse of optimal decision-making and is a particularly useful scheme for data-driven control problems under uncertainty. Initially in [C40], and later extended in [J35], we propose two novel numerical schemes for the approximate implementation of the DP operation concerned with finite-horizon optimal control of discrete-time systems with input-affine dynamics. The main idea is inspired by an interesting analogy between the convex conjugate operator and the Fourier transform, paving the way to leverage algorithmic techniques from this literature for dynamic decision-making problems in control. In [P8], we further make another explicit analogy between control and optimization across four problem classes with a unified solution characterization. This novel framework, in turn, allows for a systematic transformation of algorithms from one domain to the other. This leads to novel first-order control algorithms that share the same per-iteration complexity with the standard value iteration and Q-learning, but interestingly exhibit convergent behavior and sensitivity to the discount factor similar to the second-order algorithms such as policy iteration and Zap Q-learning.

Most recently, we have extended the above results in two directions motivated by the control problems anticipated in WP2. First, in [P4] we focus on a class of distributionally robust optimization (DRO) problems where, unlike the growing body of the literature, the objective function is potentially non-linear in the distribution. We also further developed some of the results in a dynamic setting in [P6] where we establish a collection of closed-loop guarantees and propose a scalable, Newtontype optimization algorithm for distributionally robust model predictive control (DRMPC) applied to linear systems, zero-mean disturbances, convex constraints, and quadratic costs.

Another important focus in TRUST is to propose efficient algorithms that can be implemented online with a moderate size of memory usage. In WP3, we investigate the power of Online Convex Optimization (OCO), an area that has received notable attention in the control literature thanks to its flexible real-time nature and powerful performance guarantees. In [J34], we propose new step-size rules and OCO algorithms that simultaneously exploit the structure of the problem and possible additional information available in the real-time implementation.

One of the main goals of TRUST envisioned in WP4 is to develop open-source codes that reproduce the numerical results and facilitate the use of the project outcome of this research in real-world applications. The codes of all the above publications are available in the publication link (https://www.dcsc.tudelft.nl/~mohajerin/pub_complete.htm) as well as a dedicated GitHub page to ensure the reproducibility of the results. It is worth mentioning that besides the research directions and outcome anticipated in TRUST, we also have progressed in other domains including transportation and logistics, leakage detection in water distribution networks, and fault diagnosis in switched inverter-cased microgrid systems. The following publications and manuscripts under review have reported the above progress in detail, with a pointer to the specific subtasks in each of them.

================
Published articles:
================
[C40] Fast Approximate Dynamic Programming for Infinite-Horizon Continuous-State Markov Decision Processes, Neural Information Processing Systems (NeurIPS), 2021, [arXiv:2102.08880]

[J23] "Distributionally Robust Inverse Covariance Estimation: The Wasserstein Shrinkage Estimator", Viet Anh Nguyen, Daniel Kuhn, and Peyman Mohajerin Esfahani
Operations Research (OR), 2021, [arXiv:1805.07194]

[J31] "Bridging Bayesian and Minimax Mean Square Error Estimation via Wasserstein Distributionally Robust Optimization". Viet Anh Nguyen, Soroosh Shafieezadeh-Abadeh, Daniel Kuhn, and Peyman Mohajerin Esfahani Mathematics of Operations Research (MOR), vol. 48, no. 1, pp. 1-37, 2023, [arXiv:1911.03539]

[J34] "Adaptive Online Optimization with Predictions: Static and Dynamic Environments Pedro Zattoni Scroccaro", Arman Sharifi Kolarijani, and Peyman Mohajerin Esfahani, IEEE Transactions on Automatic Control (TAC), vol. 68, no. 5, pp. 2906-2921, 2023, [arXiv:2205.00446]

[J35] "Fast Approximate Dynamic Programming for Input-Affine Dynamics". Amin Kolarijani and Peyman Mohajerin Esfahani, IEEE Transactions on Automatic Control (TAC), vol. 68, no. 10, pp. 6315 - 6322, 2023, [arXiv:2008.10362]

================
Manuscripts under review:
================
[P3] "Learning in Inverse Optimization: Incenter Cost, Augmented Suboptimality Loss, and Algorithms", Pedro Zattoni Scroccaro, Bilge Atasoy, and Peyman Mohajerin Esfahani submitted for publication, May 2023, [arXiv:2305.07730]

[P4] "Nonlinear Distributionally Robust Optimization", Mohammad Rayyan Sheriff and Peyman Mohajerin Esfahani submitted for publication, May 2023, [arXiv:2306.03202]

[P6] "Distributionally Robust Model Predictive Control: Closed-loop Guarantees and Scalable Algorithms", Robert D. McAllister and Peyman Mohajerin Esfahani submitted for publication, September 2023, [arXiv:2309.12758]

[P8] "From Optimization to Control: Quasi Policy Iteration", Amin Kolarijani and Peyman Mohajerin Esfahani submitted for publication, November 2023, [arXiv:2311.11166]

The results reported in the previous part have been published in the flagship venues in the technical disciplines of control, operations research, and machine learning. The common expectation for such publications is the extension of the existing results in the literature in a formal and rigorous manner. These extensions are expected to be numerically, and possibly experimentally, validated at a level applicable to real-world applications.

The primary focus of TRUST is on the methodological side. We have progress in developing computational algorithms for control problems with an order of magnitude lower complexity than the state-of-the-art. From the modeling viewpoint, we also propose simple data-driven models known as inverse optimization that, despite their simplicity, prove to be a strong data-driven tool to replicate highly complicated behavior emerging in a real-world setting. We also have progress in revealing new connections between the two rich algorithmic literature of convex optimization and fixed point problems. This connection allows us to propose new algorithms that outperform the well-known existing methods such as (accelerated) value iterations.

On the application side, we have some preliminary results indicating that the tools developed in TRUST can be applied and compete with the state of the art in several application domains including transportation, autonomous vehicles, and robotics. An example of such a performance is where our proposed framework performs very well in the Amazon Last Mile Routing Research Challenge [P5] (the citation is per the publications numbers available on my personal homepage here: https://www.dcsc.tudelft.nl/~mohajerin/pub_complete.html). In this challenge, the main goal is to learn models that replicate the routing preferences of human drivers, using thousands of real-world routing examples. Our proposed learning model achieves a score that ranks 2nd compared with the 48 models that qualified for the final round of the challenge.

================
Reference:
================
[P5] "Inverse Optimization for Routing Problems", Pedro Zattoni Scroccaro, Piet van Beek, Peyman Mohajerin Esfahani, Bilge Atasoy, submitted for publication, July 2023, [arXiv:2307.07357].

Periodic Reporting for period 2 - TRUST (Control without Trust: A Distributionally Robust Approach)

Partager cette page Partager cette page sur les réseaux sociaux

Télécharger Télécharger le contenu de la page