Periodic Reporting for period 1 - TRUST-ML (Trust-ML: An Optimization-based Platform for Building Trust in Machine Learning Models used for Power Systems)
Reporting period: 2022-06-15 to 2024-06-14
A complementary verification approach, built on advanced solvers from the ML community, enabled the simultaneous verification of multiple verification problems (e.g. checking for the violation of all line flow constraints simultaneously and not by solving individual verification problems). For that, we introduced an exact transformation that converts the "worst-case" violation across a set of potential violations to a series of ReLU-based layers that augment the original neural network. This allows verifiers to interpret these outputs directly directly. Additionally, since power system ML models often must be verified to satisfy power flow constraints, we proposed a dualization procedure which encodes linear equality and inequality constraints directly into the verification problem; and in a manner which is mathematically consistent with the specialized verification tools. To demonstrate these innovations, we verified problems associated with data-driven security constrained DC-OPF solvers. We built and tested our first set of innovations using the α,β-CROWN solver, and we benchmarked against Gurobi 10.0. Our contributions achieved a speedup that can exceed 100x and allow higher degrees of verification flexibility.
On top of verification, this project also designed scalable techniques to collect high-fidelity, maximally representative training data for ML model construction. To that end, we designed two key new approaches. In the first approach, we performed a systematic investigation into the various nonlinear objective functions which can be used to explore the feasible space associated with the optimal power flow (OPF) problem. A total of 40 nonlinear objective functions were tested, and their results were compared to the data generated by a novel exhaustive rejection sampling routine. The Hausdorff distance, which is a min-max set dissimilarity metric, was then used to assess how well each nonlinear objective function performed (i.e. how well the tested objective functions were able to explore the nonconvex power flow space). Exhaustive test results were collected from five PGLib test-cases and systematically analyzed. In the second approach, using bilevel optimization, we introduced a data collection routine that sequentially solves for optimal power flow solutions which are “optimally far” from previously acquired voltage, power, and load profile data points. The routine, termed RAMBO, samples critical data close to a system’s boundaries much more effectively than a random sampling benchmark. Simulated test results were collected on the 30-, 57-, and 118-bus PGLib test cases.