Periodic Reporting for period 1 - VERLAN (Verification and Language Theory)
Reporting period: 2021-09-01 to 2023-08-31
The main conclusions of the action are threefold.
(1) Where refactoring is concerned, it was discovered that program behavior without "goto" statements is closely connected to a recently solved problem in process algebra.
(2) For equivalence checking, it was shown that several fragments of programming languages admit tractable equivalence checking.
(3) For learning, it was found out that restricted behavior may come with a more complex representation, which encumbers learning algorithms.
Refactoring: The action looked into the feasibility of certain refactoring operations. In limited settings, such as one where the packet scheduling behavior of a network device must be organized to work on particular hardware, it was shown that refactoring is possible and in fact feasible (this result was published at OOPSLA 2024). Progress was also made towards a characterization of program behavior that *requires* the use of non-local control flow (e.g. "goto" statements), and can thus not be refactored to avoid these (part of the results were published at ESOP 2022; other work is currently under submission at a conference).
Equivalence checking: The possibility of verifying the correctness of a refactoring operation after-the-fact was researched, and this was also found to be feasible in limited cases (this result is currently under submission at a conference). Moreover, several fragments of programming languages were identified, including the parsing sub-language of P4, a programming language for network switches, and a limited subset of traditional ("while-based") programming languages that include probabilistic behavior. For these fragments, tractable equivalence checking algorithms were developed that relied on the simplification that arose from looking only at the fragment (these results were published at PLDI 2021 and ICALP 2023 respectively).
Learning: Because restricted fragments of programming languages may have a more complex representation, no new learning algorithms were found. However, an existing learning algorithm due to Angluin was generalized to learn program behavior where certain equations hold, thus constituting a limited fragment (this result was published at CMCS 2022).
For refactoring, the project demonstrated that certain methods of programmable packet scheduling can be implemented, and then refactored to run on fixed hardware. This strikes a balance between flexibility (in terms of scheduling algorithms that can be programmed) and efficiency (by the use of special-purpose hardware). If programmable packet scheduling takes off, this development may find its way into industrial network hardware.
Within the same theme, our progress towards a characterization of program behavior that requires non-local control flow is general, and can indeed be used to argue similar properties of programming languages that contain non-traditional or unusual primitives for control flow. This is the first time such a result goes beyond existing and well-known control flow constructs such as "if" and "while". Developments here are mainly theoretical at the time of reporting, but may pay practical dividends later on.
For equivalence checking, advances were made to verify the correctness of refactoring post hoc. This is the first time an algorithm was developed for this problem at the level of control flow verification. The algorithm may be applied as a way to validate complex program transformations that can be hard to verify in general, as well as in decompilation. Other algorithms developed during the project, such as the equivalence checking algorithm for while-based programs with probabilistic behavior, could have similar applications within probabilistic programming.
The learning algorithm developed goes beyond an earlier variation of Angluin's algorithm to learn program behavior represented in the form of trees, and in fact our work provides a common perspective on both Angluin's classical algorithm and this variation. The framework that covers this algorithm can be instantiated to learn different types of program behavior.