## Periodic Reporting for period 3 - Mathador (Type and Proof Structures for Concurrent Software Verification)

Reporting period: 2020-04-01 to 2021-09-30

The key goal of this project is to design a theory of types for fine-grained shared-memory concurrent programs. Such a type theory is an integrated system in which one can write concurrent programs and their correctness proofs, in a common language for programming and proving. The programs and the proofs can be combined together, structured, and encapsulated, so that the types can serve as interfaces that abstract the internal properties of both the concurrent programs, and their proofs.

Having types as a common interface for both programs and proofs make type theory a system for *compositional* specification of program properties. And indeed, type theory is widely recognized as a theory of composition in logics and programming languages. Up to now, however, type theory has only been applied to the sequential and purely functional programming model. The goal of this project is to apply it to stateful and concurrent one (i.e. to shared-memory concurrency).

This is of essence because it is precisely the concurrent programming model where the need for compositional reasoning is most required. This lack of composition in concurrency is a well-recognized and perennial problem, often even considered as one of the great challenges of theoretical computer science. Indeed, while a number of formal systems in existence can formally verify that an individual concurrent algorithm or data structure is correct, it is much more difficult to verify their composition. To achieve this verification at scale, one must be able to reuse the proofs already developed for the individual programs. Without such reuse, the proof of the composition leads to the combinatorial explosion in the number and size of proof obligations.

Thus, designing a type theory for concurrency will solve the key theoretical and practical challenge in the verification (and thus also of understanding) of concurrent software, which has been a long standing problem in computer science. Such type theory will give us a novel language in which to write concurrent programs so that they are amenable to verification. In a way, the key advance that the project aims for, will do for concurrent programs and their proofs what the idea of structured programming did for sequential programs: make us write (concurrent) software in a way that makes it easier to understand, maintain, and verify.

Having types as a common interface for both programs and proofs make type theory a system for *compositional* specification of program properties. And indeed, type theory is widely recognized as a theory of composition in logics and programming languages. Up to now, however, type theory has only been applied to the sequential and purely functional programming model. The goal of this project is to apply it to stateful and concurrent one (i.e. to shared-memory concurrency).

This is of essence because it is precisely the concurrent programming model where the need for compositional reasoning is most required. This lack of composition in concurrency is a well-recognized and perennial problem, often even considered as one of the great challenges of theoretical computer science. Indeed, while a number of formal systems in existence can formally verify that an individual concurrent algorithm or data structure is correct, it is much more difficult to verify their composition. To achieve this verification at scale, one must be able to reuse the proofs already developed for the individual programs. Without such reuse, the proof of the composition leads to the combinatorial explosion in the number and size of proof obligations.

Thus, designing a type theory for concurrency will solve the key theoretical and practical challenge in the verification (and thus also of understanding) of concurrent software, which has been a long standing problem in computer science. Such type theory will give us a novel language in which to write concurrent programs so that they are amenable to verification. In a way, the key advance that the project aims for, will do for concurrent programs and their proofs what the idea of structured programming did for sequential programs: make us write (concurrent) software in a way that makes it easier to understand, maintain, and verify.

Towards that above goal, our key advances to date have been in formulating and integrating into a common type-theoretic system, several mathematical concept that enable the compositional development of concurrent programs and their proofs. More specifically, we so far designed and introduced:

- Resources; which are a special form of state transition systems that we use as (part of) types of concurrent programs

- Morphisms between resources; which we use as a notion of function types. Morphisms act on programs to produce morphed programs.

- Special form of simulations; which we use to support proving mathematical theorems about resources, the programs that resources type, and morphed programs.

- Algebraic theory of Partial Commutative Monoids (PCM). PCMs have been used before in the concurrent and sequential separation logics as a way to formalize ghost state. We now developed and put to practical use the associated notions of morphisms on PCMs. In turn, this lead to the development of the idea of compatibility relations that serve as the preconditions for such PCM morphism, and formulate an abstract algebra that generalizes the idea of disjointness. We also developed and put to use a number of algebraic constructions on PCMs, PCM morphisms, and compatibility relations, such as sub-PCM construction, morphism kernels and equalizers, morphism composition and restriction, etc.

The development of resources fits in the WP1 of the proposal. The development of resource morphisms and simulations is in WP2. The development of the algebra of PCM is in WP3.

A paper describing the first three of the above notions in detail has been published at the OOPSLA 2019 conference.

We have applied these ideas (in their preliminary form) to formally verify some representative and challenging benchmark examples such as: concurrent stack data structures, several different variant of locks including a flat combiner structure, some non-linearizable data structures, and concurrent graph structures. All of these have been verified formally in our extension of the Coq theorem prover with concurrency, and we have made the software artifacts publicly available.

For example, the library for formally reasoning about Partial Commutative Monoids (PCMs) have been released as a stand-alone package, that may be useful to projects beyond concurrency verification: https://sympa.inria.fr/sympa/arc/coq-club/2018-04/msg00111.html

The meta theory behind the designed system, as well as all the examples, are available on-line at: https://doi.org/10.5281/zenodo.3365991 and http://software.imdea.org/fcsl/

Our verification of the algorithm for snapshoting an array (originally due to Prasad Jayanti) that exemplifies our general type-theoretic methodology, has been published at the ECOOP 2017 conference, and fits in the WP4 of the proposal.

- Resources; which are a special form of state transition systems that we use as (part of) types of concurrent programs

- Morphisms between resources; which we use as a notion of function types. Morphisms act on programs to produce morphed programs.

- Special form of simulations; which we use to support proving mathematical theorems about resources, the programs that resources type, and morphed programs.

- Algebraic theory of Partial Commutative Monoids (PCM). PCMs have been used before in the concurrent and sequential separation logics as a way to formalize ghost state. We now developed and put to practical use the associated notions of morphisms on PCMs. In turn, this lead to the development of the idea of compatibility relations that serve as the preconditions for such PCM morphism, and formulate an abstract algebra that generalizes the idea of disjointness. We also developed and put to use a number of algebraic constructions on PCMs, PCM morphisms, and compatibility relations, such as sub-PCM construction, morphism kernels and equalizers, morphism composition and restriction, etc.

The development of resources fits in the WP1 of the proposal. The development of resource morphisms and simulations is in WP2. The development of the algebra of PCM is in WP3.

A paper describing the first three of the above notions in detail has been published at the OOPSLA 2019 conference.

We have applied these ideas (in their preliminary form) to formally verify some representative and challenging benchmark examples such as: concurrent stack data structures, several different variant of locks including a flat combiner structure, some non-linearizable data structures, and concurrent graph structures. All of these have been verified formally in our extension of the Coq theorem prover with concurrency, and we have made the software artifacts publicly available.

For example, the library for formally reasoning about Partial Commutative Monoids (PCMs) have been released as a stand-alone package, that may be useful to projects beyond concurrency verification: https://sympa.inria.fr/sympa/arc/coq-club/2018-04/msg00111.html

The meta theory behind the designed system, as well as all the examples, are available on-line at: https://doi.org/10.5281/zenodo.3365991 and http://software.imdea.org/fcsl/

Our verification of the algorithm for snapshoting an array (originally due to Prasad Jayanti) that exemplifies our general type-theoretic methodology, has been published at the ECOOP 2017 conference, and fits in the WP4 of the proposal.

Significant amount of work remains to be done.

- In the very immediate near future, we plan to prepare for publication the research ideas that have been mathematically developed and formalized in the Coq theorem prover (with the proofs available at the above links), but have not been published. Chief among them is the development of the algebra of PCMs, mentioned above. We are currently preparing for publication the idea that the algebra of PCMs can explain the concept of framing (key concept in separation logic), as a standard algebraic construction of Galois connection.

- We tested our mathematical theories by formalizing a number of challenging benchmark examples in our extension of the proof assistant Coq (some completed examples have already been mentioned above). In this process we learned several lessons that we plan to build on in the future.

For example, we noticed that the traditional approach to the verification of concurrent programs based on so-called "linearization points" leads to formal proofs that are sub-optimal and quite bulky. One can develop such formal proofs, but in practice, they are filled with tedious book-keeping details that obscure the mathematical essence of the underlying argument.

We are now finding ways to reorganize these proofs by means of decomposition into reusable libraries. In particular, we have found that if the proofs are based not on linearization points, but on the idea of visibility relation, the proofs decompose better and enable more reuse. Visibility relations have been quite a common abstraction in distributed systems, but have not been used so much in the work of verification of concurrent structures (a notable exception is the work on "Aspect-oriented Linearizability Proofs" by Chakraborty, Henziner, Sezgin and Vafeiadis from CONCUR'13 and LMCS Vol 11, 2015). While Chakraborty et al. apply the method of visibility relations to concurrent queues, we will apply them to other data structures, and develop the general constructions underpinning them. We already carried out the exercise over snapshot algorithms. Thus, as an outcome, we expect to develop a theory of visibility, which will mesh well with the theory of types, resources, and morphisms, that we already developed.

The bulk of the remaining project effort will be focused in this direction.

- The development of the resource morphisms that we carried out for concurrency, is very general, and should apply to notions of effect other than shared-memory concurrency.

Thus, we are also engaging into applying these concepts to develop a type theory of algebraic effects. Algebraic effects have been recently proposed as a flexible abstraction for programming with effects, with very pleasing mathematical properties. However, it has been somewhat challenging to design a type system that can closely track these effects in the types, for the purpose of statically ensuring that they are used correctly. We expect to be able to contribute to designing such a type system, as an offshoot of the main project.

- In the very immediate near future, we plan to prepare for publication the research ideas that have been mathematically developed and formalized in the Coq theorem prover (with the proofs available at the above links), but have not been published. Chief among them is the development of the algebra of PCMs, mentioned above. We are currently preparing for publication the idea that the algebra of PCMs can explain the concept of framing (key concept in separation logic), as a standard algebraic construction of Galois connection.

- We tested our mathematical theories by formalizing a number of challenging benchmark examples in our extension of the proof assistant Coq (some completed examples have already been mentioned above). In this process we learned several lessons that we plan to build on in the future.

For example, we noticed that the traditional approach to the verification of concurrent programs based on so-called "linearization points" leads to formal proofs that are sub-optimal and quite bulky. One can develop such formal proofs, but in practice, they are filled with tedious book-keeping details that obscure the mathematical essence of the underlying argument.

We are now finding ways to reorganize these proofs by means of decomposition into reusable libraries. In particular, we have found that if the proofs are based not on linearization points, but on the idea of visibility relation, the proofs decompose better and enable more reuse. Visibility relations have been quite a common abstraction in distributed systems, but have not been used so much in the work of verification of concurrent structures (a notable exception is the work on "Aspect-oriented Linearizability Proofs" by Chakraborty, Henziner, Sezgin and Vafeiadis from CONCUR'13 and LMCS Vol 11, 2015). While Chakraborty et al. apply the method of visibility relations to concurrent queues, we will apply them to other data structures, and develop the general constructions underpinning them. We already carried out the exercise over snapshot algorithms. Thus, as an outcome, we expect to develop a theory of visibility, which will mesh well with the theory of types, resources, and morphisms, that we already developed.

The bulk of the remaining project effort will be focused in this direction.

- The development of the resource morphisms that we carried out for concurrency, is very general, and should apply to notions of effect other than shared-memory concurrency.

Thus, we are also engaging into applying these concepts to develop a type theory of algebraic effects. Algebraic effects have been recently proposed as a flexible abstraction for programming with effects, with very pleasing mathematical properties. However, it has been somewhat challenging to design a type system that can closely track these effects in the types, for the purpose of statically ensuring that they are used correctly. We expect to be able to contribute to designing such a type system, as an offshoot of the main project.