Skip to main content
European Commission logo
English English
CORDIS - EU research results
CORDIS
CORDIS Web 30th anniversary CORDIS Web 30th anniversary

Strong Modular proof Assistance: Reasoning across Theories

Periodic Reporting for period 4 - SMART (Strong Modular proof Assistance: Reasoning across Theories)

Reporting period: 2021-09-01 to 2022-08-31

Formal proof technology delivers an unparalleled level of certainty and security. Nevertheless, applying proof assistants to the verification of complex theories and designs is still extremely laborious. High profile certification projects, such as seL4, CompCert, and Flyspeck require tens of person-years. We recently demonstrated that this effort can be significantly reduced by combining reasoning and learning in so called hammer systems: 40% of the Flyspeck, HOL4, Isabelle/HOL, and Mizar top-level lemmas can be proved automatically.

In the project we have aimed to develop stronger systems combining automated reasoning with artificial intelligence. For this, we worked on combining and reuse of several hammer components. The for main work packages of the project aimed to develop and improve: (a) uniform learning methods, (b) reusable ATP encoding components for different foundational aspects, (c) integration of proof reconstruction, and (d) methods for knowledge extraction, reuse and content merging. The combination of these methods have improved the efficiency of AI and automated reasoning in several ITP systems.
The project has mostly been executed as planned. We have managed to make more progress about the objectives (a) and (d) while we did less progress when it comes to (c) than what we initially envisioned.

When it comes to objective (a) of the proposal, we have developed a first version of the CIC0 logic combining data from the various type theoretic systems. We have investigated the various machine learning tasks for theorem proving and created various benchmarks. We have also worked on the characterization of mathematical knowledge that is more appropriate for machine learning methods and improved neural network methods tailored for theorem proving data and tasks.

For objective (b), we have developed a first version of the CoqHammer translation along with its improvements for several different Coq libraries. Its performance is much better than expected on Coq's standard library, but we achieved much less than what we were expecting on the Mathematical Components Library and other developments that rely on similar foundations. For hammering set theory, we have developed the Isabelle/Mizar object logic and moves important parts of Mizar knowledge to that foundation as well as created isomorphisms between the concepts in the object logics. We have developed a number of benchmarks for proof assistant ATP methods, including formalizations of category theory in Coq and game theory in Isabelle. Sledgehammer for Isabelle/HOL has been improved by the the integration of the learned ATP Enigma.

For objective (c), we have developed first proof certification methods in Coq by extending the Ben-Yelles term synthesis algorithm and adding heuristic rewriting. This constitutes the first general proof reconstruction mechanism for intuitionistic type theory. We have not however managed to create any reconstruction mechanism that would make use of any further information from the found proofs beyond the used premises or the unsatisfiable core as reported by SMT solvers. Furthermore, we have developed tactic-prediction models and their integration in several ITPs.

For objective (d), we have extended the alignments between proof assistant libraries from statistical ones to neural ones, in particular constructing an alignment of 6 proof assistants. We have also developed first statistical and deep auto-formalization systems.
We have developed several new proof advice systems, including CoqHammer, the first proof advice system for a prover based on intuitionistic type theory. We have used the combinations of proof libraries to improve the learning part of the provers. In particular, the learning component of CoqHammer, HOLyHammer and MizAR are shared. Due to the different logics of the systems, the translations are different, but they are able to use learning on the level of the used ATPs including the Enigma system that we improved in the project. The learning techniques are also useful beyond theorem proving: The proposed graph neural network changes improve on the general machine learning problem of graph isomorphism. The various formal proof developments aimed at testing strong proof automation also further the state-of-the-art in formalization. Finally, the developed techniques have also given rise in improvements in program synthesis.
Project Logo