Proof assistants are increasingly used to verify hardware and software and to formalize mathematics. However, despite the success stories, they remain very laborious to use. The situation has improved with the integration of first-order automatic theorem provers -- superposition provers and SMT (satisfiability modulo theories) solvers -- through middleware such as Sledgehammer for Isabelle, codeveloped by the PI; but this research has now reached the point of diminishing returns. Only so much can be done when viewing automatic provers as black boxes.
To make interactive verification more cost-effective, we propose to deliver very high levels of automation to users of proof assistants by fusing and extending two lines of research: automatic and interactive theorem proving. This is our grand challenge. Our starting point is that first-order automatic provers are the best tools available for performing most of the logical work. Our approach will be to enrich superposition and SMT with higher-order reasoning in a careful manner, to preserve their desirable properties. We will design proof rules and strategies, guided by representative benchmarks from interactive verification.
With higher-order superposition and higher-order SMT in place, we will develop highly automatic provers building on modern superposition provers and SMT solvers, following a novel stratified architecture. To reach end users, these new provers will be integrated in proof assistants, including Coq, Isabelle, and the TLA+ Proof System, and will be available as backends to more specialized verification tools. The users of proof assistants and similar tools stand to experience substantial productivity gains: In the past five years, the success rate of automatic provers on interactive proof obligations from a representative benchmark suite has risen from 47% to 77%; with this project, we aim at 90%--95% proof automation.
Fields of science
Funding SchemeERC-STG - Starting Grant
1081 HV Amsterdam
See on map
78153 Le Chesnay Cedex
See on map
54052 Nancy Cedex
See on map