CORDIS - EU research results

Large-Scale Formal Proof for the Working Mathematician

Periodic Reporting for period 4 - ALEXANDRIA (Large-Scale Formal Proof for the Working Mathematician)

Reporting period: 2022-03-01 to 2023-08-31

Mathematics lies at the heart of science and technology, affecting our daily lives in everything from aircraft to finance. However, as the world of mathematics becomes ever more abstract and complex, questions of correctness become pressing. Proofs can be 100s of pages long and even the greatest minds make serious errors.

Our solution is to involve interactive theorem provers (specifically, Isabelle/HOL): we encode the rules of logic in software, eliminating human error. But can it work? What about the effort needed to convince the software of "mathematically obvious" facts? Can we encompass the immense world of mathematics within the rigid framework of a formal logic? How do we manage—and search—the huge libraries of mathematics already codified? And these libraries are still incomplete.

We—mathematicians and computer scientists working together—addressed these issues through several parallel approaches. One was to consolidate and extend the Isabelle/HOL libraries for analysis, algebra, etc. Another was to do pilot studies in the formalisation of a wide variety of mathematical topics, concerning infinite series, quantum computing, number theory and much more, including recent results. Yet another focused on verified techniques for doing algebraic computations, such as root isolation.

ALEXANDRIA aimed to create a proof environment and methodology useful to working mathematicians, utilising the best technology available across computer science and focused on the management and use of large-scale mathematical knowledge, both theorems and algorithms. The project objectives included

• The formalisation of core mathematical knowledge
• Codification of core mathematical algorithms, including proofs of their correctness
• Tools for searching and applying this knowledge, based on AI and language models
• Linking this knowledge to the mathematics literature

By the end we discovered that we could indeed make it work, in every study we tried. Our software of choice, Isabelle/HOL, allowed us to write proofs that were formal yet human readable, and with reasonable effort.
At the start, the two mathematicians on the team selected small topics to formalise. This allowed them to familiarise themselves with the advantages and limitations of Isabelle/HOL and to plan their next steps. They formalised results on advanced mathematical topics, including quantum computation and irrationality and transcendence criteria for infinite series. We found several errors in mathematics papers.

We worked heavily on consolidating and reorganising our libraries and formalised mathematics, producing a user manual identifying the the main topics in the Analysis library's 150K proof lines and 95 formal theories. Substantial amounts of new material was added to the libraries, covering advanced analysis and algebra: many tens of thousands of lines of formal proofs.

The proposal included the speculative aim of using AI to support users. We developed a tool to perform intelligent search in our mathematical libraries through natural language queries, achieved by indexing all four million lines of Isabelle's Archive of Formal Proofs. We also wanted something like GitHub's Copilot, which could make suggestions based on material found in existing proofs. We needed time to realise these ideas concretely, but towards the end, the project achieved impressive results involving the use of AI to generate proofs, e.g. automatic formalisation (that is, translating normal mathematical text into our formalism).

Verified Computer Algebra is another project focus. Computers have long been able to perform algebraic manipulations, but much of this software is not trustworthy. We developed advanced, formally verified algorithms for root-finding and other operations on polynomials.

As the project progressed, we tackled more and more ambitious pieces of mathematics, and all fairly recent (within the past half century). Separate projects on irrational convergent theories, ordinal partitions and Grothendieck schemes ended up taking over half of a special issue announced by the journal Experimental Mathematics. We tackled advanced topics in extremal graph theory, block designs, additive combinatorics and higher-order category theory, all relevant to today's mathematics.

The theorems we formalised include include Szemerédi's regularity lemma, Roth's theorem on arithmetic progressions, Lucas’s theorem, Fisher's inequality, the Plünnecke-Ruzsa inequality, Kneser's theorem, the Cauchy–Davenport theorem, Khovanskii's theorem, Balog–Szemerédi–Gowers theorem and much more.

• Our main conclusion is that there is no clear limit to what sort of mathematics can be formalised. We have formalised material across the mathematical landscape: combinatorics, analysis, number theory, Ramsey theory. By the end we had formalised the work of some of the greatest mathematicians of our day—Erdős, Gowers, Roth, Szemerédi—and uncovered numerous small errors.

• Another key conclusion: Isabelle/HOL relies on a relatively simple formalism, higher-order logic, thought by some to be unsuitable. This view is now disproved. And the simple formalism has strong advantages: (1) fewer technical quirks to entrap users, (2) better automation, hence better productivity, and (3) legible formal proofs, in which the original mathematical ideas remain visible.

The project developed methodologies that could be applied by mathematicians today, although a major effort of persuasion remains. For that, members of the team have taken on numerous speaking engagements. We launched a seminar series in the Cambridge mathematics department and have delivered lectures in universities and at high-profile conferences.

The project produced 63 outputs, including 13 journal articles, 15 conference papers (including super selective AI conferences), 2 book chapters, and 32 contributions to Isabelle's Archive of Formal Proofs, with more coming.

Dissemination is also taking place over the Internet: the project webpage ( includes a comprehensive list of accomplishments, and the PI's blog ( discusses aspects of the underlying ideas for a more general mathematical audience.
We have achieved the first ever formalisations of many results, across the spectrum of mathematical topics. We are the first to formalise the existence of an algebraic closure for an arbitrary field, the first to formalise additive combinatorics, block design theory, ordinal partition theory and many other other advanced topics. We proved the appropriateness of Isabelle/HOL as a software tool for mathematicians, we introducing new methodologies for working with large hierarchies of definitions, as in Grothendieck schemes.

Our work on AI and language models, e.g. to automatically formalise mathematical material, defines the state of the art. We can soon expect to see a new class of user support based on the ability to find relevant material, both theorems and reusable fragments of proofs. The next step is the evaluation and refinement of our search engine in a broader user setup. Our intelligent search efforts required the development of considerable infrastructure (i.e. information extraction), which benefits the broad machine learning aspects of the project.
graph of theory library imports, to drive machine learning
A landmark development: Grothendieck Schemes
graph of fact relationships, to drive machine learning