A Semantic Foundation for Persistent Programming

Project Information

PERSIST

Grant agreement ID: 101003349

Project website

DOI

10.3030/101003349

EC signature date 23 February 2021

Start date 1 March 2021

End date 28 February 2026

Funded under

EXCELLENT SCIENCE - European Research Council (ERC)

Total cost

€ 1 999 300,00

EU contribution

€ 1 999 300,00

1 999 300,00

Coordinated by

MAX-PLANCK-GESELLSCHAFT ZUR FORDERUNG DER WISSENSCHAFTEN EV
Germany

Periodic Reporting for period 2 - PERSIST (A Semantic Foundation for Persistent Programming)

Reporting period: 2022-09-01 to 2024-02-29

The project concerns the correctness of computer applications that operate on data that outlives the program's execution, such as financial transactions and medical records. Traditionally, such data has been stored in hard disks, possibly in a distributed fashion over the cloud. However, it is quite possible that the data will be stored differently in the future if non-volatile memory (NVM) catches on. NVM is an emerging technology that provides orders of magnitude faster access to persistent storage (which preserves its contents after a crash or a power failure) than hard disks, and thus has the potential of making applications run much faster.

Nevertheless, correctly using NVM is extremely difficult because its programming model is standing on very shaky foundations. The persistency semantics of the mainstream architectures (x86, Arm, etc.) is unclear and full of counterintuitive behaviours. Errors in NVM programs can lead to data corruption and even to an irrecoverable corruption of the internal program state, which cannot be solved by "rebooting" the machine.

In response, PERSIST's goal is to develop a solid mathematical basis for determining the possible outcomes of persistent programs and reasoning about their correctness. More specifically, we aim to produce:

1. Formal persistency models for mainstream hardware architectures and programming languages,
2. Firmly-grounded higher-level abstractions to ease persistent programming, and
3. Effective testing and verification techniques for persistent programs.

In the course of the project, we have made significant breakthroughs in all three aspects of the project.

First, we have developed a formal model of Intel-x86's consistency and persistency semantics, that involved formalizing advanced architectural features, which were seldomly used prior to the advent of NVM, that enable programmers to bypass the usual cache coherence mechanisms provided by the architecture in order to gain a more predicable persistency semantics and/or higher performance in specific scenarios (e.g. when some data is accessed only once.)

Second, we have developed a framework for specifying the correctness of persistent libraries, such as persistent sets, and a proof technique for establishing one of the most common correctness conditions for such libraries ("durable linearizability"). The crux of our technique is to reuse an existing correctness proof for a version of the abstraction without the persistency guarantees and only prove that the additional code relevant for persistency is actually sufficient for establishing the property of interest.

Finally, we have made several contributions concerning the automated verification ("model checking") of concurrent and persistent programs. For instance, we have developed TruSt, a first truly stateless model checking algorithm that employs optimal dynamic partial order reduction and is parametric in the choice of concistency/persistency model. As such, it can thus be applied directly on implementations of higher-level persistent data structures. In follow up work, we have considered combing TruSt with preemption bounding, a technique for greatly enhancing the scalability of model checking at the expense of losing completeness. Besides such enumerative verification approaches, we have also applied SMT-based model checking approaches to verify persistency invariants of Px86 programs, but have noticed that the enumerative approaches tend to work better for the verification of persistent programs.

We have also worked on testing of CPU implementations for their persistency semantics, but do not yet have publications on these lines of work.

In the remaining of the project, we expect two major results:
1. To define persistency semantics at the level of a programming language like C/C++, and establish correctness of compilation to the hardware persistency models.
2. To enhance the scalability of model checking for programming patterns that appear frequently in persistent programs.

PERSIST logo

Periodic Reporting for period 2 - PERSIST (A Semantic Foundation for Persistent Programming)

Share this page Share this page on social networks

Download Download the content of the page