Skip to main content
European Commission logo print header

Natural Program Repair

Periodic Reporting for period 1 - NATURAL (Natural Program Repair)

Reporting period: 2021-02-01 to 2022-07-31

Automatic bug fixing, i.e. the idea of having programs that fix other programs, is a long-standing dream that is increasingly embraced by the software engineering community. Indeed, despite the significant effort that humans put into reviewing code and running software test campaigns, programming mistakes slip by, with severe consequences. Fixing those mistakes automatically has recently been the focus of a number of potentially promising techniques. Proposed approaches are however recurrently criticized as being shallow (i.e. they mostly address unit test failures, which are often neither hard nor important problems).
Initial successes in automatic bug fixing are based on scenarios such as the following: when a bug is localized, patches are generated repetitively and automatically, through trial and error, until a valid patch is produced. The produced patch could then be later revised by developers. While the reported achievements are certainly worthwhile, they do not address what we believe is a more comprehensive challenge of software engineering: to systematically fix features of a software system based on end-user requirements.
The ambition of the NATURAL project is to develop a methodology for yielding an intelligent agent that is capable of receiving a natural language description of a problem that a user faces with a software feature, and then synthesizing code to address this problem so that it meets the user’s expectations. Such a repair bot would be a trustworthy software contributor that is (i) first, targeting real bugs in production via exploiting bug reports, which remain largely under-explored, (ii) second, aligning with the conversational needs of collaborative work via generating explanations for patch suggestions, (iii) third, shifting the repair paradigm towards the design of self-improving systems via yielding novel algorithms that iteratively integrate feedback from humans.
For the first report period, we have mainly worked to:
1. Understand how we can address the differences in quality that exist among user bug reports. A first result was achieved by devising a technique to take into account the specificities of each bug report for the bug localization step. This result has been published in the Elsevier Journal of Systems and Software.

2. Explore techniques for assessing the correctness of patches by taking into account the new constraint about the incompleteness of test suites. Two major contributions were made in this direction, where we leveraged natural language processing to predict patch correctness based on the bug report content, and also predict patch correctness based on the failing test cases. These results have been published in premier venues of our research domain, namely IEEE/ACM International Conference on Automated Software Engineering (ASE) and ACM Transactions on Software Engineering Methodology (TOSEM)

3. Devise code and patch representation learning techniques for the tasks of patch generation. Preliminary results have been obtained and are under review.
The NATURAL project includes 4 work packages, each aiming to make progress beyond the state of the art.

In WP1: The novelty is that we will mine bug reports beyond mere superficial token-matching, while exploring feedback mechanisms to enhance low-quality natural language-based user bug reports.
We have already progressed the state of the art by devising a simple localization approach that learns to select for a given bug report the right operators for processing the bug reports before feeding it the localization pipeline.

In WP2: The novelty is that we will initiate a new direction of test suite augmentation to support bug report-driven Automatic program repair. We will build on code search to bypass the oracle problem in test generation. We have collaborated with other researchers on improving the state of the art in detecting test flakiness first (cf. PEELER @ICSME 2022)

In WP3: The novelty is that we will specialize template-based automatic program repair by guiding template selection based on bug types and we will leverage the accuracy of Big Code models to further guarantee correctness. Our first contributions with BATS (@TOSEM) and Quatrain (@ASE 2022) have already progressed significantly the state of the art in correctness prediction.

In WP4: The novelty is that we will resolutely turn to building a self-learning bot that is a dynamic actor engaged in conversations with practitioners about repair attempts. We have developed representation learning models that enable to generate descriptions of patches that are more reliable than what have been presented in the literature so far. These contributions are currently under review.