Periodic Reporting for period 2 - NATURAL (Natural Program Repair)
Período documentado: 2022-08-01 hasta 2024-01-31
Initial successes in automatic bug fixing are based on scenarios such as the following: when a bug is localized, patches are generated repetitively and automatically, through trial and error, until a valid patch is produced. The produced patch could then be later revised by developers. While the reported achievements are certainly worthwhile, they do not address what we believe is a more comprehensive challenge of software engineering: to systematically fix features of a software system based on end-user requirements.
The ambition of the NATURAL project is to develop a methodology for yielding an intelligent agent that is capable of receiving a natural language description of a problem that a user faces with a software feature, and then synthesizing code to address this problem so that it meets the user’s expectations. Such a repair bot would be a trustworthy software contributor that is (i) first, targeting real bugs in production via exploiting bug reports, which remain largely under-explored, (ii) second, aligning with the conversational needs of collaborative work via generating explanations for patch suggestions, (iii) third, shifting the repair paradigm towards the design of self-improving systems via yielding novel algorithms that iteratively integrate feedback from humans.
1. Understand how code search mechanisms work and how they can help find ingredients for program repair. A result of this was compiled and published in ACM Computing Surveys.
2. Develop practical approaches for program repair using Ensemble Learning. The approach and results are published at the IEEE/ACM International Conference on Software Engineering.
3. Devise code and patch representation learning techniques for the tasks of patch generation. One strong result, CodeGrid, has been published at the International Symposium on Software Testing and Analysis
In WP1: The novelty is that we will mine bug reports beyond mere superficial token-matching, while exploring feedback mechanisms to enhance low-quality natural language-based user bug reports.
We have already progressed the state of the art by devising a simple localization approach that learns to select for a given bug report the right operators for processing the bug reports before feeding it the localization pipeline.
In WP2: The novelty is that we will initiate a new direction of test suite augmentation to support bug report-driven Automatic program repair. We will build on code search to bypass the oracle problem in test generation. We have collaborated with other researchers on improving the state of the art in detecting test flakiness first (cf. PEELER @ICSME 2022). We also have demonstrated that Large language models can be leveraged to generate patches from bug reports (Poster papers accepted at IEEE/ACM International Conference on Software Engineering 2024 - To be published).
In WP3: The novelty is that we will specialize template-based automatic program repair by guiding template selection based on bug types and we will leverage the accuracy of Big Code models to further guarantee correctness. Our first contributions with BATS (@TOSEM) and Quatrain (@ASE 2022) have already progressed significantly the state of the art in correctness prediction.
In WP4: The novelty is that we will resolutely turn to building a self-learning bot that is a dynamic actor engaged in conversations with practitioners about repair attempts. We have developed representation learning models that enable to generate descriptions of patches that are more reliable than what have been presented in the literature so far. CodeGrid has demonstrated that we leveraged the spatiality of Code to learn good representations. CodeGrid was presented at ISSTA.