Learning to Find Software Bugs

Informazioni relative al progetto

LearnBugs

ID dell’accordo di sovvenzione: 851895

DOI

10.3030/851895

Progetto chiuso

Data della firma CE 18 Settembre 2019

Data di avvio 1 Marzo 2020

Data di completamento 31 Agosto 2025

Finanziato da

EXCELLENT SCIENCE - European Research Council (ERC)

Costo totale

€ 1 458 375,00

Contributo UE

€ 1 458 375,00

1 458 375,00

Coordinato da

UNIVERSITY OF STUTTGART
Germany

Periodic Reporting for period 4 - LearnBugs (Learning to Find Software Bugs)

Periodo di rendicontazione: 2024-09-01 al 2025-08-31

Software has become the cornerstone of modern society, economy, and life. Since software is created by humans, though, every non-trivial program contains various bugs, i.e. programming errors that may have disastrous consequences. Traditional approaches to find bugs include automated bug detection tools. Such tools search for instances of bug patterns that recur across projects and application domains. However, automated bug detection currently cannot unleash its full potential because each bug detector addresses one bug pattern and one programming language, while creating new bug detectors is feasible only for program analysis experts.

The objective of this project has been to radically change the way automated bug detection tools are created. The core idea is to replace manually written program analyses with trained machine learning models. To this end, developers train a bug detector for a particular bug pattern with examples of buggy and non-buggy code, which the model learns to distinguish. The project will realize this vision by developing a reusable framework that addresses several fundamental challenges at the intersection of software engineering, programming languages, and machine learning, e.g.: (i) How to support developers in creating large amounts of training data of buggy and non-buggy code examples? (ii) How to represent programs in a way suitable for advanced machine learning techniques?

The project has achieved its objectives and, boosted by the general advances in artificial intelligence, even exceeded the initial expectations. As such, the project has made significant contributions toward increasing the reliability, security, and efficiency of complex software systems used by millions of people.

The project has lead to ground-breaking results in both learning-based bug detection and learning-based program repair. First, we have established the general, conceptual framework of neural software analysis, which in particular provides a way to construct learning-based bug detectors. Second, we made significant progress toward our goal of automatically injecting bugs, which can then be used as training data, e.g. using our award-winning SemSeed technique. Third, we presented new benchmarks and metrics for evaluating neural models of code, e.g. the IdBench benchmark suite and our CrystalBLEU metric. Fourth, we have provided several learning-based bug detectors, such DeepBugs and Nalin. Finally, the project has gone beyond the detection of software bugs by also automating the process of fixing software bugs. A particularly noteworthy contribution in this field is RepairAgent, which at the time of its release, has been the first LLM agent to successfully address the problem of automated program repair.

Neural software analysis

Periodic Reporting for period 4 - LearnBugs (Learning to Find Software Bugs)

Scarica Scarica il contenuto della pagina