Skip to main content
Vai all'homepage della Commissione europea (si apre in una nuova finestra)
italiano it
CORDIS - Risultati della ricerca dell’UE
CORDIS

Learning to Find Software Bugs

Periodic Reporting for period 4 - LearnBugs (Learning to Find Software Bugs)

Periodo di rendicontazione: 2024-09-01 al 2025-08-31

Software has become the cornerstone of modern society, economy, and life. Since software is created by humans, though, every non-trivial program contains various bugs, i.e. programming errors that may have disastrous consequences. Traditional approaches to find bugs include automated bug detection tools. Such tools search for instances of bug patterns that recur across projects and application domains. However, automated bug detection currently cannot unleash its full potential because each bug detector addresses one bug pattern and one programming language, while creating new bug detectors is feasible only for program analysis experts.

The objective of this project has been to radically change the way automated bug detection tools are created. The core idea is to replace manually written program analyses with trained machine learning models. To this end, developers train a bug detector for a particular bug pattern with examples of buggy and non-buggy code, which the model learns to distinguish. The project will realize this vision by developing a reusable framework that addresses several fundamental challenges at the intersection of software engineering, programming languages, and machine learning, e.g.: (i) How to support developers in creating large amounts of training data of buggy and non-buggy code examples? (ii) How to represent programs in a way suitable for advanced machine learning techniques?

The project has achieved its objectives and, boosted by the general advances in artificial intelligence, even exceeded the initial expectations. As such, the project has made significant contributions toward increasing the reliability, security, and efficiency of complex software systems used by millions of people.
The project has lead to ground-breaking results in both learning-based bug detection and learning-based program repair. First, we have established the general, conceptual framework of neural software analysis, which in particular provides a way to construct learning-based bug detectors. Second, we made significant progress toward our goal of automatically injecting bugs, which can then be used as training data, e.g. using our award-winning SemSeed technique. Third, we presented new benchmarks and metrics for evaluating neural models of code, e.g. the IdBench benchmark suite and our CrystalBLEU metric. Fourth, we have provided several learning-based bug detectors, such DeepBugs and Nalin. Finally, the project has gone beyond the detection of software bugs by also automating the process of fixing software bugs. A particularly noteworthy contribution in this field is RepairAgent, which at the time of its release, has been the first LLM agent to successfully address the problem of automated program repair.
Neural software analysis
Il mio fascicolo 0 0