Skip to main content
Aller à la page d’accueil de la Commission européenne (s’ouvre dans une nouvelle fenêtre)
français français
CORDIS - Résultats de la recherche de l’UE
CORDIS

Learning to Find Software Bugs

Periodic Reporting for period 3 - LearnBugs (Learning to Find Software Bugs)

Période du rapport: 2023-03-01 au 2024-08-31

Software has become the cornerstone of modern society, economy, and life. Since software is created by humans, though, every non-trivial program contains various bugs, i.e. programming errors that may have disastrous consequences. Traditional approaches to find bugs include automated bug detection tools. Such tools search for instances of bug patterns that recur across projects and application domains. However, automated bug detection currently cannot unleash its full potential because each bug detector addresses one bug pattern and one programming language, while creating new bug detectors is feasible only for program analysis experts.

The objective of this proposal is to radically change the way automated bug detection tools are created. The core idea is to replace manually written program analyses with trained machine learning models. To this end, developers will train a bug detector for a particular bug pattern with examples of buggy and non-buggy code, which the model learns to distinguish. The project will realize this vision by developing a reusable framework that addresses several fundamental challenges at the intersection of software engineering, programming languages, and machine learning, e.g.: (i) How to support developers in creating large amounts of training data of buggy and non-buggy code examples? (ii) How to represent programs in a way suitable for advanced machine learning techniques?

The proposed project has the potential to revolutionize how software developers find bugs. To date, no other research has addressed the problem of automatically learning bug detection tools. If successful, the project will ""democratize"" bug detection by enabling all software developers, instead of a few program analysis experts, to create and share bug detection tools. Ultimately, the project will contribute to increasing the reliability, security, and efficiency of complex software systems used by millions of people.
After about half of the project period, several key results have been achieved. First, we establish the general, conceptual framework of neural software analysis, which in particular provides a way to construct learning-based bug detectors. Second, we made significant progress toward our goal of automatically injecting bugs, which can then be used as training data, e.g. using our award-winning SemSeed technique. Third, we present new benchmarks and metrics for evaluating neural models of code, e.g. the IdBench benchmark suite and our CrystalBLEU metric. Finally, we provide several learning-based bug detectors, such DeepBugs and Nalin.
We expect the second half of the project period to lead to the creation of even more effective techniques for injecting large amounts of bugs into existing code, further learning-based bug detectors, and insights into how and why neural models of code make their predictions.
Neural software analysis
Mon livret 0 0