Software has become the cornerstone of modern society, economy, and life. Since software is created by humans, though, every non-trivial program contains various bugs, i.e. programming errors that may have disastrous consequences. Traditional approaches to find bugs include automated bug detection tools. Such tools search for instances of bug patterns that recur across projects and application domains. However, automated bug detection currently cannot unleash its full potential because each bug detector addresses one bug pattern and one programming language, while creating new bug detectors is feasible only for program analysis experts.
The objective of this project has been to radically change the way automated bug detection tools are created. The core idea is to replace manually written program analyses with trained machine learning models. To this end, developers train a bug detector for a particular bug pattern with examples of buggy and non-buggy code, which the model learns to distinguish. The project will realize this vision by developing a reusable framework that addresses several fundamental challenges at the intersection of software engineering, programming languages, and machine learning, e.g.: (i) How to support developers in creating large amounts of training data of buggy and non-buggy code examples? (ii) How to represent programs in a way suitable for advanced machine learning techniques?
The project has achieved its objectives and, boosted by the general advances in artificial intelligence, even exceeded the initial expectations. As such, the project has made significant contributions toward increasing the reliability, security, and efficiency of complex software systems used by millions of people.