Periodic Reporting for period 1 - MirandaTesting (Testing Program Analyzers Ad Absurdum)
Période du rapport: 2023-07-01 au 2025-12-31
Program analysis refers to automatically examining a piece of software with the goal of detecting correctness issues or verifying their absence. Program analyzers are tools that implement program-analysis techniques. In recent years, as day-to-day life increasingly depends on software, more and more program analyzers are being built and used in practice. In fact, there is an abundance of popular analyzers developed both in academia and industry.
We rely on program analyzers to “guard” software reliability, but who will guard the guards? Program analyzers are highly complex tools, implementing sophisticated algorithms and performance optimizations. In addition, analyzers typically integrate several self-contained, core analysis components, such as specialized solvers, which are already complex by themselves. Due to this overall complexity, program analyzers are all the more likely to contain correctness issues. The most dangerous kind of correctness issue in analyzers is a critical bug, which we define as a bug leading to a wrong response, e.g. returning ‘correct’ for incorrect software, or leading to a right response for the wrong reasons. The latter type of critical bug is relevant because it is also likely to result in a wrong analyzer response under different circumstances.
In modern society, critical bugs may have disastrous consequences, e.g. when analyzing software used for transportation, banking, or secure communication. As a concrete example, consider the Astrée analyzer, which has been used to verify the absence of runtime errors in the flight-control software of Airbus A340 and A380. What if it missed an error? It is, therefore, imperative to check program analyzers for critical bugs.
Verifying the absence of critical bugs in a program analyzer is prohibitively expensive. Contrary to verification, automated test generation can be used to effectively find such bugs. Existing testing approaches, however, are still limited for this application domain.
The goal of this project is to develop an overarching methodology for more rigorous testing of program analyzers than ever before. The key idea is to first expose more information about why a program analyzer reaches a particular response for a certain piece of code, e.g. why was the code found correct? This information will be used to interrogate the analyzer further aiming to force it into a contradiction. In other words, anything the analyzer says during interrogation can and will be used against it. Finding a contradiction signifies that an analyzer response or its justification for a response is wrong, and that a critical bug has been detected. We call this methodology “interrogation testing”.
If successful, this project will enable systematic testing of entire program-analyzer classes. As a result, analyzers will exhibit fewer critical bugs, potentially preventing catastrophic outcomes in safety-critical domains.
1. Christoph Hochrainer, Anastasia Isychev, Valentin Wüstholz and Maria Christakis. Fuzzing Processing Pipelines for Zero-Knowledge Circuits. In Proceedings of the 31st International Conference on Computer and Communications Security (CCS'25), 2025. ACM.
2. David Kaindlstorfer, Anastasia Isychev, Valentin Wüstholz and Maria Christakis. Interrogation Testing of Program Analyzers for Soundness and Precision Issues. In Proceedings of the 39th International Conference on Automated Software Engineering (ASE'24), 2024. ACM.
3. Markus Fleischmann, David Kaindlstorfer, Anastasia Isychev, Valentin Wüstholz and Maria Christakis. Constraint-Based Test Oracles for Program Analyzers. In Proceedings of the 39th International Conference on Automated Software Engineering (ASE'24), 2024. ACM.
4. Jiradet Ounjai, Valentin Wüstholz and Maria Christakis. Green Fuzzer Benchmarking. In Proceedings of the 32nd International Symposium on Software Testing and Analysis (ISSTA'23), 2023. ACM.
5. Muhammad Numair Mansur, Valentin Wüstholz and Maria Christakis. Dependency-Aware Metamorphic Testing of Datalog Engines. In Proceedings of the 32nd International Symposium on Software Testing and Analysis (ISSTA'23), 2023. ACM.
We expect interrogation testing to also achieve societal impact. Program analyzers will exhibit fewer critical bugs, hence increasing the quality of analyzed software, especially in safety-critical settings. This may even impact software certification. Specifically, analyzers are being used to check whether software meets certification requirements, e.g. for road vehicles, safety-related electrical control systems, or safety-related railway software. We envision a future where analyzers used for certification must be interrogation tested, thereby indirectly raising the certification standards for safety-critical software. Moreover, by improving analyzer quality, users will place more trust in analysis results, making program analyzers even more widely applicable. In short, software - a key innovation driver in our society - will become more reliable.