Periodic Reporting for period 1 - S3 (Semantics of Software Systems)
Periodo di rendicontazione: 2023-09-01 al 2026-02-28
IT workers are expensive and scarce. So why can't we further automate boring, repetitive activities such as testing and debugging? The problem is that we lack computer-readable _specifications_ (so-called _oracles_) for what the system should do or not do.And the oracle problem, in contrast to the techniques above, is insufficiently researched.
For decades, this _oracle problem_ has been a roadblock to automated test generation, trusted software repairs, and accurate monitoring of software.
TESTING. System invariants encode _languages_ for automatically generating test inputs and provide _oracles_ for checking test results: "In the TLS server, the
DEBUGGING. System invariants allow narrowing down causes of software behavior ("The X.509 public key certificate is not recognized if
MONITORING. System invariants enable detecting abnormal behavior at runtime ("In 'log4j', logging a
The central achievement of the project is the _Fandango_ test generator – a framework for generating myriads of inputs and interactions for software. The key feature of FANDANGO is that it is _language-based_, meaning that it takes a _specification_ of the input or the interaction and generates conforming inputs. These specifications combine well-known context-free grammars (for syntax) with _constraints_ over grammar elements, allowing an easy specification of input properties. To specify constraints, users can make use of the well-known Python language and library (and even define their own functions), which makes Fandango both expressive and familiar. Fandango comes with 250+ pages of documentation and tutorial.
Current and future S3 contributions all revolve around Fandango, be it testing, debugging, or monitoring - all use the Fandango language and system. This allows for easy interaction and synergies within the project, as well as future extensions.
– the concept of language-based software testing [1],
– the ability to automatically statically mine input grammars from existing code [2],
- the ability to learn program models from executions, being able to predict inputs for given outputs [3],
– the application of evolutionary algorithms in the FANDANGO test generator for highly effective generation of complex inputs [4],
– and finally the usage of I/O grammars for comprehensive protocol testing [5].
[1] STEINHÖFEL, Dominic; ZELLER, Andreas. Language-based software testing. Communications of the ACM, 2024, 67. Jg., Nr. 4, S. 80-84.
[2] BETTSCHEIDER, Leon; ZELLER, Andreas. Inferring Input Grammars from Code with Symbolic Parsing. arXiv preprint arXiv:2503.08486 2025. Accepted for publication in ACM Transactions on Software Engjneering, 2026.
[3] MAMMADOV, Tural, et al. Learning program behavioral models from synthesized input-output pairs. ACM Transactions on Software Engineering and Methodology, 2024.
[4] ZAMUDIO AMAYA, José Antonio; SMYTZEK, Marius; ZELLER, Andreas. FANDANGO: Evolving Language-Based Testing. Proceedings of the ACM on Software Engineering, 2025, 2. Jg., Nr. ISSTA, S. 894-916.
[5] NEUHAUS, Stephan; AMAYA, Jose Antonio Zamudio; ZELLER, Andreas. Personalized Fuzzing: A Case Study with the FANDANGO Fuzzer on a GNSS Module. In: Proceedings of the 34th ACM SIGSOFT International Symposium on Software Testing and Analysis. 2025. S. 86-91.