Semantics of Software Systems

Informazioni relative al progetto

ID dell’accordo di sovvenzione: 101093186

DOI

10.3030/101093186

Data della firma CE 22 Agosto 2023

Data di avvio 1 Settembre 2023

Data di completamento 31 Agosto 2028

Finanziato da

European Research Council (ERC)

Costo totale

€ 2 500 000,00

Contributo UE

€ 2 500 000,00

2 500 000,00

Coordinato da

CISPA - HELMHOLTZ-ZENTRUM FUR INFORMATIONSSICHERHEIT GGMBH
Germany

Periodic Reporting for period 1 - S3 (Semantics of Software Systems)

Periodo di rendicontazione: 2023-09-01 al 2026-02-28

*What if we had software bots that tirelessly test, debug, and monitor our software systems?*

IT workers are expensive and scarce. So why can't we further automate boring, repetitive activities such as testing and debugging? The problem is that we lack computer-readable _specifications_ (so-called _oracles_) for what the system should do or not do.And the oracle problem, in contrast to the techniques above, is insufficiently researched.
For decades, this _oracle problem_ has been a roadblock to automated test generation, trusted software repairs, and accurate monitoring of software.

Building on groundbreaking research to infer input languages of systems, S3 introduces a unified approach to _learning oracles automatically_. It takes a given software system; _infers_ and _decodes_ its inputs and outputs; and runs _experiments_ to extract _models_ of how the system behaves, capturing its semantics by predicting output features for given input features. These models, named _system invariants_, allow to _fully automate_ critical software development activities:

TESTING. System invariants encode _languages_ for automatically generating test inputs and provide _oracles_ for checking test results: "In the TLS server, the in the must be the same as in the ."
DEBUGGING. System invariants allow narrowing down causes of software behavior ("The X.509 public key certificate is not recognized if contains a zero byte"). Generated tests and oracles ensure reliable automated repair.
MONITORING. System invariants enable detecting abnormal behavior at runtime ("In 'log4j', logging a containing '"jndi:"' opens "). Problematic queries can be isolated and investigated until the problem is fixed.

The central achievement of the project is the _Fandango_ test generator – a framework for generating myriads of inputs and interactions for software. The key feature of FANDANGO is that it is _language-based_, meaning that it takes a _specification_ of the input or the interaction and generates conforming inputs. These specifications combine well-known context-free grammars (for syntax) with _constraints_ over grammar elements, allowing an easy specification of input properties. To specify constraints, users can make use of the well-known Python language and library (and even define their own functions), which makes Fandango both expressive and familiar. Fandango comes with 250+ pages of documentation and tutorial.

Current and future S3 contributions all revolve around Fandango, be it testing, debugging, or monitoring - all use the Fandango language and system. This allows for easy interaction and synergies within the project, as well as future extensions.

The most significant achievements to date are

– the concept of language-based software testing [1],
– the ability to automatically statically mine input grammars from existing code [2],
- the ability to learn program models from executions, being able to predict inputs for given outputs [3],
– the application of evolutionary algorithms in the FANDANGO test generator for highly effective generation of complex inputs [4],
– and finally the usage of I/O grammars for comprehensive protocol testing [5].

[1] STEINHÖFEL, Dominic; ZELLER, Andreas. Language-based software testing. Communications of the ACM, 2024, 67. Jg., Nr. 4, S. 80-84.
[2] BETTSCHEIDER, Leon; ZELLER, Andreas. Inferring Input Grammars from Code with Symbolic Parsing. arXiv preprint arXiv:2503.08486 2025. Accepted for publication in ACM Transactions on Software Engjneering, 2026.
[3] MAMMADOV, Tural, et al. Learning program behavioral models from synthesized input-output pairs. ACM Transactions on Software Engineering and Methodology, 2024.
[4] ZAMUDIO AMAYA, José Antonio; SMYTZEK, Marius; ZELLER, Andreas. FANDANGO: Evolving Language-Based Testing. Proceedings of the ACM on Software Engineering, 2025, 2. Jg., Nr. ISSTA, S. 894-916.
[5] NEUHAUS, Stephan; AMAYA, Jose Antonio Zamudio; ZELLER, Andreas. Personalized Fuzzing: A Case Study with the FANDANGO Fuzzer on a GNSS Module. In: Proceedings of the 34th ACM SIGSOFT International Symposium on Software Testing and Analysis. 2025. S. 86-91.

Periodic Reporting for period 1 - S3 (Semantics of Software Systems)

Scarica Scarica il contenuto della pagina