Skip to main content

Automated Program Analysis for Advanced Web Applications

Periodic Reporting for period 3 - PAW (Automated Program Analysis for Advanced Web Applications)

Reporting period: 2018-08-01 to 2020-01-31

Web applications that execute in the user's web browser constitute a substantial part of modern software. JavaScript is the main programming language of the web, although alternatives are emerging, in particular, TypeScript and Dart. Despite the advances in design of languages and libraries, it is difficult to prevent errors when programming such web applications. Although the basic principles of software verification have been known for decades and researchers have developed an abundance of techniques for formal reasoning about programs, modern software has lots of errors, as everyday users can testify.

The PAW project is creating novel automated program analysis algorithms for preventing errors in web applications. Our approach involves a mix of static and dynamic analysis techniques. Our prototype implementations are made openly available to facilitate re-usability.

The overall objectives of the project are to:
1) enable analysis of programs that use new programming language features,
2) develop analysis abstractions that enable analysis of complex libraries and frameworks,
3) expand the capabilities of automated testing techniques,
4) support migration and evolution of software, and
5) provide reusable program analysis infrastructure.
"So far, we have produced a number of scientific results that span all five objectives and that have been published at top conferences and journals:

Type Test Scripts for TypeScript Testing , Kristensen and Møller. OOPSLA 2017.
""TypeScript applications often use untyped JavaScript libraries. To support static type checking of such applications, the typed APIs of the libraries are expressed as separate declaration files. This raises the challenge of checking that the declaration files are correct with respect to the library implementations. Previous work has shown that mismatches are frequent and cause TypeScript's type checker to misguide the programmers by rejecting correct applications and accepting incorrect ones.
This paper shows how feedback-directed random testing, which is an automated testing technique that has mostly been used for testing Java libraries, can be adapted to effectively detect such type mismatches. Given a JavaScript library with a TypeScript declaration file, our tool TSTEST generates a type test script, which is an application that interacts with the library and tests that it behaves according to the type declarations. Compared to alternative solutions that involve static analysis, this approach finds significantly more mismatches in a large collection of real-world JavaScript libraries with TypeScript declaration files, and with fewer false positives. It also has the advantage that reported mismatches are easily reproducible with concrete executions, which aids diagnosis and debugging.""

Practical Initialization Race Detection for JavaScript Web Applications, Adamsen, Møller, and Tip, OOPSLA 2017 (ACM SIGPLAN Distinguished Paper).
""Event races are a common source of subtle errors in JavaScript web applications. Several automated tools for detecting event races have been developed, but experiments show that their accuracy is generally quite low. We present a new approach that focuses on three categories of event race errors that often appear during the initialization phase of web applications: form-input-overwritten errors, late-event-handler-registration errors, and access-before-definition errors. The approach is based on a dynamic analysis that uses a combination of adverse and approximate execution. Among the strengths of the approach are that it does not require browser modifications, expensive model checking, or static analysis. In an evaluation on 100 widely used websites, our tool InitRacer reports 1085 initialization races, while providing informative explanations of their causes and effects. A manual study of 218 of these reports shows that 111 of them lead to uncaught exceptions and at least 47 indicate errors that affect the functionality of the websites.""

A Survey of Dynamic Analysis and Test Generation for JavaScript, Andreasen, Gong, Møller, Pradel, Selakovic, Sen, and Staicu, ACM Computing Surveys, 50(5).
JavaScript has become one of the most prevalent programming languages. Unfortunately, some of the unique properties that contribute to this popularity also make JavaScript programs prone to errors and difficult for program analyses to reason about. These properties include the highly dynamic nature of the language, a set of unusual language features, a lack of encapsulation mechanisms, and the ""no crash"" philosophy. This paper surveys dynamic program analysis and test generation techniques for JavaScript targeted at improving the correctness, reliability, performance, security, and privacy of JavaScript-based software.

Systematic Approaches for Increasing Soundness and Precision of Static Analyzers, Andreasen, Møller, and Nielsen, SOAP 2017.
""Building static analyzers for modern programming languages is difficult. Often soundness is a requirement, perhaps with some well-defined exceptions, and precision must be adequate for producing useful results on realistic input programs. Formally proving such properties of a complex static analysis implementation is rarely an option in p"
We expect to continue developing novel program analysis techniques for web-based software. In particular, we plan to continue the development of the TAJS analyzer and to investigate possibilities for analyzing Node.js software.