Skip to main content
European Commission logo print header

Evolving Language Ecosystems

Periodic Reporting for period 4 - ELE (Evolving Language Ecosystems)

Reporting period: 2021-04-01 to 2022-09-30

A computer language is more than the code that programmers write or compilers translate into machine instructions. Modern languages are characterized by rich ecosystems that include compilers, interpreters, IDEs, libraries, help pages, manuals and discusion forums. To remain relevant, languages need to evolve, they must be augmented with new features, their libraries must be adapted to new end user requirements, implementations must change to meet new performance goals. How can this be achieved without disrupting the entire ecosystem? The ELE project explores the fundamental techniques and algorithms for evolving entire language ecosystems. Our purpose is to reduce the cost of wide-ranging changes to programming languages and obviate the need for devising entirely new languages.
The project has led, to date, to 21 research publications in leading venues in the field of programming languages and software engineering. We mention some of the highlights here.

Our TOPLAS'18 paper titled "Feature-Specific Profiling" starts from the premise that while high-level languages come with significant readability and maintainability benefits, their performance remains difficult to predict. For example, programmers may unknowingly use language features inappropriately, which cause their programs to run slower than expected. To address this issue, we introduce feature-specific profiling, a technique that reports performance costs in terms of linguistic constructs. Feature-specific profiling helps programmers find expensive uses of specific features of their language. We describe the architecture of a profiler that implements our approach, explain prototypes of the profiler for two languages with different characteristics and implementation strategies, and provide empirical evidence for the approach’s general usefulness as a performance debugging tool.

Our JFP18 paper titled "How to Evaluate the Performance of Gradual Type Systems" shows that a sound gradual type system ensures that untyped components of a program can never break the guarantees of statically typed components. This assurance relies on runtime checks, which in turn impose performance overhead in proportion to the frequency and nature of interaction between typed and untyped components. The literature on gradual typing lacks rigorous descriptions of methods for measuring the performance of gradual type systems. This gap has consequences for the implementors of gradual type systems and developers who use such systems. Without systematic evaluation of mixed-typed programs, implementors cannot precisely determine how improvements to a gradual type system affect performance. Developers cannot predict whether adding types to part of a program will significantly degrade (or improve) its performance. This paper presents the first method for evaluating the performance of sound gradual type systems. The method quantifies both the absolute performance of a gradual type system and the relative performance of two implementations of the same gradual type system. To validate the method, the paper reports on its application to twenty programs and three implementations of Typed Racket.

Our OOPSLA 18 paper titled "Julia: Dynamism and Performance Reconciled by Design" looks at Julia, a programming language for the scientific community that combines features of productivity languages, such as Python or MATLAB, with characteristics of performance-oriented languages, such as C++ or Fortran. Julia has many productivity features: dynamic typing, automatic memory management, rich type annotations, and multiple dispatch. At the same time, it lets programmers control memory layout and uses a specializing just-in-time compiler that eliminates some of the overhead of those features. This paper details these choices, and reflects on their implications for performance and usability.

Our POPL 18 paper titled "Correctness of Speculative Optimizations with Dynamic Deoptimization" starts from the premise that high-performance dynamic language implementations make heavy use of speculative optimizations to achieve speeds close to statically compiled languages. These optimizations are typically performed by a just-in-time compiler that generates code under a set of assumptions about the state of the program and its environment. In certain cases, a program may execute code compiled under assumptions that are no longer valid. The implementation must then deoptimize the program on-the-fly; this entails finding semantically equivalent code that does not rely on invalid assumptions, translating program state to that expected by the target code, and transferring control. This paper looks at the interaction between optimization and deoptimization, and shows that reasoning about speculation is surprisingly easy when assumptions are made explicit in the program representation. This insight is demonstrated on a compiler intermediate representation, named sourir, modeled after the high-level representation for a dynamic language. Traditional compiler optimizations such as constant folding, unreachable code elimination, and function inlining are shown to be correct in the presence of assumptions. Furthermore, the paper establishes the correctness of compiler transformations specific to deoptimization: namely unrestricted deoptimization, predicate hoisting, and assume composition.
Jan Vitek