Skip to main content

Learning from Big Code: Probabilistic Models, Analysis and Synthesis

Objective

The goal of this proposal is to fundamentally change the way we build and reason about software. We aim to develop new kinds of statistical programming systems that provide probabilistically likely solutions to tasks that are difficult or impossible to solve with traditional approaches.

These statistical programming systems will be based on probabilistic models of massive codebases (also known as ``Big Code'') built via a combination of advanced programming languages and powerful machine learning and natural language processing techniques. To solve a particular challenge, a statistical programming system will query a probabilistic model, compute the most likely predictions, and present those to the developer.

Based on probabilistic models of ``Big Code'', we propose to investigate new statistical techniques in the context of three fundamental research directions: i) statistical program synthesis where we develop techniques that automatically synthesize and predict new programs, ii) statistical prediction of program properties where we develop new techniques that can predict important facts (e.g., types) about programs, and iii) statistical translation of programs where we investigate new techniques for statistical translation of programs (e.g., from one programming language to another, or to a natural language).

We believe the research direction outlined in this interdisciplinary proposal opens a new and exciting area of computer science. This area will combine sophisticated statistical learning and advanced programming language techniques for building the next-generation statistical programming systems.

We expect the results of this proposal to have an immediate impact upon millions of developers worldwide, triggering a paradigm shift in the way tomorrow's software is built, as well as a long-lasting impact on scientific fields such as machine learning, natural language processing, programming languages and software engineering.

Field of science

  • /natural sciences/computer and information sciences/data science/natural language processing
  • /natural sciences/computer and information sciences/software
  • /humanities/languages and literature/languages - general
  • /natural sciences/computer and information sciences/artificial intelligence/machine learning

Call for proposal

ERC-2015-STG
See other projects for this call

Funding Scheme

ERC-STG - Starting Grant

Host institution

EIDGENOESSISCHE TECHNISCHE HOCHSCHULE ZUERICH
Address
Raemistrasse 101
8092 Zuerich
Switzerland
Activity type
Higher or Secondary Education Establishments
EU contribution
€ 1 500 000

Beneficiaries (1)

EIDGENOESSISCHE TECHNISCHE HOCHSCHULE ZUERICH
Switzerland
EU contribution
€ 1 500 000
Address
Raemistrasse 101
8092 Zuerich
Activity type
Higher or Secondary Education Establishments