A Theory for Understanding, Designing, and Training Deep Learning Systems

Project Information

THUNDEEP

Grant agreement ID: 754705

Project website

DOI

10.3030/754705

Project closed

EC signature date 5 September 2017

Start date 1 September 2018

End date 31 July 2025

Funded under

EXCELLENT SCIENCE - European Research Council (ERC)

Total cost

€ 1 442 360,00

EU contribution

€ 1 442 360,00

1 442 360,00

Coordinated by

WEIZMANN INSTITUTE OF SCIENCE
Israel

Periodic Reporting for period 5 - THUNDEEP (A Theory for Understanding, Designing, and Training Deep Learning Systems)

Reporting period: 2024-09-01 to 2025-07-31

Deep learning, in the form of artificial neural networks, is one of the most rapidly evolving fields in machine learning, with wide-ranging impact on real-world applications. Neural networks can efficiently represent complex predictors, and are nowadays routinely trained successfully. Unfortunately, our scientific understanding of neural networks is quite rudimentary. Most methods used to design and train these systems are based on rules-of-thumb and heuristics, and there is a drastic theory-practice gap in our understanding of why these systems actually work. We believe this poses a significant risk to the long-term health of the field, as well as an obstacle to widening the applicability of deep learning beyond that achieved with current methods. The goal of this project is to develop principled tools for understanding, designing, and training deep learning systems, based on rigorous theoretical results. This is a major challenge in this rapidly evolving field, and any progress along these lines is expected to have a substantial impact on the theory and practice of creating such systems. To do so, we focus on three inter-related sources of performance losses in neural network learning: The optimization error of neural networks (that is, how to train a given network in a computationally efficient manner); The estimation error (how to ensure that training a network on a finite training set will ensure good performance on future examples); and the approximation error (how architectural choices of the networks affect the type of functions they can compute).

The project made a substantial impact on the research community’s understanding of deep learning and artificial neural networks, and led to significant advances across all of the projects objectives. Notable highlights include a much better understanding of the role of the network’s depth and width on the type of functions that they can express; a more rigorous understanding of deep learning system’s ability to successfully generalize (and the types of generalization behaviors that can be expected), as well as new practical algorithms pertaining to deep learning, such as data set reonstruction from trained neural networks. In addition, the project has led to several successful offshoots in the context of non-convex and convex optimization, which underlies the training of neural networks, such as the ability to find stationary points and the performance of stochastic gradient-based methods across several settings.

By the end of the project, we have significantly advanced our rigorous understanding of deep learning, and placed it on a much firmer theoretical footing. The different objectives we have pursued resulted in new insights about how to design and train deep learning systems in practice.

reconstruction.png

Periodic Reporting for period 5 - THUNDEEP (A Theory for Understanding, Designing, and Training Deep Learning Systems)

Share this page Share this page on social networks

Download Download the content of the page