Robust algorithms for learning from modern data

Informacje na temat projektu

SEQUOIA

Identyfikator umowy o grant: 724063

Strona internetowa projektu

DOI

10.3030/724063

Projekt został zamknięty

Data podpisania przez KE 16 Marca 2017

Data rozpoczęcia 1 Września 2017

Data zakończenia 31 Sierpnia 2023

Finansowanie w ramach

EXCELLENT SCIENCE - European Research Council (ERC)

Koszt całkowity

€ 1 998 750,00

Wkład UE

€ 1 998 750,00

1 998 750,00

Koordynowany przez

INSTITUT NATIONAL DE RECHERCHE EN INFORMATIQUE ET AUTOMATIQUE
France

Periodic Reporting for period 4 - SEQUOIA (Robust algorithms for learning from modern data)

Okres sprawozdawczy: 2022-03-01 do 2023-08-31

Machine learning is needed and used everywhere, from science to industry, with a growing impact on many disciplines. While first successes were due at least in part to simple supervised learning algorithms used primarily as black boxes on medium-scale problems, modern data pose new challenges. Scalability is an important issue of course: with large amounts of data, many current problems far exceed the capabilities of existing algorithms despite sophisticated computing architectures. But beyond this, the core classical model of supervised machine learning, with the usual assumptions of independent and identically distributed data, or well-defined features, outputs and loss functions, has reached its theoretical and practical limits.

Given this new setting, existing optimization-based algorithms are not adapted. The main objective of this proposal is to push the frontiers of supervised machine learning, in terms of (a) scalability to data with massive numbers of observations, features, and tasks, (b) adaptability to modern computing environments, in particular for parallel and distributed processing, (c) provable adaptivity and robustness to problem and hardware specifications, and (d) robustness to non-convexities inherent in machine learning problems.

To achieve the expected breakthroughs, we will design a novel generation of learning algorithms amenable to a tight convergence analysis with realistic assumptions and efficient implementations. They will help transition machine learning algorithms towards the same wide-spread robust use as numerical linear algebra libraries. Outcomes of the research described in this proposal will include algorithms that come with strong convergence guarantees and are well-tested on real-life benchmarks coming from computer vision, bioinformatics, audio processing and natural language processing. For both distributed and non-distributed settings, we will release open-source software, adapted to widely available computing platforms.

During the project, we have focused primarily on five important topics:

- Distributed algorithms:
We have proposed a general framework for the analysis of distributed optimization algorithms, that allows to both (a) provide new algorithms and their convergence bounds, and (b) prove lower bound of complexity, stating that no algorithms can ever achieved a better complexity. This was done both for centralized and decentralized approaches, and for machine learning problems and also problems beyond.

- Stochastic gradient algorithms:
In a series of papers, we provided a refined analysis of stochastic gradient techniques for positive definite kernel methods, showing in particular that (1) they could converge exponentially fast under some common scenarios, and (2) multiple passes could be beneficial (this is the first time this is provably mathematically).

- Analysis of neural network training:
In a series of papers, we analyzed how gradient descent could lead to global convergence guarantees, which is particularly difficult because this is a non-convex optimization problem.

- Black-box algorithms by kernel sums-of-squares:
In a series of papers, we have developed a new framework for black-box optimization, which opens up a new area of research by combining two unrelated lines of work.

- Automatic proofs: given the high technicity of the proofs of optimization algorithms, having dedicated tools to automate the proofs and discover both new proofs and algorithms has been a key achievement.

These results are all published in the most prestigious venues in machine learning and optimization.

The ERC project has allowed to pursue research at the forefront of machine learning and optimization, notably with a new framework for distributed optimization, and a new set of algorithms for black-box optimization, with both have a strong potential for future research and applications.

modern data

Periodic Reporting for period 4 - SEQUOIA (Robust algorithms for learning from modern data)

Udostępnij tę stronę Udostępnij tę stronę w mediach społecznościowych

Pobierz Pobierz zawartość strony