Skip to main content

Robust algorithms for learning from modern data

Objective

Machine learning is needed and used everywhere, from science to industry, with a growing impact on many disciplines. While first successes were due at least in part to simple supervised learning algorithms used primarily as black boxes on medium-scale problems, modern data pose new challenges. Scalability is an important issue of course: with large amounts of data, many current problems far exceed the capabilities of existing algorithms despite sophisticated computing architectures. But beyond this, the core classical model of supervised machine learning, with the usual assumptions of independent and identically distributed data, or well-defined features, outputs and loss functions, has reached its theoretical and practical limits.

Given this new setting, existing optimization-based algorithms are not adapted. The main objective of this proposal is to push the frontiers of supervised machine learning, in terms of (a) scalability to data with massive numbers of observations, features, and tasks, (b) adaptability to modern computing environments, in particular for parallel and distributed processing, (c) provable adaptivity and robustness to problem and hardware specifications, and (d) robustness to non-convexities inherent in machine learning problems.

To achieve the expected breakthroughs, we will design a novel generation of learning algorithms amenable to a tight convergence analysis with realistic assumptions and efficient implementations. They will help transition machine learning algorithms towards the same wide-spread robust use as numerical linear algebra libraries. Outcomes of the research described in this proposal will include algorithms that come with strong convergence guarantees and are well-tested on real-life benchmarks coming from computer vision, bioinformatics, audio processing and natural language processing. For both distributed and non-distributed settings, we will release open-source software, adapted to widely available computing platforms.

Field of science

  • /natural sciences/computer and information sciences/artificial intelligence/computer vision
  • /natural sciences/computer and information sciences/data science/natural language processing
  • /natural sciences/mathematics/pure mathematics/algebra/linear algebra
  • /natural sciences/computer and information sciences/artificial intelligence/machine learning/supervised learning
  • /natural sciences/computer and information sciences/artificial intelligence/machine learning

Call for proposal

ERC-2016-COG
See other projects for this call

Funding Scheme

ERC-COG - Consolidator Grant

Host institution

INSTITUT NATIONAL DE RECHERCHE ENINFORMATIQUE ET AUTOMATIQUE
Address
Domaine De Voluceau Rocquencourt
78153 Le Chesnay Cedex
France
Activity type
Research Organisations
EU contribution
€ 1 998 750

Beneficiaries (1)

INSTITUT NATIONAL DE RECHERCHE ENINFORMATIQUE ET AUTOMATIQUE
France
EU contribution
€ 1 998 750
Address
Domaine De Voluceau Rocquencourt
78153 Le Chesnay Cedex
Activity type
Research Organisations