Skip to main content
Go to the home page of the European Commission (opens in new window)
English English
CORDIS - EU research results
CORDIS

Understanding and Fixing Bottlenecks in Optimization for Modern Machine Learning

Project description

Understanding modern machine learning training bottlenecks

Despite the growing use of modern machine learning models, their development remains largely undocumented and difficult to comprehend. The high costs and resource-intensive training processes complicate understanding, while current theoretical frameworks offer limited insights. As a result, this limits the accessibility of machine learning to industries and researchers without significant resources. Supported by the Marie Skłodowska-Curie Actions programme, the Bernar project aims to enhance our understanding of bottlenecks in neural network training, their negative impact on optimisation, and how to address these challenges. The project will identify where additional algorithmic resources are needed, uncover novel bottlenecks, and develop a theory for early detection of bottlenecks to improve performance.

Objective

Modern machine learning models have been successfully deployed across fields, from scientific studies to tech-
nological developments in industry, but their development remains poorly understood. The training of a large
language model such as GPT-3 is estimated to cost $4.6M and public attempts to replicate the training process
alone required teams of engineers to rotating on-call for months, monitoring various statistics and constantly
tweaking the training procedure when it broke. Existing theoretical frameworks offer limited insights into this
process, as they do not capture the main difficulties that arise in practice when training neural networks, leaving
practitioners to rely on error-prone heuristics and expensive trial-and-error. This leads not only to a large devel-
opment cost dominated by wasted resources, but also limits the possible impacts of machine learning to areas
considered profitable by industries that have the resources to carry this development.

The objective of this project is to build a better understanding of how recently identified bottlenecks in neural
network training slow down optimization and how to adress them. The specific aims are to: (a) Understand
the impact of class imbalance on the dynamics of neural networks to identify where to allocate algorithmic
resources. (b) Develop a theory to capture optimization difficulties early in training to guide the development
of algorithms that improve performance during this crucial phase. (c) Identify new bottlenecks that arise from
applications to new data types.

The project combines experimental expertise of the postdoctoral and the theoretical expertise of the host insti-
tution to identify and describe the real impact of data characteristics on neural network training. Understanding
these bottlenecks will help develop more efficient and reliable algorithms and guidelines on best practices that
depend on properties of the data.

Fields of science (EuroSciVoc)

CORDIS classifies projects with EuroSciVoc, a multilingual taxonomy of fields of science, through a semi-automatic process based on NLP techniques. See: The European Science Vocabulary.

You need to log in or register to use this function

Keywords

Project’s keywords as indicated by the project coordinator. Not to be confused with the EuroSciVoc taxonomy (Fields of science)

Programme(s)

Multi-annual funding programmes that define the EU’s priorities for research and innovation.

Topic(s)

Calls for proposals are divided into topics. A topic defines a specific subject or area for which applicants can submit proposals. The description of a topic comprises its specific scope and the expected impact of the funded project.

Funding Scheme

Funding scheme (or “Type of Action”) inside a programme with common features. It specifies: the scope of what is funded; the reimbursement rate; specific evaluation criteria to qualify for funding; and the use of simplified forms of costs like lump sums.

HORIZON-TMA-MSCA-PF-EF - HORIZON TMA MSCA Postdoctoral Fellowships - European Fellowships

See all projects funded under this funding scheme

Call for proposal

Procedure for inviting applicants to submit project proposals, with the aim of receiving EU funding.

(opens in new window) HORIZON-MSCA-2024-PF-01

See all projects funded under this call

Coordinator

INSTITUT NATIONAL DE RECHERCHE EN INFORMATIQUE ET AUTOMATIQUE
Net EU contribution

Net EU financial contribution. The sum of money that the participant receives, deducted by the EU contribution to its linked third party. It considers the distribution of the EU financial contribution between direct beneficiaries of the project and other types of participants, like third-party participants.

€ 226 420,56
Address
DOMAINE DE VOLUCEAU ROCQUENCOURT
78153 Le Chesnay Cedex
France

See on map

Activity type
Research Organisations
Links
Total cost

The total costs incurred by this organisation to participate in the project, including direct and indirect costs. This amount is a subset of the overall project budget.

No data
My booklet 0 0