Project description
Solutions for faster distributed learning
Distributed learning faces two major challenges: delays caused by slower workers, known as stragglers, and high communication costs from sending large data. Gradient coding (GC) can help with stragglers, while using 1-bit data reduces communication loads. However, current methods do not address both issues together, especially when transmitting 1-bit data. Existing GC techniques also struggle to work with 1-bit encoded vectors. Supported by the Marie Skłodowska-Curie Actions programme, the 1-Bit GC-DL project tackles these problems by developing new approaches for distributed learning. It introduces the 1-Bit GC-DL method, which uses 1-bit gradient coding to handle stragglers while reducing data size. Additionally, a second method, 1-Bit LA-GC-DL, further cuts training time by selecting only key workers for each iteration.
Objective
In the framework of distributed learning, to mitigate the negative impact of the stragglers on the training time, the gradient coding (GC) technique has been adopted. On the other hand, to deal with high communication burden in distributed learning, 1-bit gradient vectors can be transmitted instead of real-valued ones. However, the existing distributed learning method based on 1-bit data does not take stragglers into account. In addition, current GC techniques are only designed for the distributed learning scheme where real-valued encoded vectors are transmitted and it is difficult to apply them under the case where 1-bit vectors are transmitted.
To overcome the above drawbacks and to reduce the communication overhead and the training time simultaneously, this project aims to propose novel distributed learning methods based on GC with 1-bit data. First, this project will propose a distributed learning method named 1-Bit GC-DL, which develops a 1-bit GC strategy to encode the locally computed gradient vectors of the allocated subsets into 1-bit data. Based on that, the aggregation rule at the central server for the received 1-bit data will be designed, which guarantees that the central server computes an approximated version of the true gradient vector in the presence of a certain number of stragglers to. Second, to further reduce the training time of 1-Bit GC-DL, this project will propose a lazily aggregated distributed learning method based on 1-bit GC, i.e. 1-Bit LA-GC-DL, by combining 1-Bit GC-DL with the lazily aggregated strategy. In 1-Bit LA-GC-DL, only a fraction of the workers participate in local training during each iteration and this project will provide the criterion for selecting the participating workers based on Age of Information. The proposed methods will be compared with other state-of-the-art methods in the context of distributed learning on both simulated and realistic datasets under practical scenarios.
Keywords
Project’s keywords as indicated by the project coordinator. Not to be confused with the EuroSciVoc taxonomy (Fields of science)
Project’s keywords as indicated by the project coordinator. Not to be confused with the EuroSciVoc taxonomy (Fields of science)
Programme(s)
Multi-annual funding programmes that define the EU’s priorities for research and innovation.
Multi-annual funding programmes that define the EU’s priorities for research and innovation.
-
HORIZON.1.2 - Marie Skłodowska-Curie Actions (MSCA)
MAIN PROGRAMME
See all projects funded under this programme
Topic(s)
Calls for proposals are divided into topics. A topic defines a specific subject or area for which applicants can submit proposals. The description of a topic comprises its specific scope and the expected impact of the funded project.
Calls for proposals are divided into topics. A topic defines a specific subject or area for which applicants can submit proposals. The description of a topic comprises its specific scope and the expected impact of the funded project.
Funding Scheme
Funding scheme (or “Type of Action”) inside a programme with common features. It specifies: the scope of what is funded; the reimbursement rate; specific evaluation criteria to qualify for funding; and the use of simplified forms of costs like lump sums.
Funding scheme (or “Type of Action”) inside a programme with common features. It specifies: the scope of what is funded; the reimbursement rate; specific evaluation criteria to qualify for funding; and the use of simplified forms of costs like lump sums.
HORIZON-TMA-MSCA-PF-EF - HORIZON TMA MSCA Postdoctoral Fellowships - European Fellowships
See all projects funded under this funding scheme
Call for proposal
Procedure for inviting applicants to submit project proposals, with the aim of receiving EU funding.
Procedure for inviting applicants to submit project proposals, with the aim of receiving EU funding.
(opens in new window) HORIZON-MSCA-2023-PF-01
See all projects funded under this callCoordinator
Net EU financial contribution. The sum of money that the participant receives, deducted by the EU contribution to its linked third party. It considers the distribution of the EU financial contribution between direct beneficiaries of the project and other types of participants, like third-party participants.
100 44 STOCKHOLM
Sweden
The total costs incurred by this organisation to participate in the project, including direct and indirect costs. This amount is a subset of the overall project budget.