Re-thinking Efficiency in Deep Learning under Accelerators and commodity and processors

The emergence of deep learning in recent years has revolutionized computer functionality in numerous areas, including facial recognition, driving vehicles, and language translation. Although deep learning has become a vital tool for tackling many challenges in artificial intelligence and machine learning, the high resource requirements for routine deep learning activities, such as model training, are hindering progress and limiting participation in the field to only large corporations. The goal of REDIAL is to overcome these technical challenges by improving the efficiency of deep learning. This will involve developing a theoretical understanding of current approaches to deep learning efficiency and creating new architectures and methods for training and inference that can handle core efficiency bottlenecks, such as limited parallelization and excessive on-chip data movement. This project aims to facilitate the adoption of analog processing in accelerators and produce new deep architectures and algorithms that promote high efficiency. The ultimate goal is to revolutionize the way models are trained and deployed to constrained devices, thereby paving the way for new innovations in machine learning.

Since the beginning of REDIAL we have made two huge strides towards our aims. The first is in the direction of micro-controller based model compression. The second advances the collaborative possibilities of ML through advanced forms of federated learning.

By inventing a series of core algorithms with theoretical underpinning and applying these to micro-controllers, we increased the capabilities of the micro-controller in terms of machine learning. The methods include a novel neural architecture method called uNAS and the first ever differentiable pruning method able to work with the extreme constraints encountered my micro-controllers.
As a result, our methods allow a micro-controller to execute the inference of machine learning models at a low cost and privacy preserving manner that has not been seen before. I would add we also performed a first-of-its-kind measurement study to understand deeply the capabilities of micro-controllers under common deep learning operations.

The second is that of advancing the ability of individuals to collaborate when designing machine learning methods. We have done this by advancing a method known as federated learning. We invented a framework called Flower that has become globally popular and is now the most popular way to develop federated learning workloads in the community. A key innovation to support this is the secure aggregation strategy we developed within this framework, which works alongside our design of the framework more generally.

Our closing work performed is more general that the first element mentioned, although is related to it. That is a meta-learning informed pruning strategy. We demonstrate that this technique can outperform all known methods of this type currently known. The core idea is to consider second order effects during the pruning process, in a way that makes such consideration tractable.

I would highlight the two core initiatives in the direction of theoretically sound advances in model compression, and advances in federated learning as being two core directions where we have clearly pushed forward the state of the art. In those two particular vectors we can enable outcomes that would have been impossible previously.

In relation to results expected in the remainder of REDIAL I anticipate the following:

= Development of co-design accelerators that correspond to the algorithmic contributions we have made thus far
= Improved data-movement methodologies
= Increased outreach and dissemination.

Periodic Reporting for period 2 - REDIAL (Re-thinking Efficiency in Deep Learning under Accelerators and commodity and processors)

Share this page

Download