Final Report Summary - AOC (Adversary-Oriented Computing)
Computing is nowadays mainly distributed. At one extreme, we have small scale distributed systems where a multiprocessor machine hosts hundreds of computing devices that communicate through shared memory. Each algorithm running on such a system has to be distributed to leverage the underlying parallelism. At the other extreme, we have large scale distributed systems connecting computing nodes located all over the world. Here again, algorithms are inherently distributed for they connect and leverage geographically distinct data. Between these two extremes lies a large class of distributed architectures.
Distribution generally enables to boost efficiency, namely throughput and storage capacity. It also in principle enables to boost robustness: clearly, a computing system that operates on several machines is more reliable than one that operates on a single one, i.e. constituting a single point of failure. But achieving robustness is not trivial and leads to distributed algorithms that are intricate and sometimes inefficient. Not surprisingly, the complexity of these distributed algorithms depends on the severity of the failures to be tolerated. This severity is typically modeled by an adversary: an abstract entity that is supposed to control a subpart of the distributed system. Algorithms designed for an adversary that can only crash one machine are intuitively more complex than those designed to cope with an adversary that can make several nodes send malicious messages of which purpose is solely to defeat the entire computation.
This project addresses the general problem of how to design, in a modular manner, robust distributed algorithms the complexity of which dynamically depends on the kind of adversary to be faced by the computation. In short, given an adversary A against which we want our distributed system to be robust, we wish to devise an algorithm A1 that encompasses a subalgorithm A2 that is more efficient than A1 if the adversary turns out to be weaker than A. Furthermore, we want A2 to de designed and tested independently from A1. We argue this modular approach is key to designing strongly robust distributed algorithms and systems: the complexity of the algorithms and the code can be significantly reduced.
The project came up with novel techniques to achieve this strong form or robustness while minimizing complexity. A complete theoretical framework was devised and was experimented in classical settings like shared memory and message passing ones. We also explored new models like those with persistent memory and remote shared memory (RDMA). We came up with the first theoretical papers to reason about such models. We also explored new environments like those considered for distributed machine learning and initate the research on their robustness. Last but not least, the project also resulted in a new scalable protocol to build distributed crypto-currencies. An associated Proof of Concept ERC project has just been launched.
Distribution generally enables to boost efficiency, namely throughput and storage capacity. It also in principle enables to boost robustness: clearly, a computing system that operates on several machines is more reliable than one that operates on a single one, i.e. constituting a single point of failure. But achieving robustness is not trivial and leads to distributed algorithms that are intricate and sometimes inefficient. Not surprisingly, the complexity of these distributed algorithms depends on the severity of the failures to be tolerated. This severity is typically modeled by an adversary: an abstract entity that is supposed to control a subpart of the distributed system. Algorithms designed for an adversary that can only crash one machine are intuitively more complex than those designed to cope with an adversary that can make several nodes send malicious messages of which purpose is solely to defeat the entire computation.
This project addresses the general problem of how to design, in a modular manner, robust distributed algorithms the complexity of which dynamically depends on the kind of adversary to be faced by the computation. In short, given an adversary A against which we want our distributed system to be robust, we wish to devise an algorithm A1 that encompasses a subalgorithm A2 that is more efficient than A1 if the adversary turns out to be weaker than A. Furthermore, we want A2 to de designed and tested independently from A1. We argue this modular approach is key to designing strongly robust distributed algorithms and systems: the complexity of the algorithms and the code can be significantly reduced.
The project came up with novel techniques to achieve this strong form or robustness while minimizing complexity. A complete theoretical framework was devised and was experimented in classical settings like shared memory and message passing ones. We also explored new models like those with persistent memory and remote shared memory (RDMA). We came up with the first theoretical papers to reason about such models. We also explored new environments like those considered for distributed machine learning and initate the research on their robustness. Last but not least, the project also resulted in a new scalable protocol to build distributed crypto-currencies. An associated Proof of Concept ERC project has just been launched.