Skip to main content

Self Management for large-scale distributed systems based on structured overlay Networks and components

Exploitable results

The Challenge: Large Internet Applications As Internet applications become larger and more complex, the task of managing them becomes overwhelming. "Abnormal" events such as software updates, faults, threats, and performance hotspots become normal and even frequent occurrences. The goal of SELFMAN is to handle these events automatically by making the applications self managing. They will reconfigure themselves to handle changes in their environment or requirements without human intervention but according to high-level management policies. We focus on four axes of self management, namely self configuration, self healing, self tuning, and self protection. The SELFMAN project1 approached the problem by combining the complementary strengths of structured overlay networks and advanced component models. Structured overlay networks are already self-organizing. They originated with peer-to-peer file sharing applications, but have matured to provide strong guarantees and efficient communication and storage operations. As such, they form an ideal foundation for hosting self-managing services. To achieve this, we leverage the overlay network by rebuilding it using advanced component models. These make the overlay network modular and add the hooks needed for self management. The hooks provide introspection, reections, and dynamic reconfiguration abilities. In this way, the overlay network can host an application that manages itself. SELFMAN has reconstructed the structured overlay network as part of a self-managing component architecture and used it to build high-level self-managing services. We have built powerful services including transactional replicated storage and media streaming and we have built scalable Internet applications on top of these services: a media streaming product, PeerTV, that provides competitive quality of service at much lower cost than existing products, a Distributed Wikipedia that is competitive in performance with the standard Wikipedia but is scalable, and a decentralized drawing application, DeTransDraw, for mobile phones (gPhones) that allows local editing while keeping global drawing coherence. Breakthrough results SELFMAN has developed two breakthrough results as well as numerous scientific advances at all levels of self management for distributed systems. The breakthrough results are in distributed transactional storage and media distribution: PeerTV. PeerTV is a state-of-the-art application developed as a product by partner Peerialism that distributes video over Internet with live streaming and progressive download. It uses advanced technology based on peer-to-peer structured overlay networks, optimization algorithms, and component models. It uses the Chandelier dynamic peer-to-peer optimization algorithm and was developed with an integrated simulation tool MyP2PWorld, which radically sped up development and shorten time to market. It uses new techniques for firewall hole punching and NAT traversal of video streams. PeerTV has comparable QoS to leading distribution providers but at much lower costs. PeerTV was developed for and is being used by the Swedish company MPS Broadband AB. Scalaris. Scalaris is an open-source library that provides a scalable global storage service. Scalaris is built on top of a structured peer-topeer network and implements a key/value store that supports transactions. It survives node crashes and network problems using replication and a sophisticated consensus algorithm. It scales to hundreds of nodes and provides strong data consistency in the face of concurrent operations, node failures, and network problems. It does load balancing with an algorithm that uses estimates of global knowledge to reduce the storage load deviation between nodes. Scalaris was used to implement a version of Wikipedia that won first prize in the IEEE International Scalable Computing Challenge 2008. It is competitive in performance to the actual Wikipedia backend (14,000 read+write transactions / second on a synthetic benchmark) but is more robust and scalable. Major scientific results SELFMAN has produced many scientific results. Here are the three most important: - Atomic transactions on a peer-to-peer network. We have developed a transaction manager that works for data stored on structured peer-to-peer networks. The heart of the transaction manager is the algorithm that implements the atomic commit. It is an optimized version of Lamport's Paxos uniform consensus algorithm that needs only four communication steps instead of six in the common case when there are no failures. The improvement was possible by exploiting information from the data replication in the structured peer-to-peer network. The transaction manager works correctly under realistic Internet conditions, where at any time nodes can crash or the network can have problems. The Scalaris library uses the atomic commit algorithm to implement strong data consistency and atomic transactions. - Merge algorithm for network partitioning. We have solved the network partitioning problem for structured overlay networks. If the network is partitioned, the overlay splits into several smaller overlays, which each continue to provide its service as best it can with the nodes it contains. If the partition goes away (the network is repaired), then the merge algorithm will automatically combine the smaller overlays back into a single large overlay, thus restoring the complete service. This behavior can be seen a reversible phase transition, in analogy with thermodynamics. It can be used to build extremely robust Internet applications that survive network partitioning. - Methodology for building self-managing applications. The SELFMAN project has pushed the state of the art in software development techniques for self-managing applications. Out of the practical experience of SELFMAN, we have derived a development methodology for large-scale distributed systems based on the concept of weakly interacting feedback structures. A feedback structure is a hierarchy of interacting feedback loops that together maintain one global system property. This gives a much more natural and powerful way of designing large systems than the traditional layered approach. We have applied this methodology to decentralized distributed systems and in particular to the Beernet and Scalaris systems. Considered in this way, Scalaris consists of six weakly interacting feedback structures. We are continuing to develop and extend this methodology, especially for the development of extremely robust applications, by generalizing the idea of reversible phase transitions.