Clouds have revolutionised the computing landscape by offering a variety of service models on top of shared commodity hardware. Recent years have seen the cloud landscape transforming to accommodate more and more sophisticated services including HPC-like services, Big Data Analytics, Machine Learning and Artificial Intelligence. The success in providing and extending these services depends on how well the cloud can manage heterogeneity - not only in the services being offered but also in the software tools and hardware needed to support them. CloudLightning provides a novel, single, extensible architecture for the next generation of cloud computing. It is a framework for managing heterogeneity at any scale capable of provisioning heterogeneous cloud resources to deliver services, specified by the user, using a bespoke service description language. This framework coherently addresses a number of topical issues in cloud computing, including: incorporating and dynamically constructing HPC environments, efficiently managing heterogeneous resources at scale (e.g. utilisation and power consumption), incorporating new hardware and new types of hardware readily, addressing over-provisioning by using profiled services, making complex service workflows available through a Blueprint-as-a-Service (BPaaS) delivery model and automating service discovery, resource selection and service deployment without having to use resource reservation.
The CloudLightning architecture is designed to be highly scalable and extensible (via a novel Plug and Play mechanism) embracing different types of heterogeneity. A unique feature of this approach is that it facilitates both the incorporation and the dynamic construction of HPC environments. In the former case, HPC machines can be added to the CloudLightning resource fabric by registering the resource manager of the HPC machine as a CloudLightning resource. In the latter case, HPC-like environments can be dynamically constructed, in response to support a particular service, from resources co-located on the same low-latency network. Thus, providing a mechanism to offer HPC-as-a-Service.
An important objective of CloudLightning was to remove the burdens of low-level service provisioning, optimisation and orchestration from the cloud consumer. A related objective was to locate decisions pertaining to resource usage with individual resource components, where optimal decisions could be made. To achieve these objectives, a system was created, composed of a hierarchy of resource managers and employing self-organisation and self-management strategies. By addressing the inefficient use of resources CloudLightning can facilitate savings to the cloud provider and the cloud consumer through reduced power consumption and improved service delivery, with hyperscale systems particularly in mind.