Serverless technologies allow developers to build and run services and applications in the cloud without having to think about servers. The code is run on demand, is automatically scaled and is billed only for the execution duration.
Serverless computing offers new opportunities for extreme-scale analytics, by allowing to run embarrassingly parallel computations with an extraordinary simplicity and an unlimited scalability thanks to the automated management of the cloud resources. Leveraging this, our novel programming abstractions and tools will considerably simplify the deployment of data analytics code to the cloud. The main objective of the project is then to create CloudButton: a Serverless Data Analytics Platform. In order to achieve this ambitious objective, the project defines the following goals:
- Create a High Performance Serverless Compute Engine for Big Data. This is the foundational technology for the CloudButton platform that must overcome the current limitations of existing serverless platforms. In particular, it includes extensions to i) support stateful and highly performant execution of serverless tasks, ii) optimized elasticity and operations management of functions thanks to new locality aware scheduling algorithms, iii) efficient QoS management of containers that host serverless functions, and iv) a Serverless Execution Framework supporting typical dataflow models.
- Support for Mutable Shared Data in Serverless Computing. To simplify the transitioning from sequential to (massively-)parallel code, we will design a new middleware that allows to quickly spawn and share mutable data structures in a serverless computing platform. Our Mutable Shared Data middleware will i) offer an easy-to-use programming framework to add state to serverless computing, ii) provide dynamic data replication and tunable consistency to match the performance requirements of serverless data analytics, and iii) integrate this framework to an in-memory data grid for performance.
- Design novel Serverless Cloud Programming Abstractions: To provide a new programming model for serverless cloud infrastructures that can express a wide range of existing data-intensive applications with minimal changes. The programming model should at the same time, i) preserve the benefits of a serverless execution model in terms of resource efficiency, performance, scalability and fault tolerance, ii) explicit support for stateful functions in applications, while offering guarantees with respect to the consistency and durability of the state.