Periodic Reporting for period 2 - CloudDBAppliance (European Cloud In-Memory Database Appliance with Predictable Performance for Critical Applications)
Berichtszeitraum: 2018-06-01 bis 2019-11-30
There was a need for a new Cloud Database appliance, architecture:
- to support Hybrid Transactional/Analytical Processing (HTAP),
- to enable Enterprise critical application to run on the Cloud,
- to ensure mainframe class quality of service in terms of predictable performance and resilience (High availability).
Partners of the CloudDBAppliance European consortium have joined their forces to overcome these challenges, and have intensively worked and cooperated to make it happened, and to achieve the following results:
- A revised architecture of components (LeanXcale vertically scalable in-memory operational database, ActivePivot fast analytical engine, Hadoop data lake (Spark optimized By Atos), Time series algorithms, Data Streaming engine) to take care of the New Hardware capabilities (NUMA, GPU, non-Volatile Ram), and to ensure predictable performance. These revised architectures also allow the integration between components and the new data management tools to ensure mainframe capabilities (high availability mechanisms and optimization of hardware resources utilization).
- Optimized Implementations of these standalone components on the BullSequana, and of the CloudDBAappliance prototype and their performance evaluation results.
- The results of the 5 use cases implementation on the prototype, to validate and evaluate the added values of the CloudDBAppliance prototype.
- Scientific & Educational: 8 PhD Thesis / 1 MSc Thesis, 8 training activities, 36 publications, Course and 2 special lectures.
For the technical part, the global architecture and evaluation plan were specified during the first reporting period. Then the partners developed and tested their individual components on the BullSequana S. Some algorithms necessary to the project were designed; the implementation of an initial version of all the algorithms was completed. The second period was dedicated to the test and the optimization of the components & algorithms to obtain the QoS (performance & robustness) and to the integration of cross components on the final platform available to the partners.
For the end-user validation, during the first reporting period the five use cases have been defined and designed to take advantage of the technological components. The functionalities of the first prototype version of all use cases has been developed. During the second reporting period the use cases have been implemented, tested and optimized to fully leverage and the functionalities of the CloudDBAppliance prototype and to demonstrate its benefits.
The dissemination and communication activities are highlighted on the project public website, http://clouddb.eu/ and on social media. 2 ADITCA workshops were organized.
The results of the exploitation activities cover
- the innovation created by the project, reflected by the number of software artefacts and by the many demonstrators.
- the knowledge acquired to implement the CloudDBAppliance prototype and the use cases demonstrators which is already exploited in educational courses and is opening the way to propose new solutions to customers.
- The transfer of academic results to community (INRIA Algorithms are open-sourced) or to industry (Agreement between UPM & LeanXcale for results exploitation),
- The enhanced product: LeanXscale Database, ActivePivot are already available on the market.
1) The delivery of the appliance prototype featuring:
• An in-memory operational database blended with a fast-analytical engine delivering real-time data, able to answer analytical continuous queries over operational data.
• An in-memory data streaming engine integrated with the operational database that provides the framework for parallel analytics algorithms on operational data streams and complemented with a time series analytics library.
• An operational data lake that enables the use of Spark and Hadoop analytics frameworks on operational data.
• A hardware appliance leveraging the BullSequana S server, ranging from 2 to 32 processors (up to 896 cores and 1792 hardware threads), up to 32 GPUs and 48 TB RAM, and 64 TB of non-volatile RAM.
• A management system to deploy and monitor the prototype, based on the Atos Codex AI suite, which provides for the deployment of Platform as a Service (PaaS) and/or Software as a Service (SaaS) cloud on various Infrastructure as a Service clouds, or on HPC clusters.
The prototype provides for On Line Transaction Processing (OLTP), On Line Analytics Processing (OLAP) and on the fly real time analytics on data streams within a single platform.
Moreover, the prototypes exhibit the following non-functional features: • Horizontal scalability, inherited from the built-in features of the components. • Vertical scalability, thanks to the NUMA awareness implemented by the project to the components. • High availability through replication. • Smart placement through efficient resource allocation algorithms. • Dynamic migration across appliances.
2)The components of the optimized software ecosystem, and their performance:
- the scalable in-memory operational database leverages a new Data Engine, a new Query Engine and Transactional Engine. Now it is able to scale over 100 cores, to utilize efficiently over 10s of TB of Memory and to deliver over 100,000 TPM in the OLTP Industrial Benchmark TPC-C.
- the scalable ActivePivot in-memory Analytics brings an innovative architecture benefiting from a custom memory allocator, compressed data structures and NUMA awareness & partitioning. It is already production at several customers where it unlocks things that were not previously possible.
- the extension to Spark to make it NUMA aware showing improvement in terms of duration of the run and CPU consumption. The bigger the input dataset, the higher is the gain. The experimentation of GPU acceleration of PySpark (the Python API of Spark) workflows, using the Nvidia RAPIDS library (Rapids/AI), demonstrated the feasibility of this integrtaion and showed an execution three times faster.
- the scalable Data Streaming Engine brings an innovative optimization in the implementation of the tuple structure, the window processing, and streaming Operators. it is highly available and supports self-reconfiguration. It attained the targeted scalability.
- the parallel incremental sketching approach of ParCorr algorithm proves to be tens of times faster. ParCorr scales up to 100 million time series and is linearly scalable.
- the implementation of the ParCorr algorithm on top of the data streaming engine, and its extension with machine learning capabilities.
3)These technological progresses make it possible to enhance our selected use cases beyond SOTA. For example, for Mobile phone number migration CloudDBAppliance can provide a technological solution for a national central database for cell number portability, capable of providing service to a large country with a number of cell-phone users, of up to 100 million.