Periodic Reporting for period 1 - MAESTRO (Middleware for memory and data-awareness in workflows)
Reporting period: 2018-09-01 to 2020-02-29
The Maestro project will build a data and memory-aware middleware framework for High Performance Computing (HPC). This will provide a bridge between the operating system and the applications. The goal is to provide better control of the flow of data across the many layers of memory in HPC.
High Performance Computing (HPC) and High Performance Data Analytics (HPDA) opens up the opportunity to solve a wide variety of questions and challenges. The number and complexity of challenges that HPC and HPDA can help with are limited by the performance of computer software and hardware. Increasingly, performance is now limited by how fast data can be moved within the memory and storage of the hardware. So far, little work has been done to improve data movement.
How will Maestro help?
Maestro will develop a new framework to improve the performance of data movement in HPC and HPDA. The framework will consider two key components: data and memory.
Data movement awareness: Moving data in computer memory had not always been a performance bottleneck. Up until recently, performance was limited by the number of calculations that could be completed. Great improvements have been made in computational performance, but the software for memory has not changed during this time. Maestro will develop a better understanding of the performance barriers of data movement.
Memory awareness: memory in computer hardware is now increasingly complex. Historically, applications have been unaware of the particulars of memory layout. However, as memory becomes more complex, software performance is limited by data movement across the layers of memory. To improve software performance it is now important that software has an 'awareness' of memory and how to optimise data movement.
Maestro will develop a framework to improve performance of data movement in applications. By improving the ease-of-use of complex memory hierarchies Maestro will help by:
● improving the performance of software, and therefore the energy consumption and CPU hours used by software;
● encouraging the uptake of parallel computing and HPC systems by new communities by lowering the memory performance barrier.
Maestro has the potential to influence a broad range of human discovery and knowledge, as every computational application relies on data movement.
The main technical achievements of the initial part of the Maestro projects are the following:
● Establishment of detailed requirements, justified through relevant HPC use cases, to influence the design decisions for the Maestro middleware
● Specification of the core middleware and its first implementations
● Design of execution framework architecture as well as development of the access semantics
● Realisation of the first prototype working implementation of the MIO interface
● Specification of demonstrators for the Maestro technology as well as some early prototypes
● Data models: A new approach has been specified and implemented that provide a much higher abstraction level than existing approaches
● Workflow management: A concept for introducing data- and memory-awareness in existing workflow frameworks have been formulated
● Dynamic Provisioning: A new solution for dynamic provisioning of storage has been implemented and demonstrated