An exascale supercomputer is not “yet another big machine”. With a cost of hundreds of million euros, power consumption in the order of tens of megawatts and a lifetime that reaches a decade at most, judicious management of those resources is of utmost importance. Quite frequently, a machine of this size is not able to operate at full power 24/7, and energy consumption is now a primary concern to keep its environmental footprint and operational costs at acceptable levels without neglecting its ultimate purpose: to equip highly critical applications with the computational capacity to solve extremely resource hungry problems.
Focusing on the application developer side, achieving scalable performance and high system throughput has always been a cumbersome task. To make things even more challenging, next-generation HPC applications can no longer be considered as simple, monolithic blocks. The revolution of Big Data and Machine Learning, the emerging Edge Computing and IoT, with the scale of modern HPC systems and cloud datacentres, are rapidly changing the way we solve scientific problems which are now based on the composition to what we call workflows. Existing solutions may render the execution of such workflows in a large-scale supercomputer either impossible, or extremely time consuming both in terms of development and execution time.
The ultimate goal of REGALE was to pave the way of next generation HPC applications to exascale, energy-efficient systems. To accomplish this, we defined an open architecture, built a number of instantiations of this architecture and incorporated in this system appropriate sophistication in order to equip supercomputing systems with the mechanisms and policies for effective resource utilization and execution of complex applications. Additionally, we promoted five pilots for key sectors by integrating them with appropriate workflow engines, alleviating in this way the developer from the burden to manage computational resources and achieve better performance and scalability.