Periodic Reporting for period 4 - Hi-EST (Holistic Integration of Emerging Supercomputing Technologies)
Reporting period: 2019-11-01 to 2020-04-30
In summary, Hi-EST addressed the following objectives:
1. Advance research frontiers in Adaptive Learning Algorithms by proposing Deep Learning techniques for guiding task and data placement decisions, and the first known Adaptive Learning Architecture for exa-scale Supercomputers.
2. Advance research frontiers in Task Placement and Scheduling by proposing novel topology-aware workload placement strategies, and by extending unifying performance models for heterogeneous workloads to cover and unprecedented number of workload types.
3. Advance research frontiers in Data Placement strategies by studying data stores on top of heterogeneous sets of key/value stores connected to Active Storage technologies, as well as by proposing the first known uniform API to access hierarchical data stores based on key/value stores.
4. Advance research frontiers in Software Defined Environments by developing policies for the upcoming disaggregated data centres, and by creating placement algorithms that combine data and task placement into one single decision-making process
During the project, SMUFIN: a somatic mutation finder software, developed in the Barcelona Supercomputing Center, was optimized to take advantage of the project advances. The work performed illustrated how the data intensive nature of processing data in the human genome is still a computational and memory challenge. However, we described techniques and mechanisms to overcome the memory challenge and to alleviate the computational one. In particular, we demonstrated how accelerators can be used to shuffle data to minimize interthread communication and how it can cooperate with the CPU to build large Bloom filters. Results showed that the Barcelona Supercomputing Center (BSC) is now able to process the genomes of approximately 250 patients per MWh of energy consumed, while with the previous generation of the pipeline, the output was of approximately 18 patients per MWh of energy consumed.
During the length of the project, 22 research papers were accepted for publication (10 journals and 12 conference papers), contributing results across the four research pillars. Additionally, three patents were filed, and a spinoff (Nearby Computing) created out of the results of the project.
For this reason, there is an urging need to perform a significant advance in the field of methods, mechanisms and algorithms for the integrated management of heterogeneous supercomputing workloads. This is a huge challenge since management of a homogeneous set of distributed workloads on homogeneous infrastructures is already an NP-hard problem to solve. The level of dynamism expected for future generation supercomputing significantly raises the complexity of the problem. Addressing this grand challenge is the ultimate goal of the Hi-EST project.
In particular, Hi-EST plans to advance research frontiers in four different areas:
1. Adaptive Learning Algorithms: by proposing a novel use of Deep Learning techniques for guiding task and data placement decisions;
2. Task Placement: by proposing novel algorithms to map heterogeneous sets of tasks on top of systems enabled with Active Storage capabilities, and by extending unifying performance models for heterogeneous workloads to cover and unprecedented number of workload types;
3. Data Placement: by proposing novel algorithms to map data on top of heterogeneous sets of key/value stores connected to Active Storage technologies; and
4. Software Defined Environments (SDE): by extending SDE description languages with a still inexistent vocabulary to describe Supercomputing workloads that will be leveraged to combine data and task placement into one single decision-making process.