The growing need to process and access extremely large volumes of heterogeneous data sets, data-intensive applications, and the steep growth of data sets question the traditional compute-centric view on HPC. The flat storage hierarchies found in classic HPC architectures, uncoordinated file accesses, and the limited bandwidth make the centralized back-end parallel file system a serious bottleneck in traditional systems. At the same time, there is a disruptive change of the underlying storage technology with emerging multi-tier storage hierarchies based on fast non-volatile memory that can significantly lower the pressure on the back-end file system. But maximizing performance still requires careful control to avoid congestion and balancing compute and storage performance. Unfortunately, appropriate interfaces and policies for managing such an enhanced I/O stack are still lacking.
The main objective of the ADMIRE project is the creation of an active I/O stack that dynamically adjusts computation and storage requirements through intelligent global coordination, elasticity of computation and I/O, and the scheduling of storage resources along all levels of the storage hierarchy, while offering quality-of-service (QoS), energy efficiency, and resilience for accessing extremely large data sets in very heterogeneous computing and storage environments.
The specific scientific-technical objectives of ADMIRE are:
Objective 1: Enable the efficient use of new storage tiers by subjecting storage to HPC scheduling decisions and establishing a distributed control that, based on global monitoring, can dynamically adapt storage allocations to changing application demands.
Objective 2: Increase application throughput of HPC systems by leveraging the performance advantage of fast, node-local storage tiers through novel, European ad-hoc storage systems, and in-transit/in-situ processing facilities.
Objective 3: Balance computation and data transfers by providing elastic mechanisms to dynamically adjust the ratio between the allocations of compute and storage resources.
Objective 4: Reduce I/O interference via globally coordinated minimization of data transfers between storage tiers, while conveying and enforcing end-to-end QoS needs.
Objective 5: Provide tools to co-design applications and storage systems with the goal of minimizing data movement, targeting different HPC architectures.
Objective 6: Increase power-efficiency in data management operations by reducing data movement and adopting low-power storage and CPU technologies.
An integrated and operational prototype will be validated and demonstrated with several with real-world data-intensive applications from various domains, including climate/weather, life sciences, physics, remote sensing, and deep learning. The consortium comprises leading European companies, research organizations and universities, bringing together several PRACE members and Centres of Excellence for HPC applications.