Periodic Reporting for period 2 - ESCAPE (Energy-efficient SCalable Algorithms for weather Prediction at Exascale)
Reporting period: 2017-04-01 to 2018-09-30
The biggest challenge for state-of-the-art NWP arises from the need to simulate complex physical phenomena within tight production schedules. Existing extreme-scale application software of weather and climate services is (i) not very efficient on existing CPU-type processors reaching only 5% of the peak performance, mostly due a lack of arithmetic intensity, and (ii) ill-equipped to adapt to rapidly evolving options for new processor hardware, mostly due to a lack of flexibility for mapping specific computational problems onto heterogeneous computing units. This problem is exacerbated by other drivers for hardware development that are not necessarily optimal for weather and climate simulations.
ESCAPE will redress this imbalance through innovation actions that fundamentally reform Earth-system modelling. ESCAPE addresses the ETP4HPC Strategic Research Agenda 'Energy and resiliency' priority topic, developing a holistic understanding of energy-efficiency for extreme-scale applications using heterogeneous architectures, accelerators and special compute units. The three key steps towards much enhanced performance and energy-efficiency for weather and climate modelling are:
• Defining and encapsulating the fundamental algorithmic building blocks ('Weather & Climate Dwarfs') of weather and climate prediction models. This is the prerequisite for any subsequent co-design, optimization, and adaptation efforts.
• Combining ground-breaking frontier research on algorithm development for use in extreme-scale, high-performance computing applications aiming to minimize time- and cost-to-solution.
• Synthesizing the complementary skills of all project partners in all necessary domains, at the interface between applied and computational science.
Further, installations of limited-area prediction models at ECMWF have been performed in the first phase that serve as reference standards for performance evaluations once selected dwarfs will be reintegrated in the models. These reference installations also include the use of the Atlas library. This capability allows gauging the impact of running optimized dwarfs on novel hardware in full-sized forecast systems.
In the second period the work focused on the following.
9 dwarfs were created and used in the project for hardware adaptation, performance optimisation and energy measurements. Multiple resolutions for different processes and multiple resolutions through multigrid preconditioning were investigated, finding significant potential for speedup and reduction in the number of elliptic solver iterations.
Different dwarfs were ported to different architectures (CPU, GPU and MIC) using directive-based approaches based on standards supported by vendors, revealing several key features of the directives essential for providing both portability and performance portability of the ESCAPE dwarfs.
A complete DSL definition and implementation was delivered, capable to represent dynamical cores on unstructured meshes as well as structured grids. The use of the language and the performance obtained for multiple architectures was demonstrated for the MPDATA unstructured dwarf. Accelerator-capable dwarfs were delivered that will become part of future HPC benchmarks for typical weather applications after the conclusion of the project.
Platform-specific optimization of dwarfs was performed on both single- and multi-node CPU and GPU systems. Dwarfs related to the dynamical core as well as column-based physics were selected, with specific focus on the formulation relevant to spectral transformations as used in ECMWF’s IFS code.
Modelling of the achieved performance based on measurements was investigated. The models used key performance drivers such as data flow vs locality and communication patterns and their dependence on precision, accuracy and clock-speed to accurately represent dwarf performance with a simplified parameterization that is based on meaningful predictors.
For selected dwarfs, the simulator framework DCworms was used to ingest the performance parameterization and to estimate overall computing and energy performance at scale. Based on these simulations, criteria for the distinction between various hardware/software choices were derived, as well as a possible strategy for choosing a specific processor. Four reference Limited-Area Models (LAM) were installed at ECMWF to assess the performance and scalability of full-scale models as well as to determine the representativeness of dwarfs for the full workload. Energy-aware metrics and an energy measurement methodology were proposed that characterizes entire models sufficiently well.
Finally, the second dissemination workshop was delivered as well as the final dissemination assembly (as a webinar), and held the Young Scientists Summer School in Copenhagen in the second project phase.
By the end of the project, we expect to have a better insight into the computing performance and energy efficiency of those model components that drive overall cost. This insight will be based on porting these components to different processor types and on performing optimizations according to standardized metrics and by employing open-access tools. Performance and energy efficiency can now be encapsulated in parametric models to drive optimization efforts. This expertise will provide guidance for model configurations that provide the best cost-benefit across all applications considered in ESCAPE.
The socio-economic benefit of the ESCAPE developments follows the value that weather and climate prediction contributes to all areas of our society. As ESCAPE is the first project of its kind for weather and climate prediction, we expect significant impact on the ways of working in our community, and in collaboration with computing science.