European Commission logo
English English
CORDIS - EU research results
CORDIS

Energy-efficient SCalable Algorithms for weather and climate Prediction at Exascale

Periodic Reporting for period 2 - ESCAPE-2 (Energy-efficient SCalable Algorithms for weather and climate Prediction at Exascale)

Reporting period: 2020-04-01 to 2021-09-30

ESCAPE-2 has paved the way for the European weather and climate prediction community towards achieving exascale performance on future HPC architectures. This has been accomplished by developing bespoke and novel mathematical and algorithmic concepts and by combining them with proven methods and thereby reassessing the mathematical foundations forming the basis of Earth-system models. Moreover, ESCAPE-2 has also invested in significantly more productive programming models for the weather-climate community through which novel algorithm development will be accelerated and future-proofed. As one of the main outcomes, the project provided exascale-ready production benchmarks to be operated on EuroHPC infrastructures. Moreover, ESCAPE-2 has combined cross-disciplinary uncertainty quantification tools (here URANIE originating from the energy sector) for high-performance computing with ensemble based weather and climate models to quantify the effect of model and data related uncertainties on forecasting – a capability, which weather and climate prediction has pioneered since the 1960s.
Amongst others, WP1 has:
• developed an alternative dynamical core for the Integrated Forecasting System (IFS) based on a discontinuous Galerkin discretization (IFS-DG). More precisely, this action has implemented an MPI-parallel dynamical core prototype based on a semi-implicit (SI) semi-Lagrangian (SL) discontinuous Galerkin (DG) discretization of the three-dimensional rotating Euler equations both in spherical and Cartesian geometry; the code includes BICGSTAB, GCR, and GMRES solvers for the SI time-stepping part, which were assessed for convergence on canonical validation benchmarks and equipped with complementary fault tolerant procedures;
• identified fault tolerant approaches for numerical weather prediction and the development of fault tolerant linear solvers specifically tailored towards efficient (time-critical) forward-in-time solutions employed by numerical weather prediction models;
• developed multigrid preconditioners for another alternative dynamical core based on finite-volumes (IFS-FVM) and the experimentation with other multigrid and multilevel numerical approaches in the framework of ESCAPE-2 finite volume dwarfs;
• developed a radiation scheme based on the integration of machine learning approaches into the existing and popular RTE+RRTMGP radiation package;
• provided a range of dwarfs and models based on different numerical approaches, implementing either specific algorithms or full dynamical cores for atmospheric and ocean dynamics, assembled in a single testing suite for the purpose of benchmarking exascale architectures.
WP2 has:
• developed an entire DSL toolchain for weather and climate models;
• demonstrated the DSL on a broad range of dwarfs; due to the careful design of the toolchain, all computations of the selected dwarfs were supported by the language and could be integrated into the dwarf.
• developed the intermediate representation (HIR) that defines an interface between the frontend and backend which could be developed independently along the project, allowing the development of future additional frontends, provided they generate a valid HIR specification that can be digested within our DSL compiler toolchain (Dawn);
• adopted a modular design with open interfaces that will allow future reuse of tools for performance portability across different models and communities, in part due to the standard specification of the IRs;
WP3 has:
• produced Version v1.0 of the HPCW benchmark released on Sep. 20th 2021 and deployed this version on different HPC systems at BULL, DKRZ, BSC and ECMWF;
• developed and successfully implemented a verification mechanism for two example setups from the HPCW benchmarks;
• extended the Kronos I/O workload simulator to execute relevant weather & climate dwarfs in its workload and report/analyze the runtime statistics that the dwarfs generate.
WP4 has:
• defined a common Verification Validation and Uncertainty Quantification (VVUQ) framework;
• performed UQ and numerical precision analyses on a weather/climate shallow water toy model and a radiation dwarf;
• performed UQ analysis on a comprehensive forecasting system;
• integrated a new workflow management of URANIE for HPC environments;
• defined a new calibration module for URANIE
WP5 performed:
• planning and organisation of two dissemination workshops;
• planning and organisation of the ESCAPE-2 Summer school;
• updates to the ESCAPE-2 Website;
• publications and presentations at conferences and workshops
WP6
• successfully concluded of the project with all deliverables submitted.
For achieving progress in weather and climate prediction it is crucial to overcome the performance wall for Earth-System model simulations at kilometre-scale spatial resolutions. This is not a matter of simply increasing computing power measured in theoretical floating-point operation rates. Addressing Europe's grand societal challenges such as climate change adaptation, especially under continued budgetary constraints for many European countries, it is imperative to identify, apply and implement flexible software engineering design principles for current and future weather and climate models.
Our vision is to implement the so-called separation of concerns, where domain science and multi-disciplinary abstractions are separated by a formal interface that facilitates rapid and simultaneous developments of the key weather and climate model algorithms together with hardware adaptation avoiding conflicting choices made in one or the other. This approach is orthogonal to the existing manual and inflexible code adaptation paradigm, and therefore moves far beyond the state of the art.
However, the separation of concerns also needs to build on a flexible framework so that scientific accuracy and numerical stability can be traded off against computational performance through viable mathematical or algorithmic choices. ESCAPE-2 achieves this by combining world-leading mathematical and algorithmic expertise in efficient forward-in-time computing to enhance algorithmic robustness and resilience to failure at scale, while minimizing time- and energy-to-solution for highly scalable, high-order algorithms, thus opening novel pathways for efficient use of future HPC architectures and providing a foundation for subsequent advances in scalability, and potentially time-parallelism. Eventually, this transformational research will be tested in a new class of weather and climate prediction community benchmarks, facilitating co-design and promoting a standardized, objective and widely accepted HPC hardware evaluation for this application domain.
ESCAPE-2 also combined research on an open-source, trans-disciplinary and (exa-)scalable VVUQ package, typically used outside the weather and climate community, with Earth-system modeling applications and associated uncertainty estimation expertise. This transfer aims to investigate the suitability of a simple package for higher-order and complex problems for which uncertainty quantification can presently only be done through very computing and data intensive simulations.
logo.jpg