Periodic Reporting for period 1 - AI4PEX (Artificial Intelligence and Machine Learning for Enhanced Representation of Processes and Extremes in Earth System Models)
Okres sprawozdawczy: 2024-04-01 do 2025-09-30
Beta versions of ML-enhanced observational data streams strengthen process understanding and model constraints. A database of low-level cloud patterns, robustly distinguishing stratocumulus-to-cumulus regimes, provides cloud classes for later model evaluation and parameterisation. An ML approach was expanded to reconstruct 4D fields of temperature and dissolved inorganic carbon from EO and in situ data, delivering beta products for ocean heat and carbon transport. AI4PEX also delivered beta products for global information-content and persistence metrics, advances toward empirically constrained global energy flux products, a neural-network river-discharge and routing prototype, and groundwork for dataset of biomass dynamics.
Research focused on ML-enhanced process representations in atmosphere, ocean and land models. For the atmosphere, the ICON-A-MLe configuration coupled an equation discovery cloud cover scheme with automatic tuning, producing 20-year AMIP runs that reduced cloud and top-of-atmosphere radiation biases, while a scale-aware NN cloud scheme was implemented in HadGEM3-GC5.0/UKESM alongside emulators. For the ocean, mesoscale eddy closures in NEMO were benchmarked and an emulator for surface chlorophyll and carbon fluxes was developed. For the land, hybrid JSBACH parameterisations of stomatal conductance and photosynthesis reduce carbon and water-flux biases.
AI4PEX focused on developing uncertainty quantification (UQ) for ML and hybrid modelling. Methodological advances included enhanced UQ in variational data assimilation, with stochastic priors, neural state estimation under irregular sampling, and differentiable data assimilation (DA) frameworks. Activities developed and tested ML emulators and stochastic parameterisations, established a benchmarking framework for hybrid ESMs, and investigated transfer of NN emulators from offline to online coupling. In the ocean and land, emulation and variational inference approaches were extended to biogeochemistry and hybrid soil-carbon models.
AI4PEX advanced ML-based evaluation techniques, including dynamic mode decomposition and physics-aware Koopman operators. ML-enhanced representations of the atmosphere have been partially analysed for 25-year simulations on the HadGEM3-GC5.0 model, noting global consistency but regional biases. Substantial improvements were made for the ML-enhanced ICON-A-MLe model, showing bias reduction in cloud cover and radiative fluxes.
AI4PEX made progress in detecting and attributing changes in climate extremes to their dynamical and thermodynamical drivers. A causal representation-learning framework is developed for dynamical adjustment and factorial analysis of temperature and precipitation trends, enabling robust separation of circulation-driven and thermodynamic components. Research has examined dynamical contributions to summer warming using multiple decomposition approaches, and refined probabilistic extreme event attribution by incorporating regional aerosol covariates. The application of automated differentiable architecture search to discover high-performing convolution NNs for precipitation downscaling of extreme events, showed that ML methods improve the representation of a global model compared to simple interpolation.
- Supporting the transferability of ML components into process-based models and across modelling centres, minimising tooling incompatibilities for online ML-inference within ESM code infrastructures.
- Online learning and adaptive calibration are identified as both promising and technically demanding. AI4PEX laid theoretical and practical groundwork for online-learning, yet porting approaches to domain level or fully coupled ESMs remains a conceptual and structural challenge.
- Assessing and developing confidence in hybrid frameworks. Causal regularisation and causal representation learning already make a proof of concept for improving robustness to distribution shifts, attribution schemes and data biases. Future needs concern taking these tools to operational ESM model development and benchmark frameworks.
- Consolidation of datasets, benchmarks and protocols are prerequisites for broad impact.