European Commission logo
italiano italiano
CORDIS - Risultati della ricerca dell’UE
CORDIS

innovative MachIne leaRning to constrain Aerosol-cloud CLimate Impacts (iMIRACLI)

Periodic Reporting for period 1 - iMIRACLI (innovative MachIne leaRning to constrain Aerosol-cloud CLimate Impacts (iMIRACLI))

Periodo di rendicontazione: 2020-01-01 al 2021-12-31

Climate change is one of the most urgent problems facing mankind. Yet, the uncertainty of non-greenhouse gas perturbations (radiative forcing) associated with air pollution and its effect on clouds (aerosol-cloud interactions) limits our understanding of climate sensitivity. Progress has been hampered by the difficulty of disentangling aerosol effects on clouds and climate from their co-variability with confounding factors. Additional challenges are posed by limitations in remote sensing, low signal-to-noise ratios, and computational challenges like the scale and heterogeneity of datasets. Innovative techniques developed by the AI and machine learning community show huge potential but have not yet found their way into climate sciences – and climate scientists are currently not trained to capitalise on these advances.

The iMIRACLI ITN builds on the hypothesis that merging machine learning and climate science will provide a breakthrough in the exploration of existing datasets, and advance our understanding of aerosol-cloud forcing and climate sensitivity. Its innovative training plan matches ESRs with supervisors from climate and data sciences as well as a non-academic advisor and secondment, and provides them with state-of-the-art data and climate science training. Partners from the non-academic sector provide training in a commercial context.

The overall objective of iMIRACLI is to train and shape a new generation of climate data scientists with a solid foundation in climate sciences and competence in the latest machine learning techniques; ideally trained for employment in the academic and commercial sectors.
This innovative approach aims to answer our top-level science question: Can we develop and expand machine learning solutions to the analysis of the exploding amounts of climate data, to deliver a breakthrough in climate research, by tracing and quantifying the impact of aerosol perturbations from the microscale to the imprints on large-scale climate?

We address this top-level objective through a combination of questions emerging from climate and data sciences:

SQ1 Process-level detection: What is the imprint of aerosol perturbations on cloud microphysical properties? How can these be detected in observations despite weather noise and the presence of confounding factors?
SQ2 Process-level attribution: To what extent are statistical relationships between cloud-, precipitation-, radiation- and aerosol quantities, as found in remote sensing data, attributable to a causal relationship?
SQ3 Climate change detection and attribution: What has been the contribution of aerosols to observed climate change, such as global and regional temperature and precipitation change patterns?
SQ4 Learning feature representations from heterogeneous data sources: How can we marry modern machine learning tools with atmospheric science to extract meaningful and expressive feature representations in the challenging regime of heterogeneous, multiscale and low signal-to-noise climate data?
SQ5 Physically-constrained spatiotemporal modelling: How can we construct performant and expressive machine learning tools with model spatiotemporal dependencies in the aerosol-cloud interactions, whilst conforming to the appropriate physical constraints?
SQ6 Causal inference: Can causal associations be inferred based on statistical dependencies? How can we extend the existing causal discovery methods to be applicable to climate data? Can new hypotheses be generated and tested?
The key to creating a new generation of climate data scientists is a successful training programme, which we delivered despite the difficulties associated with COVID-19. We created a dedicated e-learning platform based on Oxford’s Canvas system and provided access to all students and supervisors. All students have completed an individual training needs analysis. Unfortunately, the first in-person summer school to be held at Oxford had to be cancelled but was successfully replaced with an equivalent online programme, which remains accessible to the full consortium, as well as an online hackathon jointly organised with the Climate Informatics conference. Likewise, the second summer school in Valencia had to be cancelled, but was successfully replaced by another hackathon. To make up for missing in-person interactions we are organising an extra event ahead of the 2022 EGU conference in Vienna (at which iMIRACLI has a session). Despite the circumstances, student feedback is consistently positive.

In terms of our science objective, significant progress is being made across all questions and deliverables:
• ESRs have set up reference datasets from Earth Observations as well as synthetic datasets for the testing of observational strategies. These are being applied to train machine learning models of clouds' responses to varying environmental factors.
• Techniques are being developed that allow a process-based attribution of observed aerosol-cloud relationships to specific perturbations through the development of causal inference techniques. This includes the use of instrumental variables, as well as the exploitation of opportunistic experiments, such as the tracks of shipping pollution in clouds.
• The causal attribution of key indicators of large-scale climate change to aerosol perturbations is being explored in the context of multi-variable methods.
• New methods to learn feature representations in heterogeneous datasets are being developed, tackling issues of data heterogeneity as well as the task of labelling data for supervised machine-learning techniques.
• Novel methods for spatio-temporal and physics-constrained modelling are being developed with applications for statistical downscaling of low resolution climate data to the high resolutions required for impact studies, and for the study of cirrus cloud properties using explainable machine learning.
• New methods for causal inference and network tools are being developed and applied to the causal attribution in time series datasets across time-scales, as well as the development of graphical criteria for the optimal adjustment in causal models.

iMIRACLI students have already presented and submitted a wide range of results at leading climate and machine learning conferences and many of the results are now being written up for journal publications or extended conference abstracts.
This is a frontier training and research project aiming to transform climate science through the introduction of the latest AI and machine learning techniques. At the same time it pushes their development with complex climate science questions as well as big heterogeneous data challenges. We expect all our results to expand the state of the art and the developed methodologies to be widely used in the climate and machine learning communities to increase our understanding of anthropogenic climate change.

There exists a significant public interest in the applications of AI and machine learning for “good”, as evident by the United Nations ITU AI for Good event series on AI for Climate Science that emerged from iMIRACLI discussions (https://aiforgood.itu.int/eventcat/discovery-ai-and-climate-science/). This is co-hosted by the iMIRACLI PI and course director.

After a successful setup of the project, iMIRACLI is now entering its productive phase and we are looking forward to showcasing further results in the future.
hi-res-imiracli-logo.png