European Commission logo
italiano italiano
CORDIS - Risultati della ricerca dell’UE
CORDIS

Causality Relations Using Nonlinear Data Assimilation

Periodic Reporting for period 4 - CUNDA (Causality Relations Using Nonlinear Data Assimilation)

Periodo di rendicontazione: 2021-03-01 al 2022-08-31

A major problem in understanding complex nonlinear geophysical systems is to determine which processes drive which other processes, so what the causal relations are. Several methods to find causal relations exist, but most do not allow for interactions among drivers when they influence a target. However, that is a crucial part of any nonlinear system. Another issue with existing methods is that uncertainty quantification is ignored, while it is essential for any scientific debate, and for forecasting. A special case is causal discovery in numerical models, where existing methods got stuck in evaluating very high-dimensional integrals. To find causal relations in the real worlds we need to have the best data sets available, and they come from combining observatons with numerical model that encode the known physical laws. This combination is done in a process called data assimilation. However, existing data-assimilation methods in high-dimensional systems are either linear or weakly nonlinear, while we would need fully nonlinear data-assimilation methods.

In this proposal we tackle these problems by 1) develop a new causal discovery framework that allows for nonlinear interactions between drivers, 2) robustly embedding causality into a Bayesian framework, moving from testing causality to estimating causality strength and its uncertainty in a systematic way, 3) develop a method to infer causal strength in computer models that is feasible in high-dimensional settings, 4) develop fully nonlinear data assimilation for high-dimensional systems, and 5) as an example, use nonlinear data assimilation on the outstanding problem of interocean exchange around South Africa for a 30-year period and use both causal discovery methods from 1) and 3) to understand the drivers of the interocean exchange, including how they interact and including uncertainty estimates.

If successful, the project will provide the research community with exciting new tools to understand complex high-dimensional systems in the geosciences and beyond. Via the specific ocean example we will provide an unprecedented view of the workings of the ocean in the highly complex ocean region around South Africa, which is of crucial importance for our understanding of the role of the ocean in the climate system. Broader societal impacts include substantial improvements to weather and climate prediction, and in many other disciplines from traffic control to combustion, to modeling the human brain, via successful fully nonlinear data assimilation, and step changes in our understanding of the workings of many complex real world systems such as in economics, medicine, and brain research, via nonlinear causal discovery.
The work can be divided in nonlinear data assimilation, causal discovery, and applications of these methodologies to real-world problems. More has been done than can be reported here due to space limitations.

Quick progress was made on the data-assimilation side, where we explored four main variants: tempered particle filters, equivalent-weight particle filters, synchronization, data-assimilation on the Wasserstein space, and particle flow filters. We made strong progress on all these methods, and determined which method is most useful for which kind of problem. We also worked on exciting new methods to estimate errors in model equations, e.g. resulting in the only comprehensive method that can be applied in operational weather prediction, and we worked on method to increase the efficiency of solvers for large linear systems via so-called randomized preconditioners. More details can be found in the publications, and in papers that will appear, or be submitted soon.

However, the major breakthrough has been the development of a so-called Particle Flow Filter halfway the project. This is the first fully nonlinear data-assimilation (Bayesian Inference) method that can be applied to systems with arbitrary dimensions, and this achievement has solved a major problem that has plagued the data-assimilation community, including weather, ocean, and climate forecasters, for decades. We are still working on several papers, applying the method to a global atmospheric model, to high-resolution storm forecasting, and to ocean reanalysis.

After two years of hard work, sometimes encountering dead ends, we have managed to put causality on a completely new footing by including all possible interactions between the drivers. Interestingly, any nonlinear multivariate interaction had been ignored in causality estimation, while most physical, biological etc. systems are strongly governed by this category of interactions. Furthermore, we implemented causality in a Bayesian framework, allowing for proper uncertainty quantification for scientific reasoning.

In parallel, we worked on causal discovery in numerical models, exploring the model equations. It was hard to find a good postdoc to lead this work, but in the last year this was solved. We have developed a new method for information propagation in complex systems based on model equations, the first of its kind. This has turned out a much harder problem than anticipated, partly because the existing literature is less useful than previously thought, and either is misleading, incomplete, or cannot be generalized to the higher dimensional systems of interest. We essentially had to start from scratch, but have been successful and a paper is in preparation.

For the application of these methodologies to the ocean we collected data from satellites, ships, buoys, floats, and moorings, and combined them with ocean model NEMO, using the PDAF data-assimilation software system. Inflexibility of the ocean model made us change to MOM6, and we are working hard on finishing the 30-year reanalysis and the causal discovery. We performed nonlinear data assimilation on an evolving Hurricane (Patricia from 2015) and applied our causal discovery framework to understand the rapid intensification process in these huge storms. We found that ocean surface processes, vertical alignment processes related to strong convective rain systems, and favourable outflow conditions in the upper troposphere combine to make this process happen. We used our newly developed Particle Flow filter on cloud property retrievals of marine shallow cumulus and are applying the causal framework to better understand cloud microphysics.

Overall, the main results so far can be summarized as follows:
1) The development of the first fully nonlinear data-assimilation method for high-dimensional systems that converges to the true solution without any approximation.
2) The development of the first nonlinear causal discovery framework that includes nonlinear interactions between drivers
3) The development of the first causal discovery framework for numerical models that explores the model equations and can handle high-dimensional systems
4) Application of the causal discovery methodology of 2) to a realistic Hurricane simulation using the nonlinear data-assimilation of 1).

These main results have been published or about to be published. The nonlinear data-assimilation methodology has been implemented in community data-assimilation system DART, and in the near operational data-assimilation system JEDI. This will allow for maximum exposure of the methodology to both the academic and the operational communities.

Documenti correlati