Skip to main content

IMProved multivariate frequency Analysis of flood extremes by copuLAs in a non-stationary environment

Final Report Summary - IMPALA (IMProved multivariate frequency Analysis of flood extremes by copuLAs in a non-stationary environment)

Project objectives:
=============
One of the biggest challenges of the flood frequency analysis that water resources managers have to face recently is modelling two or more inter-dependent flood variables (floods at river confluences; flood and the respective volumes and durations), and accounting for the non-stationarity of the environment. The IMPALA project offers multidisciplinary solution to this problem by copula-based multivariate frequency modelling of flood extremes with the inclusion of information on historical and regional ungauged extremes and respecting the effects of the changing environment, including further development of methods for spatial data extension and their verification on a Europe-wide scale.
The specific objectives of the IMPALA project are as follows:
- to study the dependence structure between the flood characteristics (flood and the respective volumes and durations; floods at river confluences), which often is non-linear, by means of copulas, and reduce the uncertainty in the choice and estimation of the fitted copula family, of the multivariate design values and their return periods, arising from the lack of the available data in the right tail of the joint distribution function of the variables,
- to improve the frequency modelling of the marginal distributions of multivariate flood characteristics by improving the statistical background for the for inclusion of historical floods and reconstructed flood extremes from ungauged basins into a regional flood frequency analysis,
- to expand the spatial scope of the analysis, i.e. to evaluate the flood risk related to reconstructed flood events using different pan-European datasets of flood available through various European projects such as: FRIEND, HYDRATE, COST Action ES0901 (FloodFreq), and others; and
- to investigate the potential effect of non-stationarity on flood frequency estimates, both for the marginal distributions of flood peaks and volumes and for their dependence structure (copula family, copula parameters), and compare the results with the outcomes of the model assuming stationarity.

Work performed and the main results:
===========================
The work related to the concepts of the IMPALA project started even before the start of the project itself. The researcher and his colleagues from the University of Technology in Bratislava have been long co-operating with the host institute, Institute of Hydraulic Engineering and Water Resources Management (IHEWRM) at the Vienna University of Technology (VUT). The first significant results of this co-operation were presented in the paper of Gaál et al. (2012, WRR) where dependence between flood peaks and volumes was analyzed regionally, in terms of flood time scales which is the ratio of these variables. The results of this paper underpinned the importance of a process-based analysis in catchment hydrology. Generally, studies dealing with fitting univariate or multivariate statistical models to observed data in any method of frequency estimation do not go beyond the tools of statistics, they are usually carried out in an automated way, and do not explore the hydrological or meteorological drivers of the flood processes. The overall philosophy of the hydrology group of IHEWRM lead by prof. Günter Blöschl is that, in any regional flood analysis, it is essential and inevitable to differentiate between the flood driving mechanisms. Therefore, the IMPALA project was also governed to follow this concept.

During the 24 months of the IMPALA project, equally important focus was set on the following two subtopics: (a) data-based identification of convective rainfall events and analysis of storm properties and (b) regional and process-based analysis of the dependence between flood peaks and flood volumes. It may seem that the first subtopic completely differs from the general goals of the project, but this is not truth. We have dealt with methods for differentiation between convective and stratiform rainfall events for two main reasons. First, rainfall events belong to the most important meteorological drivers of flood events. At the same time, the type of the rainfall is an indicator of the type of the flood processes: convective rainstorms are precursors of flash floods while stratiform rainfall is generally associated with synoptic flood processes. Secondly, one can find remarkable analogies between storms and floods. The basic requirement of both studies is identification of independent events. In both cases, this is a complex procedure while one of the main goals is minimizing subjective decisions is the process of event identification. As soon as events are delineated, one can perform their multivariate analysis, preferably by means of the toolbox of copulas. While in the case of floods, the bivariate statistical analysis usually relates to flood peaks and flood volumes (in some studies, also the duration is considered, though), for storms there are more characteristics, interrelationship of which is of a particular interest both from the aspect of theory as well as for practitioners, e.g. in engineering design.

In the paper of Gaál et al. (2014, HESS) a novel methodology was proposed for identification of storm events with convective character on the basis of high resolution climatological data and lightning activity. Identification of intense summer storms was based on the hypothesis that thunderstorms with strong convective lifting are commonly associated with lightning. Presence of lightning activity in the vicinity of the stations was therefore an indicator or storms with convective character, and station-based intensity thresholds were defined that differentiate between events with and without lightning with an acceptably small probability of misclassification. The results of the paper proved to be useful in typology of flood processes; nevertheless, since the analysis of rainstorms was restricted to the warm half-year, a further research was needed to explore the drivers of the snow-related flood events in the cold half-year.

In the second paper (Molnar et al., 2015, HESS), an empirical study of the variability in the Clausius-Clapeyron rates of increase in precipitation intensity with air temperature was published. The novelty of the study is the fact that it is event based; the paper made use of the methodology for identification convective-stratiform subsets on the basis of the presence of lightning strikes (Gaál et al. 2014, HESS). One of the key questions of the study was how storm type contributes to the rainfall intensity–temperature relation. It was concluded that the scaling rates for all events mixed together were systematically higher than those of the individual lightning and no-lightning subsets because of the mixing of stratiform events at low temperatures and convective events at high temperatures. These findings, again emphasize the importance of separation of processes. Generally, the results of the paper have serious implications on the hydrological risk in the future: as a consequence of the climate change, the mean air temperature is expected the raise, which implies higher scaling rates, and this might be manifested in more severe rainstorms and consequently, flash floods.

The dependence between flood peaks and corresponding flood volumes in Austria was analysed in a series of three peer-reviewed publications. In the first one (Gaál et al., 2014, HSJ), the flood peak-volume relationship was examined in terms of Spearman rank correlation coefficient. The paper was aimed at finding the climatological and/or hydrological attributes that control this bivariate relationship in a regional context. The results indicate that climate related factors are more important than catchment related ones in controlling the consistency between the two variables. Spearman’s rank correlation coefficients typically range from about 0.2 in the high alpine catchments to about 0.8 in the lowlands. The weak dependence in the high alpine catchments is due to the mix of flood types, including long duration snowmelt, synoptic floods and flash floods. In the lowlands, the flood durations vary less in a given catchment which is related to the filtering of the distribution of all storms by the catchment response time to produce the distribution of flood producing storms. In this paper, only a regional analysis of rank correlation between flood peaks and volumes was performed; therefore, a further study was carried out to find the most suitable statistical models to fit this bivariate relationship.

In the paper of Szolgay et al. (2015, PIAHS), the bivariate relationship of flood peaks and volumes was examined in terms of copulas, with a particular focus on the type and seasonality of flood generation processes. The scientific goal was to assess whether the suitability of the selected bivariate statistical models (copula families) depends on the type of the floods (rainfall-fed floods that are dominant in summer vs. snowmelt-fed floods that occur in the winter season), and the size of the data samples. It was concluded that uncertainty in the choice of any bivariate statistical model can be reduced by a deeper hydrological analysis of the dependence structure between flood peaks and volumes, particularly by considering the model’s suitability and the flood generation mechanisms in the target region.

In the last paper (in preparation; to be submitted before June 2015), the core idea of the second paper was examined in more detail. More precisely, answers for the following science questions were sought: Do different flood types have statistically different bivariate flood peak volume distributions? How similar are, for a given flood process, the empirical copulas between pairs of sites? Does the copula fitting procedure lead to a different set of acceptable parametric copulas for different flood types? It is concluded that when empirical copulas for different flood processes are compared locally, flash floods are indeed distinguishable both from the synoptic and snowmelt floods. It is argued that this is a consequence of a larger coefficient of tail dependence in the case of flash floods, which means higher likelihood that high flood peaks are associated with high flood volumes. The copula-fitting procedure also resulted in different findings for the individual flood type subsets. While the extreme value copulas were the most acceptable models to fit pairs of peaks and volumes for the synoptic and flash floods, the Frank copula showed the best performance for the snowmelt floods. These results justify that the basic concept of the IMPALA project (i.e. a process-based approach to a statistical analysis) was correctly defined.

During the course of the IMPALA project, an overview article (Blöschl et al., 2015, WIREs Water) was also prepared. It examines whether floods have changed in the past and explores the driving processes of such changes in the atmosphere, the catchments and the river system based on examples from Europe. Methods are reviewed for assessing whether floods may increase in the future. It is argued that an integrated flood risk management approach is needed for dealing with future flood risk with a focus on reducing the vulnerability of the societal system.

The final results and their potential impact and use:
=====================================
The project IMPALA aims at improving the copula-based frequency modelling of multivariate flood data by improved assessment of the marginal distributions of the variables. The research field of the
copulas is multifaceted, with a plenty of practical applications, and also with a number of scientific questions to be answered. Nevertheless, the shortness of the time series of the individual variables
still remains a general problem of the copula-based frequency modelling, imposing unreliable inferences on the type of the fitted copula family or the magnitudes of the T-year design quantiles.
As the innovation of the project IMPALA we propose a method for increasing the density of the data in the extreme tail of the copulas (which is of the major interest in hydrological design) by
incorporating a considerably larger amount of extremes in the area of the right tail of the marginal distributions of the variables.

An important outcome of the project IMPALA is the process-based analysis of the bivariate flood peak-volume relationship. Even though this analysis represents a minor deviation from the original goals of the project, the results indicate that this was a prudent decision. It was found that the flood genesis (quantified by flood types including synoptic rain storms, flash flood events and snowmelt floods) is an important control in the regional analysis of bivariate flood relationships. It was concluded that uncertainty in the choice of any bivariate statistical model can be reduced by a deeper hydrological analysis of the dependence structure between flood peaks and volumes, particularly by considering the model’s suitability and the flood generation mechanisms in the target region.

Due to the nature of the project IMPALA that is related to many theoretical and practical questions (e.g. design value estimation, flood risk assessment, non-stationarity of the environment and its effect on the hydrological cycle), the plausible target groups of the communication and dissemination plan for the project’s results show a high degree of diversity: they are the international scientific community, professional end-users and flood mitigation policy makers, young researchers and university students and the general public including national, regional and local authorities, NGO’s and pressure groups.