Statistical challenges and approaches
for the analysis and verification/validation
of weather and climate extremes

Dr. B. Casati, Ouranos, Montreal, Canada

Extreme events can be defined in different ways, depending on the users and purposes of the study. Extremes can be maxima or minima, they can be regarded as rare events, they can be defined by their magnitude or based on their socio-economical impacts. A definition often used in statistics (and which embraces some of the previous ones) is: extreme events are events in the tail of the distribution. Before addressing the analysis and verification/validation of extremes, it is essential to clearly define the extremes themselves.

Several recent studies in weather and climate and related sciences have focused on extremes. From the user's perspective, extremes have high social and economical impacts. Climate studies project an increase in extreme frequency and magnitude. The progressively higher temporal and spatial resolution of NWP model enables to better resolve (some) extreme events. Analysis and verification/validation of extreme events plays a key role since (1) it enhances understanding of the capability of our models in predicting extremes, and (2) it helps decision makers in developing adaptation strategies to mitigate impact and losses.

Statistical analysis of extreme events is challenging for several reasons. Extremes are often characterized by large values (and outliers): robust and resistant statistics are necessary for their treatment. Extreme events are often rare events: these are characterized by a small sample, and therefore large uncertainties. Observations for extremes events might also be poor or sub-sampled because of technical difficulties related to the extreme weather itself (e.g., extreme cold temperatures in the Arctic are generally recoded in winter, when fewer observations are available). Pooling in space and time alleviate the effects of the small sample size, however inhomogeneity and non-stationarity ought to be kept in account. Alternatively, statistical analysis of moderate extremes can be used to infer the behaviour of more extreme events. Small samples in a categorical approach can result also in unstable statistics, over-sensitivity to the bias and/or non-informative asymptotic limits.

Extreme Value Theory (EVT) is the branch of statistics which studies the properties of extreme values, and enables them to be fit with theoretical distributions (or probability models). Such theoretical distributions enable one to describe the behaviour of extremes through the estimation of few key parameters. The use of theoretical distributions and their parameters leads to three major advantages: 1. Properties of the population and/or very large extremes can be inferred from the EVT theoretical distributions (even if these are not observed in the actual sample). 2. Trends in the extremes (e.g., due to climate change) and/or dependence of extreme events to specific covariates (e.g., annual and diurnal cycles, North Atlantic Oscillation) can be accounted for by using a non-stationary model for the EVT distribution parameters. 3. EVT distributions are suitable and optimally handle right-skewed distributions characterized by large values and outliers (such as extreme distributions). EVT distribution parameters can therefore provide robust and resistant measures of extremes typical values and variability, as an example. The use of the EVT distributions and their parameters plays often a key role in extracting the signal from empirical (and often noisy) extreme data.

The capabilities of EVT have been only recently explored by the climate and weather research community. The use of EVT is radically different within the two communities, due to the different spatial and temporal matching requirements and different time scales. Climate does not require an exact time matching, therefore climate studies focus on the analysis and comparison of marginal distributions obtained over 30 or more years (e.g., future versus present climate). In weather forecasts, on the other hand, the events require a more exact matching in space and time, therefore verification approaches focus on the behaviour of the forecast and observation joint distribution. Therefore, analysis and validation of climate extremes use EVT for univariate distributions, whereas extreme weather verification uses EVT for bivariate distributions. Climate and weather community approaches are complementary -- both communities could gain from method exchanges and collaborations.

Extreme weather verification

Early verification studies have analyzed the behaviour of traditional binary categorical verification scores for extreme and rare events (Schaefer, 1990; Doswell et al., 1990; Marzban, 1998; Goeber, 2004). Most of these studies have shown that the verification scores in extreme and rare event situations are overly sensitive to the bias and encourage either under- or over-forecasting. Few studies have investigated the mathematical limits of the scores as the events become rarer: Schaefer (1990) showed that the ETS converges to the TS as the base rate tends to zero; Doswell et al. (1990) showed that the PSS converges to the hit rate as the number of correct rejections becomes larger relative to the other entries in the contingency table. Goeber (2004) and Stephenson et al. (2008) show that most of the categorical scores converge to trivial limits (either zero or infinity) as the events get more and more extreme and rare.

Ferro (2007) and Stephenson et al (2008) propose a verification approach based on EVT for bivariate distributions. Both studies are inspired by the study by Coles et al. (1999) on dependence measures for extreme values. Ferro (2007) proposes a probability model to represent the asymptotic behaviour of forecast and observation joint distribution as the events get more and more rare. Stephenson et al. (2008) propose a non-vanishing measure for extreme event verification, the Extreme Dependency Score (EDS). Both the probability model parameters and the EDS depend on the rate of convergence to zero of the hit rate as the events get rarer. Note that the EDS is one of the two parameters considered in Ferro (2007).

Neighbourhood (fuzzy) verification approaches (e.g., Ebert, 2008) have been recently used for extreme weather verification to alleviate the requirement of exact spatial and temporal matching between observed and forecast events (quite a strict requirement, given the low predictability of extreme and rare events).

There is not consensus between Weather Services on which technique is optimal to verify extremes. Extreme events themselves are often defined by exceedance of fixed thresholds, which depend on the local climatology or some specific application. A suite of standard statistics could help monitoring and intercomparisons.

Climate extremes analysis and validation

Two approaches for the analysis of extremes are adopted in climate studies:

  1. Evaluation and analysis of indices associated with extreme phenomena.

    Temperature Indices
    frost days = n of days with Tmin < 0
    cold days = n of days with Tmax < 10%ile
    cold nights = n of days with Tmin < 10%ile
    summer days = n of days with Tmax > 25C
    warm days = n days with Tmax > 90%ile
    warm nights = n days with Tmin > 90%ile
    Diurnal T range = average Tmax - Tmin
    Standard deviation of Tmean
    Precipitation Indices
    total annual snowfall accumulation
    total snow to total precip ratio
    days with precipitation
    days with rain
    average precip intensity for precip days
    average rainfall intensity for rain days
    max number of consecutive dry days
    highest 5-days precip accumulation amount
    Very wet days = n of days with P > 95%ile
    Heavy precipitation days (P > 10 mm)

    A suite of standard indices related to extremes in daily temperature and precipitation has been identified (Frich et al., 2002) and later amplified ( Index List) by the joint CCl/CLIVAR/JCOMM Expert Team on Climate Change Detection and Indices (ETCCDI), in an international effort coordinated by WMO/WCRP ( These indices are used both to monitor extremes in the present climate (e.g., Vincent and Mekis, 2006), and to analyze the evolution of extremes with Climate Change (e.g. Tebaldi et al., 2006). Statistical significance of the indices trend is usually assessed along with their distributions and geographical location.

  2. Evaluation and analysis of return values obtained from extreme value distributions.
    EVT describe two types of theoretical distributions fitting extreme values: the Block Maxima (e.g. maximum annual temperature) are described by the Generalized Extreme Value (GEV) distribution, whereas the magnitude of Peaks over Threshold (PoT) are fitted with the Generalized Pareto (GP) distribution. Most climate studies use GEV distributions to define extreme event return values (extreme value exceeded once every T years), and then analyze the changes in the return values with Climate Change (e.g., Kharin et al., 2007; Fowler et al., 2007). Stationary fits by using L-moments are usually used on two (present and future) time windows, however recent studies are considering maximum likelihood estimation (MLE) to determine covariates and time evolution of the extreme distributions with Climate Change. The PoT approach has also been recently embraced, since it enables the use of a larger data sample than the Block Maxima approach.
Due to the great scale mis-match between climate model grids and station observations, validation of climate models is usually performed versus re-analyses (Gleckler et al., 2008). The representativeness error is quite dramatic when considering precipitation extremes. Upscaling techniques (to interpolate observed extremes to the model grid) or areal correction factors (to infer extreme local values from the grid-box areal-average estimates) are sometimes applied. Direct comparison of climate models versus observed extremes is usually used to define downscaling relations.

Some key references for EVT
References for weather extremes verification
References for climate extremes analysis and validation