The spatial multi-event contingency table methodology is well suited for verifying high resolution forecasts since it gives credit to forecasts that are "close" to the truth in some way but need not be exactly correct.
The performance of a set of deterministic forecasts is often represented
by a simple 2 x 2 contingency
table that represents the joint distribution of forecasts and observations
for a specified event criterion or threshold (for example, rain exceeding
|Observed yes||Observed no|
|Forecast yes||hits||false alarms|
|Forecast no||misses||correct negatives|
Now, for the same observed event criterion, consider a range of K
thresholds on the forecasts (for example, forecast rain exceeding 1 mm/h,
2 mm/h, 5 mm/hr, etc). These can be viewed as possible decision thresholds
for taking action, such as issuing a warning. Instead of the contingency
table having only a single event category it now contains multiple categories
corresponding to the K forecast thresholds.
|Observed yes||Observed no|
|Forecast >= threshold1||hits1||false alarms1|
|Forecast < threshold1||misses1||correct negatives1|
|Forecast >= threshold2||hits2||false alarms2|
|Forecast < threshold2||misses2||correct negatives2|
|Forecast >= thresholdK||hitsK||false alarmsK|
|Forecast < thresholdK||missesK||correct negativesK|
By using multiple thresholds, a deterministic forecast system can be evaluated across a range of possible decision thresholds (instead of just one) using ROC and relative value. This enables a fairer comparison against ensemble prediction systems or other probabilistic forecasts.
For an ensemble prediction system with M members, for each forecast threshold k there are now M probability categories (at least 1 member >= thresholdk, at least 2 members >= thresholdk, etc.), yielding a total of KxM categories.
An alternative to multiple intensity thresholds is multiple "closeness" thresholds, for example, forecast event within 10 km of the location of interest, within 20 km, 30 km, etc. Forecasters conceptually interpret high resolution model output in this way. The verification results can therefore be used to assess the performance of high resolution forecasts where the exact spatial matching of forecast and observed events is difficult or unimportant.
Other forecast decision criteria are possible, depending on the application.
Decision criteria can be combined to produce multi-dimensional contingency
tables. The spatial multi-category contingency table described by Atger
(2001) is a good example. In the case below, the number of categories
would be JxK for single-model forecasts, and JxKxM
for ensemble prediction systems.
|Forecast within distance1||...||Forecast within distanceJ|
|Observed yes||Observed no||...||...||Observed yes||Observed no|
|Forecast >= threshold1||hits11||false alarms11||...||...||hitsJ1||false alarmsJ1|
|Forecast < threshold1||misses11||correct negatives11||...||...||missesJ1||correct negativesJ1|
|Forecast >= threshold2||hits12||false alarms12||...||...||hitsJ2||false alarmsJ2|
|Forecast < threshold2||misses12||correct negatives12||...||...||missesJ2||correct negativesJ2|
|Forecast >= thresholdK||hits1K||false alarms1K||...||...||hitsJK||false alarmsJK|
|Forecast < thresholdK||misses1K||correct negatives1K||...||...||missesJK||correct negativesJK|
Atger, F., 2001: Verification of intense precipitation forecasts from single models and ensemble prediction systems. Nonlin. Proc. Geophys., 8, 401-417. Click here to get the PDF (295 Kb).