Journal of Hydrology 276 (2003) 210–223 www.elsevier.com/locate/jhydrol
Non-stationary pooled flood frequency analysis Juraj M. Cunderlika,*, Donald H. Burnb,1 a
Department of Civil Engineering, University of Waterloo, Waterloo,ON, N2L 3G1, Canada Department of Civil Engineering, University of Waterloo, Waterloo, ON, N2L 3G1, Canada
b
Received 4 September 2002; accepted 24 January 2003
Abstract The presence of significant non-stationarity in a hydrologic time series cannot be ignored when estimating design values for future time horizons. This paper introduces a second-order non-stationary approach to pooled flood frequency analysis. The proposed approach separates the non-stationary pooled quantile function into a local time-dependent component, comprising the location and scale distribution parameters, and a regional component that can be regarded as time-independent under the assumption of second-order non-stationarity. A local trend analysis is used for identification, local significance assessment and estimation of the changes in the time-dependent components. A regional trend analysis based on a regional bootstrapresampling algorithm is then applied for assessment of the changes at a regional scale. A Monte-Carlo experiment is used for evaluating the performance of the method in the estimation of trend magnitudes in the location and scale parameters of a nonstationary series. The model is then applied on a study catchment from a homogeneous region. The results show that ignoring even a weakly significant non-stationarity in the data series may seriously bias the quantile predicted for time horizons as near as 0 – 20 years in the future. q 2003 Elsevier Science B.V. All rights reserved. Keywords: Annual maximum flood; Non-stationarity; Local and regional trend; Mann– Kendall test; Pooled flood frequency; L-moments
1. Introduction According to Thompson and Perry (1998), floods caused the highest percentage of deaths (26%) among all natural hazards evaluated around the World during the period 1963– 1992. With a total of 32% they also have the leading position in total significant damage * Corresponding author. Tel.: þ1-418-654-3751; fax: þ 1-418654-2600. E-mail addresses:
[email protected] (J.M. Cunderlik),
[email protected] (D.H. Burn). 1 Tel.: þ1-519-888-4567x3338; fax: þ1-519-888-6197.
and the second position (32%, after droughts) in the total number of persons affected. During the last several years many hydrological studies have identified significant changes (trends) in the annual maximum flood (AMF) time series (Grew and Werrity, 1995; Changnon and Kunkel, 1995; Westmacott and Burn, 1997; Robson et al., 1998; Reed et al., 1999; Lins and Slack, 1999; Loukas and Quick, 1996, 1999; Cayan et al., 1999; Jain and Lall, 2001; Douglas et al., 2000; Adamowski and Bocci, 2001; Zhang et al., 2001; Cunderlik and Burn, 2002b and others). Most of these studies have focused on the identification of trends in the location parameter of
0022-1694/03/$ - see front matter q 2003 Elsevier Science B.V. All rights reserved. doi:10.1016/S0022-1694(03)00062-3
J.M. Cunderlik, D.H. Burn / Journal of Hydrology 276 (2003) 210–223
the AMF series. So far, minimal attention has been paid to possible changes in the scale and shape parameters of the AMF series, mostly due to the lack of both long data records and convenient methods for trend detection in higher moments. There are few methods available for estimating trends in the scale and shape parameters of a sample distribution function. Vinnikov and Robock (2002) estimated linear trends in higher moments using a simple formula Eby0 ðtÞn c ¼ b0 þ b1 t 0
ð1Þ
where y ðtÞ are residuals from the de-trended time series ðy0 ðtÞ ¼ yðtÞ 2 a0 2 a1 tÞ; bi are parameters of the trend function and n is the moment order. The linear approximation in Eq. (1) is not adequate since the raising of the de-trended series to the power of n transforms the true trend into a non-linear form. Another method frequently used for the estimation of trends in higher moments is the method of moving windows. A time window of fixed length is moved with constant step size over a time series and for each stop, distribution parameters are estimated by any conventional estimation technique such as the method of moments, L-moments or maximum likelihood. The performance of this method depends on three parameters, the length of the series, the length of the window and the time step. An optimal trade-off between the latter two parameters can be found by a simulation experiment. Plag and Tsimplis (1999) applied this method to the moving harmonic analysis of a century-long climatological data records in the North and Baltic Seas. Unfortunately, the method of moving windows is less convenient for AMF series because of their relatively short record lengths. A time series whose distribution parameters change over time is called non-stationary. The source of the non-stationarity in hydrologic records can be a natural catastrophe or periodicity (forest fires, El Nin˜o, solar activities), anthropogenic activity (land use changes due to deforestation, urbanization) or in changing climate (natural or anthropogenic). When significant non-stationarity is identified in the AMF time series, then the standard statistical methods cannot be applied for flood frequency estimation (FFE) because the parameters describing the location, scale and shape properties of the series change over time. Also, the parametric approach to FFE requires
211
a choice of the distribution function of AMF to be made, which is, under non-stationary conditions, time dependent. New approaches are needed, that would allow predicting the time-dependent distribution parameters at some time point in the future. North (1980) developed a time-dependent flood frequency model (FFM) in which both the time of occurrence and the flood magnitude were time-dependent variables. Duckstein et al. (1987) outlined the Bayesian approach to the estimation of the posterior flood distribution. Strupczewski et al. (2001) and Strupczewski and Kaczmarek (2001)) developed a nonstationary flood frequency procedure comprising 56 models of flood distribution and trend function. The authors then estimated model parameters via the maximum likelihood and weighted least squares techniques. In this paper we propose a second-order nonstationary approach to pooled flood frequency analysis. The second-order approach assumes nonstationarity in the first two moments of the time series. The main objective was to develop a pooled FFM that separates the non-stationary pooled quantile function into a local time-dependent component comprising the location and scale distribution parameters and a regional component that can be regarded as timeindependent under the assumption of second-order non-stationarity. Parameters of the local component are estimated from a time series decomposed into a trend component and a residual time-dependent random variable that represents irregular fluctuations around the trend. Non-stationarity in the location parameter is then estimated from the trend component and in the scale parameter from the variability in the residual component. The local significance of nonstationarity is tested using the Mann – Kendall test. The regional significance of non-stationarity is assessed by a regional bootstrap resampling technique. A Monte-Carlo experiment is further used for evaluation of the performance of the method in the estimation of trend magnitudes in the location and scale parameters of a non-stationary series. Section 2 describes the non-stationary time-series model adopted in this study and the technique used for estimation of the model’s parameters and the significance and uncertainty involved at local and regional scales. This is followed by an introduction of the case study and Section 4 discusses the results.
212
J.M. Cunderlik, D.H. Burn / Journal of Hydrology 276 (2003) 210–223
2. Methodology
the general form
2.1. Non-stationary pooled flood frequency model
mðtÞ ¼ a0 þ a1 t þ a2 t2 þ ··· þ an tn
A time series whose parameters change over time is called non-stationary. Due to large sampling variability involved in the estimation of higher moments from the average 30 –50 years long hydrologic series, it is convenient to focus on the nonstationarity in the first two moments of the series. As we limit our attention to non-stationarity in the first two moments we will adopt a second-order nonstationary FFM. A time series is said to be weakly or second-order non-stationary if
mðtÞ – m0 and=or sðtÞ – s0 and=or sðt;t 2 kÞ – sk ð2Þ where m is the mean, s is the variance and sðt;t 2 kÞ is the covariance between two observations k time periods apart. The time-dependent quantile function of a second-order non-stationary time series can be expressed by QðF;tÞ ¼ vðtÞqðFÞ
ð3Þ
where QðF;tÞ is the quantile with a probability F [ ð0;1Þ at a time t; vðtÞ is a time-dependent local component and qðFÞ is a time-invariant regional component of the quantile function defined by
vðtÞ ¼ mðtÞsðtÞ qðFÞ ¼ f½F; Q1 ; Q2 ;…; Qn
ð4Þ
where mðtÞ and sðtÞ are non-stationary location and scale parameters, f is a distribution-specific scale frequency factor and Qi is a set of n stationary parameters describing the pooled growth curve. It is convenient to standardize the original data by their mean value and then set the parameters Q1 and Q2 to unity. Eq. (3) separates the nonstationary quantile function into a local timedependent component comprising the location and scale distribution parameters and a regional component that can be regarded as time-independent under the assumption of second-order non-stationarity. The location and the scale parameter are a function of time that can be expressed in
ð5Þ
sðtÞ ¼ b0 þ b1 t þ b2 t2 þ ··· þ bn tn where ai ; and bi are sets of n parameters. If n ¼ 1 then the functions reduce to a linear form. In a pooled FFM focused on a gauged site the location and scale parameters are usually estimated from local data. The location parameter is substituted by the index flood (mean or median annual flood) and the scale parameter is estimated from a measure describing the variability of floods (such as second moment or L-moment). The shape parameters are estimated regionally—from averages obtained from local values of distribution’s shape describing measures such as coefficient of skewness and kurtosis. 2.2. Estimation and prediction of the time-dependent parameters of FFM The following model of the standardized AMF time series ðQmax ðtÞÞ is proposed for the identification and prediction of the changes in the location and scale parameters of the second-order non-stationary flood frequency distribution Qmax ðtÞ ¼ tðtÞ þ 1ðtÞ
ð6Þ
where tðtÞ is a trend component and 1ðtÞ is a residual time-dependent random variable representing irregular fluctuations around the trend. The variability of the fluctuations is a function of time. The distribution function of Qmax ðtÞ is unknown since its parameters change over time. The trend component in Eq. (6) can be represented by
tðtÞ ¼ a0 þ a1 t þ a2 t2 þ · · · þ an tn
ð7Þ
where ai is a set of n parameters and can be used for predicting the location parameter (the index flood) pperiods ahead of time t: Let us define the de-trended series by 1ðtÞ ¼ Qmax ðtÞ 2 tðtÞ
ð8Þ
The residual time-dependent component 1ðtÞ contains information about the trend in the variability (scale) of the series. The trend in the scale parameter can be
J.M. Cunderlik, D.H. Burn / Journal of Hydrology 276 (2003) 210–223
213
Fig. 1. Components of a second-order non-stationary time series (A—original time series ðQmax ðtÞÞ; B—de-trended A series with stationary mean ðQmax ðtÞ 2 tðtÞÞ; C—de-trended B series with stationary mean and variance ðQstmax ðtÞÞ).
estimated using the transformed series 10 ðtÞ ¼ l1ðtÞ 2 1l
ð9Þ
where 1 is the average residual component from the series after the trend component tðtÞ has been removed. The trend in the transformed series 10 ðtÞ can be expressed as 2
wðtÞ ¼ b0 þ b1 t þ b2 t þ · · · þ bn t
n
ð10Þ
where bi is a set of n parameters. When we remove the trend in the variability of the series wðtÞ; we will get a second-order stationary time series ðQstmax ðtÞÞ defined as 8 1ðtÞ 2 wðtÞ ;1ðtÞ $ 1 > > > increasing trend in wðtÞ > > < 1ðtÞ þ wðtÞ ;1ðtÞ , 1 st Qmax ðtÞ ¼ > > 1ðtÞ þ wðtÞ ;1ðtÞ $ 1 > > decreasing trend in wðtÞ > : 1ðtÞ 2 wðtÞ ;1ðtÞ , 1 ð11Þ The parameters of the distribution of Qstmax ðtÞ are no longer a function of time and hence the distribution function of Qstmax ðtÞ can be determined.
The components of the time series Qmax ðtÞ are shown in Fig. 1 using artificial data. Assuming an average record length of a typical hydrologic time series to be rarely more than 50 years it is convenient to reduce the trends in the location and scale parameters to a linear form
tðtÞ ¼ a0 þ a1 t
wðtÞ ¼ b0 þ b1 t
ð12Þ
If the form of trend in the location and scale parameters is assumed linear, then the slopes of trends (SL) in tðtÞ and wðtÞ; i.e. a1 and b1 ; can be estimated non-parametrically using Sen’s robust slope estimator (Sen, 1968) SL ¼ MedðSij Þ;
Sij ¼
Xi 2 X j ;j , i i2j
ð13Þ
where Xi and Xj are data values at time i and j; respectively. In order to be able to compare trend slopes from different sites, it is desirable to standardize the values of SL by SL SLS ¼ 100½% Qmax
ð14Þ
214
J.M. Cunderlik, D.H. Burn / Journal of Hydrology 276 (2003) 210–223
where SLS is the trend slope standardized by the site’s max Þ: The variable average annual maximum flood (Q SLS describes the yearly trend change in annual maximum floods as a percentage of the average flood value calculated from a regionally common observation period. Theil (1950) proposed a method for testing H0: SL ¼ SL0 against the alternative H1: SL – SL0 based on Kendall t statistics. First the pairs of observations (ti ; Xi 2 SL·ti Þ are arranged in ascending order by the magnitude. Then the values Xi 2 SL·ti are compared with the values Xj 2 SL·tj appearing below the ith values. Let P be the number of such comparisons that result in a pair (Xi 2 SL·ti ; Xj 2 SL·tj ) that is in natural order and let R be the number of such comparisons that result in a pair that is in reverse natural order. Define S ¼ P 2 R: Then the test statistic is defined as
t¼
S nðn 2 1Þ=2
ð15Þ
The H0 is rejected at the a level if the computed value of t is either positive and larger than the tp critical for given record length n and a=2 or negative and smaller then the negative of the tp critical for given n and a=2 (Daniel, 1990). Theil (1950) also derived confidence intervals for SL based on the hypothesis procedure described earlier. Using Eq. (13) all N possible sample slopes Sij are computed and arranged in ascending order. Let k¼
N 2 ðta=2 2 2Þ 2
ð16Þ
the Sen’s method needs to be adjusted by the parameter l: Similarly the slope of the trend in the location parameter determined by Eq. (12) underestimates the true trend if a significant trend in the scale parameter is present in the series and needs an adjustment that depends on the magnitude of the estimated Sen’s slope b1 : The values of the parameters k and l can be derived for different probability distributions using Monte-Carlo simulations. For twoparameter distributions with fixed shape parameter k and l take one specific value. For three or more parameter distributions the parameters depend on the distribution shape parameters. FFE from a non-stationary time series requires a different understanding than a traditional stationary approach. The distribution parameters and the distribution type itself change in time and so does the probability of a design value of an extreme event. The ‘initial’ (stationary) distribution parameters mst ðt0 Þ; sst ðt0 Þ at a given starting time point t0 cannot be estimated from the raw observed data Qmax ðtÞ: Together with k-shape parameters that are considered stationary they can only be determined from the transformed stationary series Qstmax ðtÞ (Eq. (11)). After the set of initial, stationary parameters is determined, their predictions can be made for a time horizon t; p-periods ahead. The prediction of the location and scale parameters at time t ðt ¼ t0 þ pÞ; p-periods ahead, assuming stationarity at t0 can be estimated from
mðtÞ ¼ mst ðt0 Þ þ a01 ðt0 þ pÞ
ð18Þ
Then the lower (upper) confidence limit for SL is the kth value of Sij counting from the smallest (largest) value upward (downward). If the magnitudes of the trends in the location a1 and scale b1 parameters are estimated using Sen’s non-parametric method, then the estimated values need to be adjusted using the approximate formulas
A given quantile estimate has another parameter besides the probability F; which is the prediction horizon t ðQðF; tÞÞ: Using Sen’s slope estimator the sample quantile function at time t; p-periods ahead, assuming stationarity at t0 can be rewritten as
a01 ¼ k1 a1 þ k2 b1 < a1 þ kb1
QðF;tÞ ¼ ½mst ðt0 Þ þ a01 ðt0 þ pÞ £ ½ðsst ðt0 Þ þ b01 ðt0 þ pÞÞ
ð17Þ
b01 ¼ l1 a1 þ l2 b1 < lb1 where ki and li are distribution dependent parameters. If a trend in the scale parameter is present in the series that originated from an extremal-type of distribution, then the de-trending operation (Eq. (8)) removes part of this trend. Therefore the slope b1 estimated by
sðtÞ ¼ sst ðt0 Þ þ b01 ðt0 þ pÞ
£ f½F;1;1; Q3 ; Q4 ;…; Qn ð19Þ where f is a scale frequency factor that depends on the probability distribution. The parameters of a stationary distribution (quantile) function can be estimated by a number
J.M. Cunderlik, D.H. Burn / Journal of Hydrology 276 (2003) 210–223
of methods. The method of L-moments (Hosking, 1990) has become popular for parameter estimation in hydrology. L-moments are linear functions of the sample values, thus avoiding non-linear transformations of data. They are virtually unbiased, have relatively small sampling variance and are less sensitive to outliers than traditional moments. Hosking defines L-moments to be the quantities
lr ¼
ð1 0
xðFÞPpr21 ðFÞdF; r ¼ 1; 2; …;
test for trend at a significance level of a; H0 should be rejected if lZl $ za=2 ; where FN ðza=2 Þ is the standard normal cumulative distribution function and Z is the test statistic 8 S21 > > pffiffiffiffiffiffiffiffi if S . 0 > > > < VarðSÞ Z¼ 0 ð22Þ if S ¼ 0 > > > S þ 1 > > : pffiffiffiffiffiffiffiffi if S , 0 VarðSÞ
ð20Þ where S is defined by
where Ppr ðFÞ ¼
r X k¼0
ppr;k F k
ppr;k ¼ ð21Þr2k
r k
!
215
r þk
!
S¼ ð21Þ
k
In the method of L-moments, the location parameter (the index flood, l1 ) the scale ðl2 Þ and the shape (LCSðl3 =l2 Þ; LCKðl4 =l2 Þ) parameters are estimated from linear combinations of the observed data series. The set of n stationary parameters Qi of a dimensionless growth curve in Eq. (19) is estimated using the method of L-moments regionally, from a group of catchments that share some degree of hydrological similarity with the catchment of interest. The estimation is based on calculating an average value of an L-moment ratio weighted by the degree of the similarity between a given site and the site on which the group is focused. 2.3. Non-stationarity significance assessment 2.3.1. Local non-stationarity significance The second-order non-stationary model described in Section 2.2 should be applied for FFE when one or both time dependent distribution parameters significantly change with time. The significance of the change can be assessed using the Mann – Kendall test (MK test) (Mann, 1945; Kendall, 1975) of the randomness of data against trend. The MK test is based entirely on ranks and hence is robust to nonnormality. According to Mann, the null hypothesis H0 states that the data are a sample of N independent and identically distributed random variables. The alternative hypothesis H1 is that the distribution of xi and xj is not identical for all i; j # N with i – j: In a two-sided
N21 X
N X
sgnðxi 2 xj Þ
j¼1 i¼jþ1
8 1 if u . 0 > > < if u ¼ 0 sgnðuÞ ¼ 0 > > : 21 if u , 0
ð23Þ
Kendall obtained the theoretical mean and variance of S under the assumption of no trend as EðSÞ ¼ 0 " VarðSÞ ¼ nðn 2 1Þð2n þ 5Þ 2
X
# tðt 2 1Þð2t þ 5Þ =18
t
ð24Þ where t is the extent of any tie. A continuity correction for improving the approximation to normality is recommended when less than 10 data values is available. The results of the MK test are evaluated in this work at the 10% local significance level. 2.3.2. Regional non-stationarity significance The occurrence of one extreme high or low value (possibly erratic) at the beginning or at the end of a hydrologic time series may lead to a significant result of the Mann – Kendall test. Therefore it is necessary to check whether similar results are observed also at the neighboring gauged sites and if yes, then to assess their regional significance. The regional significance represents more reliable information than the local significance. In regional significance analysis the objective is to assess whether the proportion of sites with significant local trend (local non-stationarity in
216
J.M. Cunderlik, D.H. Burn / Journal of Hydrology 276 (2003) 210–223
a given distribution parameter) from a given homogeneous pooling group can be regarded as significant at a regional scale. The number of significant sites is affected by the presence of cross-correlation in the data set, which tends to increase the expected number of trends under the hypothesis of no trend in the region. A regional bootstrap re-sampling technique preserves the cross-correlation in the original data set. This allows the impact of cross-correlation to be determined in establishing the critical value for the percentage of stations that by chance exhibit a trend at a given local significance level. On the other hand, the bootstrapping procedure does not retain in the resampled data the temporal structure (trends) that exists in the original data. In the regional bootstrap technique a vector containing data from all sites is associated with each year from the common regional observation period. A new synthetic data set is then obtained by combining randomly selected vectors, until the total number of station-years from the resampled data set is equal to the number of stationyears in the original data set. Then the MK test for trend detection is applied to the data from each station in the resampled data set and the percentage of results that are significant at the local significance level ðLaÞ is determined. This process is repeated Nsim times resulting in a distribution for the percentage of results that are locally significant at the La level. From this distribution, the value that is exceeded Ra times is selected as the regional significance level. In this study the number of simulation Nsim is set to 1000. After the regional significance is assessed and found positive at the Ra significance level, regional confidence limits can be constructed to express the regional uncertainty involved in the FFM. We suggest calculating the regional confidence limits from Nsim simulated sets of regional resampled data. The lower (upper) confidence limit for SL will be the kth value of SLij (i ¼ 1; 2; …; Nsim ; j ¼ 1; 2; …; Nreg Þ counting from the smallest (largest) value upward (downward) where Nreg is the number of sites in a pooling group and N ¼ Nsim £ Nreg : 2.3.2.1. Site-focused flood pooling. A group of hydrologically similar catchments that are needed for the regional non-stationarity significance assessment, and for the construction of the regional confidence limits, can be identified by site-focused
pooling. The site-focused pooling applied in this study is based on the seasonality of AMFs. The similarity measure is defined as the root mean squared difference between relative frequencies of flood occurrence in each month (Cunderlik and Burn, 2002a) vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi uX u 12 u ðRFik 2 RFjk Þ2 u t k¼1 ð25Þ Simði;jÞ ¼ 12 where RFki ðRFkj Þ are relative frequencies of flood occurrence in catchment iðjÞ: Similarity indices Simði;jÞ defined by Eq. (25) are calculated between the site of interest and all potential sites that may form the pooling group. The pooling process then involves successively adding sites, starting with the most similar site, and continuously adding the next most similar site. In each step, after a new site is added into the pooling group, the homogeneity of the group is examined. The pooled homogeneity is tested using H2 (based on LCV (coefficient of linear variation) and LCS (coefficient of linear skewness)) and H3 (LCS and LCK (coefficient of linear kurtosis)) tests defined by Hosking and Wallis (1997). The null hypothesis for both tests is H0: H2ðH3Þ , H p (pooling group is homogeneous) against the alternative H1: H2ðH3Þ $ H p (pooling group is heterogeneous). The limiting threshold H p for the H2 and H3 measures is set to 2. If adding a site to the pooling group causes pooled heterogeneity then the site is removed from the group and the next most similar site is added. The pooling is terminated when no site is left that if included would not cause pooled heterogeneity. Thus, the upper limit for the group’s size is set by the pooled homogeneity.
3. Application 3.1. South BC mountains climate region The non-stationary pooled FFM presented in this study is demonstrated on a pooling group of catchments from the South British Columbia Mountains Climate region (SBCM) defined by Hare and Thomas (1979). The SBCM region belongs to the Cordillera physiographic region (Fremlin, 1974) and the Montane Cordillera ecozone (Environment Canada, 1989; Wiken,
J.M. Cunderlik, D.H. Burn / Journal of Hydrology 276 (2003) 210–223
217
Fig. 2. The AMF time series from the Columbia River at Donald (WSC station number 08NB005). Asterisk-line shows the decreasing trend in the location parameter calculated using Sen’s slope estimator.
1996) thus roughly representing a climatologically and physiographically homogeneous area. The catchments selected in this study belong to the Reference Hydrometric Basin Network (RHBN) (Harvey et al., 1999) that is characterized by catchments with relatively pristine and stable landuse conditions. The pooling group is focused on the Columbia River at Donald (Water Survey of Canada station number 08NB005). Fig. 2 shows the AMF time series from this site with a line representing Sen’s slope calculated from this record. The line shows a visible decreasing trend in the location parameter of the series.
4. Results The site-focused pooling (Eq. (25)) identified 14 hydrologically similar sites prior to the test of pooled heterogeneity exceeding its limit. The value of H2 for the group was 1.745 and H3 0.941. All sites included in the pooling group overlap the 40year long record of the Columbia River with a minimum ratio of 90%. Fig. 3 shows the location of the Columbia River catchment (black) together
with the 14 other catchments attributed to its pooling group (four of them are too small to be discerned), and an approximated polygonal delineation of the SBCM region. After transforming all records to be stationary, the GEV distribution was chosen as the most appropriate pooled distribution according to the L-moment ratio diagram (not shown) and the goodness of fit z statistic defined by Hosking and Wallis (1997) (zGEV ¼ 0:602 against zGPA ¼ 24:742; zGLO ¼ 3:008; zPE3 ¼ 21:089 and zLN3 ¼ 0:727). The stationary, pooled coefficient of linear skewness (LCS ¼ 0.218) was calculated as an average from all sites in the pooling group, weighted by the similarity measure given by Eq. (25). The distribution-dependent parameters k and l in Eq. (17) for samples drawn from the GEV distribution with pooled LCS ¼ 0.218 were estimated by a Monte-Carlo experiment. The experiment involved the definition of 25 scenarios (50 for both parameters) with seven different magnitudes (0, ^ 0.1, ^ 0.5 and ^ 0.9% change of the stationary location (scale) parameter per year) of the slopes in the location and scale parameters. For each scenario 1000 samples with a record length of 500 years
218
J.M. Cunderlik, D.H. Burn / Journal of Hydrology 276 (2003) 210–223
Fig. 3. Location of catchments attributed to the Columbia River (black) pooling group with approximated polygonal delineation of SBCM region.
were generated and used for estimation of the correction parameters k and l: The simulation results are shown in Figs. 4 and 5. Fig. 4 depicts the planar relation between the adjusted magnitude of the trend in the location
parameter ða01 Þ and the trend magnitudes in the location ða1 Þ and scale ðb1 Þ estimated using Sen’s method. Similarly Fig. 5 shows the relationship for the adjusted trend magnitude in the scale parameter ðb01 Þ: We can see that the relation between
Fig. 4. The relation between the adjusted magnitude of the trend in the location parameter ða01 Þ and the trend magnitudes in the location ða1 Þ and scale ðb1 Þ estimated using Sen’s method.
Fig. 5. The relation between the adjusted magnitude of the trend in the scale parameter ðb01 Þ and the trend magnitudes in the location ða1 Þ and scale ðb1 Þ estimated using Sen’s method.
J.M. Cunderlik, D.H. Burn / Journal of Hydrology 276 (2003) 210–223
b01 and a1 can be neglected. The parameters k and l were estimated by linear regression and inserted into Eq. (17) a01 ¼ a1 þ 0:651b1
and b01 ¼ 1:165b1
ð26Þ
The relative BIAS and RMSE of the slope magnitudes estimated from samples with different record lengths using Eq. (26) are summarized in Figs. 6 and 7. Figs. 6 and 7 depict 12 results from eight slopecombination scenarios (results 1– 6 (7 –12) for estimating the slope in the location (scale) parameter) with medium slope magnitudes (^ 0.5%). Two scenarios are unique for each parameter and four are used for assessing the performance of both parameters (giving a total of 12 results). For better orientation the small figures located in the upper right corners of Figs. 6 and 7 explain the eight defined scenarios (also numbers of scale scenarios are in italics). We can see that the overall spread of the relative BIAS increases from around ^ 5% for the record length of 200 years to around ^ 15% for 50-year long samples. The lowest
219
values are observed for scenarios where the slope in the scale parameter is zero (scenarios 1 and 2). The BIAS values for scenarios with slopes in the scale parameter (dashed line) have a larger spread than the values for scenarios with slopes in the location parameter (solid line). The largest magnitude of BIAS (2 13.7%) calculated from 50-year long samples was observed in the estimation of slopes in the scale parameter for the scenario with no trend in the location parameter and negative trend in the scale parameter (scenario 8). The relative RMSE in most scenarios sharply increases when the record length is smaller than 100 years (Fig. 7). For 200-year long records the average RMSE is 30% and this value increases to 65% for 50-year long records. The lowest values are again observed for scenarios where the slope in the scale parameter is zero. The results for scenarios with slopes in the scale parameter (dashed line) have in general higher values of RMSE than those for scenarios with slopes in the location parameter (solid line) and are grouped in the upper part of the chart.
Fig. 6. Relative BIAS of the estimated slope magnitudes in the location and scale parameters from the GEV distribution with LCS ¼ 0.218 for different slope-combination scenarios (shown in the upper right corner) with medium slope magnitudes (^0.5% of the stationary location parameter per year). Location slope scenarios are depicted by solid lines and scale slope scenarios by dashed lines and italics numbers.
220
J.M. Cunderlik, D.H. Burn / Journal of Hydrology 276 (2003) 210–223
Fig. 7. Relative RMSE of the estimated slope magnitudes in the location and scale parameters from the GEV distribution with LCS ¼ 0.218 for different slope-combination scenarios (shown in the upper right corner) with medium slope magnitudes (^0.5% of the stationary location parameter per year). Location slope scenarios are depicted by solid lines and scale slope scenarios by dashed lines and italics numbers.
The Mann –Kendall test identified a locally significant negative trend in the location parameter of the AMF series of the Columbia River at Donald at the 10% significance level and no locally significant trend in the scale parameter. The standardized slope value for the location parameter was 2 0.398 and was significantly different from zero according to the Theil test at the 10% significance level. The upper and lower 95% confidence bands for this value are þ 0.018 and 2 0.812, respectively. There are four other sites with locally significant negative trend in the location parameter at the 10% significance level found in the focused pooling group of the Columbia River at Donald. According to the regional resampling scenario this trend is regionally significant at the 10% significance level and thus it is unlikely that the trend observed at the Columbia River is caused by a measurement error or by an extreme outlier. Theil’s test for Sen’s slope estimator showed that 13 out of 15 records had slopes of the location parameter significantly different from zero. There were only two sites with significantly decreasing trends in the scale parameter at the 10% significance level, but this result was not regionally significant. The slopes of these trends
were also significantly different from zero according to Theil’s test. Since the trends in the scale parameter were not regionally significant, the regional 95% confidence bands were estimated using the bootstrap algorithm only for the trend in the location parameter. The upper standardized band is 0.30 and the lower band 2 0.99. The results are summarized in Table 1. The individual records belonging to the pooling group were made stationary according to the significance of trends in the location and scale parameters. The location and scale parameters were derived from the focused record and used for quantile estimation and prediction. The stationary, pooled shape parameter LCS (0.218) was determined as weighted average according to the site similarity with the focused site. Quantiles of a return period of 100 years were estimated for a given prediction horizon t according to Eq. (19). Fig. 8 depicts the values of QGEV 100 predicted for different time horizons together with local and regional 95% confidence limits. The horizontal gray line represents the value of QGEV 100 (1242 m3 s21) calculated from the raw record that was available for the focused site from 1960 to 1999.
J.M. Cunderlik, D.H. Burn / Journal of Hydrology 276 (2003) 210–223
221
Table 1 Pooling group focused on the Columbia River at Donald (WSC station number 08NB005) for the common observation period 1960–1999. ‘S’ marks results significant at the 10% significance level, aðbÞ ¼ trend in location (scale) parameter WSC ID
Sim(i,j) ( –)
Rec. len. (years)
MK test a
MK test b
Theil’s test a
Theil’s test b
08NB005 08JE001 08NE001 08ID001 08KH006 08ND013 05BB001 08JB002 08NF001 05DA007 08LA001 08FB007 08NH005 08NE006 08FB006
0.000 4.643 6.019 6.705 7.828 8.297 8.737 10.156 10.531 10.680 11.367 12.375 12.642 13.807 14.070
40 40 36 38 40 36 40 40 40 40 40 36 36 36 36
S – – – – – – S – – – S – S S
– – – – S – – – – – – S – – –
S S S S S S S S – – S S S S S
– – – – S – – – – – – S – – –
38.3
S
–
S
–
Pooling group
From Fig. 8 we can see that if we ignore even a weakly significant non-stationarity in the location parameter of our data series then we could design values in 2002 overestimate the QGEV 100 by 84 m3 s21 (7%) and in 2020 by 142 m3 s21 (13%).
In practical engineering hydrology overestimating the design values as a result of ignoring significant decreasing trends in the series may lead to over designed objects and unnecessary expenses. The more serious cases arise from ignoring increasing trends in maximum floods, which may
Fig. 8. Values of the QGEV 100 predicted for different time horizons together with local (L) and regional (R) 95% confidence limits. The horizontal 3 21 gray line represents the value of QGEV 100 (1242 m s ) calculated from the raw record that was available for the focused site from the common observation period 1960–1999.
222
J.M. Cunderlik, D.H. Burn / Journal of Hydrology 276 (2003) 210–223
lead to under designed objects and significant losses when more severe floods occur than are expected. 5. Conclusions Non-stationarity in hydrologic time series as an outcome of natural and/or anthropogenic activities at local or global scales is a feature that cannot be ignored in contemporary hydrology. Currently used statistical methods for modeling hydrologic time series and for estimating their extreme values for a given recurrence period cannot be reliably applied in cases when significant non-stationarity is found in the record. New approaches are critically needed that allow predicting the parameters of time-dependent distributions at a point in time. This paper has outlined a second-order nonstationary approach to pooled flood frequency analysis. The proposed model separates the pooled quantile function into a local time-dependent component and a regional time-invariant component. The results showed that the 50-year long samples generated from different trend-scenarios assuming GEV distribution yielded relative BIAS of the estimated slopes in the location (scale) parameter within the range of ^ 7% (^ 15%) and to the average relative RMSE of 60% (80%). The results quickly improve as the record length of the generated samples increases. The model was applied to a group of catchments from a homogeneous region in British Columbia with regionally significant decreasing trend in the location parameter of the AMF series. The results showed that ignoring even a weakly significant non-stationarity in the data series may seriously bias the quantile estimation for horizons as near as 0 – 20 years in the future. Future work will be focused on identification of linkages between non-stationarity in hydrologic and climatic records and on improving the prediction reliability using climatic data.
Acknowledgements JMC gratefully acknowledges the support of a NATO Science Fellowship administered by
the Natural Sciences and Engineering Research Council of Canada (NSERC). Environment Canada provided the data for the RHBN stations. This research was supported in part by a grant from NSERC. The authors appreciated helpful comments from two anonymous reviewers.
References Adamowski, K., Bocci, C., 2001. Geostatistical regional trend detection in river flow data. Hydrological Processes 15, 3331–3341. Cayan, D.R., Redmond, K.T., Riddle, L.G., 1999. ENSO and hydrologic extremes in the western United States. Journal of Climate 12, 2881–2893. Changnon, S.A., Kunkel, K.E., 1995. Climate-related fluctuation in Midwestern floods during 1921 – 1985. Journal of Water Resources Planning and Management 121 (4), 326–334. Cunderlik, J.M., Burn, D.H., 2002a. Analysis of the linkage between rain and flood regime and its application to regional flood frequency estimation. Journal of Hydrology 261, 115 –131. Cunderlik, J.M., Burn, D.H., 2002b. Local and regional trends in monthly maximum flows in southern British Columbia. Canadian Water Resources Journal 27/2, 191– 212. Daniel, W.W., 1990. Applied Nonparametric Statistics, second ed, PWS-KENT, Boston, 635 p. Douglas, E.M., Vogel, R.M., Kroll, C.N., 2000. Trends in floods and low flows in the United States: impact of spatial correlation. Journal of Hydrology 240, 90–105. Duckstein, L., Bobee, B., Bogardi, I., 1987. Bayesian forecasting of hydrologic variables under changing climatology. IAHS Publication No. 168. Environment Canada, 1989. Ecoclimatic Regions of Canada, Ecological Land Classification Series, vol. 23. Supply and Services Canada, 118 p. Fremlin, G. (Ed.), 1974. The National Atlas of Canada, AshtonPotter Ltd, p. 254. Grew, H., Werrity, A., 1995. Changes in flood frequency and magnitude in Scotland. BHS Fifth National Hydrology Symposium, Edinburgh, 3.1–3.9. Hare, F.K., Thomas, M.K., 1979. Climate Canada, second ed, Wiley, New York, 230 p. Harvey, K.D., Pilon, P.J., Yuzyk, T.R., 1999. Canada’s Reference Hydrometric Basin Network (RHBN), Partnerships in Water Resources Management. Proceedings of the CWRA 51st Annual Conference, Nova Scotia,. Hosking, J.R.M., 1990. L-moments: analysis and estimation of distributions using linear combinations of order statistics. Journal of Royal Statistical Society, Series B 52 (2), 105–124. Hosking, J.R.M., Wallis, J.R., 1997. Regional Frequency Analysis: An Approach based on L-moments, Cambridge University Press, Cambridge, 223 p. Jain, S., Lall, U., 2001. Floods in a changing climate: does the past represent the future? Water Resources Research 37 (12), 3193–3205.
J.M. Cunderlik, D.H. Burn / Journal of Hydrology 276 (2003) 210–223 Kendall, M.G., 1975. Rank Correlation Measures, Charles Griffin, London. Lins, H.F., Slack, J.R., 1999. Streamflow trends in the United States. Geophysical Research Letters 26 (2), 227–230. Loukas, A., Quick, M.C., 1996. Effect of climate change on hydrologic regime of two climatically different watersheds. Journal of Hydrologic Engineering 1 (2), 77–87. Loukas, A., Quick, M.C., 1999. The effect of climate change on floods in British Columbia. Nordic Hydrology 30, 231 –256. Mann, H.B., 1945. Non-parametric tests against trend. Econometrica 13, 245–259. North, M., 1980. Time-dependent stochastic model of floods. Journal of Hydraulic Division ASCE 106 (HY5), 649–665. Plag, H.P., Tsimplis, M.N., 1999. Temporal variability of the seasonal sea-level cycle in the Northern Sea and Baltic Sea in relation to climate variability. Global and Planetary Change 20, 173–203. Reed, D.W., Jakob, D., Robson, A.J., 1999. Statistical procedures for flood frequency estimation. In: Robson, A.J., Reed, D.W. (Eds.), Flood Estimation Handbook, vol. 3. Institute of Hydrology, p. 338. Robson, A.J., Jones, T.K., Reed, D.W., Bayliss, A.C., 1998. A study of national trend and variation in UK floods. International Journal of Climatology 18, 165–182.
223
Sen, P.K., 1968. Estimates of the regression coefficient based on Kendall’s tau. Journal of American Statistical Association 63, 1379–1389. Strupczewski, W.G., Kaczmarek, Z., 2001. Non-stationary approach to at-site flood frequency modelling II. Weighted least squares estimation. Journal of Hydrology 248, 143 –151. Strupczewski, W.G., Singh, V.P., Feluch, W., 2001. Non-stationary approach to at-site flood frequency modelling I. Maximum likelihood estimation. Journal of Hydrology 248, 123 –142. Theil, H., 1950. A rank-invariant method of linear and polynomial regression analysis. I. Koninklijke Nederlandse Akademie van Wetenschappen, Proceedings 53, 386 –392. Thompson, R.D., Perry, A. (Eds.), 1998. Applied Climatology, Principles and Practice, Routledge, London, p. 352. Vinnikov, K.Y., Robock, A., 2002. Trends in moments of climatic indices. Geophysical Research Letters 29 (2), 14/1–14/4. Westmacott, J.R., Burn, D.H., 1997. Climate change effects on the hydrologic regime within the Churchill–Nelson River Basin. Journal of Hydrology 202, 263–279. Wiken, E.B., 1996. Ecozones of Canada, Environment Canada, Ottawa. Zhang, X., Harvey, K.D., Hogg, W.D., Yuzyk, T.R., 2001. Trends in Canadian streamflow. Water Resources Research 37 (4), 987 –998.