Accepted Manuscript
Considering historical flood events in flood frequency analysis: Is it worth the effort? Thomas Schendel, Rossukon Thongwichian PII: DOI: Reference:
S0309-1708(17)30479-7 10.1016/j.advwatres.2017.05.002 ADWR 2842
To appear in:
Advances in Water Resources
Received date: Revised date: Accepted date:
12 November 2015 16 January 2017 5 May 2017
Please cite this article as: Thomas Schendel, Rossukon Thongwichian, Considering historical flood events in flood frequency analysis: Is it worth the effort?, Advances in Water Resources (2017), doi: 10.1016/j.advwatres.2017.05.002
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT
Highlights • We studied the utility of including historic floods for flood frequency
CR IP T
analysis.
• Bias decreases for large return periods, but increases for smaller ones.
• Including historical data substantially reduces root mean square error.
AN US
• We generalized an approach to estimate the historic period to several known floods.
• Exact knowledge of the historic period does only marginally increase
AC
CE
PT
ED
M
performance.
1
ACCEPTED MANUSCRIPT
Considering historical flood events in flood frequency analysis: Is it worth the effort?
CR IP T
Thomas Schendela,∗, Rossukon Thongwichianb a
b
Oranienburger Straße 33, 16775 Gransee, Germany Thammasat University, Department of Biotechnology, Faculty of Science and Technology, Pathumthani, 12121 Thailand
AN US
Abstract
Information about historical floods can be useful in reducing uncertainties in flood frequency estimation. Since the start of the historical record is often defined by the first known flood, the length of the true historical period
M
M remains unknown. We have expanded a previously published method of estimating M to the case of several known floods within the historical
ED
period. We performed a systematic evaluation of the usefulness of including historical flood events into flood frequency analysis for a wide range of return periods and studied bias as well as relative root mean square error
PT
(RRMSE). Since we used the generalized extreme value distribution (GEV) as parent distribution, we were able to investigate the impact of varying the
CE
skewness on RRMSE. We confirmed the usefulness of historical flood data regarding the reduction of RRMSE, however we found that this reduction
AC
is less pronounced the more positively skewed the parent distribution was. Including historical flood information had an ambiguous effect on bias: depending on length and number of known floods of the historical period, bias ∗
Corresponding author. Email:
[email protected]
Preprint submitted to Advances in Water Resources
May 6, 2017
ACCEPTED MANUSCRIPT
was reduced for large return periods, but increased for smaller ones. Finally, we customized the test inversion bootstrap for estimating confidence inter-
CR IP T
vals to the case that historical flood events are taken into account into flood frequency analysis.
Keywords: flood frequency analysis, historical floods, confidence intervals
1
1. Introduction
Water resources design and management requires the analysis of extreme
3
flood events, especially the determination of floods with defined return pe-
4
riods. A common approach therefore is the annual block maxima approach,
5
where for each year the peak streamflow is determined and a distribution
6
is fitted to this series of maxima. Eventually this distribution is used to
7
estimate the return level for a defined return period. However, due to the
8
finite sample size, the estimated return levels are typically associated with
9
a wide range of uncertainty. One way to reduce this uncertainty is the in-
10
clusion of historical recorded flood events before the onset of systematical
11
record. These historical records are characterized by the knowledge of one
12
or several (usually very extreme) flood events, while for the majority of the
13
years within the historical period the maximal annual streamflow is unknown.
14
But since they were not recorded, it is assumed that they were smaller than
15
the smallest mentioned flood event. In many case studies, the implementa-
AC
CE
PT
ED
M
AN US
2
16
tion of historical data into flood frequency analysis was demonstrated, e.g.
17
by using traditional methods of statistical analysis [8] [28] [22] or Bayesian
18
statistics [26] [33] [9] [25]. The advantages of including historical floods were
19
systematically studied e.g. by Frances et al. (1994) [8]. By assuming the log 3
ACCEPTED MANUSCRIPT
Gumbel distribution as parent distribution, for different types of historical
21
data (flood events are explicitly known vs. only known that a threshold was
22
surpassed) the statistical gain of including historical information was studied
23
in dependence on the desired return period.
CR IP T
20
However, in many cases the true length of the historical period M is not
25
known, especially when the historical record starts already with a known
26
flood. Assuming that M is simply the period from the first known flood to
27
the start of the systematic record (denoted as L) will substantially underes-
28
timate the true length of the historical period, which was already mentioned
29
in Hirsch and Stedinger (1987) [11]. Ga´al et al. (2010) [9] combined a visual
30
inspection method with Bayesian analysis in order to estimate M , but the au-
31
thors called their approach themselves as ”subjective”. Another systematic
32
study was performed by Strupczewski et al. (2014) [31] (using frequentist
33
statistics), where the benefit of including the largest historical flood for flood
34
frequency analysis was studied, using the Weibull and Gumbel distribution
35
as underlying parent distribution. The authors presented a sound approach
36
for estimating M if one flood is known during M . However, this investi-
37
gation led to the rather pessimistic claim, that it is not sure ”whether the
38
whole operation is worth a candle”. Although the root mean square error
39
was substantially reduced by including historical information (for the flood
40
with return period of 100 years), the authors judged the occurring positive
CE
PT
ED
M
AN US
24
bias (for the same return level) in conjunction with the uncertainty of flood
42
magnitude and length of the historic period very critical. However, we have
43
some doubts regarding the used approach. First, only samples were consid-
44
ered were the maximal flood is within the historical period and does not occur
AC 41
4
ACCEPTED MANUSCRIPT
in the systematic record (with length N ). This selection itself may already
46
introduce a bias, since samples with very large floods in the systematic record
47
(or reverse, only modestly large floods in the historic period) are neglected.
48
Second, for estimating the historical period M , the formula M = 2 · L is
49
used, although the authors mention elsewhere that M = 2 · L + N should
50
be utilized if the maximal flood occurs in the historical period. Both factors
51
may result in a positive bias compared to a flood frequency analysis using
52
the systematic record only.
AN US
CR IP T
45
While the assessment of uncertainty of the results of flood frequency anal-
54
ysis is an integral part of the results obtained by Bayesian statistics (credible
55
interval), the calculation of confidence intervals is often neglected in the
56
framework of frequentist statistics. Nevertheless, attempts to estimate confi-
57
dence intervals in the presence of historical information have been made for
58
example by McDonald et al. (2014) [22], where the variance is estimated by
59
the delta method and the appropriate return level is assumed to be normal
60
distributed. However, for the cases of extreme events, it was shown that con-
61
fidence intervals are typically not symmetrical distributed (see for example
62
[24]), which leads to the use of bootstrap approaches ([3],[17],[1]) or Profile
63
likelihood ([24], [32]). In a previous article, we have customized a rarely used
64
bootstrap method, the test inversion bootstrap, to the case of systematic an-
65
nual flood record for the generalized extreme value distribution (GEV) [29]
CE
PT
ED
M
53
and shown its superiority against other bootstrap approaches as well as the
67
Profile Likelihood. The main idea of the test inversion is that it emphasizes
68
that the sample does not represent the parent distribution (nor its variance),
69
but given a certain confidence level, the sample is seen as a clearly defined
AC 66
5
ACCEPTED MANUSCRIPT
70
over- or underestimation of the underlying distribution. In this article we aim to evaluate the usefulness of using historical flood
72
information for flood frequency analysis using the GEV as parent distribu-
73
tion. We will investigate this matter studying the bias and relative root mean
74
square error (RRMSE) for a wide range of different skewness of the under-
75
lying distribution, which extends the investigations performed by Frances et
76
al. (1994) [8] and Strupczewski et al. (2014) [31]. We intend to generalize
77
the estimation of the length of the historical period (presented in [31] for the
78
largest historical flood) to the case that several floods are known for the his-
79
torical period. Furthermore, we want to adapt the test inversion bootstrap
80
approach for estimating confidence intervals to the case of historical flood
81
information.
82
2. Methods
83
2.1. Estimating the length of the historical period
ED
M
AN US
CR IP T
71
First of all, we would like to discuss a rather simple case for flood fre-
85
quency analysis using historical data, shown in Fig. 1a. Given are a sys-
86
tematic record with the length of N years and k known floods during the
87
historical period M . For this case, we assume that the historical period M
88
is known. As example, if a city was founded at a river side and the historic
89
accounts of flood reports (since its founding) are well preserved, it can be
AC
CE
PT
84
90
assumed that the historical period is simply the period between the found-
91
ing of the city and the onset of systematic records. It is assumed that for
92
the unrecorded years, the annual maximal streamflow’s are lower than the
93
smallest recorded flood xl within the historical period - otherwise they would 6
ACCEPTED MANUSCRIPT
T M
N
CR IP T
streamflow x
b streamflow x
a
T L
(years)
N
M
(years)
Figure 1: Systematic record of N annual maximal streamflow combined with a) known historical period M ; b) unknown historical period M with length L between first known flood and the begin of the systematic record.
have been mentioned too. The likelihood function L˜ in this case would read
95
[8] : L˜ =
M k
F (xl )M −k
k Y j=1
f (xj )
N Y
f (xi ) ,
(1)
i=1
M
AN US
94
with F (x) the cumulative distributive and f (x) the probability density func-
97
tion. The term F (xl )M −k accounts for all the unrecorded years. By maximiz-
98
˜ the corresponding parameters of F can be determined. ing the likelihood L,
99
However, Fig. 1b shows a much more realistic version of how historical flood
100
data is usually available: the historical period starts already with a distinc-
101
tive flood event, but nothing certain is known for the years before this event.
102
However, it is very likely that if a flood with a similar magnitude had oc-
103
curred some years before, it would have been recorded too. Therefore, taking
104
the historical period M = L (with L the number of years from the first his-
105
torical flood to the start of the systematic record) will lead to substantial
106
bias [11] [31]. Strupczewski et al. (2014) [31] inferred that for the largest
107
historical flood, the best estimate would be M = 2L (disregarding the sys-
108
tematic record), since the position of the largest flood (and therefore also L)
AC
CE
PT
ED
96
7
ACCEPTED MANUSCRIPT
can be seen as a uniform random variable on the interval [0, 1, ..., M ] with the
110
expected value E(L) = M/2 (if the systematic record is neglected, otherwise
111
it would be E(L) = (M + N )/2).
112
CR IP T
109
We would like to generalize this finding to the situation that several floods are known in the historical period. We propose following steps:
114
1. Determine the value of the smallest flood xl in the historical period (the
115
assumption here, of course is that in all non-recorded years the maximal
116
streamflow was lower than this particular flood).
117
2. Consider both, the historical period and the systematic record: How many
118
floods are larger or equal to the value xl ? Denote the result with k.
119
3. The estimate for M is then given by:
AN US
113
L+N −1 . k
(2)
M
M =L+
The reasoning behind this formula is that for the unrecorded years, the max-
121
imal streamflow is lower than the kth largest flood. Since it is assumed that
122
the maximal annual floods are independent, picking the k largest floods is
123
not different from picking any k numbers from the interval [1, 2, ..., M + N ]
124
(note that in contrast to [31] the interval starts with 1). Therefore, the prob-
125
ability that one of the k mentioned flood events occurs for a given year, is the
126
same for each year. Drawing k uniform distributed random numbers from the
127
continuous interval [0, 1] will yield an expected value for the position of the
AC
CE
PT
ED
120
128
largest value (which corresponds to L) of k/(k+1) [4]. Correspondingly, using
129
the interval [1, 2, ..., M + N ], L can be estimated to 1 + (M + N − 1)k/(k + 1).
130
This leads to Eq. 2. The estimated M can be used to calculate the appro-
131
priate likelihood given in Eq. 1.
8
ACCEPTED MANUSCRIPT
132
2.2. The GEV and estimation of its parameters The generalized extreme value distribution (GEV) is a theoretical sound
134
choice to fit the annual series of maximal streamflow, since the Fisher-
135
Tippett-Theorem [7] [10] states that the GEV is the limiting distribution
136
of maxima of a series of independent and identically distributed observa-
137
tions. It is also widely used in hydrology [16] [27] [30] [19] [5] [18] [12] [23].
138
For the cumulative GEV distribution, we used following notation (with the
139
shape parameter a, scale parameter b, location parameter c, and the peak
140
flow x): Fa,b,c (x) = e−(1+a
1 x−c − a ) b
− x−c b
F0,b,c (x) = e−e
AN US
CR IP T
133
; a 6= 0; 1 + a
; a = 0.
x−c >0 b (3)
Noteworthy, in hydrology often an alternative parametrization of the GEV
142
with −a instead of a is used. Anyway, for a ≥ 0 the GEV has no upper
143
boundary, and for a > 0 the GEV is heavily tailed (since the moments
144
greater than 1/a are not defined). In contrast, for a < 0, the distribution
145
has a finite upper bound where all moments the distribution has a finite
146
upper bound. The case a = 0 is known as the Gumbel distribution, which is
147
unbound but all of its moments are defined.
ED
PT
Utilizing Eq. 1 implies the use of maximum likelihood in order to estimate
the parameters of the GEV. However, from previous studies [13] [21][6] it is
AC
149
CE
148
M
141
150
known that MLE performs bad in terms of bias as well as root mean square
151
error compared to the methods of probability weighted moments (PWM) for
152
small sample sizes. Coles and Dixon (1999) [6] have attributed the com-
153
parable good performance of PWM to its assumption that the mean of the 9
ACCEPTED MANUSCRIPT
distribution exists (shape parameter a < 1) and if such a constraint is in-
155
troduced to the likelihood, its performance is even better than for PWM.
156
In detail, they introduced a penalty function P that is multiplied to the
157
likelihood function: L˜Pa,b,c (x) = P (a) · L˜a,b,c (x) with
1 P (a) = g(a) 0 g(a) = exp −
a≤0
AN US
158
CR IP T
154
0
(5)
(6)
M
a≥1 a 1−a
(4)
159
The penalty increases with increasing shape parameter a. However, the
160
choice of g(a) in Eq. 6 is suboptimal. If LP denotes the penalized log-
likelihood (LP := log(L˜P )), then the derivative of the penalized log-likelihood
162
with respect to a is:
∂LP ∂L 1 = − ∂a ∂a (1 − a)2 0 < a < 1.
CE
PT
ED
161
There are two main problems associated with this derivative, more exactly
164
with the penalizing term:
165
1. It is not unambiguously defined for a = 0. The left hand limes is 0,
166
the right one -1. This can be problematic for algorithms searching for local
167
maxima of the penalized likelihood.
168
2. Even small (but positive) values of the shape parameter a are already
AC
163
10
ACCEPTED MANUSCRIPT
noticeable penalized. However, Coles and Dixon (1999) [6] themselves have
170
shown that robust measures of estimator performance (like the median) show
171
negative bias despite the significant positive bias that is observed for the mean
172
estimators (see also the results in the Appendix). Thus, small values of a
173
should, if possible, not noticeable be penalized.
Therefore, we would like to introduce a different function g(a): a3 g(a) = exp − . 1−a
(7)
AN US
174
CR IP T
169
For this choice of g(a) the first and second derivative (with respect to a) are
176
unambiguously defined. Moreover, small values of a are penalized very little,
177
while for a → 1 the penalizing effect is similar to Eq. 6. The performance
178
of the modified penalized likelihood is studied in the Appendix for relevant
179
parameters used in this article.
180
2.3. Details for simulations
ED
M
175
For the performed simulations, we always chose the systematical record
182
length N = 50. We considered the number of known floods k in the historical
183
period k = 1, 2, and 4. The true historical period M was chosen to be
184
M = 200. However, for studying cases where M is not known, we determined
185
for each run the length L by subtracting all those years from M before one
186
of the k largest floods occurs first and then used an estimate for M (Eq. 2).
187
This enabled us to compare both cases. For the choice of parameters of
188
the parent distribution (the GEV, from where the samples are drawn), the
189
location parameter c and the scale parameter b can be arbitrarily chosen,
190
since c only shifts the whole distribution and b scales it around c. Therefore,
191
we chose c = 0 and b = 1. Note that c = 0 leads to negative flood events.
AC
CE
PT
181
11
ACCEPTED MANUSCRIPT
While this does not interfere with the calculations, it should be kept in mind
193
that the relative bias and relative root mean square error (RRMSE) reported
194
in this article is necessarily higher than in practical applications, since c is
195
supposed to be positive, which leads to a larger mean. For the remaining
196
parameter a (the shape parameter), we chose a = 0, 0.1, 0.2, 0.3. All choices
197
of a represent cases without upper boundary, and the skewness ranges from
198
1.14 to 13.48. Relative bias and RMSE are calculated for the return levels
199
associated with following return periods T : T = 50, 100, 200, 500, 1000
200
years.
201
2.4. Test inversion bootstrapping
AN US
CR IP T
192
The use of the test inversion bootstrapping (TIB) for estimating confi-
203
dence intervals (CI) was mainly advocated by Carpenter [2], [3]. The under-
204
lying idea of TIB can be explained by considering the definition of confidence
205
intervals (CI). Let us assume that from any sample with the sample size N a
207
c of W is estimated. If we knew the true underlying distribution, statistic W
ED
206
M
202
ci (estimates we could construct the range of values of W where 95% of all W
of samples) would be located. If this interval would be symmetrically cho-
209
sen, 2.5% of all samples would yield an estimate smaller than the lower and
210
another 2.5% of all samples would yield an estimate larger than the upper
211
boundary. This would be the symmetric and double-sided 95% CI. However,
212
the problem is we do not know the true distribution and all we have is a
AC
CE
PT
208
213
214
215
216
ci . sample and the corresponding estimate W
Under these circumstances, estimating the 95% CI requires the following
way of thinking: Determining the upper boundary of the CI (the single-sided ci underestimates 97.5% CI), we need to assume that the sample estimate W 12
ACCEPTED MANUSCRIPT
W such that the probability of non-exceedance is only 2.5% (regarding all
218
samples of the true, but unknown distribution). If that is the case, how would
219
the parameters of the true distribution (and hence W ) look like? Or more
220
specifically: What are the parameters were 2.5% of all values are smaller
221
CR IP T
217
ci ? This would define the upper boundary of the CI. Likewise, the than W
lower boundary of the 95% CI can be determined by assuming that 2.5% of
223
all samples of the true (yet unknown) distribution would exceed the sample
224
225
ci . estimate W
AN US
222
A simple example might illustrate this point. Let x be uniform distributed
226
on the interval [0, R] with unknown R. The cumulative distribution function
227
F (x) is then given by
229
230
b would be R b = 2x1 . For of the 95% double-sided CI? The mean estimate R
the upper boundary, the probability of non-exceedance need to be 2.5%. Therefore,
232
CE
PT
231
If only one value x1 is drawn from this distribution, what are the boundaries
ED
228
x ; 0 ≤ x ≤ R. R
M
F (x) =
x1 R → R = 40 · x1 .
F (x1 ) = 0.025 =
The upper boundary of the 95% CI is forty times larger than the sampled value. For the lower boundary, the probability of non-exceedance need to be
234
97.5%, therefore
AC 233
x1 R x1 →R = . 0.975
F (x1 ) = 0.975 =
13
ACCEPTED MANUSCRIPT
235
The lower boundary is roughly only 2.6% larger than the estimate x1 . The
236
complete 95% CI is given by (1.026 x1 ; 40 x1 ). Note that the boundaries of
238
b = 2x1 . the CI are not symmetrically placed around R
CR IP T
237
What most methods designed to determine CI imply is that the parame-
ters estimated from the sample can represent the true distribution and then
240
the interval were the appropriate percentage of samples are located serve as
241
estimate for the CI. But this is not true: In order to estimate the boundaries
242
of the CI we need to assume that the sample is more or less a significant over-
243
or underestimation of the true value. Therefore, loosely spoken, the distribu-
244
tion estimated from the sample cannot represent adequately the variability of
245
the distribution (which influences of course the width of the CI) and we need
246
to estimate the parameters of the true distribution under the assumption
247
that the sample was a defined over- or underestimation.
M
AN US
239
For distributions with a single parameter (like the simple example of
249
the uniform distribution presented above) the CI can be determined exactly
250
without the necessity of using bootstrap simulations. The TIB approach
251
itself was developed initially for one parameter Θ under the presence of a
252
nuisance parameters η ([3]). It uses the sample estimate for the nuisance
254
PT
single-sided CI for Θ (Θα ) are estimated (using bootstrap simulations) such that
AC
255
parameter η = ηb for the subsequent bootstrap simulations. The α · 100 %
CE
253
ED
248
b obs ); Θα , ηb) = 1 − α . F (Θ(x
256
For each value of Θ, bootstrap simulations can be carried out (with each
257
bootstrap simulation having the same sample size as the observed sample),
258
ci can be estimated. Subsequently, the and for each sample i the value Θ 14
ACCEPTED MANUSCRIPT
259
b obs ) can be determined. If this probability of non-exceedance regarding Θ(x
probability of non-exceedance reaches 1 − α, the boundary of the α · 100 %
261
single-sided CI (Θα ) has been found.
CR IP T
260
However, return levels of flood events are an explicit function of all three
263
parameters of the GEV. The TIB approach was customized in a previous
264
publication (Schendel and Thongwichian (2015) [29]). Before we discuss how
265
historical information should be included in the estimation of CI by the TIB
266
approach, we would like to shortly summarize the findings for solely system-
267
atic records. The problem in this situation is that only one constraint is avail-
268
able (the confidence level) but three variables are to determine. The main
269
idea was to maximize the likelihood function, such that the most likely set of
270
parameters (that suffice the respective confidence level) is chosen. Normally,
271
this would be a computationally demanding task, since it would require run-
272
ning several thousand bootstrap simulations for each set of two parameters
273
in order to determine the third one. However, since random values xi drawn
274
from the parameter set (a = aj , b = 1, c = 0) can be easily adjusted to every
275
GEV with the parameters (a = aj , b = bk , c = cl ) by:
PT
ED
M
AN US
262
277
(8)
we first draw for each aj B samples (with sample size N ), estimate the
CE
276
xnew = x i · bk + c l . i
parameters (b a, bb, b c) (using for example PL) and hence, calculate the return
levels of interest Q (for each sample). If we are interested in the (1−α)·100%
279
CI, we chose α/2 · (B + 1)th and 1 − α/2 · (B + 1)th smallest values of return
AC
278
280
levels [20]. In general, we will denote this values as qap , which is the pth
281
quantile of the return level of the chosen T -year return period for given a.
282
For each aj the parameters b and c can be derived as follows (Qest T denotes 15
ACCEPTED MANUSCRIPT
the T -year return level estimated from the observed sample, and L denotes
284
the log-likelihood function of the GEV): a − c) zi : = 1 + (xobs b i (Qest T − c) b = qap
CR IP T
283
(9)
N
N
X −1 (Qest − c) a+1X L = −N ln( T p )− ln(zi ) − zi a qa a i=1 i=1
AN US
N ∂L + = 0= ∂c (Qest T − c) N 1 X qap (xiobs − Qest −a T ) − (a + 1) . + z i 2 z (Qest − c) i T i=1
(10)
(11)
Equation 9 ensures that the pth quantile of the T -year return level matches
286
the T -year return level estimated from the observed sample. Eq. 11 allows
287
to numerically determine c and subsequently b. For each a, these parameters
288
enable us to determine the log-likelihood function (Eq. 10). By varying a, the
289
value a that maximizes Eq. 10 can be determined. The better performance of
290
this approach in comparison to other methods to estimate the CI was shown
291
for the GEV in [29].
PT
ED
M
285
In order to implement historical information, we would like to consider
293
first the situation depicted in Fig.1a with known historical period M . The
294
bootstrap runs in order to determine qap can be performed as described be-
295
fore, the historical information is included as follows: M values are randomly
296
drawn from the GEV, the k largest values are chosen (given that for the ob-
297
served time series k floods are known from the historical period) and for the
298
remaining values it is assumed that they are smaller than the kth largest
299
value. Using Eq. 1 with Eq. 3, Eq. 5, and Eq. 7, the desired return level can
AC
CE
292
16
ACCEPTED MANUSCRIPT
be estimated for each run and subsequently the qap can be determined. To
301
determine the parameters of the GEV that will eventually lead to the bound-
302
aries of the CI, Eq. 10 and Eq. 11 need to be altered as follows (S denotes the
303
smallest known historical flood, apart from that all known historical flood
304
events will be added to xiobs ):
AN US
a − c) zi : = 1 + (xobs b i a λ : = 1 + (S − c) b (Qest T − c) b = qap
CR IP T
300
(12)
M
N +k N +k X (Qest a+1 X −1 T − c) L = −(N + k) ln( )− ln(zi ) − zi a p qa a i=1 i=1 M 1 −(M − k)λ− a + ln k
ED
a+1 S − Qest N +k ∂L = 0 = (M − k)λ− a qap est T 2 + est + ∂c (QT − c) (QT − c) N +k 1 X qap (xiobs − Qest −a T ) + z − (a + 1) . i 2 z (Qest − c) i T i=1
(13)
All estimations of CI in this article used B = 9999 simulation runs for deter-
306
mining qap .
For the other situation of historical information (Fig.1b), only L (the
CE
307
PT
305
time period between the first known historical flood and the beginning of
309
the systematic record) is known, while the historical period M needs to be
310
estimated. The correct approach in estimating CI using the TIB would be
311
to run several sets of bootstrap simulations with different M (in addition to
312
the varying parameters of the GEV), and use the value of M that maximizes
313
the likelihood function. Unfortunately, this would inevitably lead to M = L.
AC
308
17
ACCEPTED MANUSCRIPT
314
The reason can be illustrated using a simpler example: If a uniform distri-
315
bution is defined on the interval [0, N ] and a sample x1 , ..., xn is given, then
317
b = max(x1 , ..., xn ) the maximum likelihood estimation of N will lead to N
CR IP T
316
[4], which is unfortunately just a lower bound for N . Therefore, in order
to account somehow for the variability of the unknown variable M , we in-
319
terpreted M as a random variable for the bootstrap simulations. To derive
320
the distribution for M , let us first consider the case that k random numbers
321
are drawn from the uniform distribution on the interval [0, 1]. The proba-
322
bility density function f (g) for the maximum number g and the cumulative
323
distribution function F (g) are then given by:
AN US
318
f (g) = k g k−1
M
F (g) = g k .
To explain the expression derived for f (g), k − 1 numbers need to be lower
325
than g and due to permutation, each of the k numbers can be the maximum.
326
With these results, we can infer the result for the cumulative distribution
327
function F (L), where k floods are distributed on the interval [1, 2, ..., M + N ]
328
(assuming that k<
CE
PT
ED
324
F (L) =
L+N −1 M +N −1
k
.
We can solve this equation for M (r denotes a uniform random number which
330
represents F (L)):
AC
329
M =1+
L+N −1 1
rk
−N
(14)
331
For each single bootstrap run, a new random variable r is chosen, which leads
332
to a new estimation of M . Afterwards, in order to determine the parameters 18
15
●
10
●
● ●
5
relative bias in %
systematic M=L estimated M known M
● ●
0 0
200
400
600
return period in years
AN US
● ● ● ● ● ●
CR IP T
ACCEPTED MANUSCRIPT
800
1000
Figure 2: Relative bias in dependence on return period for a = 0.2 4) using only the floods were known during M .
M
systematic record; using M = L; ◦) using estimate of M ; •) using true M = 200. Four
of the GEV that will eventually lead to the boundaries of the CI, for M the
334
estimation of Eq. 2 is used and the procedure is carried out the same way as
335
for known M .
336
3. Results
337
3.1. Effect of including historical information for bias and root mean square
PT
CE error
AC
338
ED
333
339
Bias and RMSE where studied for a various numbers of known floods
340
within the historical period, shape parameters of the GEV, and return pe-
341
riods. Although we cannot present all the results due to space limitations,
342
we will still discuss all the typical cases. In Fig. 2 the relative bias is shown 19
systematic k=1 flood k=2 floods k=4 floods
●
8
●
6
●
●
● ●
4
relative bias in %
10
●
● ●
2
●
0
●
0
200
400
600
return period in years
AN US
●
CR IP T
12
ACCEPTED MANUSCRIPT
800
1000
Figure 3: Relative bias in dependence on return period for a = 0.2 •) systematic record period M (estimated).
M
only; one known flood; ◦) two known floods; 4) four known floods in the historical
for the case of four known flood events and a shape parameter a = 0.2 in
344
dependence on the return period for different methods of estimating M . Ob-
345
viously, M = L leads to a large bias while the estimation of M via Eq. 2
346
yields much better results. Using the correct value M = 200 reduces the bias
347
slightly more. Compared to the use of the systematic record only (N = 50),
348
the inclusion of historical information (using Eq. 2) seems to reduce the bias
349
for large return periods, while it may increase it for smaller return periods.
AC
CE
PT
ED
343
350
Fig. 3 depicts for the same shape parameter a = 0.2 the relative bias for
351
one, two, and four known floods in the historical period. The bias reduces
352
with the increase of the number of known floods. Again, inclusion of his-
353
torical flood events seems to reduce the bias for large return periods, while 20
●
k=1 flood k=2 floods k=4 floods
● ●
●
●
●
●
0
●
200
●
400
●
600
return period in years
AN US
●
CR IP T
●
0.50 0.55 0.60 0.65 0.70 0.75 0.80
ratio of RRMSE
ACCEPTED MANUSCRIPT
800
1000
Figure 4: Ratio of RMSE between using historical information vs. using the systematic
M
record only for a = 0.2 and •) one known flood event; ◦) two known flood events; ) four known flood events in the estimated historical period M .
it increases them for smaller ones, especially if only one flood is known. In
355
Fig. 4 the ratio of RMSE for using historical information versus using the
356
systematic record only is depicted for various return periods, using a = 0.2.
357
The more flood events are known in the historical period, the larger the re-
358
duction in RMSE. Moreover, this reduction is slightly more pronounced for
359
large return periods. In order to compare the effect of skewness of the parent
360
distribution on RMSE, Fig. 5 shows the ratio of RMSE for using historical
AC
CE
PT
ED
354
361
information (with four known floods) versus using the systematic record only
362
for four different values of the shape parameter a. It seems that the larger
363
a, the smaller the reduction of RMSE gets if historical information is used
364
(compared to the use of the systematic record only). Moreover, the increase 21
a=0 a=0.1 a=0.2 a=0.3
●
0.60 0.55
● ●
0.50
●
●
●
●
●
●
●
0.45
●
0
200
400
600
return period in years
AN US
ratio of RRMSE
0.65
●
CR IP T
0.70
ACCEPTED MANUSCRIPT
800
1000
Figure 5: Ratio of RMSE between using historical information vs. using the systematic
M
record only with four known floods and shape parameter •) a = 0; ◦) a = 0.1; ) a = 0.2; 4) a = 0.3. Historical period M estimated.
of RMSE reduction for large return periods is more pronounced for smaller
366
a. Finally, Fig. 6 compares the relative decrease of RMSE for the cases that
367
M needs to be estimated (using Eq. 2) or if M is known for two different
368
values of the shape parameter a (a = 0 and a = 0.3 (and one known historic
369
flood event)). While knowledge of M decreases RMSE further, this decrease
370
is rather small compared to the overall reduction of RMSE due to the im-
371
plementation of historical information: For example, for the 100-year return
AC
CE
PT
ED
365
372
level, a = 0, and estimated M , RMSE is 63.8% of RMSE if the systematic
373
record is used only, but 58.5% if M is known. For a = 0.3 the difference is
374
similar: 76.7% for estimated M , but 71.8% for known M .
22
●
0.8
M estimated; a=0 M known; a=0.0 M estimated; a=0.3 M known; a=0.3
0.7
ratio of RMSE
0.9
●
● ●
0.6
●
●
●
●
● ● ●
0
200
400
600
return period in years
AN US
0.5
●
CR IP T
1.0
ACCEPTED MANUSCRIPT
800
1000
Figure 6: Ratio of RMSE between using historical information vs. using the systematic
M
record only, for one known flood and •) a = 0 and estimated M ; ◦) a = 0 and known M ; a = 0.3 and estimated M ; 4) a = 0.3 and known M .
3.2. Coverage error of the test inversion bootstrap
ED
375
In order to investigate the performance of the test inversion bootstrap
377
for confidence interval (CI) estimation, we chose a parent distribution with
378
the parameters a = 0.2, b = 1, c = 0, a systematic record length N = 50,
379
a known historical period M = 200 with one known flood event. We study
380
one-sided CI of flood events with return periods of 50, 100, 200, 500, and 1000
381
years for the following single-sided confidence levels: 2.5%, 5%, 10%, 90%,
AC
CE
PT
376
382
95%, and 97.5%. From the parent distribution, we draw 5000 samples and
383
determine the one-sided CI stated above. Then the percentage of samples,
384
were the true value of the parent distribution of the respective return level is
385
within the one-sided CI (which means lower than the boundary of the single23
ACCEPTED MANUSCRIPT
50-yr. r.l
100-yr r.l.
200-yr. r.l. 500-yr r.l.
1000-yr r.l.
2.5%
2.7
2.6
2.7
2.7
2.6
5%
5.4
5.0
5.0
4.8
10%
10.5
10.1
10.2
10.0
10.2
90%
91.1
91.1
91.2
90.9
90.7
95%
95.3
95.3
95.4
95.6
95.6
97.5% 97.4
97.6
97.6
97.7
97.9
CR IP T
CI
AN US
4.9
Table 1: Percentage of simulated results for which the single-sided confidence intervals (CI) (determined by test inversion bootstrapping) covered the true values of the return level (r.l.) of the parent distribution (GEV distribution with shape parameter a=0.2) for
M
different return periods for explicit known historical period M .
sided CI) will be determined. Consequently, deviations from the confidence
387
level can be regarded as coverage error. In Table 1 the respective percentage
388
of samples, where the true value is inside of the CI, is reported. The coverage
389
error is reasonable small. There seems to be a slight tendency to overestimate
390
the boundaries of large confidence levels (the upper boundary of double-sided
391
CI).
PT
CE
A similar investigation for the case that the historical period M is not
AC
392
ED
386
393
explicitly known (and therefore needs to be estimated), is numerically much
394
more cumbersome. Because L (number of years between first known flood
395
and begin of the systematic record) in Eq. 14 is different for each sample,
396
theoretically the whole bootstrap simulations to determine qap (used e.g. in 24
ACCEPTED MANUSCRIPT
50-yr. r.l
100-yr r.l.
200-yr. r.l. 500-yr r.l.
1000-yr r.l.
2.5%
3.2
3.0
3.2
2.9
2.9
5%
6.0
6.0
5.9
6.0
10%
11.4
11.6
11.5
11.8
11.8
90%
88.3
88.0
87.7
87.4
87.3
95%
93.4
93.1
93.0
92.8
92.9
97.5% 96.0
96.1
96.2
96.4
96.4
CR IP T
CI
AN US
6.0
Table 2: Percentage of simulated results for which the single-sided confidence intervals (CI) (determined by test inversion bootstrapping) covered the true values of the return level (r.l.) of the parent distribution (GEV distribution with shape parameter a=0.2) for
M
different return periods for unknown historical period M .
Eq. 13) need to be carried out for each of the 5000 samples. Moreover, the
398
overall rank (including the floods of the systematic record) of the largest
399
known flood within the historical period may vary from sample to sample,
400
which needs to require separate bootstrap simulations for each rank of the
401
known historical flood. Using the same parent distribution as before (in-
402
cluding the underlying historical period M of 200 years). The bootstrap
403
simulations were carried out not only for a variety of shape parameter a,
AC
CE
PT
ED
397
404
but also for the following L: 1, 50, 100, 150, 200 and rank of the historical
405
flood 1,2,3. This simulations made it possible to interpolate qap for all possi-
406
ble L. The results for the coverage probabilities are given in Table 2. The
407
boundaries of small and large confidence levels are both underestimated. For 25
ACCEPTED MANUSCRIPT
double-sided CI, this results into too narrow boundaries. The reason for the
409
larger coverage error in case of unknown M might be due to the fact that
410
the variability of M could not fully be implemented in the TIB approach (as
411
described in section 2.4).
412
3.3. Case Study: Gauge Cochem (Moselle)
CR IP T
408
The gauge Cochem at the river Moselle is located in the west of Germany.
414
Historical floods for this gauge have been investigated by Sartor et al. (2010)
415
[28]. In detail, for the four largest historical floods between 1572-1819, the
416
maximal streamflow was estimated to following values:
417
1572/1573: 4170-4400 m3 /s
418
1651: 4500 m3 /s
419
1740: 4170 m3 /s
420
1784: 5750 m3 /s
M
AN US
413
From the data source of the Federal Waterways and Shipping Adminis-
422
tration, Germany (WSV) (provided by the Federal Institute of Hydrology,
423
Germany (BfG)) maximal annual streamflow was available for the hydrolog-
424
ical years 1819-2014 (starting from 01.11. and ending at 31.10.). Hence, the
425
systematic record contains N = 196 years. The length of the true historical
426
period M is unknown, the first known flood starts in the hydrological year
427
1573. Therefore, L is given by 1819 − 1573 = 246. The minimal known
CE
PT
ED
421
historic flood had a streamflow of about 4170 m3 /s. Such a streamflow was
429
reached in the systematic record once (22.12.1993). Thus, we can assign an
430
order k = 5 to the minimal known flood event within the historical period.
431
Using Eq. 2, we can estimate M to M ≈ 334 years. However, before we use
432
Eq. 1, one point still needs to be considered: The maximal streamflow of the
AC
428
26
ACCEPTED MANUSCRIPT
return
estimate 95% CI m3
m3
historic information included
95% CI
estimate 95% CI
[
]
l.b. [
50-yr.
3790
3540
4250
3850
100-yr.
4080
3770
4700
4190
200-yr.
4350
3970
5150
4510
500-yr.
4690
4200
5750
4910
1000-yr. 4920
4350
6210
s
s
]
l.b. [
5200
m3 s
95% CI 3
] u.b. [ ms ]
3600
4110
3880
4530
4130
4960
4430
5540
AN US
s
] [
m3
level
s
] u.b. [
m3
CR IP T
systematic record only
4630
5970
Table 3: Estimated return levels for Cochem (Moselle) together with the lower (l.b.) and upper bound (u.b.) of the symmetric and double-sided 95 % confidence intervals (CI) for
M
various return periods with and without using historic flood events.
flood event in 1572/1573 is not known exactly, but somewhere in the inter-
434
val [4170 m3 /s;4400 m3 /s]. In order to deal with this uncertainty, we need
435
to insert following term into Eq. 1: F (4400 m3 /s) − F (4170 m3 /s). This
PT
ED
433
term describes the probability that the flood was smaller than 4400 m3 /s
437
but larger than 4170 m3 /s. Using the test inversion bootstrap approach as
438
described in section: 2.4, the double-sided 95% confidence interval is calcu-
439
lated with and also without utilizing historical flood events.The results are
AC
CE
436
440
presented in Table 3 and depicted in Fig. 7. Implementing historical flood
441
information reduces the length of the estimated confidence intervals for this
442
gauge.
27
● ● ● ●
3500
● ● ●
0
200
400
600
return period in years
AN US
●
CR IP T
5500 5000 4500
● ●
4000
streamflow in m^3/s
6000
ACCEPTED MANUSCRIPT
800
1000
Figure 7: Estimated upper (4) and lower (◦) bound of the 95% confidence interval as
M
well as the mean estimate () regarding extreme flood events at Cochem (Moselle) using
443
4. Discussion
ED
historical flood information (dashed line) vs. using the systematic record only (solid line).
In this article, we have shown that information about historical flood
445
events is very useful in flood frequency analysis. Especially, uncertainty due
446
to sampling error can be reduced substantially. However, for bias the situ-
447
ation is ambiguously: taking historical floods into account will reduce bias
448
for return levels of large return periods (where ”‘large”’ is determined by the
AC
CE
PT
444
449
length of the historical period and the number of known floods), but it will
450
introduce a slight bias for floods with smaller return periods. We have gen-
451
eralized and validated a method to estimate the true length of the historical
452
period (developed originally by Strupczewski et al. (2014) [31] for the largest 28
ACCEPTED MANUSCRIPT
historical flood) to the case of several known floods within the historical pe-
454
riod. Furthermore, we found that explicit knowledge of the true length of
455
the historical period M does only lead to minor improvements of bias and
456
RMSE compared to the use of estimated M . By utilizing different values of
457
the shape parameter a of the GEV, we found that reduction of RMSE due to
458
historical information is more pronounced for less skewed parent distributions
459
(small shape parameter a) and that for these cases RMSE also significantly
460
reduces more if the return period is increased. Finally, we have successfully
461
adapted the test inversion bootstrap for estimating confidence intervals for
462
the use of historical flood information.
AN US
CR IP T
453
We would like to emphasize that including historical floods will not in
464
general introduce an additional bias, only for return levels associated with
465
certain return periods. However, this bias is originally not an effect of the
466
unknown length of the historical period M (although this contributes to bias),
467
but seems to be a feature of the likelihood estimation when using censored
468
data. That bias can even be reduced for large return periods is indicative
469
that the shape parameter a is more precisely (with less bias) estimated due
470
to the use of historical information, but with the effect that in turn the scale
471
parameter b seems to increase. This can explain why bias increases for small
472
return periods, but decreases for larger ones. However, a positive bias is
473
in terms of risk assessment more desirable than a negative one (assuming
CE
PT
ED
M
463
the same absolute value). Given that for the maximum (as well as for the
475
penalized) likelihood the median bias is negative (using systematic record
476
only), a slight positive bias might be even welcome. More importantly, in
477
practice the parent distribution is not known, and therefore it is impossible
AC 474
29
ACCEPTED MANUSCRIPT
to distinguish between sampling error and bias. But we know that RMSE
479
is reduced substantially if historical information is included, which seems to
480
be an important justification for including historical information into flood
481
frequency analysis.
CR IP T
478
Our investigation has shown, that RMSE reduction is less pronounced
483
if the parent distribution is more positively skewed (implies larger shape
484
parameter a). Increasing a means that the distribution is more heavy-tailed
485
and hence, the probability distribution decreases more slowly for large and
486
increasing streamflow. Therefore, for large a the variability of extreme flood
487
events increases such that even small changes in the probability of exceedance
488
will lead to large changes of streamflow. Hence, extreme flood events carry
489
less information (resulting in less RMSE reduction) if the parent distribution
490
is more heavy-tailed.
M
AN US
482
We have confirmed the results of Hirsch and Stedinger (1987) [11] and
492
Strupczewski et al. (2014) [31] that assuming that the true historical period
493
is equal to the difference between the first known flood and begin of the
494
systematic record (M = L) is wrong (although often used in literature, like
495
in Frances et al. (1994) [8], Sartor et al. (2010) [28]). We have generalized
496
an idea (developed in [31]) for estimating M from one known flood in the
497
historic period to several ones. We could show that this estimate of M leads
498
to only marginal larger bias and RMSE compared to explicit knowledge of M .
CE
PT
ED
491
Therefore, we do not support the statement ”every effort should be made to
500
establish M as exact as possible” (Hirsch and Stedinger (1987) [11], similar
501
in Strupczewski et al. (2014) [31]), but believe that the estimation method
502
presented in this work is already a good starting point. Noteworthy, in
AC 499
30
ACCEPTED MANUSCRIPT
Frances et al. (1994) [8] it was already shown that at least for return periods
504
of interest, knowledge that a flood exceeds a certain threshold is nearly as
505
good as knowing the exact runoff. Thus, neither the true historic period nor
506
the exact magnitude of a historic flood event need to be known exactly for
507
benefitting from historical data. In contrast, Strupczewski et al. (2014) [31]
508
has provoked the question, whether including historical data in presence of
509
the uncertainties of the true historic period as well as flood magnitudes, ”is
510
worth a candle”. Given the facts, we can answer this question confidently:
511
It even deserves the spotlight!
AN US
CR IP T
503
One problem not tackled in this paper is the misspecification of the parent
513
distribution. While the Fisher-Tippet-Theorem states that the GEV is the
514
limiting distribution of a series of independent and identically distributed ob-
515
servations [7][10], smaller flood events (e.g. with return periods smaller than
516
2 years) might not be in the respective limit. Therefore, the GEV is not
517
the only valid choice as parent distribution. However, respecting the Fisher-
518
Tippett-theorem means that every function used to fit a series of annual
519
maximal streamflow must converge to the GEV at least for events associated
520
with large return periods. Since the GEV can be heavy-tailed (some moments
521
from a certain order on are infinite, and therefore not defined) for particular
522
parameter values, any other distribution used to fit annual maximal stream-
523
flow needs to have these feature too. This is for many distributions typically
CE
PT
ED
M
512
used in hydrology for fitting annual maximal streamflow, like the Weibull or
525
the Log-Pearson 3 distribution, simply not the case. In contrast, a version of
526
the generalized logistic distribution (used e.g. in the UK for flood frequency
527
analysis [15][22]) or the Wakeby distribution (see e.g. [14]) would be possible
AC 524
31
ACCEPTED MANUSCRIPT
choices. Anyway, for large return periods it is much more likely to assume the
529
validity of the Fisher-Tippet-Theorem than for smaller ones. Since known
530
historic floods are typically associated with large return periods, inclusion
531
of this historic data can even reduce errors caused by the wrong use of the
532
model distribution. This is simply because all the distributions will have the
533
same functional behaviour for extreme events.
CR IP T
528
Finally, we would like to present a possible application of this work to a
535
case that has nothing to do with historical flood events. Let us assume that
536
recently a very large flood has occurred. This flood leads to a reevaluation
537
of the flood frequency analysis. As long as the reevaluation of the flood
538
frequency analysis can be unambiguously attributed to this last flood event,
539
we need to expand the observation period. Otherwise, as we have seen for
540
historical floods, the results would be upward biased. By determining the
541
rank k of the last flood, additional (N − 1)/k (given a time series of length
542
N ) years in the future need to be assumed to be lower than the last flood
543
event. Ironically, studying the past leads to future implications.
544
References
546
548
M
ED
PT
single site and pooled frequency analysis. Hydrolog Sci J 2003; 48(1):2538 http://dx.doi.org/doi:10.1623/hysj.48.1.25.43485.
AC
547
[1] Burn DH. The use of resampling for estimating confidence intervals for
CE
545
AN US
534
[2] Carpenter
JR. J
Roy
Test Stat
inversion Soc
bootstrap
549
tervals.
550
http://dx.doi.org/doi:10.1111/1467-9868.00169.
32
Ser
B
confidence
in-
1999;61(1):159-172
ACCEPTED MANUSCRIPT
551
[3] Carpenter JR, Bithell J. Bootstrap confidence intervals:
When,
which, what?
A practical guide for medical statisticians. Stat
553
Med 2000;19(9):1141-1164 http://dx.doi.org/doi:10.1002/(SICI)1097-
554
0258(20000515)19:9¡1141::AID-SIM479¿3.0.CO;2-F.
CR IP T
552
[4] Castillo E, Hadib AS. A method for estimating parameters and quantiles
556
of distributions of continuous random variables. Comput Stat Data An
557
(1995);20(4):421-439 http://dx.doi.org/10.1016/0167-9473(94)00049-O.
558
[5] Chowdhury JU, Stedinger JR, Lu LH. Goodness of fit tests for re-
559
gional generalized extreme value flood distributions. Water Resour Res
560
1991;27(7):1765-1776 http://dx.doi.org/doi:10.1029/91WR00077.
AN US
555
[6] Coles SG, Dixon MJ. Likelihood-based inference for extreme value mod-
562
els. Extremes, Kluwer Academic Publishers, Boston, 1999;2(1):5-23
563
http://dx.doi.org/doi:10.1023/A:1009905222644.
ED
M
561
[7] Fisher RA, Tippett LHC. Limiting forms of the frequency distribution
565
of the largest or smallest member of a sample. Proc Cambridge Phil Soc
566
1928;24(2):180-190 http://dx.doi.org/10.1017/S0305004100015681.
568
tematic and historical or paleoflood data based on the two-parameter general extreme value models. Water Resour Res 1994;30(6):1653-1664
AC
569
[8] Frances F, Salas JD, Boes DC. Flood frequency analysis with sys-
CE
567
PT
564
570
http://dx.doi.org/doi:10.1029/94WR00154.
571
[9] Ga´al L, Szolgay J, Kohnov´a S, Hlavˇcov´a K. Inclusion of his-
572
torical information in flood frequency analysis using a Bayesian
573
MCMC technique:
a case study for the power dam Orl´ık, 33
ACCEPTED MANUSCRIPT
574
Czech Republic. Contributions Geophy Geodesy 2014;40(2):121-147
575
http://dx.doi.org/doi:10.2478/v10126-010-0005-5. [10] Gnedenko
577
mum
578
http://dx.doi.org/doi:10.2307/1968974.
579
BV.
d’une
[11] Hirsch
Sur
serie
RM,
la
distribution
aleatoire.
Stedinger
Ann
JR.
limite
du
terme
maxi-
CR IP T
576
of
Math
Plotting
1943;44(3):423-453
positions
for
historical
floods and their precision. Water Resour Res 1987;23(4):715-727
581
http://dx.doi.org/doi:10.1029/WR023i004p00715.
AN US
580
[12] Hosking JRM, Wallis JR, Wood EF. An appraisal of the regional flood
583
frequency procedure in the UK Flood Studies Report. Hydrol Sci J
584
1985;30(1):85-109 http://dx.doi.org/doi:10.1080/02626668509490973.
585
[13] Hosking
JRM,
M
582
Wallis
JR,
generalized
587
probability-weighted
588
http://dx.doi.org/doi:10.2307/1269706.
589
[14] Houghton JC. Birth of a parent:
by
Technometrics
PT
moments.
Estimation the
of
method
the of
1985;27(3):251-261
The Wakeby distribution
for modeling flood flows. Water Resour Res 1978;14(6):1105-1110
CE
591
EF.
distribution
ED
586
590
extreme-value
Wood
http://dx.doi.org/doi:10.1029/WR014i006p01105.
[15] Institute of Hydrology (IH): The Flood Estimation Handbook, Cen-
593
tre for Ecology and Hydrology, Wallingford, UK, 1999 ISBN:
594
9781906698003.
AC
592
595
[16] Katz
RW,
Parlange
MB,
Naveau 34
P.
Statistics
of
extremes
ACCEPTED MANUSCRIPT
596
in
597
http://dx.doi.org/doi:10.1016/S0309-1708(02)00056-8.
Water
Resour
2002;25(8-12):1287-1304
[17] Kysel´ y J. A Cautionary Note on the Use of Nonparamet-
599
ric
600
Models.
601
Adv
Bootstrap J
for
Estimating
Appl
Meteorol
CR IP T
598
hydrology.
Uncertainties Climatol
in
Extreme-Value
2007;47(12):3236-3251
http://dx.doi.org/doi:10.1175/2008JAMC1763.1.
[18] Lettenmaier DP, Wallis JR, Wood EF. Effect of regional heterogeneity
603
on flood frequency estimation. Water Resour Res 1987:23(2):313-323
604
http://dx.doi.org/doi:10.1029/WR023i002p00313.
AN US
602
[19] Madsen H, Pearson CP, Rosbjerg D. Comparison of annual maximum
606
series and partial duration series methods for modeling extreme hydro-
607
logic events, 2. Regional modeling. Water Resour Res 1997;33(4):759-769
608
http://dx.doi.org/doi:10.1029/96WR03849. [20] Makkonen
ED
609
M
605
L.
Bringing
the
611
http://dx.doi.org/doi:10.1080/03610920701653094.
Position
2008;37(3):460-467
CE
[21] Martins ES, Stedinger JR. Generalized maximum-likelihood generalized extreme-value quantile estimators for hydrologic data. Water Resour Res 2000;36(3):737-744 http://dx.doi.org/doi:10.1029/1999WR900330.
AC
614
A-Theor
Plotting
PT
Controversy.
613
Stat
to
610
612
Commun
Closure
615
[22] McDonald N, Kjeldsen TR, Prosdocimi I, Sangster H. Reassessing flood
616
frequency for the Sussex Ouse, Lewes: the inclusion of historical flood
617
information since AD1650. Nat Hazards Earth Sys 2014;14(10):2817-
618
2828 http://dx.doi.org/doi:10.5194/nhess-14-2817-2014. 35
ACCEPTED MANUSCRIPT
[23] Morrison JE, Smith JA. Stochastic modeling of flood peaks using the
620
generalized extreme value distribution. Water Resour Res 2002;38(12):1-
621
12 http://dx.doi.org/doi:10.1029/2001WR000502.
CR IP T
619
622
[24] Obeysekera J, Salas JD. Quantifying the uncertainty of design floods
623
under nonstationary conditions. J Hydraul Eng 2014;19(7):1438-1446
624
http://dx.doi.org/doi:10.1061/(ASCE)HE.1943-5584.0000931.
[25] Park B, Demeritt D. Defining the hundred year flood: Bayesian
627
tainty in flood frequency estimates. J Hydrol 2016;540(9):1189-1208
628
http://dx.doi.org/10.1016/j.jhydrol.2016.07.025. [26] Payrastre
O,
for
Gaume
Andrieu
data
H.
cal
631
based
632
http://dx.doi.org/doi:10.1029/2010WR009812.
case
frequency
study.
Water
to
reduce
Usefulness
analyses:
Resour
Res
of
uncer-
histori-
Developments 2011;47(8):1-15
ED
a
flood
historic
630
on
for
E,
M
information
using
A
626
629
approach
AN US
625
[27] Rust HW, Kallache M, Schellnhuber HJ, Kropp JP. Confidence Intervals
634
for Flood Return Level Estimates Assuming Long-Range Dependence.
635
In: Kropp JP and Schellnhuber HJ, editors. In extremis. Disruptive
636
Events and Trends in Climate and Hydrology. Heidelberg: Springer;
CE
2011 http://dx.doi.org/doi:10.1007/978-3-642-14863-7 3.
AC
637
PT
633
638
639
640
[28] Sartor J, Zimmer KH, Busch N. Historische Hochwasserereignisse der deutschen Mosel. Wasser und Abfall 2010;10:46-51.
[29] Schendel T, Thongwichian R. Flood frequency analysis: Confidence in-
36
ACCEPTED MANUSCRIPT
terval estimation by test inversion bootstrapping. Adv Water Resour
642
2015;83:1-9 http://dx.doi.org/doi:10.1016/j.advwatres.2015.05.004.
643
[30] Stedinger JR, Lu LH. Appraisal of regional and index flood
644
quantile
645
http://dx.doi.org/doi:10.1007/BF01581758.
646
estimators.
[31] Strupczewski
WG,
Hydrol
Kochanek
K,
the
648
cal
649
http://dx.doi.org/doi:10.5194/nhess-14-1543-2014.
E.
largest
Flood
histori-
AN US
by
1995;9(1):49-75
Bogdanowicz
frequency
Nat
supported
Hydraul
647
flood.
analysis
Stoch
CR IP T
641
Hazards
Earth
Sys
2014;14(6):1543-1551
[32] Venzon DJ, Moolgavkar SH. A Method for Computing Profile-
651
Likelihood Based Confidence Intervals. Appl Stat 1988;37(1):87-94
652
http://dx.doi.org/doi:10.2307/2347496.
M
650
[33] Viglione A, Merz R, Salinas JL, Bl¨oschl G. Flood frequency hydrol-
654
ogy: 3. A Bayesian analysis. Water Resour Res 2013;49(2):675-692
655
http://dx.doi.org/doi:10.1029/2011WR010782.
657
PT
mators
In order to compare the performance of the maximum likelihood (MLE),
AC
658
Appendix A. Performance comparison of various likelihood esti-
CE
656
ED
653
659
penalized likelihood according to Eq. 6 by Coles and Dixon (1999) [6] (de-
660
noted as PLE1), and its modification presented in Eq. 7 (denoted as PLE2),
661
the mean bias, the median bias, and the relative root mean square error
662
(RRMSE) were determined. Therefore, for parent distributions (of the GEV)
37
ACCEPTED MANUSCRIPT
with various shape parameter a, 10,000 samples with sample size N = 50 were
664
drawn for estimating bias and RRMSE for return levels of different return
665
periods. The results are seen in Table A.4, A.5, A.6. PLE1 lowers the mean
666
bias significantly compared to MLE, even to the point that it introduces a
667
strong negative bias for smaller return periods. Similar, the already negative
668
median bias of MLE is further pronounced for PLE1. Therefore, it is also
669
not surprising that PLE1 has significant lower RMSE, since introducing neg-
670
ative bias for positively skewed distributions will to a certain extent lower
671
RMSE. In contrast, the median bias is only slightly lowered for PLE2 com-
672
pared to MLE. Regarding the mean bias, for return periods of 50 and 100
673
years, PLE2 also performs better than PLE1 and often improves the results
674
of MLE. For larger return periods, the mean bias is always smaller than for
675
MLE, especially for large shape parameter a. However, PLE1 yields typically
676
better results. Finally, PLE2 yields smaller RMSE values than MLE, espe-
677
cially for larger values of a. Nevertheless, the RMSE of PLE1 is smaller, but
678
this might be due the wide range of estimates being penalized - even those
679
with only slightly positive shape parameter a. To summarize, PLE2 seems to
680
substantially improve bias and RMSE compared to MLE, while avoiding the
681
heavy side effects of PLE1, namely the considerable larger negative median
682
bias and negative mean bias for smaller return periods.
AC
CE
PT
ED
M
AN US
CR IP T
663
38
ACCEPTED MANUSCRIPT
CR IP T
Table A.4: Mean bias in % for return levels (r.l.) of different return periods of the parent
GEV distribution with various shape parameter a. MLE denotes maximum likelihood estimation, PLE1 penalized likelihood estimation according to Eq. 6, and PLE2 penalized likelihood estimation according to Eq. 7.
method
50-yr. 100-yr 200-yr. 500-yr 1000-yr r.l.
r.l.
r.l.
r.l.
0.0 MLE
-0.9
-0.1
1.0
2.8
4.5
PLE1
-2.5
-2.2
-1.7
-0.7
0.3
PLE2
-1.1
-0.3
0.7
2.5
4.1
0.1 MLE
0.6
2.1
3.9
7.0
9.9
PLE1
-2.8
-2.3
-1.6
-0.3
1.1
PLE2
0.0
1.3
2.9
5.6
8.1
PT
ED
M
r.l.
0.2 MLE
2.6
4.8
7.6
12.5
17.1
PLE1
-3.4
-3.0
-2.2
-0.6
1.1
PLE2
0.9
2.5
4.6
8.1
11.5
0.3 MLE
4.8
7.9
12.0
19.0
25.8
PLE1
-5.1
-4.9
-4.3
-3.0
-1.5
PLE2
0.4
2.0
4.2
7.9
11.4
CE AC
AN US
a
39
ACCEPTED MANUSCRIPT
CR IP T
Table A.5: Median bias in % for return levels (r.l.) of different return periods of the parent GEV distribution with various shape parameter a. MLE denotes maximum likelihood estimation, PLE1 penalized likelihood estimation according to Eq. 6, and PLE2 penalized likelihood estimation according to Eq. 7.
method
50-yr. 100-yr 200-yr. 500-yr 1000-yr r.l.
r.l.
r.l.
r.l.
0.0 MLE
-2.9
-3.1
-3.5
-3.8
-3.9
PLE1
-3.9
-4.3
-4.8
-5.3
-5.6
PLE2
-2.9
-3.1
-3.6
-3.8
-4.0
0.1 MLE
-2.5
-2.7
-3.1
-3.3
-3.3
PLE1
-5.2
-5.9
-6.8
-7.9
-8.6
PLE2
-2.7
-2.9
-3.3
-3.5
-3.5
PT
ED
M
r.l.
0.2 MLE
-2.3
-2.2
-2.5
-2.4
-2.4
PLE1
-6.5
-7.5
-8.6
-9.8
-11.0
PLE2
-2.9
-2.8
-3.2
-3.4
-3.4
0.3 MLE
-2.0
-1.9
-2.0
-1.6
-1.7
PLE1
-8.8
-10.0
-11.3
-13.4
-14.7
PLE2
-3.9
-4.0
-4.5
-4.8
-4.9
CE AC
AN US
a
40
ACCEPTED MANUSCRIPT
CR IP T
Table A.6: Relative Root mean square error (RRMSE) in % for return levels (r.l.) of
different return periods of the parent GEV distribution with various shape parameter a. MLE denotes maximum likelihood estimation, PLE1 penalized likelihood estimation according to Eq. 6, and PLE2 penalized likelihood estimation according to Eq. 7.
50-yr. 100-yr 200-yr. 500-yr 1000-yr r.l.
r.l.
r.l.
r.l.
0.0 MLE
21.5
25.4
30.0
37.2
43.5
PLE1
19.9
22.9
26.4
31.7
36.3
PLE2
21.2
25.0
29.4
36.1
42.0
0.1 MLE
26.2
31.7
38.3
49.0
59.0
PLE1
23.2
27.1
31.6
38.6
44.8
PLE2
25.2
30.1
35.9
45.0
53.2
PT
ED
r.l.
0.2 MLE
31.7
39.3
48.6
64.2
79.5
PLE1
26.4
31.0
36.3
44.7
52.1
PLE2
28.8
34.7
41.7
52.9
63.1
0.3 MLE
38.0
48.1
60.8
83.0
105.6
PLE1
28.7
33.5
39.1
47.8
55.4
PLE2
31.2
37.4
44.8
56.5
67.2
CE AC
AN US
method
M
a
41