Journal o f Hydrology, 52 (1981) 139--147 Elsevier Scientific Publishing Company, Amsterdam -- Printed in The Netherlands
139
[3]
LOG P E A R S O N III DISTRIBUTION - - IS IT APPLICABLE TO FLOOD F R E Q U E N C Y A N A L Y S I S OF A U S T R A L I A N STREAMS?
T.A. McMAHON* and R. SRIKANTHAN
Department o f Civil Engineering, Monash University, Clayton, Vic. 3168 (Australia) (Received January 21, 1980; accepted for publication July 15, 1980)
ABSTRACT McMahon, T.A. and Srikanthan, R., 1981. Log Pearson III distribution -- is it applicable to flood frequency analysis of Australian streams? J. Hydrol., 52: 139--147. An analysis using m o m e n t ratio diagrams of peak annual discharges of 172 Australian streams suggests that the log Pearson Type III distribution is appropriate for estimation of extreme flood discharges. The practice of setting to zero the coefficients of skewness of the logarithms of the peak discharges that are not statistically different from zero is questioned.
INTRODUCTION
With the adoption of the log Pearson Type III (LP-III) distribution in I.E.A. (1977) for flood frequency analysis, it is useful to review the appropriateness of the distribution and some of its special characteristics. In reading the section in I.E.A. (1977) on flood frequency analysis, one is left with a n u m b e r of unanswered questions, namely: (1) Is LP-III a suitable distribution for flood frequency analysis? (2) If the peak annual discharges are n o t statistically independent, does it matter? (3) H o w does sampling error affect the flood esfimates? (4) Is the r e c o m m e n d e d plotting position (Weibull's procedure) the most appropriate one? Recently, hydrologists have been using the following plotting position with a in the range 0.3--0.44 rather than a = 0 as in I.E.A. (1977). T = [(Y + 1 ) - -
2a]/(M--a)
(1)
This paper deals only with the first question and presents evidence in supp o r t o f using the LP-III distribution for f l o o d frequency analysis at least for * Present address: Agricultural Engineering Section, Department of Civil Engineering, University of Melbourne, Parkville, Vic. 3052, Australia.
0022-1694/81/0000--0000/$02.50 © 1981 Elsevier Scientific Publishing Company
140
Australian streams. A second paper (Srikanthan and McMahon, 1981a) deals with the effect of dependence, distribution parameters and sample size on peak annual flood estimates. A short communication (Srikanthan and McMahon, 1981b) examines specifically the question of plotting position in relation to the LP-III distribution.
BACKGROUND
After a comprehensive U.S.A. Federal inter-agency study by the U.S. Water Resources Council, the LP-III distribution was adopted in the U.S.A. as the base m e t h o d for flood frequency estimation (Benson, 1968). In Australia, an extensive comparative study of Queensland data by Kopittke et al. (1976) and one by Conway (1971) for New South Wales coastal streams concluded that the LP-III distribution performed the best of a number of distributions examined. In 1977, Australian Rainfall and R u n o f f (I.E.A., 1977) was published in which the LP-III distribution is the r e c o m m e n d e d distribution.
PRESENT PROCEDURE
The steps in applying the LP-III distribution to peak annual discharge data are set out clearly in I.E.A. (1977) but for completeness herein the salient points are included: (1) Transform peak annual discharges to logarithms (base 10). (2) Calculate the mean (M), the standard deviation (S) and the coefficient of skewness (g) of the logarithms by the standard procedures. (3) Check in terms of sampling error whether (g) is significantly different from zero at 5% level of significance, and if not set g = 0 (that is, the data are assumed to be log-normally distributed). (4) Use the following relationship to calculate the T-yr. peak flood discharge ( Q T ):
log Q T
:
M + KTS
(2)
where K T --- frequency factor for the selected recurrence interval T and is a function ofg. Tables of K T are given in Benson (1968) and I.E.A. (1977). A n u m b e r of points arise out of this procedure. (1) The procedure implicitly assumes that the annual peak discharges are independent. This assumption is discussed in Srikanthan and McMahon (1981a) on LP-III but it suffices to say at this stage that if autocorrelation is present, it can be neglected. (2) The coefficient of skewness (g) of the logs of the peak discharges is calculated and if statistically significant, the value of g is used in the analysis, otherwise g is set to zero which implies the data are two-parameter log-
141
normally (LN2) distributed. The effect of this procedure is considered later. (3) Is LP-III an appropriate distribution anyway? These latter two points are developed later. But first let us consider the characteristics of the LP-III distribution.
LOG P E A R S O N T Y P E III D I S T R I B U T I O N
The probability density function of LP-III distribution is defined (Landwehr et al., 1978) as follows:
f(z) -
la
I-~]
exp --
;
ia I > 0
(3)
where z = ln(x -- v); x = random variable in the absolute domain; v = location parameter in the absolute domain (generally v = 0); c = location parameter in the log domain; b = shape parameter in the log domain; and a = scale parameter in the log domain. The statistical properties of the LP-III distribution are given in Landwehr et al. (1978) and need n o t be repeated here. These properties are used to compute for the LP-III distribution the m o m e n t ratio diagrams Cs vs. Cv and fll vs. f12 where Cs and Cv are the coefficients of skewness and variation respectively, both corrected for small sample bias./~1 and f12 (kurtosis) are respectively defined as:
•1
and
---- ~'ll/IA~
f12 = P,~/P~
(4), (5)
where 1 pi
=
-
n
T. (Q
-
and n = number of items of data. The theoretical m o m e n t ratio curves for the LP-III distribution are plotted in Fig. 1A and B for v = 0, c = 0 and scale parameter a varying as shown. The shape parameter b is varied to construct each curve. In addition to the LP-III distribution points and curves representing other distributions including the exponential, gamma, Gumbel, LN2, Weibull and normal are shown on Fig. 2A and B.
A U S T R A L I A N F L O O D D A T A A N D LP-III
To test the acceptability of the LP-III distribution as a general flood frequency distribution for Australian streams, m o m e n t ratio values for 172 peak annual discharge series of Australian streams are plotted in Fig. 2A and
I 0"2
I 0-4
I 0.6
CV
I t
20
t8
16
14
12
~2
10
8
6
2
4
6
8
10
Fig. 1. Moment ratio diagrams of data and log Pearson III distribution: (A) Cs vs. Cv; and (B) ~1 vs. 132.
0.1 G
0-2
0.4
0-6
2
Cs
0
12
14
I
I 02
I 0"4
~
I 0'6
x
oo
•
x x
• x" "x
x
~,
x
x
x
I t
x"
.x x
,~x
X
l 2
1 4
Non Significant (g=O)
Significant ( g # O )
E - Exponential P V - Pearson TypeY_ L N - 2 Par Log Normal W - Weibull G - Gamma
I 6
2
4
6
10
12
14
Fig. 2. M o m e n t ratio diagrams of data and theoretical distributions: (A) C s vs. Cv; and (B) fil vs. fi2-
~1
02
0.4
06
CS
0
t6
I
g =0)
)
144
B (also Fig. 1A and B). Hydrologic characteristics of these streams are not given here b u t full details are given in McMahon (1979). In Fig. 2B, the fi, --fi2 diagram, it is observed that most of the points plot to the right of the gamma, log-normal, Weibull, normal, Gumbel and exponential distributions. In Fig. 2A, these distributions do n o t cover more than half the points. However, the LP-III distribution (Fig. 1A and B) is found to satisfactorily cover the data points both in the #1--~2 and Cs--Cv diagrams. This simple analysis suggests that at least for these 172 streams LP-III is the only suitable general distribution for flood-frequency analysis. In particular, it is noted that the LN2 distribution is generally unsatisfactory. To check the consistency of the plotted points across both Fig. 1A and B with respect to the LP-III distribution, for each stream the appropriate scale parameter a is determined from the Cs--Cv plot (Fig. 1A). This value in conjunction with the sample value ~1 is used to determine from Fig. 1B a comp u t e d value of the kurtosis (f12) assuming the data were LP-III distributed. The c o m p u t e d values of f12 are plotted against the observed values of ~2 in Fig. 3. Consistency of the LP-III distribution across both Fig 1A and B implies that the points in Fig. 3 lie approximately along the 45 ° line. In fact, the observed kurtosis values tend to be underestimated. However, if the sampling bias corrections for ~1 and/~2 were taken into account, the computed and observed values would tend to be closer. Nevertheless, the analysis as it stands suggests at least for these Australian streams that the LP-III distribution is a satisfactory base m e t h o d for analyzing flood flow frequencies. 30
25
.
20
,.-., rr bJ q') cn 0 10
i
i
l
l
i
l
5
'I0
15
20
2.5
30
COMPUTED
~2
Fig. 3. Observed ~2 compared with computed ~2, assuming a log Pearson III distribution.
145
What value of g? In flood frequency analysis, if g (the coefficient of skewness of the logarithms of peak annual discharges) is not statistically different from zero (at the 5% level of significance) current practice is to set g = 0 (I.E.A., 1977). Applying the above criterion to the 172 streams studied herein, g would be set to zero in ~ 6 0 % of the streams. These streams have been differentiated in Figs. 1--3. Thus the peak annual discharges for these streams are assumed to be distributed as LN2. To examine the effect of this procedure, we show in Fig. 4 estimates of the 100-yr. peak discharges (expressed as ratios of the mean annual flood) c o m p u t e d by using the sample value of g and also setting g to zero. As expected the effect is large. For example, differences larger than 25% between the two estimates are observed for more than 40% of the streams. Generally, by setting g = 0, the 100-yr. flood is overestimated relative to using the sample value of g. But a significant number of floods are underestimated. Twelve percent differ by more than 10%, while 3% underestimate the 100-yr. flood by more than 25%. These differences are large enough for us to examine the wisdom of setting g = 0 for those values of g t h a t are n o t significantly different from zero. To examine this question further, the mean and standard deviation of the coefficients of skewness of logs of the peak annual discharges for each Australian drainage division or region are set out in Table I. These show clearly that except for Tasmania, other division or regional averages are considerably different from zero, although n o t statistically so. In view of 20
•
10
6
4
•
°°
2
I
l
I
I
I
2
4
6
10
I0
g-O I 40
Fig. 4. Comparison of lO0-yr, peak discharge estimates (expressed as ratio of mean annual flood) calculated using t h e sample value o f g (vertical a x i s ) w i t h estimates based on g = 0
(horizontal axis).
146 TABLE I S k e w n e s s c h a r a c t e r i s t i c s o f l o g a r i t h m s o f peak annual discharge o f Australian s t r e a m s Number of streams analyzed
Mean o f g
Standard deviation ofg
N o r t h e a s t c o a s t division - c a t c h m e n t s n o r t h o f 24°S
10
--0.62
0.42
N o r t h e a s t c o a s t division - c a t c h m e n t s b e t w e e n 24 ° and 2 9 ° S
15
--0.59
0.52
S o u t h e a s t c o a s t division - c a t c h m e n t s along east c o a s t
22
--0.42
0.60
S o u t h e a s t c o a s t division - c a t c h m e n t s along s o u t h c o a s t
28
--0.55
1.08
T a s m a n i a n division
11
+0.11
0.43
M u r r a y - - D a r l i n g division - D arling--Murrumbidgee system
15
--0.67
0.82
M u r r a y - - D a r l i n g division - Murray s y s t e m
22
--0.54
0.70
S o u t h Australian division
i.s.
i.s.
i.s.
Drainage division o r region
S o u t h w e s t c o a s t division
24
--0.61
I.I0
T i m o r Sea division
n.d.
n.d.
n.d.
Arid z o n e region
14
--0.89
1.05
i.s. ---- i n s u f f i c i e n t d a t a ; n . d . = n o data.
these values, it is suggested that g other than zero is appropriate and that a realistic estimate of the population value is the sample estimate corrected for sampling bias. Even if the above approach is n o t adopted, it would be prudent for engineering hydrologists n o t to set g = 0 for any records in which g is positive. In these cases setting g = 0 will result in underestimation of the peak flood relative to a value c o m p u t e d from the sample value of g.
CONCLUSIONS
Analyses using m o m e n t ratio diagrams of peak annual discharges for 172 Australian streams led to the following conclusions: (1) The log Pearson III distribution is r e c o m m e n d e d as a suitable general distribution for flood frequency analysis. (2) The theoretical distributions -- normal, 2-parameter log-normal, gamma, Gumbel, Weibull and exponential -- fitted the data poorly. The procedure of setting the coefficient of skewness (g) of the logarithms of peak annual discharges to zero, if the sample value o f g is n o t statistically
147
different from zero, is questioned. It is concluded that, at least for the case of g ~ O, this practice be n o t followed b u t rather the sample value of g be used.
REFERENCES Benson, M.A., 1968. Uniform flood-frequency estimating methods for Federal Agencies. Water Resour. Res., 4(5): 891--908. Conway, K.M., 1971. Flood frequency analysis of some New South Wales coastal rivers. M. Eng. Sc. Thesis, University of New South Wales, Sydney, N.S.W. I.E.A. (Institution of Engineers, Australia), 1977. Australian rainfall and runoff: Flood analysis and design. Inst. Eng., Aust., 149 pp. Kopittke, R.A., Stewart, B.J. and Tickle, K.S., 1976. Frequency analysis of flood data in Queensland. Hydrol. Symp. 1976, Inst. Eng., Aust., Natl. Conf. Publ. No. 76/2, pp. 20--24. Landwehr, J., Matalas, N.C. and Wallis, J.R., 1978. Some comparisons of flood statistics in real and log space. Water Resour. Res., 14(5): 902--920. McMahon, T.A., 1979. Hydrologic characteristics of Australian streams. Monash Univ., Clayton, Vic., Civ. Eng. Res. Rep. No. 3/1979. Srikanthan, R. and McMahon, T.A., 1981a. Log-Pearson III distribution: effect on dependence, distribution parameters and sample size on peak annual flood estimates. J. Hydrol., 52:149--159 (this issue). Srikanthan, R. and McMahon, T.A., 1981b. Log-Pearson III distribution: plottingposition. J. Hydrol., 52:161--163 (this issue).