Journal
ELSEVIER
Journal of Hydrology 176 (1996) 115-131
Modelling of the time-series of spatial cow-ages of British rainfall fields Christian Department
Onofc,
Howard
S. Wheater
of Civil Engineering, Imperial College, London S W7 2BU. UK
Received 8 December 1994; revision accepted 9 April 1995
Abstract From an analysis of British radar data, it is concluded that the series of consecutive rainfall coverages for a given area (e.g. a general circulation model grid-square) is strongly,correlated. Consequently, in application of a rainfall disaggregation procedure, the choice of rainfall coverage at each time-step should not be independent of previous coverage. The series of area1 rainfall depths may be used to represent part of this correlation by a simple regression based model. Improvements upon this representation based upon linear time-series models are examined here, but they are of limited use. A second model uses the strong correlation between logarithmic transforms of the coverage and of the mean area1 depth. It provides a good reproduction of the autocorrelation structure of the observed time-series. The reproduction of the marginal distribution can be improved upon by adding a linear time-series component. Another model which uses correlations between the series of ratios of consecutive coverages and those of consecutive area1 rainfall depths is more successful in reproducing the autocorrelations of the coverages while preserving the features of the marginal distribution of the coverage. It provides a method for the generation of successive coverages conditional upon the knowledge of the area1 rainfall depths.
1. Introduction 1.1.
Rainfall disaggregation
in meteorogical
models
In a typical atmospheric meteorological model, a single value of the average rainfall is calculated for each grid-square. Grid-square sizes vary, depending on model application, ranging from 2.5” of arc N-S by 3.25” of arc E-W for a typical general circulation model (GCM), to as low as 10 km x 10 km for forecasting models. As * Corresponding author. Elsevier Science B.V. SSDI 0022- 1694(95)02769-6
116
C. Onof, H.S. Wheater / Journal of Hydrology 176 (1996) 115-131
this single, spatially uniform, value does not represent sub-grid-scale variability, this variability is commonly artificially introduced, for example by considering that rainfall at a given point M is a random variable (exponentially distributed) which is independent of the distribution of rainfall depth at any neighbouring point N. As the assumption of a constant distribution over the whole of the grid square is unable to reproduce observed dry areas, this distribution is taken to apply to a proportion e of the grid-square. Therefore, conditional upon the average rainfall x over that grid-square and the wet proportion E of the square, the cumulative distribution of depths is given by
F(Ro)=P{R
-6% (
>
The assumption that E can be chosen among a set of two or three fixed values such as 0.5 and 0.1 according to the type of rainfall (Gregory and Smith, 1990) was revealed to be inadequate by Eltahir and Bras (1992) and Onof and Wheater (1996). The latter investigated the variation of the marginal distribution of e as a function of the size of the area, elevation and the type of rainfall (through a distinction of seasons). The choice of a mean value of E was first carried out by choosing a monthly mean value of the coverage as estimate of E[E]. Then, depending upon the season and the area under consideration (highland, rainshadow and lowland were distinguished), a modification of a power function distribution was used to model the distribution of e conditional upon the average area1 rainfall i?, so that the exceedance function is approximately:
PI{E> Eg}N 9(Eg)= 1 - QJm’(l-m)+%(es)
(1)
where Q is given in Onof and Wheater (1996). Note that in the first approximation, Y can be chosen as zero. However, the determination of the marginal distribution neglects temporal dependence between the values of e for consecutive time-steps. This paper is concerned with an examination of the time-series of the coverages. The data used are two months (August 1985 and December 1990) of 15 min radar rainfall for 5 km grid-squares from the Clee Hill radar. These data have been calibrated using a multi-quadric interpolation method (Onof and Wheater, 1996).
2. Preliminary analyses Examination of the autocorrelation functions of the 15 min time-series of rainfall coverage {et} reveals significant autocorrelations larger than 0.2 for lags up to 30, as shown in Figs. l(a) and l(b). For December, the autocorrelation function is again larger than 0.2 for lags greater than 95 and smaller than 130. This corresponds to a duration larger than 24 h. The decrease thereafter is very slow. For August, there is a clear diurnal pattern which amounts to positive correlations for lags corresponding to a duration of 24-36 h and negative for lags corresponding to 12-24 h, and this is repeated for each day. This is assumed to reflect the influence of diurnal heating on the initiation of convective activity.
C. Onof, H.S. Wheater / Journal of Hydrology 176 (1996) 115-131
(a)
117
LO 0.9 0.8 0.7 -, 0.6 -I \ 0.5 - 1
‘ii 3
0.4
- I
0.3 -
--- - criticel veluea (5 percent) epatiel coverye - - meen ueel depth
‘\
.----
Fig. 1. Autocorrelation December.
-
criticel veluea (5 percent) 8petiel aovere2e ~OAI depth
mm
of the time-series of coverages and mean area1 depths, for (a) August and (b)
C. Onof. H.S. Wheater / Journal of Hydrology 176 (1996) 115-131
118 Amplitudes
200
(a)
400
600
800
1000
1200
Frequencies
Amplitudes
0.6
Frequencies @I
600
800
1000
1200
Fig. 2. Rainfall coverage Fourier transforms, for (a) August and (b) December.
The significance was tested using the fact that for a large sample of n independent values picked from a given distribution, the sample autocorrelation of lag k is approximately normally distributed with mean -l/(n - k) and standard deviation ere fore the 95% quartiles of the distribution of the sample (n-k+1)1’2/(n-k).Th autocorrelation of lag k are: -[l/(n - k)] - 1.96(n - k + 1)‘12/(n - k) and -[l/n - k)] + 1.96(n - k + 1)1’2/(n -k). This is confirmed by a spectral analysis of the two time-series: the Fourier transform decreases below 25% of the maximal amplitude for frequencies greater than 200 cycles and above 400 cycles it is negligible (Figs. 2(a) and 2(b)), which indicates some
C. Onof, H.S. Wheater 1 Journal of Hydrology
176 (1996) 115-131
119
Table 1 Area1 rainfall and coverage statistics Month August December
0.79 0.84
0.014 0.009
0.025 0.019
0.073 0.060
0.088 0.079
long-term memory (at least 3 or 4 days long here); this appears to be meteorologically justified if one considers the fact that mid-latitude cyclones or depressions, which represent 60% of the rainfall in the British Isles, present a sequence of growth from the first perturbation of the frontal zone to the occlusion, which takes 3-4 days (Shaw, 1983). An examination of the autocorrelation of the time-series of average area1 depths {x,} shows a quicker decrease of the autocorrelation function, but nevertheless, values larger than 0.2 for lags up to 20 (Figs. l(a) and l(b)). The patterns observed above for larger lags are clearly also present here. The Fourier transforms are neglibible beyond 200 cycles, which confirms the observations made for the coverage. Finally, the evolutions of the time-series {Q} and {R,} present some obvious visual similarities although there are also noticeable differences, in the peaks in particular: it is indeed possible to have widespread rainfall with a small value of the area1 average (in particular in December) and, vice versa, to observe very localised rainfall peaks (in particular in August). Nevertheless, correlation coefficients of 0.79 and 0.84 were found for these two months (see Table 1). Because, in a meteorological model, the only information which we possess at each time-step is the average area1 rainfall, this correlation will be the basis on which timeseries models for the ramfall coverages will be constructed. The model will be considered to be adequate if we can reproduce the main features of the marginal distribution of E and most of the autocorrelation of {Q}. These tests will be carried out by graphical comparison of the cumulative distribution functions of the historical and simulated series {et} and of the autocorrelation functions of these series.
3. A first time-series model 3.1. Simple regression A first attempt to relate the two time-series simply uses a regression of et upon xt:
where zt is a random variable of minimal variance. The model first sets z1 = 0. This attempt is motivated by the observation of the similarities between the evolutions of et and x, and by the model of Eltahir and Bras (1993), who used the fact that the coverage is linked to the mean area depth by
C. OnoS, H.S. Wheater / Journal of Hydrology 176 (1996) 115-131
120
Table 2 Regression oft upon a Month
a
b
August December
2.163 3.514
0.036 0.029
0.62 0.71
0.073 0.060
0.069 0.067
where p is the mean of the conditional distribution of rainfall rates for non-zero rain. The coefficients characteristic of the regression are given in Table 2. Also shown in this table are the statistics of the simulated series of coverages. Thus, between two-thirds and three-quarters of the variance of the series {et} is explained by the regression (3). Moreover, the autocorrelation function of the series {a& + b} is of course the same as that of {x,} and therefore, as Figs. 3(a) and 3(b) show, an extra component would be required to improve the model. However, this autocorrelation function is very well reproduced for lags larger than 60. 3.2. Addition of an ARIMA component The series {zt} obtained as remainder from Eq. (3) is now modelled by an ARIMA process. The fact that is has non-negligible autocorrelations for lags up to 30 indicates that the white-noise representation is inadequate. This large value of the autocorrelation suggests non-stationarity and therefore, we shall consider the first differences {AzI}. These show rapidly decreasing autocorrelations which can either be modelled by an autoregressive AR(l) or a white-noise model for {AZ,}. Therefore, {z,} is either modelled by an ARIMA(l,l,O) or an ARIMA(O,l,O) model. The RARIMA( 1,l ,O) model (Regression + ARIMA) is therefore et = ax; + b + zt (4)
zI = zl-l + (cAz,_i + dy,) with y, = N(0, 1)
and the RARIMA(O,l,O) model is ct=a&+b+z, zt = z~_~+ ey, with y, = N(0, 1)
where the coefficients are given in Table 3. It will be noted that the extra RARIMA components are practically independent of the month chosen. Table 3 RARIMA model for E Month
C
d
e
August December
-0.17 -0.16
2.22 x 10-2 2.37 x lo-’
2.27 x lo-’ 2.41 x lo-’
C. Onof; H.S.
Wheater / Journal of Hydrology
-
0.9
- ----
176 (1996) 115-131
12
Eistori~el Smuleted Confidence
0.2 0.1 0.0 -0.1 -0.2
I
I
I
I
I
I
I
I
I
I
0
SO
100
150
200
250
300
350
400
460
I 500
lux
-
Eletorlcd - - Slmuleted
----
-0.1
Confidence
-
-0.2
0
I
50
I 100
I 150
I
200
I 250
I
I
I
300
350
400
I
1
450
500
4x
Fig. 3. Autocorrelations for the pure regression first model, for (a) August and (b) December.
C. Onof, H.S. Wheater / Journal of Hydrology 176 (1996) 11S-131
122
Table 4 Regression + ARIMA (RARIMA) based model Month August
Statistic
RARIMA (l,l,O)
RARIMA (0,l ,0)
+I
0.088 0.102 0.059 0.111
0.093 0.112 0.073 0.106
+I December
Ekl 44
Simulations yielded the results for the two months shown in (Table 4). As this table shows, there is a problem of bias in the estimation induced by the simulation of series of coverages: adding a modelled remainder to the regressed coverage may lead to values larger than unity or smaller than zero. These are then eliminated (by setting them to values within the range [O,l]). However, the effect is to cause bias. The remainders were introduced to generate larger variances. The effect, however, is to produce variances which are on the whole too large for both sets of data. The RARIMA(O,l ,O) allows for an improved reproduction of the autocorrelations of lags from one to 50 whereas the RARIMA( 1,l ,O) produces some autocorrelations which are too large. The problems encountered above suggest that using a relation of type (3) to determine the coverage is not a satisfactory solution. In their use of this model, Eltahir and Bras (1993) assumed that p can be chosen constant for a month, but it is not clear on what grounds this assumption is warranted. Indeed, for the Clee Hill data, the mean of the conditional distribution of rainfall rates varies considerably from one time-step to the next throughout the month. The above analysis suggests a non-linear representation of the coverages is preferable.
4. A second time-series model Rather than look at the series {et} and {xl}, we could therefore look at {ln(e,) + q} and {ln(& + q)}, which series have the advantage of belonging to the semi-infinite interval (- 00 , 0). q~is chosen non-zero so as to avoid problems with eventual zero depths or coverages: n = 10P4. 4. I. Simple regression The correlation of the two series is high: coefficients of 0.92 and 0.85 are found for August and December respectively and therefore a simple regression model is proposed: ln(e, + 10m4)= aln($
+ 10T4) +
b + zt
where zt is a random variable of minimal variance. The model first sets zy = 0. The coefficients are given in Table 5. Interestingly, there is much less difference between the coefficients for the two months, in particular the
123
C. Onox H.S. Wheater / Journal of Hydrology 176 (1996) 115-131
Table 5 Regression of In(e + 10m4)upon ln(a + 10m4) Month
a
b
R=
EN
44
August December
0.66 0.63
0.38 0.31
0.84 0.72
0.068 0.053
0.075 0.064
slope of the regression line. In the case of August, this provides a clear improvement upon the first model. For December, the mean is not well reproduced so that the remainders will have to be modelled. Figs. 4(a) and 4(b) show the cumulative distribution functions for the two months. For both months, the autocorrelation function is reproduced more satisfactorily than with the first model, as seen from Figs. 4(c) and 4(d). The improvement is particularly clear for December (the mean absolute deviation is down from 7.12 x lo-* to 3.4 x lo-*). 4.2. Addition of an ARIMA component As with the first model, the series {zt} obtained as remainder from Eq. (6) is now modelled by an ARIMA process, as it has non-negligible autocorrelations for lags up to 100 in the case of December. In the case of August, there is only significant correlation for lags up to ten. Therefore, {zt} is either modelled by an ARIMA( 1,l ,O) or an ARIMA(O,l ,O) model. The LRARIMA( 1,l ,O) model (log-regression + ARIMA) is therefore ln(e, + 10P4) = a ln(& + 10F4) + b + zt zl = z,_~ + (cAz,_, + dy,) with yt = N(0, 1)
(7)
and the LRARIMA(O,l,O) model is ln(e, + 10-4) = a In@, + 10-4) + b + z, zI = z~_~+ ey, with y, z N(0, 1)
where the coefficients for December are c = -0.155, d = 0.25 and e = 0.25. The results for LRARIMA(l,l,O) for December are shown in Figs. 5(a) and 5(b). This shows a very good reproduction of the marginal distribution (the. simulated mean is 0.058 and the simulated standard deviation 0.085) and a good reproduction of the autocorrelation function (mean absolute deviation smaller than 0.06). The LRARIMA(O,l,O) model provides an equally good reproduction of the marginal distribution but produces insufficient autocorrelations. For the month of August, the addition of an ARIMA component does not improve the reproduction of the marginal distribution and produces insufficient autocorrelation. 5. A third time-series model
Instead of looking at the correlation between the series {et} and {Z,}, we consider
C. Onof. H.S. Wheater / Journal of Hydrology 176 (1996) 115-131
124
(a)
-
lilstdoal --Simulated
soZOlo0 0
I 10
I
20
I
20
I
I
40
60
100
I
60 ??
1
70
I
20
I
90
loo
I
I 110
I
I
x
(b)
-
0
I
I
I
I
10
20
30
40
Historical - - Simulated
I
I
50 60 100 * x
I
I
I
70
80
90
100 110
125
C. Onof. H.S. Wheater / Journal of Hydrology 176 (1996) 115-131 (c).
1-o 0.0 -
-
0.8 -\
liimtorlocl - - 8imulded
----
confid@nocl
I
0.7 -,
0.0 -; 0.5 - ’ I ‘ii;d 0.4 - ’ \ 0.3 - 1 I 0.2 I
-0.2
1 0
I
50
I
I
100
150
I
200
I
250
I
300
I
350
I
400
I
450
I 500
lyx
- ----
mctorloal Simulated Confldcncc
0.8 - \
Fig. 4. Cumulative distribution for the pure regression second model, for (a) August and (b) December Autocorrelation for the pure regression second model, for (c) August and (d) December. ’
126
C. OnoJ H.S. Wheater / Journal of Hydrology 176 (1996) 115-131
-
Eietor[cd -
----
-
Simulated
Contldence
0.6 -I I 0.5 - \
-0.1
-
-0.2 0
I 50
I 150
I 100
I 250
I 200
I 350
I 300
I 400
I 450
I 500
ux
(b)
looQO-
-
Eletmlcel Simulated
-
2010-1 0
I 0
10
I
20
I
30
I
40
100 Fig. 5. (a) December autocorrelations LRARIMA(l,l,O).
I
I
I
60
50
??
of LRARIMA(l,l,O).
70
I
30
I
90
I
loo
I
110
x (b) December cumulative distribution for
C. Onof, H.S. Wheater / Journal of Hydrology
176 (1996) 115-131
127
Table 6 Regression of {E,/E,_~} upon {R,/B,_ I}’ Month August December
1.02 0.58
-0.04 0.47
0.99 0.93
0.070 0.065
0.087 0.084
--
that between {et/et - 1) and {RJR, - 1) for t > 1 andp > 0. The advantage of such an approach is that it uses a correlation between the variation of coverage and that of area1 rainfall; such a model is a priori likely to be more robust to the difference in rainfall types because, although different types of rainfall have different types of characteristic coverages for a given area1 depth, the variation of rainfall depth from one time-step to the next is more likely to be independent of the type of rainfall. The model will therefore be a regression of the form {E~/c;‘} = CU{~,/~;‘}~ + /3 + zy for t > 1 and
p >
0
where z1 is a white noise of mean zero. This means that it is a non-stationary model.
(9) Markov
5.1. Determination of p is determined so that it provides the maximum correlation between {et/c;‘} and 1 }p. Interestingly, for both months, the optimal value of p was found to be approximately equal to 0.7; this corroborates the remark made above about the independence from the rainfall type. Moreover, the correlations between the two series are very high: they are 0.995 and 0.963 for August and December, respectively. p
--
{R,/R,-
5.2. Determination of a and ,L? The two coefficients (Y and /3 are determined by regression of {Q/E;~} upon }p. The coefficients obtained are given in Table 6, and show that this method provides a very good reproduction of the first- and second-order features of the marginal distribution of el (within 5% for August and 10% for December). Moreover, longer simulations of these series have shown that these can be reduced to 6% uncertainty for both months. What is observed here is the effect of the initial conditions upon the simulation of {Q}. Other remarks about the simulation are made in the Appendix. The performance for the first two moments is confirmed by a very good reproduction of the whole of the cumulative distribution function, as shown in Figs. 6(a) and 6(b). --1 {R,/R,-
5.3. Autocorrelations Figs. 6(c) and 6(d) show the autocorrelation functions for the two months. These clearly provide a better fit to the historical autocorrelations than those obtained from
C. Onof, H.S. Wheater / Journal of Hydrology 176 (1996) 115-131
128
(a)
-iii; &
-
??
Historical Simulated
- -
8 .-I
3020100
0
I 10
I 20
I 30
I 40
I I 50 60 100 ?? s
I 70
I 00
I 90
I 100
(b)
-
0
10
20
30
-
40
-
Historical Simulated
50 60 100 ?? I
70
90
90
100
C. Onof, H.S. Wheater / Journal of Hydrology 176 (1996) 115-131
-
0.Q
- ----
0.6
129
Historical Simulated Confidence
0.7
-0.2
1 0
I 50
I
100
I
150
I
200
I
250
I
I
I
I
300
350
400
450
I
I
I
I
I
500
NE =
-
0.Q
‘\
0.6 0.7
Iiietorical Simulated Confidence
I
I
’ \ ’ \
0.6 0.5
\
\ \
5. 7 0.4 k
-0.2
- ----
\
] 0
I
50
100
150
I
200
I
250 k
300
350
400
450
\ 1
500
x
Fig. 6. Cumulative distribution for the third model, for (a) August and (b) December. Autocorrelation for the third model, for (c) August and (d) December.
130
C. Onof. H.S. Wheater / Journal of Hydrology 176 (1996) 115-131
the first model in the case of the month of December, and are of comparable quality to those obtained from the second model. They tend to be larger (by up to 10%) than the historical values for time-lags smaller than 100 but do not reproduce the general shape of the historical autocorrelation function for larger lags. For August, the autocorrelations obtained are similar to those produced by the first model with the simple regression (note that this does indicate an improvement in the autocovariance because the variance is much larger here than for the first model). The periodicity is not quite as well reproduced for lags larger than 300, but the historical autocorrelations are in any case small for those lags. The insufficient autocorrelation for small lags for August may be due to problems linked with the constraints imposed by the simulation of {et} (Appendix). 6. Conclusion Two models for the generation of successive values of the rainfall coverage for GCMs were successful. The first one, which uses a simple regression of the logarithms of the coverages (with a correction factor) upon the logarithms of the area1 depths, is easy to use, and in the case of most of the data examined, provides a good approximation which suffers from insufficient variability. Linear additions to this model provide an improvement. Another model using a very strong correlation between the ratio of consecutive coverages and that of consecutive area1 rainfall depths provides a very good reproduction of the marginal distribution of the coverage and a good reproduction of the autocorrelations for one set of data. For the other, the autocorrelation is still insufhcient. These show that a satisfactory representation of the temporal evolution of the area1 rainfall coverage is provided by a non-linear relationship linking it to the evolution of the mean area1 depth over the same period. Linear representations such as that proposed by Eltahir and Bras (1993) do not appear to be valid for the data examined in this paper, in particular, as no account is taken of the important variability, from one time-step to the next, of the mean rainfall rate conditional upon rain. Further research is being carried out to test the two successful models with other data sets so as to examine their applicability to a wider range of rainfall types and the robustness of the parameters. Acknowledgements
Funding for the project was made available through the TIGER programme of the Natural Environment Research Council. Appendix
Because the second model only uses a representation
of the ratio of consecutive
C. Onof, H.S. Wheater 1 Journal of Hydrology I76 (1996) 115-131
131
coverages, el is obtained as the product (Q/E;‘) x et-l and therefore the model suffers from two essential problems: it has a strong dependence upon initial conditions, i.e. the choice of q, and the simulated coverages are not necessarily numbers between zero and unity. Practically, the simulation algorithm deals with these problems in the following manner: (1) the algorithm must run for a certain number of time-steps (500 in the case of the current data) before the chain can be considered to be independent from the initial conditions. The initial e1 is chosen from the distribution presented in the Introduction. (2) A lower bound q is set so that if et Q q then E~+~is picked using the marginal distribution function F = Jtf(t)dt (see Introduction); we chose q = 10e3. (3) An upper bound E, may be set so that if eI 2 eU,then e,+r is either picked using F, or is set to its mean value E; generally, the upper limit can be picked from the data (e.g. E, = 0.70 for August and E, = 0.65 for December). This resetting of the simulation algorithm may contribute to a small decrease of the autocorrelation function.
References Eltahir, E.A.B. and Bras, R.L., 1992. Estimation of the fractional coverage of rainfall in climate models. 1992 Spring Meeting, American Geophysical Union and Canadian Geophysical Union, Montreal, Ont. Suppl. to EGS, Transactions, American Geophysical Union, Washington, DC. Eltahir, E.A.B. and Bras, R.L., 1993. Estimation of the fractional coverage of rainfall in climate models. J. Climate, 6(4): 639-644. Gregory, D. and Smith, R.N.B., 1990. Unified Model Documentation Paper 25: Canopy, Surface and Soil Hydrology, Version 1. Meteorological OtTice, Bracknell. Onof, C. and Wheater, H.S., 1996. Analysis of the spatial coverage of British rainfall fields. J. Hydrol., 176: 97-113. Shaw, E.M., 1983. Hydrology in Practice. Van Nostrand, Wokingham, UK.