COMPUTATIONAL STATISTICS &DATAANALYSIS ELSEVIER
Computational Statistics & Data Analysis 22 (1996) 159-175
Spatio-temporal prediction of snow water equivalent using the Kalman filter Hsin-Cheng
Huang, Noel Cressie*
Department of Statistics, Iowa State University, Ames, IA 50011, USA
Received February 1995; revised July 1995
Abstract Consider a spatio-temporal stochastic process {Z(s; t): s e D; t = 1, 2 .... } and suppose it is of interest to predict {Z(s; to): s e D} at some fixed time point t o. Purely spatial methods use data Z(sl;to) . . . . . Z(s,;to) to construct a spatial predictor (e.g., kriging). But, when data {Z(si;t): i = 1,..., n; t = 1, 2,..., t o } are available, it is advantageous to treat the problem as one of spatiotemporal prediction. The US National Weather Service now use current snow water equivalent (SWE) data and a purely spatial model to predict SWE at sites where no observations are available. To improve SWE predictions, we introduce a spatio-temporal model that incorporates the SWE data from the past, resulting in a Kalman-filter prediction algorithm. A simple procedure for estimating the parameters in the model is developed and an example is presented for the Animas River basin in southwest Colorado. Keywords: Cross-validation; Kriging; Second-order stationary; Spatio-temporal model
1. Introduction T h e i n c r e a s e d d e m a n d s o f w a t e r f o r i n d u s t r y , a g r i c u l t u r e , a n d u r b a n living m a k e t h e r e l i a b l e f o r e c a s t i n g o f w a t e r r e s o u r c e s a n e x t r e m e l y i m p o r t a n t p r o b l e m . As a result, t h e U S N a t i o n a l W e a t h e r S e r v i c e h a s d e v e l o p e d a set o f h y d r o l o g i c simulation models used to generate extended streamflow predictions, water supply
* Corresponding author. 0167-9473/96/$15.00 © 1996 Elsevier Science SSDI 0 1 6 7 - 9 4 7 3 ( 9 5 ) 0 0 0 4 7 - X
160
H.-C. Huang, N. Cressie / Computational Statistics & Data Analysis 22 (1996) 159-175
outlook, and flood forecasts. These models are intended to answer the three questions: How much snow has fallen in a particular winter? How much water will that snow turn into? Where will that water go? One of the most important parts of the hydrologic simulation models is the snow accumulation and ablation model that uses observed temperature and precipitation data to simulate snow cover conditions. This model is updated during the snow season through estimates of water content in snow cover (called snow water equivalent, abbreviated here as SWE) so as to maintain accurate streamflow and water supply forecasts. Castruccio et al. (1980) state that the benefit of a 6% improvement in streamflow predictions could be as high as 10 million dollars for hydropower and 28 million dollars for irrigation annually in the western USA. Hence precise estimation of SWE becomes an attractive goal. To estimate SWE accurately, snow data from the snow course sites have been manually collected during the snow seasons for as many as 50-60 years. Also, snow data from the automated sensor S N O T E L sites and remotely sensed airborne estimates of SWE have recently become available. Carroll et al. (1995) have shown how, at a given date, to use current spatial data in a statistical model to predict SWE at sites where no observations are available. In this article, we shall construct a spatio-temporal model that can incorporate information from past dates in the same year in order to obtain more reliable SWE predictions. In Section 2, the proposed spatio-temporal model is described, a special case of which is the purely spatial model that is currently being used by the US National Weather Service. Section 3 considers parameter estimation for the models given in Section 2. Section 4 applies the prediction equations derived in Section 3 to SWE in the Animas River basin of Colorado and compares their performance through cross-validation. Discussion of the results is given in Section 5.
2. The proposed spatio-temporai model To illustrate our approach, we concentrate on the Animas River basin in southwest Colorado. SWE data used in this article are collected from 13 S N O T E L sites in the basin. The site locations are displayed in Fig. 1. Data from the 13 sites have been observed daily during the snow season, anywhere between three and fifteen years. Because the US National Weather Service is interested in updates at weekly intervals, we consider only the data observed at the six time points, each one week apart, from February 25 through April 1 (inclusive), for each year (1987-1993), and at each site in the study area.
2.1. Exploratory data analysis Fig. 2 shows the plot of average SWE as a function of weeks, for years 1987-1993, where the average is taken over the 13 S N O T E L sites. Fig. 3 shows the plot of average SWE as a function of weeks, for the 13 S N O T E L sites, where the average is taken over years. Fig. 4 shows the plot of average SWE as a function of years, for
H.-C. Huano, N. Cressie / Computational Statistics & Data Analysis 22 (1996) 159-175
161
o.
I
)1(
SNOTELSitu I
=S m9
"7 m.
"6
"4
"8 ml
-11 "10
m p,. o~
"12
m13
r,. o)
-108.0
-107.8 Longitude
-107.4
-107.6
Fig. 1. Animas river basin and locations of SNOTEL sites in southwest Colorado.
Ur) O)
year 87 year 88 year 89 year 90 year 91 year 92 year 93
O
f
u~ 04
II
. . . . . :..:..-.Z.-°-'~- . . . . . .
. . . .
--_:o
O
UD
O
1
2
3
4
5
unit ,,, 1 week
Fig. 2. Average of SWE over 13 sites from February 25 to April 1 for 1987 to 1993.
6
162
H.-C. Huang, N. Cressie / Computational Statistics & Data Analysis 22 (1996) 159--175
J
0
. . . . . . . . . . . . . . . L-- ~-~2~'-.z--;~ S~.:._~-_~L~'-
c
m lo
m IIIBs S
~
= ~-'-'L=-- ""'" js~
| ~
~
m
.,, ~
._=.~,..._.~..~.v--~~-~=:'--
--- ~
. . . . .
o
t~
o
1
2
3
4
5
6
unit = 1 week
Fig. 3. Average of SWE over years 1987 to 1993 from February 25 to April 1 for 13 SNOTEL sites. Each line indicates one site.
fJ
0
II
o
p" i t p•l
I0 lm
0
0
V
1
I
2
3
4
5
6
u n i t - 1 year
Fig. 4. Average of SWE over six weeks (February 25 to April 1) from 1987 to 1993 for 13 SNOTEL sites. Each line indicates one site.
H.-C. Huang, N. Cressie/ Computational Statistics & Data Analysis 22 (1996) 159-175
163
the 13 S N O T E L sites, where the average is taken over weeks. From these plots, strong location effects, year effects, week effects and some possible interaction can be seen. Consequently, the large-scale variation among sites, among years, and among weeks should be considered. It should be noted that the National Weather Service currently uses spatial prediction (kriging) methods that account for this large-scale variation by kriging residuals from a mean (over years) map. Furthermore, the standard deviations of SWE tends to be larger for larger mean SWE values. This can be seen in Fig. 5, the plot of the standard deviation versus mean SWE, where the means and standard deviations are taken over years for each S N O T E L site and week. These sorts of relations are used to standardize the data to have zero mean and unit variance. We standardize the historical data at S N O T E L sites by subtracting the means and dividing by the standard deviations at those sites. For locations where no observations are available, and so kriging is required, mean maps produced by the National Weather Service on a weekly basis (Day, 1990) are used for mean SWE and the standard deviation is modeled as a function of the mean (e.g. Fig. 5). Details are given below. Let {si: i = 1,..., n} denote the locations of n S N O T E L sites. The standardized SWE Zj,(s), for week t in year j at location s, derived by the National Weather
Od
Y
=
0.95
X 0"69
.
•
.
""
UJ
8 -= E3
1. Od
0 5
10
15
20
25
Mean SWE
Fig. 5. Sample standard deviation versus sample mean (over years) of SWE. The super-imposedline shows the least-squares fit.
164
H.-C. Huano, N. Cressie / Computational Statistics & Data Analysis 22 (1996) 159-175
Service is as follows: Z~t(s) = Y jr(S) -- ~, (S)
at(S)
'
(2.1)
where t = 1,..., to,/6(sl) is the sample mean (over years) of observed SWE Yjt(sl) on week t at site sl and at(s~) is the corresponding sample standard deviation. For locations where no observations are taken,/~,(s) is obtained from mean maps and at(e) is estimated by Ca(#t(s)) c2. The estimates of Ca and C2 obtained from nonlinear least squares are 0.95 and 0.69, respectively (see Fig. 5). Notice that the standardization (2.1) used by the National Weather Service does not consider the year effect, which could be important (see Fig. 2). To adjust for the year effect, we replace the standardized SWE Z jr (s) by the adjusted standardized SWE Z~ (s) as follows: (2.2)
z?, (s) = zjt(~) - 2j,
where
1
~ Zjt(si).
2i = .to
2.2. Spatio-temporal prediction For the purposes of SWE prediction (as distinct from estimation of model parameters), only 1993 data will be used for SWE predictions in this study; hence we replace subscriptjt with subscript t in the following to simplify the notation. To begin with, let us consider the purely spatial model currently used by the National Weather Service. Suppose (e.g. Cressie, 1993, Section 3.1)
Zt(s) = S,(s) + et(s);
s ~ D c_ R 2,
(2.3)
where St(') is a zero-mean, L2-continuous (i.e., E E S t ( s + ~ ) - S t ( s ) ] 2 -*0 as t~ ~ 0), second-order stationary process, and et(') is a zero-mean, white-noise 2 process, independent of S,('), with var[et(s)] = try. In (2.3), it is assumed that {St('): t = 1, 2, ... } and {et( "): t = 1, 2, ... } are all mutually independent. To obtain the estimate of St (So), at location So, we could use ordinary kriging. Let the best linear unbiased predictor (BLUP) of St(so), given {Zt(sl): i = 1,..., n}, be
gt(So) = ~ 2iZ,(s,), i=l
where 5~'= 1 2~ = 1. If So # s~; i = 1 .... , n, then 21, ..., 2, can be obtained from (e.g. Cressie, 1993, pp. 121-122) 20 = Ao X?o,
H.-C. Huang, N. Cressie / Computational Statistics & Data Analysis 22 (1996) 159-175
165
where ~.o = (21 .... ,2n, m)', ro - (yz(So - Sl), ..., ~z(So - sn), 1)',
(Tz(si-sj); I
Ao-=ll; 0;
i= 1,...,n,
j=
1 .... ,n,
i=n+l,
j = 1 .... , n,
i = n + 1,
j=n+l,
~z(h) = ½var [Z,(s + h) - Z,(s)], and Ao is symmetric. The kriging (or prediction) variance is given by /~(So) = ~bAo 1~'o. If So e {st: i = 1, ..., n}, then 21, ..., 2n can be obtained from (Cressie, 1993, pp. 128) 2o = A o l r ~ ', where ~,~' is the same as ~o except the kth c o m p o n e n t is replaced by a~. The kriging (or prediction) variance is given by/Y(So) =/o"'*',.oA-1,../o - a~; So ¢ {st: i = 1 , ..., n}. Notice that for ordinary kriging, only data obtained in the current week t is used to predict the unknown SWE, S,(so), at location So. In what follows, we use a spatiotemporal model and a K a l m a n filter to incorporate past data as well as current data into the prediction of SWE. Recall Eq. (2.3); this yields a vector relation given by Zt = St + 8,;
t = 0, 1 , 2 , . . . ,
(2.4)
where all the vectors are n × 1 corresponding to the n S N O T E L sites. The unobserved state processes {St(s): s ~ D, t = 0, 1, 2 .... } are assumed to be m e a n zero, spatially and temporally stationary Gaussian processes. Suppose they evolve in a M a r k o v manner, which says that only recent weeks' SWE values influence the current week's value. This has proved to be a very effective assumption for modeling precipitation in general. Linearizing this temporal dependence leads to a pth order autoregressive process,
S,(s) = ~ l S , - l ( s ) + ~2S,-2(s) + .-- + ~pSt-~(s) + ~h(S), s e D,
(2.5)
where the coefficients 0q .... , ~p are chosen so that all (possibly complex) roots of ~.P - E~'= 1 ~ 2 p-~ = 0 are less than 1 in absolute value in order to achieve temporal stationarity. Then the spatio-temporal model we shall consider is given by (2.4) and (2.5). In (2.4) and (2.5), Z, ~- (Z, (Sl) .... , Z,(sn))' is the vector of the standardized SWE observations for the n S N O T E L sites at time t; St = (S,(Sl), ..., S,(s,))' is the corresponding vector of the state variable for the n S N O T E L sites; ~, ~ N(0, a~I) is a white-noise process, independent of S,; {rh(s): s ~ D, t = 0, 1, 2,... } is a zero mean, spatially, and temporally stationary Gaussian process such that r/,, (.) and rh2(') are independent for all t~ ~= t2; rh(') and St-1 (') are independent for all t; 8,, and rh2(" ) are independent for all tt, t2; and 0q, ..., ~p are constants.
H.-C. Huan#, N. Cressie / Computational Statistics & Data Analysis 22 (1996) 159--175
166
(sis I (o l o (sps) (i)
Notice that (2.5) can be written as the following first-order M a r k o v process,
=
s,
St(s)
" o
/
~
~p- ~
" 1
St-p+l(s)
~1
St- ~(s)
+
(2.6)
~h s)
Due to temporal stationarity, we see that u p o n post-multiplying (2.5) by St-j(r) and taking expectations, we have P
cJS)(s, r) -
Z ~mCJS-)m(S' r) = 6joC("'(s, r), j = O, 1, 2, ...,
(2.7)
m=l
where r, ts), r) - cov[S,(s), St+h(r)], for all h = ..., - 1,0, 1, ..., C(")(s,r) •-,hts, cov [rh(s), rh(r)]; and {~
&u =
ifi~j, if i = j .
Set
r; s' (s, r) - ( q:' (s' s) C; (s, r)
cJ%, s) Cl~)(r,r))
and
( C(")(s, s) F{")(s, r) =- \ C(,)(r ' s)
C(")(s, r) ) C(")(r, r) ,"
Then Fff)(s, r); h = ..., - 1 , O, 1, ..., can be determined by solving the (YuleWalker) equations, P
FJS)(s, r) -
~
otmFJS-)m(S , r) = 6joF(")(s, r), j = 0, 1, 2 , . . . ,
(2.8)
m=l
obtained from (2.7). Note that F(-Sh)(s,r) = [F(hS)(s, r)]'. Hence, the first (p + 1) of Eqs. (2.8) can be solved for F(oS)(s, r), ..., F~S)(s, r). The remaining equations then give r~S) Jt p+ l i~ [O~ r), l,,~s) p + 2 ,[$, r), . . . , recursively.
2.2.1. Derivation of the Kalman-filter procedure The K a l m a n filter is a recursive algorithm for inference about the state variables of a model written in a state space form (e.g., the S variables defined according to the model (2.6)). Since it can be shown that L,~t(s) - E [ s t , ( s ) l Z l
.... , zt]
is a linear function of Zt and Sc - ~lt - ] (s) for all t', the K a l m a n filter enables the state variable to be updated once new observations are available. In the derivation below, we c o m p u t e the optimal spatio-temporal predictor of St(so) (i.e., E [St (So)/Z1 .... , gt-I), not only for sites where observations are taken but also for locations where no observations are available.
H.-C. Huang. N. Cressie / Computational
Statistics & Data Analysis 22 (1996) 159-I 7.5
167
Assume the models (2.4) and (2.6) hold. We establish the following notation: &,,(s) = ECZ,(s)lZ,,
&,, = EIZ,,)Zl, c
....ztl.
. . . . Z,],
f,,t~~t(~,r)~covCSt,(~),SfZ(r)IZ1,...,Zfl,
& ft(s,f-1= covC&(4, St,(4 IZI , ... , Z,l , ~,~~,(~,r)~~~~C(~,~-,+~(s),...,~,~(~)~,(~,~-,+~(r),...,~,~(r)~IZ~,...,Z,l, Ft= var[ZtIZ1, . . . . Z,-l], G(4=ovCZ,,
(Sf-P+l(~),...,St(~))IIZ1,...,Zt-ll,
a-(ctp,cxp_~,...,
6)
All joint distributions are multivariate Gaussian. Therefore, the conditional on of (S, _ p+ 1(s), . . . , S,(s), S,_,+ 1(r), . . . , S,(r), Zi)‘, conditional distribution r, . . . , Z,_ 1), is multivariate Gaussian with mean (Z ~~I,-,(s),~~-p+Ilr-l(r),...,~~(t-I(r)r~;lr-l)’
(%,+I,,-,(s)V., and covariance matrix
%I,-&J) &-drd G (4
&,,-ds,r) G(s)) &~~-l(r,r) (GM’ 4 G (4
I .
Consequently, the distribution of (S, _,,+ I (s), . . . , S,(s), St_,+ I (r), . . . , S,(r))‘, conditional on (Z,, . . . , Z,), is multivariate Gaussian with mean (St- ,+,~l-~(s),...,~~~t-~(s),~~-p+~~l--l(r),...,~~~r-~(r)) + (G,(s), Gt(r)VY’(& and covariance matrix
--&-I),
H.-C. Huan0, N. Cressie/ ComputationalStatistics& Data Analysis22 (1996)159-175
168
(e.g. Searle, 1971, p. 46). Hence, we have
+ (G,(s))'F;-~(Z,- 2,._ 1). Stft (S)
~]t -1 (S)
,~t[,($, ¥) = a~tl' _ I(S, where
(2.9)
¥) __
A )I(
St-p + lit-1(s)
(Gt(s) ), F t
St-p+ l(S)
) ZI~ . . . , Z t - i
~E
S,I, - 1 (S) = E[A(S,_p(s),
(2.10)
I ~t(r),
St (S) .... S,_,(s))' + (0 .... , O, ,,(s))'lZ, . . . . . Z , _ , ] = A ( ~ , - p l , - ,(s), ..., ~,_ , r , - , (s))',
(2.11)
Ztlt- l(s) = E [Z,(s)[Zt, ..., Zt_ a] = E [ S , ( s ) I Z ~ , . . . , Z , _ ~] = E [at' (S,_ p(s) .... , St- 1 (S))' ] Z 1 , .. • , Z t - 1 ] = ~'(S,-~1,-~(s),
[ ( S,-,+,(s)
/ S,p+ ,(r)
~tIt- 1($, r) ~ COY S,(s)
+
A
o)
St- 1 (S )
tt,- i (s)
S,_p(r) )
0
+
A
S,_ ~(r)
= A~t-lft_l(s,r)A' + (~ Ft= v a r [ g t l g l , . . . , Z t - 1 ] = var[S, lZ1, . . . , Z , - I ] + a2I
Z1,...,Zt-I
S,(r)
S,_p(S) = COV
(2.12)
. . . , ~ , _ ~ l , - ~ (s))';
ZI~...,Zt- 1 ?~t- 1 (¥) 0
C~"~(s,r)}'
(2.13)
H.-C. Huang, N. Cressie/ Computational Statistics & Data Analysis 22 (1996) 159-l 75
169
(2.14)
(2.15)
=
(2.16) 1 a'&-Ilt-1(s1,s)e2... a'&-Ip&l,s)a
=
(2.17) .a. a'Z,_,I,_,(s,,s)a
Thus, we have established the following theorem. Theorem 2.1. Assume that models (2.4) a_nd(2.5) hold. Fot_any pointAs E D and for spatio-temporaZdataZ1,Z2,...,Z,,,let(S-,+l,O(sO),~..,S-l~O(sO),SOIO(sO))Ibethe initial predictor of (S-r+ l(s,-,), . . . , S- 1(so), So(so))‘, and &,(s, r) be the coua_riance matrix between (S-,+,(~),...,S-~(~)~~O(~))I-(~-~+~IO(~),..~SI--IJO(~), So,o(s)) and (S-,+,(r), . . . . L(r), S&9’ - (&+I&%..., ~-&9, So~o(r))l. Then the minimum mean squared error predictor (MMSEP) of (S,,-,+ 1(so), . . . , S,,(s,))’ and its matrix of mean squared prediction errors, XrOIr,(~O, so), can be obtained recursively from the updating equations, (2.9)-(2.15). Notice that Z,,,(s, r) does not depend on the observations, which is to be expected because of the Gaussian assumptions. Thus, for all tl , t2 = t - p + 1, . . . , t, we have
&,,r,lt(s,r) =covCs,,(s),S,,(r)lZ1,...,ztl = EC(S&) - &As)) C&,(r) - &,I~(WL = EC(S&) - %&))(S&) = covCS&) - S,Js),
- %&))I
S,(r) - S,~d~~l.
. . . Al
H.-C. Huan#, N. Cressie / Computational Statistics & Data Analysis 22 (1996) 159-175
170
This means that Z,l,(s , r) is actually the matrix of covariances between the prediction errors at locations s and r. Recall that the process has mean zero and so a reasonable choice for the initial predictor is --P + ll0(S)' "'" , S - 1 1 o ( $ ) ,
Solo(S)) t = E [(S_p+ 1(s), ..., S_ l(s), So(S))'] = 0.
It follows that
2:010(s, r) = c o v [ ( S _ p + , ( s ) ,
..., S _ ~ (s), S o ( s ) ) ; ( S - p ÷ ~ (r) . . . . , S _ ~ (r), So(r))'],
which is a p x p matrix whose (i,j)th entry is C~S~j(s,r).
Theorem 2.2. The one-step M M S E P for S,o+1(So) is #iven by
Lo÷l~,o(SO)=~,Lo,o(~O)+~2S',o_,f,o(SO)+
... + %,Lo-~÷,,,o(So),
(2.18)
where S, ol,o(So). . . . . Sto -p + ltto(So) are obtained from Theorem 2.1. Further, its mean squared prediction error is 2:,0 + ~l,o(So, So) = ~'2:tol,o(So, So)~ + C~"~(So, So), where L',olto(So, So) is obtained from Theorem 2.1. Proof. Sto +1 ito (So) = E [S,o+ 1(So)IZl, ..., Z, o]
= E[~'(Sto_~÷ ~(So) .... , Sto(So))'+ ~,o÷l(So)lZl,...,Z, o]
= ~,g, ol,o(~O) + ~Lo-,jto(~O) + ... + %Lo-~÷,l,o(So). Eto + l jto(So, So) = var [Sto + 1 (So) IZ 1.... , Z,o ] = v a r [ ~ ! (Sto-p+ i(So),
...
,
Sto(So))' + tho+l(so)lZ1, . . . , Z j
= ~' var [(S,o _ p + 1(So),..., S, o(So))' IZl .... , Zto] • + var [tho + 1(So)]
= ~'Z,o,,o(So, So)~ + C~"~(So, So).
[]
For the case where the data are corrected by year effects (see (2.2)), the models and the procedures remain the same except the standardized SWE data {Zt} are replaced by the adjusted standardized SWE data {Z*}. The assumption of joint Gaussianity above can be easily relaxed without changing the result. Instead of looking for M M S E P s that are best in the class of all predictors, we look for the minimum mean squared error linear predictors (MMSELPs) that are best in the more restrictive class of linear predictors. Then the predictors given by (2.9) and (2.16) are M M S E L P s in non-Gaussian settings. They are also unbiased. More discussion of this can be found in Harvey (1989, Chapter 3) or D u n c a n and H o r n (1972).
H.-C. Huang, N. Cressiel Computational Statistics & Data Analysis 22 (1996) 159-175
171
3. Parameter estimation For the purely spatial model given by (2.3), let 2yz(hj
I
0
E
8,+&(1-exp(-&IIhI)))
if h = 0, if h#O.
This is a flexible three-parameter variogram that exhibits nugget effect (0,), sill (0, + 0,), and spatial dependence (0,). Since the data from the SNOTEL sites were measured electronically, the potential for measurement error is much less than for data from snow course sites. Based on an analysis of repeated measurements of instruments over short time periods, we found it is reasonable to assume that cr,”is zero. The parameters that must be estimated are 8 = (0,) &, OS)‘, where ei 2 0; i = 1,2,3. The empirical robust variogram estimator suggested by Cressie and Hawkins (1980) is given by 2y^(h(l))
~
Cave(lWd- Zt(sj)11’2: WE N@U)))14 0.457 + 0.494/IN(h(I))(
’
hETol(h(l)), i,j=l,..., n, t= l,..., to}, where N(h(I)) = {(i, j, t): si-Sj=h, Tol(h(I)) is some specified “tolerance” region around h(l); 1 = 1, . . . , k, and IN (h (I)) I is the number of distinct elements in N (h (I)); I = 1, . . . , k. The parameter 8 can then be estimated by fitting this empirical variogram to 2y,(h; 0). We use the weighted least-squares method (e.g. Cressie, 1993, pp. 96-97) to estimate 8, by minimizing
over all possible 8. Now we consider parameter estimation for the spatio-temporal model given by (2.4) and (2.5). Let the spatially stationary error process be described by the flexible three-parameter covariance function,
4~w(-4211~-~ll) c’%, 4 = & + &
I
if s#r if s=r,
and again assume 0,’ to be zero. From (2.4) and (2.5), the parameters that must be estimated are c(~,. . . , up, 41, 42, and &, where 4i 2 0; i = 1,2,3. From (2.7) we have
Now
@‘(s, s); k = 0, 1, . . . , p, can be estimated by a consistent estimator that averages over n locations, J years, and up to to weeks (Brockwell and Davis, 1991,
172
H.-C. Huang, N. CressielComputational
Statistics & Data Analysis 22 (1996) 159-175
p. 220): p
k
E &
$ Oi
i
‘kk
(zjt(Si)
-
2)
{zj,t+k(Si)
-
z},
lj=lt=i
where
p p I iI1
Therefore, (aI, . . . , a,)’ can be estimated by
-1
p 1
P
1
2 .
p 1 p 0
c
*
07 P
Notice that by solving the first (p + 1) of Eqs. (24, we have Ch”(s , r) =f
P
(ccl, .,.,
ap)
cCq)
(s,
r)
for a particular fp (aI, . . . , up) that is easy to calculate. Hence the variogram of Z,( *) between si and sj, given by var [Z,
(Si)
-
Z, (Sj)]
=
varlIst(Si) -
=
varC&(%)l + varCSt(sj)l - 2COV CSt(Si),
= Ch”(Si,
St) +
st(sj)l St(Sj)l
CA”(Sj, Sj) - C$?(si, Sj),
can be estimated by if i=j, I 2 { ChS’ -f,(&,
. . . ,~p)~1exp(-~211Si-SjII)}
if i#j.
(3.1)
Thus, 41 and 42 can be estimated by fitting an empirical variogram based on {Zjt(Si): i = 1, . . . , 0, j = 1, . . . , J, t = 1, . . . , to> to this two-parameter variogram model. The fitted values of CJ&and & are denoted by $1 and a2. Because CP(s, s) = (41 +
43)fph
9 . . .,
ap),
it follows that 41~can be estimated by
(3.2) For the case where the data are corrected by year effects (see (2.2)), the parameter estimation procedures remain the same except the standardized SWE data {Z, ] are replaced by the adjusted standardized SWE data {Z:).
H.-C. Huang, N. Cressie / Computational Statistics & Data Analysis 22 (1996) 159-175
173
4. Application m snow water equivalent data
In this section, the results of fitting the data to a purely spatial model and to the spatio-temporal model are illustrated. Six weekly observations from February 25 through April 1 are taken for each year (1987-1993) at each SNOTEL site in the Animas River basin. The data were first standardized by either (2.1) or (2.2). For the purely spatial model, we use only the standardized SWE data on 1 April 1993 to obtain SWE predictions at any location on 1 April 1993. For the spatio-temporal model, we use all standardized SWE data (adjusted or not adjusted for year effects) to estimate the parameters and use only the standardized SWE data (adjusted or not adjusted, respectively)from the three most recent dates, 18 March 1993 through 1 April 1993, to obtain SWE predictions on 1 April 1993. Three cases are considered in this study and the parameters estimated for these cases are given as follows. Case 1. For the purely spatial model (2.3) and the standardized SWE obtained from (2.1), the estimated variogram parameters are/~ = 0.2585, 02 = 1.4426, and 03 = 0.0010. Case 2. For AR(1) spatio-temporal model (2.4) and (2.5), and the standardized SWE obtained from (2.1), &~ = 0.82267, q~ = 0.2462, (~2 = 0.0141, q~3 = 0.0285. Case 3. For AR(2) spatio-temporal model (2.4) and (2.5), and the adjusted standardized SWE obtained from (2.2) (i.e., adjusted for year effect), &l = 0.8497, &: = -0.1090, (~ = 0.0067, q~2 = 0.0501, and (ks = 0.0419. Notice that it is not necessary to consider the year-effect adjustment in the purely spatial model, because ordinary kriging used therein already takes the year effect into account. We use cross-validation techniques to compare these three cases. Cross-validation is implemented by removing data {Z~t(si): t = 1, ..., to} for each /, and predicting Zjto(Si) from the remaining data. The resulting predicted value and prediction variance of the standardized SWE for April 1 on year j at site i are A(_. denoted by 2(-°ls.~ jto , ,, and Zjt o ')(st), respectively. Then the predicted value and the prediction variance of unstandardized SWE for April 1 on year j at site i are ') (s,) ~ , ( - i)
+ ^
"
For the case where the data are corrected by year effects (see (2.2)), the predicted value and the prediction variance of the adjusted standardized SWE for April 1 on year j at site i are denoted by 2~(o-i)(sl) and Z(z~°(sl), respectively. Then the predicted value and the prediction variance of unstan~tardized SWE for April 1 on year j at site i are + zj } - _
+
174
H.-C. Huano, N. Cressie/ Computational Statistics & Data Analysis 22 (1996) 159-175
The cross-validation statistics suggested by Carroll and Cressie (1996) are as follows: 1 ~ Yjto(sl)- ~(-i)(_,jtok3"iJ CRV1 = - ~ /~_O~s.~i/2 , n i= I ( yjt ° ~, 11) CRV2-
!
~ i=1
CRV3 ~ {!
~
~(--i)
[Yj,o(Sl)
2 1/2
Yjto (st)] YJto I. i! d -
E Y f l ° ( s i ) - Y~t°i)(si)]2}
where here we compute them f o r j = 1993. The quantity CRVI checks the unbiasedness of the predictor and should be approximately equal to zero; CRV2 checks the accuracy of the standard deviation of the prediction error and should be approximately equal to one; and CRV3, the quantity we focus on, is a measure of goodness of prediction. The results of the cross-validation statistics are shown in Table 1. Inspection of Table 1 indicates that the values of CRV1 and CRV2 are all reasonable, with case 3 showing worse bias than the others. However, case 3 performs better than the other two cases by having the smallest CRV3 value. O n average, there is a 20 % improvement in root mean squared prediction errors using the spatio-temporal approach with year-effect adjustment. When we looked at the individual values of { Yto(Si) -- Y~o-i)(si): i = 1,..., 13} for each case, we found that the predicted SWE of site 1 is always substantially different from the observed SWE of site 1 in each case. This is because the standardized SWE of site 1 is much smaller than the standardized SWE of the others. Therefore, we m a y consider site 1 as an outlier. The cross-validation statistics with site 1 removed are shown in Table 2. F r o m Table 2, we can see that the values of CRVt and CRV2 are still reasonable for each case and case 3 again has a smaller CRV 3 value than the other two cases. The improvement in root mean squared prediction error using case 3 increases to 29°/'0 when the outlying site 1 is removed. Table 1 Cross-validation statistics. Case 1: purely spatial model; case 2: AR(1) spatio-temporal model; case 3: AR(2) spatio-temporal model adjusted for year effects. Model
CRV~
Case 1 Case 2 Case 3
0.08818 -0.02104 0.26646
CRV2
CRV3
0.99209 1.03000 1.06048
2.51406 2.31370 2.08728
Table 2 Cross-validation statistics with site 1 removed Model
CRV1
Case 1 Case 2 Case 3
-0.07941 -0.13997 0.10928
CRV2
CRV3
0.83608 0.91231 0.91225
2.14515 1.93585 1.66593
H.-C. Huano, N. Cressie/ Computational Statistics & Data Analysis 22 (1996) 159-175
175
5. Discussion In this article, we have proposed a spatio-temporal model and showed how to use the Kalman filter algorithm to obtain SWE prediction for locations where no observations are taken. For SWE data from the Animas River basin of southwest Colorado, the spatio-temporal model performs better than the purely spatial model, even when the standardized SWE are not adjusted for year effects. A further reduction of the CRV3 value is observed when there is an adjustment for year effects. Although the spatio-temporal model presented has been developed for the analysis of SWE data, this model should also prove useful for many other spatiotemporal prediction problems.
Acknowledgements We would like to thank Dr. Steven S. Carroll of Arizona State University for his assistance in obtaining and interpreting the data and for many helpful discussions on snow water equivalent. This research was supported by the National Weather Service, the National Science Foundation (DMS-9204521) and the Environmental Protection Agency (CR 822919-01-0).
References Brockwell, P.J. and R.A. Davis, Time Series: Theory and Methods, 2nd edn. (Springer, New York, 1991). Carroll, S.S and N. Cressie, Spatial modeling of snow water equivalent using covariances estimated from spatial and geomorphic attributes, Journal of Hydrology, tentatively accepted (1996). Carroll, S.S., G.N. Day, N. Cressie and T.R. Carroll, Spatial modeling of snow water equivalent using airborne and ground-based snow data, Environmetrics, 6 (1995) 127-139. Castruccio, P.A., H. Loats, Jr., D. Lloyd and P.A.B. Newman, Cost/benefit analysis for the operational applications of satellite snowcover observations (OASSO), Proc. Final Workshop on Operational Applications of Satellite Snowcover Observations, Washington, DC NASA CP-2116 (1980) 239-254. Cressie, N., Statistics for Spatial Data, revised edn. (Wiley, New York, 1993). Cressie, N. and D.M. Hawkins, Robust estimation of the variogram, I. J. Int. Assoc. Math. Geology, 12 (1980) 115-125. Day, G.N., A methodology for updating a conceptual snow model with snow measurements, NOAA Technical Report, NWS 43 (Silver Spring, MD, 1990). Duncan, D.B. and S.D. Horn, Linear dynamic regression from the viewpoint of regression anaysis, J. Amer. Statist. Assoc. 67 (1972) 815-821. Harvey, A.C., Forecasting, Structural Time Series Models and the Kalman Filter (Cambridge University Press, Cambridge, 1989). Searle, S.R., Linear Models (Wiley, New York, 1971).