Journal o f Hydrology, 140 (1992) 209-233 Elsevier Science Publishers B.V., A m s t e r d a m
209
[4]
Generation of multivariate autoregressive sequences with emphasis on initial values Taylan A. Ula Department o[Statistics, Middle East Technical University, 06531 Ankara, Turkey (Received 18 July 1991; revision accepted 29 April 1992)
ABSTRACT Ula, T.A., 1992. Generation of multivariate autoregressive sequences with emphasis on initial values. J. Hydrol., 140: 209-233. Certain aspects of data generation are studied through multivariate autoregressive (AR) models. The main emphasis is on the preservation of certain desired moments and the effect of initial values on these moments. The problem of preservation of moments is approached in a nontraditional way by starting with the initial values. For this purpose, general AR processes with a random start and with time-varying parameters are introduced to lay a foundation for the analysis of all types of AR processes, including the periodic cases. It is shown that an AR process with a random start and with parameters obtained from the moment equations is capable of generating jointly multivariate normal vectors with any specified means and covariance matrices, and with any specified autocovariance matrices up to a given lag. With a random start, there is no transition period involved for achieving these moments. A simple solution is proposed for matrix equations of the form BBT = A which appear in the moment equations. The aggregation properties of general AR process are also studied. A more detailed analysis is given for the two-period first-order periodic autoregressive model, PAR 2( 1). For the PAR2(I) process with a random start and with parameters obtained from the moment equations, it is shown that the autocovariance function depends only on the period and the lag, and therefore the process is periodic (covariance) stationary. The PAR2(I) process with a fixed start is also studied. It is shown that the moments of this process depend on the absolute time, in addition to the period and the lag, and therefore the process is not periodic stationary. This dependence diminishes with time, and periodic stationarity is realized if the AR parameters satisfy certain conditions. In that case, the PARt(I) process with a fixed start converges to that with a random start, but only after a certain transition period. This proves the superiority of a random start over a fixed start.
INTRODUCTION
Simulation studies are being widely used in the design and operation of water resources systems. Data generation is an integral part of such studies. Hydrologic data (mainly streamflow data) generation techniques have been in C o r r e s p o n d e n c e to: T.A. Ula, D e p a r t m e n t o f Statistics, M i d d l e East Technical University, 06531 A n k a r a , T u r k e y .
0022-1694/92/$05.00
© 1992
- Elsevier Science Publishers B.V. All rights reserved
210
T.A. ULA
use and under constant development over the last 30 years under the names such as 'operational' and 'synthetic' hydrology. In the modelling and simulation of univariate, multivariate, nonseasonal and seasonal hydrologic processes, autoregressive-moving average (ARMA) type models and their seasonally varying parameter versions, referred to as periodic A R M A models (PARMA), have commonly been used. The ARMA models include the pure autoregressive (AR) and the pure moving average (MA) models as special cases. The univariate and multivariate AR models and their periodic versions (PAR) have received the most attention. Some early studies in this area are by Thomas and Fiering (1962), Yevdjevich (1964), Matalas (1967), Pegram and James (1972) and Salas and Pegram (1979) for the univariate PAR(l) and AR(p), and multivariate AR(1), AR(p) and PAR(p) models, respectively. Yagil (1963) introduced a univariate PAR type model with a periodically varying order in which the autocorrelations do not extend beyond a water year. This model is therefore not a truly autoregressive model but rather a multiple regression model. Multivariate PAR models, because of their multidimensionality and periodicity, are general enough to investigate certain general aspects of data generation. We use the simple case of a two-period first-order PAR model, PAR2(1), in most of our analysis. A commonly pursued objective in data generation has been the preservation of the distribution and some moments (mostly up to the third order) of the observed data. In our analysis we assume joint multivariate normality of the observations. This distribution is completely specified by its first- and second-order moments. The preservation of moments has commonly been accomplished by using the method of moments. In this method, the parameter-moment relations (which will simply be referred to here as the moment equations) are obtained from the recursive equation of the model under (covariance) stationarity. The method of moment estimates of the parameters can then be obtained by using the sample moments. We will not cover the estimation phase here, which we intend to discuss in a future paper. The analysis of periodic models requires periodic (covariance) stationarity, which is the dependence of the autocovariance function on the period and the lag but not on the absolute time. The determination of periodic stationarity conditions for univariate and multivariate PARMA models has been discussed by Vecchia (1985) and Ula (1990), respectively, and explicit results have been obtained for the PAR(l) case. The moment equations for the univariate PAR(p), and the multivariate AR(1), AR(p) and PAR(p) models have been obtained by Salas (1972), Matalas (1967), Pegram and James (1972), and Salas and Pegram (1979), respectively. The moment equations lbr the univariate AR(p) model are available as the well-known Yule-Walker equations (Box and Jenkins, 1976).
G E N E R A T I O N O F M U L T I V A R I A T E AUTOREGRESSIVE SEQUENCES
21 l
The use of the m o m e n t equations for the parameters in data generation guarantees, at least in the long run, the preservation of moments up to the order to which they are included in the m o m e n t equations. As data generation is to be started with some initial values, the preservation of moments is in general a long-term property which can be achieved after the effect of initial values has diminished. In practice, the effect of initial values has been removed, at some cost, by disregarding a considerable amount of the initially generated data. Although it is known by virtue of stationarity that the initial values have a transitory effect, the nature and the extent of this effect have not been studied analytically. For periodic stationary models, which are not stationary in the ordinary sense, the nature of this effect is more questionable, and therefore is to be studied. This is the main objective of this paper. We approach the problem of preservation of moments in a nontraditional way by starting with the initial values. For this purpose, we first introduce general A R processes with a random start and with time-varying parameters. We develop the m o m e n t equations for these processes, giving due attention to some fine points, and investigate their capability for reproduction of the desired first- and second-order moments, specifically the mean and the autocovariance function. We also investigate certain aggregation properties of general A R processes. The general A R process lays the foundation for the analyses of all types of A R processes, including the periodic cases. For a more detailed analysis, we adopt the PAR2 (1) model. For this model, we first study the random start case in some detail, and then compare it with the fixed start case, showing the superiority of the former. GENERAL
AUTOREGRESSIVE
PROCESSES
Let {X,; t = 1,2 . . . . } be an m-dimensional process, whose second-order moments exist, with the m × 1 mean vector #(t) = E(X,) and the m × m autocovariance matrix C(s, t) = E{[X~ - /t(s)][)(, - /,t(t)]T } for all positive integers s and t, where T is the transpose operator. The vector X, can alway be redefined as X~° = X, - /~(t) with a zero mean vector and with the same autocovariance matrix as X,, and )(, can be recovered from X,° as IV, = X,° + /t(t). Therefore it may be assumed, without loss of generality, that #(t) = 0 and C(s, t) = E(Z,)(,T). The lag-zero autocovariance matrix C(t, t) is the variance-covariance matrix of)(, (which will simply be referred to as the covariance matrix of X,) which is always symmetric and non-negative, i.e. either positive semidefinite (p.s.d.) or positive definite (p.d.), and p.d. iff (if and only if) nonsingular. In the context of, say, streamflow modelling, X, is the streamflow vector at time t representing the mean-subtracted streamflow values at m stations.
212
T.A ULA
The process {J(, } is called periodic (covariance) stationary of period w if, for some integer w > 1 and for all positive integers s and t , / 4 t ) = /~(t + w) and C(s, t) = C(s + w, t + w). F o r w = 1, this reduces to (covariance) stationarity. W e will first introduce general A R processes with a r a n d o m start and with time-varying parameters. The first- and second-order processes will be analysed in some detail. The extensions to higher orders will then readily follow. The analysis o f these processes with a r a n d o m start does not require any type o f stationarity. We will treat the P A R processes later as a special case, and study some implications o f stationarity there when we consider the fixed start case.
First-order A R process We define an A R ( 1 ) process for {X,} with a r a n d o m start and with timevarying parameters as
Xi
=
Blel
X,
=
O,X, , + B,e,,
(1) t =
2,3 . . . .
(2)
where {e,; t = 1,2, . . . j are independently and identically distributed m-dimensional noise vectors having a multivariate normal distribution with zero m e a n vector and identity covariance matrix, denoted as N,,,(0, I), {qS,; t = 2,3 . . . . } are m x m time-varying A R coefficient matrices, and [B,; t = 1,2 . . . . } are m x m matrices required to p r o d u c e the desired timevarying covariance matrices for the error terms a, = B,e,, t = 1,2 . . . . . which are independently distributed as N,,, (0, B, B,v). Denoting the vector e, as Nm(0, I) is a c o m p a c t w a y of saying that its m elements are independently distributed standard normal variables. Therefore independently distributed {e, } vectors can be generated by using standard normal r a n d o m numbers. The assumptions o f independence and multivariate normality of {e, } are required only for the joint multivariate normality o f /tX , (, as will be discussed later. The noise vectors {e,} being uncorrelated with zero m e a n vectors and identity covariance matrices, i.e. E(e,) = O, E(e, elr) = I, and E(e, eV, ) = 0 for s # t, suffice for the derivation o f the m o m e n t equations. By successive application o f eqn. (2) together with eqn. (l), it can be shown that )(, is a linear c o m b i n a t i o n o f e~, e2 . . . . . e, as t-I
X, =
Z 0,0,
. • • • O,,.B,e, + B,e,
t =
2,3 . . . .
(3t
i=1
It then follows that E(e, X T ) = B v, and that e, is uncorrelated with X, j for j = 1,2 . . . . i.e. E(e,Y, vl) = 0 f o r j = 1,2,
GENEP.ATION oF M U L T I v a r I A T E a u t O r e G r E s s l v e SEQUENCES
213
It f o l l o w s f r o m eqn. (1), and then f r o m successive a p p l i c a t i o n o f eqn. (2) or directly f r o m eqn. (3) that p(t) = E(X~) = 0, t = 1,2, . . . . F r o m eqns. (1) and (2), C(1,1)
=
C(t,tC(t,t)
E(X,X, t) = B, BT
1) = =
E(X,)(,t,)
E(X, XJ)
=
=
(4)
4~,C(t-
1, t -
1)+
(9,Ct(t,t-
1)
B , B ~,
t
t =
=
2,3 . . . .
2,3 . . . .
(5)
(6)
Assuming nonsingularity, i.e. positive definiteness, of all the covariance matrices C ( t , t), t = 1,2 . . . . . it follows from eqn. (5) that ~p, =
C(t,t-
1)C
l(t-
1, t -
1)
t = 2,3,...
(7)
If the covariance matrix C ( t - 1, t - 1) in eqn. (5) is singular, then a solution for qS, can still be obtained, but it is not unique (Graybill, 1983, p. 416). F r o m eqns. (6) and (7), we have B,B T
=
t =
C(t,t)
-
2,3 . . . .
C(t, t -
1) C - ' ( t
-
1, t -
1) c T ( t , t -- 1), (8)
and from eqn. (4) BIB~ =
C(1,1)
(9)
Therefore eqns. (1) and (2) w i t h parameters as specificed by eqns. (7)-(9) are
capable of generating {X,} vectors with zero means, and with any specified covariance matrices C ( t , t) and lag-one autocovariance matrices C ( t , t - 1). Any nonzero mean #(t) for any X, can also be specified by replacing X, in eqns. (1) and (2) by X~ - /~(t). It should be noted that the r a n d o m start as specified by eqn. (1) provides an appropriate initial condition, eqn. (4), for the solution of the recursive m o m e n t eqns. (5) and (6). The specified mean vectors and the covariance matrices are reproduced starting from t = 1, and the lag-one autocovariance matrices are reproduced starting from t = 2. A p r o p o s e d s o l u t i o n f o r BB r = A The solution of eqns. (8) and (9) for B, requires in general the solution of the equation B B v = A for B. Matalas (1967) pointed out the nonuniqueness of B and proposed a principal components solution. Y o u n g (1968) and Y o u n g and Pisano (1968) proposed the simpler square root method, for the case of a p.d. A matrix, which yields a lower triangular B matrix. For a p.d. A matrix, there is a unique (except for sign) lower triangular B matrix (Graybill, 1983, p. 208). The square root m e t h o d can be modified for a p.s.d. A matrix, and a lower triangular B matrix (not unique) can still be obtained (Graybill, 1969, pp. 298-304).
214
T . A ULA
The matrix A = B B T is symmetric and non-negative (Graybill, 1983, p. 397). Using these properties of matrix A, we propose a simpler solution for matrix B. Let A be an m × m real symmetric matrix. Its eigenvalues 2~ . . . . , 2 m are all real, and a set of real orthonormal m × 1 eigenvectors ~:~. . . . . em exists. For distinct eigenvalues of A, the corresponding eigenvectors are always orthogonal. For identical eigenvalues of A, the corresponding eigenvectors are not necessarily orthogonal but an orthogonal set always exists. As a nonzero scalar multiple of an eigenvector is still an eigenvector, eigenvectors can always be normalized to have unit lengths. Let A = Diag.(2~ . . . . . 2m), where 'Diag." indicates a diagonal matrix with the specified elements, and E -- (~ . . . . . era) which is an m × m real orthogonal matrix, i.e. E V E = I. It follows from E V E = / t h a t E T = E -t . The matrix A can be expressed as A
=
EAE T
(10)
which is known as the spectral decompositio n of A (Rao, 1973, pp. 39-40). This simply follows from A E = E A , which is the matrix representation of Aei = 2iei, i = 1. . . . . m. If, in addition, A is a non-negative matrix, then its eigenvalues are all non-negative. In that case, let A j/2 = Diag.(21/2, . . . . z,,;-)," ~'~ and define A t/2
=
(11)
EA1/2E v
where A ~/2 satisfies the relation A1/2A ~/2 = A and hence is called the square root of the A matrix. The matrix A t/2 is symmetric and satisfies the relation B B v = A with B = B s = A ~/:. This symmetric solution for B is different from the nonsymmetric solution (being lower triangular) obtained from the square root method. It can be seen that B = E A ~/2 (in general nonsymmetric) also satisfies B B V = A , as used by Pereira et al. (1984). The moment equations are invariant with respect to the particular solution used for B in BB s = A.
In eqn. (9), C(1, 1), being a covariance matrix, is symmetric and non-negative, and is p.d. if nonsingular, an asumption which we already made in the m o m e n t eqns. (7) and (8). The covariance matrix of (X~, X, ~), together, is = E
X,
,
~
~
,t CT(t,t--
l)
C(t-
1, t -
=
2,3 . . . .
1)
(12) The positive definiteness of this covariance matrix guarantees the positive definiteness of the covariance matrices on its diagonal and the positive definiteness of the symmetric matrix at the right-hand side of eqn. (8) (Graybill, 1983, pp. 183-186, 415-418). The positive definiteness of covariance matrices
GENERATION OF MULTIVARIATE AtJTOREGRESSWE SEQUENCES
215
is not an unreasonable assumption, as nonsingularity occurs in the case of perfect linear dependence between some of the variables, which is a very unlikely situation, especially in hydrologic applications. We will assume positive definiteness of covariance matrices for all elements and combinations of elements of {X~}. Given the positive definiteness of the symmetric matrices at the right-hand sides of eqns. (8) and (9), the matrices B r and B t in these equations can be taken as the square roots of those matrices.
Normality of {Xt } The jont multivariate normality of {Xt } follows from that of {e,}. Because, by eqns. (1) and (3), {X,} are linear combinations of {el} which are independently and multivariate normally distributed, the joint distribution of {X,} is also multivariate normal. For example, X~ and X 2 can be expressed as linear combinations of e~ and e2 as
(/)2BI
B2][e2I
(13)
As the joint distribution of independent multivariate normal vectors is also multivariate normal, the joint distribution of e~ and e2 is Nxm(0, I ). As multivariate normality is preserved under a linear transformation (Kshirsagar, 1972, p. 27), the joint distribution of X I and )(2 is also 2m-dimensional multivariate normal. Under multivariate normality, e, being uncorrelated with X, / f o r j -- 1,2 . . . . implies also their independence, as e, and Xt ~j also have a multivariate normal joint distribution, and zero correlation means independence in that case.
H~,her-order AR processes We define a second-order autoregressive AR(2) process for {X,} with a random start as
Xi = Blel X2 =
~2Xi
(14) + B~e 2
X, = O,.~X,_, + (a,.2X,_2 + B,e,
(15)
t =
3,4 . . . .
(16)
where ~b2, 4~,.i and ~bt.2are m × rn autoregressive coefficients. Again, it can be shown that X~ is a linear combination of el, e2 . . . . , e,, E(e, XT) = B v, and E(e~X,V j) = 0 for j = 1,2, . . . . The joint multivariate normality of {Xt} again follows from that of {e,}. It follows from eqns. (14)-(16) that #(t) = 0, t = 1,2, . . . , and c(1,1)
=
B,B,
(17)
216
T.A. t L A
C(2, 1) =
q~2C(1,1)
C(2,2)
~b2CX(2, 1) + B 2 B ~
=
1) =
C(t, t -
4),,1C(t -
t =
=
1) + ¢o, 2 c T ( t -- 1, t --
1, t -
1, t - 2)
4),..C(t -
t = C(t,t)
(19)
2),
(20)
3,4 . . . .
2) =
C(t, t -
18)
+
q~,.2C(t
-
2, t
2),
--
(2l)
3,4 . . . .
4),.,CX(t, t -
1) + ~b,,2CV(t,t -- 2) + B, BV,, t =
3,4,... (22)
F r o m eqn. (18), we obtain
=
c(2,1) c
'(1,1)
(23)
and from eqns. (20) and (21) [-C(t-
[q~,.,,~b,.:] -- [ C ( t , t t =
1, t -
1),C(t,t-Z)]_Cf( t-
1)
2)
1, t -
C(t-
1, t -
2)]
C(t-
2, t -
2)
3,4,...
'
(24)
where the matrix under the inverse is the covariance matrix o f (X,_ ~, X, :). By using results for inverses o f partitioned matrices (Graybill, 1983, p. 184), we obtain I -
C(t-1, cV(t
C(t -
[K
'(t-
'
t-
C(t-1,
-- 1 t -- 2)
-C t =
1)
-C
l(t - 2, t
2)] 2)
2, t 1, t -
'
1) C ( t -
1, t -
2) K ~ ' ]
2) cV(t -- 1, t -- 2) K I I ,
K2
I
3,4 . . . .
(25)
where K,
=
C(t -
1, t -
x C~(tK2 =
C(t x C(t-
1) -
C(t -
1, t -
2) C - I ( t - 2, t - 2)
(26)
1, t - 2 )
2, t -
2) -
CV(t -
1, t -
2) C - - t ( t -
1, t -
1)
(27)
1, t - 2 )
If a matrix is p.d., then its inverse is also p.d., and the submatrices on the diagonal of a partitioned p.d. matrix are also p.d. (Graybill, 1983, pp. 396-397).
GENERATION OF MULTIVARIATE AUTOREGRESSIvE SEQUENCEs
217
Therefore positive definiteness of the covariance matrix of (X~ ~, )(,_2) guarantees the positive definiteness of Kt and K2, and hence the existence of their inverses. It then follows from eqn. (24) that, for t = 3,4 . . . . .
dp,., = [C(t, t -
1) - C ( t , t
- 2)C ' ( t -
2, t - 2)cT(t -- 1, t - - 2)]Kt'
(28) 0,,2 =
[C(t, t - 2) - C ( t , t -
1)C '(t -
1, t -
1)C(t -
1, t - 2)]K2' (29)
F r o m eqn. (17), we have
B, B,* =
C(I, 1)
(30)
from eqns. (19) and (23)
B2B ~ =
C(2, 2) - C(2, 1 ) C - ' ( I , 1)cT(2, 1)
(31)
and from eqns. (22) and (24)
B,B v, = C(t, t) - [C(t, t - 1 ) , C ( t , t - 2) 1 [ C(tl't1) C ( t l't2) ] -' - C T ( t ' ' cT(t
1, t
2) C(t
2, t -- 2)J
1)]
t = 3,4,...
cT(t, t -- 2) (32)
The symmetric matrix on the right-hand side of eqn. (31) is the same as Kt for t = 3, the positive definiteness of which is guaranteed by that of the covariance matrix of (X 2, Xt ), as discussed above. This also provides a simple way for obtaining the matrix in eqn. (31) from the inverse of the covariance matrix of (X 2, X~) in which the upper-left m x rn submatrix is K~-I , as seen from eqn. (25). It can be seen that the matrix in eqn. (8) is the same as K~ with t replaced by t + 1. The positive definiteness of this particular K~ is guaranteed by that of the covariance matrix of (X,, X,_ t ), which we already stated after eqn. (12) but without giving the details. It can also be shown that the symmetric matrix on the right-hand side of eqn. (32) is the inverse of the upper-left rn x m submatrix in the partitioned inverse of the covariance matrix of (X,, X,_ l, )(,-2), the positive definiteness of which guarantees that of the matrix in eqn. (32). Given the positive definiteness of the symmetric matrices on the right-hand sides of eqns. (30)-(32) for B , B T, the matrices B, can be taken as the square roots of those matrices. Equations (14)-(16) with parameters as specified by eqns. (23), (24), (30), (31) and (32) are capable of generating jointly multivariate normal {X,} vectors with zero means, and with any specified covariance matrices C(t, t), lag-one autocovariance matrices C(t, t - 1), t = 2, 3, . . . , and lag-two autocovariance matrices C(t, t - 2), t = 3, 4, . . . . The random start as
218
T.A.U;t.A
specified by eqn. (14), and the first-order eqn. (15) for the transition period t = 2 provide the appropriate initial conditions, eqns. (17)-(19), for t h e solution o f the recursive m o m e n t eqns. (20)-(22). The analyses o f first- a n d second-order A R processes with a r a n d o m start can easily be generalized to the p t h - o r d e r case, A R ( p ) , where p = 1, 2 . . . . : XI
Blel
=
t
X,
=
(33)
I
Y, ch,.,X,_, + i
B,e,
t = 2, 3 . . . . .
p,
p = 2, 3 . . . .
(34)
I P
X~
=
•
O,.~X~ ~ + B , e ,
t = p + 1,p + 2....
(35)
i=l
E q u a t i o n s (24) and (32) for the AR(2) case for t = 3, 4, . . . can easily be generalized to the A R ( p ) case for t = p + 1, p + 2, . . . by replacing the matrix [(a,. L, 0,.2] by [~b,.~. . . . , qS,.p], the matrix [ C ( t , t - 1), C ( t , t - 2)] by [ C ( t , t - 1) . . . . , C ( t , t - p)], and the covariance matrix of (2", ~, X , 2), which is the matrix under the inverse, by that o f (X, ~. . . . . X, p). It m a y be noted that eqns. (7) and (8) for the AR(1) case for t = 2, 3 . . . . can be obtained from this generalization for p = 1. The only m o m e n t e q u a t i o n for t = 1 is that given by eqn. (17). The m o m e n t equations for the transition periods t = 2, 3, . . . , p for p = 2, 3, . . . , like eqns. (23) and (31) for the AR(2) case for t = 2, can also be obtained f r o m this generalization by noting that for a n y given t in this range, the m o m e n t equations are those of A R ( t - 1) case. The A R ( p ) model is capable of reproducing joint multivariate normality, any specified m e a n a n d covariance function, and any specified autocovariance function up to lag p. A further generalization o f the A R ( p ) model above is that with time-varying orders Pr, for which the m o m e n t equations can be obtained in a similar manner. Such a model is capable of reproducing joint multivariate normality, a n y specified m e a n and covariance function, and any specified a u t o c o v a r i a n c e function up to b a c k w a r d lag p, at time t. Aggregation
We n o w consider certain aggregates (sums) of {X~}, where {Yr} follows a general A R ( p ) process with a r a n d o m start. Let us consider the qth-order aggregation: qv
Zv =
• t-q(v-
X, I)+
v = 1,2 . . . . .
q = 1,2 ....
(36)
I
for whichZ~ = X~ + • • • + X~l, Z2 = X,l~ + • - • + X2q, e t c . F o r m o n t h l y
219
GENERATION OF MULTIVARIATE AUTOREGRESSIVE SEQUENCES
streamflow modelling, for example, the 12 monthly streamflow vectors each year v will aggregate to the annual streamflow vector Zv for that year, for w h i c h q = 12. It follows from eqn. (36) that E(Zv) = 0, and that the autocovariance function of {Zv} for lag k is:
E(ZvZv k) = E
E
t=q(v qv
=
• i=q(v-l)+l
v = k + 1, k + 2 . . . .
I)+1
x,
E
t=q(v-k
I)+1
x,
q(v- k)
Z j=q(v-k
C(i, j ) , 1)+1
k = 0, 1 , 2 , . . .
(37)
For nonzero means for {X~}, each X, in eqn. (36) is to be replaced by X, - /~(t). In that case, Zv is the mean-subtracted aggregated process. The mean function Zq~'q~v--i)+ 1 ~(t) of the aggregated process is reproducible by an A R process of any order, as any A R process is capable of reproducing the desired mean function {/~(t)} for {X,}. As seen from eqn. (37), the autocovariance function of {Zv} is a sum of the autocovariance matrices C(i, j ) of {X,}, and the highest lag involved in C ( i , j ) is i.... - Jmin. = qv -- [q(v -- k -- 1) + 1] = q(k + 1) - 1. Therefore the autocovariance function of {Zv} for lag k is reproducible only by A R ( p ) processes of order p >~ q(k + 1) - 1. In particular, we require p ~> q - 1 for reproducing the covariance function of
{zv}. As the joint distribution of {X, } is multivariate normal for any A R process, that of {Z~.} is also so, which follows from preservation of multivariate normality under linear transformations. For example, let us consider the second-order aggregation for which ZI = X~ + X2 and Z2 = X~ + )(4, which can be expressed as =
Z2
(38)
0
0
I
The first matrix on the right-hand side ofeqn. (38) is of dimension 2m x 4m, and of rank 2m, i.e. Z~ and Z 2 are linearly independent linear combinations of X ~ , . . . , X4. The 4m x 1 vector on the right-hand side of eqn. (38) has a nonsingular (referring to the covariance matrix) multivariate normal distribution assuming positive definiteness of its covariance matrix. These results guarantee the joint nonsingular multivariate normality of (ZI, Z2) and hence the nonsingular multivariate normality of both ZI and Z2 (Kshirsagar, 1972, p. 27). Therefore the positive definiteness of the covariance matrices for all subsets of {X,} guarantees that of the covariance matrices for all subsets of
220
T.A. ULA
PERIODIC AUTOREGRESSIVE PROCESSES The general A R processes with a random start can be simplified by imposing some restictions on the time-varying parameters. The simplest case is that of constant parameters. In that case, the parameters B t and 4>,,;in eqn. (35) for the AR(p) process will be B and ~bi, respectively. Equations (33) and (34) for the transition period of the random start will remain as they are. This is the case of ordinary A R processes. A more general case is that of periodic parameters of period w. In that case, the parameters in eqn. (35) will satisfy the conditions B, = B,+, and q~t,, = ~5,+,,,,ifor some integer w > 1. This is the case of periodic autoregressive (PAR) processes, which will be considered in the following sections. We use the simple case of a two-period first-order PAR model, PAR2(1 ), in most of our analysis. We also consider briefy its extensions to the arbitrary w-period case, PAR,,. (1), and then to the pth-order case, PAR,.(p). P A R : ( 1 ) process R a n d o m start Using the established terminology of seasonal streamflow modelling, with v as the year and w as the number of seasons in the year (for example, w = 12 for monthly streamflow modelling), the w-period periodic process { ~ } can be expressed as {Xv,~}, v = 1 , 2 , . . . , r = 1,...,w. Let us consider the following two-period first-order autoregressive, PAR2(1 ), process {Xv,,}, r = 1, 2, with a r a n d o m start: Xl,i
=
(39)
Boei,L
Xv, i =
dPiXv 1,2 + Ble~,,l
X,,,2 =
~2Xv,i + BRev,2
v = 2, 3 , . . . v = 1,2,...
(40) (41)
We again assume a zero-mean process, without loss of generality. For nonzero means, Xv., is to be replaced by Xv,: - lit, where/l: is the mean for period l".
By successive applications ofeqns. (40) and (41) together with eqn. (39), Xv, ~ can be expressed as a linear combination of e~.~, e~,2, • • • , ev, T as X~., =
(q~,q~2)v 'Boe,. , + L ((9,~P2)v-i[gg,B2ei-l,2 q- B,e,.,] i-2
v = 2,3,... X~.2 =
(42a)
((/)2~b,)~-'[~b2Boe,., + B2el.2] + L (~b2~,)' v[dP2B, ei., + B2e,.2] i-2
v = 2, 3 . . . .
(42b)
GENERATION OF MULTIVARIATE AUTOREGRESSIVE SEQUENCES
Xi. 2 =
221
(42c)
(P2Boej.i + B2el. 2
which constitute a special case of eqn. (3). For the autocovariance matrices of this process, we can employ the notation C~(k), where k is the backward lag considered at period ~. For example, r = 1, 2
(43)
C~(O) =
E(Xv,~XvV~)
C,(1)
E(Xv.,X~,.2 )
(44)
E(Xv.2X~V,,)
(45)
=
C2(1) =
It then follows from eqns. (7)-(9) with the new notation, or directly from eqns. (39)-(41), that q~, =
C,(1)C2'(0)
(46)
4~2 =
C2(1)C,-'(0)
(47)
BoBo
=
c,(o)
(48)
B~B T =
C1(0) -
CI(1)CfI(o)cT(1)
(49)
B2B T =
C2(0 ) -
C2(1)C~-1(o)cT(1)
(50)
The covariance matrices of (Xv, 1, Xv 1,2) and (Xv,2, Xv,1), respectively, are
,,2, \x _1,2 E
Lc, (1)
G(o)J
= Xv., ,Xv.1
v = 2, 3 , . . . v = 1,2,...
C](1)
(51a) (51b)
C,(0)J
It follows from our previous discussion that the positive definiteness of the above covariance matrices guarantees the positive definiteness of the matrices on the right-hand sides of eqns. (48)-(50) and also the existence of the inverses of covariance matrices C¢(0). We can take B0, B I and B 2 as the square roots of the matrices at the right-hand sides of eqns. (48)-(50), respectively. Equations (39)-(41) with parameters as specified by eqns. (46)-(50) are capable of generating jointly multivariate normal {Xv,¢} vectors with zero means, and with any specified covariance matrices C¢ (0) and lag-one autocovariance matrices C¢(1). This follows from our previous discussion of the general AR(1) process, and can of course be verified again directly from eqns. (39)-(41) together with eqns. (46)-(50) by obtaining the m o m e n t s recursively. In this way the autocovariance matrices C¢(k) for higher lags k = 2, 3, . . . can also be obtained in terms of C¢ (0) and C¢ (1). We give below all the first-
222
r.A. ULA
and second-order m o m e n t s o f {Xv,~}:
E(Xv,,) = 0 E(Xv,~X]~) =
C~(0)
E(Xv, IXvT 1,2) =
E(Xv,2X~TI)
r = 1,2
v = 1,2,
(52)
r = 1, 2
v = 1, 2,
(53)
"U = 2, 3,
(54)
v = l, 2,
(55)
Cl(l) C2(1)
=
E(Xv,,X,V, k.,) =
C,(2k)
=
(q~,~b2)kC,(0)
v = 2,3, k = 1,2,
,v-
1 (56)
E(Xv.2X~v,_k.2) =
C2(2k)
=
(~b2~b,)kC2(0)
v = 2,3 .... k = 1,2,...,v-
1 (57)
E(Xv, IXW_k,2) = C , ( 2 k -
1) =
(~b,~b2)k 1C~(1)
v = 2,3 .... k = 1,2 .....
v-
1 (58)
E(Xv.2XWk.,)
= C2(2k + 1) = (q~2q~,)kC2(1)
v = 1, 2 . . . . k = 0,1,2,...,v-
1 (59)
where q~l and ~b2 are as given by eqns. (46) and (47). As all the first- and second-order m o m e n t s o f the process {Xv. ~} depend only on the period r and the lag k, but not on the absolute time v, the orocess is periodic stationary.
Limiting behaviour of autocovariance matrices. E q u a t i o n s (56)-(59) contain the non-negative integer powers of the m x m matrices q~ q~2 or q~2~bt which are in general n o n s y m m e t r i c . The limiting behaviours o f these matrices as the power tends to plus infinity determine those of the corresponding autocovariance matrices. We will study this limiting b e h a v i o u r in some detail. We let B be an m x m n o n s y m m e t r i c matrix. Some of its eigenvalues, and the c o r r e s p o n d i n g eigenvectors, m a y be complex. We let B k, k = 1, 2 , . . . , be the kth power o f B. If the eigenvalues of B are all distinct, then B ~ can be expressed in terms o f the eigenvectors and the k t h powers o f the eigenvalues o f B t h r o u g h its spectral d e c o m p o s i t i o n (Rao, 1973, pp. 43-44). In the case of repeated eigenvalues, a similar but m o r e complicated d e c o m p o s i t i o n can be obtained t h r o u g h the J o r d a n canonical f o r m (Rao, 1973, p. 45). It can be shown t h r o u g h these d e c o m p o s i t i o n s t h a t the dependence o f A k on k is only
GENERATION OF MULTIVARIATE AUTOREGRESSIVE SEQUENCES
223
through the eigenvalues of B. For the special case of a symmetric matrix, for example, the spectral decomposition of B was given by eqn. (10), which is applicable even for the case of repeated eigenvalues. Multiplying B in that equation by itself k times gives B k = EAkE r, k = 1, 2 . . . . . where Ak = . , ,t,,). Using some theorems on matrix sequences (Cullen, 1972, p. 256), the limiting behaviour of the matrix sequence {A k } as k tends to plus infinity can be determined from the nature of the eigenvalues of B. It can be shown that for any matrix B, the sequence {Bk } converges to the null matrix iff the eigenvalues of B are all less than one in absolute value (Cullen, 1972, p. 256). Otherwise, the following limiting behaviours may be observed: (a) the sequence {A k } converges to a non-null matrix; (b) the sequence diverges by oscillation, i.e. the sequence oscillates in the limit with a certain period between two or more limiting matrices; (c) the sequence diverges to infinity, i.e. at least one element of B k diverges to plus or minus infinity (this can happen iff there is at least one eigenvalue with absolute value greater than one); (d) the sequence diverges by drifting, i.e. the sequence changes continuously, without diverging to infinity or diverging by oscillation. In certain cases, the sequence {A k} may exhibit a finite behaviour, either remaining unchanged for all k (the case of an idempotent matrix) or oscillating with a certain period for all k (e.g. the case of a tripotent matrix, with period two). It follows from the preceding discussion that the autocovariance function of the PAR2(1) process will decay, i.e. converge to a null matrix, iff the eigenvalues of ~,bl4~2 and 4~24~ are all less than one in absolute value. It can be shown that the matrices q~ q~2and ~b: ~bt have the same eigenvalues (Morrison, 1976, p. 65). Therefore we can restrict our attention to the matrix qS~gb2 alone, where ~b, ~b2 =
CI(I)C:'(0)C2(1)C~-'(0)
(60)
As the covariance matrices of the process are fixed by eqn. (53), a decaying autocovariance function will mean a decaying autocorrelation function, which is consistent with the autocorrelation properties of most hydrologic processes. If the eigenvalues of ~b~b2 are not all less than one in absolute value, then the autocorrelation function will exhibit a persistent behaviour. The possibility of an explosive autocorrelation function, i.e. the sequence [(4)~q~2)~} diverging to infinity, does not exist as the autocorrelations are limited to the range ( - 1, 1), as will be shown below. Equation (60) assumes positive definiteness of the covariance matrices C~ (0) of Xv, ~. Furthermore, in eqn. (60) each of the lag-one autocovariance matrices C~(I) relates to C~ (0) and C2 (0) through the covariance matrices of (Xv,~, Xv_ ~2) and (X~,,2, Xv, ~) in eqns. (51a) and (51b), which are also assumed to be p.d. in our analysis. These positive definiteness assumptions certainly place some restrictions on the ~b~q~2
224
T.A. ULA
matrix and its eigenvalues. The author is currently investigating the nature of these restrictions. To throw some light upon this issue, we will consider the simple univariate case.
Univariate case. For the univariate case, we define the lag-one autocorrelations pl(l) -- ct(1)/(ala:) and p2(1) = c 2 ( 1 ) / ( o - l o ' 2 ) for periods one and two, respectively, where o~ = q(0) is the variance for period r, which is assumed to be positive (the univariate version of the positive definiteness assumption on the covariance matrices of Xv.~). For the univariate case, we replace C in C~(0) and C,(1) by c. It then follows from eqns. (46)-(50) that q~l =
Pl (l)(O'l/0"2)
(61)
Co2 = p2(1)(a:/a,)
(62)
~ldp2 = pl(1)p2(1)
(63)
B0 =
a~
(64)
B~ =
o-~[1 -
p~(1)l
(65)
B~ =
o-~ [1 - p~(1)]
(66)
The determinants of the covariance matrices of (Xv.i, X~ 1.2)and (Xv,2, Xv, t) in eqns. (51a) and (51b) are a~o'~[1 - p~(1)] and a~a~[1 - p~(l)], respectively. These determinants are always non-negative by the non-negativity property of covariance matrices, and therefore [p~(1)[ ~< 1 and Ip2(1)[ ~< 1. This can be generalized to the multivariate case and to autocorrelations of all lags. It then follows from eqn. (63) that [~blq~2[ ~< 1, and from eqns. (65) and (66) that B~ ~> 0 (the univariate version o f t e non-negativity property of B~B(). We can then take B 1 = al[1 - p~(1)] 1/2 and B 2 = a2[1 - p~(l)] 1/2. It should also be noted that 14~1q~21 = 1 and B~ = 0 for r = 1,2 iff Ip~(1)[ = 1p2(1)1 = 1, practically an almost impossible situation. In that case, B~ = 0 and, as can be verified from eqns. (40) and (41), this is the case of perfect linear dependence between X~. 2, Xv.~ and X~ 1,2. If we impose the positive definiteness assumption on the above-mentioned covariance matrices, then their determinants are positive, and therefore [pl(1)l < 1 and [p2(1)l < 1. In that case, [~b~q521 < 1 and B~ > 0 (the univariate version of the positive definiteness of B,B[ in that case). It should be noted that B~ is always positive under the assumption o-~ > 0. A necessary and sufficient condition for the positive definiteness of a 2 x 2 symmetric matrix A = (%) is that a~t > 0 and det. (A) > 0 (Graybill, 1983, p. 397). Therefore, when o-~ > 0, the abovementioned covariance matrices are p.d. iff their determinants are positive. Therefore, when a~ > 0, we have [pl(1)] < 1, [p2(1)] < 1, [~bt~bz[ < I and B~ > 0 iffthe covariance matrices of (X~.~, Xv_l,2) and (Xv. 2, Xv.~ ) are p.d. For
GENERATION OF MULTIVARIATE AUTOREGRESSIVE SEQUENCES
225
the univariate case, the eigenvalue of the scalar ~ q~2 is the same as itself, and its absolute value is always less than or equal to one, and less than one iff the above-mentioned covariance matrices are p.d. Whether or not this can be generalized to the multivariate case should be investigated. F i x e d start
We now consider a PAR2(1) process with a fixed start. In eqns. (39)-(41), we keep eqns. (40) and (41) as they are, and change eqn. (39) as XI.I
=
(67)
a
where a is an m x 1 arbitrary constant vector. It follows from eqn. (67) that E ( X t , ~ ) = a and E[(X~.t - a) (Xt,i - a) T] = 0. A zero mean vector for f~.~ can be obtained by taking a = 0, but this will not affect the covariance matrix of X~,I, which will still remain as zero. Therefore with a fixed start, it is not possible to have a proper nonsingular covariance matrix for X~,~. In eqns. (40) and (41), let us again employ the m o m e n t equations for the parameters q~l, ~b2, BI and B:, as given by eqns. (46), (47), (49) and (50), respectively. It follows from our previous discussion that eqns. (40) and (41) with these parameters and with a r a n d o m start are capable of generating {Xv.~ } vectors with any specified covariance matrices C~(0), starting with X~. t, and lag-one autocovariance matrices C:(1), starting with XL2. However, this is not the case with a fixed start, as a fixed start does not provide the appropriate initial condition, which is a nonsingular covariance matrix for Xj,~, for the solution of the recursive moment equations. This can be seen from an inspection of the m o m e n t eqns. (4)-(9) for the general A R ( I ) process with a r a n d o m start, which require as the initial condition for solution a nonsingular covariance matrix C(1, 1) for X~, which cannot be accomplished by a fixed start as C(I, 1) = 0 in that case. The mean vectors, and the autocovariance matrices up to lag two for the PAR2(1) process with a fixed start, as defined by eqns. (40), (41) and (67), with parameters as given by eqns. (46), (47), (49) and (50), are given below: E(Xv,,) =
(~ptgp2) ~' ~a
E(X~,2)
=
(42(o~)v-'(~2a
c(x~,.,,
L,,)
=
c,(o)
v = 1, 2 . . . .
(68)
v = 1, 2 . . . .
-
(69)
(0,¢2) ~ 'c,(o)[(~,¢2)~-'1T
v =
l, 2 . . . . (70)
c(x~.~, xv.~)
=
c2(0)
-
(~2¢,) ~ '~2c,(0)¢~[(¢2~,)v-']
r
v = 1, 2 . . . . c(£.,,
x~
,.~)
v = 2,3,...
(71) =
c,(1)
-
( ¢ , ~ ) - ( ov, C,e ( l
)q~w[ ( ~ : ~ , ) v
qw
(72)
226
T, a. ULA
C(X~,2, X~,,) =
C2(1) -
(~b2q),)~ 'C2(1)[(qS~b2) v L]~
v = i, 2 . . . . (73)
c(xv.,, Xv ,,) v =
= 4),0,.c,(o)
-
v 'c,(o)[(O,
qL
Y-2] w
2, 3 . . . .
(74)
C(X~..2, X~, ,.2) = 4,20,G(0) - (4,~,/,,) ~ ' 4,~C,(0)0~[(4~24,,) ~ ~_]~r v = 2,3,...
(75)
where C ( . , . ) is the covariance operator. As we have, in general, nonzero means in this case, C(Xv,2, Xv.~), for example, is to be defined as E{[X~,2 - E(Xv,2)] [X~.~ - E(Xv, I)]T}. In obtaining eqns. (70)-(75), the use of the following general result simplifies the derivations: C ( a l X I Jr- a2)(2, a3)f" 3 q- a4X4)
=
aiC(X1, X3)a~ + aiC(Xl,
X4)a]
+ a2C(X2, X3)a~ + a2C(X2, X4)a~ where ai, i = 1, . . . , 4, is an m x m c o n s t a n t matrix, and ~ is an m x 1 r a n d o m vector. In eqns. (70)-(75), we also m a k e use of the following relations: q~,C2(0)
=
C,(I)
(76)
~b2C,(0 )
=
C2(1)
(77)
C,(1)C;'(0)C((1)
=
~b,Cz(0)~bT
(78)
C2(1)Cff' (0)C2r(1)
=
~b2C, (0)q~T
(79)
which follow f r o m eqns. (46) and (47). In eqns. (78) and (79), the left-hand side terms appear in eqns. (49) and (50), respectively. It can be seen from eqns. (68)-(75) that the m o m e n t s of the P A R 2 ( I ) process with a fixed start depend on the absolute time v, in addition to the period and the lag, and therefore the process is n o t periodic stationary. We will now show that this dependence diminishes with time, and periodic stationarity is realized iff the eigenvalues of q~ q~2 are all less t h a n one in absolute value. We will refer to this as the periodic stationarity condition. It follows f r o m eqns. (68) a n d (69) that E(X~,:) = 0 if a = 0. If a ¢ 0, omitting the trivial case ~6~ = 0 a n d / o r q~2 = 0, then E(X~,~) converges to zero as v -* c~ iff(q~lq~2) ~ ~ and (q~2q61)v ~ converge to zero, which can h a p p e n iff the eigenvalues of ~1 ~2 are all less t h a n one in absolute value. Therefore the m e a n vectors for periods one and two, E(Xv,~), ~ = 1, 2, converge to zero in the limit with a nonzero fixed start iff the periodic stationarity c o n d i t i o n is satisfied. In eqn. (70), the second term on the r i g h t - h a n d side is an m x m matrix whose diagonal elements are the q u a d r a t i c forms qiC~(O)q~, i = l,
GENERATION OF MULTIVARIATE AUTOREGRESSIVE SEQUENCES
227
, m, where qi is the ith row of (q~ 4~2)v-~ . As Ci (0) is p.d., these quadratic forms are all z e r o i f f q i = 0, i = 1 , . . . , m , i . e . ( 4 ~ l q ~ 2 ) ~ ~ = 0. T h e m a t r i x (~b~q~2)~' i is, in general, nonzero for finite v but converges to zero in the limit iff the periodic stationarity condition is satisfied. Therefore the covariance matrices for the first period, C(Xv, i, Xv, i), converge to CI (0) iff the periodic stationarity condition is satisfied. By a similar argument, it can be shown by eqn. (71) that the covariance matrices for the second period, C(Xv.2, X~.2), converge to C 2(0) iff the periodic stationarity condition is satisfied. It follows from eqns. (72) and (73) that if the periodic stationarity condition is satisfied, then the lag-one autocovariance matrices for periods one and two, C(Xv,~, Xv 1,2) and (X~,2, Xv,,), converge to CI(1) and C2(1), respectively. It also follows from eqns. (74) and (75) that if the periodic stationarity condition is satisfied, then the lag-two autocovariance matrices for periods one and two, C(Xv, i, X~_I.,) and C(Xv.2, Xv_,,2), converge to 4>,dp2C,(O) and q~2~b~C2(0), respectively, which are same as the results that can be obtained for k = 1 from eqns. (56) and (57), respectively, for the random-start case. In eqns. (70)-(75), we included the autocovariance matrices for periods one and two only up to lag two. The higher lag autocovariance matrices can be obtained similarly, and it can be verified that if the periodic stationarity condition is satisfied, then all these autocovariance matrices converge to those given by eqns. (56)-(59) for the random-start case. Therefore the PAR2 (1) process with a fixed start is not periodic stationary as is that with a random start, and periodic stationarity as in the random-start case, with the same mean and autocovariance function, can be achieved iff the eigenvalues of 4~1q~2 are all less than one in absolute value and then only after a certain transition period. This proves the superiority of a r a n d o m start over a fixed start. It follows from the preceding discussion that the PAR2 (1) process with a fixed start which satisfies the periodic stationarity condition is capable of generating {X~,~} vectors with any specified covariance matrices C~(0) and lag-one autocovariance matrices C~(I), but only after a certain transition period. The length of this period depends in general on the rate of convergence of {(q~l~b2)~'} and {(4~2~bl)v} to zero as v ~ ~ or, more specifically, on the convergence times of the second terms on the right-hand sides of eqns. (70)-(73)• The effect of a fixed start on the generated values can also be seen from eqns. (42a)-(42c) by replacing the random starting value Boel,l in these equations by the nonzero fixed starting value a. The terms (~b~q~2)v l a and (q~2q~ )v- I q~2a, which appear respectively in eqns. (42a) and (42b) in this case, converge to zero, i.e. the effect of the starting value a diminishes with time, iff the periodic stationarity condition is satisfied. The joint multivariate normality of {Xv.:} vectors for the fixed-start case, •
.
.
228
T A ULA
w i t h the exception of the degenerate value Xl,t = a, also follows from eqns. (42a)-(42c) w i t h B0el, l replaced by a. The vectors Xi, 2, X:. 1 , X2, 2, . . . , Xv, ~, Xv, 2, w h i c h are linear c o m b i n a t i o n s o f independently and multivariate n o r m a l l y distributed e~,2, e2,~, e2,2. . . . . e~,~, %2 vectors, have a multivariate normal joint distribution.
An independent verification of the periodic stationarity condition for the PAR2(1 ) process with a fixed start was given by Ula (1990). The PAR2(1) process considered there is based on eqns. (40) and (41) for v = 0, + 1, _+ 2, . . . . and there is no specific mention of initial values, as the process is assumed to be fully developed and freed from the effect of initial values. The periodic stationarity of the PAR2(1 ) process {X~,~, Xv.2} is related, as a necessary and sufficient condition, to the (covariance) stationarity of the 2m-dimensional AR(1) process {Yv }, where Y~ = (X~., Xf2). The stationary multivariate AR(1) process (Fuller, 1976) has properties similar to those of the well-known stationary univariate AR(1) process (Box and Jenkins, 1976). Two of these properties are the transitory effect of initial values, and the decaying nature of the autocorrelation function, both of which follow from the infinite moving average representation of the process with absolutely converging weights. The periodic stationary PAR2(I ) process exhibits these properties also because of its convertibility to a stationary AR(1) process. For the P A R 2(1) process with a r a n d o m start, the first- and second-order m o m e n t s of the process as given by eqns. (52)-(59) guarantee periodic stationarity of the process, as these moments depend only on the period ~ and the lag k. The derivation of these m o m e n t s did not require the periodic stationarity condition. This is totally due to a r a n d o m start. In our analysis of the P A R 2( 1) process with a r a n d o m start, the only point at which we had to resort to the periodic stationarity condition was to obtain a decaying autocorrelation function, which is also a very desirable feature, especially from a practical point of view. It may be noted that when the periodic stationarity condition is not satisfied, the weights (~1 ~ 2 ) v i and (q~24h)v-., in eqns. (42a) and (42b), respectively, do not decay for large values of v - i, i.e. the process is affected by its remote past. This explains the persistent autocorrelations in that case.
Aggregation For a periodic process {Xv,~} of period w, we consider the wth-order aggregation: Z~ =
L X~.~
3--1
v = 1,2 . . . .
(80)
For the special case of a two-period process {Xv.l, X~.2}, we have Z~ = Xv. I + Xv. 2 , and if { Xv. I , Xv,2 } follows a PAR2(1) process with a r a n d o m start,
229
GENERATION OF MULTIVARIATE AUTOREGRESSIVE SEQUENCES
then E(Zv) = 0, and the autocovariance function of {Zv} for lag k is
E(Z~,Z[,)
= E[(Xv,, + Yv,2)(Xv ,,, + Y~ ,,2)-r] =
+ E(L,2 + E(L,,XJk,2)
+
C,(2k) + C2(2k) + C~(2k -
=
l) + C2(2k + l)
k = 1,2 . . . .
,
E(ZvZ
C,(O) + C2(0) + C (1) + C2(1)
)
=
v = k + 1, k + 2 . . . .
(81a) v = l, 2 . . . .
(k = O)
(Slb) where the autocovariance matrices on the right-hand side of eqn. (81a) can be obtained from eqns. (56)-(59). As the autocovariance function of {Zv} depends only on the lag k, the aggregated process {Zv} is stationary. It follows from our previous discussion on aggregation that for secondorder aggregation, the mean vector, the covariance matrix and the joint multivariate normality of {Zv } can be reproduced by the PAR2 (1) process but the lag-k autocovariance matrix of {Zv} for k = l, 2 . . . . is reproducible only by higher-order PAR2(p) processes with p >~ 2k + 1. The results for covariance and autocovariance matrices of {Zv } can readily be verified from eqns. (81a) and (81b). In eqn. (81b), the covariance matrix of Z~, is a function of the autocovariance matrices of {Xv, ~} up to lag one, which are reproducible by the PAR2(1) process. In eqn. (81a), the autocovariance matrix of {Z~, } for lag k = l, 2 . . . . . is a function of the autocovariance matrices of {Xv,~ } up to lag 2k + l, which are reproducible only by PAR2(p) processes with p ~> 2k + I. For example, if we wish to reproduce the lag-one autocovariance matrix of {Zv }, in addition to its covariance matrix and mean vector, then we should use at least a PAR2(3) process. It also follows from our previous discussion on aggregation that the positive definiteness of the covariance matrix of (X~,2, Xv.~) in eqn. (51b) guarantees the positive definiteness of the covariance matrix of Z v. It follows from our previous discussion of the fixed-start case that for a PAR2(1) process with a fixed start, eqns. (81a) and (81b) for the autocovariance function of {Z~} are applicable iff the periodic stationarity condition is satisfied and then only after a certain transition period. Therefore the aggregated process for the fixed-start case, with the periodic stationarity condition satisfied, converges to that for the r a n d o m start case. Within the transition period, the autocovariance function of {Zv} depends on the absolute time v, in addition to the lag k, and therefore the aggregated process {Zv} is not stationary. Within the transition period, the autocovariance function of {Z~.} can be obtained from the right-hand sides ofeqns. (81a) and
230
T.A. ULA
(81 b) by replacing the autocovariance matrices C~(.) by their counterparts for the fixed-start case, such as eqns. (70)-(75) for the first two lags. Generalizations
to P A R w ( 1 ) a n d P A R w ( p ) p r o c e s s e s
For an arbitrary period w = 2, 3 . . . . . random start is defined as ~'Xl,i
=
Boel,l
X~.l
=
(PiXy
Xv. ~ =
the PAR,.(1) process with a (82)
1.2 + Ble~,,i
v = 2, 3 , . . .
dp~Xv,~_t + B~ev,~
v = 1,2,...,
(83) ~ = 2,3 .....
w
(84)
The analysis of this process is similar in many ways to that of the PAR2(1) process. The PARw(1) process is capable of generating jointly multivariate normal {Xv,~} vectors with zero means, and with any specified covariance matrices C~(0) and lag-one autocovariance matrices C~(1), z = 1. . . . . w. The fixed-start case, Xt,I = a, converges again to the random-start case provided that the eigenvalues of ~b,.~b,. ~ . . . q~2q~ are all less than one in absolute value. This again is the same as the periodic stationarity condition for PARw(I) processes as obtained by Ula (1990). The w-period AR(p) process, PAR,.(p), where p = 1, 2 . . . . . with a random start can be defined as a generalization of the PAR,.(1) process, following our previous discussion of the general AR(p) process. We will not present a formulation of this process here because of the notational complexities involved. For the formulation and the m o m e n t equations of the PAR,,(p) process, the reader is referred to Salas and Pegram (1979). The results given there should be coupled with those for the transition period of the random start, such as eqns. (33) and (34) for the general AR(p) case. The PARw(p) process is capable of generating jointly multivariate normal {Xv,~ ~ vectors with zero means, with any specified covariance matrices C~(0), = 1. . . . . w, and with any specified autocovariances matrices C , ( k ) up to lag k = p. A further generalization of the PARw(p) process is the PARw(p~ ) process with periodically varying orders p,, • = 1. . . . . w. The m o m e n t equations for this process can again be obtained following our discussion of the general AR(p) process. The PARw(p~) process is capable of reproducing joint multivariate normality, any specified periodic mean and covariance function, and any specified periodic autocovariance function up to backward lagp~ at period r. The determination of periodic stationarity conditions for PARw processes with fixed or periodically varying orders was discussed by Ula (1990). If we consider the wth-order aggregation Z v = Xv.~ + • • • + X~,,. for the PARw(p) process, it follows from our previous discussion on general aggregation that the joint multivariate normality and the mean vector of {Z~,} is
G E N E R A T I O N OF M U L T I V A R I A T E AUTOREGRESSIVE SEQUENCES
231
reproducible by a PAR,,.(p) process of any order, but the autocovariance matrix of {Zv} for lag k = 0, 1, 2 . . . . is reproducible only by PAR,,.(p) processes of order p /> w(k + 1) - 1. For a random start, the aggregated process is stationary. For a fixed start, stationarity is achieved in the limit provided that the periodic stationarity condition for the original process is satisfied. SUMMARY AND CONCLUSIONS
The present study investigates some aspects of data generation through multivariate autoregressive (AR) models. The main emphasis is on the preservation of the first- and second-order moments in the generation and the effect of initial values on these moments. The main conclusion of the study is the optimality of a random start for preservation of moments and for achieving stationarity. The problem of preservation of moments is approached in a nontraditional way by starting with the initial values. For this purpose, general AR processes with a random start and with time-varying parameters are introduced to lay a foundation for the analysis of all types of AR processes. The moment equations for general AR processes are obtained, paying attention to some fine points such as the existence of certain inverses and the positive definiteness of some matrices. A simple symmetric solution is proposed for matrix B in matrix equations of the form B B x = A which appear in the moment equations. It is shown that an AR process with a random start and with parameters obtained from the moment equations is capable of generating jointly multivariate normal vectors with any specified means and covariance matrices, and with any specified autocovariance matrices up to lag p, where p is the order of the process. With a random start, there is no transition period involved for achieving these moments, and the analysis does not require any type of stationarity. The aggregation properties of general AR processes are also studied. The qth-order aggregation is defined as the sum of the original vectors over consecutive sets of size q. It is shown that the joint multivariate normality and the mean function of the aggregated process can be reproduced by an AR process of any order, but the autocovariance function of the aggregated process for lag k = 0, 1, 2 . . . . is reproducible only by AR processes of order p >~ q(k + 1) - 1. As a special case of general AR processes, the periodic autoregressive (PAR) processes are considered in the second half of the paper. The twoperiod first-order PAR model, PAR2(1), is studied in some detail. It is shown that the PAR2 (1) process with a random start is capable of generating jointly
232
T.A. ULA
multivariate normal vectors with any specified periodic mean, covariance and lag-one autocovariance function. Again, with a r a n d o m start, no transition period is involved in the achievement of these moments. The periodic autocovariance function of this process is obtained for all lags and is shown to depend only on the period and the lag. Therefore the PAR2(I) process with a random start is always periodic (covariance) stationary. The PAR2(1 ) process with a fixed start is also studied, to obtain its mean function, and the autocovariance function up to lag two. It is shown that the moments of this process depend on the absolute time, in addition to the period and the lag, and therefore the process is not periodic stationary. This dependence diminishes with time, and periodic stationarity is realized if the A R parameters satisfy certain conditions, which we refer to as the periodic stationarity condition. In that case, the PAR2(1) process with a fixed start converges to that with a random start, with the same mean and autocovariance function, but only after a certain transition period. This proves the superiority of a random start over a fixed start. The aggregation of the PAR2(1) process is also considered. For a random start~ the aggregated process is stationary. For a fixed start, stationarity is achieved in the limit provided that the periodic stationarity condition for the PAR2(1) process is satisfied. In that case, the aggregated process with a fixed start converges to that with a random start with the same mean and autocovariance function. The extensions of the PAR2(I) process to the arbitrary w-period case, PAR,.(1), the pth-order case, PAR~,(p), and the periodically varying order case, PAR~,.(p~), are also briefly discussed. REFERENCES Box, G.E.P. and Jenkins, G.M., 1976. Time Series Analysis, Forecasting and Control, 2nd edn. Holden-Day, San Francisco, CA, 575 pp. Cullen, C. G., 1972, Matrices and Linear Transformations, 2nd edn. Addison-Wesley, Reading, MA, 330 pp. Fuller, W.A., 1976. Introduction to Statistical Time Series. Wiley, New York, 480 pp. Graybill, F.A., 1969. Introduction to Matrices with Applications in Statistics. Wadsworth, Belmont, CA, 382 pp. Graybill, F.A., 1983. Matrices with Applications in Statistics, 2rid edn. Wadsworth, Belmont, CA, 473 pp. Kshirsagar, A.M., 1972. Multivariate Analysis. Marcel Dekker, New York, 552 pp. Matalas, N.C., 1967. Mathematical assessment of synthetic hydrology. Water Resour. Res., 3(4): 937-945. Morrison, D.F., 1976. Multivariate Statistical Methods, 2nd edn. McGraw-Hill, New York, 431 pp. Pegram, G.G.S. and James, W., 1972. Multilag multivariate autoregressive model for the generation of operational hydrology. Water Resour. Res., 8(4): 1074-1076.
G E N E R A T I O N O F M U L T I V A R I A T E AUTOREGRESSIVE SEQUENCES
233
Pereira, M.V.F., Oliveira, G.C., Costa, C.C.G. and Kelman, J., 1984. Stochastic streamflow models for hydroelectric systems. Water Resour. Res., 20(3): 379-390. Rao, C.R., 1973. Linear Statistical Inference and its Applications, 2nd edn. Wiley, New York, 647 pp. Salas, J.D.~ 1972. Range analysis for storage problems of periodic-stochastic processes. Colo. State Univ., Fort Collins, Hydrol. Paper, 57, 89 pp. Salas, J.D. and Pegram, G.G.S., 1979. A seasonal multivariate multilag auto-regressive model in hydrology. In: H.J. Morel-Sytoux, J.D. Salas, T.G. Sanders and R.E. Smith (Editors). Hydrologic Processes. Proceedings of Third International Hydrology Symposium on Theoretical and Applied Hydrology, Colorado State University, Fort Collins, CO, July, 1977. Water Resources Publications, Littleton, CO, pp. 125-145. Thomas, H.A. and Fiering, M.B., 1962. Mathematical synthesis of streamflow sequences for the analysis of river basins by simulation. In: A. Maass, M.M. Hufschmidt, R. Dorfmam H.A. Thomas, Jr., S.A. Marglin and G.M. Fair (Editors), Design of Water-Resource Systems. Harvard University Press, Cambridge, MA, pp. 459-493. Ula, T.A., 1990. Periodic covariance stationarity of multivariate periodic autoregressive moving average processes. Water Resour. Res., 26(5): 855-861. Vecchia, A.V., 1985. Periodic autoregressive--moving average (PARMA) modeling with applications to water resources. Water Resour. Bull., 21(5): 721-730. Yagil, S., 1963. Generation of input data for simulations. IBM Syst. J., 2: 288-296. Yevdjevich, V.M., 1964. Fluctuations of wet and dry years, Part If, Analysis by serial correlation. Colo. State Univ. Fort Collins, Hydrol, Paper 4, 60 pp. Young, G.K., 1968. Discussion of mathematical assessment of synthetic hydrology. Water Resour. Res., 4(3): 681 682. Yound, G.K., and Pisano, W.C., 1968. Operational hydrology using residuals. J. Hydrol. Div. ASCE, 94(HY4): 909-923.