Mechanical Systems and Signal Processing (1998) 12(6), 825–838 Article No. pg980179
MODAL PARAMETER ESTIMATION AND MODEL ORDER SELECTION OF A RANDOMLY VIBRATING SYSTEM J. L University of Franche-Comte, L.M.A.R.C., Rue de L’Epitaphe, 25030 Besanc¸on, France
(Received December 1997 , accepted after revisions June 1998) A vibrating system was excited by a random force and only the multi-output responses were measured. A state–space modelling of the sensors output was then considered, which consisted of the state equation and the observation equation. The modal parameters were obtained from the transition matrix of the vibrating system. Three methods for transition matrix determination are presented using shift property of a block Hankel matrix of the covariances, calculated from the data, and using a shift property of the observability matrix and of the controllability matrix. The order of the state–space system is obtained from a test which is based on the ratio of the determinants of the innovations covariance matrices of the process, from models of two different sizes. This test exploits the distributional properties of the canonical correlation coefficients. A numerical example illustrates the procedure for identifying the order and parameters of a vibrating system.
7 1998 Academic Press
1. INTRODUCTION
The identification of the natural frequencies of vibration and the damping in various modes, from empirical test data only, is of fundamental engineering importance and has been under practical and research consideration by structural engineers for 30 years [1–5]. That effort was motivated by the fact that the damage done to particular structures under natural excitation or under an unknown external force might be identified with the relative damping and the natural frequencies of the vibrating system. Structural systems are very conveniently and frequently represented by a system of linear ordinary constant coefficient differential equations, characteristic of lumped mass–spring–damper systems excited by a random forcing function. Other research, motivated by the large deflections that structures are subject to under severe excitation, has considered non-linear springs and damping. This paper’s attention is directed to the linear system situation in which the random excitation is approximated by a white noise and a multivariate vibration record is observed. It is assumed that the system under consideration can be described accurately in terms of a discrete n' degree of freedom lumped mass–spring–damper model excited by a white noise and only the purely stochastic case is considered here, in which there are no measured inputs: the modal parameters are extracted from output data only. The resulting multivariate discrete time series is represented in a state–space form and the aim is to estimate the transition matrix (or state matrix) and then to transform this matrix into the damping and natural frequency parameters of the structural system. The crucial problem of order estimation is also considered. This paper is organised as follows. A modelisation of a linear discrete vibrating system and its innovation state–space formulation is given in Section 2. It is shown in Section 3 0888–3270/98/060825 + 14 $30.00/0
7 1998 Academic Press
.
826
that the transition matrix is obtained as the product of a shifted matrix (the block Hankel matrix or the observability matrix or the controllability matrix) with a pseudo-inverse. A method useful for determining the order (the number of states) of the state–space representation is described in Section 4. This method is based on the ratio of the determinants of the innovations covariance matrices, from models of two different sizes, and considers distributional properties of the canonical correlation coefficients. To illustrate the effectiveness of the procedure a numerical example is presented in Section 5. This paper is briefly concluded in Section 6. 2. MODELLING A VIBRATING SYSTEM AND STATE–SPACE INNOVATIONS REPRESENTATION
In this section it will be shown briefly how a continuous time structural system, described by a second order multivariate differential equation, can be converted to an equivalent state–space realisation. Consider a multivariate continuous time structural dynamic system described by a positive definite diagonal mass matrix M0 , a symmetric semi-definite viscous damping matrix C0 and a symmetric positive definite stiffness matrix K0 . These matrices are (n' × n'). The system is represented by the following linear differential equation M0 j (t) + C0 j (t) + K0 j(t) = h(t)
(1)
and the observation equation y(t) = Lj(t) + v(t)
(2)
where j(t) is the (n' × 1) vector of displacement of degrees of freedom; h(t) is the unmeasured excitation vector, which is simulated by a Gaussian white noise with zero mean and covariance matrix N; y(t) is the (m × 1) output observation vector; the matrix L (m × n') specifies which points of the system are observed, namely where the sensors are located and v(t) is an additive noise disturbance, a vector of measurement errors simulated by a Gaussian white noise. The modal characteristics (m, Cm ) of the vibrating system are solutions of: =M0 m 2 + C0 m + K0 = = 0;
(M0 m 2 + C0 m + K0 )Cm = 0
(3)
where = = represents a determinant. The system described by equations (1) and (2) is equivalent to the following continuous time state–space model: x˙ (t) = A x(t) + B h(t)
(4)
y(t) = Cx(t) + v(t)
(5)
where x(t) is the vector of dimension 2n' = n x(t) =
$ % j(t) j (t)
(6)
and A , B , C are the (2n' × 2n'), (2n' × n'), (m × 2n') matrices A =
$
0 −M−1 0 K0
%
I ; −M−1 0 C0
B =
$ %
0 ; M−1 0
C = [L
0].
(7)
827
After sampling with constant period Dt and transformation of the 2n' first-order differential equation (4) into a discrete time equation, one obtains the following discrete time state–space model (see Appendix A): xk + 1 = Axk + uk
(8)
and the discrete time observation equation yk = Cxk + vk
(9)
where xk represents the unobserved state vector of dimension n; A = eA Dt is the (n × n) discrete time state–space matrix or discrete time transition matrix and uk is given by uk =
g
Dt
eA sB h(t − s) ds.
(10)
0
The excitation h(t) is a random process and is not constant over the sampling period Dt, so one cannot obtain directly uk by integration of equation (10). Under the assumption that h(t) is a zero-mean white noise process, the sequence uk of n-dimensional random vectors is also zero-mean white noise with covariance matrix U=
g
Dt
eA sN eA 's ds
(11)
0
where N = B NB '. The generation of the equivalent white noise sequence uk given the covariance matrix U is given in Appendix A. Note that uk and vk are mutually uncorrelated, the matrix C is (m × n) and is called the observation matrix and yk is an (m × 1) vector of centered observations. The eigenvalues l and eigenvectors Fl of the discrete time transition matrix A are related to the modal characteristics (3) by l = emDt and
CFl = LCm .
(12)
The global modal parameters: the natural frequencies fi and damping ratios zi of the vibrating system can be determined by [2]: fi =
1 2pDt
zi =
c
X
$ 0
2 [ln (li l* l + l* i )] i + cos−1 i 4 2zli l* i
1%
2
2 [ln (li l* i )]
$ 0
2 −1 [ln (li l* i )] + 4 cos
li + l* i 2zli l* i
1%
(13)
(14)
2
for i = 1, 2, . . . , n'. Since the state vector xk of dimension n = 2n' is not observable one must estimate it and transform equations (8) and (9) into a state–space innovation form, before the discrete time transition matrix A can be obtained. One estimates xk by its orthogonal projection onto the subspace spanned by the stacked vector of past data vectors noted y− k − 1 , y' k − 2 , . . . ]'. Here, the superscript ' denotes k − 1 = [y' the transpose operation and the notation zk = E[xk = y− k − 1] denotes this orthogonal
828
.
projection. To obtain zk + 1 , the time index is advanced by 1 in zk and properties of the orthogonal projection [6] are used: − − zk + 1 = E[xk + 1 = y− k ] = E[xk + 1 = yk − 1, ek ] = E[xk + 1 = yk − 1] + E[xk + 1 = ek ]
= AE[xk = y− k − 1] + Bek = Azk + Bek
(15)
where ek = yk − E[yk = y− k − 1] is called the innovation component in the data vector yk . It is an (m × 1) stochastic vector with zero mean and covariance matrix Q = E[ek e'k ]. Note that ek is uncorrelated with y− k − 1 and hence with zk by construction, and that the elements of {ek } are serially uncorrelated. zk is an (n × 1) vector of unobservable states that are minimal sufficient statistics (optimal carriers of information) for the history of the process yk . The (n × m) matrix B is called the Kalman gain. Since E[xk + 1 = ek ] = E[xk + 1 e'k ](E[ek e'k ])−1ek
(16)
B = E[xk + 1 e'k ](E[ek e'k ])−1.
(17)
one has
− Note that by assumption, E[uk = y− k − 1] = 0 and E[vk = yk − 1] = 0. Using the definition of ek and equation (9), the discrete time observation equation is obtained, which contains the innovation vector − yk = E[Cxk + vk = y− k − 1] + ek = CE[xk = yk − 1] + ek = Czk + ek .
(18)
The set of equations (8) and (9) is now replaced by zk + 1 = Azk + Bek ,
(19)
yk = Czk + ek .
(20)
Such representation is called innovation representation of the data generating process (8)–(9). The state–space innovation model in equations (19) and (20) is a suitable representation for the synthesis of samples of regularly sampled random vibration system. Given a sequence of observations {yk }, this paper now indicates how to determine the discrete time transition matrix A, which contains all the modal information of the vibrating system, and the order n of the system (the number of states in the model). 3. PARAMETER ESTIMATION
It is assumed in this section that the order n of the multivariate stochastic time series {yt } is given. There are three theoretical covariance matrices associated with the states and outputs of an innovation model: the state covariance matrix, P = E[zk z'k ];
(21)
the cross-covariance matrix, G = E[zk + 1 y'k ];
(22)
the auto covariance matrix, Ri = E[yk + i y'k ].
(23)
The (m × m) auto covariance matrix Ri can be written in terms of the innovations model parameters and G by substituting equations (19) and (20) into the definition of Ri : Ri = E[yk + i y'k ] = CA i − 1G
and R−i = R'i
(24)
with i q 0. R0 is determined separately. It will be convenient to stack the observations temporally via the (mf × 1) future and the (mp × 1) past data vectors defined as
+ k
829
− k
y = [y'k , y'k + 1 , . . . , y'k + f − 1 ]' and y = [y'k , y'k − 1 , . . . , y'k − p + 1 ]'. Define the block auto covariance matrix H as
K R1 R 2 G R2 R3 G H = E(y+ y' ) = k k−1 · G · R R f+1 k f
· ·
Rp Rp + 1
· ·
· Rp + f − 1
L G G. G l
(25)
H is a block Hankel matrix (mf × mp) (a block-band, counter-diagonal matrix). This block matrix may also be represented in terms of the model parameters A, C and the cross-covariance matrix G by substituting Ri by its relation (24) into the block Hankel matrix (25):
K CG G CAG H= G · G f−1 k CA G
CAG 2
CA G · CA fG
· · · ·
L CA G G G. · G CA f + p − 2G l CA p − 1G p
(26)
This block Hankel matrix can be expressed as the outer product of two block vectors
K C L G CA G G [G AG · · · Ap − 1G] = OK H =G · G G k CAf − 1 l
(27)
with O the (m × n) observability matrix and K the (n × mp) controllability matrix associated with the triplet (A, B, G). The auto covariance matrices are estimated from T data points and computed by the maximum likelihood relation T−i
R i = T−1 s yk + i y'k ;
i = 0, 1, . . . , p + f.
(28)
k=1
The estimated block Hankel matrix is noted H (the hat indicates an estimate). In order to estimate the discrete time transition matrix A, two matrix decompositions of H are employed: the singular value decomposition and the factorisation of H into its observability and controllability matrices H = U S V ' = O K
(29)
with U 'U and V 'V identity matrices and S a diagonal matrix of singular values ordered in decreasing order. When estimating a model with n states, only the n largest singular values are considered, setting the smallest singular values to zero. The approximation to the block Hankel matrix H , based on the largest n singular values, is denoted by H n . Then one obtains H n = U n S n V n' = (U n S n1/2 )(S n1/2V n' ) = O n K n
(30)
.
830
where the subscripts indicate that only the n largest singular values and associated singular vectors are included in the approximation. The two factors which are the approximate observability matrix and the approximate controllability matrix have been formed: O n = U n S n1/2
and K n = S n1/2V n'
(31)
and K (
n S −1/2 n =V n
(32)
with pseudo-inverses
−1/2 U n' O ( n =S n
where ( denotes a pseudo-inverse matrix. To estimate A it is necessary to introduce a left-shifted version of the block Hankel matrix defined as
K R2 G R3 H = G G · k Rf + 1
R3
·
Rp + 1
R4 ·
· ·
Rp + 2 ·
Rf + 2
·
Rp + f
L K CAG G G CA2G G =G G G · l k CAfG
GA 2G 3
·
CA G ·
· ·
CA f + 1G
·
L CA G G G . (33) · G CA p + f − 1G l CA pG p+1
The same left-shifted matrix can be obtained by multiplying O by A and K
K C L G CA G G A[G AG · · · Ap − 1G] = OAK. H = G · G G k CAf − 1 l
(34)
The estimate of A is obtained by applying the pseudo inverses of K n and O n , yielding
K (
−1/2 U n' H V n S −1/2 . A = O ( n H n =S n n
(35)
Another method to obtain an estimate of the discrete time transition matrix is derived from the following property, deduced from the structure of the observability matrix O. Let O q to be m(f − 1) × n matrix obtained from O by deleting the last block row of O and let O Q be the m(f − 1) × n matrix obtained from O by deleting the first block row of O. Therefore,
K C L K CA L G CA G G CA2 G G and OQ = G G Oq = G G · G G · G k CAf − 2 l k CAf − 1 l
(36)
O qA = O Q.
(37)
A = (O nq )(O nQ .
(38)
and one obtains
The estimate of A is then
Using equation (31) and the properties of the shift operator (see Appendix B), then O nq = (U nq )S n1/2
and
O nQ = (U nQ )S n1/2
(39)
831
and thus, equation (38) becomes A = [(U nq )S n1/2 ]((U nQ )S n1/2 = S −1/2 (U nq )((U nQ )S n1/2 n
(40)
where U nq means the matrix obtained by deleting the last m rows of the matrix U n , and U nQ represents the matrix obtained by deleting the first m rows of the matrix U n . Consider now the structure of the controllability matrix K. Let K be the n × m(p − 1) matrix obtained from K by deleting the last block column of K and let K be the n × m(p − 1) matrix obtained from K by deleting the first block column of K. Hence, K = [G
AG · · · A p − 2G]
and K = [AG
A 2G · · · A p − 1G]
(41)
and AK = K
(42)
can be formed. Then one obtains an estimate of A A = K n K ( n .
(43)
Using (31), "--K n = S n1/2 (V n' )
and
---# K n = S n1/2 (V n' )
(44)
and thus, equation (43) becomes "-----# "------# A = S n1/2 (V n' )[S n1/2 (V n' )]( = S n1/2 (V n' )(V n' )(S −1/2 n
(45)
---# where (V n' ) means the matrix obtained by deleting the last m columns of the matrix V n' "--and (V n' ) represents the matrix obtained by deleting the first m columns of V n' . Using properties of the shift operator (see Appendix B) one obtains also A = S n1/2 (V nQ )'(V nq )'(S −1/2 n
(46)
where V nQ means the matrix obtained by deleting the first m rows of V n and V nq represents the matrix obtained by deleting the last m rows of V n . With estimates of the discrete time transition matrix A in hand an estimate of the modal parameters of the vibrating system can be developed from equations (12), (13) and (14). Before the matrix A in the model can be estimated, one must address the issue of order estimation. A procedure using the covariance matrix Q of the innovations from models of different sizes is presented in the next section.
4. ORDER SELECTION
Based on the ratio of the determinants of the covariance matrix of the innovations from models of two different sizes, a test is developed to determine the order n of the state–space model (the number of states in the model), from output data only. The first step in the development of the procedure is to write the (mf × 1) stacked vector of future observations as + y+ k = Ozk + Fek
(47)
.
832
where O is the observability matrix of equation (27), e+ k is the corresponding (mf × 1) stacked vector of innovations e+ = [e' , e' , . . . , e' ]' and F is the (mf × mf) lower k k + 1 k + f − 1 k block triangular matrix given by
K Im G CB F =G · G f−2 k CA B
Om Im · ·
L · Om G G. · · G CB Im l ·
Om
(48)
Next R+ , the (mf × mf) covariance matrix of the stacked vector of future observations, is defined and the determinant of the innovations covariance matrix found: +' + +' R+ = E[y+ k ]O' + FE[ek ek ]F' = OPO' + F(If &Q)F' k yk ] = OE[zk z'
(49)
=R+ − OPO'= = =F(If &Q)F'= = =F=2=If &Q= = =F=2=Q=f.
(50)
Since =F= = 1, one has the determinant of the innovations covariance matrix raised to the power f : =Q=f = =R+ − OPO'=.
(51)
It is shown in Appendix C that the state covariance matrix P takes the form P = E[zk z'k ] = KR−1 − K'
(52)
−1 =Q=f = =R+ − OKR−1 − K'O'= = =R+ − HR− H'=.
(53)
and then
In the next step, let Vj be the ratio of the determinants of the innovation covariances of a model with mf states (the largest possible number of assumed states) and a model with j states (mf q j), raised to the power f. This can be written as −1 Vj = (=Q=/=Qj =)f = =R+ − HR−1 j = − H'=/=R+ − Hj R− H'
(54)
or as it is shown in Appendix D Vj =
−1/2 −1/2 −1/2 =I − (R−1/2 + HR− )(R+ HR− )'= . −1/2 −1/2 −1/2 =I − (R+ Hj R− )(R+ Hj R−1/2 − )'=
(55)
−1/2 be Let the singular value decomposition of the scaled block Hankel matrix R−1/2 + HR− −1/2 R−1/2 = P GQ ' + HR−
(56)
where P and Q are orthonormal matrices and G is the (mf × mf) diagonal matrix which −1/2 contains the singualar values of R−1/2 + HR− , or the canonical correlation coefficients − + between the past yk − 1 and the future yk [6] G = diag (g1 , g2 , . . . , gj , gj + 1 , . . . , gmf ) = (G1 , G2 )
(57)
with G1 = diag (g1 , g2 , . . . , gj )
and G2 = diag (gj + 1 , gj + 2 , . . . , gmf ).
(58)
It is understood that the canonical correlation coefficients gi are arranged in decreasing order.
833
As shown in Appendix D, Vj =
=I − G 2= = =I − G22= =I − G12=
(59)
where G2 is the (mf − j) diagonal matrix of canonical correlation coefficients left out of the approximate model with j states. Vj can be interpreted as a test for the joint significance of the excluded canonical correlation coefficients. Aoki [6], Kshirsagar [7], and Dorfman and Havenner [8], showed that a certain constant multiplied by the logarithm of the determinant of the excluded canonical correlations is asymptotically distributed as a chi-squared random variable, which can be used to test sequentially the significance of the choice of state-space dimension. The test statistic for the (mf − j) smallest canonical correlation coefficients is Cj = −[T − j − (p + f )m/2 − 1/2] ln (Vj ) = −[T − j − (p + f )m/2 − 1/2] ln =I − diag (gj + 1 , gj + 2 , . . . , gmf )2= with j = 1, 2, . . . , mf − 1
(60)
and is asymptotically chi-squared with (mf − j)(mp − j) degrees of freedom. In other words, the null hypothesis gj + 1 1 0; gj + 2 1 0 . . . ; gmf 1 0 is tested by the statistic Cj which 2 is distributed asymptotically as x(mf − j)(mp − j). The model order estimation procedure using Cj involves the following steps: (1) begin with the largest possible model (mf q n); (2) compute Cj for each j = mf − 1, mf − 2, . . . ; (3) the procedure terminates when the test statistic Cj is large enough to reject the null hypothesis (in general when Cj Cj + 1 ); (4) the order of the model is then j + 1 = n. When Cj causes the rejection of the null hypothesis, the order of the model j + 1 = n should be specified. Indeed, the order of the model is equal to (j + 1) because when the null hypothesis is rejected, the (j + 1)th canonical correlation coefficient, of the (mf − j) smallest canonical correlation coefficients excluded, is statistically significant. The index j of the Cj test is associated with the (j + 1)th largest canonical correlation coefficient excluded [see equation (60)]. Note that the order determination can be computed without estimating all the various models and obtaining their innovation covariance matrices. Indeed, consider equation (54) and Hj the approximate Hankel matrix formed by setting all the singular values smaller than sj equal to zero. The estimate of Cj is then Cj = − [T −j −(p +f )m/2 −1/2] ln (=R + − (U S V ')R −1
S U ')=/=R + − (U j S j V j' )R −1
j S j U j' )=) − (V − (V (61) where U , V and S are defined in equation (29) and U j , V j and S j are defined by equation (30). 5. NUMERICAL EXAMPLE
To illustrate the procedure for order determination and modal parameter estimation we consider an elementary three degree of freedom system. The natural frequencies and damping ratios of the simulated system are f i = {0.957; 1.564; 2.531}; zi = {0.0384; 0.0542; 0.0826}. The vibrating system is excited by a random force, and corresponds to
.
834
T 1 Cj test for order selection j Cj
1 5·8 × 10
2 3
3·9 × 10
3 3
4
2·1 × 10
3
1·1 × 10
5 3
2·9 × 10
2
6
7
8
5·5
0·51
0·026
the situation where the inputs cannot be measured. However, they can be approximated as a white-noise sequence. Only the responses, obtained from three sensors (three channels: m = 3) are used. Consider the case when f = p = 3 and the number of data points per channel is T = 1000. The order is estimated using the Cj test for 9 different orders (mf = 9). The results are shown in Table 1. It is clear that C5C6 , and the null hypothesis is rejected for C5 , so the order of the state–space model is 6. The estimated natural frequencies and damping ratios of the vibrating system, excited by a random force, are obtained from equations (13) and (14) using eigenvalues of the estimated transition matrix A (35). One obtains f i = {0.955; 1.576; 2.501}; zi = {0.0379; 0.0584; 0.0853}. Note that these results are obtained from only one simulation run and improved results can be obtained using the ensemble averaged over several independent realisations.
6. CONCLUSION
This paper has considered a discrete time state–space approach for the identification of a vibrating system driven by a random force. The proposed procedure is applied to multi-output data and gives satisfactory results in the case where the input forces are not available, but can be approximated as white noise. The discrete time transition matrix, which contains all modal information, has been obtained using three methods. These methods exploit properties of a shift operator applied to the block Hankel matrix, to the observability matrix, and to the controllability matrix. The crucial problem of order estimation has been considered using a test on the ratio of the determinants of the innovation covariance matrices, from models of two different sizes. This test exploits distributional properties of the canonical correlation coefficients and is asymptotically chi-squared. The effectiveness of the methods in modal parameter estimation and in model order selection has been demonstrated using a numerical example of three degrees of freedom. The application of these methods to a continuous system using experimental data is under investigation.
REFERENCES 1. W. G, N. N. N and H. A 1973 Journal of Sound and Vibration 31, 295–308. Maximum likelihood estimation of structural parameters from random vibration data. 2. S. H, Y. B. C and S. M. W 1989 Proceedings of the 12th Biennial Conference on Mechanical Vibration and Noise, 259–265. Multi-output modal parameter identification by vector time series modeling. 3. M. B, A. B, B. G-D, M. G, D. B, P. D, M. P and M. O 1993 Mechanical Systems and Signal Processing 7, 401–423. Damage monitoring in vibration mechanics: issues in diagnostics and predictive maintenance. 4. J. N. J 1994 Applied System Identification. Prentice-Hall. 5. P. H. K and P. A 1967 Proceedings of the 15th IMAC, 889–895. State–space identification of civil engineering structures from output measurements. 6. M. A 1990 State Space Modeling of Time Series. Berlin: Springer-Verlag. 7. A. M. K 1972 Multivariate Analysis. Marcel Dekker.
835
8. J. H. D and A. H 1995 Communication Statistics Theory and Methods 24, 97–119. Model specification tests for balanced representation state–space models. 9. J. R. M and H. N 1994 Matrix Differential Calculus with Applications in Statistics and Econometrics. John Wiley.
APPENDIX A
Let x(t) be the stationary stochastic process defined by x˙ (t) = A x(t) + B h(t)
A =
with
$
0 −M−1 0 K0
I −M−1 0 C0
%
and
B =
$ %
0 . M−1 0
The solution of this equation, starting at time t0 , can be written as x(t) = eA (t − t0 )x(t0 ) +
g
t
eA (t − t)B h(t) dt.
t0
This expression can be written at discrete time intervals equally spaced at times 0, Dt, 2Dt, . . . , kDt, . . . , by making the substitutions t0 = kDt and t = (k + 1)Dt, where Dt is the sampling interval and k is the time index. This yields x((k + 1)Dt) = eA Dtx(kDt) +
g
(k + 1)Dt
eA [(k + 1)Dt − t]B h(t) dt.
kDt
Now, xk + 1 is used to denote x[(k + 1)Dt] and so on, and the integration variable is changed to s = ((k + 1)Dt) − t, so the following discrete time expression is obtained: xk + 1 = eA Dtxk +
g
Dt
eA sB h(t − s) ds
0
xk + 1 = Axk + uk where A = eA Dt and uk = f0Dt eA sB h(t − s) ds. Because the excitation h(t) is a random process, it cannot be assumed to be constant over any time interval Dt. In other words, the quantity h(t) is unknown and uncomputable and the integration given uk cannot be performed. A general method to determine uk is now presented. It is assumed that h(t) is a zero-mean, white-noise process with covariance matrix E[h(t)h(r)'] = Nd(t − r) where d(t − r) is the Dirac delta function with unit value at t = r and zero elsewhere. To compute uk its covariance matrix is required, which is given by
$g g Dt
U = E[uk u'k ] = E
0
=
g
Dt
eA s1B h(t − s1 )h'(t − s2 )B ' eA 's2 ds1 ds2
0
Dt
eA sN eA 's ds
where
N = B NB '.
0
Let the integrand in the above equation be called J(s). Then dJ(s) = A J(s) + J(s)A '. ds
%
836
.
Integration of this equation for 0 E s E Dt gives eA DtN eA 'Dt − N = A U + UA '. To obtain the matrix U, first the expression above is vectorised, using properties of the operator vec [9]: vec (A + B) = vec (A) + vec (B); vec (AB) = (I&A) vec (B) = (B'&I) vec (A) where & is the Kronecker product. Therefore, vec (A U) + vec (UA ') = vec (eA DtN eA 'Dt − N ) (I&A ) vec (U) + (A &I) vec (U) = vec (eA DtN eA 'Dt − N ) [(I&A ) + (A &I) vec (U) = vec (eA DtN eA 'Dt − N ) vec (U) = [(I&A ) + (A &I)]−1 vec (eA DtN eA 'Dt − N ). Once vec (U) has been computed, the inverse of the vec operator is used to obtain the matrix U. With U found, it is then factored by the Cholesky factorisation U = TT' and one computes uk = Tok where ok is a random vector with zero mean and an identity covariance matrix. These vectors uk are the required vectors, indeed E[uk u'k ] = E[Tok o'k T'] = TE[ok o'k ]T = T' = U. Finally the required discrete time stochastic process used in the simulation is xk + 1 = Axk + Tok . It should be emphasised that, apart from round-off error, the expression above is exact; there is no truncation error, as appears in methods based upon numerical integration of differential equations (the Runge–Kutta, Houblot and Wilson methods, for example). APPENDIX B: PROPERTIES OF SHIFT OPERATORS
If A and B are two matrices of dimensions (m × n) and (n × q) then "-----# (AB)Q = A QB; (AB)q = A qB; (AB) = AB ; (AB) = AB "-----# (A )' = (A')q; (A )' = (A')Q; (A') = (A Q)'; (A') = (A q)'. APPENDIX C
To obtain the state covariance matrix P = E[zk z'k ] consider the orthogonal projection − of y+ k on the mainfold spanned by yk − 1 − + − − − −1 − E[y+ k = yk − 1] = E[yk yk − 1'](E[yk − 1yk − 1']) yk − 1 − −1 − = HR−1 − yk − 1 = OKR− yk − 1 − − − + + where H = E[y+ k yk − 1'] and R− = E[yk − 1yk − 1']. Considering yk = Ozk + Fek then, − + − − E[y+ k = yk − 1] = E[(Ozk + Fek ) = yk − 1] = OE[zk = yk − 1] − − −1 − = OE[zk y− k − 1'](E[yk − 1yk − 1']) yk − 1. − Now it is demonstrated that E[zk y− k − 1'] = E[xk yk − 1']. Consider the definition of zk : − − − − −1 − −1 − zk = E[xk = y− k − 1] = E[xk yk − 1'](E[yk − 1yk − 1']) yk − 1 = E[xk yk − 1']R− yk − 1
(C1)
837
and form − −1 − − − −1 − E[zk y− k − 1'] = E[xk yk − 1']R− E[yk − 1yk − 1'] = E[xk yk − 1']R− R− = E[xk yk − 1'].
Then one has − − − − − −1 − E[y+ k = yk − 1] = OE[xk yk − 1'](E[yk − 1yk − 1']) yk − 1 = OE[xk = yk − 1] = Ozk .
(C2)
Comparing equations (C1) and (C2): − zk = KR−1 − yk − 1
and − − −1 − − −1 −1 −1 P = E[zk z'k ] = E[KR−1 − K'] = KR− E[yk − 1yk − 1']R' − K' = KR− K'. − yk − 1yk − 1'R'
APPENDIX D
Let Vj be the ratio of the determinants of the innovation covariances of two models, raised to the power f. Then, from equation (54) Vj =
=R+ − HR−1 =R =−1=R − HR−1 − H'= − H'= = + −1 + −1 =R+ − Hj R− H'j = =R+ = =R+ − Hj R−1 j = − H'
−1/2 −1/2 and =A==B= = =AB=, one obtains since R−1 + + = R+ R'
Vj =
−1/2 −1/2 −1/2 =R+ =−1/2=R+ − HR−1 =I − R−1/2 H'R'+−1/2 = − += − H'==R' + HR− R' −1 −1/2 −1/2 −1/2 −1/2 = −1/2 =R+ = =R+ − Hj R− H'j ==R'+ = =I − R+ Hj R− R'− H'j R'+−1/2 =
Vj =
−1/2 −1/2 −1/2 =I − (R−1/2 + HR− )(R+ HR− )'= . −1/2 −1/2 −1/2 =I − (R+ Hj R− )(R+ Hj R−1/2 − )'=
Consider the singular value decomposition of the scaled block Hankel matrix R−1/2 + HR−1/2 − . Then −1/2 = P GQ ' R−1/2 + HR−
where P and Q are orthonormal: P 'P = P P ' = I and Q 'Q = Q Q ' = I; G is the mf × mf diagonal matrix which contains the singular values or canonical correlation coefficients (the cosines of the angles between the past and the future of the process {yk }) arranged in decreasing order G = diag (g1 , g2 , . . . , gj , . . . , gmf ) = (diag (g1 , g2 , . . . , gj ), G2 ) = (G1 , G2 ) − with gi being the ith canonical correlation coefficient between y+ k and yk − 1. Note that G2 is the diagonal matrix of canonical correlation coefficients which are neglected of the approximate model with j states. Partitioning the matrix P and the matrix Q conformably into (P 1 , P 2 ) and (Q 1 , Q 2 ) gives
Vj = =
=I − (P GQ ')(P GQ ')'= =I − P G 2P '= = =I − (P 1 G1 Q 1' )(P 1 G1 Q 1' )'= =I − P 1 G12P 1' = =P ==P '==I − G 2= =I − G 2= =P (I − G 2)P '= = . 2 2 = =P 1 (I − G1 )P 1' = =P 1 ==P 1' ==I − G1 = =I − G12=
.
838
Since the determinant of a diagonal matrix is just the product of its diagonal entries, then 2 =I − G 2= = (1 − g12)(1 − g22) · · · (1 − gj2 ) · · · (1 − gmf ) = =I − G12==I − G22=
and Vj =
=I − G 2= = =I − G22=. =I − G12=