Copyright © IFAC System Identification, Kitakyushu, Fukuoka, Japan, 1997
AN APPROACH TO REALIZATION OF STOCHASTIC SYSTEMS WITH EXOGENOUS INPUT
Tohru Katayamat and
Giorgio Picci:j:
t Department
of Applied Mathematics and Physics, Kyoto University Kyoto 606-01 , Japan ; e-mail:
[email protected]. ac.jp
:j:Department of Electronics and Informatics, University of Padova 35131 Padova, Italy; e-mail:
[email protected]
Abstract. This paper solves the stochastic realization problem for a linear discretetime system with an exogenous input . The oblique projection of the future outputs on the space of the past observations along the space of the future inputs is factorized as a product of the extended observability matrix and the state vector. The state vector is chosen by using the canonical correlation analysis (CCA) of past and future. We then derive the state equations of the optimal predictor of the future outputs in terms of the state vector and the future inputs. These equations lead to a forward innovation model for a stochastic system with exogenous inputs . Keywords. Stochastic realization; Exogenous inputs; Canonical correlation analysis; Oblique projection; Subspace method
1
INTRODUCTION
classical parameter optimization approaches to MIMO identification (Ljung, 1987) is completely avoided.
Subspace identification methods have received much interest in the past years (Moonen and Vandewalle, 1990; Van Overschee and De Moor , 1993, 1994, 1996; Verhaegen and Dewilde, 1992; Verhaegen , 1994; Viberg, 1995) and relations among existing subspace identification methods have been discussed by Van Overschee and De Moor (1996) . Subspace methods involve geometric operations on subspaces spanned by the column or row vectors of certain block Hankel matrices formed by the input-output data. These operations are performed numerically in a reliable way based on the singular value decomposition (SVD) and QR decomposition (Golub and Van Loan, 1989); a main advantage being that the difficult model selection problem inherent in the
Recently Lindquist and Picci (1996a, 1996b) have interpreted subspace identification of state-space models from the point of view of realization theory. Picci and Katayama (1966) have shown the relevance of stochastic realization to subspace identification of state-space systems with exogenous inputs. In his pioneering work, Akaike (1974, 1975, 1976) has developed a minimal stochastic realization theory based on the canonical correlation analysis (CCA) between the future and the past of a stochastic process , in which the state-space is defined as a linear space spanned by the best predictors of the future based on the past . The computation associated with the CCA technique is
1057
and Picci, 1996) leads to a representation of the oblique projection of the future outputs onto the space of past observations along the space of future inputs (Van Overschee and De Moor, 1996) as a product of an extended observability matrix and the state vector. From this we derive a forward innovation representation of a stochastic system with exogenous input under the feedback-free condition.
performed by the SVD of a normalized Hankel matrix formed by covariance matrices. He also gave a stochast ic interpretation of the basic Ho-Kalman realization method (Ho and Kalman, 1966) and other related realization algorithms. By using the CCA technique, a balanced order-reduction algorithm for stochastic statespace models has been developed by Desai et al.(1985). Also, Aoki (1990) has extensively studied a stochastic realization problem with applications to economic timeseries analysis , and Lindquist and Picci (1991) have developed a geometric procedure for solving the stochastic realization problem.
The organization of the paper is as follows. In section 2, we present some preliminary mathematical facts about oblique projection in Hilbert spaces. In section 3, we consider the linear LS problem for predicting the future outputs based on the projection theorem and define the state vector based on the factorization of the block Hankel matrix using the CCA technique. Section 4 derives a forward innovation model for the stochastic system with exogenous input . The conclusion is given in section 5.
The stochastic realization for time-series is now well understood, but there are only few papers dealing with the realization for stochastic systems excited by an exogenous input. Larimore (1983, 1990) has presented a reduced order identification technique called CVA (canonical variate analysis) method for MIMO linear statespace models so that arbitrary control inputs can be included in the model; but he has not considered the identification problem from the point of view of stochastic realization . Van Schuppen (1994) has considered a realization of stochastic control system under the assumption that the control input is a white noise. Recently, we have also studied a stochastic realization of stationary processes with an exogenous input under the assumption of absence of feedback, and derived a minimal state-space model with a natural block structure (Picci and Katayama, 1996).
2 PRELIMINARIES Consider a discrete-time stochastic linear system with the m x 1 input vector u(t) and the p x 1 output vector Yet) . It is assumed that {u(t) , yet), t = 0, ±1, · · ·} are jointly wide sense stationary processes with zero mean and finite covariance matrices . Let t be the present time and k a positive integer. We then define the stacked vectors of past and future inputs and stacked vectors of past and future of outputs as
In this paper, we consider the realization problem for linear stochastic systems excited by an exogenous input , by using a different approach than Picci and Katayama (1996) . The approach to stochastic realization used here is based on oblique projection and is meant to shed some light on the procedures of some well-known papers on subspace identification (Van Overschee and De Moor, 1994, 1996; Verhaegen and Dewilde , 1992; Verhaegen, 1994) and to allow comparisons.
u_(t)
[ u(t 1) u(t -- 2)
1
:=
u+(t)
[
u(tu(t) + 1)
:=
1
U(t+~-I) and
y_(t)
As suggested in Larimore (1990), the effect of the future inputs is to be removed from the future outputs in order to apply the stochastic realization results to construct state-space systems with an exogenous input. The realization problem is initially formulated as statespace modeling of the least-squares (LS) predictor of the future output based on the future inputs and the past observations and is solved by using representation results for the corresponding projection operators . It can be shown that the optimal operator from the future inputs to the future outputs is causal under the assumption that there exists no feedback from the output to the input (Caines and Chan , 1976; Gevers and Anderson, 1982). We also show that the state vector constructed by the CCA technique (Akaike, 1974; Lindquist
[ y(t 1) yet -- 2) :=
1
.
y+(t)
[
1
yety(t) + 1)
:=
yet +
~-
1)
where the present is conventionally included in the future. For notational simplicity, we also define
pet) := [
~=~!j
]
[00 x 1 past observations]
f(t) := y+(t)
[kp x 1 future outputs]
u(t) := u+(t)
[km x 1 future inputs]
It should be noted that the dimension of pet) is infinite , since we consider a steady state prediction problem in Section 3, whereas those of f(t) and u(t) are finite.
1058
Let P t -, Yt- and Ut + be the linear spaces spanned by the past p(t), y-(t) and the future inputs u(t) , respectively. It is assumed that the norm in these spaces is given by the root-mean-square and the spaces are closed with respect to this norm. Thus P t - , Yt- and Ut+ are Hilbert subspaces of an ambient Hilbert space 1i . The Hilbert space P t - is also written as
where E aa / b, Ebb/a are nonsingular if so are Eaa , Ebb . Proof: The formulae for 11 and ~ are easily proved by using _A- 1 BD- 1 D- 1 + D- 1 CA -1 BD-1
]
e
where ~ := A-BD- 1 is the Schur complement . Since A = E(aaT ), B = eT = E(abT ), D = E(bb T ), we have ~ E{{ajb.L)(ajb.Ll} = Eaa/b. Thus it can be shown that
P t - = span{y(t -1), y(t - 2) ,·· ·; u(t -1) , u(t - 2),· · ·}
=
In the following, V denotes vector sum of subspaces. Suppose that A be a subspace in a Hilbert space 1i. Then the orthogonal projection of b E 1i onto the subspace A is denoted by (bj A). Let A.L be the orthogonal complement of A c 1i. Then the orthogonal projection of b onto A.L is defined by (bj A.L) := b - (bj A) . Suppose that A is generated by a random vector a with zero mean and finite covariance matrix. Then the orthogonal projection of b E 1i onto A is given by
lI(y) = E(yaT)~ -1
_
E(yb T )E(bbT )-l E(baT)~-l
= E{(y jb.L )(ajb.L l}E{(ajb.L )(ajb.L l}-l = Eya/bE~a1/b Similarly, we have ~(y) = Eyb/aEbb~a . Since lI(y)a Eya/bE~;/ba , we see that , for v = ala, lI(v)a = alIa
=v ,
lI(b)a
=
=0
Thus lI(y)a is the oblique projection of y onto A along B. A dual argument shows that ~(y)b is the complementary oblique projection. 0
where E( ·) denotes the mathematical expectation, and is the pseudo-inverse (Golub and Van Loan, 1989) . Also , the orthogonal projection of b onto A.L is expressed as (bja.L) .
ot
The optimal predictor
In order to extend the stochastic realization results to a system with exogenous inputs , the effect of the future inputs u(t) should be removed from the future outputs f(t) (Larimore, 1990) . To this end , we consider the linear LS prediction of the future outputs f(t) based on the infinite past observations p(t) and the future inputs u(t). By a generalization of Lemma 1 we obtain the following representation result .
Orlhogonal and oblique projections
The following is a basic technical result used m this paper . Lemma 1. Let y , a and b be random vectors with components in 1i, let A span{a} , B span{b} and assume that AnB = O. Then the vectors lI(y)a, ~(y)b in the orthogonal projection
=
=
x [ : ] := [1I(y)
~(y)] [
: ]
Theorem 1. Suppose that P t - n Ut + = O. Then the optimal LS predictor f(t) , of the future output vector f(t) , given Pt - V UH , the space spanned by p(t) and u(t) , is given by f(t) = (f(t)jP t _ V Ut +) = IIp(t)
(1)
The operators 11 and type equations
are oblique projections of y onto A along B and of y onto B along A respectively.
L,aajb :=
E{(ajb.L)(ajb.L)T} ,
L,yajb:=
E{(yjb.L)(ajb.Lf}
L,yb/a :=
E{(yja.L)(bja.Lf} ,
L,bbja :=
E{(bja.L)(bja.L)T}
lI(y)E aa / b
= E ya / b,
~(y)
= Eyb / a
satisfy the discrete Wiener-Hopf
where E pp / u , E uu / p are the conditional covariance operators of the past vector p(t) given u(t) and of the future input u(t) given the past p(t). (11 is represented by a matrix with kp rows and an infinite number of columns and ~ is a pk x km matrix.)
satisfy the discrete
~(y)Ebb/a
(3)
(4)
Define the conditional covariance matrices
Then the matrices II(y) and Wiener-Hopf type equations
~
+ ~u(t)
It should be noted that IIp(t) is the oblique projection of the future f(t) onto the past P t - along the future
(2)
1059
It should be noted that the prediction error e( t) is sta-
of (8) is essentially the same as that of Larimore (1990) and Van Overschee and Moor (1994, 1996).
tionary, since y( t) is stationary and the estimate y( t) is based on observations from the infinite past .
Thus in terms of x(t) , the optimal predictor f(t) of (3) has the form
f(t)
= Ox(t) +
u(t)
Lemma 2. The prediction error e(t) is an innovation process uncorrelated with the past outputs {yet 1) , yet - 2) ,·· ·} and the present and the past inputs {u(t) , u(t - 1) " , .}, namely
(9)
where is a causal operator (see Theorem 2) .
e(t) .1 Yt-
Remark 1. From the point of view of CCA, a seemingly more natural definition of the state vector is given by 1 / 2 a(t) , namely
Proof' Note that P t - V Ut = Yt- V U(t+l)- ' Then the result follows from the definition (11) . 0
:t
We now consider the dynamics of the state vector x(t). Let
x(t) = CE;p/u(p/ul.)(t) which , using (4) , (7) , leads to
II(p/ul.)(t) = OCE;pl/u(p/ul.)(t)
= Ox(t)
wet) := x(t (10)
optimal predictor is given by
(x(t
f(t) = Oi(t) + Hu(t)
wet) = K e(t)
P(t+l)- :=
yet) - (y(t)/P t -
Theorem 3. Suppose that there is no feedback from the output yet) to the input u(t) . We assume that rankE Jp / u = n . Then in terms of a state vector x(t) of (8) , we have a stochastic realization of the form
x(t + 1)
= Ax(t) + Bu(t) + Ke(t)
yet) = Cx(t) + Du(t) + e(t)
(11)
(14) (15)
We note that the system of (14) , (15) is the innovation form derived from the Kalman filter with the exogenous input u(t) .
From this definition , we get the output equation
yet)
span{y(t) , yet - 1)",,; u(t) , u(t - I)" ,, }
where e(t) is orthogonal to Yt- V U(t+l)- ' Since wet) is the projection of x(t + 1) onto the complement of Yt- V U(t+l)- , we see that wet) is a function of e(t) . 0
+ Du(t)
V Ut)
(13)
= span{e(t) , y(t - 1) ",, ; u(t), u(t - I) , . ··}
Define the prediction error by :=
(12)
Proof' We see that X t V Ut = Pt - V Ut = Yt- V U(t+l)- ' since x(t) is a sufficient statistic of pet) . It should be also noted that x(t + 1) is a function of pet + 1), and
where C and Dare p x nand p x m constant matrices, respectively.
e(t)
+ 1)/Xt V Ut) = Ax(t) + Bu(t)
Lemma 3. The prediction error wet) is a function of e(t) , and hence there exists K such that
We now derive a forward innovation model by using the state vector x(t) of (8) and the optimal predictor f(t) of (9). Since, from Theorem 2, is block lower-triangular, the first p rows of (9) involves just one-step predictor of yet) based on P t - VUt , where Ut is the subspace spanned by u(t) ,
Cx(t)
+ l)/Xt V Ut)
x(t + 1) = Ax(t) + Bu(t) + wet)
INNOVATION MODEL
V Ut) =
(x(t
where the right-hand side is a sum of the oblique projections. Thus we have the state equation
where Hu(t) := (f/u)(t) = E(fuT)E(uuT)-lu(t) . It should be, however, noted that this is not a suitable expression for the derivation of a causal state-space model, since H is noncausal, unless u(t) is a white noise. If the input u(t) is a white noise, then (p/ul. )(t) = pet) and in this case x(t) agrees with x(t) . Also, if there is no exogenous input, then the state vector defined above coincides with the state vector for stochastic processes treated in Akaike (1975) , Desai et al. (1985) and Lindquist and Picci (1996) . 0
yet) := (y(t)/Pt -
+ 1) -
where X t is the subspace generated by x(t) . Then there exist n x nand n x m constant matrices A and B such that
It follows from (10) that another representation of the
4
V U(t+l)-
= Cx(t) + Du(t) + e(t)
1061
Remark 2. For the computation based on finite inputoutput data, the past stacked vectors u_(t) and y_(t) must be truncated to the finite length, namely
Golub, G. H. and C. R. Van Loan (1989). Matrix Computations (2nd ed.) . The Johns Hopkins Univ. Press. Ho, B. 1. and R . E. Kalman (1966). Effective Construction of Linear State-Variable Models from Input/Output Functions. Regelungstechnik, 14, 545-548. Larimore , W. E . (1983). System Identification , ReducedOrder Filtering and Modeling via Canonical Variate Analysis. In Proc. American Automatic Conference, San Francisco, 445-451. Larimore, W. E . (1990) . Canonical Variate Analysis in Identification, Filtering, and Adaptive Control. In Proc. 29th IEEE Con/. Decision €J Control, Honolulu, 596604. Lindquist, A. and G. Picci (1991) . A Geometric Approach to Modelling and Estimation of Linear Stochastic Systems. J. Math . Systems, Estimation €J Control, 1,241-333. Lindquist, A. and G . Picci (1996a). Geometric Methods for State Space Identification. In Identification, Adaptation, Learning (Lectures given at the NATO-ASI School, From Identification to Learning, Como , Italy, August 1994), Springer. Lindquist, A. and G . Picci (1996b). Canonical Correlation Analysis, Approximate Covariance Extension, and Identification of Stationary Time Series. Automatica, 32, 709-733 . Ljung, L. (1987) . System Identification - Theory for the User. Prentice-Hall. Moonen, M. and J. Vandewalle (1990). QSVD Approach to On- and Off-Line State~Space Identification. Int. J. Control, 51, 1133-1146. Picci , G. and T. Katayama (1996). Stochastic Realization with Exogenous Inputs and "Subspace Methods" Identification. Signal Processing, 52, 145-160 . Rozanov, N. I. (1963). Stationary Random Processes. Holden-Day. Van Overschee, P. and B. De Moor (1993) . Subspace Algorithms for the Stochastic Identification Problem. Automatica, 29, 649-660. Van Overschee, P. and B. De Moor (1994). N4SID - Subspace Algorithms for the Identification of Combined Deterministic - Stochastic Systems. Automatica, 30 , 75-93. Van Overschee, P. and B. De Moor (1996) .. Subspace Identification for Linear Systems. Kluwer Academic Publications. Van Schuppen, J. H. (1994). Stochastic Realization of a Gaussian Stochastic Control System. Acta Applicandae Methematicae, 35 , 193-212. Verhaegen, M. and P. Dewilde (1992). Subspace Model Identification, Part 1. The Output-Error State-Space Model Identification Class of Algorithms; Part 2. Analysis of the Elementary Output-Error State-Space Model Identification Algorithm . Int. J. Control, 56, 1187-1210 & 1211-1241. Verhaegen , M. (1994) . Identification of the Deterministic Part of MIMO State Space Models given in Innovations Form from Input-Output Data. Automatica, 30, 61-74 . Viberg, M. (1995). Subspace-based Methods for the Identification of Linear Time-invariant Systems. Automatica, 31 , 1835-1851.
Thus the past observation p(t) must be replaced by the d(p+m) x 1 stacked vector. By taking a sufficiently large number of lags d, the statistically significant behavior in the data will be retrieved by the CCA (Akaike, 1976; Larimore, 1990) . Thus in the case of finite lag, we will obtain an approximate innovation representation like the system of (14) and (15). 0
5 CONCLUSIONS In this paper we have developed a realization theory for stochastic system with exogenous inputs under the assumption that there is no feedback from the output to the control input. A generalized CCA technique is employed for defining the state vector and a forward innovation representation of a stochastic system with the exogenous input has been derived. The present approach will shed some light on the procedure of some well-known papers on subspace identification (Larimore, 1990; Verhaegen and Dewilde, 1992; Verhaegen 1994; Van Overschee and De Moor, 1994, 1996).
REFERENCES Akaike, H. (1974) . Stochastic Theory of Minimal Realization . IEEE Trans. Automat. Contr., AC-19, 667-674 . Akaike, H. (1975). Markovian Representation of Stochastic Processes by Canonical Variables. SIAM J. Control, 13 , 162-173. Akaike, H. (1976). Canonical Correlation Analysis of Time Series and the Use of an Information Criterion. In System Identification: Advances and Case Studies (R. Mehra and D. Lainiotis, Eds.), 27-96 , Academic . Aoki, M. (1990). State Space Modeling of Time Series (2nd ed.), Springer. Caines, P. E. and C . W. Chan (1976). Estimation, Identification and Feedback. In System Identification : Advances and Case Studies (R. Mehra and D. Lainiotis, Eds.), 349-405, Academic. Desai , U. B., D. Pal and R. D. Kirkpatrick (1985). A Realization Approach to Stochastic Model Reduction . Int. J. Control, 42, 821-838. Gevers, M. R. and B. D. O . Anderson (1982) . On Jointly Stationary Feedback-free Stochastic Processes. IEEE Trans. Automat. Contr., AC-27, 431-436.
1062