Copyright @ IFAC System Identification, Copenhagen, Denmark, 1994
AN IV-SCHEME FOR ESTIMATING CONTINUOUS-TIME STOCHASTIC MODELS FROM DISCRETE-TIME DATA • S. BIGI, T. SODERSTROM and B. CARLSSON Upp$ala Univer$ity, SY$tem$ and Control Group, Box 27, S-751 03 Uppllala, Sweden
Abstract. The problem of estimating the autoregressive parameters of continuous-time stochastic models from discrete-time data is addressed. The presented approach is based on discrete-time models parameterized with the continuous-time parameters using approximations of the differentiation operator. It is shown that the parameters in these models, in general, cannot be estimated with the standard least squares method (in contrast to the use of deterministic models). Instead an instrumental variable method is suggested and analyzed. The approach is numerically illustrated for unifonnly sampled data. Key Words. Continuous-time stochastic models, uneven sampling, IV method, derivative approximation, asymptotic consistency
ate step, and can thus be called a direct method. This method may have good numerical properties, especially for fast sampling, and it is computationallyefficient. In particular it is well suited for estimating the continuous-time parameters in the presence of a nonuniform sampling interval. In the literature, the above approach seems limited to deterministic systems, where a measurable input signal is available and the possible presence of a stochastic disturbance is neglected (Young, 1981; Unbehauen and Rao, 1987; Middleton et al., 1990; Sagara et al., 1991). There are hence reasons to examine its potential for dealing with the pure stochastic case (the time series problem), i.e. when the "input" or driving function is an unknown (and immeasurable) white noise sequence. In this article this case is analyzed. It is shown that the approach of replacing derivatives by differences, combined with an instrumental variable (IV) scheme, generally works in the pure stochastic case as well. However, the use of a least-squares scheme mostly leads to a considerable bias in the estimates and this is in contrast to the deterministic case where this is a common approach, (Unbehauen and Rao, 1987; Sagara et al., 1991). The paper is organized as follows. In Section 2 the problem to be studied is defined. In Section 3, a leastsquares scheme is analyzed by means of simple examples. A more detailed analysis of the consistency properties of the IV estimate is given in Section 4. Section 5 contains a numerical illustration of the proposed IV scheme. Conclusions are given in Section 6.
1. INTRODUCTION The problem of modeling a continuous-time stochastic process from discrete-time data has gained a certain interest in the statistical literature, see, for example, Bergstrom (1976), Parzen (1984), Kulchitskii and Skrobotov (1990), Sin ha and Rao (1991). One possible approach for estimating the continuous-time parameters is to first fit a discrete-time model to the data and then, as a second step, convert this model to continuous time. This is known as an indirect method. Algorithms for such transformations can be found in Keviczky (1977), Sinha and Lastman (1982), Feliu et al. (1988), Jeiek (1990), Soderstrom .(1991). A drawback of the indirect method is that the two step procedure does not seem feasible when the sampling interval is not constant over the data set. Observe that in this case the discrete-time parameters will be timevarying. Moreover, a statistically optimal prediction error approach becomes computationally very complex as a repeated nonlinear optimization must be done, (Soderstrom and Angeby, 1990). Another approach is to approximate the continuous-time derivatives by differences, or alternatively integrals by sums (Young, 1981; Unbehauen and Rao, 1987; Middleton et al., 1990; Sagara et al., 1991). This approach results in the possibility of estimating the parameters of the continuous-time system without any intermedi·This work has been supported by the Swedish National Board Cor Technical Development, under contract 91-00019. The work of S. Bigi has been supported by scholarships of University of Padova and Fondazione Gini, Padova, Italy.
1561
2. STATEMENT OF THE PROBLEM
Consider the continuous-time (ARMA) process y(t) given as
to get an exact differentiation as h
the following condition must hold
I!1:(z) = { BI:(I) =
where p is the differential operator and e(t) is continuous white noise with unit incremental variance (i.e. Ee(t)e(s) = 6(t - s». The system description will be written for short as
= B(p)e(t)
1 N
N Ly(k)y(k
(3)
The analysis of the parameter estimates will be based on the concept of asymptotic consistency. An estimate iJ is said to be asymptotic consistent if hm lim iJ = ()* (4)
(1)
The polynomials A(p) and B(p) are assumed to be coprime, with A(s) having all zeros The samples in the open left half plane. y(td,y(t2),oo.,y(tN) of the process are available. For ease of notation, in the following a uniform sampling will be considered, with the sampling interval equal to h, the data sequence being y(h),y(2h), ... ,y(Nh). The results can, however, be easily extended to the nonuniform sampling case as well. Introduce further the notations (the sampling interval h is omitted in the time argument)
f(r)
0, i.e.
stochastic
Wn +a1pn-1 + .. .+an)y(t) =(b1pn-1 + .. .+bn)e(t)
A(p)y(t)
-+
h-ON-oo
where ()* denotes the true value. Asymptotic consistency appears to be a reasonable property to strive for when examining procedures based on various discrete-time approximations of the derivatives. 3. PRELIMINARY EXAMPLES
In this section, some properties of least squares estimates are illustrated using first and second order models. It is shown that a least squares method will not, in general, give asymptotically consistent estimates. Due to this fact, an IV approach is proposed in next section.
+ r)
k=l
r(r)
Ey(k)y(k
+ r)
Consider the first order model
py(t)
for the sample and ensemble covariance functions, respectively. Under weak conditions, the sample covariance elements converge to the ensemble elements, i.e. f(r) -+ r(r) as N-+ 00, see, for example, (Soderstrom and Stoica, 1989) for details.
AI""
be(t)
(5)
which has the covariance function
Example 3.1. (A least-squares method based on Euler's method) The differential operator p is approximated by Euler's method (forward difference), obtaining the model
The problem to be discussed is the estimation of the autoregressive parameters of (1) based on the available data. We will consider methods based on discrete-time models parameterized with the continuous-time parameters, via a discrete-time approximation of the derivative operator p. In particular, the derivative of order k, pk, will be approximated by the general difference operator D k defined as k
+ ay(t) =
y(t
+ h) h
y(t)
+ ay(t) -
-
c(t)
The least-squares estimate a is found by minimizing the sum of squared residuals c(t), i.e. the criterion
.
D y(t)= h k L...Jf3k iq1y(t) i where q represents the shift operator, i.e. qiy(t) = y(t + jh). Note the summation can be possibly over Z, the set of positive and negative integers, and the indices may depend on k. The difference operator will be written for short as 1 Dl:y(t) = hI: BI:(q)y(t) (2)
The minimizing element is given by
Hence, " I' 1 [r(h)-r(O)] I1m a = - lm - --..,..-,.-'-';";; N-oo h f(O)
N-oo
Assuming y(t) to be an arbitrarily differentiable function, it easily follows from (2) that in order
=-t(e- Oh 1562
1) = a
+ 0(a 2 h) , h -+ 0
The estimate is thus asymptotically consistent. o Example 3.2. (A least-squares estimate based on the backward difference) Now the differential operator is approximated by a backward difference, giving the model
4. ASYMPTOTICALLY CONSISTENT INSTRUMENTAL VARIABLE ESTIMATES In this section, an instrumental variable (IV) approach for estimating the autoregressive parameters {a;}i=1 in the model (1) is considered. The derivative operator pA: is approximated by the general difference operator DA: defined in Section 2. Introducing the instrumental variables z(t) by using delayed outputs
y(t) - y(t - h) + ay(t) = !(t) h Proceeding similarly to Example 3.1, the estimate results in
z(t) = (
a = -a h-ON-oo lim lim
+ ay(t) =
1
= - N
2
+ (a + b)py(t) + aby(t) =
y(t) + a1
q-l
( ) -h-
(7)
Asymptotic consistency Set ao = 1, and let N -+ (Xl. The IV equations (7) can then be equivalently phrased as (j = 1, ... , n) Ey(t-Lh- jh)t a t h n1_ t t=o
f3n-t,lJy(t+J.lh) = 0
IJ
(8)
(6)
abe(t)
L
which can be rewritten repeatedly as
o
= t a t h:- t t=o
L .Bn_t,lJqlJr([L + j]h) IJ
n
y(t)
=
lim lim
a1
2 3(a + b)
lim lim
a2 =
ab
=
L atDn-tr(t)lt=(L+j)h
(9) t=o The covariance function r(t) is in general not continuously differentiable at t = 0, but arbitrarily differentiable for t ¥ O. In order to calculate the limit of (9) for h -+ 0, let us assume that L is chosen so that in (8)
+ a2y(t) = !(t)
k
h-ON-oo
L z(t)[Dny(t)]
To show the feasibility of this approach, it is necessary to prove that the IV estimates are asymptotically consistent and that the IV equations in (7) are "well-behaved", i.e. the system matrix RN is nonsingular.
A least-squares estimate of (a1 a2f is then found by minimizing L: !2(t) with respect to a1 and a2. Straightforward calculations then give, see (Soderstrom et al., 1992) for details h-ON-oo
z(t)[D n - 1y(t) ... y(t)]
bu(t)
The derivatives are approximated by forward differences, obtaining the model q-l -h- )
L
RN ~ ~
One might conjecture that, in a more general situation, if a least-squares approach is to be used it should be combined with a forward difference approximation of p. This conjecture is not true for higher order models as the following example shows. Example 3.3. (A least-squares estimate for a second order model) Consider now the second order process
(
h) )
with L ~ 0, the estimates {adi=1 are found from the matrix equation
with a known input signal u(t), the parameter estimates would be asymptotically consistent in both cases. 0
p2 y(t)
~h -
y(t-Lh-nh)
A completely erroneous estimate is hence obtained in this case. Note that it could hardly be anticipated that there would be such a significant difference in using a forward and a backward difference for approximating p. In case instead the approximations applyed to a deterministic model of the form
py(t)
y(t -
L+j+J.l~O
(10)
This guarantees the argument t in (9) to be strictly positive when letting h -+ 0+, which implies the derivatives of r(t) to exist of every order. Set ak = limh_o+ aA:,
=
The true values are, er (6), a1 (a+b), a2 ab. Hence, the estimates are not asymptotically con0 sistent.
n
As a consequence of the examples in this section, the least-squares approach is ruled out as a general and pertinent estimation method for this framework.
n
L t=o 1563
aipn-tr(t)lt=o+
0
(11)
Note that for t ~ 0 the covariance function r(t) satisfies the Yule-Walker equation
Since the argument of r(t) remains strictly positive, independently of j, k
n
(pn+a1pn-I+ .. . +an)r(t)
lim [5(00, h)li.l: = pn-.l:+i -l r(t)II=O+
= La;pn-ir(t) = 0 ;=0
(12) which is actually equation (11). Since both the true values {at} and the estimates {at} satisfy the same equation, the asymptotically consistentcy immediately follows. Remark: Let I-lmin denote the minimal value of fl that appears in any of the sums in (8), i.e. I-lmin = min[mjn{fll,Bn-t,1' :j: O}]
(13)
Then the condition (10) corresponds to
(14)
L ~ -1- flmin
Note though, that the condition is not needed at all when r(t) is arbitrarily differentiable, and that it can be relaxed when degA(p) deg B(p) = k > 1. In this case, in fact, it can be proved, that r(t) has continuous derivatives for t 0 up to (inclusive) order 2(k - 1). Hence, it is sufficient to restrict the minimization in (13) 0 over e such that n - e ~ k.
h-O+
Define the matrix S (17)
The nonsingularity of the matrix S is shown in the following Lemma. Lemma 4.1. The matrix S is nonsingular. Proof. Define x(t)= (r(t) pr(t) ... pn-l r(t))T. By the Yule-Walker equation (12), (t > 0)
t=[n
=
Nonsingularity of the system matrix
RN
=
It is sufficient to prove that the matrix R limN_oo RN is not singular (Soderstrom and Stoica, 1989). Introduce the linear nonsingular transformation:
T(h)=
1 -ljh Ijh 2
o
Ijh
-2jh 2
1
0
_~,) ~ x
Fx
(18)
and
Hence, rankS = McMillan degree of F = n, cf o (Brockett, 1970). This consideration concludes the examination of the IV approach. It has been found that the IV approach gives asymptotically consistent estimates of the autoregressive parameters of an ARMA model under mild conditions, using delayed outputs as instrumental variables. 5. NUMERICAL ILLUSTRATION
5(N, h) = T(h)RN The transformation does not affect the solution of (7), but only the analysis of its asymptotic properties. The elements of the coefficient matrix 5(N, h) are apparently (j, k = 1, ... , n)
In this section, the performance of the instrumental variable estimates outlined in Section 4 is briefly illustrated by means of a numerical example. A more detailed illustration can be found in Soderstrom et al. (1992).
N
1", 1 ·1 (15) [5(N,h)li.l: = N LJ hi-I (-1)11=1 x[(1- q-l)i-1y(t - Lh - h)]Dn-.l:y(t) Let A(q-l) = 1 + alq-l + can be easily shown that
EA(q-l )x(t)y(t)
=
... + anq-n,
Ex(t)A(q)y(t) A(q)Ex(t)y(t) (16)
lim [5(N, h)]i.l:
N-oo
=(_ly-l h .. -A,-I (_ly-l(q -
p2 y(t)
+ 2py(t) + 2y(t) = €(t)
(19)
then it
Hence, from (2), (3), (15), (16)
[5(00, h)]i.l: =
Estimations were performed using data from a uniformly sampled second order AR system.
ly-l Bn-l:(q) xEy(t - Lh - h)y(t) = Dn-.l:-i-1r(Lh + h)
The sampling period was taken as h = 0.2s, which was found appropriate for the system under study. The types of derivative approximations used in the estimations are summarized in Table 1. The values of the instrumental time delay L were chosen as L = 0,1 and 15, respectively. The data length was N = 5000. The IV method based on the derivative approximation is compared to an indirect method based on a discrete-time ARMA(2,2) model' (Soderstrom
1564
Table 1: Derivative approximations used in the simulation study. Name
Description
Expression
B
Backward difference
,,(')-~('-h)
E
EuJer's method
R
HR filter (see Remark 1)
T
Tustin's approximation
q+1Y ( t ) h2.i=.!.
IM
Interpolating polynomial (see Remark 2)
*P(q)y(t)
The sample means and standard deviations of the parameter estimates from 150 realizations of the process are shown in Table 2. The conclusions from the simulation session are the following: 1. A high value of L leads to very inaccurate results. This follows the common experience that when an IV method is used in identification, a low value of the instrumental variable time delay L should be chosen. 2. Derivative approximation • For the IM cases, the bias are in most cases lowest for L = 1 and low values on M. The standard deviation increases slightly when M is increased. A reasonable choice of derivative approximation appears to be 11. • The cases B, E and R give quite bad accuracy. • The Tustin approximation (T) has the same bias for L = 1 and L = 0, but a lower standard deviation for L = O. 3. When the IV method is appropriately used, its performance is comparable to the indirect ARMA method.
,,(l+h)-y(.) h
iR(q-l)y(t)
and Angeby, 1990). In the latter case, a prediction error method is first used to estimate the ARMA parameters, then the continuous-time AR parameters are obtained by transforming the discrete-time poles to continuous time using the relation z = eph , where z and p are the discrete and continuous time poles, respectively.
Remarks: 1. The derivative approximation R in Table 1 is
based on Rabiner and Steiglitz, (1970), where the HR filter is:
In summary, the central difference with L = 1 and the Tustin approximation with L 0 seem to be the best choices of derivative approximation in the IV method. In Soderstrom et al. (1992) this conjecture is also supported by a theoretical analysis limited to second order systems.
=
R( -I) = 1.15 - 0.478q-1 - 0.771q-2 q 1 + 0.86q-l + 0.lq-2 2. The derivative approximation I M in Table 1 is obtained by interpolating polynomials (e.g. see Carlsson, 1991), resulting in the expression:
6. CONCLUSION The problem of estimating the autoregressive parameters of continuous-time stochastic models from discrete-time'data have been studied. We have considered methods based on discrete-time models parameterized with the continuous-time parameters via discrete-time approximations of the derivative operator. It is shown in some simple examples that, in general, the least squares method does not work for such parameterizations. Instead an IV method is suggested and analyzed. For large data sets and short sampling intervals, it is proved that, under weak assumptions, the IV method gives a good estimate of the autoregressive parameters of a continuous time stochastic system. The method is numerically illustrated for a second order autoregressive stochastic system and it is compared with an indirect scheme based on a prediction error method. In summary, it seems that the suggested IV method is an interesting alternative for modeling continuous-time stochastic systems.
1 M
h
Dy(t)
I>A:[y(t + kh) - yet - kh)] 1
(_1)1:+1 k
M!M! (M - k)!(M
+ k)!
In the simulations the values M = 1, ... ,8 were considered. 3. A direct use of the Tustin's approximation into a second order AR model gives
2 -1
(
2
- -q- ) yet) hq+l
2q-l
+ al -h --yet) + a2y(t) = e(t) q+l
The above direct implementation is not feasible since the gain of the Tustin approximation is infinite at w = 71". However, it can be rewritten as
4
2
h 2 (q - 1)2y(t) + al "h(q2 -1)y(t) + a2(q + 1)2 y(t)
= e(t) This form was used in the estimation. 4. The second order derivative is approximated 0 according to D 2 y(t) = D(Dy(t))
7. REFERENCES Bergstrom, A.R. (1976). Statistical Inference in Continuous Time Economic Models. North-Holland, Amsterdam. 1565
Table 2: Mean and standard deviation for the estimated AR(2) model over 150 realizations. Number of data points N = 5000. L _
I
arameter 01 -
L_O
arameter 02 L _ 15
L_I
melUl 1.972 2.087 1.978 l.ilil4 1.820 1.778 1.761 1.734
std 0.128 0.145 0.142 0.138 0.135 0.133 0.132 0.131
mean 1.710 2.176 2.317 2.364 2.378 2.378 2.373 2.365
std 0.076 0.101 0.109 0.113 0.114 0.115 0.115 0.115
melUl 1.026 1.008 0.951 0.957 0.956 0.954 0.951 0.948
std 2.036 1.971 2.082 2.070 2.078 2.091 2.104 2.117
mean
T
1.974 1.971 1.831 2010
0.134 0.149 0.133 0.201
1.090 1.973 2.350 2.011
0.063 0.100 0.121 0.139
1.253 1.030 1.220 0.ilil2
1.663 1.841 l.ilil4 1.845
ARMA
1.999
0.136
11 12 13 14 15 16 17 18 B
E R
Brockett, R.W. (1970). Finite Dimensional Linear Systems. John Wiley and Sons Inc., New York, NY. Carlsson, B. (1991). Maximum flat digital differentiator. Electronics Letters, 27, 675677. Feliu, V., J.A. Cerrada and C. Cerrada (1988). An algorithm to compute the continuous state space model from its equivalent discrete model. Control Theory and Advanced Technology, 4, 231-241. Keviczky, L. (1977). On the equivalence of discrete and continuous transfer functions. Problems of Control and Information Theory, 6,111-128. Kulchitskii, O.Y. and S.V. Skrobotov (1990). Identification of continuous-time linear dynamic stochastic systems. A utomation and Remote Control, 51, 1095-1104. (Originally published in Russian in Avtomatika i Telemekhanika, 8, 106-118, August 1990.) J eiek, J. (1990). Conversion of the polynomial continuous-time model to the delta discrete-time and vice versa. 11th IFAC World Congress, Tallinn, Estonia, August 13-17. Middleton, R.H. and G.C. Goodwin (1990). Digital Control and Estimation: A Unified Approach. Prentice-Hall, Englewood Cliffs, NJ. Parzen, E. (1984). Time Series Analysis of Irregularly Observed Data. Springer-Verlag, New York, NY. Rabiner, L.R. and K. Steiglitz (1970). The design of wide-band recursive and nonrecursive differentiators. IEEE Transactions on Audio Electronics, 18,204-209. Sagara, S., Z.J. Yang and K. Wada (1991). Identification of continuous systems using digital low-pass filters. International Journal of Systems Sciences, 22, 11591176.
L_O
=
L _ 15
2.001 2.071 1.979 1.909 1.864 1.836 1.818 1.807
std 0.099 0.111 0.108 0.105 0.104 0.102 0.102 0.101
melUl 1.850 2.123 2.180 2.194 2.195 2.192 2.187 2.181
std 0.078 0.093 0.097 0.098 0.098 0.098 0.098 0.098
mean 1.965 1.858 1.871 1.872 1.872 1.871 1.870 1.869
std 2.208 2.145 2.146 2.156 2.163 2.168 2.172 2.175
2.447 1.636 2.206 1.999
0.118 0.104 0.114 0.155
2.011 1.638 2.481 2.000
0.085 0.076 0.111 0.106
2.016 1.691 1.991 1.992
2.560 1.523 2.558 1.775
2.003
0.105
Sinha, N.K. and G.J. Lastman (1982). Identification of continuous-time multivariable systems from sampled data. International Journal of Control, 35,117-126. Sinha, N.K. and G.P. Rao (1991). Identification of Continuous-Time Systems. Kluwer Academic Publishers, Dordrecht, The Netherlands. Soderstrom, T. and P. Stoica (1989). System Identification. Prentice Hall International, Hemel Hempstead, UK. Soderstrom, T. and J. Angeby (1990). On the prediction error method for estimating continuous time models from unevenly sampled data. Report UPTEC 90084R, Department of Technology, Uppsala University, October 1990. Soderstrom, T. (1991). Computing stochastic continuous-time models from ARMA models. International Journal of Control, 53, 1311-1326. Soderstrom, T., B. Carlsson and S. Bigi (1992). On estimating continuous-time stochastic models from discrete-time data. Report UPTEC 92104R, Department of Technology, Uppsala University, July 1992. Unbehauen, H. and G.P. Rao (1987). Identification of Continuous Systems. NorthHolland, Amsterdam. Young, P. (1981). Parameter estimation for continuous-time models: a survey. A utomatica, 17, 23-29.
1566