Automatica 35 (1999) 1073}1080
Brief Paper
Parameter identi"cation with derivative shift operator parametrization Lin Guo*, Masayoshi Tomizuka Department of Mechanical Engineering, University of California at Berkeley, Berkeley, CA 94710, USA Received 25 August 1997; revised 1 April 1998; received in "nal form 8 December 1998
Abstract A way of parametrizing linear systems using an operator which combines the two commonly used operators, the shift(q) and the delta(d) operators, is proposed. It is shown that the proposed parametrization improves the numerical property of parameter identi"cation process. By using the proposed operator the condition number of the information matrix in the identi"cation algorithm is signi"cantly lower than that using the q- or d-operator. It is also shown that accurate and fast converging parameter identi"cation can be achieved by using the new operator at fast sampling rates, especially for systems with relatively wide bandwidth. The numerical superiority of the new parametrization is demonstrated by simulation results. 1999 Elsevier Science Ltd. All rights reserved.
1. Introduction Over the years, a lot of research has been done in analyzing and improving the properties of recursive parameter estimation algorithms. Among those Ljung and Ljung (1985) studied error propagation properties of recursive least-squares adaptive algorithms and Verhagan (1989) analyzed the round-o! noise propagation in a particular class of least-squares estimation schemes. However, not much attention has been paid to better parametrizations for discrete-time models until the mid1980s when ideas of utilizing new operators to reparametrize the input}output relationship were developed to address numerical issues in parameter identi"cation. Middleton and Goodwin (1990) suggested the use of the d-operator instead of the traditional q-operator to achieve better accuracy. It is claimed that the reparametrized transfer function using the d-operator has a close connection with the corresponding transfer function in the continuous-time domain. The d-operator model also has the advantage of better numerical properties at high ***** * Correspondence address: Advanced Technology Group, Maxtor Corporation, 510 Cottonwood Drive, Milpitas, CA 95035, USA. Tel.: 001-408-432-4529; fax: 001-408-432-4424; e-mail: lin}guo@ maxtor.com. This paper was not presented at any IFAC meeting. This paper was recommended for publication in revised form by Associate Editor S. Sagara under the direction of Editor T. SoK derstroK m.
sampling rates. Li and Fan (1997) studied the advantages of using the d-operator in adaptive signal processing. Wahlberg (1991) studied the use of the Laguerre operator in system identi"cation. He showed that models parametrized using the Laguerre operator are insensitive to sampling rate and are able to give low-order approximations. Johansson (1994) studied the problem of continuous-time system estimation in the presence of colored noise and proposed an operator transformation which retains continuous-time parameters while allowing estimation to be carried out in discrete-time domain. Li and Gevers (1992) proposed the c-operator and showed its advantages in giving more accurate parameter estimation and faster convergence under certain conditions. In this paper, we propose to use a combined operator * d-operator which is a generalized form of the q- and d-operators, the two commonly used operators, in parameter identi"cation. It is shown that some improvements are obtained at fast sampling rates, especially for systems with comparatively wide bandwidths. One objective of this study is to provide a mathematical support to intuitive observations to parameter identi"cation based on the q- and d-parameterizations, and to devise a new scheme which takes advantages of the two parametrizations. In Section 2, we study numerical problems associated with the q- and d-realizations in parameter identi"cation. In Section 3, parametrization using the doperator is proposed. The relationship between the coe$cients of the d-realization and the d-realization is established. Numerical properties of the d-realization
0005-1098/99/$ - see front matter 1999 Elsevier Science Ltd. All rights reserved PII: S 0 0 0 5 - 1 0 9 8 ( 9 9 ) 0 0 0 1 4 - X
1074
L. Guo, M. Tomizuka/Automatica 35 (1999) 1073}1080
are studied. The procedure of parametrizing the transfer function using the d-operator is also given in Section 3. Some simulation results are given in Section 4 to show the advantages of using the d-operator parametrization.
2. Numerical issues associated with parameter estimation In this section, we study the numerical properties of the q- and d-operator parametrizations in parameter identi"cation. Consider the input}output relationship of a linear system in the transfer funtion form b qK#b qK\#2#b K, G(q)" M (1) qL#aN qL\#2#aN L where parameters aN 's and b 's are unknown. G G In parameter identi"cation, the following model for the input}output relationship of the linear system is often used: L K y(k)"! aN y(k!i)# b u(k#m!n!i) G G G H "h2 (k), O O where
(k)"[!y(k!1)2!y(k!n) O u(k#m!n)2u(k!n)]2
(2)
(3)
is the regression vector and h "(aN 2 aN b 2 b )2 (4) O L M K is the unknown parameter vector. To estimate the unknown parameters, the leastsquares method [11] is used. Assume we have N pairs of observations +(y(k), (k)), k"1, 2,2, N,, the residual O of the estimation is de"ned as e(k)"y(k)!yL (k)"y(k)! 2(k)h) , k"1, 2,2, N, (5) O where h) is the estimated parameter vector. O The loss function is de"ned as 1 , (6) <" e(k). 2 I The problem is to "nd the estimate of h such that < is O minimized. 2.1. The information matrix and its condition number The most commonly used least-squares method in the controls area is called the normal equations method [2]. By using this method, h) is given by O , h) "R\ 2 (k)y(k), (7) O O O I
where R " , [ 2 (k) (k)] is called the information O O I O matrix. The above method has the advantage of being computationally simple and being able to be computed recursively (Middleton and Goodwin, 1986). It is clear from Eq. (7) that the numerical accuracy of the estimates depends on the condition number of the information matrix which is de"ned as p (R ) i(R )" O , (8) O p (R )
O where p (R ) and p (R ) denote the largest and
O
O smallest singular values of R , respectively (Golub and O Van Loan, 1989). The information matrix is obtained from the regression vector which is constructed from signals. From Eq. (7) it is clear that R has to be non-singular such that O h) is uniquely determined. This can be achieved by the O &&persistent excitation'' condition which is satis"ed by &&su$ciently rich'' signals (Astrom and Wittenmark, 1995). In parameter estimation, we generally assume that this condition can be satis"ed by proper design of the input signal (Ljung, 1987). However, even if the &&persistent excitation'' condition is satis"ed, the estimated parameters can still have signi"cant errors. It is known that the relative error of the solution of h) , which is the norm O of the parameter estimation error normalized by the norm of the actual parameter, is proportional to the condition number of the information matrix. It is also known that the condition number in#uences the convergence rate (Gevers and Li, 1993). Therefore, we want the information matrix to be well conditioned in order to obtain good estimates. 2.2. Study of q- and d-operator parametrizations We "rst study the q-operator parametrization. The transfer function written in the q-operator form is given by Eq. (1). It is restated as K b qK\G G G G(q)" , m4n, (9) qL# L aN qL\G G G The regression vector for "ltered signals can be written as
qL\y(k) y(k) qKu(k) u(k)
M 2 (k)" ! 2! 2 O F(q) F(q) F(q) F(q)
y(k#n!1) y(k) u(k#m) u(k) "! 2! 2 , F(q) F(q) F(q) F(q) (10) where F(q) is a pre"lter which is a standard tool in system identi"cation to reduce the bias and avoid pure di!erentiation in the case where the d-operator is used (Wahlberg, 1991).
L. Guo, M. Tomizuka/Automatica 35 (1999) 1073}1080
As has been pointed out by many researchers (Gevers and Li, 1993; Ninness and Goodwin, 1986), the qoperator parametrization has serious numerical problems in parameter estimation at fast sampling rates. The reason is that the di!erence between consecutive sampling data points is very small which results in the elements of (k) becoming very close to each other. This in O turn results in rows in the information matrix being very close to each other. As a result, the information matrix becomes ill conditioned. So it is desirable to alter the signals in the regression vector to make the information matrix better conditioned. Goodwin and Middleton (1986) proposed to use the d-operator, d"(q!1)/*, where * is a positive number. By using the d-operator, some sense of &&di!erentiation'' is introduced. Rather than the direct use of the input and output sequences as in the q-operator parametrization, their rates of changes are used in the regression vector. In this way, the regression vector is made &&richer'' to alleviate the numerical problems in the qoperator parametrization. Note that in this case * in the d-operator is the sampling time ¹. The transfer function written in the d-operator form is given as K b dK\G G G G(d)" , m4n . G dL# L a dL\G G G The corresponding regression vector is
(11)
dL\y(k) dy(k) dKu(k) u(k) 2! 2 2 ,
2 (k)" ! B F(d) F(d) F(d) F(d) (12) where F(d) is a pre"lter in d-operator. The unknown parameter vector is h "(a 2a b 2b )2. (13) B L M K The conversion between h) and h) is shown by Young O B et al. (1991) Although the d-operator parametrization reduces some numerical problems associated with the q-operator parametrization [9], it also has its own weakness. Notice that the magnitude of d-operator is
"d""
eS2 !1 1 " "cos u¹#j sin u¹!1". ¹ ¹
1075
Fig. 1. Condition numbers for q- and d- operator parametrizations with di!erent pre"lter bandwidth ( * pre"lter BW).
terms in the regression vector provide little new information and the information matrix R for the d-operator B parametrization becomes ill conditioned. In some special cases, for example when we assume a narrow BW and have knowledge about the variances of both input and output signals, the diagonal elements of the information matrix can be made identical by proper scaling (Gevers and Li, 1993). Unfortunately, this is not true in general and we expect R to be ill conditioned B when the system order is high and a pre"lter with large bandwidth is used. Example 1 in Section 4 shows some simulation results to illustrate the condition numbers for R and R with O B di!erent sampling rates and BW. The results in Fig. 1 show that as the sampling rate increases, R becomes O more and more ill conditioned while condition number of R drops as sampling rate increases and it keeps constant B as the sampling rate increases further. On the other hand, as the BW increases for any given sampling rate, the condition number for R drops. The is due to richer O signals in the input and output that cause the elements of R to di!er from each other. At the same time the condiO tion number of R increases with BW. This is because of B larger magnitude di!erence among elements in the regression vector.
(14)
For fast sampling rates (say 510 BW) we have "d"N"ju""u at frequencies lower than is the bandwidth of the pre"lter, which has been denoted as BW. As a consequence, for a comparatively high BW, as the order of the system increases the di!erence between the magnitude of the term with the highest order d and that of the term with the lowest order increases dramatically. The last term dominates at low frequencies while the "rst term dominates at high frequencies. As a result, the end
3. Parameter identi5cation using the d-operator parametrization In the previous section we have discussed the numerical problems associated with the q- and d-operator parametrizations. Here we propose to use the combined operator, the d-operator. We will show how the d- realization may be used to improve parameter identi"cation. Again, our emphasis is the condition number of the information matrix.
1076
L. Guo, M. Tomizuka/Automatica 35 (1999) 1073}1080
3.1. The derivative shift operator parametrizations The d-operator is a combination of the q- and doperators. It is de"ned as dJ OqJdI, (15) I where l is the number of the q-operations and k is the number of the d-operations. The transfer function of a discrete-time system in terms of the d-operator can be written in the following form: N
P
b dH G (d)" G H GH G , (16) B Q E a dJ IJ I I J where p#r"m and s#g"n are the orders of the numerator and the denominator, respectively, r and g are the highest order of the q-operations in the numerator and the denominator, respectively, and p and s are the highest order of the d-operations in the numerator and the denominator, respectively. It can be seen that the q- and d-operator parametrizations are the two special cases of the d-operator parametrization. By setting p"s"0 or r"g"0, the d-parameterization is reduced to either the q-operator parametrization or the d-operator parametrization, respectively. Eq. (16) is the most general form of a transfer function in the d-operator. In practice, fewer terms may be used in the numerator and the denominator. A procedure of
mixing the q- and d-operators for system identi"cation is as follows: E
E
Pick the highest order of the d operation to be used in the parametrization, k using the guidelines detailed in Section 4. Note that the order of the system is assumed to be known and further research is needed for better guidelines. Assume we have the following d-operator parametrization:
(a dL#a dL\#2#a )y(t) L L\ "(b dK#2#b )u(t). K Rewrite Eq. (17) as
(17)
[dI(a dL\I#2#a )#a dI\#2#a ]y(t) L I I\ "[dI(b dK\I#2#b ) K I #b dI\#2#b ]u(t). (18) I\ E Change d's inside the parentheses to (q!1)/*, make certain rearrangements, and obtain (a* dK\I#2#a* d#a* d L I I I I\ I\ #2#a* d#a* y(t) "(b* dK\I#2#b* d#b* d I I I\ I\ K I #2#b*d#b* )u(t). (19) The coe$cients of the transfer functions in the d- and d-operators have the following linear relationship:
(20)
and
(21)
where C "k!/i!(k!1)!. I
L. Guo, M. Tomizuka/Automatica 35 (1999) 1073}1080
3.2. Recursive computation of the regression vector
1077
By de"ning
From Eq. (19) the regression vector of the d-operator parametrization is written as
dL\I y(k) d y(k) d y(k) 2! I ! I\ 2
2 (k)" ! I B F(d) F(d) F(d) y(k) dK\I u(k) d u(k) d u(k) I I\ ! 2 I 2 F(d) F(d) F(d) F(d)
u(k) , F(d)
(22)
where F(d) is the pre-"lter de"ned as d F(d)"f * dL\I#f * dL\I\#2#f * L\I\ I L I L\ I #f * d #2#f * . (23) L\I\ I\ The elements of the regression vector are calculated in the following recursive way. First, note that F(d)y(k)"f * dL\I y(k)#f * dK\I\ y(k)#2 L I L\ I #f * d y(k)#f * d y(k)#2 L\I\ I L\I\ I\ #f * y(k). (24) From Eq. (24), we have
dJ y(k) f* dJ\ y(k) I I " ! L\ F(d) f* F(d) L f* d y(k) I ! ! I f* F(d) L f* y(k) !2! f * F(d) L where l"n!k. Hence, we have
!2
f* d y(k) I\ I\ f* F(d) L y(k) # , (25) f* L
dJ y(k) y(k) I "!2 # , BW F(d) f* L where
f* f* f* f* !2" ! 2! I\! I 2! L\ f* f* f* f* L L L L and
(26)
and
1 (2" 020 W f* L we have
(32)
(k#1)"" (k)#( y(k). BW BW W Similarly, we have
(33)
(k#1)"" (k)#( u(k), BS BS S where
(34)
1 (2" 020 . (35) S f* L It is clear that the order of d in vector and can be BW BS chosen lower than the order of the system n. This reduces the numerical problems of the d-operator parametrization. On the other hand, since derivatives are introduced in (k) and (k) to a certain degree, the numerical BW BS problems of the q-operator parametrization can be alleviated.
(27) 4. Simulations
y(k) d y(k) d y(k) dJ\ y(k) I
2 (k)# 2 I\ 2 I . BW F(d) F(d) F(d) F(d)
(28)
Similarly, dN u(k) u(k) I "!2 # , BS F(d) f* L where
(31)
u(k) d u(k) d u(k) dN\ u(k) I
2 (k)" 2 I\ 2 I BS F(d) F(d) F(d) F(d)
(29)
Example 1. In this example, we compare the condition numbers of the d and the q-operator parametrizations of a third-order FIR model; i.e. the model transfer function does not contain denominator parameters. The "lters used are of the form F"(d#a)(d#b) and
(30)
and p"m!k and m is the order of the numerator.
F(q)"[q!(1!a¹)][q!(1!b¹)].
(36)
By using di!erent values for a and b, we can change the BW of the pre"lter-"lter.
1078
L. Guo, M. Tomizuka/Automatica 35 (1999) 1073}1080
The regression vector is
du(k) u(k) 2 F(d) F(d)
for d-parametrization
(37)
qu(k) u(k) 2 F(q) F(q)
for q-parametrization
(38)
2 (k)" B or
2 (k)" O
The input is a square wave with a frequency of 1 Hz. The simulation results are shown in Fig. 1. Some comments on this simulation have been given in Section 2.2. Example 2. In this example, we show that the information matrix of the d-operator parametrization does have a lower condition number than that of the q- and the d-operator parametrizations. The numerator has an order of 3 and the "lter used is of the form F"(d#a)(d#b)
Fig. 2. Condition numbers for d-operator parametrization of di!erent order with BW"3 Hz (* the number denotes the number of d operations).
(39)
The regression vectors in d-operator form with di!erent d-operator orders are
d\I u(k) u(k) I 2 , k"1, 2, 3. (40) F(d ) F(d ) I I The input is a square wave with a frequency of 1 Hz. Fig. 2 shows the condition numbers of the d-operator parametrizations with di!erent orders of d operations. Fig. 3 shows the comparison of the parametrizations using the three operators. In this case, the order doperation in the d-operator parametrization is two. As we can see in the "gure, the d-operator parmetrization has the lowest condition number.
2 ( k)" BI
Example 3 (Parameter estimation). The numerical advantages of the d-operator parametrization is further demonstrated in this example. The plant used in the simulation is 1.0#0.75d#1.2d#3.0d G(d)" , F(d)
Fig. 3. Condition numbers for q-, d- and d-operator parametrizations with BW"3 Hz.
(41)
where F(d) is the pre"lter whose parameters are assumed to be known while the coe$cients in the numerator are to be identi"ed using a standard recursive least-squares algorithm. In the simulation, parameter estimates in the q- and d-parameterizations are converted to corresponding values in the d-parametrization. Note that the order of the d-operation in the d-parameterization is chosen to be 2. This is based on the order of the system and the BW of the pre"lter. From the results given in Example 2, it is clear that the d-parameterization with k"2 has the lowest overall condition number in the frequency range of interest.
The input signal is white noise. The estimated numerator coe$cients are shown in Figs. 4}7. When the sampling rate is comparatively low and the pre"lter has a narrow bandwidth all three parametrizations give good estimates (Fig. 4). As the sampling rate increases, the q-operator parametrization becomes ill conditioned and the identi"cation process eventually blows up (Fig. 5). On the other hand, when the pre"lter bandwidth increases, the identi"cation process using the d-operator parametrization becomes very sluggish (Figs. 6 and 7). In all cases, d-operator parametrization gives good identi"cation results.
L. Guo, M. Tomizuka/Automatica 35 (1999) 1073}1080
Fig. 4. Parameter estimates with sampling frequency"50 Hz and pre"lter BW"2 Hz.
Fig. 5. Parameter estimates with sampling frequency"100 Hz and pre-"lter BW"2 Hz.
Fig. 6. Parameter estimates with sampling frequency"250 Hz and pre-"lter BW"6 Hz (estimates using the q-operator are not shown since they are divergent).
Fig. 7. Parameter estimates with sampling frequency"250 Hz and pre-"lter BW"8 Hz (estimates using the q-operator are not shown since they are divergent).
We summarize the observations from Examples 1}3 in the following remarks:
singular (ill conditioned) which causes the identi"cation process to blow up. This happens to the q-operator parametrization when the sampling rate is high.
Remark 1. There are two reasons for the information matrix to become ill conditioned which leads to poor identi"cation. E
E
Diagonal elements in the information matrix play a very important role. When the relative magnitudes among those elements become big, the identi"cation process becomes o! balanced and the parameters corresponding to the small elements &&die''. This is clearly shown in the simulation of d-operator parametrization with comparatively wide bandwidth. When the di!erence between elements of the information matrix becomes small, the matrix is close to
1079
Remark 2. The d-operator parametrization has superior numerical properties in parameter estimation. Proper selection of the order of d-operation in the d-operator parametrization is important and needs further investigation. Meanwhile, we give the following guidelines: E
Given pre"lter bandwidth, choose the order of d-operation so that the relative magnitude among the diagonal elements of the information matrix stays less than a factor of 10 000.
1080 E
L. Guo, M. Tomizuka/Automatica 35 (1999) 1073}1080
Under the assumption that the above condition is satis"ed, choose the order of the d-operation as high as possible.
5. Conclusions Some numerical properties of the q- and d-operator parametrizations in parameter identi"cation have been studied. To improve the quality of parameter identi"cation, the d-operator parametrization is suggested. It has been shown that the d-realization has a smaller condition number of the information matrix, which may lead to better identi"cation of the parameters. Simulation results have shown superior numerical properties of the d-realization.
References Astrom, K. J., & Wittenmark, B. (1990). Computer controlled systems theory and design (2nd Ed.) Englewood Cli!s, NJ: Prentice-Hall. Astrom, K. J., & Wittenmark, B. (1995). Adaptive control, (2nd Ed.). Reading, MA: Addison-Wesley. Gevers, M., & Li, G. (1993). Parametrizations in control, estimation and ,ltering problems. Berlin: Springer. Golub, G. H., & Van Loan, C. F. (1989). Matrix computations. Baltimore, MD: The Johns Hopkins University Press. Johansson, R. (1994). Identi"cation of continuous-time models. IEEE ¹ransactions on Signal Processing, 42(4), 887}897. Li, G., & Gevers, M. (1992). Data "ltering, reparameterization, and numerical accuracy of parameter estimators. In Proceedings of the 31st IEEE Conference on Decision and Control (pp. 3692}3697). Tucson, Arizona. Li, Q., & Fan, H. (1997). On properties of identi"cation matrices of delta-operator based adaptive signal processing algorithms. IEEE ¹ransactions on Signal Processing 45(10), 2454}2467. Ljung, L. (1987). System identi,cation. Englewood Cli!s, NJ: PrenticeHall. Ljung, S., & Ljung, L. (1985). Propagation properties of recursive least square adaptive algorithms. Automatica, 21(2), 157}167. Middelton, R. H., & Goodwin, G. C. (1986). Improved "nite word length characteristics in digital control using delta operator. IEEE ¹ransaction on Automatic Control, 31(11), 1015}1021. Middelton, R. H., & Goodwin, G. C. (1990). Digital control and estimation a uni,ied approach. Englewood Cli!s, NJ: Prentice-Hall. Ninness, B. M., & Goodwin, G. C. (1991). The relationship between discrete time and continuous time linear estimation. Identi,cation of continuous-time systems (pp. 79}122). The Netherlands: Kluwer Academic Publishers. Verhagen, M. (1989). Roundo! error propagation in four generallyapplicable recursive least-squares estimation schemes. Automatica, 25(3), 437}444.
Wahlberg, B. (1991). System identi"cation using Lagurre models. IEEE ¹ransaction on Acoustics Speech and Signal Processing, 36(5), 551}562. Young, P. C., Chotai, A., & Tych, W. (1991). Identi"cation, estimation and control of continuous-time systems described by delta operator model. Identi,cation of continuous-time systems (pp. 363}418). The Netherlands: Kluwer Academic Publishers.
Lin Guo received the B.Eng. degree in mechanical engineering from Shanghai Jiao Tong University, China, in 1984, the M. Eng. degree in mechanical engineering from McGill University, Montreal, Canada, in 1987, and the Ph.D. degree in mechanical engineering from the University of California at Berkeley, Berkeley, California, in 1996. Between 1988 and 1993 he was an experimental scientist at the Commonwealth Scienti"c and Industrial Research Organization, Australia. Since 1996, he has been with the advanced technology group, Maxtor Corporation, California, where he heads the Micro-actuator Development Project. His research interests include digital motion control, optimal control, system identi"cation, mechatronics and disk drive dynamics. Dr. Guo is a member of IEEE. Masayoshi Tomizuka was born in Tokyo, Japan in 1946. He received his B.S. and M.S. degrees in Mechanical Engineering from Keio University, Tokyo, Japan and his Ph.D. degree in Mechanical Engineering from the Massachusetts Institute of Technology in February 1974. In 1974, he joined the faculty of the Department of Mechanical Engineering at the University of California at Berkeley, where he currently holds the Cheryl and John Neerhout, Jr., Distinguished Professorship Chair. At UC Berkeley, he teaches courses in dynamic systems and controls. His current research interests are optimal and adaptive control, digital control, signal processing, motion control, and control problems related to robotics, machining, manufacturing, information storaee devices and vehicles. He has served as a consultant to various organizations, including Lawrence Berkeley Laboratory, General Electric, General Motors and United Technologies. He served as Technical Editior of the ASME Journal of Dynamic Systems, Measurement and Control, J-DSMC (1988}1993). He currently served as an Associate Editor of the Journal of the International Federation of Automatic Control, Automatica and European Journal of Control, and is Editor-in-Chief of the IEEE/ASME Transactions on Mechatronics. He was General Chairman of the 1995 American Control Conference, and currently serves as President of the American Automatic Control Council. He is a Fellow of the ASME, the Institute of Electric and Electronics Engineers (IEEE) and the Society of Manufacturing Engineers. He is the recipient of the Best J-DSMC Best Paper Award (1995), the DSCD Outstanding Investigator Award (1996) and the Charles Russ Richards Memorial Award (ASME, 1997). The Charles Russ Richards Memorial Award, established in 1944, was named in honor of a founder of Pi Tau Sigma. It is given to an engineering graduate who demonstrates outstanding achievement in mechanical engineering 20 years or more following graduation.