3a-144
Copyright © lY961FAC 13th Triennial World Congress, S411l Fnmcisco, USA
A ROBUST AND RECURSIVE IDENTIFICATION METHOD FOR THE HAMMERSTEIN MODEL M. Boutayeb, H. Rafaralahy and M. Darouach CRAN CNRS URA 821, University Henri Poincar" - Nancy J 186 rue de Lorraine, 54400 Cosnes et Romain FRANCE. Email:
[email protected]. Tel: (33) - 82 - 25 - 91 - 49. Fax: (33) - 82 - 23 - 97 - 93. Abstract: In this note, a robust identification method of non linear systems. using the Hammerstein model is presented. The main property of the proposed approach is that parameters of the linear and nonlinear parts are estimated in a recursive way while the global convergence is guaranted. The proposed method consists first to transform the non linear representation into an input-output model linear in parameters, after, a regular lransformation based on the pseudo-inverse teChnique allows us to estimate in the least squares sense parameters vector of the original realization. For the case of correlated noise, to model measurement disturbances, we propose a simple technique based on the pseudo-linear regression method and we investigute all parameters dependencies to obtain a consistent solution. Sufficient conditions for global convergence are derived. Two numerical examples with different correlation structures and different Signal to Noise Ratio values are provided. Keywords : Robust estimation . Recursive algorithms - Nonlinear models - Coloured 1l0ifie - Convergence analysis.
1 - INTRODUCTION Analysis and design of non linear systems was the subject of many research activities in the last decade. Indeed, as most of physical processes. have non linear behaviour outside a limited linear range, accuracy and perfonnances of controllers depend very closely on the mathematical model considered. One of the non linear approximations frequently used is the Hammerstein model. This was investigated in many practical situations such as chemical processes. electric heat exchanger, distillation column etc ... (Eskinat et aI., 1991; HaheT and Zicrfuh. 1988: Korcnberg, 1973). This kind of models is composed of a static nonlinearity in series with a linear dynamic system. It is not intended to give here a total overview of the different identification algorithms established up till now. We refer the reader to (Billings and Fakhouri. I n2; Boutayeb and Darouach, 1994; Eskinat et at., 1991; Haber and Unbehauen, 1990; Kortman and Unbehauen, 1937; Stoica and Soderstrom, 1982) and the references inside. However, we distinguish three parametric approaches for the identification of the Hammerstein model. The first one, (;onsists in estimating parameters of the linear and non linear parts by an iterative method (Narendra and Gallman, 19(6). Unfortunately, stability and convergence of this approach is not guaranted as it has be shown analytically by the use of a counterexample in Stoica, 1981. The second approach is based on a non iterative technique (Chang and Luus, 1971) ami consists in transforming the non linear parameter representation into an equivalent foml which is linear in
parameters. After that, classical identification methods are used as in the linear case to determine parameters of the obtained model. However, there is a certain amount of redundancy since that for each parameter. of the numerator, we have several estimates. In the case of correlated noise, Stoica et al. have developed instrumental variable techniques to obtain an unbiased estimator. However. the considered models are 'linear' in parameters. The third approach, which is based on the Extended Recursive Prediction Error Method (Kortman and Unbehauen, 1987), has the advantage to estimate parameters of the linear and non linear parts without redundancy. However, only local convergence is ensured while computational requirements, in particular computation of the gradient vector, increase significantly for large scale systems. In this paper, we propose a robust and recursive technique to estimate parameters of the linear and non linear parts of the Hammerstei n model without redundancy while the global convergence is guaranted. The method consists first to transfonn the nonlinear representation into the cla"isical input-output model linear in parameters, after, a regular transformation based on the pseudo-inverse technique allows us to estimate the unique and unbiased solution. Next, for LwO cases: white noise sequence and ARMA correlated noise to model measurement disturbances, parameter identification problems wilt be discussed in detail. Sufficient conditions for global convergence arc derived. Efficiency of this approach is shown through some numerical examples.
4374
2 - PROBLEM FORMULATION Let us consider the Hammerstein model described by the following non linear representation: B(q-') , Yk =--.,-(g,u,+ ... +g,u,) (I) A(q )
The transformation (4) to (7) avoids non linearity in parameters and assures general convergence under the persistently exciting condition. However there is a certain amount of redundancy in parameters since that for each b. ) we have s different estimates in the form:
.
A
b;g, for i = 1.... ,
with " __ . + a"q ." A( q -') = I +a,q
(2)
and B(q-")=q'd+b,q·(d+1l+ __ . + bmq·!d+m)
(3)
q-] is the delay operator, Yk and u" are the output and input
signals at time instant k respectively and d is the time delay length. Orders n, m and s are supposed to be known, those may be detennined from the input-output data by several methods as detailed in the survey paper of Haber et al., 1990. Into the difference form and by making use of (I), Yk becomes: "
m
;=1
j=O
(4)
h =-Ia'YH + Ib;W'_d_, bO = I
with
(5)
(6) W k _d _j = gIUk_d_.i+···+g~U~_d.j Our aim here is to estimate parameters of the linear and
(3 1 ... an b l ... b m ) and non linear: (gl ... gs) parts from the input-output data by guarantying global convergence.
5
andj = I .... , m
(11)
obtained from ij. There exists two techniques to establish an estimate value of each b. : J The first onc consists in computing the mean value of the A A bjg, bjg,. . set {-,-, .. ., -,-j for J ~ I, .. ., m. We notIce that a g, g, recursive form of this technique may involve numerical instabilities in the transient behaviour or for low values of gj'
The second technique consists in detennining the set of
bj
which minimizes the root mean-square error (rrns) between the system and model outputs respectively for each value of i (i = I, ... , s) and j U = I, .... m). However computational requirements to determine the set of bj are considerable particularly for high orders, Thus, a recursive version of this approach can not be applied in real time applications. Hereafter, a simple method based on the pseudo-inverse technique to establish a consistent estimator of e = T
3 - MAIN RESULTS 3.1 - Stochastic case : white noise model The proposed method consists first in transfonning (4) into a model linear in parameters. By introducing the signal vector, we may write (4) in thc noisy form as: -ry,='l'ka+v, (7)
[a l ,., an gl '" gs b l '" b m 1 is proposed, We notice that e, of dimension n+s+m, represents the parameters vector of the initial representation (I). First let us define
g,
ea. e and 8 g
bg
as:
(12)
where vk is assumed 10 be a zero mean white noise sequence aCling as a disturbance,
Ei
are the [n + (m
+ I) x s] vectors defined as :
e
and
(8)
r' are defined in (10).
(
~ R -, N
where -Pa'
g
and
~bg
are malrices of n, sand rnxs rows
respectively. (9)
The Least squares estimator leads 10 the following unbiased parameter estimation:
e (~R-'N r' ~R-'YN =
(10)
with N = [(f\ ... tpk+Nf and YN = [Yk ... Yk+Nf Where R is a weighting matrix symmetric and positive, N is the length of data. We notice that parameters (5. 1' ... , an' gl' .... gJ are directly obtained from the estimate
Ei.
(13)
Theorem 1
The consistent estimator of e is given by :
il=[;:JJl !: J~R-'YN ab
Mb,
where M is an m.(m.s) diagonal matrix defined a:o: :
4375
(14)
a weighting factor. Now, 1ct us decompose K (IS)
sub-vectors K
ak+l
,K
and K
gk+1
bgk+!
k+1
into three
, of dimensions n, s
and rn.s respectively, as follows: "
T
e, = g N R
with
-I
YN
e; is the right pseudo-inverse vector of 9
(26)
g.
Proof
ehg
As 9 g and
are directly obtained from (10), that we
Decomposition of (23) into three parts gives: ~
compute as:
ii g = g'N R -I y N "
(16)
T -\ cDbgll>N R Y N
8 hg =
8 k +1 =
+Eg
• 9,+, =
)b,
b
g
(Sg + E\)h m
III
As
e
g
and E
= 8hg + f hg
eh
the least squares estimator of ,
e, u..
=[
1\
is given by :
1= ['+0: :01'); )e
I
b b
U
6 bgk
[~Jk+ll
hgk+1
~ak 1+ [K
ak 1
1
~g'+1 = [ ,eg~
+ ~gk+1
8 hk + 1
M k +1Kbgk + 1
Mk+18bgk
Mk
9;k+1
,
b,
=M
eb,
m
Sg
Since
(27)
J
, +, [e;",
(19)
arc zero mean while noise sequences, then
bg
[e9 1[9 1[
c k + 1(28)
where Mk 1I is defined as :
on the other hand we have:
Bh)!
K;lk+!
+ ~Y'k+l e k +1
Using the same technique abov\! i.c. (18) to (20), we easily deduce, from (27), a recursive identification algorithm of 9 :
and
)
"ak
= ~gk.
bgk +1
(17)
the sub-paramctcl3 vector 6 bg may he written as :
e , = [:g~, = [(~g
"ak+l
?gk.+1
is a vector of full column rank, and then
(20)
=
.
0.. . ..
0
'+
o ..09 g,+,
is the right pseudo inverse vector of
(29)
9gk + 1 obtained
from the second block of (28) at time instant k+ I. The final algorithm form may be then summarized as follows:
e;eg =
I, we can easily verify that (14) is also a consistent estimator: (21)
E(e)=9
End of proof For real time applications and in order to avoid inversion of the matrix data (4)~R-lc:l>N)' in particular for large scale
(32)
systems, it would be interesting to develop a recursive fonn of the identification algorithm ( 14).
(33)
(22)
In the following we address the problem of parameter identification of the Hammerstein model in the presence of correlated measurement noise.
(23)
3.2 • Stochastic case : ARMA noise model
The classical recursive fOIll1 of (10) is given by : ..::...::. 9 k+ 1 =
e,
K
k+1
T
+K
(
k+l
yk+1 - q>hl ek )
= P q>k+I(q>;+, P q>,+, k
k
T
where
e
k+1
9k + l ,
= K
r
~
yk+l -
hi
and P
k+1
+ Ak+1
rl
(24)
P k+l = (I - K k+l q>'+I)Pk.
and
..::.
(25)
represent the parameters vector
estimate of e, the gain vector and the parameter error covariance matrix at time instant k+l respectively. A is '+1
Using a white noise sequence to model measurement disturbances is nOl, in general. an available assumption. Indeed, in many practical situations this hypothesis is seldom verified sinee that measurement crrors have a limited bandwidth. So in order to avoid this conservative assumption, a reasonable hypothesis, which is usually used in the literature. consists in writing disturbances as an ARMA noise model.
4376
For linear systems, several techniques have been developed to deal with this problem. The most popular ones are the prediction error methods and the generalized least squares methods, in particular, the pseudo-linear regression technique, Those approaches have, in general, similar performances (Ljung and S6derstriim 1983). The output signal is then writing as : ecq-') A( q -I) Yk = B( q -I )( g,Uk+ .. ·+g,u ') k + D(q-') v k
(34)
The problem to solve here is to estimate g I' .. " gs and parameters of A(q-I), B(q-'). C(q-I) and D(q-I). v is assumed k
to be a zero rnean white noise sequence.
[n this paper, the proposed mClhod is based on a modified pseudo-linear regression algorithm and the pseudo-inverse technique, as used in the first section, to estimate parameters of the realization (34). Firs[, wc transfonn (34) into the equivalent linear fonn ; A,(q-I )y, = B,(q-' )(g,u, + ___ +g,u: )+ecq-I )v,
(35)
C( q -') = I + c1q -, ... + epg -r
(36)
D(q-')=I+d,q-'+. __ + diq-I
(37)
A,(q-I) = A(q-I )D(q -')
- 1.1 -
+ ...
+ac]q
+acn]q
-nl
(38)
ee =[ael .. aenT gl··g, hel··bem-r cl ..
epr
(47)
Once an estimate of (a el ... a enr bel ." b em1.) is obtained, we use the following technique to extract the original parameters (al'" all b l ... b m d l ... d f )· From (38) and (39) we notice that: A, (q-I )B(q-I) = B,(q-' )A(q-I)
(48)
Therefore. after some linear manipulations of both sides of (48), we have: + b, = b'l +a , (49) ad + a el b l + b 2 == b e2 + bela l + a 2 (50)
a"
a e3 +a e2 b l +a el b 2 + b 3
;;;;
b e3 + b e2 a l + b el a 2 +a 3 (51)
= The general fonn may be written as : a ej +ae(j.l)b j +ae(j_2jb2+ ... +at:lbO_I) + h j = bej + beU_lla j + beO_21a2+, .. +bdaO_l) + a j
(52)
for j = I, ... , (f+m+n). If j > n (or j > m) a = 0 (or b = 0) j j = 0 (or b = 0). ej ej or equivalently: YI = H,e, (53) with
and if j > n+f (or j > m+f) a
B,(q-') = B(q -I )D(q -')
= with
q
-d
h -(d+l) + <'!Iq +
+
b
em-Iq
-(d+m T )
(39)
nT=n+f,n\=m+f.
The signal vector of the extended representation (35) is : T
-
=
Yk+l
k
and
'8
~k
0 __
= [ -Y k .. -Y k+l-n
1
s U .. U k+l-d U k-d"
,
~
u k _d "
U k+l-d-m.] ,. Uk+l-d-ITIT V k" v k+l-p
and
[ael"a Cll ,],
]T
gl··gs (bc]gl )"(belg~),, T
(hem,],gl ) .. (bem,],gs) cI··c p 1
T
9 k+,=6,+K
k+1
(y
(42)
k+1
(43)
,-I
T
K hI = P k 'l'k+' ('I'k+1 P k 'l'k+' +,.. k+1 )
(44)
T
(45) Ph' =(1 - K,+, k+1 JP k where v k ". v k+ l _p in (41) can be replaced by their
estimates v k
... Y k + l . p LC:
vk+1
=y
T
_m k+l 'I'k+1
::..
6
k+1
.0
(53) is a linear system with (f+m+n) equations and (m+n) unknown parameters, hence the minimum least squares estimate of I is :
e
,..
(46)
'" + "
e'k = H'kYlk
::..
-'I'k,,6,)
0_
(41)
The Recursive Pseudo Linear Regression algorithm is summarized as follows: ~..;:.
.0-1
bel I 0 ... 0 - a,,1 -I O . . . . 0 ht"2 bt:! I 0 .. 0 ~ a,.2 - a ol I 0 _ 0
T
(54)
where a and b are replaced by their estimates at time ei ej instant k in Y and H . 1k
1k
On the other hand, we have: A,(q-I) = A(q-I )D(q-I)
and
B , ( q -I)=B( q -I)D( q) -,
using the same technique as above, we obtain the following linear systems: (55) a ej = a j + a(j.l)d l + aU_ZJdZ+ ... +aldU_I) + d j
Using the same technique as above i.e. (30)-(33), we obtain an estimate of the reduced parameters veclor :
(56)
4377
for j = I, ... , (n+f) and i = I, ... , (111+1). or equivalently: Y2 = H 29 2 with
As fy (Se) and EH (ee) are linear in
1
........................
y, =
I 0 ..... 0 b, I 0 ... 0
................. -..... . The minimum least squares estimate of 8 is : 2
(58) where a. and b. , obtained from (54), are replaced by their J
'
estimates at time instant k in
Y2k
and H 1k ,
For the obtained linear system (40)-(42), it was shown that for the Pseudo-Linear Regression (PLR) method, the key condition for global convergence is the positive realness of
Ge , the
Therefore, as long as ge converges to Se with probability one, 9, also converges to 9, with probability one. In the same manner, we prove that probability one. End of proof
82 also converges to
9 2 with
Numerical
(on
Centris
examples
Macintosh
650) : To illustrate robustness of the proposed algorithm with respect to correlated disturbances, we consider two numerical examples with a very bad initialization, with different correlation structures and different Signal to Noise Ratio values (SNR). The later is defined as : SNR = (var(y var( 'k) B -, where Yak = ~ Wk is the undisturbed output signal A(g ) C q -, and Xk = ) 1 v k is the correlated noisc. For A(q )D(q ) both examples. two values are considered for SNR : SNR = 2 and SNR = 10.
i
the 1I"I ter - -I, -
ecg' )
Our aim now is to determine conditions for which the
parameters vector 9 converges to the actual one probability one with: =:
Gc -
",))"2
4 - CONVERGENCE ANALYSIS
8
=
unbiased and least squares estimator of 6, is given by (54).
(57)
I 0I .0. .... . . 00 ",
Se
[al··a n bl .. b m gl
·gs
cl,·c p dl··d r
e with
r
The initial conditions are taken as : Ph = 10'1, and b = 1000 i.e each parameter is initialized at 1000.
e
Theorem 2 If the following assumptions: a) - The input signal U(l) is strongly persistently exciting,
The input signal (uk) is a zero mean white noise sequence with standard deviation a = 0.5; the length of data is N=I000.
b) - The filter __1_,- is strictly positive real, ecl]' )
Example I The nonlinear static polynomial is 2 3 W k = 1.35u, "2.4u, + 1.7u, in scries with -] ·1 -2 ·3 B(g) = q + 1.5~ + O.4
hold then the parameters vector one e with probability one.
e converges to the actual
Proof In the theory of linear systems, it was already shown that under assumptions a) and b) the extendcd parameters vector
e, obtained trom (43), converges to the actual one e with probability one. Therefore, the sub-vectors Ge =[2: 1", epr and8 g=[gl ... gsr of e converge to ee and e g
with probability onc ( assumption a) means
that the input signal and its powers are jointly persistently exciting, for more details see the work of Stoica et al. [15] ). On the other hand, from (30)-(33) and theorem I the parameters vector one.
Se
converges to 9~ with probability
Now, (53) may be written in the equivalent form:
y, + Ey(S.) = H,6, + EH(S. )6,
The pulse transfer function of the correlated noise is : ·1 -1-2 C(g ) = I + 0.2q + 0.1 q -]
-[
D(q ) = I - 0.9q + 0,H5q
-2
Example 2 The nonlinear static polynomial is 2 W k :; 2.Ru k - 3.4U k + 2. 7uk :l in series with ·1 -1 -2 ·1 B(q,) = q "1.2~ + 0.8'1,' ., A(q ) = I - I.3q + 1.1 q - 0.55q . The pulse transfer function of the correlated noise is: ·1 ·1 ecq )=1+0.7q -I -I -2 D(g ) = I "0.65q + OJ5q Numerical results of the proposed technique are summarized in table I and table 2 for both examples respectively:
4378
that consistency and global convergence are guaranteed
Parameter True values SNR = 2 al
"2 a3 bl b2 gl g2 g3 Cl
C2 dl d2
-\.9000 1.3000 -0.2800 1.5000 0.4000 1.3500 -2.4000 1.7000 0.2000 0.1000 -0.9000 0.8500
-1.8720 ±0.0527 1.2890±0.0682 -0.2959 ±O.OI 87 1.4935±0.OI74 0.4073 ±0.0439 1.4079±0.1051 -2.3651 ±O.0672 1.7767±0.1218 0.1417±0.0396 0.0371 ±0.0328 -0.9705 ±0.0667 0.8773 ±0'(J421
SNR = 10 -1.9OO2±0.0070 1.3055 ±0.0060 -0.2850±0.0065 1.4924±0.0063 0.3985 ±0.0028 1.31 82±0.0650 -2.4283 ±0.0259 1.7121 ±0.0258 0.2528 ± 0.0569 O.1529±0.1130 -0.8869±0.0066 0.8479±0.0039
Table 1 : Numerical results, parameters estimates and standard deviations, of the proposed identification algorithm : example 1
Parameter True values SNR "I a2 a3 hI b2 gl g2 g3 Cl
dl d2
-1.3000 1.1000 -0.5500 -1.2000 0.8000 2.8000 -3.4000 2.7000 0.7000 -0.6500 0.3500
=2
-1.3054 ± 0.0261 1.1051 ± 0.0231 -0.5579±0.0185 -1.2026±0.O383 0.8039 ± 0.0235 2.7618±0.0293 -3.4396± 0.0268 2.7066±0.0351 0.6126±0.1168 -0.6970 ± 0.0420 0.3896 ± 0.0435
SNR = 10 -1.2982±0.0017 I.IOI7±O.0020 -0.5508 ±0.0006 -\.2014 ± 0.0030 0.8055 ± 0.0042 2.7866±0.0146 -:U902±0.0112 2.7055±0.O087 0.6474 ± 0.0665 -0.6765 ± 0.0325 0.3358 ±O.OO70
while the output model is non linear in parameters. Two numerical examples with different correlation structures and different Signal to Noise Ratio values have been presented. The obtained numerical results show high performances and accuracy of the proposed algorithm.
REFERENCES Billings S. A. and Fakhouri S.Y., 1978, "Idcntilication of a class of non linear systems using correlationanalysis," IEE, 125, pp. 691-697. Billings S. A. and Fakhouri S.Y., 1982, "Identification of systems containing linear dynamic and static non-linear
clement," Automatica, Vol. 18, W I, pp. 15-26. Boutayeb M. and Darouach M., 1994, "Recursive identification method fDr Hammerstein model. Extension to the nonlinear MISO systems," Control Theory and Advanced Technology, Vol. 10, No. I, pp.
57-72. Boutayeb M. and Darouach M., 1995, "Recursive identification method for MISO Wiener-Hammerstein
model," IEEE T-AC. Vul. 40, No. 2, pp. 287-291. Chang F. H. and Luus R" 1971, "A non-iterative method for identification using the Hammerstein model," IEEE
Trans. on Aul. Control, AC- 6, pp. 464-468. Eskinat E., Johnson S. H. and Luyben W. L" 1991, "Vse of Hammerstcin models in identification of nonlinear
systems," AIChE Journal, Vol. 37, N° 2, pp. 255-268. Haber R. and Unhehauen H., 1990, "Structure identification of nonlincar dynamic systems A survey on Input/Output
approaches," Automatica, Vol. 26, No. 4, pp. 651-677. Haber R. and R. Zierfub, 1988, "Identification of an electrically heat exchanger by several non linear models using different structure and parameter estimation methods. Report, Technical University of Vienna, Austria.
Gelb A., 1974, Applied Optimal Estimation. MIT Press: Cam bridge, MA ..
Korenberg M. J" 1973, "Identification of biological cascades of linear and static nonlinear systems," Proc.
Table 2 : Numerical results, parameters estimates and
standard deviations. of the proposed identification algorithm : example 2
We notice from tables I and 2 that the proposed algorithm gives consistent estimates of parameters in spite of low values of SNR and in spite of bad initialization. On the other hand, the convergence rate is quite fast since that all the parameters are approximately reached about 300 samples. Of course, we cannot make definitive statements on the convergence rate only on the basis of some numerical examples.
5 - CONCLVSION In this note, the prohlem of recursive identification of Hammerstein model with correlated noise has been addressed. Based on the special slructure of Hammerstein model, we have proposed a simple technique to estimate consistent parameters of the initial representation in the least squares sense. It has be shown, under mild conditions,
16th Midwest Symp. Circuit Theory. Kortmann M. and Vnbehauen H" 1987, "Identification methods for nonlinear MISO systems," IFAC 10th Triennial World Congress. Munich, pp. 233-238.
Ljung L. and Soderstrom T., 1983, Theory and practice of recursive identification.:MIT Press.
Narendra K. S. and Gallman P.G" 1966, "An iterative method for the identification of nonlinear systems using
a Hammerstein model," IEEE T-AC, AC-Il. pp. 546560. Stoica P., 1981, "On the convergence of an iterative algorithm used for Hammerstein system identification," IEEE Trans. on Automatic Control, AC- 26, N° 4, pp.
967-969. Stoica P. and Soderstrom T., 1982, "Instrumental variable methods for identification of Hammerstein systems,"
Int. J. Control, 35, pp. 459-476.
4379