Identification of MIMO Hammerstein systems using cardinal spline functions

Identification of MIMO Hammerstein systems using cardinal spline functions

Journal of Process Control 16 (2006) 659–670 www.elsevier.com/locate/jprocont Identification of MIMO Hammerstein systems using cardinal spline functio...

1MB Sizes 3 Downloads 186 Views

Journal of Process Control 16 (2006) 659–670 www.elsevier.com/locate/jprocont

Identification of MIMO Hammerstein systems using cardinal spline functions Kwong Ho Chan a, Jie Bao a

a,*

, William J. Whiten

b

School of Chemical Engineering and Industrial Chemistry, The University of New South Wales, Sydney NSW 2052, Australia Julius Kruttschnitt Mineral Research Center, The University of Queensland, Isles Road, Indooroopilly QLD 4068, Australia

b

Received 18 May 2005; received in revised form 24 January 2006; accepted 25 January 2006

Abstract A new approach to identify multivariable Hammerstein systems is proposed in this paper. By using cardinal cubic spline functions to model the static nonlinearities, the proposed method is effective in modelling processes with hard and/or coupled nonlinearities. With an appropriate transformation, the nonlinear models are parameterized such that the nonlinear identification problem is converted into a linear one. The persistently exciting condition for the transformed input is derived to ensure the estimates are consistent with the true system. A simulation study is performed to demonstrate the effectiveness of the proposed method compared with the existing approaches based on polynomials.  2006 Elsevier Ltd. All rights reserved. Keywords: System identification; Hammerstein systems; Spline functions

1. Introduction Recently many research activities focus on identification and control for nonlinear systems. Most chemical processes can be better represented by nonlinear models, which are able to describe the global behaviour of the system over the whole operating range, rather than by linear ones that are only able to approximate the system around a given operating point. Even simple nonlinear models can often provide much better approximations to process dynamics than linear models [1]. One of the most frequently used nonlinear model structure is the Hammerstein system, which consists of a static part (f(Æ)) which contains all the nonlinearity followed by a linear part (H(z)) which contains all the dynamics of the process (see Fig. 1). Many identification methods of Hammerstein systems have been reported in the literature. However, most of the methods

*

Corresponding author. Tel.: +61 2 9385 6755; fax: +61 2 9385 5966. E-mail address: [email protected] (J. Bao).

0959-1524/$ - see front matter  2006 Elsevier Ltd. All rights reserved. doi:10.1016/j.jprocont.2006.01.004

focus on Single-Input–Single-Output (SISO) processes [2–10]. Although there are some methods that can handle Multi-Input–Multi-Output (MIMO) processes, many of them assume the structure of the nonlinearity to be separate [5,11], i.e., the ith output of the nonlinear function is only affected by the ith input. Neural networks or fuzzy logic [8,9,12,13], and polynomial with cross-terms [14,15] have often been used to deal with more general nonlinearities. The major disadvantage of using neural networks is that a large amount of process data are required to train the network in order to give satisfactory results. The use of fuzzy logic approach leads to non-smooth nonlinear models (i.e., the first derivative of the nonlinearity is not continuous), which are often difficult to use in control design and process optimization. This is the same problem for the piecewise-linear approximation of the nonlinearity in [6]. The polynomial based methods, while do not suffer from the above problem, often lead to excessively oscillatory models due to the use of high-order polynomials when they are implemented to model highly nonlinear processes. The model accuracy tends to degenerate at the edges of the

660

K.H. Chan et al. / Journal of Process Control 16 (2006) 659–670

Section 8 to illustrate the application of the proposed method and provide a comparison with existing methods based on polynomials. Static nonlinearity

Linear dynamics

Fig. 1. Hammerstein model.

data set so that extrapolation beyond the limits, and even those near the limits, usually gives inaccurate results. Furthermore, the parameter estimation is often numerically ill conditioned [2–4]. Finally, and most significantly, the coefficients describing the polynomial do not act locally. As a result, changing one coefficient of the polynomial changes the shape of the entire nonlinearity. Fitting a polynomial to a sharp corner often results in ripples, especially in areas where the data is sparse. Because of these characteristics, it is unlikely that polynomials are the ideal representation for hard nonlinearities [4,16]. Another major problem of existing methods is the use of iteration in the identification procedures. Many methods [2–4,6,13,14] divide the parameters into two sets, the linear dynamic part and the static nonlinear part. The optimal values for one set is calculated while the other set is fixed. The optimization proceeds iteratively with the two sets switched over repeatedly. The biggest problem of this approach in general is that the iterative methods do not always guarantee convergence of the estimated parameters. Recently, Sung [10] proposed an identification method by using a special test signal that enables the decoupling of the identification of the linear dynamic part from that of static nonlinear function. It only applies to SISO systems and polynomial nonlinearities. Lee et al. [17] extended Sung’s decoupling method to MIMO systems, which however requires nonlinear optimization. Motivated by the above problems in existing identification methods, a new approach based on multivariable cardinal spline functions is developed in this paper. Spline functions are much more flexible in accommodating varieties of nonlinear functionalities as their coefficients act locally. Therefore spline functions are suitable for modelling hard nonlinearities. It is also relatively easy to develop control strategies based on spline function models as spline functions are smooth. Unlike the existing methods based on spline functions [4,11], which are limited to SISO nonlinear functions, the proposed method can handle highly coupled multivariable nonlinearities. This paper is organized as follows. In Section 2, the problem is formulated and cardinal cubic spline functions are introduced in Section 3, followed by a new framework for representing nonlinear systems presented in Section 4. The problem of identifying both the nonlinear and linear parts is transformed into a linear identification problem in Section 5. The persistent excitation condition is derived in Section 6. The complete identification procedure is summarized in Section 7. A simulation study is presented in

2. Problem formulation A Hammerstein system is schematically represented in Fig. 1. As mentioned previously, the system consists of a static part f(Æ) which contains all the nonlinearity followed by a linear time invariant (LTI) system H(z) which contains all the dynamics of the process. Assume that the LTI system has the ARX structure (but other structure like ARMAX, state-space etc. are also possible). The input– output relationship is then given by AðqÞyðtÞ ¼ BðqÞvðtÞ þ eðtÞ

ð1Þ

vðtÞ ¼ f ðuðtÞÞ

ð2Þ

where uðtÞ; vðtÞ 2 Rru , y(t) and eðtÞ 2 Rry are the system input, intermediate input (unobserved), output and noise respectively. If v(t) is known, then standard linear identification methods can be applied to identify A(q) and B(q). In order to achieve this, f(u(t)) is transformed such that v(t) = Ns(u(t)) where N is an unknown constant matrix and s(Æ) is a vector of known functions. The transformation is achieved using Cubic Cardinal Spline Functions, which will be explained in the following section. 3. Cubic cardinal spline functions 3.1. Natural cubic spline function A SISO natural cubic spline function is defined as follows [18]: SðxÞ ¼ Sðfk i g; fni g; xÞ

for i ¼ 1; . . . ; p with k i < k iþ1 ð3Þ

where p is the number of knots, {ki} is the set knots, and {ni} is the set of the corresponding data points. The spline function consists of cubic polynomials in each of the intervals [ki, ki+1] and straight lines in (1, k1) and (kp, 1). These polynomials are chosen so that the first and second derivatives are continuous at each point (ki, ni). Cubic spline functions may be evaluated by calculating the second derivatives at the knots ki from p  2 tridiagonal linear equations ðk i1  k i ÞS 00 ðk i1 Þ þ 2ðk i1  k iþ1 ÞS 00 ðk i Þ þ ðk i  k iþ1 ÞS 00 ðk iþ1 Þ   ni1  ni ni  niþ1  ¼6 ð4Þ k i1  k i k i  k iþ1 where also S00 (k1) = S00 (kp) = 0 for natural cubic splines. Eq. (4) is obtained by equating the first derivative at that point (ki, ni). From here onwards, the term ‘‘spline function’’ will refer to the natural cubic spline function as defined above.

K.H. Chan et al. / Journal of Process Control 16 (2006) 659–670

3.2. Cardinal spline functions

661

4 2

The set of p natural cardinal splines on the knots {ki} can be defined as follows [18]:

−2

S m ðfk i g; xÞ ¼ Sðfk i g; fdim g; xÞ

−4

ð5Þ

0

−6

where

−8

dim ¼ 1; dim ¼ 0;

if i ¼ m otherwise

−10 −5

Obviously the above cardinal spline functions are determined only by the knots. Therefore, the overall spline function can be expressed as follows: Sðfk i g; fni g; xÞ ¼

p X

nm S m ðfk i g; xÞ

ð6Þ

−4

−3

−2

−1

0

1

2

3

4

5

−4

−3

−2

−1

0

1

2

3

4

5

4 2 0 −2 −4 −6

m¼1

so that the spline function is linear in {ni} which are the function values at the knots {ki}. The above features make the natural cardinal spline function a perfect choice for the transformation that converts the nonlinear identification problem into a linear one. From the natural cardinal splines on five equally spaced knots shown in Fig. 2, it can be seen that the behaviour of a spline function at a point is effectively determined by those cardinal splines with a non-zero knot close to the point. This feature makes for stable computational procedures and shows that a spline function fit is only sensitive to the local behaviour of the data. In fact for SISO systems, if the model accuracy is not crucial, the coefficients of the cardinal spline func-

1 0

−8 −10 −5

Fig. 3. Fitting the cardinal spline by inspection.

tions can be obtained simply by inspecting the data because the coefficients are function values. Consider the upper plot in Fig. 3, it is relatively nonlinear on the left while it is relatively linear on the right. If the knots are chosen to be {5, 4, 3, 0, 4} by inspection, then the corresponding coefficients can be chosen to be {7, 3.5, 1.8, 0, 1} without doing any optimizations. This gives a spline function depicted in the lower plot in Fig. 3, a reasonable fit to the scattered data with both the knots and coefficients chosen by inspection. This way of parameterizing the nonlinearity is very powerful as minimal effort is required to achieve reasonable results. 3.3. Multi-dimensional spline functions

1

2

3

4

5

1 0 1

2

3

4

5

Based on the one-dimensional spline functions described previously, a Multi-Input–Single-Output (MISO) cardinal spline function may be defined [18] from the points   k 1d 1 ; k 2d 2 ; . . . ; k rduru ; nd 1 d 2 d ru

1

where 0

d 1 ¼ 1; . . . ; p1 1

2

3

4

5

1

d 2 ¼ 1; . . . ; p2 .. .

0 1

2

3

4

5

d ru ¼ 1; . . . ; pru as n o n o  n o S k 1d 1  k 2d 2      k rduru ; fnd 1 d 2 d ru g; X

1 0 1

2

3

4

5

Fig. 2. Natural cardinal splines on five equally spaced knots.

¼

p1 X p2 X d 1 ¼1 d 2 ¼1



pru X d ru ¼1

nd 1 d 2 d ru

ru Y m¼1

Sdm



 k mdm ; xm

ð7Þ

662

K.H. Chan et al. / Journal of Process Control 16 (2006) 659–670

where X is a ru-dimensional vector with elements xm, and  is the combination operator that create the rectangular grid on all of the dimensions used. Denote the set of knots as K. Notice that this MISO spline function is also a linear function of its values on a rectangular grid of points in the space over which it is defined. Since K is pre-defined, all the univariable natural cubic spline functions ðS d m Þ are known nonlinear functions. Hence, for a general nonlinearity v ¼ f ðuÞ

ð8Þ

where u; v 2 Rru , it can be approximated using the multidimensional spline function as follows: 2 3 v1 6v 7 6 27 7 ð9Þ v¼6 6 .. 7 4 . 5 2

vr u p1 p2 P P

pru P

n1d 1 d 2 d ru

ru Q



k mdm





3

 Sdm ; um 7 6 7 6 d 1 ¼1 d 2 ¼1 d ru ¼1 m¼1 7 6 pru p1 p 7 6P r u 2    P 2 P Q m 7 6 ; u    n S k dm m 7 6 dm d 1 d 2 d ru 7 6 m¼1 ¼ 6 d 1 ¼1 d 2 ¼1 d ru ¼1 7 7 6 .. 7 6 . 7 6 6P p p1 p r r 2  m  7 Pu ru P Qu 5 4  nd 1 d 2 d ru S d m k d m ; um

ð10Þ

¼ Ns

ð11Þ

d 1 ¼1 d 2 ¼1

d ru ¼1

m¼1

4. Parameterization of Hammerstein systems As discussed in the previous section, the use of cardinal cubic spline functions transforms the actual inputs into the virtual inputs. Thus the identification of the Hammerstein system becomes the identification of an augmented linear system. This allows the Hammerstein system to be parameterized in a new way by the following parameters: Ai 2 Rry ry Bj 2 Rry ru

for i ¼ 1; 2; . . . ; ny for j ¼ 1; 2; . . . ; nu

N 2 Rru ða1Þ   K ¼ ru sets of knot vectors k mdm Notice that a different representation of the linear subsystem may be used, for example the state-space representation (A, B, C, D). The main emphasis here is that different nonlinear systems can now be represented numerically using a set of coefficient matrices and knot vectors which has a very attractive appeal because all the nonlinearity is embedded in the transformation from the actual inputs into the virtual inputs. This parameterized structure is very beneficial to both numerical simulations and control design. Furthermore, the nonlinearity can be visualized roughly from the matrices N and K, since the elements of matrix N are the function values of the nonlinearity at the corresponding knots from K. This is not possible for the existing parameterizations by using polynomials or other types of spline functions.

where 5. Estimation of linear dynamics

s ¼ ½ s1 s2    sa  ru Y   S d m k mdm ; um si ¼

With the nonlinear transformation in Section 3.3, the identification problem using the Hammerstein model can be formulated using the output y and transformed input ~s. For example, if the linear component of the Hammerstein model is described as the ARX structure below

m¼1

½N mi ¼ nmd1 d 2 d ru i ¼ d 1 þ ðd 2  1Þp1 þ    þ ðd ru  1Þ

rY u 1

pm

m¼1

m ¼ 1; . . . ; ru ru Y a¼ pi i¼1

pi ¼ number of knots in the ith input From the nonlinear transformation in Eq. (11), each element in s is linearly independent of each other. However, s has a special property a X si ¼ 1 ð12Þ i¼1

This property is actually undesirable in the identification problem. Hence, the basic spline function that corresponds to the knots at the origin is removed from s such that the new vector, ~s, is of dimension a  1, which implies N is of dimension ru · (a  1). This step is indeed very important, which will be illustrated in the proof in Section 6.

AðqÞyðtÞ ¼ BðqÞvðtÞ þ eðtÞ

ð13Þ

where vðtÞ 2 Rru ; yðtÞ; eðtÞ 2 Rry and q is the delay-operator. The above equation can be re-written as follows: yðtÞ ¼ A1 yðt  1Þ  A2 yðt  2Þ      Any yðt  ny Þ þ B1 vðt  1Þ þ B2 vðt  2Þ þ    þ Bnu vðt  nu Þ þ eðtÞ ð14Þ ¼ A1 yðt  1Þ  A2 yðt  2Þ      Any ðt  ny Þ þ B1 N~sðt  1Þ þ B2 N~sðt  2Þ þ    þ Bnu N~sðt  nu Þ þ eðtÞ ð15Þ ¼ A1 yðt  1Þ  A2 yðt  2Þ      Any ðt  ny Þ þ B1~sðt  1Þ þ B2~sðt  2Þ þ    þ Bnu ~sðt  nu Þ þ eðtÞ ð16Þ

where the order of the input (nu) and the order of the output (ny) are assumed to be known, with N and ~s defined in Section 3.3. Then write the model structure in Eq. (16) as a linear regression

K.H. Chan et al. / Journal of Process Control 16 (2006) 659–670

^y ðtjhÞ ¼ hT uðtÞ

ð17Þ

where

h ¼ A1

T A2    Any B1 B2    Bnu uT ðtÞ ¼ ½ yðt  1Þ yðt  2Þ    yðt  ny Þ ~sðt  1Þ ~sðt  2Þ

there exists many different representation of the Hammerstein system. If ry = ru, then there are several ways to make the representation unique, e.g., either the gain of H or f can be set to be unity. In this paper, it is assumed that H has the unity steady-state gain, i.e., I þ A1 þ A2 þ    þ An y ¼ B1 þ B2 þ    þ Bn u

T    ~sðt  nu Þ 

Hence any standard MIMO linear identification algorithms can be used to identify A(q) and BðqÞ. One good approach is the four-step instrumental variable (IV) algorithm described in [19]. Suppose that the actual model is described as yðtÞ ¼ hT0 uðtÞ þ e0 ðtÞ

663

ð18Þ

  Then least-squares (LS) estimate of h ^ hLS would converge N to h0 only if

ð25Þ

The advantage of this choice is that the steady-state gain of the resulting Hammerstein model is simply equal to the static nonlinearity f(Æ). Then U 1 þ U 2 þ    þ U nu ¼ B1 T u þ B2 T u þ    þ Bnu T u

ð26Þ

¼ ½B1 þ B2 þ    þ Bnu T u

ð27Þ

¼ ½I þ A1 þ A2 þ    þ Any T u

ð28Þ

Hence, 1. {e0(t)} is a sequence of independent random variables with zero mean values (white noise). 2. The intermediate input sequence {v(t)} is independent of the zero mean sequence {e0(t)} and ny = 0 in Eq. (16). The LS estimate ^ hLS N does not tend to the true parameter vector h0 in typical cases due to the correlation between u(t) and e0(t). Therefore the IV method has been used which would eliminate the correlation effect. The optimal IV method was developed in [20], but it requires some rather sophisticated iterative procedure. Therefore, the four-step IV method in [19], which uses consistent but not necessarily efficient estimates of the dynamics and of the noise, is used in this paper to retain the appealing simplicity of the IV method. For completeness, the details of the method is included in Appendix A.1. The IV-estimate ^ hIV N in Eq. (69) in Appendix A.1 is used to determine A(q) and BðqÞ. Once BðqÞ is obtained, B(q) can be calculated by singular value decomposition (SVD), similar to [15] 3 2 B1 6B 7 6 27 T 7 6 ð19Þ 6 .. 7 ¼ USV 4 . 5 Bn u There exists a non-singular input transformation T u 2 Rru ru , such that U 1 ¼ B1 T u ¼ U ð1 : ry ; 1 : ru Þ

ð20Þ

U 2 ¼ B2 T u ¼ U ðry þ 1 : 2ry ; 1 : ru Þ

ð21Þ

.. .

ð22Þ

U nu ¼ Bnu T u ¼ U ððnu  1Þry þ 1 : nu ry ; 1 : ru Þ

ð23Þ

T 1 u N

ð24Þ

¼ SV

T

Because the gains of the static nonlinearity and linear dynamics are actually not unique due to the product, any  pair af ; 1a H ; a 6¼ 0, will produce identical input and output measurements. Furthermore, ry 5 ru in general. Hence,

1

T u ¼ ½I þ A1 þ A2 þ    þ Any  ½U 1 þ U 2 þ    þ U nu  ð29Þ Very often, the number of non-zero singular values in S is more than ru. Those extra singular values are significantly smaller compared to the largest ru singular values. Hence, only the largest ru singular values (hence, the first ru columns of V) are used in the above calculations. Remark 1. The above method can be applied to other parameterizations of static nonlinearities. For example, if the nonlinear function is polynomial rðuÞ ¼ n1 u1 þ n2 u2 þ n3 u21 þ n4 u22 þ n5 u1 u2 ¼ Ns where N ¼ ½ n1 n 2 n3 n4 n5 

T s ¼ u1 u2 u21 u22 u1 u2 provided that the transformed input s is ‘‘persistently excited’’—see Section 6. 6. Persistence of excitation The important feature of the proposed method is the nonlinear transformation of the inputs. Once the inputs have been transformed, virtually any linear identification method could be used to identified the augmented linear system. The crucial requirement in linear identification algorithms (e.g., multi-variable output error state space (MOESP), instrumental variables (IV) etc.) is the persistency of excitation of the input vector ~sðtÞ. It had been proven polynomial based transformation satisfies the persistently excited condition [16]. Here we prove the persistently excited condition for the MIMO cardinal spline functions. Definition 1 [16]. A multi-dimensional signal {v(t)} is said to be persistently exciting of order n if the expectation matrix

664

K.H. Chan et al. / Journal of Process Control 16 (2006) 659–670

R ¼ E½vvT 

ð30Þ

is positive-definite, where

v ¼ vT ðt  1Þ vT ðt  2Þ    Lemma 1. Let 2 m 1 6 6 1 m 6 M ¼6 6 . 6 .. 4 1



..



.

1

vT ðt  nÞ

~s ¼ ½ 1 0 T or ½ 0 0 T or ½ 0 1 T (the center component is removed)

T

1 .. 7 . 7 7 7 7 1 7 5

ð31Þ

T

x Mx ¼ ½ x1 x2

¼ ½ x1 x2

T

m

mx1  x2      xm

¼

þ    þ x2m Þ  2ðx1 x2

ð33Þ

ð34Þ ¼ þ    þ x2m Þ þ ½ðx1  x2 Þ2 þ ðx1  x3 Þ2 2 2 2 þðx1  xm Þ  þ ½ðx2  x3 Þ þ ðx2  x4 Þ þ    þ x22

þ   :

þ ðx4  xm Þ  þ    þ ½ðxm1  xm Þ  ðthere are m  1 terms of

in all the square bracketsÞ

ð35Þ ð36Þ

P0

It is obvious that the above equality holds if and only if xi = 0 "i = 1, 2, . . . , m. Therefore, M is positive definite. h Theorem 1. Assume a ru-dimensional input signal u(t) is chosen such that each component ui(t) is a random (uniformly distributed) sequence of the knots in the ith component. Then the (a  1)-dimensional vector ~sðtÞ constructed in Section 3.3 is persistently excited of order n, i.e., u(t) is strongly persistently excited of orders n, a  1 [denoted as SPE(n, a  1)]. Proof. We first illustrate the idea of our approach to the proof using a SISO case. Consider a SISO spline function with the knot sequence {1, 0, 1}. From the construction of ~s in Section 3.3, all the possible s are ½1

0

0

T

or ½ 0

1

0

T

or ½ 0

0

1

" T

E½~sðt  iÞ~s ðt  jÞ ¼



1 9 1 9

1 9 1 9

ð41Þ

# ð42Þ

T

2

3 ~sðt  1Þ~sT ðt  1Þ ~sðt  1Þ~sT ðt  2Þ    ~sðt  1Þ~sT ðt  nÞ 6 7 .. 6 ~sðt  2Þ~sT ðt  1Þ ~sðt  2Þ~sT ðt  2Þ 7 . 6 7 R ¼ E6 7 .. .. .. 6 7 4 5 . . . ~sðt  nÞ~sT ðt  1Þ 2

2

x2i

0 1

for i, j = 1, 2, . . . , n and j 5 i. Then

þ x1 x3 þ    þ x1 xm Þ

2

T

for i = 1, 2, . . . , n. Therefore, " # 1 0 3 T E½~sðt  iÞ~s ðt  iÞ ¼ 0 13 Similarly,

 2ðx2 x3 þ x2 x4 þ    þ x2 xm Þ      2ðxm1 xm Þ ðx21

T

ð40Þ

ð32Þ

3

6 x þ mx      x 7 2 m7 6 1 7    xm 6 .. 6 7 4 5 .

þ x22

   ~sðt  nÞ~sT ðt  nÞ



Since ~s ¼ ½ 1 0  or ½ 0 0  or ½ 0 1  , then      1 0 0 0 0 T ~sðt  iÞ~s ðt  iÞ ¼ or or 0 0 0 0 0

x1  x2     þ mxm mðx21

ð38Þ

3 ~sðt  1Þ~sT ðt  2Þ    ~sðt  1Þ~sT ðt  nÞ 7 .. 7 ~sðt  2Þ~sT ðt  2Þ . 7 7 .. .. 7 5 . .

ð39Þ

3 1    1 2 x1 3 6 .. 7 6x 7 6 1 m 27 . 7 76 6 7    xm 6 . 76 . 6 . 74 .. 7 6 . . 5 . 1 5 4 . xm 1    1 m 2

R ¼ E½ssT  2 ~sðt  1Þ~sT ðt  1Þ 6 6 ~sðt  2Þ~sT ðt  1Þ 6 ¼ E6 .. 6 4 . ~sðt  nÞ~sT ðt  1Þ

mm

where m 2 Zþ . Then M is positive definite. Proof. Let x 2 Rm . Then 2

ð37Þ

Hence from Definition 1,

3

m

when the inputs are on the knots. Therefore

1 3

1 9 1 9 1 3

1 9 1 9

0 6 6 0 13 6 6 1 1 0 6 9 9 6 1 1 6 9 9 0 13 ¼6 6 . 6 . 6 . 6 6 6 1 1 4 9 9  1 1 2 19 9 I 3 22 6 61 6 9 122 ¼6 6 . 6 . 4 . 1 1 9 22

1 1 9 22

1 9 1 9



.. 1 9 1 9

1 9 1 9



ð43Þ

3

1 1 9 22

ð44Þ

3

7 7 7 7 7 7 1 1 5 22 9 1 I 3 22 .. .

1 I 3 22

.. 

1 9 1 9

7 7 7 .. 7 . 7 7 7 7 1 1 7 9 9 7 7 1 1 7 9 9 7 7 1 05 3 0 13

.

   ~sðt  nÞ~sT ðt  nÞ



.

1 1 9 22

ð45Þ

where  I 22 ¼

1

0

0

1



 and

122 ¼

1

1

1

1



ð46Þ

K.H. Chan et al. / Journal of Process Control 16 (2006) 659–670

Now we extend the result to the MIMO case. Matrix R takes similar form as in the SISO case. For the ru-dimensional input signal u(t) such that each component ui(t) a random sequence of the knots in the ith component, si ¼ 1

when i ¼ d 1 þ ðd 2  1Þp1 þ    þ ðd ru  1Þ

rY u 1

pm

m¼1

ð47Þ ð48Þ

sj ¼ 0 when j 6¼ i ~s ðwith the length of a  1Þ ¼ s ðwith the length of aÞ with the origin component removed ð49Þ d i ¼ ith index of ui ðtÞ corresponding to the knots in the ith component Then

2

aI ða1Þða1Þ

6 16 6 1ða1Þða1Þ R¼ 26 .. a 6 4 . 1ða1Þða1Þ



1ða1Þða1Þ aI ða1Þða1Þ

.. 

.

1ða1Þða1Þ

ð50Þ

3 1ða1Þða1Þ 7 .. 7 . 7 7 7 1ða1Þða1Þ 5 aI ða1Þða1Þ ð51Þ

Let 2

a1

6 6 1 G¼6 6 .. 4. 1

1



a1 .. 

.

1

1 .. . 1 a1

3 7 7 7 7 5

ð52Þ

ða1Þða1Þ

Then

82 3 >  0 > > G 0 > > .. 7 >6 . 7 1 <6 60 G 7 R ¼ 2 6. 7 .. >6. 7 a > . . 0 > 4 5 > > > : 0  0 G 2 1ða1Þða1Þ 1ða1Þða1Þ 6 6 6 1ða1Þða1Þ 1ða1Þða1Þ þ6 6 .. 6 . 4 1ða1Þða1Þ 

1 kmin ðX þ Y Þ a2 1 P 2 ½kmin ðX Þ þ kmin ðY Þ a

kmin ðRÞ ¼

665

ð56Þ ð57Þ

Obviously, 0 is one of the eigenvalues of Y. Furthermore, !2 n X xT 1nn x ¼ xi P 0 8n 2 Zþ ; x 2 Rn ð58Þ i¼1

which implies kmin(Y) = 0. In addition, according to Lemma 1, G is positive definite, which implies kmin(X) > 0. Therefore, from Eq. (57) kmin ðRÞ > 0

ð59Þ

which means that the expectation matrix R is positive definite. Hence ~sðtÞ is persistently excited, or u(t) is SPE(n, a  1). h Remark 2. In Section 3.3, it has been mentioned that the removal of the basic spline component corresponding to the origin in the transformed vector ~s is essential in the identification step. Lemma 1 shows why it is important. If the diagonal value above is m  1, the matrix M will be positive semi-definite, which implies that the matrix G in the proof of Theorem 1 will be positive semi-definite as well. Lemma 2. If a ru-dimensional input signal u(t) is chosen such that each component ui(t) at least contains a random (uniformly distributed) sequence of the knots in the ith component, then u(t) is SPE(n, a  1). Proof. Since all the knots are included in the signal u(t), it is obvious that this larger set of input (compared with the one in Lemma 1) will be SPE(n, a  1). h



..

.

1ða1Þða1Þ

39 1ða1Þða1Þ > > > 7> .. > 7> 7= . 7 7> > 1ða1Þða1Þ 7 > 5> > > ; 1ða1Þða1Þ

Remark 3. u(t) does not need to be chosen such that each component ui(t) is a random sequence of the knots in the ith component. As long as the input is uniformly distributed and the knots ar chosen among the input values, the expectation matrix R will remain positive definite, i.e., u(t) will remain SPE(n, a  1). 7. Identification procedure

ð53Þ

One way to show that R is positive is to show that the minimum eigenvalue is positive. Let 3 2 G 0  0 6 .. 7 60 G . 7 7 6 X ¼ 6. ð54Þ 7 . 7 6. . . 05 4. 0  0 G ½nða1Þ½nða1Þ Y ¼ 1½nða1Þ½nða1Þ

ð55Þ

Since X and Y are both Hermitian, according to Weyl inequality [21],

From the results in Section 6, the knots must be chosen amongst the test input values. This actually helps restrict the number of choices of knots to a finite number. A complete procedure for identification of the Hammerstein system can be summarized as follows: 1. The amplitude distribution will influence the accuracy of the model. Make sure the test input amplitude should be similar to, but contain more different levels of amplitude than, the input signal during typical process operations. In addition, use higher density in the regions where high model accuracy is desired. If there is not enough a priori

666

K.H. Chan et al. / Journal of Process Control 16 (2006) 659–670 Input 1 Input 2

2

Input Step

1

0

3. 4.

−1

5.

−2 0

20

40

60

80 100 Time (sec)

120

140

160

180

6.

Fig. 4. Multi-step input: u1 = {1, 0, 1} and u2 = {2, 0, 2}, with 20 s for each step.

7.

½ 1

2 T ;

½0

T

0 ;

½0

½ 1 T

2 ;

0 T ; ½1

½ 1 2 T ; T

2  ;

½1

½0 T

0 ;

2 T ; T

½1 2

where each step lasts for the same amount of time (see Fig. 4). The sequence of such input signal vectors can be randomized to ensure the model is not identified based on a particular input sequence. 2. If the knots are not given, determine the set of knot vectors K by inspection of the input–output data set. Generally, the region where the nonlinearity is high should be fitted with more knots. When it is not possible to determine the knots by inspection, the following two systematic approaches may be possible: (a) Use all input steps in each direction as knots. The advantage is that the accuracy increases due to increase number of knots. However, the model could become too complicated with too many knots, and the size of the transformed factor may increase exponentially (see Section 3.3 ‘‘Multi-Dimensional Spline Functions’’). This problem will get even worse when the dimension of the model increases. (b) Use a subset of the input steps in each direction as knots. For example, let mi be the desired number of knots and ni be the number of steps in the ith-direction. ni Hence, there are possible choices of knots in mi the ith-direction, which means

that the total number Qr u ni . Perform the identificaof possible K is i¼1 mi tion procedure for each set of K. The ‘‘optimal’’

8. Illustrative example Consider a nonlinear process described by the Hammerstein system as follows: v ¼ f ðuÞ "

3 þ 8=ð1 þ e3u1 Þ  7

¼ 3u1 10½3 þ 8=ð1 þ e Þ  7 ð10u2 þ 50Þ=ð10u2 þ 51Þ  50 51

#

ð60Þ



   1:55 0 0:6 0 yðtÞ þ yðt  1Þ þ yðt  2Þ 0 1:72 0 0:738     0:162 0:038 0:112 0:038 vðt  1Þ þ ¼ 0:1408 0:2408 0:1408 0:2228  vðt  2Þ þ eðtÞ 5

ð61Þ Input 1 Input 2

4 3 2

Input Step

knowledge about the process, it is recommended to use uniformly distributed test signals. One possible input sequence is the multivariable multi-step signal. For example, u1 = {1, 0,1} and u2 = {2, 0, 2}, then the step values are

choice is the one that gives the least loss function (determinant of the estimated covariance matrix of the noise source). Even though both methods are computationally intensive, parallel computing techniques can be used to speed up the computations, especially for the second approach. Approach (b) is the preferred option because the resulting model will be less complicated than that from Approach (a). Determine the orders (nu and ny) for the ARX model. Transform the input using cardinal spline functions in Section 2. This is a very computer intensive step. More then one computer can be used in parallel to obtain the transformation. Use the four-step IV algorithm in Appendix A.1 to obtain the augmented MIMO linear model (A(q) and BðqÞ). Repeat steps 2–4 using a different set of orders (nu and ny) for the ARX model until the residue is smaller than required. Calculate the parameters B(q) in the linear part, and N in the nonlinear part as in Section 5.

1 0

−1 −2 −3 −4 −5

0

100

200

300

400

500 600 Time (sec)

700

800

900

1000

Fig. 5. Multi-step input: u1 = {5: 5} and u2 = {5: 5}, with 20 s for each step (up to 1000 s for illustration purpose).

K.H. Chan et al. / Journal of Process Control 16 (2006) 659–670 50

assume that the knots are  known to be K ¼  f4; 2; 1; 0:5; 0; 0:5; 1; 2; 4g , and nu = ny = 2. The f5; 4:5; 4; 3; 0; 4g input is a multivariable multi-step signal sequence which covers all the combination of the following values:

Output 1 Output 2

40 30

Output Response

20

u1 ¼ f5 : 0:5 : 5g

10

u2 ¼ f5 : 0:5 : 5g

0

−10 −20 −30 −40 −50

667

0

100

200

300

400

500 600 Time (sec)

700

800

900

1000

Fig. 6. Output response for multi-step input: u1 = {5: 5} and u2 = {5: 5} (up to 1000 s for illustration purpose).

where similar nonlinearities can often be found in mineral processing plants where continuous reagents are added for activation and pH control. For simplicity,

i.e., a total of 21 · 21 = 441 steps. Each step lasts for 20 s, while data is taken every second. The sequence is then randomly scrambled (as shown in Fig. 5). Random white noise is added to the output, with je1(t)j 6 0.5 and je2(t)j 6 0.5 for all t P 0. The output response is shown in Fig. 6. By using the proposed method (with the use of Matlab), the results have been obtained (see Appendix A.2). The identified nonlinearity and the original nonlinearity are shown in Figs. 7 and 8 respectively. The multivariable cubic polynomials with cross-terms are also used for comparison studies. The linear parameterization becomes 2 3 n1;1 u1 þ    þ n1;3 u31 þ n1;4 u2 þ    þ n1;7 u1 u2 6 þ    þ n u3 u3 7 1;15 1 2 6 7 v¼6 7 ¼ Ns 4 n2;1 u1 þ    þ n2;3 u31 þ n2;4 u2 þ    þ n2;7 u1 u2 5 þ    þ n2;15 u31 u32

40

5

y2

y1

20 0

0 −20 −40 −5

−5 −5

−5

0

–5

0

0 u2

5

5

0 u2

u1

5

u1

5

4

40

2

20

0

y2

y1

Fig. 7. Nonlinearity estimated using the cardinal cubic spline functions: nonlinear output 1 (left); nonlinear output 2 (right).

0

−2

−20

−4 −5

−40 −5

−5

0

−5

0

0 u2

5

5

u1

0 u2

5

5

u1

Fig. 8. Original nonlinearity: nonlinear output 1 (left); nonlinear output 2 (right).

668

K.H. Chan et al. / Journal of Process Control 16 (2006) 659–670

y2

y1

5

0

−5 −5

30 20 10 0 −10 −20 −30 −5

−5

0

−5

0

0 u2

5

0 u2

u1

5

5

u1

5

Fig. 9. Nonlinearity estimated using polynomials with cross-terms: nonlinear output 1 (left); nonlinear output 2 (right).

where

s ¼ u1    u31

u2

   u1 u2



u31 u32



5 Input 1 Input 2

4 3 2

0 −1 −2 −3 −4

0

50

100

150 Time (sec)

200

250

300

Fig. 10. Triangular pulse.

6

5

True System (with noise) Spline Estimation Polynomial Estimation

0

5 4

Pulse Response (y1)

−5

Pulse Response (y1)

Input

1

The results were obtained using the same linear identification method described in the paper (see Appendix A.2). The identified nonlinear functions are plotted in Fig. 9. Comparing the nonlinearities in Figs. 7 and 9, it can be seen that the spline function approach is significantly better in identifying the nonlinearities. The original Hammerstein system (with noise), the identified model using cardinal spline functions and the one using polynomials are simulated with triangular pulse shown in Fig. 10. The pulse responses are shown in Fig. 11. The pulse responses of the original system and the identified system using spline functions are very similar. This also shows that the proposed method is not very sensitive to output noise. The identified system using polynomials on the other hand differs noticeably from the true system because of the error in the static nonlinearity. From the results presented in Appendix A.2, one may see that the reason why the proposed method provides better results is that there are more parameters in the spline functions compared with the cross-term polynomials. However, implementation of higher order polynomial terms will make little improvement in the accuracy of the nonlinear model. On the contrary, the use of higher order polynomials leads to more oscillatory nonlinearities and makes the linear identification problem ill-conditioned.

3 2 1

−10 −15 −20 −25

0 −1

−5

−30

0

100

200 Time (sec)

300

−35

True System (with noise) Spline Estimation Polynomial Estimation

0

100

200 Time (sec)

Fig. 11. Pulse responses: output 1 (left); output 2 (right).

300

K.H. Chan et al. / Journal of Process Control 16 (2006) 659–670

3. Let

9. Conclusion

ð2Þ

In conclusion, a new method using cardinal spline functions has been proposed to model MIMO Hammerstein systems. The main contribution is the use of cardinal spline functions. Besides being able to model MIMO nonlinearities that are highly coupled together, spline functions are much more flexible in modelling varieties of nonlinear functionalities as they act locally (e.g., the coefficients of cardinal spline functions determine the behaviour of the nonlinearities around the corresponding knots, but only have minor effect on the knots elsewhere). After the input signals have been transformed using cardinal spline functions, the original nonlinear identification problem simply becomes a linear one. The persistently excited condition, under mild assumptions, for the transformed inputs is also proven. Hence, any linear identification algorithm can be used, in particular the four-step IV algorithm for ARX models, which gives consistent estimates under quite general conditions. The identified nonlinear system can be parameterized by a set of matrices and knot vectors, which makes the numerical simulation and control design easier. Simulation studies, one of which is presented in Section 8, have shown that the proposed method is superior to existing identification methods based on polynomials. Acknowledgement The first author of this paper would like to acknowledge the financial support from the The Australian Postgraduate Awards. Appendix A

ð2Þ

ð2Þ

ð2Þ

LðqÞ^ wN ðtÞ ¼ eðtÞ

^N ðqÞ, the final Using these instruments and a prefilter L IV estimate of h is " #1 N N X X ð2Þ ^hIV ¼ f ðtÞuTF ðtÞ fð2Þ ðtÞy TF ðtÞ ð69Þ N t¼1

^N ðqÞuðtÞ uF ðtÞ ¼ L ^N ðqÞyðtÞ y F ðtÞ ¼ L A.2. Identified models A. Proposed  0:391 A1 ¼ 1:285  0:167 B1 ¼ 0:134  n11 n12 N¼ n21 n22

The IV estimate of h is " #1 N N X X ð1Þ IV T ^ h ¼ f ðtÞu ðtÞ fð1Þ ðtÞy T ðtÞ t¼1

ð63Þ ð64Þ

n13 ¼ ½ 3:912 2:544 n14 ¼ ½ 3:825

3:818

3:641

n21 ¼ ½ 38:60 n22 ¼ ½ 6:651

4:070 

3:353 2:389 3:566 2:576

3:648 4:001

38:63

35:94

34:87 38:49 the corresponding ARX

0:102 2:657

3:956 3:951  3:857

2:536

0:012

4:179 

3:467 2:580

3:643 3:974

0:097

4:046 

3:257 2:438

3:797 4:058

0:188

4:155 

3:136 1:908

3:792 3:930

3:803

2:544

3:476 2:124

3:683 4:053

3:881

n16 ¼ ½ 3:831

ð65Þ

   n26

3:947

n15 ¼ ½ 3:897

t¼1

ð2Þ Denote this estimate by ^ hN and ^ ð2Þ ^ ð2Þ model by A N ðqÞ and BN ðqÞ.

n12 ¼ ½ 3:975 2:725

^ Nð1Þ ðqÞxð1Þ ðtÞ ¼ B ^ Nð1Þ ðqÞuðtÞ A

fð1Þ ðtÞ ¼ xð1Þ ðt  1Þ     xð1Þ ðt  ny Þ uðt  1Þ    T uðt  nu Þ

method:    0:150 0:344 0:151 ; A2 ¼ 1:887 1:047 0:906    0:045 0:098 0:043 ; B2 ¼ 0:2248 0:372 0:229     n16

4:078

2:598

ð1Þ Denote the estimate by ^ hN and the corresponding ARX ^ ð1Þ ^ ð1Þ model by A N ðqÞ and BN ðqÞ. 2. Generate the instruments

N

t¼1

where

n11 ¼ ½ 4:139 ð62Þ

ð67Þ

Estimate L(q) using the LS method and denote the result ^N ðqÞ. by L 4. Let x(2)(t) be defined analogously to Eq. (63), and let

^N ðqÞ xð2Þ ðt  1Þ     xð2Þ ðt  ny Þ uðt  1Þ    fð2Þ ðtÞ ¼ L T ð68Þ uðt  nu Þ

1. Estimate h by the least-square method " #1 N N X X T ^hLS ¼ uðtÞu ðtÞ uðtÞy T ðtÞ t¼1

ð66Þ

^ N ðtÞ and postulate and AR model of order ny + nu for w

where

N

ð2Þ

^ N ðqÞyðtÞ  B ^ N ðqÞuðtÞ ^ N ðtÞ ¼ A w

A.1. Four-step IV algorithm [19]

t¼1

669

6:369

7:636

4:938 6:461

0:068

3:943 

26:62

0:614

24:76

38:49  6:592 6:190 

0:379 3:328

670

K.H. Chan et al. / Journal of Process Control 16 (2006) 659–670

n23 ¼ ½ 2:198

2:181

3:096

0:792 1:241 n24 ¼ ½ 2:274

2:391

0:500

0:259

2:084

n26 ¼ ½ 0:415

0:225

0:714 " K¼

0:679

0:442

0:524

0:946

0:569

0:090 

0:247

0:495

1:111

1:184 

0:825

0:003

0:005

1:051 

1:089 1:382 n25 ¼ ½ 0:462

1:245

0:522

0:364

0:182

0:270 

f4; 2; 1; 0:5; 0; 0:5; 1; 2; 4g

#

f5; 4:5; 4; 3; 0; 4g

B. Cubic polynomials with cross-terms " # " # 0:449 0:020 0:305 0:052 A1 ¼ ; A2 ¼ 1:160 1:764 0:950 0:811 " # " # 0:172 0:035 0:075 0:002 B1 ¼ ; B2 ¼ 0:127 0:256 0:337 0:208 3 2 1:960u1  0:004u21  0:052u31 þ 0:057u2  0:004u22 7 6 6  0:004u32  0:053u1 u2  0:005u21 u2 þ 0:002u31 u2 7 7 6 7 6 7 6  0:003u1 u2 þ 0:0001u2 u2  6:2  105 u3 u2 2 1 2 1 2 7 6 7 6 7 6 þ 0:004u1 u3 þ 0:0002u2 u3  0:0002u3 u3 2 1 2 1 2 7 6 f ðuÞ ¼ 6 7 6 0:941u1 þ 0:029u2  0:028u3  0:421u2 þ 0:009u2 7 1 1 2 7 6 7 6 6 þ 0:016u3  0:716u1 u2 þ 0:018u2 u2 þ 0:018u3 u2 7 2 1 1 7 6 7 6 7 6  0:245u1 u2  0:0001u2 u2 þ 0:007u3 u2 2 1 2 1 2 5 4 þ 0:082u1 u32  0:0005u21 u32  0:002u31 u32

References [1] R.K. Pearson, M. Pottmann, Gray-box identification of blockoriented nonlinear models, Journal of Process Control 10 (4) (2000) 301–315. [2] Y. Zhu, Identification of Hammerstein models for control using ASYM, International Journal of Control 73 (18) (2000) 1692–1702. [3] Y. Zhu, Estimation of an N–L–N Hammerstein–Wiener model, Automatica 38 (9) (2002) 1607–1614.

[4] E.J. Dempsey, D.T. Westwick, Identification of Hammerstein models with cubic spline nonlinearities, IEEE Transactions on Biomedical Engineering 51 (2) (2004) 237–245. [5] M. Boutayeb, M. Darouach, Identification of the Hammerstein model in the presence of bounded disturbances, in: Proceedings of the IEEE International Conference on Industrial Technology, vol. 1, 2000, pp. 590–594. [6] G. Dolanc, S. Strmcˇnik, Identification of nonlinear systems using a piecewise-linear Hammerstein model, Systems & Control Letters 54 (2) (2005) 145–158. [7] E.W. Bai, Frequency domain identification of Hammerstein models, IEEE Transactions on Automatic Control 48 (4) (2003) 530–542. [8] L. Jia, M.S. Chiu, S.S. Ge, Iterative identification of neuro-fuzzy based Hammerstein model with global convergence, Industrial and Engineering Chemistry Research 44 (6) (2005) 1823–1831. [9] L. Jia, M.S. Chiu, S.S. Ge, A noniterative neuro-fuzzy based identification method for Hammerstein processes, Journal of Process Control 15 (7) (2005) 749–761. [10] S.W. Sung, System identification method for Hammerstein processes, Industrial and Engineering Chemistry Research 41 (17) (2002) 4295– 4302. [11] J.A. Ramos, J.F. Durand, Identification of nonlinear systems using a B-Splines parametric subspace approach, in: Proceedings of the American Control Conference, vol. 6, 1999, pp. 3955–3959. [12] H. Al-Duwaish, M.N. Karim, A new method for the identification of Hammerstein model, Automatica 33 (10) (1997) 1871–1875. [13] J. Abonyi, R. Babusˇka, M.A. Botto, F. Szeifert, L. Nagy, Identification and control of nonlinear systems using fuzzy Hammerstein models, Industrial and Engineering Chemistry Research 39 (11) (2000) 4302–4314. [14] S. Lakshminarayanan, S.L. Shah, K. Nandakumar, Identification of Hammerstein models using multivariate statistical tools, Chemical Engineering Science 50 (22) (1995) 3599–3613. [15] M. Verhaegen, D. Westwick, Identifying MIMO Hammerstein systems in the context of sub-space model identification methods, International Journal of Control 63 (2) (1996) 331–349. [16] P. Stoica, T. So¨derstro¨m, Instrumental variable methods for identification of Hammerstein systems, International Journal of Control 35 (3) (1982) 459–476. [17] Y.J. Lee, S.W. Sung, S. Park, S. Park, Input test signal design and parameter estimation method for the Hammerstein–Wiener processes, Industrial and Engineering Chemistry Research 43 (23) (2005) 7521–7530. [18] W.J. Whiten, The use of multi-dimensional cubic spline functions for regression and smoothing, The Australian Computer Journal 3 (2) (1971) 81–88. [19] L. Ljung, System Identification: Theory for the User, Prentice Hall, New Jersey, 1999. [20] P. Stoica, T. So¨derstro¨m, Optimal instrumental-variable methods for identification of multivariable linear systems, Automatica 19 (4) (1983) 425–429. [21] A. Amir-Moez, Extreme properties of a Hermitian transformation and singular values of the sum and product of linear transformation, Duke Math. J. 23 (1956) 463–476.