Available online at www.sciencedirect.com
Automatica 40 (2004) 815 – 822
www.elsevier.com/locate/automatica
Brief Paper
Optimal expansions of discrete-time Volterra models using Laguerre functions Ricardo J.G.B. Campelloa;1 , G+erard Favierb;∗ , Wagner C. do Amaralc a COPOP/UNISANTOS,
R. Dr. Carvalho de Mendonca 144, CEP 11070-906, Santos-SP, Brazil B.P. 121, 06903 Sophia Antipolis C+edex, France c DCA/FEEC/UNICAMP, CEP 13083-970, Campinas-SP, Brazil
b I3S/CNRS/UNSA,
Received 20 December 2002; received in revised form 24 June 2003; accepted 21 November 2003
Abstract This work is concerned with the optimization of Laguerre bases for the orthonormal series expansion of discrete-time Volterra models. The aim is to minimize the number of Laguerre functions associated with a given series truncation error, thus reducing the complexity of the resulting 8nite-dimensional representation. Fu and Dumont (IEEE Trans. Automatic Control 38(6) (1993) 934) indirectly approached this problem in the context of linear systems by minimizing an upper bound for the error resulting from the truncated Laguerre expansion of impulse response models, which are equivalent to 8rst-order Volterra models. A generalization of the work mentioned above focusing on Volterra models of any order is presented in this paper. The main result is the derivation of analytic strict global solutions for the optimal expansion of the Volterra kernels either using an independent Laguerre basis for each kernel or using a common basis for all the kernels. ? 2003 Elsevier Ltd. All rights reserved. Keywords: Nonlinear systems; Volterra series; Laguerre functions; Optimization; Model reduction
1. Introduction A discrete-time Volterra model is generically described as ∞ ∞ m ∞ y(k) = ··· hm (k1 ; k2 ; : : : ; km ) u(k − kj ); (1) m=1 k1 =0
km =0
j=1
where u, y and hm are the input, the output and the mth-order kernel, respectively. It is a non-linear generalization of the impulse response model as well as a speci8c realization of the following system representation: y(k) = H({u( )}k =−∞ );
(2)
where H is a generic non-linear operator. Indeed, model (1) can be seen as a multidimensional Taylor development A reduced version of this paper was presented at the IFAC Symposium on System Identi8cation (SYSID), Rotterdam, The Netherlands, August 2003. This paper was recommended for publication in revised form by Associate Editor Johan Schoukens under the direction of Editor Torsten SCoderstrCom. ∗ Corresponding author. E-mail addresses:
[email protected] (R.J.G.B. Campello),
[email protected] (G. Favier),
[email protected] (W.C. do Amaral). 1 Also for correspondence.
0005-1098/$ - see front matter ? 2003 Elsevier Ltd. All rights reserved. doi:10.1016/j.automatica.2003.11.016
of H if this operator is analytic (EykhoE, 1974). The Taylor series converges if the input signal u is bounded (and normalized so that |u| ¡ 1). An immediate consequence is that the higher-order terms can be neglected in such a way that only the 8rst M Volterra kernels need to be taken into account, where M is the order of the resulting model. In addition, if the desired representation is stable, then the elements hm (k1 ; : : : ; km ) with kl ¿ m (∀l ∈ {1; : : : ; m}) can be ignored and the Volterra model can be simpli8ed as y(k) =
m M m=1 k1 =0
···
m km =0
hm (k1 ; k2 ; : : : ; km )
m
u(k − kj ): (3)
j=1
Boyd and Chua (1985) showed that this model can approximate to desired accuracy any non-linear system in (2) that meets the following requirements: (i) The operator H is causal, continuous, time-invariant and has fading memory; (ii) The input u is (upper and lower) bounded. These requirements comprise a wide-class of real-world systems. The main problem in practice is how to select the Volterra kernels that provide an adequate representation of the system to be modeled. The estimation of Volterra kernels for modeling of non-linear systems has been investigated for decades
816
R.J.G.B. Campello et al. / Automatica 40 (2004) 815 – 822
(EykhoE, 1974; Billings, 1980; Schetzen, 1980). The main drawback is that the kernels are, in principle, non-parameterized functions whose measurement is possible only if their individual contributions can be separated from the total system response (Schetzen, 1980). A straightforward approach can be derived if the elements of the Volterra kernels, i.e., the coeIcients hm (k1 ; : : : ; km ), are treated as individual parameters to be estimated. In this case, the Volterra model (3) is linear in these parameters and classical estimation algorithms like the recursive least squares (RLS) can be applied. This approach, however, usually makes the model over-parameterized. Hence, it is important to reduce its parametric complexity before estimation to improve numerical conditioning while decreasing the variance of the estimated parameters. An interesting approach to dealing with this problem, 8rst suggested by Wiener (1958), is the expansion of the Volterra kernels using orthonormal basis functions, such as the Laguerre and Kautz functions (Broome, 1965; Wahlberg, 1991; Wahlberg, 1994; Wahlberg & MCakilCa, 1996) or generalized orthonormal basis functions (GOBFs) (Heuberger, Van den Hof, & Bosgra, 1995; Ninness & Gustafsson, 1997). The Laguerre basis is a special case of the Kautz basis, which in turn is a particular realization of GOBFs. Although a truncated Laguerre series usually requires more functions than a truncated Kautz or GOBF series when representing oscillatory dynamics or systems with multiple representative modes, the Laguerre functions have the desirable property that they are completely determined by a single parameter. Mostly for this reason, Laguerre function based models have been widely used in the context of system identi8cation and control (see for instance Marmarelis, 1993; Dumont & Fu, 1993; Oliveira, Amaral, Favier, & Dumont, 2000, and references therein). In practice, the parsimony of these models depends strongly on the choice of the Laguerre pole. When this pole is set close to the dominant dynamic of the system to be modeled a signi8cant reduction in the number of functions and parameters needed to approximate the system with arbitrary accuracy can be achieved. This is the classic approach to modeling and model-based control of stable linear systems using orthonormal functions (Zervos & Dumont, 1988; Wahlberg & Ljung, 1992; Finn, Wahlberg, & Ydstie, 1993; Elshafei, Dumont, & Elnaggar, 1994). This approach is, however, of a heuristic nature. A systematic optimal selection of the Laguerre pole was 8rst proposed by Clowes (1965) in the case of linear continuous-time systems. Later, Fu and Dumont (1993) approached this problem in the context of linear discrete-time systems by deriving an analytic optimal solution which forces a fast convergence of the function series by linearly increasing the cost assigned to each additional Laguerre coeIcient. Tanguy, Vilb+e, and Calvez (1995) and Tanguy, Morvan, Vilb+e, and Calvez (2000) showed that this analytic solution minimizes the upper bound of the squared norm of the error resulting from the truncation of the series into a 8nite number of functions. In the context of non-linear sys-
tems, Campello, Amaral, and Favier (2001) generalized Fu and Dumont’s approach to second-order Volterra models by proposing the use of two independent Laguerre bases for the expansion of the 8rst- and second-order Volterra kernels. Such a strategy allows the application of Fu and Dumont’s solution to accomplishing the optimal expansion of the 8rst-order kernel, which is equivalent to the impulse response of a linear system. In addition, an analogous solution for the optimal expansion of the second-order kernel was derived. The present paper extends the above-mentioned works by deriving an analytic global strict solution for the optimal Laguerre series expansion of Volterra kernels of any order. This solution minimizes an upper bound for the series truncation error and relies on the use of an independent Laguerre basis for the expansion of each kernel. For the sake of computational simplicity, a solution to the problem of expanding all the kernels using a common basis is also provided. In the next section the Laguerre series expansion of Volterra kernels is brieOy reviewed. The optimal selection of Laguerre bases for the expansion of Volterra models is discussed in Section 3. Two examples are provided in Section 4 and, 8nally, the conclusions are addressed in Section 5. 2. Laguerre expansion of Volterra kernels Assuming that the kernels hm in (1) are absolutely summable on [0; ∞[, which means that the corresponding Volterra model is stable, then they can be represented by means of orthonormal bases of functions. Considering the expansion of these kernels using Laguerre functions (Dumont & Fu, 1993), but using an independent basis of functions for each kernel, yields hm (k1 ; : : : ; km ) =
∞
···
i1 =1
∞
i1 ;:::;im
m
im =1
m; ij (kj );
(4)
j=1
where m; l is the lth function of the mth Laguerre basis and (·) are the corresponding expansion coeIcients, given by i1 ;:::;im =
∞
···
k1 =0
∞
hm (k1 ; : : : ; km )
m
m; ij (kj ):
(5)
j=1
km =0
From Eqs. (1) and (4), a Volterra model of order M can easily be rewritten as y(k) =
M ∞ m=1 i1 =1
···
∞ im =1
i1 ;:::;im
m
Q m; ij (k);
(6)
j=1
where Q m; l is the outputof the lth Laguerre 8lter of the ∞ mth basis, i.e., Q m; l (k)= =0 m; l ( )u(k − ). The transfer functions of the corresponding Laguerre 8lters are given by l−1 2 1 − pm 1 − pm z ; (7) m; l (z) = Z{m; l ( )} = z − pm z − pm
R.J.G.B. Campello et al. / Automatica 40 (2004) 815 – 822
where Z denotes the unilateral z-transform and pm ∈ ] − 1; 1[ is the real stable pole which parameterizes the mth basis.
with j = 1; 2 and ∞ ∞ 1 ··· kl h2m (k1 ; : : : ; km ); S1; l =
hm 2 k1 =0
3. Optimization of the Laguerre poles For computational reasons, the in8nite-dimensional expansion of the mth-order kernel in (4) is, in practice, truncated into a 8nite number of functions. The underlying problem considered here is how to select the Laguerre pole pm in (7) so as to minimize the error resulting from this truncation. Particularly, the pole p1 which parameterizes the Laguerre expansion of the 8rst-order kernel h1 can be obtained by solving the following optimization problem (Fu & Dumont, 1993): ∞ i1 i21 ; (8) min J1 ,
−1¡p1 ¡1
i1 =1
whose solution forces a fast convergence of the function series by linearly increasing the cost assigned to each additional Laguerre coeIcient (·) . In the work cited above, the authors showed that problem (8) has an analytic strict global solution if the corresponding linear model is BIBO ∞ stable ( k1 =0 |h1 (k1 )| ¡ ∞) and strictly proper (h1 (0)=0). This solution has been shown to globally minimize the upper bound of the squared norm of the error resulting from the truncation of the Laguerre series into a 8nite number of functions (Tanguy et al., 1995, 2000). In the sequel, these results are extended to the optimal expansion of Volterra models of any order. Theorem 1. Let the mth-order Volterra kernel satisfy the stability and unit delay conditions, i.e. ∞ ∞ (1) k1 =0 · · · km =0 |hm (k1 ; : : : ; km )| ¡ ∞. (2) hm (k1 ; : : : ; km ) = 0 if ∃l ∈ {1; : : : ; m}: kl = 0. Then, de@ning the cost function ∞ ∞ Jm , ··· (i1 + · · · + im )i21 ;:::;im i1 =1
(9)
im =1
the following equality holds: Jm =
S2; l =
k1 =0
(Q1; m −
+ (Q2; m − 2Q1; m + 1)pm + Q1; m m hm 2 ; 2 1 − pm (10)
where Q1; m , Q2; m and hm are constant terms which depend exclusively on the mth-order Volterra kernel, as follows: ∞ ∞ ··· h2m (k1 ; : : : ; km ); (11)
hm 2 = k1 =0 m
km =0
1 Sj; l ; Qj; m = m l=1
(12)
(13)
km =0
∞ ∞ 1 · · · kl [Tl hm (k1 ; : : : ; km )]2 ;
hm 2
(14)
km =0
where l = 1; : : : ; m and the operator Tl is de@ned as Tl hm (k1 ; : : : ; km ) = hm (k1 ; : : : ; kl + 1; : : : ; km ) − hm (k1 ; : : : ; kl ; : : : ; km ). Proof. The proof is based on a strategy along the lines of (Fu & Dumont, 1993) generalized to the mth-dimensional kernel expansion (4). The main steps are summarized in Appendix A and the complete demonstration can be found in Campello (2002). Remark 2. Although the 8rst condition of Theorem 1 implies hm 2 ¡ ∞ (Desoer & Vidyasagar, 1975), the same does not hold with respect to the terms S1; l and S2; l in (13) and (14), respectively. The convergence of these terms is necessary so that the cost function Jm in (10) be well-de8ned and can be optimized. The convergence of the terms associated with the 8rst-order kernel is guaranteed in the modeling of sampled linear systems, but not necessarily in the present context. In the general case, the convergence depends on the rates of decay of the kernel hm (k1 ; : : : ; km ), i.e., the rates with which the kernel tends to zero as k1 ; : : : ; km go to in8nity. A suIcient condition for convergence is to assume that the kernel is non-null ( hm 2 = 0) and has 8nite memory (EykhoE, 1974), i.e., hm (k1 ; : : : ; km ) = 0 if kl ¿ m (∀l ∈ {1; : : : ; m}), where m ¡ ∞ (no matter how large it is). This non-restrictive assumption results in the usual approximate model representation given by Eq. (3). Bearing in mind the above considerations and assuming that the terms S1; l and S2; l as de8ned in (13) and (14) are 8nite, it is possible to select the pole pm which parameterizes the Laguerre expansion of the mth-order Volterra kernel hm by solving the following optimization problem: min JQm (pm ) ,
2 1)pm
817
pm ∈P
Jm ; m hm 2
(15)
where P , {pm ∈ R : |pm | ¡ 1} and Jm is given by Eq. (10). Before proceeding to solve this problem, the following two lemmas are necessary (please, refer to Appendix A for the corresponding proofs): Lemma 3. If the kernel hm satis@es the second condition stated in Theorem 1, then Q1; m − 1 ¿ 0. Lemma 4. If the kernel hm satis@es the conditions stated in Theorem 1, then 4Q1; m Q2; m − Q2;2 m − 2Q2; m ¿ 0.
818
R.J.G.B. Campello et al. / Automatica 40 (2004) 815 – 822
Theorem 5. Problem (15) has a strict global solution given by ∗ pm =
2Q1; m − 1 − Q2; m 2Q1; m − 1 + 4Q1; m Q2; m − Q2;2 m − 2Q2; m
(2Q1; m − 1)2pm + 1 = 0: (Q2; m − 2Q1; m + 1)
pm; 1 =
2Q1; m − 1 −
4Q1; m Q2; m −
− 2Q2; m
2Q1; m − 1 − Q2; m
pm; 2 =
4Q1; m Q2; m − Q2;2 m − 2Q2; m
2Q1; m − 1 − Q2; m
:
4Q1; m Q2; m − Q2;2 m − 2Q2; m ¿ 0
2Q1; m − 1 −
4Q1; m Q2; m −
Q2;2 m
− 2Q2; m ¿ 0
m; ij (kj ): (22)
j=1
Using orthonormality property of the Laguerre functions, the ∞ i.e., k=0 m; q (k)m; r (k) = qr , where qr is the Kronecker delta, it can be shown that ∞ ∞ ··· em2 (k1 ; : : : ; km )
em 2 , k1 =0
=
km =0
∞
∞
···
i1 =Nm +1
i21 ;:::;im :
(23)
im =Nm +1
(18)
(19)
3.1. Optimal model expansion using a single Laguerre basis
(20)
which implies
i1 ;:::;im
im =Nm +1
Therefore, since the optimal pole given by Eq. (16) globally minimizes the cost function Jm , then it also minimizes the upper bound of the sum of the squared errors resulting from the truncation of the series into Nm functions. Moreover, it is also clear from (24) that increasing the number Nm of functions causes this upper bound to decrease.
Since pm; 1 pm; 2 = 1, then pm; 1 and pm; 2 have the same sign. In addition, as 2Q1; m − 1 ¿ 0 (from Lemma 3) and 4Q1; m Q2; m − Q2;2 m − 2Q2; m ¿ 0 (Lemma 4), it follows that 2Q1; m − 1 +
···
m
(17)
and 2Q1; m − 1 +
=
∞
Then, using the above equation well as the de8nition as ∞ ∞ of Jm in (9) and the inequality i1 =1 · · · im =1 (i1 + · · · + ∞ ∞ im )i21 ;:::;im ¿ m(Nm +1) i1 =Nm +1 · · · im =Nm +1 i21 ;:::;im yields Jm : (24)
em 2 6 m (Nm + 1)
The solutions of Eq. (17) are Q2;2 m
∞ i1 =Nm +1
Proof. Since the cost function JQ m is a pseudo-convex function inside the domain P (see Appendix B), then any solution to d JQm =dpm = 0 in P is a global solution to problem (15) (Bazaraa, Sherali, & Shetty, 1993). Assuming that Q2; m − 2Q1; m + 1 = 0 (otherwise the solution will clearly ∗ = 0), the derivative d JQ m =dpm will be equal to zero if be pm and only if
em (k1 ; : : : ; km ) , hm (k1 ; : : : ; km ) − hQm (k1 ; : : : ; km )
(16)
if the kernel hm satis@es the conditions stated in Theorem 1.
2 pm +
truncation error is given by
(21)
and, accordingly, |pm; 1 | ¡ |pm; 2 |. Finally, since pm; 1 pm; 2 = 1, then |pm; 1 | ¡ 1 (pm; 1 ∈ P) and |pm; 2 | ¿ 1 (pm; 2 ∈ P). ∗ Therefore, the strict global solution to problem (15) is pm = pm; 1 = 1=pm; 2 , that is, Eq. (16). Theorem 6. The optimal Laguerre pole given by Eq. (16) minimizes the upper bound of the squared norm of the error resulting from the truncation of the series expansion into a @nite number Nm of functions. Proof. Truncating the kernel expansion given by Eq. (4) functions yields hQm (k1 ; : : : ; km ) = Nminto N mNmLaguerre m j=1 m; ij (kj ). The corresponding i1 =1 · · · im =1 i1 ;:::;im
So far, the Laguerre expansion of an M th-order Volterra model has been approached by adopting M independent bases of functions for the development of the kernels. For the sake of computational simplicity, but with the price of the corresponding series truncation error being larger, it is possible to expand all the kernels using a single Laguerre basis. This is equivalent to setting p1 =· · ·=pM , p in (7). In this case, the optimization problem can be rewritten as min
−1¡ p ¡1
J ,
M Jm m=1
m
;
(25)
where the cost function J is represented using the above de8nitions as well as Eq. (10) as (QQ 1 − 1)p2 + (QQ 2 − 2QQ 1 + 1)p + QQ 1 (26) J= 1 − p2 M M 2 Q with = m=1 hm 2 , QQ 1 = 1 m=1 Q1; m hm , Q 2 = M 1 2 m=1 Q2; m hm , and Qj; m de8ned in (12). Theorem 7. Problem (25) has a strict global solution given by 2QQ 1 − 1 − QQ 2 (27) p∗ = 2QQ 1 − 1 + 4QQ 1 QQ 2 − QQ 2 − 2QQ 2 2
R.J.G.B. Campello et al. / Automatica 40 (2004) 815 – 822
if the kernels hm (m = 1; : : : ; M ) satisfy the conditions stated in Theorem 1. Proof. The proof is the same as that of Theorem 5 (mutatis mutandis) since expressions (26) and (10) are equivalent in form and the terms QQ 1 and QQ 2 have the same relevant properties of Q1; m and Q2; m presented in Lemmas 3 and 4. It is important to notice from the de8nitions of QQ 1 and QQ 2 that if the squared norm of a speci8c kernel tends to be much larger than those of the others, then solution (27) tends to the respective individual solution given by Eq. (16). This result is in conformity with the relative importance of the corresponding kernel to the overall model behavior. Theorem 8. The optimal Laguerre pole given by Eq. (27) minimizes the upper bound of the squared norm of the error resulting from the truncation of the series expansion into a @nite number N of functions. Proof. Theorem 6 states that em 2 6 Jm =(Nm + 1)m, where Nm is the number of functions used for expanding the mth-order kernel hm . When using a single Laguerre basis for expanding all the M kernels of the model, with N1 = · · · = NM , MN , it is straightforward to conclude that M 2
e
6 + 1)m and, from the de8m m=1 m=1 Jm =(N M 2 nition of J in (25), that m=1 em 6 J=(N + 1). It is clear from this relation that since the optimal pole given by Eq. (27) globally minimizes the cost function J , then it also minimizes the upper bound of the sum of the squared errors resulting from the truncation of the series into N functions. 4. Examples 4.1. Example 1 Consider a second-order Volterra model with kernels h1 and h2 given by h1 (k1 ) = e− k1 and h2 (k1 ; k2 ) = e− 1 k1 e− 2 k2 , respectively, where k1 ; k2 = 1; 2; : : : and ; 1 ; 2 ¿ 0. The 8rst-order kernel h1 is equivalent to the normalized impulse response of a stable 8rst-order continuous system with transfer function !=(s+!) sampled by means of a zero-order-hold circuit with sampling period T , where = !T . It is known that the optimal Laguerre pole—with respect to (8)—for the orthonormal series expansion of that kernel is exactly the pole of the resulting discrete-time system, i.e., p1∗ = e− (Fu & Dumont, 1993). The second-order kernel h2 is clearly a two-dimensional generalization of the 8rst-order one. In this case, analytic expressions for Q1; 2 and Q2; 2 can be derived from equations (11) to (14) (Campello et al., 2001). These expressions can be used to analyze the behavior of the optimal Laguerre pole in (16) for several values of 1 and 2 . It can be shown that if 1 and/or 2 tend to zero, then
819
p2∗ → 1, i.e., the stability bound. The reason is that the kernel tends to be nearly constant (with a very slow dynamic) along the direction ki as i → 0. In the speci8c case when ∗ − 1 = e− 2 , which 1 = 2 it can also be shown that p2 = e is a generalization of the solution concerning the 8rst-order kernel. This result means that if 1 ; 2 → ∞, then p2∗ → 0, i.e., the (discrete) optimal pole follows the faster and faster dynamics of the kernel. 4.2. Example 2 The identi8cation of a simulated polymerization CSTR (continuous stirred tank reactor) is considered. A four state model of this non-linear process is given by (Maner, Doyle III, Ogunnaike, & Pearson, 1996; Doyle III, Pearson, & Ogunnaike, 2002): x˙1 (t) = 60 − x1 (t)(2:4568 x2 (t) + 10); x˙2 (t) = 80u(t) − 10:1022x2 (t); x˙3 (t) = 0:002412x1 (t) x2 (t) + 0:11218x2 (t) − 10x3 (t); x˙4 (t) = 245:9811x1 (t) x2 (t) − 10x4 (t); y(t) = x4 (t)=x3 (t);
(28)
where y is the number average molecular weight (NAMW) of the substance derived from the reaction (kg/kmol) and u is the inlet reaction initiator Oow rate (m3 =h). The process is simulated from t = 0 up to 24 h with the input u being a sequence of steps with period of 1 h and random amplitude uniformly distributed within the operational interval [0:002; 0:02]. The data set is sampled (T = 0:2 h) and split into two parts: One half intended for the estimation of a Volterra model of this process and the other half intended for the validation of the resulting model. These data are normalized within the interval [ − 1; 1] and the 8rst 8ve samples (1 h) are used exclusively to recover the right Laguerre states which are unknown (and set equal to zero) at t = 0. A Volterra model given by Eq. (3) plus an additive constant term h0 is estimated in such a way that the elements of the kernels are treated as individual parameters. For practical reasons, a second-order model (M = 2) has been adopted, as usual in the literature (Billings, 1980; Dumont & Fu, 1993). The memory of the model has been selected as 1 = 2 = 4 based upon preliminary experiments, which showed that the modeling error cannot be signi8cantly reduced by increasing 1 or 2 beyond this value. By assuming the non-restrictive hypothesis of symmetry of the second-order kernel (Dumont & Fu, 1993), the model has 15 free-design parameters to be estimated using the RLS algorithm. The modeling performance throughout the validation phase is shown in Fig. 1 (top). The corresponding mean squared error is MSE = 2:09 × 105 . The Laguerre pole for expanding the Volterra model using a single Laguerre basis is computed from Eq. (27) as p = 0:456. A truncated Laguerre–Volterra model implemented
820
R.J.G.B. Campello et al. / Automatica 40 (2004) 815 – 822
[ Kg/Kmol ]
4
Appendix A. Proofs
x 104
Proof of Theorem 1. Using the following equality satis8ed by any two discrete-time sequences a(k) and b(k): ∞ a(k + 1)Tb(k) = b(∞)a(∞) − b(0)a(0)
3.5 3 2.5 2 12
14
16
4
4
18 time [h]
20
22
24
k=0
∞ b(k)Ta(k); −
x 10
(A.1)
[ Kg/Kmol ]
k=0 3.5 3 2.5 2 12
14
16
18 time [h]
20
22
24
Fig. 1. Process (line) and model (circles) outputs: Volterra model (above) and Laguerre–Volterra model (below).
where the operator T is de8ned as Tb(k) = b(k + 1) − b(k), and noticing that conditions 1 and 2 of the theorem imply limkl →∞ hm (k1 ; : : : ; kl ; : : : ; km ) = 0 and hm (k1 ; : : : ; kl−1 ; 0; kl+1 ; : : : ; km ) = 0, respectively, then it can be shown that ∞ ∞ 1 · · · hm (k1 ; : : : ; kl + 1; : : : ; km ) S2; l = −
hm 2 k1 =0
km =0
× Tl [kl Tl hm (k1 ; : : : ; km )]; with this pole and just one Laguerre 8lter is considered. The corresponding Laguerre coeIcients, namely, 1 and 1; 1 in (6), plus an additive constant term 0 , are estimated using the RLS algorithm. The performance of the model throughout the validation phase is shown in Fig. 1 (bottom). The corresponding error is MSE = 2:18 × 105 . This result shows that the Laguerre–Volterra model performs similarly to the original Volterra model, while being endowed with 20% of the number of parameters.
5. Conclusion In this paper, the optimization of Laguerre bases for the orthonormal series expansion of discrete-time Volterra models of any order has been addressed. Analytic strict global solutions for the optimal expansion of the Volterra kernels have been derived either using an independent Laguerre basis for each kernel or using a common basis for all the kernels. It has been shown that these solutions minimize the upper bound of the squared norm of the error resulting from the truncation of the series into a 8nite number of functions, thus optimizing the performance of the resultant 8nite dimensional representation. In future works the authors intend to investigate extensions of these results with respect to generalized orthonormal bases of functions ( Heuberger et al., 1995; Ninness & Gustafsson, 1997).
where the term S2; l is de8ned in (14). By means of elementary algebraic manipulations the above equation can be rewritten as ∞ ∞ 1 S2; l = − · · · hm (k1 ; : : : ; kl + 1; : : : ; km )
hm 2 k1 =0
km =0
× [(kl + 1)hm (k1 ; : : : ; kl + 2; : : : ; km ) − (2kl + 1)hm (k1 ; : : : ; kl + 1; : : : ; km ) + kl hm (k1 ; : : : ; km )]:
(A.3)
In addition, using the recursive relation satis8ed by the Laguerre functions m; il (kl ) (Fu & Dumont, 1993), i.e., 2 )kl m; il (kl )+pm (kl −1)m; il (kl − pm kl m; il (kl +1)−(1+pm 2 2 1) = (pm il − pm − il )m; il (kl ), as well as the orthonormality property of these functions and the mth-order kernel expansion given by Eq. (4), the following equality can be derived: ∞ ∞ ··· [pm (kl + 1)hm (k1 ; : : : ; kl + 2; : : : ; km ) k1 =0
km =0
2 )(kl + 1)hm (k1 ; : : : ; kl + 1; : : : ; km ) −(1 + pm
+pm kl hm (k1 ; : : : ; kl ; : : : ; km )]hm (k1 ; : : : ; kl + 1; : : : ; km ) ∞ ∞ 2 2 = ··· i21 ;:::;im (pm il − pm − il ): i1 =1
(A.4)
im =1
Eqs. (A.3) and (A.4) yield ∞ ∞ 1 2 2 · · · i21 ;:::;im [il (pm − 1) − pm ] S2; l = − pm hm 2 i1 =1
Acknowledgements The authors acknowledge the anonymous referees for their constructive suggestions. The 8rst and third authors also acknowledge CAPES (for fellowship BEX0467/02-2) and CNPq (for fellowship 301345184), respectively.
(A.2)
+
∞ k1 =0
···
∞ km =0 2
im =1
hm (k1 ; : : : ; kl + 1; : : : ; km )2
× [(1 − pm ) (kl + 1) + pm ] :
(A.5)
R.J.G.B. Campello et al. / Automatica 40 (2004) 815 – 822
Moreover, using Eq. (4) and the orthonormality property of the Laguerre functions, it is straightforward to rewrite the squared norm hm 2 de8ned in (11) as ∞ ∞
hm 2 = ··· i21 ;:::;im : (A.6) i1 =1
i1 =1
−
im =1
(1 − pm )2 S1; l ; pm
(A.7)
where S1; l is de8ned in (13). Summing S2; l in (A.7) for l=1; : : : ; m and using the de8nition of Q1; m and Q2; m in (12) yields ∞ ∞ 1 Q2; m = · · · i21 ;:::;im mpm hm 2 i1 =1
×
2 mpm
im =1
− mpm + (1 −
2 pm )
m
∞ ∞ ∞ 1 · · · · · · kl (hm (k1 ; : : : ; kl ; : : : ; km )
hm 2 k1 =0
(1 − pm )2 Q1; m − pm or equivalently
+hm (k1 ; : : : ; kl + 1; : : : ; km ))2 = 4S1; l − S2; l − 2: (A.12)
(A.9)
which in turn directly implies the equivalence between Eqs. (9) and (10). Proof of Lemma 3. From the de8nition of Q1; m in (12) it follows that m m 1 1 S1; l − 1 = (S1; l − 1): (A.10) Q1; m − 1 = m m l=1
The term S1; l − 1 can be written using Eq. (13) and the second condition stated in Theorem 1 as ∞ ∞ 1 S1; l − 1 = · · · (kl − 1)
hm 2 k1 =1
km =1
× h2m (k1 ; : : : ; kl ; : : : ; km )
4Q1; m − Q2; m − 2 ¿ 0:
(A.13)
In addition, Q2; m is non-negative by de8nition. Moreover, it is always positive since hm is non-null and satis8es the 8rst condition of Theorem 1. Hence, inequality (A.13) results in 4Q1; m Q2; m − Q2;2 m − 2Q2; m ¿ 0:
f(x) ,
!(x) ; *(x)
(B.1)
• ! is convex, diCerentiable and non-negative for all x ∈ X. • * is concave, diCerentiable and positive for all x ∈ X.
l=1
l=1
Since the 8rst condition of Theorem 1 implies limkl →∞ hm (k1 ; : : : ; km ) = 0; ∀l ∈ {1; : : : ; m}; and provided that the kernel hm is non-null by assumption, then Eq. (A.12) implies 4S1; l − S2; l − 2 ¿ 0. From this inequality as well as the de8nition of Q1; m and Q2; m in (12) it is straightforward to conclude that
where X is an open convex set such that X ⊂ Rn and
pm mQ2; m hm 2 + (1 − pm )2 mQ1; m hm 2
im =1
km =0
Theorem B.1. Let a function f : X → R be de@ned as (A.8)
2 2 = m(pm − pm ) hm 2 + (1 − pm ) ∞ ∞ m × ··· il i21 ;:::;im
kl =0
Appendix B. Auxiliary results
il
l=1
i1 =1
to show that
im =1
Then, taking the second condition of the theorem into account, Eqs. (A.5) and (A.6) result in ∞ ∞ 1 2 2 S2; l = ··· [(pm + (1 − pm )il − pm )i21 ;:::;im ] pm hm 2
821
(A.11)
which cannot be negative, i.e., S1; l − 1 ¿ 0. Hence, it is straightforward to infer Q1; m − 1 ¿ 0 from (A.10). Proof of Lemma 4. Using elementary algebraic operations and the second condition stated in Theorem 1 it is possible
Then, f is a pseudo-convex function in X, i.e., ∇f(x1 )T (x2 − x1 ) ¿ 0 implies f(x2 ) ¿ f(x1 ) for any pair of elements x1 ; x2 ∈ X (see Bazaraa et al., 1993, Chapter 3). Proposition B.2. Function JQm (pm ) , Jm =m hm 2 , with Jm given by Eq. (10), is pseudo-convex in P = {pm ∈ R: |pm | ¡ 1}. Proof. Provided that (1) P is an open convex set; 2 (2) *(pm ) , 1 − pm is concave, diEerentiable and positive for all pm ∈ P; 2 (3) !(pm ) , (Q1; m −1)pm +(Q2; m −2Q1; m +1)pm +Q1; m is convex since Q1; m − 1 ¿ 0 (Lemma 3), diEerentiable and non-negative 2 for all pm ∈ P; then Proposition B.2 results directly from the application of Theorem B.1. Otherwise JQm (and Jm ) would be negative, which is not possible by De8nition (9). 2
822
R.J.G.B. Campello et al. / Automatica 40 (2004) 815 – 822
References Bazaraa, M. S., Sherali, H. D., & Shetty, C. M. (1993). Nonlinear programming: Theory and algorithms (2nd ed.). New York: Wiley. Billings, S. A. (1980). Identi8cation of nonlinear systems—a survey. IEE Proceedings Part D, 127(6), 272–285. Boyd, S., & Chua, L. O. (1985). Fading memory and the problem of approximating nonlinear operators with Volterra series. IEEE Transactions on Circuits and Systems, 32(11), 1150–1161. Broome, P. W. (1965). Discrete orthonormal sequences. Journal of the Association for Computing Machinery, 12(2), 151–168. Campello, R. J. G. B. (2002). New architectures and methodologies for modeling and control of complex systems combining classical and modern tools. Ph.D. thesis, School of Electrical and Computer Engineering of the State University of Campinas (FEEC/UNICAMP), Campinas/SP, Brazil (in Portuguese). Campello, R. J. G. B., Amaral, W. C., & Favier, G. (2001). Optimal Laguerre series expansion of discrete Volterra models. In Proceedings of the European Control Conference, Porto/Portugal (pp. 372–377). Clowes, G. J. (1965). Choice of the time-scaling factor for linear system approximation using orthonormal Laguerre functions. IEEE Transactions on Automatic Control, 10, 487–489. Desoer, C. A., & Vidyasagar, M. (1975). Feedback systems: Input– output properties. New York: Academic Press. Doyle III, F. J., Pearson, R. K., & Ogunnaike, B. A. (2002). Identi@cation and control using Volterra models. Berlin: Springer. Dumont, G. A., & Fu, Y. (1993). Non-linear adaptive control via Laguerre expansion of Volterra kernels. International Journal of Adaptive Control and Signal Processing, 7, 367–382. Elshafei, A.-L., Dumont, G. A., & Elnaggar, A. (1994). Adaptive GPC based on Laguerre 8lters modelling. Automatica, 30(12), 1913–1920. EykhoE, P. (1974). System identi@cation: Parameter and state estimation. New York: Wiley. Finn, C., Wahlberg, B., & Ydstie, B. E. (1993). Constrained predictive control using orthogonal expansions. A.I.Ch.E. Journal, 39(11), 1810–1826. Fu, Y., & Dumont, G. A. (1993). An optimum time scale for discrete Laguerre network. IEEE Transactions on Automatic Control, 38(6), 934–938. Heuberger, P. S. C., Van den Hof, P. M. J., & Bosgra, O. H. (1995). A generalized orthonormal basis for linear dynamical systems. IEEE Transactions on Automatic Control, 40, 451–465. Maner, B. R., Doyle III, F. J., Ogunnaike, B. A., & Pearson, R. K. (1996). Nonlinear model predictive control of a simulated multivariable polymerization reactor using second-order Volterra models. Automatica, 32(9), 1285–1301. Marmarelis, V. Z. (1993). Identi8cation of nonlinear biological systems using Laguerre expansions of kernels. Annals of Biomedical Engineering, 21, 573–589. Ninness, B., & Gustafsson, F. (1997). A unifying construction of orthonormal bases for system identi8cation. IEEE Transactions on Automatic Control, 42, 515–521. Oliveira, G. H. C., Amaral, W. C., Favier, G., & Dumont, G. A. (2000). Constrained robust predictive controller for uncertain processes modeled by orthonormal series functions. Automatica, 36(4), 563–571. Schetzen, M. (1980). The Volterra and Wiener theories of nonlinear systems. New York: Wiley. Tanguy, N., Morvan, R., Vilb+e, P., & Calvez, L. C. (2000). Online optimization of the time scale in adaptive Laguerre-based 8lters. IEEE Transactions on Signal Processing, 48, 1184–1187. Tanguy, N., Vilb+e, P., & Calvez, L. C. (1995). Optimum choice of free parameter in orthonormal approximations. IEEE Transactions on Automatic Control, 40, 1811–1813.
Wahlberg, B. (1991). System identi8cation using Laguerre models. IEEE Transactions on Automatic Control, 36(5), 551–562. Wahlberg, B. (1994). System identi8cation using Kautz models. IEEE Transactions on Automatic Control, 39(6), 1276–1282. Wahlberg, B., & Ljung, L. (1992). Hard frequency-domain model error bounds from least-squares like identi8cation techniques. IEEE Transactions on Automatic Control, 37(7), 900–912. Wahlberg, B., & MCakilCa, P. M. (1996). Approximation of stable linear dynamical systems using Laguerre and Kautz functions. Automatica, 32(5), 693–708. Wiener, N. (1958). Nonlinear problems in random theory. New York: Wiley. Zervos, C. C., & Dumont, G. A. (1988). Deterministic adaptive control based on Laguerre series representation. International Journal of Control, 48(6), 2333–2359. Ricardo J.G.B. Campello was born in Recife - PE, Brazil. He received the BSc degree in Electronics Engineering from UNESP (Universidade Estadual Paulista), Ilha Solteira - SP, in 1994 and the MSc and Ph.D. degrees in Electrical Engineering from the School of Electrical and Computer Engineering of UNICAMP (Universidade Estadual de Campinas), Campinas - SP, in 1997 and 2002, respectively. In 2002 he was a visiting scholar at the Laboratoire D’Informatique, Signaux et SystZemes de Sophia Antipolis (I3S), Universit+e de Nice - Sophia Antipolis, France. He is currently an assistant professor in the Graduate Program in Informatics at UNISANTOS (Universidade Cat+olica de Santos), Santos - SP. His research interests include fuzzy systems, arti8cial neural networks, evolutionary computation, arti8cial intelligence, data mining and dynamic systems identi8cation and control. G.erard Favier was born in Avignon, France, in 1949. He received the engineering diplomae from ENSCM (Ecole Nationale Sup+erieure de Chronom+etrie et de Microm+ecanique), Besan[con, and ENSAE (Ecole Nationale Sup+erieure de l’A+eronautique et de l’Espace), Toulouse, the Engineering Doctorate and State Doctorate degrees from the University of Nice– Sophia Antipolis, in 1973, 1974, 1977 and 1981, respectively. In 1976, he joined the CNRS (Centre National de la Recherche Scienti8que), and now he works as a Research Director of CNRS at the I3S laboratory, in Sophia Antipolis. From 1995 to 1999, he was the Director of the I3S laboratory. His present research interests include nonlinear process modelling and identi8cation, nonlinear and blind equalization, and predictive control. Wagner Caradori do Amaral was born in Campinas, SP, Brazil on March 25, 1952. He received the BS, MS and doctorate degrees in Electrical Engineering from State University of Campinas, UNICAMP, in 1974, 1976 and 1981, respectively. Since 1991 he has been a Full Professor at the Department of Computer Engineering and Industrial Automation at UNICAMP. From 1995 to 1999 he was the Director of the School of Electrical and Computer Engineering at UNICAMP. His research interests are in the areas of modeling, identi8cation and predictive control.