ARTICLE IN PRESS
Engineering Applications of Artificial Intelligence 21 (2008) 591–603 www.elsevier.com/locate/engappai
Output tracking with constrained inputs via inverse optimal adaptive recurrent neural control Luis J. Ricaldea, Edgar N. Sanchezb, a
Facultad de Ingenieria de la Universidad Autonoma de Yucatan, Av. Industrias no Contaminantes por Periferico Norte, Apdo. Postal 150 Cordemex, Merida, Yucatan, Mexico b CINVESTAV, Unidad Guadalajara, Apartado Postal 31-438, Plaza La Luna, Guadalajara, Jalisco, C.P. 45091, Mexico Received 23 November 2005; received in revised form 23 June 2007; accepted 11 July 2007 Available online 14 September 2007
Abstract This paper extends previous results to the output tracking problem of nonlinear systems with unmodelled dynamics and constrained inputs. A recurrent high order neural network is used to identify the unknown system dynamics and a learning law is obtained using the Lyapunov methodology. A stabilizing control law for the output tracking error dynamics is developed using the Lyapunov methodology and the Sontag control law for nonlinear systems with constrained inputs. r 2007 Elsevier Ltd. All rights reserved. Keywords: Recurrent neural networks; Output trajectory tracking; Adaptive control; Constrained inputs; Inverse optimal control
1. Introduction In many control applications, the process exhibits highly nonlinear behavior, uncertainties, unknown disturbances and bounded inputs. All these phenomena are required to be considered for control analysis and synthesis. The problem of designing robust controllers for nonlinear systems with uncertainties, which guarantee stability and trajectory tracking has received an increasing attention lately. The presence of constrained inputs limits the ability to compensate the effects of unmodelled dynamics and external disturbances. These effects are reflected on the loss of stability, undesired oscillations and other adverse effects. Regarding previous results on control applications for systems with constrained inputs, there are some results for linear systems as in Kapoor and Daoutidis (1995) and Qiu and Miller (2000), where the former deals with the synthesis of controllers for linear and nonlinear feedback linearizable systems with input constraints and the latter, based on predictive control, develops a state feedback Corresponding author. Tel.: +52 333 134 5570; fax: +52 333 144 5579.
E-mail addresses:
[email protected] (L.J. Ricalde),
[email protected] (E.N. Sanchez). 0952-1976/$ - see front matter r 2007 Elsevier Ltd. All rights reserved. doi:10.1016/j.engappai.2007.07.007
control law for constrained systems where the initial states lie on a prespecified compact subset. On the other hand, Low and Zhuang (2003) presents a model predictive control with experimental results for position control of a servo drive. In the field of nonlinear systems, Ren and Beard (2003) applies a Control Lyapunov Function (CLF) approach with applications to an unmanned air vehicle, with velocity input constraints. For nonlinear systems, control with constrained inputs is restricted by requiring to know the system model. Some algorithms allow the presence of uncertainties satisfying the matching condition (El-Farra and Christofides, 2001). In El-Farra and Christofides (2001), a control law, based on the Sontag formula (Lin and Sontag, 1991) with constrained inputs, is developed and applied to a chemical reactor. To relax the restriction of requiring knowledge of the system model, identification via recurrent neural networks arises as a potential solution (Hokimyan et al., 2001; Sanchez et al., 2003). On the other hand, it is worth mentioning that the H 1 control approach (Basar and Bernhard, 1995) minimizes the control effort and achieves robust stabilization. One major difficulty with this approach, alongside its possible system structural instability, seems to be the requirement of
ARTICLE IN PRESS 592
L.J. Ricalde, E.N. Sanchez / Engineering Applications of Artificial Intelligence 21 (2008) 591–603
solving some resulting partial differential equations. In order to alleviate this computational problem, the so-called inverse optimal control technique was recently developed, based on the input-to-state stability (ISS) concept (Krstic and Deng, 1998). This approach has the advantage of not requiring to solve a Hamilton–Jacobi–Bellman (HJB) equation. In Sanchez et al. (2002), the inverse optimal control approach is applied to control robotic manipulators, with unmodelled dynamics. Since the seminal paper (Narendra and Parthasarathy, 1990), there has been a continuously increasing interest in applying neural networks to identification and control of nonlinear systems. Lately, the use of recurrent neural networks is being developed, which allows more efficient modelling of the underlying dynamical systems. Recent books, as Rovitahkis and Christodoulou (2000), have reviewed the application of this kind of neural networks for nonlinear system identification and control. In Rovitahkis and Christodoulou (2000), adaptive identification and control by means of on-line learning is analyzed; the stability of the closed-loop system is established based on the Lyapunov function method. In Sanchez and Ricalde (2003), the problem of trajectory tracking, for nonlinear systems in presence of constrained inputs and uncertainties with application to chaos control and synchronization, is considered. In this paper, we extend our previous results (Sanchez and Ricalde, 2003), to nonlinear systems with less inputs than states. The output trajectory tracking problem with constrained inputs is solved with adaptive control scheme composed by a recurrent neural identifier, which is used to build an on-line model for the unknown plant, and a control law to force the unknown plant to track the output reference trajectory. An update law for the recurrent high order neural network is proposed via the Lyapunov methodology. A robust learning law to avoid the parameter drift in the presence of modelling error is also proposed. The control law is synthesized using the Lyapunov methodology and a modification of the Sontag control law for stabilizing systems with constrained inputs and uncertain terms. This control law explicitly depends on the input constraints. Boundedness of the tracking error is proven and an estimation of the closed loop stability region is given in order to determine the available bounds of the uncertainties and the desired tracking error. The proposed scheme is validated for the output trajectory tracking of a nonlinear oscillator.
their easy implementation, robustness and capacity to adjust their parameters on line. In Kosmatopoulos et al. (1997), recurrent higher-order neural networks (RHONN) are defined as
2. Mathematical preliminaries
2.2. Inverse optimal control
2.1. Recurrent higher-order neural networks
Many control applications have to deal with nonlinear processes in presence of uncertainties and disturbances. All these phenomena must be considered for controller design in order to obtain the desired closed loop performance. This section closely follows Krstic and Deng (1998) and Sepulchre et al. (1997). As stated in Sepulchre et al. (1997),
Artificial Recurrent Neural Networks are mostly based on the Hopfield model (Hopfield, 1984). They are considered as good candidates for nonlinear systems in presence of uncertainties; they are also attractive due to
x_ i ¼ ai xi þ
L X k¼1
wik
Y
d ðkÞ
yj j ;
i ¼ 1; . . . ; n,
(1)
j2I k
where xi is the ith neuron state, L is the number of higherorder connections, fI 1 ; I 2 ; . . . ; I L g is a collection of nonordered subsets of f1; 2; . . . ; m þ ng, ai 40, wik are the adjustable weights of the neural network, d j ðkÞ are nonnegative integers, and y is a vector defined by y ¼ ½y1 ; . . . ; yn ; ynþ1 ; . . . ; ynþm > ¼ ½Sðx1 Þ; . . . ; Sðxn Þ; Sðu1 Þ; . . . ; Sðum Þ> , with u ¼ ½u1 ; u2 ; . . . ; um > being the input to the neural network, and SðÞ a smooth sigmoid function formulated by SðxÞ ¼ 1=ð1 þ expðbxÞÞ þ e. For the sigmoid function, b is a positive constant and e is a small positive real number. Hence, SðxÞ 2 ½e; e þ 1. As can be seen, (1) allows the inclusion of higher-order terms. By defining a vector zðx; uÞ ¼ ½z1 ðx; uÞ; . . . ; zL ðx; uÞ> " #> Y d ð1Þ Y d ðLÞ j j ¼ yj ; . . . ; yj , j2I 1
ð2Þ
j2I L
(1) can be rewritten as x_ i ¼ ai xi þ w> i zi ðx; uÞ;
i ¼ 1; . . . ; n,
(3)
>
where wi ¼ ½wi;1 . . . wi;L . In this paper, we consider the following RHONN, x_ i ¼ ai xi þ w> i zi ðxÞ þ wgi zgi ðxÞui ;
i ¼ 1; . . . ; n.
(4)
Reformulating (4) in matrix form yields x_ ¼ Ax þ WzðxÞ þ W g zg ðxÞu,
(5)
where x 2 Rn ; W 2 RnL ; W g 2 RnL ; zðxÞ 2 RL ; zg ðxÞ 2 RLm ; u 2 Rm , and A ¼ lI; l40. For nonlinear identification applications, the term yj in (2) can be either an external input or the identifier state of a neuron passed through the sigmoid function. Depending on the sigmoid function input, the RHONN can be classified as a Series-Parallel structure if zðÞ ¼ zðnÞ, where n is an external input, or a Parallel one if zðÞ ¼ zðxÞ, where x is the neural network state (Rovitahkis and Christodoulou, 2000). This terminology is standard in adaptive identification and control (Ioannou and Sun, 1996; Narendra and Annaswamy, 1989).
ARTICLE IN PRESS L.J. Ricalde, E.N. Sanchez / Engineering Applications of Artificial Intelligence 21 (2008) 591–603
optimal stabilization guarantees several desirable properties for the closed loop system, including stability margins. In a direct approach, for a system
lðxÞ ¼ Lf V ðxÞ þ 12Lg V ðxÞkðxÞX0
x_ ¼ f ðxÞ þ gðxÞu,
then V ðxÞ is a solution of the HJI equation
we would have to solve a HJB associated equation stated as T qV 1 qV qV f ðxÞ þ lðxÞ gðxÞR1 gT ðxÞ ¼ 0, qx 4 qx qx
lðxÞ þ Lf V ðxÞ 14ðLg V ðxÞR1 ðxÞðLg V ðxÞÞT ¼ 0.
where V ðxÞ is a continuously differentiable, positive definite function, lðxÞ is a positive semidefinite function and RðxÞ is a nonsingular matrix. Solving this equation is not an easy task; besides, the robustness achieved is largely independent of the particular choice for functions lðxÞ40 and RðxÞ40 appearing in the cost functional. This fact is the motivation to pursue the development of design methods, which solve the inverse problem of optimal stabilization. The inverse optimal control approach is based on the concept of equivalence between ISS and the solution of the H 1 nonlinear control problem (Krstic and Deng, 1998). Using CLFs, a stabilizing feedback is designed first and then shown to be optimal with respect to the meaningful cost functional. Consider the nonlinear system with the following state space description x_ ¼ f ðxÞ þ gðxÞu, y ¼ hðxÞ,
ð6Þ
where x 2 Rn , u 2 Rm and y denotes the system output. Without loss of generality, let us assume that xeq ¼ 0 is an equilibrium point of (6). Suppose there exists a positive definite radially unbounded C 1 scalar function V, such that infm Lf V ðxÞ þ Lg V ðxÞu o0; 8xa0. uR
Let lðxÞ and RðxÞ be two continuous scalar functions such that lðxÞX0 and RðxÞ408x 2 Rn and consider the cost functional Z 1 J¼ ðlðxÞ þ uT RðxÞuÞ dt. (7) 0
The Hamilton-Jacobi–Isaacs (HJI) equation associated with the system (6) and the cost function (7) is
When the function lðxÞ is set to be the right-hand side of (10)
(11)
As stated in Sepulchre et al. (1997), optimal stabilization guarantees several desirable properties for the closed loop system, including stability margins. In a direct approach, we would have to solve the HJI equation which is not an easy task. Besides, the robustness achieved is largely independent of the particular choice of functions lðxÞ40 and RðxÞ40. In the inverse approach, a stabilizing feedback is designed first and then shown to optimize a cost functional of the form Z 1 J¼ ðlðxÞ þ uT RðxÞuÞ dt. (12) 0
The problem is inverse because the functions lðxÞ and RðxÞ are determined by the stabilizing feedback a posteriori, rather than a priory chosen by the design. The inverse optimal control approach is based on the concept of equivalence between the ISS and the solution of the H 1 nonlinear control problem (Krstic and Deng, 1998). Using CLFs a stabilizing feedback is designed first and then shown to be optimal with respect to a cost functional that imposes penalties on the error and the control input. This approach provides robust controllers where robustness is obtained as a result of the control law optimality, which is independent of the cost functional (Sepulchre et al., 1997). 3. Control problem formulation We consider the single-input single-output nonlinear system x_ p ¼ f p ðxp Þ þ gp ðxp ÞsatðuÞ;
yðxÞ ¼ hðxÞ,
(13)
where xp 2 Rn , and satðuÞ 2 R is the standard saturation nonlinearity defined as 8 > < umax if kuk4umax ; if umin pkukpumax ; satðuÞ ¼ u > :u if kukou : min
inf flðxÞ þ uT RðxÞu þ Lf V ðxÞ þ Lg V ðxÞug ¼ 0.
593
min
uRm
A stabilizing control law uðxÞ solves an inverse optimal problem for the system (22), if it can be expressed as u ¼ kðxÞ ¼ 12R1 ðxÞðLg V ðxÞÞ> ;
RðxÞ ¼ R> ðxÞ40, (8)
where V ðxÞ is a positive semidefinite function, such that the negative semidefiniteness of V_ is achieved with the control un ¼ 12kðxÞ. That is V_ ¼ Lf V ðxÞ 12Lg V ðxÞkðxÞp0.
(9)
The vector functions f p ; gp are assumed to be unknown. We assume full state measurement. The system (13) is modelled by a RHONN in order to estimate the system dynamical model. The control goal is to force (13) to track the reference system given as x_ r ¼ f r ðxr ; ur Þ;
xr 2 R.
(14)
The control law will be a function of the identification error, the tracking error and the system dynamics estimated by the RHONN. This scheme is displayed in Fig. 1.
ARTICLE IN PRESS L.J. Ricalde, E.N. Sanchez / Engineering Applications of Artificial Intelligence 21 (2008) 591–603
594
the unknown dynamic system (15) satisfies Reference system
xr(t)
x_ p ¼ lxp þ W n zðxp Þ þ W ng zg ðxp ÞsatðuÞ, et(t)
-
where all the elements are as described earlier.
Control Unknown
up(t)
Now, we proceed to analyze the error between the identifier and the plant
xp(t)
plant -
Law W
ei(t)
e i ¼ w xp .
(18)
The identification error dynamics is given by
x(t)
NN Identifier
e_i ¼ w_ x_ p .
(19)
Since, by Assumption 2, the plant can be described completely by the neural network with optimal constant weights, we insert (17) in (19)
Fig. 1. Recurrent neural control scheme.
e_i ¼ lðw xp Þ þ ðW W n Þzðxp Þ þ ðW g W ng Þzðxp ÞsatðuÞ,
4. Plant identification Consider the unknown nonlinear plant x_ p ¼ F p ðxp ; uÞ9f p ðxp Þ þ gp ðxp ÞsatðuÞ, n
(17)
~ zðxp Þ þ W ~ g zg ðxp Þu, e_i ¼ le þ W (15)
n
where xp ; f p 2 R ; gp 2 R ; u 2 R. Taking into account that f p is unknown, we propose to model (15) by a recurrent neural network as in (5). Hence, we propose the following recurrent neural model for the unknown plant: w_ ¼ lw þ Wzðxp Þ þ wper þ W g zðxp ÞsatðuÞ,
(16)
where l40; w 2 Rn ; u 2 R; W 2 Rnp ; W g 2 Rn1 , zðxÞ 2 R; xp is the state to identify, w is the neural network state, and u is the applied input to the system. W ; W g are the values of the on-line estimated network weights., zðxp Þ; zg ðxp Þ are sigmoid functions and the term wper 2 Rn represents the modelling error which is assumed to be bounded. Remark 1. The modelling error is defined as a mismatch between the system and the RHONN model with its optimal weight values. This mismatch is caused by an insufficient number of high order terms in the RHONN model. Though l can be freely selected as long as it satisfies l40, greater values relative to xp , can minimize the convergence time of the neural network output; but if it is too large, it can produce undesirable oscillations. It is recommended to adjust the values of l via simulations.
where ~ ¼ W W n, W ~ g ¼ Wg Wn W
ð21Þ
g
are the error matrices between the optimal weights matrices W n ; W ng and the on-line estimated weight matrices W ; W g . To perform the stability analysis of (20), we consider the Lyapunov function candidate V ¼ 12kei k2 þ
1 ~ >W ~ g þ 1 trfW ~ g g, ~ >W trfW g 2g 2gg
(22)
where g; gg 40. Differentiating (22) along the solutions of (17), we obtain ~ zðxp Þ þ eT W ~ g zg ðxp Þu V_ ¼ lkei k2 þ eTi W i 1 _~ > W _~ > W ~ g þ 1 trfW ~ g g. þ trfW g g gg
ð23Þ
The asymptotic stability of the identification error is achieved if we select the weight adaptation laws _~ > W ~ g ¼ ge> W ~ zðxp Þ trfW i _~ > W ~ g g ¼ g> ei W ~ g zðxp Þu trfW g g
Assumption 1. For every wij 2 W, the system (16) is bounded for every bounded state xp .
which results in
To derive the adaptation laws for the neural network adapted weight matrices W and W g , which minimizes the identification error, we consider the case of no modelling error.
w_ gi ¼ gg ei zðxpj Þu;
w_ i;j ¼ gei zðxpj Þ;
ð24Þ
i ¼ 1; 2; . . . ; n; j ¼ 1; 2; . . . ; L, i ¼ 1; 2; . . . ; n.
ð25Þ
Substituting the adaptation law in (23) gives V_ ¼ lkei k2 p0
Assumption 2. There exist unknown but constant weights matrices W n ; W ng such that the plant is completely described by the neural network. Then, the state xp of
(20)
(26)
which is semidefinite negative. We now apply the Barba~ ;W ~ g a0 lat’s lemma (Khalil, 1996). Since V ðtÞ40; 8ei; W
ARTICLE IN PRESS L.J. Ricalde, E.N. Sanchez / Engineering Applications of Artificial Intelligence 21 (2008) 591–603
and V_ ðtÞp0, V ðtÞ is bounded. Hence, kei k is bounded on ½0; T, the maximal interval of existence of the solution for any given initial state. V ðtÞ is nonincreasing and bounded from below by zero, and converges as t ! 1. From (26) ei ~ are bounded on ½0; T, the maximal interval of and W existence of the solution for any given initial state. This implies that T ¼ 1. We conclude ei ! 0 as t ! 1. Then ~g!W ~ 1 , where W 1 limt!1 W ! W 1 and limt!1 W ~ and W 1 are constant values. The assumption of no modelling error is seldomly satisfied. Hence, the adjusted weight parameters could drift to infinity. To avoid the parameter drift, the following robust learning law, known as s-modification, for the neural network weights is proposed as in Rovitahkis and Christodoulou (2000): 8 < gei zðxpj Þ if jwij jowm ; w_ i;j ¼ : gei zðxpj Þ sgwi if jwij jXwm ; 8 < gg ei zðxpj Þui if jwgi jowgm ; w_ gi ¼ ð27Þ : gg ei zðxpj Þui sgg wgi if jwgi jXwgm ; where s is a positive constant and wm and wgm are upper bounds for the neural network weights. Both parameters are selected by experimentation and s should be selected large relative to g. The learning law (27) is the same as (24) if jwi jowm and jwi jowgm . If the weights leave this region, the leakage terms sgwi ; sgg wgi will prevent the weights from drifting to infinity. The parameters g; gg should be selected via simulations in order to reduce the convergence time between the neural network and the unknown system but taking into account that excessive values of g; gg can result in excessive control effort since they will increase the matrices W ; W g . It can be shown that the robust learning law does not affect the stability of the identification error but improves it, making the Lyapunov function time derivative more negative. For a detailed demonstration, see Rovitahkis and Christodoulou (2000). 5. Trajectory tracking analysis Consider the nonlinear system with constrained input (15), which we model by the neural network w_ ¼ lw þ Wzðxp Þ þ wper þ W g zg ðxp ÞsatðuÞ, y ¼ hðwÞ,
ð28Þ
where we assume that the modelling error is bounded. In the following, for simplicity, we will use u instead of satðuÞ. We will design a robust controller which satisfies jujpumax and guarantees boundedness of the tracking error between the plant and the reference signal generated by x_ ref ¼ f ref ðxref Þ;
xref 2 R.
(29)
For the output tracking analysis, the system (16) is converted in a partially linear system by the following
595
change of coordinates " # x TðwÞ ¼ ¼ ½x1 x2 xr B1 B2 Bnr T B ¼ ½hðwÞ Lf hðwÞ Lr1 f hðwÞ c1 ðwÞ c2 ðwÞ cnr ðwÞT ,
ð30Þ
where Lf hðwÞ ¼ ðqhðwÞ=qwÞ_w is the Lie derivative of the scalar field hðwÞ; Lrk f hðwÞ denotes the ðr kÞth order Lie derivative and r is the relative degree of (15). Then, the system (28) is converted to> x_ 1 ¼ x2 .. . x_ r1 ¼ x2 1 x_ r ¼ Lrf hðc1 ðx; BÞÞ þ Lw Lr1 f ðT tðx; BÞÞwper 1 þ Lg Lr1 f hðc ðx; BÞÞu
B_ 1 ¼ C1 ðx; BÞ .. . B_ nr ¼ Cnr ðx; BÞ y ¼ x1 .
ð31Þ
Let us consider the tracking error defined as et :¼x1 xref .
(32)
The time derivative of the tracking error is e_t ¼ x_ 1 x_ ref ¼ lw þ Wzðxp Þ þ wper þ W g zðxp Þu f r ðxr ; ur Þ.
ð33Þ
Now, let us define eTt ¼ ½et1 ; et2 ; . . . ; etr et1 ¼ x1 xref et2 ¼ x_ 1 x_ ref .. . etr ¼ xr1 xr1 1 ref .
ð34Þ
From (31) and (34), we obtain the tracking error dynamic system e_t1 ¼ et2 e_t2 ¼ et3 .. . 1 e_tr ¼ Lrf hðc1 ðx; BÞÞ þ Lw Lr1 f ðT ðx; BÞÞwper 1 xrref þ Lg Lr1 f hðc ðx; BÞÞu,
ð35Þ
ARTICLE IN PRESS L.J. Ricalde, E.N. Sanchez / Engineering Applications of Artificial Intelligence 21 (2008) 591–603
596
e_t ¼ f e ðet Þ þ wðet Þwper þ ge ðet Þu,
from above as
f e ðet Þ ¼ Lrf hðc1 ðx; BÞÞ xrref , 1 wðet Þ ¼ Lw Lr1 f ðT ðx; BÞÞ, 1 ge ðet Þ ¼ Lg Lr1 f hðc ðx; BÞÞ.
Lwe Vwper pjLwe V jwb . ð36Þ
This tracking problem can be analyzed as a stabilization problem for the error dynamics (36). 5.1. Tracking error stabilization Once (36) is obtained, we proceed to study its stabilization. In order to perform the stability analysis for the system, including identification and tracking, the following Lyapunov function is formulated: V¼
1 1 1 ~ >W ~g kei k2 þ eTt Pet þ trfW 2 2 2g 1 ~ g g, ~ >W þ trfW g 2gg
ð37Þ
where P is a positive definite matrix which satisfies the Ricatti inequality AT P þ PA PbbT Po0, 2
0
6 60 6 6. . A¼6 6. 6 60 4 " P¼
0 1 c
1 0
0 1 .. .. . .
.. .
0 0
0
3
7 07 7 .. 7 rr .7 72R ; 7 17 5
0 0 0 # c ; c 2 Rþ . 1
2 3 0 6 7 607 6 7 b ¼ 6 . 7 2 Rr , 6 .. 7 4 5
(40)
In order to stabilize the tracking error dynamics, let us consider the following modification of the Sontag control law (El-Farra and Christofides, 2001; Sanchez and Ricalde, 2003), 1 ^ ÞLg V u ¼ R1 ðet ; W e 2 ðLf e V þ Zwb jLw Lr1 hðwÞjðjbT Pet j2 =ðjbT Pet j þ fÞÞÞ f qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi L ge V ¼ ðLge V Þ2 1 þ 1 þ ðumax Lge V Þ2 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ðLf e V þ Zwb jLwe VjÞ2 þ ðumax Lge V Þ4 ð41Þ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Lge V , ðLg V Þ2 1 þ 1 þ ðumax Lge V Þ2
where Z; f are adjustable parameters. Inserting the control law (41) in (39) and taking into account the bound for Lwe Vwper , we obtain
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ðLf e V þ Zwb jLwe V jÞ2 þ ðumax Lg V Þ4 V_ p qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 þ 1 þ ðumax Lg V Þ2 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ðLf e V þ wb jLw V jÞ 1 þ ðumax Lg V Þ2 þ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 þ 1 þ ðumax Lg V Þ2 lket k2 þ
T T wb jLw Lr1 f hðwÞjjb Pet jðf ðf ðZ 1Þjb Pet jÞÞ . qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ðjbT Pet j þ fÞ 1 þ 1 þ ðumax Lg V Þ2
1
ð42Þ Now, to determine the sign of the second and last terms, we consider two cases: Case 1: Lf e V þ Zwb jLw V jp0. Substituting this inequality in (42) yields
Its time derivative, along the trajectories of (36), is ~ zðxp Þ þ eT W ~ g zg ðxp Þu V_ ¼ lkei k2 þ eTi W i 1 _~ > W ~g þ Lf e V þ Lwe Vwper þ Lge Vu þ trfW g 1 _~ > W ~ g g, þ trfW g gg
ð38Þ
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ðLf e V þ Zwb jLwe VjÞ2 þ ðumax Lg VÞ4 _ Vp qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 þ 1 þ ðumax Lg VÞ2 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi jLf e V þ Zwb jLwe V jj 1 þ ðumax Lg V Þ2 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 þ 1 þ ðumax Lg V Þ2
where we define lket k2 þ
qV Lf e V ¼ f ðet Þ, qet e qV L we V ¼ wðet Þ, qet qV Lge V ¼ g ðet Þ. qet e
ð43Þ T
Inserting the learning law (24) in (38) we obtain V_ ¼ lðkei k2 þ ket k2 Þ þ Lf e V þ Lwe Vwper þ Lge Vu
T T wb jLw Lr1 f hðxÞjj2b Pet jðf ðf ðZ 1Þj2b Pet jÞÞ . qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ðj2bT Pet j þ fÞ 1 þ 1 þ ðumax Lg VÞ2
ð39Þ
Furthermore, similar to (El-Farra and Christofides, 2001), we assume that the uncertain term Lwe Vwper is bounded
It is easy to verify that for j2b Pet j4f=ðZ 1Þ, the last term is strictly negative; hence, we proceed to study the case when j2bT Pet jpf=ðZ 1Þ. First, we consider that the uncertain term wðet Þ ¼ Lw Lr1 f hðxÞ is a disturbance which satisfies a growth bound of the form jwðet ÞjpdjbT Pet j þ m, where m and d are positive constants.
(44)
ARTICLE IN PRESS L.J. Ricalde, E.N. Sanchez / Engineering Applications of Artificial Intelligence 21 (2008) 591–603
Then, we can obtain the following bound using an analog procedure as in El-Farra and Christofides (2001), T T wb jLwe Lr1 f hðxÞjjb Pet jðf ðf ðZ 1Þj2b Pet jÞÞ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ðj2bT Pet j þ fÞ 1 þ 1 þ ðumax Lg V Þ2
p
wb jLwe V jðf ðf ðZ 1Þj2bT Pet jÞÞ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi T ðj2b Pet j þ fÞ 1 þ 1 þ ðumax Lg V Þ2
Substituting bðfÞ in (43), we obtain qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi jLf t V þ wb jLwe V jj 1 þ ðumax Lg V Þ2 V_ p lket k2 þ bðfÞ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 þ 1 þ ðumax Lg V Þ2 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ðLf t V þ Zwb jLwe V jÞ2 þ ðumax Lg V Þ4 ð45Þ . qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 þ 1 þ ðumax Lg V Þ2
From (45), it is possible to verify that there exists a class K function a1 such that V_ p a1 ðjet jÞ þ bðfÞ 2
ð46Þ
(47)
Then, by an appropriate selection of f we can force bðfÞ to be small enough such that (47) is satisfied and V_ p ða1 ðjet jÞ bðfÞÞp0;
~ a0. 8et ; W
Case 2: 0oLfe V þ Zwb jLwe V jpumax jLge V j. For this case, we consider the inequality 2
2
ðLf V þ Zwb jLwe V jÞ pðumax Lge V Þ . Then qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ðLf e V þ Zwb jLwe V jÞ2 þ ðumax Lge V Þ4 p qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ðLf e V þ Zwb jLwe V jÞ 1 þ ðumax Lge V Þ2 .
ð48Þ
where we select Z41 in order to ensure that the right hand of (48) is strictly negative. For jbT Pet j4f=ðZ 1Þ, the last term is strictly negative which gives V_ p0; hence, we proceed to study the case when jbT Pet jpf=ðZ 1Þ. Taking into account the bound given by (40) and considering that exists a class K function a2 such that V_ p a2 ðjet jÞ þ bðfÞ,
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ðZ 1Þwb jLwe V j 1 þ ðumax Lge V Þ2 a2 ðjet jÞ ¼ lket k2 þ . qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 1 þ 1 þ ðumax Lge V Þ ð49Þ
Therefore, whenever inequality Lf e V þ Zwb jLwe V jpumax jLwe V j holds, the trajectories of (36) will approach an ultimate bound if a2 ðjet jÞ4bðfÞ.
(50)
Then, by an appropriate selection of f we can force bðfÞ to be small enough such that (50) is fulfilled and for both Cases 1 and 2, the following inequality is satisfied V_ p ðai ðjet jÞ bðfÞÞp0;
Therefore, the trajectories of (36) will approach an ultimate bound if a1 ðjet jÞ4bðfÞ.
V_ p lðkei k2 þ ket k2 Þ þ bðfÞ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ð1 ZÞwb jLwe V j 1 þ ðumax Lge V Þ2 þ , qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 þ 1 þ ðumax Lge V Þ2 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ð1 ZÞwb jLwe V j 1 þ ðumax Lge V Þ2 , þ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 1 þ 1 þ ðumax Lge V Þ
pdjbT Pet jwb þ mwb f þ mwb pdwb Z1 f . pbðfÞ; 8jet jp Z1
ðumax Lge V Þ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi . 1 þ 1 þ ðumax Lge V Þ2
Inserting this bound in (42), we obtain
V_ p lket k2 þ bðfÞ
pwb jLwe V j
a1 ðjet jÞ ¼ lket k2 þ
597
~ a0. 8et ; W
(51)
The designer can choose the parameters f and Z in order to obtain a suitable tracking error. The design parameter f should be selected small relative to the desired ultimate bound for the tracking error. Since in (51) V_ is a semidefinite negative function, by the Barbalat’s lemma and Theorem 4.4 in Khalil (1996), we have V_ p ðai ðjet jÞ bðfÞÞ;
i ¼ 1; 2,
ai ðjet jÞ bðfÞ ! 0 as t ! 1, ai ðjet jÞ ! bðfÞ as t ! 1, jet j ! a1 i ðbðfÞÞ.
ð52Þ
Thus, the tracking error approaches a ball of radius jet j ¼ a1 i ðbðfÞÞ, which can be forced to be sufficiently small depending on the selection of the parameter f. Remark 2. The inequality Lfe V þ Zwb jLwe V jpumax jLge V j can be rewritten as ðbT PeÞf e ðet Þ þ Zwb ðbT PeÞwðet Þ pumax jðbT PeÞge ðet Þj,
ARTICLE IN PRESS L.J. Ricalde, E.N. Sanchez / Engineering Applications of Artificial Intelligence 21 (2008) 591–603
598
f e ðet Þ þ Zwb jwðet Þjpumax jge ðet Þj. Since the control law explicitly depends on umax , this relation can be used to reduce the available maximum input as eTt decreases in order to smooth the applied control. Then, umax can be expressed as ( umax jet j if 0pjet joent ; umax ¼ (53) umax if jet jXent ; where ent is an admissible tracking error value which can be obtained via simulations. 5.2. Inverse optimality analysis Once the problem of finding the control law (41), which stabilizes (36), is solved, we can proceed to demonstrate that this control law minimizes a cost functional defined by Z t JðuÞ ¼ lim 2V þ ðlðet ; W ; W g Þ t!1 0
þuRðet ; W ; W g ÞuÞ dt ,
ð54Þ
where the Lyapunov function solves the associated HJB partial derivative equations lðet ; W ; W g Þ þ Lf t V 14Lg VRðet ; W ; W g Þ1 Lg V > þ jLw V jwb ¼ 0.
ð55Þ
In Krstic and Deng (1998), lðet ; W ; W g Þ is required to be positive definite and radially unbounded with respect to et . Here, from (55) we have lðet ; W ; W g Þ ¼ Lf t V þ 14Lg V T Rðet ; W ; W g Þ1 Lg V jLw V jwb . Substituting Rðet ; W ; W g Þ1 into (55), we obtain after some algebraic manipulations lðet ; W ; W g ÞX
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ðLf t V þ Zwb jLwe V jÞ þ 1 þ ðumax Lge V Þ2 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 þ 1 þ ðumax Lge V Þ2 1 2
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ðLf e V þ Zwb jLwe V jÞ2 þ ðumax Lge V Þ4 þ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 þ 1 þ ðumax Lge V Þ2 1 2
wb jLwe V j ð12Z 1ÞjbT Pet j f þ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi . T ðjb Pet j þ fÞ 1 þ 1 þ ðumax Lge V Þ2
ð56Þ
For Case 1, as in the previous section, where Lf e V þ Zwb jLwe V jp0, (56) is rewritten as qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi jLf e V þ Zwb jLwe V jj 12 þ 1 þ ðumax Lge V Þ2 lðet ; W ; W g ÞX qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 þ 1 þ ðumax Lge V Þ2 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 4 1 2 ðLf e V þ Zwb jLwe V jÞ þ ðumax Lge V Þ þ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 þ 1 þ ðumax Lge V Þ2 wb jLwe V j ð12Z 1ÞjbT Pet j f þ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi . ðjbT Pet j þ fÞ 1 þ 1 þ ðumax Lge V Þ2 ð57Þ T
Pet j4f=ð12Z
Considering the relation jb 1Þ which renders the last term strictly positive, we obtain qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi jLf e V þ Zwb jLwe V jj 12 þ 1 þ ðumax Lge V Þ2 lðet ; W ; W g ÞX qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 þ 1 þ ðumax Lge V Þ2 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 4 1 2 ðLf e V þ Zwb jLwe V jÞ þ ðumax Lge V Þ þ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 þ 1 þ ðumax Lge V Þ2
1 T
wb jLwe V j 2Z 1 jb Pet j f
þ
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
,
ðjbT Pe j þ fÞ 1 þ 1 þ ðu L V Þ2 t max ge
ð58Þ ^ ÞXa11 ðjet jÞ lðet ; W
1 T
wb jLwe V j 2Z 1 jb Pet j f þ
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
.
ðjbT Pe j þ fÞ 1 þ 1 þ ðu L V Þ2 t max ge
For analyzing the case when jbT Pet jpf=ð12Z 1Þ, we consider the relation qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi jLf e V þ Zwb jLwe V jj 12 þ 1 þ ðumax Lge V Þ2 ^ ÞX lðe; W qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 þ 1 þ ðumax Lge V Þ2 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 4 1 2 ðLf e V þ Zwb jLwe V jÞ þ ðumax Lge V Þ þ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 þ 1 þ ðumax Lge V Þ2 wb jLwe V j 12Z 1 jbT Pet j f þ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi T ðjb Pet j þ fÞ 1 þ 1 þ ðumax Lge V Þ2 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 jLf e V þ Zwb jLwe V jj 2 þ 1 þ ðumax Lge V Þ2 X qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 þ 1 þ ðumax Lge V Þ2
ARTICLE IN PRESS L.J. Ricalde, E.N. Sanchez / Engineering Applications of Artificial Intelligence 21 (2008) 591–603
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ðLf e V þ Zwb jLwe V jÞ2 þ ðumax Lge V Þ4 þ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 þ 1 þ ðumax Lge V Þ2 wb jLwe V j 12Z 1 jbT Pet j þ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi T ðjb Pet j þ fÞ 1 þ 1 þ ðumax Lge V Þ2 1 2
þ
þ
w jL e V jf b wq ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi T ðjb Pet j þ fÞ 1 þ 1 þ ðumax Lge V Þ2
From (40) we can obtain the following bound: f wb jLwe V jpwb 1 ðdjbT Pet j þ mÞ Z 1 2 ! f f pwb 1 d1 þm 2Z 1 2Z 1 wb df2 1 ð2Z 1Þ2
mwb f . þ1 2Z 1
Selecting Z42 and f sufficiently small, then lðet ; W ; W g Þ satisfies the condition of being positive definite and radially unbounded if the trajectories satisfy a12 ðjet jÞ4
wb df2 mwb f þ1 2 1 ð2Z 1Þ 2Z 1
4b0 ðfÞ.
ð59Þ
Hence, (54) is a suitable cost function. For Case 2, since 0oLf t V þ Zwb jet jpumax jLge V j, we have for lðet ; W ; W g ÞX
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi þ Zwb jLwe V jÞ 1 þ ðumax Lge V Þ2 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 þ 1 þ ðumax Lge V Þ2
wb jLwe V jðð12Z 1ÞjbT Pet j fÞ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi , T ðjb Pet j þ fÞ 1 þ 1 þ ðumax Lge V Þ2
lðet ; W ; W g ÞX qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Lf e V 1 þ ðumax Lge V Þ2 þ 12wb jLwe V j qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 þ 1 þ ðumax Lg V Þ2 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi wb jLwe V j 1 þ ðumax Lge V Þ2 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 þ 1 þ ðumax Lg V Þ2 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 Zw jL V j 1 þ ðumax Lg V Þ2 b w e 2 þ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 þ 1 þ ðumax Lg V Þ2
Xa12 ðet Þ wb jLwe V j.
p
1 2ðLf e V
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ðLf t V þ wb jLwe V jÞ þ 1 þ ðumax Lge V Þ2 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 þ 1 þ ðumax Lge V Þ2 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 4 1 2 ðLf e V þ Zwb jLwe V jÞ þ ðumax Lge V Þ þ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 þ 1 þ ðumax Lge V Þ2
þ
wb jLwe V jðð12Z 1ÞjbT Pet j fÞ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi . ðjbT Pet j þ fÞ 1 þ 1 þ ðumax Lge V Þ2
þ
wb jLwe V jðð12Z 1ÞjbT Pet j fÞ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi , ðjbT Pet j þ fÞ 1 þ 1 þ ðumax Lge V Þ2
where the first term is strictly negative. Now, we proceed to rewrite (60) as lðet ; W ; W g ÞX
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ðLf e V þ wb jLwe V jÞ 12 þ 1 þ ðumax Lge V Þ2 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 þ 1 þ ðumax Lge V Þ2
ð61Þ
Then, the designer should select Z44 such that from (61), it follows qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ð12ðZ 4Þwb jLwe V jÞ 1 þ ðumax Lg V Þ2 lðet ; W ; W g ÞX qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 þ 1 þ ðumax Lg V Þ2 wb jLwe V jðð12Z 1ÞjbT Pet j fÞ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi . ðjbT Pet j þ fÞ 1 þ 1 þ ðumax Lge V Þ2
1 2
þ
599
ð62Þ Then, (62) is simplified as lðet ; W ; W g ÞXa21 ðet Þ þ
ð60Þ
wb jLwe V jðð12Z 1ÞjbT Pet j fÞ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi . ðjbT Pet j þ fÞ 1 þ 1 þ ðumax Lge V Þ2
ð63Þ
Considering the relation jbT Pet j4f=ð12Z 1Þ which makes the second term strictly positive, we obtain qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ð12ðZ 4Þwb jLwe V jÞ 1 þ ðumax Lg V Þ2 lðet ; W ; W g ÞX qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 þ 1 þ ðumax Lg V Þ2
T 1
wb jLwe V jðð2Z 1Þjb Pet j fÞ þ
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ðjbT Pe j þ fÞ 1 þ 1 þ ðu L V Þ2 t max ge
X0.
ARTICLE IN PRESS L.J. Ricalde, E.N. Sanchez / Engineering Applications of Artificial Intelligence 21 (2008) 591–603
600
To analyze the case when jbT Pet jpf=ð12Z 1Þ, we follow an analog procedure as for the second part of Case 1.
(24). Assume that the control law is given by (41), with parameters defined in (39). Then,
lðet ; W ; W g ÞXa21 ðet Þ
(a) For Lf t V þ Zwb jet jpumax jLge V j the control law (41) guarantees an ultimate bound of the tracking error. (b) For the design terms Z; f selected such that Z44, (59) and (64) are satisfied, the control law (41) minimizes the cost functional defined by (54).
þ
wb jLwe V jðð12Z 1ÞjbT Pet j fÞ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi T ðjb Pet j þ fÞ 1 þ 1 þ ðumax Lge V Þ2
Xa21 ðet Þ
wb df2 mwb f . þ1 2 1 ð2Z 1Þ 2Z 1
6. Simulation example
Selecting f sufficiently small such that wb df2 mwb f a21 ðjet jÞ4 1 þ1 2 ð2Z 1Þ 2Z 1 4bðfÞ
ð64Þ
is satisfied, lðet ; W ; W g Þ is positive definite and radially unbounded. Then, by an appropriate selection of f, we can force bðfÞ to be small enough such that (59) and (64) are both satisfied. Hence, (54) is a suitable cost function for both Cases 1 and 2. Remark 3. The designer has to consider for each reference trajectory, the value of Z and the maximum input umax in order to satisfy the restriction of Case 2. The design term f should be selected small compared to the bound of the modeling error wb , which can be obtained via simulations on the identification stage. Now we proceed to prove that the control law (41) ^ Þ in (54) we obtain minimizes (54), inserting Rðet ; W
Z t T JðuÞ ¼ lim V þ ðlðet ; W ; W g Þ þ u Rðet ; W ; W g ÞuÞ dt , t!1
0
Z
t
1 ðLf V Lg V T R1 ðet ; W ; W g ÞLg V 2
þjLw V jwb dtÞ ,
JðuÞ ¼ lim V t!1
0
Z t
qV ðf þ gu þ wb Þ dt , qe 0
Z t _ JðuÞ ¼ lim V V dt ¼ lim V ð0Þ ¼ V ð0Þ. JðuÞ ¼ lim V t!1
t!1
0
t!1
It is clear that the optimal value is J n ðuÞ ¼ V ð0Þ. The optimality conditions described in this section also satisfy the stability conditions given by Cases 1 and 2. Thus, the control law (41) ensures the stability of the closed loop system and is optimal with respect to the cost functional (54). The obtained results can be summarized by the following theorem.
In order to demonstrate the applicability of the proposed adaptive control scheme with constrained inputs, the following example is included. We consider a sinusoid reference tracking. The unknown system to control is the VanDerPol one: x_ p1 ¼ xp2 , x_ p2 ¼ ð0:5 x2p1 Þxp2 xp1 þ 0:5 cosð1:1tÞ þ satðuÞ
with yp ¼ xp1 and xp ð0Þ ¼ ð1:5 0Þ. The control objective is to force this system output to track the following reference signal xr ¼ 1:5 þ sinðt=4Þ for a maximum input umax ¼ 20. For simulations, the following recurrent neural network is used to model (65): w_ ¼ lw þ Wzðxp Þ þ wper þ bsatðuÞ, yðwÞ ¼ w1
ð66Þ 2x8
T
T
with w ¼ ½w1 w2 ; W 2 R ; l ¼ 15I; b ¼ ½0 1 and the high order sigmoid vector defined as zðxp Þ ¼ ½tanhðxp1 Þ tanhðxp2 Þ tanhðxp1 Þ tanhðxp2 Þ tanh2 ðxp1 Þ tanh2 ðxp2 Þ tanh2 ðxp1 Þ tanh2 ðxp2 Þ tanh3 ðxp1 Þ tanh3 ðxp2 Þ.
ð67Þ
To avoid parameter drift, we use the robust adaptation law (27) with g ¼ 250, wmax ¼ 100, s ¼ 50. The coordinate change is set as x1 ¼ w1 , x2 ¼ lw1 þ
8 X
w1i zi ðxp Þ.
i¼1
The tracking error is defined as e1 ¼ x1 xr1 , and the error system as e_1 ¼ e2 , e_2 ¼ x_ 2 x€ r1 ¼ l2 w1 lwper l
8 X
w1i zi ðxp Þ þ
i¼1
þ
8 X
8 X
w_ 1i zi ðxp Þ
i¼1
w1i z_i ðxp Þ x€ r1
i¼1
¼ l2 ðe1 þ xr1 Þ l
8 X
w1i zi ðxp Þ þ
i¼1
Theorem 1. Consider the unknown nonlinear system with constrained inputs (13), which is modelled by the recurrent high order neural network (16), with the on-line learning law
ð65Þ
þ
8 X i¼1
w1i z_i ðxp Þ x€ r1 ,
8 X i¼1
w_ 1i zi ðxp Þ
ARTICLE IN PRESS L.J. Ricalde, E.N. Sanchez / Engineering Applications of Artificial Intelligence 21 (2008) 591–603
where w_ 1i is obtained from (24). For the Lyapunov function, we select c ¼ 0:9. For the control law, we select wb ¼ 0:2, Z ¼ 4, f ¼ 0:02 where
control law achieves the desired tracking performance even in presence of uncertainties due to the modelling error and input constraints. The applied input is displayed in Fig. 4. As can be seen in Fig. 5 most of the neural network weights converges while some of them remain oscillating in a region due to the modelling error. The results have been satisfactory, the recurrent high order adaptive neural control scheme features the following advantages:
Lf e V ¼ ðe1 þ 0:9e2 Þe2 þ ð0:9e1 þ e2 Þ_e2 , Lge V ¼ ð0:9e1 þ e2 Þ
8 qzðxp1 ; xp2 Þ q X w1i x_ p2 , qu i¼1 qxp2
Lwe V ¼ ð0:9e1 þ e2 Þ
8 qzðxp1 ; xp2 Þ q X w1i x_ p2 . qwper i¼1 qxp2
601
(1) The neural identifier relaxes the condition of requiring knowledge of the system model, which is a common constraint in adaptive control. The recurrent neural network states converges to the real ones in a short time. Furthermore, its simple structure, reduces the complexity for the trajectory tracking analysis. The on-line
Fig. 2 displays the time evolution for the output of the plant and Fig. 3 displays the time evolution of state xp2 . The control law is applied at t ¼ 15 s. In order to improve the closed loop performance, we modify the input bounds according to (53) with et ¼ 0:15. As can be seen, the 3 2.5 2
Output
1.5 1 0.5 0 -0.5
Van der Pol output Reference Neural network
-1 -1.5 0
5
10
15
20
25
30
35
40
45
Time (sec)
Fig. 2. State evolution of the system output, reference signal and neural network output.
4 Van der pol System Neural network
3
xp2
2
1
0
-1
-2 0
5
10
15
20 25 Time(sec.)
30
35
Fig. 3. Output trajectory tracking: time evolution of state xp2 .
40
45
ARTICLE IN PRESS L.J. Ricalde, E.N. Sanchez / Engineering Applications of Artificial Intelligence 21 (2008) 591–603
602
20 15 10
u
5 0 -5 -10 -15 -20 0
5
10
15
20 25 Time (sec.)
30
35
40
45
30
35
40
45
Fig. 4. Applied input.
35 30 25 20
W
15 10 5 0 -5 -10 -15 0
5
10
15
20 25 Time (seconds)
Fig. 5. Output trajectory tracking: time evolution for the neural network weights.
adaptability of the neural identifier provides the control scheme robustness to disturbances and parametric uncertainties. The main advantage, is not requiring to know the system model, but it is necessary at least to know its structure along with its relative degree. (3) The modification of the Sontag’s law, considers and depends explicitly of the input constraints. Since it does not cancel nonlinearities like feedback linearizing approaches, no control effort is wasted to cancel nonlinearities, possibly beneficial to the process. The Lyapunov analysis renders the stability region which depends on the input constraint and design parameters. Furthermore, the control law takes into account the modelling error and provides some degree of attenuation on the basis of the design parameters.
The potential applications of the present algorithm arises mainly from electromechanical systems, where the relative degree and structure of the system can be known, but where the process exhibits uncertainties, unmeasurable parameters and limited energy for the actuators. 7. Conclusions An adaptive control structure based on a recurrent neural network for output tracking of unknown nonlinear systems with constrained inputs was developed. This structure is composed of a neural network identifier, which builds an online model of the unknown plant, which is the base to compute the time derivatives of the output, and a control law for trajectory tracking with constrained inputs is developed
ARTICLE IN PRESS L.J. Ricalde, E.N. Sanchez / Engineering Applications of Artificial Intelligence 21 (2008) 591–603
using the Sontag law and the Lyapunov methodology. Stability of the identification and tracking error and optimality analysis is developed via Lyapunov methodology. The applicability of the proposed structure was tested via simulations, by the output tracking of the VanderPol forced oscillator. Further work will consider the use of nonlinear observers for the case when the full state is not available for measurement. Acknowledgements The authors thank CONACyT, Mexico Project 39866Y, for supporting this research. References Basar, T., Bernhard, P., 1995. H-Infinity Optimal Control and Related Minimax Design Problems. Birkhauser, Boston, USA. El-Farra, N.H., Christofides, P.D., 2001. Integrating robustness, optimality and constraints in control of nonlinear processes. Chemical Engineering Science 56, 1841–1868. Hokimyan, N., Rysdyk, R., Calise, A., 2001. Dynamic neural networks for output feedback control. International Journal of Robust and Nonlinear Control 11, 23–39. Hopfield, J., 1984. Neurons with graded responses have collective computational properties like those of two state neurons. In: Proceedings of the National Academy of Sciences, USA, pp. 3088–3092. Ioannou, P., Sun, J., 1996. Robust Adaptive Control. Prentice-Hall, Upper Saddle River, NJ, USA. Kapoor, N., Daoutidis, P., 1995. Stabilization of unstable systems with input constraints. In: Proceedings of the American Control Conference, vol. 5, pp. 3192–3196. Khalil, H., 1996. Nonlinear Systems, second ed. Prentice-Hall, Upper Saddle River, NJ, USA.
603
Kosmatopoulos, E.B., Christodoulou, M.A., Ioannou, P.A., 1997. Dynamical neural networks that ensure exponential identification error convergence. Neural Networks 10 (2), 299–314. Krstic, M., Deng, H., 1998. Stabilization of Nonlinear Uncertain Systems. Springer, New York, USA. Lin, Y., Sontag, E., 1991. A universal formula for stabilization with bounded controls. Systems and Control Letters 16, 393–397. Low, K.S., Zhuang, H., 2003. Robust model predictive control of a motor drive with control input constraints. In: The Fifth International Conference on Power Electronics and Drive Systems, PEDS 2003, vol. 2, pp.1224–1228. Narendra, K.S., Annaswamy, A.M., 1989. Stable Adaptive Systems. Prentice-Hall, Englewood Cliffs, NJ, USA. Narendra, K.S., Parthasarathy, K., 1990. Identification and control of dynamical systems using neural networks. IEEE Transactions on Neural Networks 1 (1), 4–27. Qiu, L., Miller, D.E., 2000. Stabilization of linear systems with input constraints. In: Proceedings of the 39th IEEE Conference on Decision and Control, 2000, vol. 4, pp. 3272–3277. Ren, W., Beard, R.W., 2003. CLF-based tracking control for UAV kinematic models with saturation constraints. In: Proceedings.of the 42nd IEEE Conference on Decision and Control, vol. 4, pp. 3924–3929. Rovitahkis, G.A., Christodoulou, M.A., 2000. Adaptive Control with Recurrent High-Order Neural Networks. Springer, New York, USA. Sanchez, E.N., Ricalde, L.J., 2003. Chaos control and synchronization, with input saturation, via recurrent neural networks. Neural Networks 16, 711–717. Sanchez, E.N., Perez, J.P., Ricalde, L., 2002. Recurrent neural control for robot trajectory tracking. In: Proceedings of the 15th World Congress International Federation of Automatic Control, Barcelona, Spain. Sanchez, E.N., Perez, J.P., Ricalde, L., 2003. Neural network control design for chaos control. In: Chen, G., Yu, X. (Eds.), Chaos Control: Theory and Applications. Springer, Berlin. Sepulchre, R., Jankovic, M., Kokotovic, P.V., 1997. Constructive Nonlinear Control. Springer, New York, USA.