STABLE ADAPTIVE CONTROL UNDER UNMEASURABLE PLANT STATES

STABLE ADAPTIVE CONTROL UNDER UNMEASURABLE PLANT STATES

67$%/($'$37,9(&21752/81'(5810($685$%/(3/$1767$7(6  'XF71JX\HQDQG0LHWHN$%UG\V  'HSDUWPHQWRI(OHFWURQLF(OHFWULFDODQG&RPSXWHU(QJL...

154KB Sizes 1 Downloads 62 Views

67$%/($'$37,9(&21752/81'(5810($685$%/(3/$1767$7(6  'XF71JX\HQDQG0LHWHN$%UG\V  'HSDUWPHQWRI(OHFWURQLF(OHFWULFDODQG&RPSXWHU(QJLQHHULQJ 7KH8QLYHUVLW\RI%LUPLQJKDP (GJEDVWRQ%LUPLQJKDP%778. GWQ#EKDPDFXNDQGPEUG\V#EKDPDFXN

Abstract: A dynamic neural network based algorithm for learning control of an unknown nonlinear continuous-time multiple-input-multiple-output plant with unmeasurable states is proposed in this paper. A new dynamic neural network structure is utilised to model the unknown plant dynamics through modelling the input-output mapping. A feedback linearization is applied to design controller for the neural model and the neural states are used as a source of precious information about current plant dynamics. A gradient based update of the weights is performed at discrete time instants over a moving measurement window in order to reduce the model output – real output mismatch. The learning controller is applied to a double link robot arm. Stability of the system is analysed through ultimate boundedness of all signals.  Keywords: Dynamic neural networks, robotic manipulators, on-line learning, nonlinear control systems, feedback linearization, ultimate boundedness.

1. INTRODUCTION1 Controlling multiple-input-multiple-output (MIMO) nonlinear systems is still a major problem. This problem becomes more difficult if the plant dynamics are not known exactly and the state measurements are not available. A common approach to deal with it is to build a parametric model of the unknown plant dynamics and synthesize the control law based on this model. During control, the model parameters need to be updated so that they converge to true parameter values. Then the model based good control law becomes good also for the plant. Neural networks have been proven to be suitable for parameterizing the plant dynamics. As the plant states are not measurable, they cannot be used as needed plant states in the state feedback control law. Although the dynamic neural network states do not correspond to the plant states, they can be used as a source of useful information about the plant dynamics. In (Brdys et al. 1998), a Hopfield dynamic diagonal neural network was trained by gradient descent algorithm to model the plant input-output mapping. With the assumption that the perfect model parameters exist so that the model and the plant have the same input-output mappings, the feedback linearization control algorithm and online model parameter adaptation guaranteed the tracking error convergence and stability of the control system. The controller was successfully applied to induction 1

This work is supported by Vietnamese Ministry of Education and Training under the project 322 (2000).

motor control (Brdys and Kulawski 1999). Another identification and control method was introduced in (Poznyak et al. 1999). There, feeding back all states to the input layer creates the dynamics of the neural network. As opposed to (Brdys et al. 1998), the number of plant states is assumed to be known and it is the same as the number of model states. The plant states were measured and used in network training. The method was successfully applied to control double link robot manipulator (Poznyak; et al. 2001). A similar dynamic neural model is used by (Ren et al. 2003) to control a single link manipulator. A different method was introduced in (Harris et al. 1998), where two separate feed-forward neural networks were used to model the manipulator and then the controller was built based on the neural model with the manipulator states were known. In this paper, we introduce a new dynamic neural network which is as an improvement of the model introduced in (Poznyak; et al. 2001). The proposed control algorithm is a further development of the results obtained in (Brdys et al. 1998). The main extension consists of using non diagonal neural network structure and improving the network approximation properties. The new neural network is applied to a double link robot manipulator. The paper is structured as follows. In section 2, the proposed dynamic state space neural network and the training algorithm are given. The feedback linearizing control technique and stability of the closed loop system is presented in section 3. In section 4, simulation results of controlling the double link manipulator are given. Section 5 concludes the paper.

2.MODEL TRAINING  5HFXUUHQWQHXUDOQHWZRUNPRGHODUFKLWHFWXUH   [ˆ1

ˆ  σ1 9

$

ˆ :

σ2 [ˆQ [ˆ1



σP ˆ 5

φ1



φ2 [ˆQ X1

φκ

&

\1 \S

Vˆ1 1 2 Vˆ T T

( )

ˆ ˆ ∈ℜP output matrix of the network, σ 9[

( )

and

ˆ ˆ ∈ ℜκ are vectors of sigmoid functions with φ 5[ the Lth element as ⎛ Q ⎞ ⎛Q ⎞ σL = σL ⎜⎜ ∑YˆLM [ˆ M ⎟⎟, L = 1,.., P and φL = φL ⎜⎜ ∑UˆLM [ˆ M ⎟⎟ , L = 1,..,κ . ⎝ M =1 ⎠ ⎝ M=1 ⎠  1HXUDOQHWZRUNZHLJKWXSGDWLQJ  $VVXPSWLRQ7KHSODQWLVLQSXWRXWSXWVWDEOH This assumption is made to be sure that the neural network is trained by data from stable object. The following assumption is required for guaranteeing input-to-state stability of the RSSNN (Brdys and Nguyen, 2006): $VVXPSWLRQ  $OO RI WKH WUDLQHG PDWULFHV DUH ˆ VDWLVI\ ERXQGHGDQGSDUDPHWHUV :

XT Fig. 1. Proposed dynamic state space neural network architecture The proposed recurrent state space neural network (RSSNN), which is shown in Fig. 1, is combined from 2 subnets. The first subnet is a one hidden layer feed-forward network with P nonlinear neurons and the second is a two hidden layer feed-forward network with N nonlinear neurons in the first hidden layer. The dynamics of the neural network are created by the feedback from all output states. The inputs in both subnets are the states of the proposed neural network and their number is different in general from the plant states. Activation functions of nodes in the hidden layer are sigmoids. They are described by: D1 D2 σL ( ]) = − F1 and φL ( ] ) = − F2 where − E1] 1+ H 1 + H −E2 ] D1 , D2 , E1 , E2 , F1 , F2 are positive constants, 0 ≤ F1 ≤ D1 and 0 ≤ F2 ≤ D2 .

The number of nonlinear nodes of both subnets depends on the dynamics of the plant to be modelled and on the required approximation accuracy. The second hidden layer of the second subnet has T linear neurons, where T is number of plant inputs. Mathematically, the neural network can be explained by following matrix form equation: ˆ σ 9ˆ [ˆ + 6ˆ GLDJ 7 ˆφ 5 ˆ [ˆ X [&ˆ = $[ˆ + : (1) \ˆ = &[ˆ

( )

( ( ))

where, [ˆ ∈ ℜQ is the state vector, X ∈ ℜ T is the ˆ ∈ ℜQ×P is the matrix of output input vector, : ˆ ∈ ℜ T×κ is the matrix of weights in the first subnet, 7 ˆ ∈ ℜ P×Q and weights in the second subnet, 9 ˆ ∈ ℜκ ×Q are the weight matrices describing hidden 5 layer connections, 6ˆ = ⎡⎣ Vˆ1 , K , Vˆ T ⎤⎦ , Vˆ L ∈ ℜ Q is the matrix of output weights of the second subnet, S× Q is the $ ∈ ℜ Q×Q is a stable Hurwitz matrix, & ∈ℜ

2 ˆ < γ − 3 3 , 3 < γ , γ > 0  DQG : 3 /σ2 3

3 = 37 > 0 

VDWLVILHVWKH/\DSXQRYIXQFWLRQ $ 7 3 + 3$ = −γ ,  In our work, since the updating is only based on the tracking error, we use gradient descent algorithm for training the model. The convergence can be guaranteed only locally so that initial weight values need to be closed enough to the optimal ones. Following the algorithm, the N WK update are in the opposite direction to the gradient of the following error criterion 1 N7 +7 7 (2) (= H P H P Gτ 2 ∫N7

where 7 is the length of the measurement window. The network weights are updated every 7 and they are kept constant over the time interval 7. Defining a new parameter vector Eˆ with elements are all the updated parameters of the neural network ˆ1P ...ZˆQPYˆ11..YˆPQ ..Vˆ11..VˆQT ....Uˆ11..UˆκQ ⎤⎦ ∈ℜ2QP+QT+Tκ +κQ. bˆ ⎡⎣Zˆ11 ..Z Each element of vector Eˆ is computed from previous step as ∂( (3) EˆL ( N7 + 7 ) = EˆL ( N7 ) − η ∂EˆL where η > 0 is the learning rate. The discrete-time instants, when the updates take place, are indexed by the positive integer N Let us define a set 2 ⎧⎪ ⎫ ˆ ∈ ℜ Q×P : : ˆ ≤ γ − 3 3 − ε ⎪⎬ , where ε > 0 Ψ:ˆ = ⎨: 2 /σ 3 ⎪⎩ ⎪⎭ is a small positive number. It is sufficient for the ˆ ∈ Ψ  network stability that : :ˆ Hence, the weights produced by (3) are modified as: 2 ⎧ ˆ ( N7 + 7 ) ≤ γ − 3 3 − ε ˆLM ( N7 ) −η ∂( ∂ZLM LI : ⎪Z /2σ 3 ⎪ ˆLM ( N7 + 7 ) = ⎨ Z 2 γ −3 3 ⎪ ˆ ˆ ⎪ πΨ ZLM ( N7 + 7 ) LI :( N7 + 7 ) > /2 3 − ε σ ⎩

(

)

where π Ψ (.) denotes a projection operator.

H →0

Σ



\ UHI  \& UHI 

X

Feedback linearization controller

RSSNN

 \ˆ

where α > 0 is defined by pole placement technique. The error between the model output and the reference output Hˆ = \ˆ − \UHI , obeys Hˆ& +αHˆ = 0 and it converges

model 

Σ

exponentially to the origin with the rate α . Hence, after some initial transients the tracking error HW = \UHI − \ becomes the modelling error HP = \ˆ − \ .

\

Plant Fig. 2. Control diagram Partial derivatives with respect to parameters of the error function in (6) is calculated as N7 +7 ˆ ∂( 7 ∂\ (4) =∫ ( \ˆ − \ ) ˆ Gτ N7 ˆ ∂EL ∂EL (4) is implemented by integrating over 7 the derivative ˆ G ∂( 7 ∂\ (5) = ( \ˆ − \ ) ˆ ˆ GW ∂E ∂E L

L

∂\ˆ ∂[ˆ =& ˆ ∂E ∂Eˆ

(6)

starting at zero initial conditions

∂( ∂[ˆ ( N7 ) = 0, ˆ ( N7 ) =  . ˆ ∂E ∂E

with

L

L

L

L

Once the parameter updates have been performed, the initial conditions in equations (5) and (6) are reset to zero and the circle is repeated.  3. FEEDBACK LINEARIZATION CONTROL SYSTEM

The RSSNN based control system is designed as in (Brdys et al. 1998) and its structure is shown in Fig. 2. As mentioned earlier, the RSSNN is in a feedback linearisable form and all states of the neural network are available so that the controller is designed based on the RSSNN model. During control, outputs and states of the model are fed back to the controller to generate control signal X The control signal X is applied to both plant and model and the mismatch between output of the plant and output of the model is used to train the neural network. The evolution of the model output can be expressed as ˆ σ (9[ˆ ) + &6GLDJ (7φ (5[ˆ ))X (7) \ˆ& = &[ˆ& = &$[ˆ + &: To process the algorithm, the following assumption needs to be satisfied $VVXPSWLRQ  &6GLDJ ( 7φ ( 5[ˆ ) )  LV LQYHUWLEOH DQG

(

DZD\

)

IURP

&6GLDJ 7φ ( 5[ˆ ) ≥ G > 0 ∀[ˆ ∈ ℜ

]HUR

WKDW

LV

Q

) (

The main technical result of the paper can now be formulated. ˆ −:  7KHRUHP ,IWKHLQLWLDOSDUDPHWHUPLVPDWFK :

LV VPDOO HQRXJK DQG α  LV VXLWDEO\ FKRVHQ WKHQ WKH FRQWUROODZ(8)DFKLHYHVXOWLPDWHERXQGHGQHVVRIDOO VLJQDOVLQERWKWKHSODQWDQGWKHPRGHO 3URRI 3DUW%RXQGHGQHVVRIPRGHOVWDWHV The dynamics of the closed loop system under control (8) for SISO plant are ˆ σ ( 9[ˆ ) [&ˆ = $[ˆ + : + 6φ ( 5[ˆ )

(12) 1 ˆ σ ( 9[ˆ ) −α&[ˆ + \& +α \ −&$[ˆ −&: UHI UHI &6φ ( 5[ˆ )

)

(

)

Let us choose Lyapunov function for system (12) as 9ˆ = [ˆ 7 3[ˆ , where 3 is obtained from assumption 2. Taking the time derivative of the above Lyapunov function along the trajectory of the system (12) & yields: (13) 9ˆ = [&ˆ 7 3[ˆ + [ˆ 7 3[&ˆ Defining 4

Therefore, the feedback linearization control can be calculated as −1 ˆ σ ( 9[ˆ ) + Y (8) X = &6GLDJ ( 7φ ( 5[ˆ ) ) −&$[ˆ − &:

(

 8OWLPDWHERXQGHGQHVVRIDOOVLJQDOV  To proceed with the stability analysis of the system, for simplicity of exposition, a single-input-singleoutput (SISO) system is considered. Therefore, the model can be written in a simpler form: ˆ σ (9[ˆ ) + 6φ (5[ˆ )X [ˆ& = $[ˆ + : . (10) \ˆ = &[ˆ ˆ are updated during control. It is The parameters : assumed that there exist ideal parameters : , such that with these parameters RSSNN is a perfect plant model. That is, the dynamic system [& = $[ + :σ (9[) + 6φ (5[)X (11) \ = &[ is a realisation of the same input-output mapping X (W ) → \ (W ) as that of the plant. Therefore, (11) will be considered as a valid description of the controlled plant. From now on, [ will denote the state of the plant. $VVXPSWLRQ 5HIHUHQFHRXWSXWVLJQDO \UHI DQGLWV GHULYDWLYH \& UHI DUHERXQGHG

&RQWUROV\QWKHVLV

ERXQGHG

where the feedback linearized dynamics now can be written as \&ˆ = Y . A simple linear controller produces the auxiliary input v (9) Y = \& UHI − α ( \ˆ − \ UHI ) , α > 0

6 φ ( 5[ˆ ) , taking the transpose of &6φ ( 5[ˆ )

[&ˆ and substituting into (13) yields:

(

)

& 9ˆ = [ˆ 7 $7 3 + 3$ − 24 ( &$ + α & ) [ˆ

(

ˆ σ ( 9[ˆ ) + 2[ˆ 7 34 \& + α \ +2[ˆ 7 3 ( , − 4& ) : UHI UHI

Defining $1

)

$7 3 + 3$ − 234 ( &$ + α & )

(14)

(15)

= −γ , − 234 ( &$ + α & )

Since $ is chosen as a diagonal matrix with the element value D, then 3 is also a diagonal matrix with the element value S. By simple calculations, the second term on the right hand side of (15) can be written as ⎡ 2 S(D + α ) ⎢ 2 SV2 ( D + α ) 234 ( &$ + α & ) = ⎢⎢ M ⎢ ⎣⎢ 2 SVQ ( D + α )

where VL :=

M =1 κ

LM

∑V M =1

1M

, L = 2,.., Q . Therefore,

φM

\ = max \& UHI + α \UHI < ∞ since \& UHI

constants GL > 0, L = 2,..Q such that VL ≤ GL , ∀xˆ ∈ℜQ . Hence, by choosing suitable value of α , the elements satisfy the inequalities: SV L (D + α ), L = 2,..., Q Q

∑ SV ( D + α ) ≤ ∑ SG ( D + α ) ≤ 2 Sα

and

L

Q



M ≠L ,L =1

D2−LM ,

where D2− LM is LMth element of matrix $2. It means that $2 is real diagonally dominant matrix with positive diagonal entries. Therefore, A2 is positive definite and symmetric matrix (Horn and Johnson, 1985). Notice that −[ˆ 7 $1[ˆ = [ˆ 7

− $1 − $17 − $1 + $17 [ˆ + [ˆ 7 [ˆ 2 2

− $1 + $17 and [ˆ [ˆ = 0 . Thus, [ˆ 7 $1 [ˆ = − [ˆ 7 $ 2 [ˆ 2 and due to the symmetry and positive definiteness of $2 the following holds: 7

[ˆ 7 $1[ˆ = −[ˆ 7 $ 2 [ˆ ≤ − λmin ( $ 2 ) [ˆ

2

(

2

{

(



1

κ

SVL ( D + α ) ≤ −D, L = 2,..., Q . Thus, D2−LL ≥

and \ UHI are

σ

}

)

)

Then the bound of the Lyapunov function is 2 & 9ˆ ≤ − P [ˆ + P [ˆ

V1 Mφ M ≥ G > 0 , then there exist ∑ M =1

L=2

4 ≤ 1 < ∞, ∀[ˆ ∈ℜQ and

, − 4& ≤ 0 < ∞,

ˆ :

0 L 0⎤ D 0⎥ L ⎥ M M⎥ 0 L D ⎦⎥

$2 is symmetric with positive diagonal entries. ⎛ Q ⎞ As φ M = φ M ⎜ ∑ UMN [N ⎟ ≤ D2 − F2 , ∀[N , N = 1,.., κ , and due ⎝ N =1 ⎠

L=2

)

ˆ . + 1\ P2 = max 2 3 0 : σ

⎡ 2 Sα SV2 ( D + α) L SVQ ( D + α )⎤ ⎢ ⎥ −D 0 −$1 − $17 ⎢ SV2 ( D +α ) ⎥ =⎢ L ⎥ M M 2 M ⎢ ⎥ 0 L −D ⎥⎦ ⎢⎣ SVQ ( D +α )

L

(

r Where 0 , 1 and \ are positive constants such that:

Defining P1 = inf λmin ( $ 2 ) , [ˆ ∈ ℜ Q and

One can see that $is the triangular matrix with all the negative diagonal entries. Let us define 

Q

(17)

2 ˆ . + 1\ [ˆ ≤ −λmin ( $2 ) [ˆ + 2 3 0 : σ

min

to the assumption 3,

)

\UHI ∈ /∞ satisfying chosen as \& UHI ∈ /∞ and assumption 4. Therefore, the derivative of the Lyapunov function is bounded as 2 & ˆ . + 1\ [ˆ 9ˆ ≤ −λ ( $ ) [ˆ + 2 3 0 :

M

⎡ − 2 Sα ⎢− 2 SV 2 (D + α ) $1 = ⎢ M ⎢ ⎣⎢− 2 SV Q (D + α )

$2

(

+2[ˆ7 34 \&UHI +α \UHI

W

0 L 0⎤ ⎥ 0 0 L ⎥⎥ M M ⎥ 0 L 0⎦⎥

κ

∑V φ

where λ min ($ 2 ) is the minimum eigenvalue of $2. Combining (15) and (16) we can bound the right hand side of (14) as: ˆ σ ( 9[) [ˆ7 ( $7 3 + 3$ − 24(&$ +α&) ) [ˆ + 2[ˆ7 3 ( , − 4&) :

(16)

(18)

2

Rewrite (18) as 2 2 & 9ˆ ≤ −P1 (1 − θ ) [ˆ − P1θ [ˆ + P2 [ˆ where 0 < θ < 1 then P 2 & (19) 9ˆ ≤ −P1 (1 − θ ) [ˆ ∀ [ˆ ≥ 2 P1θ From the above and the fact that chosen Lyapunov function 9ˆ satisfies 2 2 we can conclude, λmin ( 3 ) [ˆ ≤ 9ˆ ( [ˆ ) ≤ λmax ( 3 ) [ˆ based on Theorem 4.18 in (Khalil 2002), that there exists a finite time W1 ≥ 0 such that

(

[ˆ (W ) ≤ β [ˆ ( W0 ) , W − W0 [ˆ ( W ) ≤

In other words,

)

∀W0 ≤ W < W0 + W1

(20)

P2 (21) , ∀W ≥ W0 + W1 P1θ [ˆ ( W ) is uniformly ultimately

bounded with the ultimate bound is given by P E[ˆ = 2 . Naturally, boundedness of [ˆ implies the P1θ boundedness of \ˆ because of the relation \ˆ = &[ˆ . 3DUW %RXQGHGQHVVRIWKHFRQWUROVLJQDO  The control signal for (10) is calculated as

(

)

ˆ = X [ˆ , :

(

1 ˆ σ (9[ˆ ) + \& + α\ − &$[ˆ − α&[ˆ − &: UHI UHI &6φ (5[ˆ )

)

The right hand side of the above can be bounded by

(

1 ˆ σ (9[ˆ ) + \& + α\ − &$[ˆ − α&[ˆ − &: UHI UHI &6φ (5[ˆ ) ≤

(

1 ˆ . + \& + α\ &$ + α& [ˆ + &: σ UHI UHI G

) (22)

)

ˆ remains bounded, Since the updating parameter : the model states are ultimately bounded by (21) and the reference is chosen as (17) then the right hand side of inequality (22) is bounded. Hence, the input signal which is applied to both model and plant is converged to a bounded value in finite time W1 .

3DUW%RXQGHGQHVVRIWKHSODQWVWDWHV Choosing Lyapunov function for the plant (11) as 1 9 = [7 [ . Its time derivative along the trajectory 2

(11) is given 9& = [7 [& = [7 $[ + [7 :σ ( 9[ ) + [7 6φ ( 5[ ) X ≤ − $ [ + [ :σ ( 9[ ) + 6φ ( 5[ ) X 2

Since X is bounded for W ≥ W1 , σ ( ⋅) , φ ( ⋅) , : , 6 are bounded, it holds that [7 :σ ( 9[ ) + [7 6φ ( 5[ ) X ≤ [ :σ ( 9[ ) + 6φ ( 5[ ) X ≤ P3 [ , ∀W ≥ W1

Where P3 is a positive constant, which satisfies

:σ (9[) + 6φ (5[)X ≤ P3 , ∀[ ∈ ℜ Q , ∀W ≥ W1

Therefore, derivative of the Lyapunov function follows 2 2 2 9& ≤− $ [ + P3 [ =−(1−θ ) $ [ −θ $ [ + P3 [ (23) 2 2 ≤−(1−θ ) $ [ = N3 [ , ∀[ ∈ℜQ , [ ≥

P3 , ∀W ≥ W1, 0 < θ < 1, N3 = (1−θ ) $ > 0 θ$

Similar as in the case of convergence of [ˆ , it can be concluded that [ decreases exponentially and in finite time enters a ball of radius E[ > 0, [ ≤ E[ , ∀W ≥ W2 , W 2 > W1 . This implies that \ is bounded for W ≥ W 2 > W1 . Thus, all signals [ˆ , [, \ˆ , \ and X are proved to be ultimately uniformly bounded for W ≥ W2 .■

middle of that link. The masses of the first and the second link are denoted by P1 and P2 . Matrices appeared in (15) are described as: ⎡(P + P )O 2 + P O 2 + 2P2O1O2 cosθ2 P2O22 + P2O1O2 cosθ2 ⎤ 0(θ ) = ⎢ 1 2 1 2 2 2 ⎥ P2O2 + P2O1O2 cosθ2 P2O22 ⎣ ⎦ & sin θ −O O P θ& sin θ ⎤ ⎡ − 2 O O P θ 2 1 2 2 2 2 &(θ ,θ&) = ⎢ 1 2 2 2 ⎥ & 0 ⎣ O O P θ sin θ ⎦ 1 2

RESULTS

2

⎡ J ( P2 O2 cos(θ1 + θ 2 )) + ( P1 + P2 )O1 cos θ1 ⎤ * (θ ) = ⎢ ⎥ JP2 O2 cos(θ1 + θ 2 ) ⎣ ⎦ & & ⎡ ⎤ I11θ1 + I12 VLJQ θ1 ⎡I ⎤ ⎥ ) θ& = ⎢ 1 ⎥ = ⎢ ⎣ I 2 ⎦ ⎢⎣ I 21θ&2 + I 22 VLJQ θ&2 ⎥⎦

( ) ( )

( )

To collect input-output data for simulation, state space form of (24) is used: θ&& = 0 (θ ) −1[τ − & (θ ,θ& ) θ& − * (θ ) − ) (θ& )] (25) The parameters of the manipulator are chosen: P NJP NJO PO P I I I I J PV  6LPXODWLRQUHVXOWV  7UDLQLQJ The identification input signals are chosen as sine waves. The output data are computed from (25). 20 nonlinear neurons in each subnet are set and the number of neural network states is 4. Simulation results show that weights are set and plant and model outputs track each other after about 400sec running. The tracking accuracy during the last 50sec simulation is presented in Fig. 3b and Fig. 3c. The length of the time windows is set 7 1sec, learning rate η = 0.1 is kept constant. D ,GHQWLILFDWLRQVLJQDOV 

OLQN OLQN

   

5. APPLICATION AND CONTROL SIMULATION

2 1











 V

E,GHQWLILFDWLRQUHVSRQVHVRIOLQN OLQN 56611

 

'RXEOHOLQNURERWDUP  The control algorithm is applied to double link rigid body manipulator with its inverse dynamics in the joint space are: 0 (θ )θ&& + &(θ , θ&)θ& + * (θ ) + ) (θ&) = τ (24) where θ , θ&,θ&& denote the angle, angular velocity and angular acceleration vectors of the joints; 0(θ ) dimensional manipulator and actuator inertia matrix; &(θ ,θ&) is the vector of Coriolis and centrifugal torques; * (θ ) is vector of gravitational torques, ) (θ& ) is the friction torques and τ is the

vector of actuator torques. The length of the links are denoted by O1 and O2 . We assume that the mass of each link is concentrated as a point mass at the

  









 V

F,GHQWLILFDWLRQUHVSRQVHVRIOLQN  OLQN 56611    









 V

G6HOHFWHGWUDLQHGSDUDPHWHUV 















 V

Fig. 3. Identification with P = κ = 20 nonlinear neurons

&RQWURO 

6. CONCLUSIONS

D0DV VF KDQJHV



P P

    







 V

E/LQNSRVLWLRQ      







 V

F /LQNSRVLWLRQ     







 V

G: HLJKWXSGDWH   

A dynamic neural network based algorithm for learning control of an unknown nonlinear continuous-time multiple-input-multiple-output plant with unmeasurable states has been proposed in this paper. A novel dynamic neural network structure is utilised to model the unknown plant dynamics through modelling the input-output mapping. A feedback linearization with the network states has been applied to design controller for the neural model. The network has been trained in the closedloop by a gradient based update of the weights over a moving measurement window at discrete time instants. Stability of the system has been analysed through ultimate boundedness of all signals. The learning controller has been applied to a double link robot arm. Simulation results have shown that disturbances can be rejected by on-line model adaptation of the weights during control and the tracking error boundedness can be achieve in spite of the modelling error.  REFERENCES

  







 V 

Fig. 4. Control with constant references E/LQNSRVLWLRQ      









 V

F /LQNSRVLWLRQ     









 V

G8SGDW HRIZHLJKW      









 V

Fig. 5. Control with sine references During control, the link masses are suddenly changed by 30% at W= 40sec for the first link and at W= 70sec for the second link as shown in Fig. 4a. The system behaviour is figured out in Fig.4 and Fig.5. It can be seen that the tracking errors are quickly reduced to accurately follow the constant references. When references are sine waves, the system needs longer time to adapt with the disturbances. It starts to accurately follow the reference at W= 400sec. In Fig. 4b, 4c, 5b and 5c, dotted lines are references and solid lines are robot positions. During control, model parameters are updated with the constant learning rate η = 0.03 .

Brdys, M. A., and Kulawski, G. J. (1999). "Dynamic neural controllers for induction motor." IEEE Transactions on Neural Networks, 10(2), 340 355. Brdys, M. A., Kulawski, G. J., and Quevedo, J. (1998). "Recurrent networks for nonlinear adaptive control." IEE Proceedings - Control Theory and Applications, 177-188. Brdys, M. A. and Nguyen, D. T "Dynamic neural network identification and control under unmeasurable plant states." Glasgow International Conference Control 2006. Horn, R. A. and Johnson, C. R. (1985). Matrix analysis, Cambrige university press. Harris, C. J., Lee, T. H., and Ge, S. S. (1998). Adaptive neural network control of robotic manipulators, World Scientific. Khalil, H. K. (2002). Nonlinear systems, Prince Hall. Poznyak, A. S., Wen Yu; Sanchez, E. N., and Perez, J. P. (1999). "Nonlinear adaptive trajectory tracking using dynamic neural networks." IEEE transactions on Neural networks, 10(6), 14021411. Poznyak, A. S., Sanchez, E. N., and Yu, W. (2001). Differential neural networks for robust nonlinear control - Identification, state estimation and trajectoty tracking, World Scientific. Ren, X. M., Rad, A. B., Chan, P. T., and Lo, W. L. (2003). "Identification and control of continuoustime nonlinear systems via dynamic neural networks." IEEE Transactions on Industrial Electronics, 50(3), 478-486.