PII: S0005–1098(98)00012–0
Automatica, Vol. 34, No. 5, pp. 641—650, 1998 ( 1998 Elsevier Science Ltd. All rights reserved Printed in Great Britain 0005-1098/98 $19.00#0.00
Brief Paper
Stable Neural Controllers for Nonlinear Dynamic Systems* SSU-HSIN Yu- and ANURADHA M. ANNASWAMY‡ Key Words—Neural networks; optimization; nonlinear control; nonlinear filters; stability analysis; control system design. Abstract—In this paper, a stability based approach is introduced to design neural controllers for nonlinear systems. The requisite control input is generated as the output of a neural network, which is trained off-line such that the time derivative of a positive definite function of the state variables becomes negative at all points. By using the successfully trained networks as controllers, the closed-loop system can be made stable. The stability framework introduced permits the generation of more efficient algorithms that lead to a larger region of stability for a wide class of nonlinear systems. ( 1998 Elsevier Science Ltd. All rights reserved.
researchers such as Narendra and Parthasarathy (1990), Sanner and Slotine (1992), Chen and Khalil (1995) and Sjo¨berg et al. (1994). In most of the above results, the role of neural networks is usually a model that can mimic a nonlinear input—output relation. In Narendra and Parthasarathy (1990), a neural network is trained to approximate various nonlinear functions in nonlinear systems. The information is then used by a controller to feedback-linearize the system. Since a neural network is only an approximation to the underlying nonlinear system, there is always residual error between the true system and the network model. Stability issues arise when the network is trained on-line and simultaneously used to control the system. To overcome this problem, the deadzone modification, similar to that in the adaptive control has been introduced (Chen and Liu, 1994) to guarantee closed-loop stability in the presence of small residual error. Furthermore, to update weights of the neural network on-line, gradient-like algorithms are commonly used. However, it is wellknown in adaptive control that a brute-force correction of controller parameters based on the gradients of output errors can result in instability even for some classes of linear systems (Parks, 1966; Osburn et al., 1961). Hence, to avoid the possibility of instability during on-line adaptation, some researchers proposed using networks such as radial basis functions where variable network parameters occur linearly in the network outputs, such that a stable updating rule can be obtained (Sanner and Slotine, 1992). Up to this point, the development of nonlinear control using neural networks parallels that of linear adaptive control, and many ideas can be carried directly over. Unfortunately, unlike linear adaptive control where a general controller structure to stabilize a system can be obtained with only the knowledge of relative degrees, stabilizing controllers for nonlinear systems are hard to determine. As a consequence, most research on neural network based controllers focuses on nonlinear systems whose stabilizing controllers are readily available once some unknown nonlinear parts are
1. INTRODUCTION
In recent years, there has been considerable research activity in the use of neural networks for identification and control of nonlinear systems. An increasing demand in the performance specifications and the concomitantly present complexity of dynamic systems mandate the use of sophisticated information processing and control in almost all branches of engineering systems. The promise of fast computation, versatile representational ability of nonlinear maps, fault-tolerance, and the capability to generate quick, robust, sub-optimal solutions from neural networks make the latter an ideal candidate for carrying out such a sophisticated identification or control task. Problems in nonlinear identification and control can be viewed as the determination of a nonlinear map between two quantities in the system. For instance, the function of a controller is a nonlinear map between system errors and the control input; the nonlinear relation may correspond to that of an identifier if the quantities are system inputs and system outputs. Neural networks have recently been applied to such problems by a number of *Received 8 May 1996; revised 11 October 1997: received in final form 5 December 1997. This paper was recommended for publication by Associate Editor K. W. Lim under the direction of Editor C. C. Hang. Corresponding author Professor Anuradha M. Annaswamy. Tel. #1 617 253 0860; Fax #1 617 258 9346; E-mail
[email protected]. - Scientific Systems Company, Inc., 500 West Cummings Park, Suite 3000, Woburn, MA 01801, U.S.A. ‡Department of Mechanical Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, U.S.A. 641
642
Brief Papers
identified, such as xn"f (xn~1, 2 , x)#bu with full state feedback where f is to be estimated by a neural network. Even though some approaches have been suggested for utilizing neural networks in the context of a general controller structure (Jordan and Rumelhart, 1992; Levin and Narendra, 1993), the stability implications are unknown. Furthermore, since a neural network controller can have many weights, whether the network can converge fast enough so as to achieve good performance is questionable. In the absence of general controller structures for nonlinear systems, using neural networks to construct stabilizing controllers also appears in the literature, especially in the discrete-time case. A straightforward approach is creating a neural network map that would inverse dynamics of the nonlinear system. Once the network is trained successfully, given a desired trajectory, the network would send appropriate input signals to drive the system to follow that trajectory (Jordan and Rumelhart, 1992). Since inverting dynamics may be too restrictive and may sometimes result in huge control signals that are not practical for some systems, several researchers proposed using a series of neural networks to control a dynamic system to the desired target gradually (Nguyen and Widrow, 1990; Antony and Acar, 1994; Levin and Narendra, 1993). Each network in the series corresponds to a map from the measurement to the input signals of the system at a particular time. At different time steps, networks are used alternately to generate control input. In general, the number of steps is chosen arbitrarily or by trial and error. The complexity of training not surprisingly increases dramatically as the number of steps increases. A similar approach to design controller is formulating it as a optimization problem. The task of obtaining the optimal controller is equivalent to solving a set of partial differential equations describing the optimality conditions. When the dynamic system under consideration is nonlinear, the analytical solution of the optimality conditions often becomes extremely difficult to obtain. Due to the functional representative ability of neural networks, some researchers limit the search of the optimal controllers to particular neural network structures so that simpler but sub-optimal solutions can be obtained (Nguyen and Widrow, 1990; Parisini and Zoppoli, 1994; Moran and Nagai, 1994; Antony and Acar, 1994). This consists of training N neural networks to solve the N-stage optimal control problem with each network supplying the optimal input at different stage, and applying them sequentially when they are implemented. As in the previous approach, complexity grows quickly as the number of stage increases. Due to this, modification has been made by using
the same neural network for every stage (Parisini and Zoppoli, 1994). Training can be done similarly to the previous approach. For all of these optimization approach, number of stages are chosen arbitrarily and are finite. Moreover, only discrete-time dynamic systems can be successfully applied. Since only suboptimal solutions can be found, whether these solutions can still ensure stability is unclear. In this paper, we propose a stability based approach for designing neural controllers. We consider the case where the linearized system is assumed to be stabilizable using static output feedback. By training the neural controller appropriately, we show that the closed-loop system can be made stable. The framework of such an off-line training procedure permits us to develop more efficient algorithms based on nonlinear programming. We discuss one such algorithm in this paper as well. The paper is outlined as follows. The problem under consideration is described in Section 2. The neural network based method for controller design is explained in Section 3, in which the training procedure is described in Section 3.1, and the existence of a controller and stability property of the closed-loop system is shown in Section 3.2. Simulation results of the proposed controller are shown in Section 4, where the resulting performance is compared with that of a linear controller. 2. STATEMENT OF THE PROBLEM
Consider the following nonlinear dynamic system xR "f (x, u),
y"h(x),
(1)
where x3Rn, u3Rm and y3Rl. The determination of a nonlinear controller u"c(y, t) to stabilize (1) for general f and h is a difficult task even if f and h are known. For systems which are feedback linearizable, although such a c exists, close form solutions for c cannot always be obtained. Our goal in this paper is to construct a neural controller as u"N(y; ¼), (2) where N is a neural network with weights ¼, and establish the conditions under which the closedloop system is stable. The nonlinear system in equation (1) is expressed as a combination of a linear part and a higherorder nonlinear part as xR "Ax#Bu#R (x, u), 1 (3) y"Cx#R (x), 2 where f (0, 0)"0, h(0)"0, A"Lf (x, u)/LxD , (x,u)/(0,0) B"Lf (x, u)/LuD and C"dh(x)/dxD . (x,u)/(0,0) x/0
Brief Papers We make the following assumptions: (A1) f, h are twice continuously differentiable and are completely known. (A2) There exists a K such that (A!BKC) is Hurwitz. The problem is therefore to determine a controller as in equation (2), the structure of N, and a procedure for training the weights of N so that the closedloop system is stable. Condition (A2) implies that a static compensator is sufficient to stabilize equation (1). Extension to the case when an observerbased dynamic compensator is used instead can be found in Yu and Annaswamy (1997). In order for the neural networks to approximate any continuous function, the class of neural networks used in this paper must satisfy the property called ‘‘universal approximator’’ (Hormik et al., 1989). The following definition describes such a class of neural networks N. Let ! be the set of all continuous mappings of Rn into Rm, and N be a subset of !. Definition 1. Given e'0 and a compact set )LRn, for every q3!, there exists a N3N such that D N(x)!q(x) D(e for every x3). Definition 1 implies that for any continuous function q3!, we can find a N3N to approximate the function uniformly to the desired degree of accuracy in a compact set. For example, the multilayered neural networks (MNN) (Hormik et al., 1989) and the radial basis functions (RBF) (Park and Sandberg, 1991) qualify as such a class of neural networks. Without loss of generality, we assume that all neural networks considered in this paper have this property.
643
function with a negative definite derivative of all points in the state space, we can guarantee stability of the overall system. Here, the choices of the Lyapunov function candidates are limited to the quadratic form, i.e. »"xT Px, where P is positive definite, and the goal is to choose the controller so that »0 (0 where »0 "2xT Pf (x, N(h(x); ¼)).
It is known from Artstein (1983) that if there is a smooth Lyapunov function of the closed-loop system in equation (1), then there must exist a corresponding continuous controller. Sontag (1989) further proposed a way to construct explicitly the controller. However, the method can only be applied to systems which are affine in control and with full-state feedback. The approach suggested here, on the other hand, does not require such assumptions. Furthermore, it is much easier to incorporate other requirements into the approach of constructing the neural controller, which is essential in improving performance of the closed-loop systems. To ensure »Q (0, we define a desired time-derivative »Q as $ »Q "!xG Qx, where Q"QT'0. (5) $ We choose P and Q matrices as follows. First, a matrix K to make (A!BKC) asymptotically stable exists according to assumption (A2). In general, such a K can be easily obtained from linear controller design methods such as the LQ scheme or pole placement. A P, Q pair can subsequently be found by choosing an arbitrary positive definite matrix Q and solving the following Lyapunov equation to obtain the positive definite P (Narendra and Annaswamy, 1989): (A!BKC)TP#P(A!BKC)"!Q.
3. THE NEURAL CONTROLLER
As mentioned in Section 2, the structure of the neural network is chosen so that it has the property of a universal approximator. With an input y in equation (2), the neural controller is therefore completely determined once the weights of the network are selected. Since this selection should be such that the closed loop is stable, the following approach is adopted. It is noted that while the controller and the training procedure is described for the continuous system in equation (1), a similar result can be derived for nonlinear discrete-time systems as well. In order for the nonlinear controller in equation (2) to result in an asymptotically stable closed-loop system, it is sufficient to establish that a continuous positive definite function of the state variables decreases monotonically through output feedback. In other words, if we can find a scalar positive definite
(4)
(6)
Based on the linearization principle (Sontag, 1990, pp. 170—173) it follows that there is at least a linear controller u"!Ky which can make the derivative of »"xT Px as close to !xT Qx as possible in a neighborhood of the origin with the choices of P and Q. Our aim is to use a controller as in equation (2) so that by allowing nonlinearity, the neural controller can achieve a larger basin of convergence. With the controller of the form of equation (2), the goal is to find ¼ in equation (2) which yields »Q 4»Q (7) $ along the trajectories in a compact set XLRn containing the origin in the state space. Let x dei note the value of a sample point of x3X in the state space where i is an index. To establish equation (7), it is necessary that for every x in a neighi borhood XLRn of the origin, »Q 4»Q , where i di
644
Brief Papers
»Q "2xTPf (x , N(h(x );¼)) and »Q "!xTQx . i i i $i i i i That is, the goal is to find a ¼ such that the inequality constraints *» (¼)40, i"1, 2 , M (8) ei are satisfied, where *» "»Q !» and M denotes i di ei the total number of sample points in X. Alternatively, this can be posed as an optimization problem with no cost function and subject to the inequality constraints in equation (8). As a common practice in optimization, the inequality constraints can be converted into equality constraints by adding additional variables z"[z , 2 , z ]T: 1 M *» (¼)#z2"0, i"1, 2 , M. ei i Under the above equality constraints, if the quadratic penalty function method (Bertsekas, 1995) is used, we can formulate the problem as finding a ¼ to minimize the following augmented Lagrangian for various j:
G
M ¸ (¼, z, j)" + j (*» (¼)#z2) c i i ei i/1 c # D*» (¼)#z2D2 , i ei 2
H
(9)
where j"[j , 2 , j ]T is the Lagrange multiplier 1 M and c is a positive constant. It can be observed from equation (9) that for large c, ¸ can be large if the c constraints are not satisfied. Hence, as c increases, the solution tends to be near the constraints. As a matter of fact, if j is equal to the optimal value of the Lagrange multiplier, there is a finite cN such that when c'cN , the ¼ that minimizes the augmented Lagrangian is also the solution of the original optimization problem (Bertsekas, 1995). Minimization of ¸ with respect to z can be first c carried out analytically as shown in Bertsekas (1995). The resulting problem becomes finding a ¼ which minimizes 1 M + M(maxM0, j #c*» (¼)N)2!j2N . (10) i i ei 2c i/1 For general optimization problems, the steps that follows consist of a sequence of minimizations for recursively varying j and steadily increasing c. However, for this special case involving only inequality constraints, only one minimization is required. It can be observed from equation (10) that for any j, as long as *» 40 for 14i4M, the ei minimal cost !+ M (j2/2c) as well as the inequali/1 i ity constraints can always be achieved with a large enough c. Hence, equation (10) can be further reduced to the following problem: 1 M min J¢ min + (maxM0, *» N)2. ei W W 2 i/1
(11)
It is worth noting that, though we focus on satisfying the constraints in equation (7) in the above derivation, »Q 40 is all that is needed to assure i nominal stability. There are several other ways to achieve this. For example, training the network to minimize + M »Q subject to the constraint »Q (0 i i/1 i for 14i4M suffices to ensure the condition. If the same penalty function method is used, the problem involves finding a sequence of weights, ¼(k), which minimizes the following augmented Lagrangian: M 1 M ¸ (k) (¼, j(k))" + »Q (¼)# + c j 2c(k) j/1 j/1 M(maxM0, j(k)#c(k)»Q (¼)N)2!j(k)N, j j j where the Lagrange multipliers j(k) and c(k) are j updated recursively as j(k`1)"maxM0, j(k)#c(k)»Q (¼(k))N, j j j c(k`1)"bc(k) and b'1. Since the process of finding weights is equivalent to solving a sequence of unconstrained problems due to the presence of + M »Q in the i/1 i original cost function, it takes longer time to train the network. Nevertheless, this approach can potentially result in large stable region since less stringent constraints are imposed. Since the network N is trained only at sample points, a question that arises is whether N would result in »Q (0 between samples. To ensure this, we can resort to the smoothness property of the system in equation (1) and N itself. If we differentiate equation (4) with respect to x, the result would depend on f, Lf/Lx, Lf/Lu, dh/dx and LN/Ly. For continuously differentiable N, such as MNN and RBF, and functions f and h, these terms are bounded in a compact region as well. d»Q /dx is thus bounded in the compact region. Therefore, to guarantee that »Q (0 at sample points implies »Q (0 between samples, the sample points have to be close enough and »Q (!e for some e '0 so that v v »Q #E*xEsup Ed»Q /dxE remains negative at the x|X sample points, where X is the compact region of interest and *x denotes the interval between samples. In practice, it requires that there are enough sample data in the testing set and »Q is small enough at these samples. 3.1. ¹raining of the neural controller To minimize J in equation (11), a training set, which includes samples of states as well as data required to evaluate J, is first formulated. For a state vector x in the training set, the corresponds ing output of the system is y . In the forward path s of the neural network, N has as its input vector y , s and its output is u . By using this preliminary u , we s s
Brief Papers can calculate »Q from equation (4). »Q is then coms s pared with »Q to obtain the error signal *» . The $s es pattern is used to update weights ¼ in the network only if *» '0. This procedure is graphically es shown in Fig. 1. In this formulation, as opposed to common neural network training, the outputs of the network do not have a direct target to compare with. Instead, after the outputs pass through another function, there is a distal target »Q . This kind $s of training is also known as ‘training with a distal teacher’ (Jordan and Rumelhart, 1992). Minimization of J can be performed using methods such as the gradient algorithm and the Levenberg—Marquardt algorithm (Marquardt, 1963). To apply either method, the gradients of the cost function with respect to weights need to be evaluated first. They are as follows: LJ LN(y ; ¼) LJ i "+ Lw Lw LN(y ; ¼) j i|A j i Lf (x , u ) LN(y ; ¼) i i i "2 + *» xTP , (12) ei i Lw Lu j i|A where A denotes the set of patterns where *» '0, ei w denotes jth element of ¼ and the term j LN(y ; ¼)/Lw "[LN (y ; ¼)/Lw , 2 , LN (y ;¼)/ i j 1 i j n i Lw ]T represents the gradient of the outputs of the j neural network with respect to w . j The procedure of training a neural network in equation (2) to stabilize equation (1) is delineated in the following steps, where we assume that the Levenberg—Marquardt method is used. Step 1. Find the linear approximation matrices A, B and C of the system in equation (2). Choose a matrix K to make (A!BKC) asymptotically stable. Step 2. Define a negative definite function »Q " $ !xT Qx, which is our desired change of Lyapunov function. The corresponding Lyapunov function candidate »"xT Px is obtained by solving the Lyapunov equation (6) to find the positive definite matrix P. Step 3. Sample x in a subspace X3Rn. Its value is, say, x . The corresponding output vector y and 1 1 »Q at the particular point x can be evaluated as $1 1 y "h(x ) and »Q "!xTQx respectively. A typi1 1 1 1 d1 cal data element is of the form (x , y , »Q ). By 1 1 $1 repeating the above procedure for different x 3X i
Fig. 1. Block diagram of the network training.
645
where x can be either chosen randomly or evenly i spaced, we can then formulate the training set as "M(x , y , »Q )D 14i4MN. 53!*/ i i $1 Similarly, a testing set ¹ can also be formed for 5%45 cross-validation. In order to ensure existence of a map N( ) ), the training set should initially compose of samples in a smaller region around the origin, and could be gradually enlarged as training is successful. Step 4. For each element x in the training set, i the outputs of the network can be obtained as u "N(y ; ¼). Based on this preliminary u , we can i i i calculate »Q "2xT Pf (x , u ). Subsequently, LJ/Lw i i i i j can be obtained for every w using equation (12). j Step 5. Solve the following n simultaneous w linear equations for *w : j nw + a@ *w "b , k"1, 2 , n kj j k w j/1 where n is the total number of weights, a@ " jj w a (1#j), a@ "a for kOj, j'0 and kj kj jj 1 LJ LJ 1 LJ a " , b "! . kj 2 Lw Lw j 2 Lw k j j If the new set of weights, w/%8"w #*w , result in j j j smaller J, then w are updated accordingly and j is j decreased. Otherwise, w remain the same and j is j increased. Step 6. Repeat Steps 4 and 5 until J stops improving. ¹
After the training is finished, it is expected that »Q !»Q 4e for a small e'0 at all the sample i $i points x . It implies that »Q (0 except possibly at i i some points where »Q is small, which are in a small di neighborhood of the origin. This ensures that states of the closed-loop system converges to the neighborhood when the neural controller in equation (2) is implemented as we shall see in the next section. The procedure discussed in this section can also be applied to discrete-time systems with some minor modification. In particular, the discrete version of Lyapunov equation (A!BKC)TP(A!BKC)!P"!Q
(13)
replaces equation (6) to determine the matrix P. In addition, »Q in equation (4) is substituted by the difference of the Lyapunov function candidate: *» ¢xT Px !xTPx t t`1 t`1 t t "f T(x , N(h(x ), ¼))Pf T(x , N(h(x ), ¼))!xTPx . t t t t t t Otherwise, the rest of the training procedure follows the same line as its continuous-time counterpart. It can be seen from the construction of the training set that for the same intervals between samples
646
Brief Papers
the number of samples increases as the number of states of the nonlinear system. Nevertheless, if the system and the underlying controller are smooth, intervals between samples can be large and, hence, fewer samples are required in the training set. 3.2. Stability of the closed-loop system In this section, we show that the neural controller proposed in Section 3 with the training procedure leads to closed-loop stability. First, in order to guarantee the training is successful, it is necessary that there exists a smooth function which, when used as a controller, can make D»Q !»Q D as small as $ possible in a neighborhood of the origin. Theorem 2 establishes this by making use of the linearization principle for stability (e.g. Sontag, 1990). When this smooth function is approximated by a neural network, the universal approximator property in Definition 1 ensures stability of the closed-loop system using the neural controller. For systems where a more complex controller exists to achieve »Q !»Q 40 in a larger region, Theorem 3 says that $ the states of the closed-loop system, with a welltrained neural controller, converges to a neighborhood of the origin. In the following theorem, x(t )"x corresponds 0 0 to the initial conditions of the system in equation (1). ¹heorem 2. Consider the closed-loop system in equations (1) and (2), which satisfies the assumptions (A1) and (A2). (i) For every e'0, there exist c '0, 0(c (c 1 2 1 and a neural network N3N such that »Q !»Q (e ∀x3X, d where »Q and »Q are given by equations (4) and (5), $ and X"Mx3Rn D c 4ExE4c N. 2 1 (ii) Given a neighborhood O, there exist a neural network N3N and an open set U containing the origin such that the solutions converge to O for every x 3U. 0 Proof. From (A2) it follows that K exists such that equation (6) is satisfied. Since X is a closed annulus region in Rn and h is continuous on X, the set Y"MyD y"h(x), x3XN is compact. Therefore from Definition 1, it follows that we can find a N3N such that DN(y)!(!Ky) D can be made as small as possible for every y3Y. Let N(y)" !Ky#E(y). Using equation (3), we obtain that u"!KCx!KR (x)#E(y) (14) 2 and the closed-loop system to be xR " (A!BKC)x#R (x), where R (x)"R (x, u(x))! 4 4 1 BKR #BE. Since »"xTPx and »Q "!xTQx# 2 2xTPR (x), it follows that »0 !»Q "2xTPR (x). 4 $ 4
We need to establish, therefore, that an annulus region X exists such that for every x3X, D2xTPR (x)D(e. 4 Since f3C(2), according to Taylor’s theorem (Rudin, 1976), limE(x,u)EP0 [ER (x, u)E/E(x, u)E]"0. 1 Hence, for every e '0, there exists d '0 such 1 1 that ER (x, u)E(e E(x, u)E ifE(x, u)E(d . (15) 1 1 1 Similarly, since h3C(2), limExEP0 [ER (x)E/ 2 ExE]"0 and thus, for every e '0, there exists 2 d '0 such that 2 ER (x)E(e ExE if ExE(d . (16) 2 2 2 From equations (14) and (16), it follows that, if ExE(d , 2 E(x, u)E((1#EKCE#e EKE)ExE#EEE 2 "c ExE#EEE· (17) 2 By using equations (15) and (17), we obtain ER (x, u)E(e (c ExE#EEE), if ExE(d and 1 1 2 2 E(x, u)E(d . 1 For a given c '0, let e "c /3EBKE, 1 2 1 e "c /3c , c "min(d /2c , d /2) and 0(c (c . 1 1 2 1 1 2 2 2 1 For the choices of e , e , c and c , if x3X, then 1 2 1 2 ExE(d and d 'c ExE, which implies that 2 1 2 E(x, u)E(d can indeed be satisfied for a small 1 enough EEE. From Definition 1 and since Y is a compact set, for every e '0, we can find N3N 3 such that EE(y)E(e . If e "min(c c /3(e #EBE), 3 3 1 2 1 d !c c ), then E(x, u)E(d and furthermore 1 2 1 1 ER (x)E4ER (x, u)E#EBKR (x)E#EBE(y)E 4 1 2 ((c e #EBKEe )ExE 21 2 #(e #EBE)e 4c ExE 1 3 1 for every x3X. Since c is arbitrary, we can choose 1 c so that E2xTPR (x)E(e for every x3X, which 1 4 establishes (i). If we choose c (j (Q)/2j (P), then 1 .*/ .!9 »Q (!(j (Q)!2c j (P))ExE2(0 for every .*/ 1 .!9 x3X. Define U "Mx3RnD ExE"c N, U " 1 1 2 Mx3RnD ExE"c N, a "minx3U »(x), a " 2 1 2 1 maxx3º2»(x), U"Mx3RnD »(x)(a N and O " 1 2 Mx3RnD »(x) (a N, where a '0 and a '0 exist 2 1 2 since U and U are compact. Choose c to be 1 2 2 small enough so that a (a and O LO. Such 2 1 2 a c can always be found since »(x) P0 as xP0. 2 Furthermore, »Q (0 if x3U!O . Therefore, ac2 cording to LaSalle (1962), the solutions of equations (1) and (2) converge to O and thus O for every 2 x3U. K The above theorem establishes the existence of a stabilizing neural network controller around the origin such that »Q !»Q (e, for every x3X. The $
Brief Papers proof was given essentially along the same lines as the proof of the linearization principle for stability (e.g. Sontag, 1990). The important modification is the introduction of a neural network which replaces the stabilizing linear controller. For systems where a continuous but unknown function i(y) exists such that for »"xT Px, the control input u"i(y) yields »Q 4!xT Qx, we can find a neural network N(y) that approximates i(y) arbitrarily closely in a compact set leading to closed-loop stability. In general, if a continuous positive definite function »(x), not necessarily quadratic, is chosen and there exists a nonlinear control input resulting negative definite »Q (x), then we can train a neural network to approximate such a controller. This is summarized in Theorem 3. ¹heorem 3. Let there be a continuous function i : h(X)PRm such that (d»/dx) f (x, i(h(x)))# M(x)40 for every x3X where XLRn is an open neighborhood of the origin, »(x) : XPR is positive definite and continuously differentiable in X, and M(x) is positive definite for every x3X. Given a neighborhood O of the origin, there exist a neural controller u"N(h(x);¼) and an open neighborhood YLX of the origin such that the solutions of xR "f (x, N(h(x); ¼)) converge to O, for every x 3Y. 0 Proof. We first find a compact set X LX contain1 ing neighborhoods of the origin. Choose c to be small enough so that the level set ¸ (c) of v U (c)"Mx3X D »(x)4cN is contained in X . Then v 1 we can follow the similar procedure as the last theorem to show that the solutions remain in ¸ (c) and v eventually converge to the neighborhood O. K It can be seen from Theorems 2 and 3 that, as opposed to most analytically expressed controllers where zero steady state error can be achieved under the disturbance-free condition, the neural controllers generally result in some small steady state error due to their approximating nature. To ensure the states converging to the origin as close as possible, small c and, hence, e are required in Theorem 2. 2 3 To achieve this for a RBF network, for example, more centers need to be placed near the origin so that approximation error could be small. For a MNN, this implies more neurons are required. However, it inevitably takes longer time to train the networks. Hence, there is a tradeoff between network complexity and accuracy of the neural controller. An alternative is to switch to a linear controller once the states converge to a neighborhood of the origin where the linear controller is applicable.
647 4. SIMULATION RESULTS
The stable neural controller presented in this paper is demonstrated in this section through a simulation example. The neural controller is also compared with a linear controller to illustrate the difference. This example concerns a second-order nonlinear system x "f (x , u ) where t t~1 t~1 f (x , u )" t~1 t~1
C
x t!1(1#x )#x (1!u #u2 ) t~1 1 2t~1 2t~1 t~1 x2 #2x #u (1#x ) 1t~1 2t~1 t~1 2t~1
D
u and x"[x , x ]T denote the input and state 1 2 vector, respectively. It is assumed that x is measurable, and we wish to stabilize the system around the origin. It can be verified that the above system is not feedback-linearizable and no other stabilizing nonlinear controllers can be found by inspection. A neural controller of the form u "N(x t, x 2t) is t 1 1 used to stabilize the above system, where N is a RBF network with Gaussian bases, inputs x t and 1 x 2t, output u , 289 centers, and variances of the 1 t bases equal to 0.0120417. The selection process of the centers and variances is given in Section 3. The linearized system in equation (3) is given by A"[ 1 1 ], B"[0, 1]T and C"I ] . Controlla0 2 2 2 bility and observability of A, B and C can be easily verified. A choice of K"[0.1429, 1.7752] obtained by using the LQ scheme results in the eigenvalues of (A!BK) being within the unit circle. By choosing Q as an identity matrix, we can calculate the matrix P in the discrete-time Lyapunov equation. The training set is constructed by sampling both x and x between $0.2 with a total of 961 1 2 points. We also form a testing set composed of 121 points distributed in the same range for cross-validation purpose. The network is then trained using the Levenberg—Marquardt method to make *»4!xTQx at all points in the training set, where *»"f T(x, N(x;¼) Pf (x, N(x;¼))!xTPx. In Fig. 2, we show how *» evolved during training after 1, 3, 5 and 11 epochs. It can be seen that after only 11 epochs using the Levenberg—Marquardt method, *» becomes negative in most region except around the origin. After the training is finished, the actual changes of the function, *», using the linear controller u"!Kx, and the neural controller are plotted in Fig. 3a and b, respectively. It can be observed from the two figures that if the neural controller is used, *» is negative definite except in a small neighborhood of the origin, which assures that the closedloop system would converge to vicinity of the origin (LaSalle, 1962); whereas, if the linear controller is used, *» becomes positive in some region away from the origin, which implies that the system may become unstable for some initial conditions. The
648
Brief Papers
Fig. 2. Evolution of *» during training: (a) 1 Epoch, (b) 3 Epoch, (c) 5 Epoch, and (d) 11 Epoch.
Fig. 3. Comparison of *» with linear and neural controllers: (a) *» : linear controller and (b) *» : neural controller.
reason for larger region of stability of the neural controller can be seen from Fig. 4. As shown in Fig. 4, the linear control function is restricted to a plane in the space. On the other hand, the neural controller allows the control input to be nonlinear. As a result, the neural controller can stabilize regions where the linear controller fails. Figure 4b shows the difference of the input signal between the linear controller and the neural controller with
respect to the state vector x. It can be seen that nonlinearity has the most effect in the region where the linear controller fails to stabilize. Simulation results for an initial state located in the region where *» of the linear controller is positive are shown in Fig. 5a and b for the linear and the neural controllers respectively. Instability of the closedloop system using the linear controller confirms our observation.
Brief Papers
649
Fig. 4. Comparison of linear and neural controllers: (a) u ("!Kx) and (b) *u ("Nx)!(!Kx)).
Fig. 5. Closed-loop system responses for linear and neural controllers: (a) x (u"!Kx) and (b) x (u"N(x)). t t
5. CONCLUDING REMARKS
In this paper, a stability based approach is taken to design a neural controller. With a quadratic Lyapunov function candidate » of state variables, the neural controller is chosen such that a negative definite time-derivative »Q is resulted along trajectories of the closed-loop system. A stability proof is given, which establishes the conditions under which such a neural controller can be found. The proposed procedure allows us to design stabilizing controllers for systems where conventional methods may be inadequate. The framework chosen naturally lends itself to a mathematically more tractable problem formulation and can be applied to more general nonlinear systems. Training such a neural controller off-line permits more efficient algorithms in nonlinear programming being used. Furthermore, the nonlinear representational ability of neural networks allows the controller to achieve a larger stable region, compared to
a linear controller. This is confirmed by the simulation results provided in the paper. There are several classes of systems where existence of a nonlinear controller can be guaranteed while difficulty still remains in actually solving such a controller. For example, the feedback linearization technique establishes a constructive procedure for finding nonlinear controllers for systems affine in control under some conditions (Isidori, 1989). However, obtaining such controllers through the proposed procedure requires solving simultaneous nonlinear partial differential equations, which is notoriously difficult. Hence, even for those classes of problems, the proposed neural controller offers an alternative solution.
Acknowledgements—This work is supported in part by Electrical Power Research Institute under contract No. 8060-13 and in part by National Science Foundation under grant No. ECS9296070.
650
Brief Papers REFERENCES
Antony, J. K. and L. Acar (1994). Real-time nonlinear optimal control using neural networks, In Proc. 1994 American Control Conf., Vol. 3, pp. 2926—2930. Artstein, Z. (1983). Stabilization with relaxed controls. Nonlinear Anal., 7, 1163—1173. Bertsekas, D. P. (1995). Nonlinear Programming. Athena Scientific, Belmont, MA. Chen, F.-C. and H. K. Khalil (1995). Adaptive control of a class of nonlinear discrete-time systems using neural networks, IEEE ¹rans. Automat. Control, 40(5), 791—801. Chen, F.-C. and C. C. Liu (1994). Adaptively controlling nonlinear continuous-time systems using multilayer neural networks. IEEE ¹rans. on Automat. Control, 39(6), 1306—1310. Hornik, K., M. Stinchcombe and H. White (1989). Multilayer feedforward networks are universal approximators, Neural Networks, 2, 359—366. Isidori, A. (1989). Nonlinear Control Systems, 2nd ed. Springer, New York, NY. Jordan, M. I. and D. E. Rumelhart (1992). Forward models: supervised learning with a distal teacher. Cognitive Science, 16, 307—354. LaSalle, J. P. (1962). Asymptotic stability criteria. In Proc. Symp. on Appl. Math., Providence, RI, Vol. 13, pp. 299—307. Levin, A. U. and K. S. Narendra (1993). Control of nonlinear dynamical systems using neural networks: controllability and stabilization. IEEE ¹rans. Neural Networks, 4(2), 192—206. Marquardt, D. W. (1963). An algorithm for least-squares estimation of nonlinear parameters. J. Indust. Appl. Math., 11(2), 431—441. Moran, A. and M. Nagai (1994). Optimal active control of nonlinear vehicle suspensions using neural networks. JSME Int. J., Series C: Dynamics, Control, Robotics, Design and Manufacturing, 37(4), 707—718. Narendra, K. S. and A. M. Annaswamy (1989). Stable Adaptive Systems, Prentice-Hall, Englewood Cliffs, NJ.
Narendra, K. S. and K. Parthasarathy (1990). Identification and control of dynamical systems using neural networks. IEEE ¹rans. Neural Networks, 1(1), 4—26. Nguyen, D. H. and B. Widrow (1990). Neural networks for self-learning control systems. IEEE Control systems mag., 10, 18—23. Osburn, P. V., H. P. Whitaker and A. Kezer (1961). New developments in the design of model reference adaptive control systems. In Proc. IAS 29th Annual Meeting, New York. Parisini, T. and R. Zoppoli (1994). Neural networks for feedback feedforward nonlinear control systems. IEEE ¹rans. Neural Networks, 5(3), 436—449. Park, J and I. W. Sandberg (1991). Universal approximation using radial-basis function networks. Neural Comput., 3, 246—257. Parks, P. C. (1966). Liapunov redesign of model reference adaptive control systems. IEEE ¹rans. Automat. Control, 11, 362—367. Rudin, W. 1976). Principles of Mathematical Analysis. McGrawHill, New York. Sanner, R. M. and J.-J. E. Slotine (1992). Gaussian networks for direct adaptive control. IEEE ¹rans. Neural Networks, 3(6), 837—863. Sjo¨berg, J., H. Hjalmarsson and L. Ljung (1994). Neural networks in system identification, In 10th IFAC Symp. on System Identification. Vol. 2, pp. 49—72. Sontag, E. D. (1990). Mathematical Control ¹heory. Springer, New York. Sontag, E. D. (1989). A ‘universal’ construction of artstein’s theorem on nonlinear stabilization. System Control ¸ett., 13, 117—123. Yu, S. H. and A. M. Annaswamy (1997). Stable Neural Dynamic Compensators for Nonlinear Dynamic Systems, Adaptive Control Laboratory, Department of Mechanical Engineering, MIT.