Applied Soft Computing 8 (2008) 530–542 www.elsevier.com/locate/asoc
A real-time neuro-adaptive controller with guaranteed stability Ali Reza Mehrabian a,b,*, Mohammad B. Menhaj c b
a Advanced Dynamic and Control Systems Lab., School of Mechanical Engineering, University of Tehran, Tehran, Iran Control & Intelligent Processing Center of Excellence, School of Electrical and Computer Engineering, University of Tehran, Tehran, Iran c Amir-Kabir University of Technology, Tehran, Iran
Received 20 March 2006; received in revised form 28 February 2007; accepted 22 March 2007 Available online 4 April 2007
Abstract This paper presents a new model reference adaptive neuro-control scheme using feedforward neural networks with momentum backpropagation (MBP) learning algorithm. Training is done on-line to tune the parameters of the neuro-controller that provides the control signal. Noting that pre-learning is not required and the structure of overall system is very simple and straightforward. No additional controller or robustifying signal is required. Tracking performance is guaranteed via Lyapunov stability analysis. Both tracking error and neural network weights remain bounded. An interesting fact about the proposed approach is that it does not require a NN being capable of reconstructing globally model non-linearities. # 2007 Elsevier B.V. All rights reserved. Keywords: Intelligent control; Neuro-controllers; Adaptive control; Multilayer perceptrons
1. Introduction In recent years, artificial neural networks (NNs) have attracted considerable attention as candidates for novel computational systems. These types of dynamical systems, in analogy to biological structures, take advantage of distributed information processing and their inherent potential for parallel computation. The field of neural networks covers a very broad area. For example, multilayer NNs have used for pattern recognition [1–4], identification of non-linear systems [5–7], control and decisionmaking [5,8–11], while recurrent neural networks have been used for identification [12–14], and control [15–17], and for finding the solution of optimization problems [17–19,21]. The problem of control is some kind of decision-making; given the observation of the state of the process to decide from encoded knowledge what action to take. Neural networks with its massive parallelism and ability to learn any kind of nonlinearity are of favorable schemes. A neuro-controller, in general, performs a specific form of adaptive control with the controller being in the form of a multilayer neural network and
* Corresponding author at: P.O. Box 14875-347, Tehran, Iran. Tel.: +98 21 4431 1321; fax: +98 21 4433 6073. E-mail addresses:
[email protected] (A.R. Mehrabian),
[email protected] (M.B. Menhaj). 1568-4946/$ – see front matter # 2007 Elsevier B.V. All rights reserved. doi:10.1016/j.asoc.2007.03.005
the adaptable parameters being defined as the adjustable weights [9]. There are many different methods to learn a NN to control a dynamic system based on the type of NN and computation complexity; they share a common fact that they strongly rely on some limited set of learning patterns or experimental data. So, there is no guaranty on the extent to which the results are yet valid and the designer is responsible to care that all the important cases (training data) are sufficiently presented to the learning algorithm. On the other hand, in some applications, reliability of the results is much more important than their abstract qualities. Furthermore, in such cases many designers may prefer classic methods. This fact invites us to introduce, in some aspects, more theoretical insights on the neuro-controllers and try to give some established properties for them. Among many aspects of a controller, its ability to stabilize the overall system is the most important one in real world problems, and it does worth if we could have a neural network controller with guaranteed stability. Recently, there has been some research concerning this subject. In fact, there is a rich and seminal theoretical study on dynamic behavior of neural networks, e.g. see Cohen and Grossberg [20], Hopfield [21], Kosko [22], and Guez et al. [23], as well as on stability and the number of equilibrium points of recurrent neural networks, e.g. see Kelly [24], Vidyasagar [25], and Jim et al. [26]. In contrast,
A.R. Mehrabian, M.B. Menhaj / Applied Soft Computing 8 (2008) 530–542
there is only limited work on the stability of neural networks as controllers for dynamic systems. Tzirkel-Hancock and Fallside [27] developed two neural-based estimators to estimate output derivatives with respect to inputs and states for feedback linearizable non-linear systems. They showed that, under some specified conditions, like exact estimation of derivatives, tracking errors will approach zero. Among the pioneering papers are the studies done by Lewis et al. [28,29]. They applied a neural network accompanied with a PD controller or a robustifying term to a robot system. They proposed a neural network to cancel out non-linear part of the robot. They used special properties concerning inertia and friction matrices, and showed that it is possible to use a modified BP-like algorithm to train neural network to make the tracking error zero. This work was extended and modified in Kwan et al. [30] for electrically driven robots. Kuntanapreeda and Fullmer [31], and Park and Park [32] applied directly a single hidden layer neural network controller to a general locally stable non-linear system. They gave some implicitly defined conditions on weight matrix of the neural network in order to make the overall system remain locally stable. Sadegh [33] introduced a method to design adaptive neural network controller by local stability analysis. Polycarpou [34] has pointed out that some of the conditions commonly considered in adaptive neural network control design can be relaxed for special cases. In Chen and Khalil [35,36] and Chen and Liu [37], two neural networks are used to estimate the non-linearity of a feedback linearizable system. The stability results are locally valid in the parameter space, so that the neural networks require offline training. Fabri and Kadirkamanathan [38], and Sun et al. [39] used GRBF neural networks together with sliding mode controller to adaptively stabilize a non-linear model. The work of Jagannathan et al. [40,41] on stable adaptive neural controller for a special class of discrete nonlinear systems is very interesting. In addition to these studies, we should point out the work by Suykens and Vandewalle [42]. Their results concern conditions on weight matrix of a neural network so that the overall system consisting of a neural network controller and a neural network model of the plant becomes stable. Their design procedures are mainly based on some complicated matrix equalities or inequalities. In this paper, we present a new model reference adaptive control scheme by NNs based on MBP learning rule to adapt parameters of the neural network controller. Training is done online while the neural network provides the control signal. No pre-learning is required. The structure of overall system is very
531
simple and straightforward. The proposed neuro-control scheme has the following advantages over previously studied neuro-control algorithms: first of all, no additional controller or robustifying signal is required, while the tracking performance of the system is guaranteed via a Lyapunov stability analysis. It is shown in the paper that when the two adjustable parameters of the control system are tuned properly, both tracking error and the neural network weights remain bounded while the closedloop system remains stable and shows a satisfactory response as well. A small number of tuning parameters required to setup the control scheme is another benefit of the proposed neuro-control system. An interesting fact about the proposed approach is that it does not require a NN being capable of reconstructing globally model non-linearities. We begin with description of an indirect control strategy, NN structure and identification and control scheme using neural networks in Section 2. In Section 3, we present stability analysis of proposed method. Section 4 presents an implementation of the proposed control approach using SIMULINK software and introduces fully the structures and blocks of the system. Section 5 is devoted to present some illustrative examples and to discuss the simulation results of proposed scheme for controlling some non-linear dynamical systems. Finally, Section 6 concludes the paper. 2. The control strategy 2.1. Indirect adaptive neuro-controller Neural networks can be used to adaptively control a nonlinear system in the presence of uncertainty. Generally speaking, direct and indirect adaptive control schemes represent two distinct methods for the design of adaptive controllers. To use neural computations to design adaptive controllers, we will easily end up with direct adaptive neuro-control (DANC) and indirect adaptive neuro-control (IANC) schemes. In the DANC, the parameters of the controller are directly adjusted to minimize the tracking error, while in the IANC scheme, parameters of the plant under study are estimated using a neural network, called the identifier and based on these estimates, the controller parameters are then adjusted [5]. The latter scheme is used in this paper. Fig. 1 represents a schematic diagram of a modified indirect serial–parallel adaptive neuro-controller connected to a plant. It basically consists of an MLP-based neural network (NNC) used as a neuro-controller, along with a second neural network (NNI) used for identifying the plant.
Fig. 1. An IANC scheme.
532
A.R. Mehrabian, M.B. Menhaj / Applied Soft Computing 8 (2008) 530–542
Fig. 2. Schematic of two-layer feedforward NN.
This IANC scheme incorporates a stable reference model that reflects the desired closed-loop dynamics of the plant. The identifier is designed to an online tracking of the plant response by minimizing the identification error (ei), while the controller is designed to generate control variables such that the output of the plant tracks that of the reference model for a given bounded reference input. The latter is achieved by adjusting parameters of the controller via an adjustment mechanism so as to minimize the error between the outputs of reference model and the plant (e). 2.2. Neural network scheme A two-layer feedforward neural network of m tangentsigmoid function neurons having n inputs in full details is shown in Fig. 2. A network can have several layers. Each layer has a weights matrix W, a bias vector b, and an output vector u. Feedforward networks often have one or more hidden layers of sigmoid or tangent-sigmoid neurons followed by an output layer of linear neurons. Multiple layers of neurons with nonlinear transfer functions allow the network to learn non-linear and linear relationships between input and output vectors. The linear output layer lets the network produce values outside the range 1 to +1. For multiple-layer networks we use the number of the layers to determine the superscript on the weight matrices and biases, or just mention input layer weights by W1, and the layer weight by W2. 2.3. Control-based on-line identification method The first step in the control design process is the development of the plant model. The performance of the neural network controller (NNC) is highly dependent on the accuracy of the plant identification. In this segment we will discuss some of the procedures that can be used in identification using NNs.
There are generally two methods for development of neural model of the system. First method which is easier in computation is off-line training, where weights of the neural network identifier (NNI) are adjusted off-line using pre-existent data from plant’s input and output response. Using this set of data, NNI can be trained using the elementary backpropagation (BP) algorithm (good for pattern-by-pattern training) or batch training methods like Marquardt algorithm [43]. One drawback of this method is its dependency on training data. Training data which have been collected off-line should have some specific properties, for example, we need to ensure that the plant input is sufficiently exiting [44,45], and we further need to be sure that the system inputs and outputs cover the operation rage [45]. If we consider these points and assuming that the identification is performed well, the question is whether the non-linear system dynamics remains unchanged through time or not? It is very well known that the identification is not perfect, and beside this, the characteristics of the dynamic system vary through time leading to an off-line identification method left unprotected against these changes; this is the main disadvantage of off-line method. In return, if the identification is performed on-line, as the second method for identification, we will have massive volume of on-line computation (a lengthy time-consuming task). The control strategy designer must find tradeoff between increased robustness and adaptability of on-line training with low computation and simplification of off-line methodology. 2.4. Mathematical description of the control scheme Consider a plant governed by the following non-linear difference equation: yðk þ 1Þ ¼ f ðYk;l ; uk;m Þ yð0Þ ¼ y0 ;
8k2N
(1)
where, for example, for vector X: Xk;l ½xðkÞ; xðk 1Þ; . . . ; xðk l þ 1Þ and y 2 Rn is the output vector, u 2 Rr is the control vector, f is a smooth non-linear function, y0 is the initial output vector, k is the time index, N is the set of natural numbers and m and l are the number of delayed values of plant input and plant output, respectively. Consider a stable reference model generating the desired response governed by: k;s yd ðk þ 1Þ ¼ f m ðYk;g d ;r Þ yd ð0Þ ¼ yd0
(2)
where yd 2 Rn is the reference model output vector, r 2 Rr the reference input, f m usually is a linear function, d l and yd0 is a given initial output vector for the reference model. The control strategy is to find a feasible control input with the following general structure to establish the closed-loop stable system whose response tracks the desired trajectory, k;m k;r uðkÞ ¼ NN c ðYk;l ; r ; Wc Þ P ;u
(3)
A.R. Mehrabian, M.B. Menhaj / Applied Soft Computing 8 (2008) 530–542
where NNc is a MLP-based neural network parameterized by the weights Wc depicted below: Wc ¼ ½ W1c
b1c
W2c
b2c
T
such that the corresponding plant output is nearest, in some norm to the augmented output ym(k + 1) defined below. This signal which is a modified version of the desired signal is to be specified a priori as a function of the plant output and the reference model input. An appropriate model to generate ym(k + 1) is:
Bðyd ðk 1Þ yðk 1ÞÞ
(4)
where A and B are stable matrixes. The ‘‘Augmented Output’’ block in Fig. 1 implements (4). This signal instead of the original desired signal will be used to tune the parameters of the neuro-controller. In order to see why we are using the augmented signal, (4) is rewritten as: ym ðk þ 1Þ ¼ yd ðk þ 1Þ AðeðkÞÞ Bðeðk 1ÞÞ
(5)
The desired signal, yd, in general, is obtained from a reference model preferably linear, time invariant system, like: xd ðk þ 1Þ ¼ Ad xðkÞ þ Bd uðkÞ (6) yd ðkÞ ¼ Cd xðkÞ in which Ad represents an asymptotic stable matrix. In short, this choice makes a stronger condition to make the overall system more stable and the controller more robust. Now consider the following index of performance: _ Fc ðxÞ
DWm ðkÞ ¼ g DWm ðk 1Þ ð1 gÞasm ðam1 Þ Dbm ðkÞ ¼ g Dbm ðk 1Þ ð1 gÞasm
¼ ðym ðk þ 1Þ yðk þ 1ÞÞ ðym ðk þ 1Þ yðk þ 1ÞÞ (7)
T
(8)
where D represents update (correction term), g is the momentum term and a is the learning rate. sm is sensitivity of the layer m and is defined as: _
@F @nm i
(9) _
which measures the sensitivity of the cost function F to the changes in the ith element of the neural network input at layer m (see [48] for more details). It is shown in [47,48] that derivative of cost function with respect to network weights can be found by: _
@F m1 ¼ sm ; i aj @wm i; j
_
@F ¼ sm i @bm i
(10)
where xm i; j represents the ith row and jth column of mth layer’s weight or bias. 2.6. Back-propagation through the model In this step, we use the NN model (NNI) to back-propagate the controller error (e) back to the NN controller (NNC). The NNC then learns the desired control command through backpropagation algorithm. This can be formulated as follows: _
_
@Fc ðk þ 1Þ @Fc ðk þ 1Þ @uðkÞ ¼ @Wc ðkÞ @uðkÞ @Wc ðkÞ
T
¼ eT ðk þ 1Þeðk þ 1Þ
oscillations in system trajectory while tracking X average value. When the momentum filter is added to the parameter update, we obtain the following equations:
sm i
ym ðk þ 1Þ ¼ yd ðk þ 1Þ Aðyd ðkÞ yðkÞÞ
533
(11)
A suitable control output, u(k), is generated by adjusting the parameters Wc of the neuro-controller NNc() through minimization of (7).
Knowing the fact that plant Jacobian matrix J(k) = (@y(k + 1))/@u(k) cannot be assumed to be available a priori, we might use its estimates obtained from ð@yˆ ðk þ 1ÞÞ=@uðkÞ back-propagated through the NN plant-identifier (NNI).
2.5. Training multilayer neural network (MLP)
3. Stability analysis
Now that we know the control strategy, the next step is to determine a procedure for selecting the NNC network parameters (weights and biases). The networks used in this paper are multilayer perceptron (MLP) type neural networks having a single hidden layer with tangent-sigmoid function and the momentum back-propagation algorithm is used for the purpose of training. It is shown that two-layer networks, with tangent-sigmoid transfer functions in the hidden layer and linear transfer functions in the output layer, can approximate virtually any function of interest to any degree of accuracy, provided that there exist sufficiently many hidden units [46– 48]. In [47,48], different variations of back-propagation algorithm have been discussed in details. In this paper, we have employed momentum back-propagation algorithm. Momentum term acts as a low-pass filter and decreases the amount of
This section presents a detailed proof for asymptotic behavior of the neuro-controller that makes the overall closed-loop systems to track the desired trajectory. To do so, we first rewrite the tracking error dynamics in term of e, then propose a Lyapunov function. Knowing that e(k) = ym(k) y(k), (5) becomes: eðk þ 1Þ ¼ eðk þ 1Þ þ AeðkÞ þ Beðk 1Þ
(12)
where e(k) = yd(k) y(k). Let us use the following frequently used function, as a candidate Lyapunov function VðeðkÞÞ ¼ eT ðkÞeðkÞ Now we proceed as follows: (1) The function V is trivially positive definite.
(13)
534
A.R. Mehrabian, M.B. Menhaj / Applied Soft Computing 8 (2008) 530–542
(2) We do need to show the function is non-increasing. This can be done as: DVðeðkÞÞ ¼ Vðeðk þ 1ÞÞ VðeðkÞÞ ¼ e2 ðk þ 1Þ e2 ðkÞ ¼ ½eðk þ 1Þ þ AeðkÞ þ Beðk 1ÞT ½eðk þ 1Þ þ AeðkÞ þ Beðk 1Þ e2 ðkÞ ¼ e2 ðk þ 1Þ þ eT ðk þ 1ÞAeðkÞ þ eT ðk þ 1ÞBeðk 1Þ þ eT ðkÞAT eðk þ 1Þ þ e2 ðkÞA2 þ eT ðkÞAT Beðk 1Þ þ eT ðk 1ÞBT eðk þ 1Þ þ eT ðk 1ÞBT AeðkÞ þ e2 ðk 1ÞB2 e2 ðkÞ ¼ eT ðkÞðA2 IÞeðkÞ T
4. Implementing the proposed control method T
þ 2e ðk þ 1ÞAeðkÞ þ 2e ðk þ 1ÞBeðk 1Þ þ 2eT ðkÞAT Beðk 1Þ þ eT ðk 1ÞBT Beðk 1Þ þ e2 ðk þ 1Þ
(14)
If eT ðkÞQeðkÞ > 2eT ðk þ 1ÞAeðkÞ þ 2eT ðk þ 1ÞBeðk 1Þ þ 2eT ðkÞAT Beðk 1Þ þ e2 ðk 1ÞB2 þ e2 ðk þ 1Þ
(15)
where Q = (A2 I), then DVðeðkÞÞ < 0
system with slower dynamics, small values for A and B will work better. The reason behind this is that larger values for A and B make the controller more sensitive to changes in the error signal; thus, the controller responds more quickly to the changes and will capture the rapid dynamics of the system. In general after a few trail and error one can end up with proper values of A and B within the unit interval. Another point that must be considered here is that the adjustment of the controller’s weights depends on the signal e; hence, as this signal decays (which it should do) the boundedness of the controller’s weights will be guaranteed.
(16)
Thus, e(k) ! 0 if e(k + 1) satisfies the inequality (15). By proper selection of A, B, we can make e decay faster than e; this in turn guarantees inequality (16) to hold. Due to our vast simulation studies and experiments, we came up with the following trend regarding how to select A and B; this mainly depends on the dynamics of the plant. For a system with fast dynamics and low degree of stability or in the presence of severe noise and disturbances, it is recommended to set relatively larger values for A and B parameters, while for a
Implementation of the proposed control algorithm has done using MATLAB1 and SIMULINK1. The control loop block diagram in SUMULINK1 is shown in Fig. 3. The proposed model mainly consists of six blocks. First block is reference model block which contains the reference model. The second is the plant block which contains the dynamics of the Plant. In reality, this block will be placed by an unknown plant. Third and fourth blocks are NN identifier (model) and controller, respectively. Next block is the augmented output builder which implements (4), and the last block is the controller error sensitivity feedback block which is used to find _ ð@Fc ðk þ 1ÞÞ=@uðkÞ. 4.1. NN identifier block NN identifier block is shown in Fig. 4. As discussed before, the identifier consists of two parts. The first part represents NN structure and the second part is the updating mechanism. The NN model of the identifier is shown in Fig. 5. Block diagram of the updating mechanism is given in Fig. 6. The updating mechanism has capability of implementing momentum backpropagation (MBP) algorithm for adaptation.
Fig. 3. SIMULINK1 model of the implemented control algorithm.
A.R. Mehrabian, M.B. Menhaj / Applied Soft Computing 8 (2008) 530–542
535
Fig. 4. SIMULINK1 model of the NN identifier.
network-based control scheme. Before we proceed with simulations, the following remarks are worth mentioning:
4.2. NN controller block Controller block is shown in Fig. 7. Its updating mechanism is the same as the identifier, plus it has the ability of changing learning rate proportional to the tracking or control error (e or e). The switches in the block will put the controller into cycle after some short seconds, when the identifier has learned dynamics of the unknown plant. 4.3. Controller error sensitivity feedback block The structure of controller error sensitivity feedback block is shown in Fig. 8. As it was mentioned before, this block is used to find the modified tracking error sensitivity to plant input variations. To do so, identifier weights and the error are fed into this block whose output measures the sensitivity. 5. Simulation studies In this section, we perform some simulations to investigate the adaptation and the robustness of the proposed neural
Remark 1. As mentioned in Section 2, the control scheme consists of a NNI and a NNC, and the identification process is necessary for finding the proper control signal. Due to this necessity the NNI first learns the unknown dynamic system by applying random control signals from the controller, i.e. in the cases we considered in the paper it takes only a few seconds (about 0.2 s) and then the neuro-controller is put into the circle. Remark 2. To avoid saturation of NNI and NNC, data normalization is performed at the entrance of the control system. 5.1. Example 1: A non-linear system with a second-order difference equation and variable reference model In the first example, we want to show the system’s ability in real-time identification and control of an unknown plant. We consider a plant in the following form taken from [5]: yðk þ 1Þ ¼ f ½yðkÞ; yðk 1Þ þ uðkÞ
Fig. 5. NN model of the identifier.
(17)
536
A.R. Mehrabian, M.B. Menhaj / Applied Soft Computing 8 (2008) 530–542
Fig. 6. Updating mechanism of the NN identifier.
and the second one is a first-order difference equation given below
where the function: f ½yðkÞ; yðk 1Þ ¼
yðkÞyðk 1Þ½yðkÞ þ 2:5 1 þ y2 ðkÞ þ y2 ðk 1Þ
(18)
is assumed to be unknown. The first reference model is described by a second-order difference equation defined as yd ðk þ 1Þ ¼ 0:2yd ðkÞ þ 0:2yd ðk 1Þ þ rðkÞ
(19)
yd ðk þ 1Þ ¼ 0:5yd ðkÞ þ 0:5rðkÞ
(20)
where (19) is applied for t < 8 s, and (20) is applied for t > 8 s. In practice, switching to different reference models may often occur when the closed-loop response of the system needs to be altered in order to meet the performance requirements.
A.R. Mehrabian, M.B. Menhaj / Applied Soft Computing 8 (2008) 530–542
537
Fig. 7. SIMULINK1 model of the NN controller.
plant has the form [5]:
The augmented desired trajectory is obtained through: ym ðk þ 1Þ ¼ yd ðk þ 1Þ 0:01ðyd ðkÞ yðkÞÞ 0:001ðyd ðk 1Þ yðk 1ÞÞ
yðk þ 1Þ ¼ (21)
NNI has five neurons in hidden layer and three plant and two input time delays, respectively. Learning rate of the identifier is selected to be a = 0.002, and momentum term equals g = 0.01. The NNC has eight neurons and two time delays, respectively for plat output, plant input and reference input (a 6-8-1 network). Learning rate of the controller is found from ac = 3 104 exp (0.3jej) and the momentum term is set to gc = 0.01. Reference input is produced as rðtÞ ¼ sinðtÞ
(22)
Sampling times for identifier and controller are chosen equal to 0.001 and 0.0001 s, respectively. Fig. 9 represents the response of the controlled plant and the control signal. 5.2. Example 2: A non-linear plant with non-linear input to the difference equation In Example 1, the input is seen to occur linearly in the difference equation describing the plant. In this example the
yðkÞ 1 þ yðkÞ
2
þ ð0:2uðkÞÞ3
(23)
The neuro-identifier has a structure of 5-5-1, with three plant and two input time delays with learning rate and momentum term equal to a = 0.002 and g = 0.01, respectively. The augmentation scheme is: ym ðk þ 1Þ ¼ yd ðk þ 1Þ 0:1ðyd ðkÞ yðkÞÞ 0:005ðyd ðk 1Þ yðk 1ÞÞ
(24)
The neural-controller is of 6-5-1 structure, with the learning rate set as ac = 3 104 exp (0.3jej) and the momentum term gc = 0.01. Sampling time for both identifier and controller is set equal to 0.001 s. Fig. 10 shows the controlled system’s response for 15 s. The ability of the controller in tracking the reference signal is obvious. In the next example, the robustness of the proposed control method to plant uncertainty and perturbations is investigated. The performance of a control system in the face of plant uncertainty is an important issue. No mathematical system can exactly model a physical system and as a result it is necessary to be aware of how modeling errors due to the plant uncertainties
Fig. 8. SIMULINK1 model of the controller error sensitivity feedback block.
538
A.R. Mehrabian, M.B. Menhaj / Applied Soft Computing 8 (2008) 530–542
Fig. 10. Time response of the controlled plant in Example 2.
Fig. 9. Time response of the controlled plant in Example 1.
affect the performance of the control system. Typical sources of uncertainty include unmodeled (high frequency) dynamics, neglected non-linearities, and plant parameter (dynamic) perturbations. If the control system performs well for these types of variations in the system dynamics and the stability of the closed-loop system is maintained, then the scheme is said to be robust.
The structure of NNI is the same as the one given in the previous example but the NNC is 8-6-1 (two delays for system output, four delays for controller command and two delays for reference signal) with ac = 5 103 and gc = 0.01. Fig. 11 shows both controlled plant’s response and controller’s command. It is clear that the reference signal is tracked and the disturbance is rejected while influencing the controller’s command.
5.3. Example 3: A non-linear plant subjected to uncertainty
5.4. Example 4: A continuous unstable system
To demonstrate the robustness of the proposed control scheme, we consider the following plant [5]:
As for the next example, we shall demonstrate the ability of the neuro-controller in stabilizing and controlling continuous,
yðk þ 1Þ ¼ f ½yðkÞ; yðk 1Þ; yðk 2Þ; uðkÞ; uðk 1Þ þ D (25) where the function f is f ½x1 ; x2 ; x3 ; x4 ; x5 ¼
x1 x2 x3 x5 ðx3 1Þ þ x4 1 þ x22 þ x23
(26)
while the perturbation model (D) used is a random number generated between 0.3 and 0.3 with sampling time of 3 s. The reference signal is generated by: rðtÞ ¼ 0:5ð0:5 sinðtÞ þ cosð0:2tÞÞ
(27)
while the reference model’s signal is generated by: yd ðtÞ ¼ sinðrðtÞÞ
(28)
The augmentation scheme is given as: ym ðk þ 1Þ ¼ yd ðk þ 1Þ 0:1ðyd ðkÞ yðkÞÞ 0:01ðyd ðk 1Þ yðk 1ÞÞ
(29)
Fig. 11. Time response of the controlled plant in Example 3.
A.R. Mehrabian, M.B. Menhaj / Applied Soft Computing 8 (2008) 530–542
539
Fig. 13. The two-tank system setup [49].
Fig. 12. Time response of the controlled plant in Example 4.
unstable dynamical systems. For example, consider the following linear system: GðsÞ ¼
2s s2 1
(30)
The reference signal is generated by: rðtÞ ¼ 0:5 sinðtÞ þ cosð0:2tÞ
(31)
where the reference model’s signal is given as: yd ðtÞ ¼ 5 sinðrðtÞÞ
(32)
The augmentation scheme is: ym ðk þ 1Þ ¼ yd ðk þ 1Þ 0:1ðyd ðkÞ yðkÞÞ 0:05ðyd ðk 1Þ yðk 1ÞÞ
(33)
5.5.1. The experimental setup The process control experiment, schematically shown in Fig. 13, consists of two tanks [49]. The first tank is called the ‘‘fill tank’’ contains a liquid whose volume is to be controlled. The second tank is called ‘‘reservoir tank’’, which contains the liquid that the controller will pump into and out of the fill tank in order to bring the liquid volume of the fill tank to a desired volume denoted by Ld. The actual liquid volume of the fill tank is denoted by Lf and is measured in gallons. There are two pumps that serve as system actuators. The first one is a variable rate direct current (DC) pump, denoted by Pr, which pumps liquid from the reservoir tank into the fill tank. The second one is alternating current (AC) pump, denoted by Pf , which can only be turned on or off, and is used to remove the liquid from the fill tank [49]. The control input to the system is a single voltage u, where a sufficiently large positive value (of at most 10 V) will cause the DC pump Pr to transfer the liquid into the fill tank, and any negative value (at least 10 V) will cause the AC pump Pf to turn on and remove liquid from the fill tank. Note that there is an asymmetry due to the different operation of the pumps. The DC pump has a dead zone, above which the liquid flow is approximately a linear function of u; the AC pump, on the other hand, is turned on to maximum power by any negative u, regardless of its magnitude. The combined behavior of the pumps when is close to zero in magnitude make it very challenging to maintain volume at a steady value with small tracking error: the dead-band of Pr and the all-or-nothing
The NNI is a 5-5-1 network and the NNC is 8-5-1. The obtained results using the above setup are shown in Fig. 12. It can be seen clearly that the desired trajectory is followed while the system has been stabilized. 5.5. Example 5: A two-tank process control The last example is taken from [49] in which the authors have implemented recently developed Sugeno-based adaptive fuzzy controllers for the control of a process control experiment. This example is chosen to compare the performance of the proposed neural-control scheme with the developed fuzzy prediction and control technique given in [49]. It is a very nice benchmark since the authors have explained the experimental setup in details and introduced an approximate mathematical discrete-time model for the system, which shows the non-linearity of the system very well. In addition, different studies are carried out in the article, which makes the judgment easier.
Fig. 14. Combined effect of pumps.
540
A.R. Mehrabian, M.B. Menhaj / Applied Soft Computing 8 (2008) 530–542
functioning of Pf conspire to make the closed-loop system oscillate around the desired volume Ld [49]. 5.5.2. The mathematical model An approximate model for the experiment is developed in this section. The process control experiment may be represented by the following first-order non-linear difference equation: L f ðk þ 1Þ ¼ L f ðkÞ þ f p ðuðkÞÞ
(34)
where k is a time index, u a voltage input (ranging between 10 and 10 V) which drives the pumps Pr and Pf , and f p(u(t)) represents the combined effects of the pumps Pr and Pf . Experimentally, it is found that f p(u(t)) can be approximated by [49]:
pump is activated in an approximately on–off manner. Similarly, the DC pump is turned on when u is positive, but only after a dead-band of approximately 1 V. The experimental characteristic is clearly piecewise continuous and noninvertible due to the several constant-valued sections it has (e.g. for u < 0). To get around this issue, the approximation (35) is defined. Notice that (35) is invertible and differentiable everywhere. 5.5.3. The control results Two simulation studies have been carried out to demonstrate the ability of the proposed control system in stabilizing and controlling the non-linear two-tank process system. In the first study (34) and (35) are modeled in the SIMULINK and no noise or disturbance signal is added to the system. In the second study several noise and disturbance signals are added to the model of
8 0:0333 tan h½8:5ðu þ 0:045Þ 0:0056 u < 0:05 > > < 2 0:05 u < 0 ð1:68u 1:42e 4Þ þ 6:63e 5 f p ðuÞ ¼ tan hð3:6u 3:3Þ > > þ0:5 u0 : 0:0488 2
(35)
This function is depicted in Fig. 14. It was derived from the experimentally determined characteristics of the combined DC and AC pumps (see [50] for details, as well as continuous time treatment of this plant). Indeed, for negative values of u the AC
the system to illustrate the performance of the neuro-controller in laboratory conditions when there are different disturbing signals and noise, e.g. electrical noise of the pumps, sensor noise and measurement noise. The reference signal in this
Fig. 15. Time response of the two-tank process.
Fig. 16. Time response of the two-tank process in presence of noise.
A.R. Mehrabian, M.B. Menhaj / Applied Soft Computing 8 (2008) 530–542
example is set as a series of step commands for the tank volume and no reference model is used (yd(t) = r(t)). The neuroidentifier has a structure of 5-6-1, with three plant and three input time delays. The learning rate and the momentum term are set to a = 0.001 and g = 0.1, respectively. The augmentation block has the following dynamics: ym ðk þ 1Þ ¼ yd ðk þ 1Þ 0:8ðyd ðkÞ yðkÞÞ 0:1ðyd ðk 1Þ yðk 1ÞÞ
(36)
The neuro-controller is of 8-5-1 structure, with the learning rate of ac = 5 104 and momentum term equal to gc = 0.01. Sampling times of the identifier and the controller are identical and equal to 0.001 s. Fig. 15 shows the time response of the closed-loop system, Lf , versus the desired volume, Ld. One should observe that the response of the closed-loop system is very desirable. The closed-loop response shows some overshoot at the beginning resulting in actuator saturation, then quickly turns to a very smooth response and follows the desired volume Ld very closely. For the two-tank process system, a dynamic model is considered as well. The dynamic model of the system with noise is given as follows: L f ðk þ 1Þ ¼ L f ðkÞ þ f p ½uðkÞð1 þ D1 Þ þ D2
Fig. 17. The flow rate and the noise signals.
(37)
541
where D1 presents electrical noise of the pumps and is computed as: D1 ¼
0 u¼0 Band-limited white noise u ¼ 6 0
(38)
where D2 represents both the sensor and the measurement noises, which is set to be a band-limited white noise. Bandlimited white noise is chosen in this study since it is wellknown that the white noise is a useful theoretical approximation when the noise disturbance has a correlation time that is very small relative to the natural bandwidth of the system. The response of the system in presence different noise signals is depicted in Fig. 16. It is clear that the controller is able to stabilize the system and tracks the desired volume. The pump flow, f p, as well as the noise signals are shown in Fig. 17, which clearly shows the power of noises. The presented example clearly shows the merits of the proposed control scheme in dealing with different disturbance and noise signals. 6. Conclusion This paper introduced a new real-time neural-adaptive MLPbased controller with guaranteed stability for a rather wide class of non-linear dynamical systems. The learning algorithm used in the proposed scheme to adjust the parameters of the networks was MBP. The proposed technique scheme belongs to a general class of indirect model-reference adaptive controllers. This scheme has the following advantages: (1) it does not require any pretraining data and robustifying terms; this in turn helps the proposed neural-control method to compensate the abrupt changes in the system, which may happen due to environmental and unforeseen conditions, as well as the measurement noise; (2) as shown in the paper, section of simulations, the proposed neuro-controller has the remarkable ability to make the closedloop system stable with a satisfactory response specially when the two adjustable parameters of the control system have been tuned nicely; (3) it needs a small number of tuning parameters requiring a few trail and errors to be properly selected. This may be considered as the disadvantage of the proposed scheme. Another disadvantage of the proposed scheme is that it is not suitable for systems with time delays, which is due to the fact that the output of the plant is not influenced by the controller’s command at the present time. So, the back-propagation through plant that is used to update the weights of the controller fails to propagate the influence of the input signal imposed to the plant to its output. This indeed will result in the overall system destabilization. The simulation results easily help better judge the merits of the proposed neuro-control scheme. It has been shown that the outputs of the unknown plants to be controlled, closely track the reference signals, despite the fact that the structure of the plant is varied or subjected to noise. Furthermore, it is shown that the control scheme is capable of effectively dealing with stable reference models with time varying dynamics.
542
A.R. Mehrabian, M.B. Menhaj / Applied Soft Computing 8 (2008) 530–542
References [1] D.J. Burr, Experiments on neural net recognition of spoken and written text, IEEE Trans. Acoust., Speech, Signal Process. 36 (7) (1988) 1162–1168. [2] R.P. Gorman, T.J. Sejnowski, Learned classification of sonar targets using massively parallel network, IEEE Trans. Acoust., Speech, Signal Process. 36 (7) (1988) 1135–1140. [3] B. Widrow, R.G. Winter, R.A. Baxter, Layered neural nets for pattern recognition, IEEE Acoust., Speech, Signal Process. 36 (7) (1988) 1109–1118. [4] T.J. Sejnowski, C.R. Rosenberg, Parallel networks that learn to pronounce English text, Complex Syst. 1 (1987) 145–168. [5] K.S. Narendra, K. Parthasarathy, Identification and control of dynamical systems using neural networks, IEEE Trans. Neural Networks 1 (March (1)) (1990) 4–27. [6] Y.-J. Wang, C.-T. Lin, Runge–Kutta neural network for identification of dynamical systems in high accuracy, IEEE Trans. Neural Networks 9 (2) (1998) 294–307. [7] A. Yazdizadeh, K. Khorasani, Adaptive time delay neural network structures for nonlinear system identification, Neurocomputing 47 (2002) 207–240. [8] M.T. Hagan, H.B. Demuth, Neural networks for control, in: Proceedings of American Control Conference (ACC), San Diego, (1999), pp. 1642–1656. [9] P.K. Dash, S.K. Panda, T.H. Lee, J.X. Xu, A. Routray, Fuzzy and neural controllers for dynamic systems: an overview, in: Proceedings of 1997 IEEE International Conference on Power Electronics and Drive Systems, vol. 2, 1997, pp. 810–816. [10] K.S. Narendra, S. Mukhopadhyay, Adaptive control using neural networks and approximate models, IEEE Trans. Neural Networks 8 (May (3)) (1997). [11] L. Chen, K.S. Narendra, Nonlinear adaptive control using neural networks and multiple models, in: Proceedings of American Control Conference, 2000, pp. 4199–4203. [12] A.S. Poznyak, L. Ljung, On-line identification and adaptive trajectory tracking for nonlinear stochastic continuous time systems using deferential neural networks, Automatica 37 (2001) 1257–1268. [13] J.-P.S. Draye, D.A. Pavisic, G.A. Cheron, G.A. Libert, Dynamic recurrent neural networks: a dynamical analysis, IEEE Trans. Syst., Man, and Cybern. 26 (October (5)) (1996) 692–706. [14] A.S. Poznyak, W. Yu, E.N. Sanchez, J.P. Perez, Nonlinear adaptive trajectory tracking using dynamic neural networks, IEEE Trans. Neural Networks 10 (November (6)) (1999) 1402–1411. [15] G.A. Rovithakis, M.A. Chrisdoulou, Adaptive control of unknown plants using dynamical neural networks, IEEE Trans. Syst., Man, and Cybern. 24 (March (3)) (1994) 400–412. [16] H.V. Kamat, D.H. Rao, Direct adaptive control of nonlinear systems using a dynamic neural network, in: IEEE/IAS International Conference on Industrial Automation and Control, 1995, 269–274. [17] Y. Becerikli, A.F. Konar, T. Samad, Intelligent optimal control with dynamic neural networks, Neural Networks 16 (2003) 251–259. [18] J.J. Hopfield, D.W. Tank, Simple neural optimization networks: an A/D converter, signal decision circuit, and a linear programming circuit, IEEE Trans. Syst., Man, and Cybern. CAS-33 (5) (1986) 533–541. [19] J.J. Hopfield, D.W. Tank, Neural computation of decision in optimization problems, Biol. Cybern. 52 (1985) 141–152. [20] M.A. Cohen, S. Grossberg, Absolute stability of global pattern formation and parallel memory storage by competitive neural networks, IEEE Trans. Syst., Man, and Cybern. 13 (5) (1993) 815–825. [21] J.J. Hopfield, Neural networks and physical systems with emergent collective computational abilities, Proc. Natl. Acad. Sci. U.S.A. 79 (1982) 2554–2558. [22] B. Kosko, Structural stability of unsupervised learning in feedback neural networks, IEEE Trans. Autom. Control 36 (7) (1991) 785–792. [23] A. Guez, V. Protopopsecu, J. Barhen, On the stability, storage capacity and design of non-linear continuos neural networks, IEEE Trans. Syst., Man, and Cybern. 18 (1) (1988) 80–87. [24] D.G. Kelly, Stability in contractive non-linear neural networks, IEEE Trans. Biomed. Eng. 37 (3) (1990) 231–242. [25] M. Vidyasagar, Location and stability of the high gain equilibria of nonlinear neural networks, IEEE Trans. Neural Networks 4 (4) (1993) 660–672.
[26] L. Jim, P.N. Nikiforuk, M.M. Gupta, Absolute stability conditions for discrete-time recurrent neural networks, IEEE Trans. Neural Networks 5 (6) (1994) 954–964. [27] E. Tzirkel-Hancock, F. Fallside, A stability based neural network control method for a class of non-linear systems, in: Proceedings of the International Joint Conference on Neural Networks, vol. 2, 1991, pp. 1047–1052. [28] F.L. Lewis, A. Yesildirek, K. Liu, Neural net robot controller with guaranteed stability, in: Proceedings of the 3rd International Conference on Industrial Fuzzy Control, 1993, pp. 103–108. [29] F.L. Lewis, A. Yesildirek, K. Liu, Multilayer neural net robot controller with guaranteed tracking performance, IEEE Trans. Neural Networks 7 (2) (1996) 388–399. [30] C. Kwan, F.L. Lewis, D.M. Dawson, Robust neural-network control of rigid-link electrically driven robots, IEEE Trans. Neural Networks 9 (4) (1998) 581–588. [31] S. Kuntanapreeda, R.R. Fullmer, A training rule which guaranteed finite region stability for a class of closed-loop neural network control system, IEEE Trans. Neural Networks 7 (3) (1996) 629–642. [32] S. Park, C.H. Park, Comments on a training rule which guarantees finiteregion stability for a class of closed-loop neural-network control systems, IEEE Trans. Neural Networks 8 (5) (1997) 1217–1218. [33] N. Sadegh, A perceptron network for functional identification and control of non-linear systems, IEEE Trans. Neural Networks 4 (6) (1993) 982–988. [34] M.M. Polycarpou, Stable adaptive neural control scheme for non-linear systems, IEEE Trans. Autom. Control 41 (3) (1996) 447–451. [35] F.-C. Chen, H.K. Khalil, Adaptive control of non-linear systems using neural networks, Int. J. Control 55 (1992) 1299–1317. [36] F.-C. Chen, H.K. Khalil, Adaptive control of a class of non-linear discretetime systems using neural networks, IEEE Trans. Autom. Control 40 (5) (1995) 791–801. [37] F.-C. Chen, C.-C. Liu, Adaptively controlling non-linear continuous-time systems using multilayer neural networks, IEEE Trans. Autom. Control 39 (6) (1994) 1306–1310. [38] S. Fabri, V. Kadirkamanathan, Dynamic structure neural networks for stable adaptive control of non-linear systems, IEEE Trans. Neural Networks 7 (5) (1996) 1151–1167. [39] F. Sun, Z. Sun, P.-Y. Woo, Stable neural-network-based adaptive control for sampled-data non-linear systems, IEEE Trans. Neural Networks 9 (5) (1998) 956–968. [40] S. Jagannathan, F.L. Lewis, Multilayer discrete-time neural-net controller with guaranteed performance, IEEE Trans. Neural Networks 7 (1) (1996) 107–130. [41] S. Jagannathan, F.L. Lewis, O. Pasravanu, Discrete-time model reference adaptive control of nonlinear dynamical systems using neural networks, Int. J. Contr. 64 (2) (1996) 217–239. [42] J.A.K. Suykens, J. Vandewalle, Global asymptotic stability for multilayer recurrent neural networks with application to modelling and control, in: Proceedings of the International Conference on Neural Network, vol. 2, 1995, pp. 1065–1069. [43] M.T. Hagan, M. Menhaj, Training feedforward networks with the Marquardt algorithm, IEEE Trans. Neural Networks 5 (6) (1994) 989–993. [44] L. Ljung, System Identification: Theory for the User 2/e, Prentice-Hall, Englewood Cliffs, NJ, 1999. [45] M. Hagan, H. Demuth, O. De Jesus, An introduction to the use of neural networks in control systems, Int. J. Robust Nonlinear Control 12 (September (11)) (2002) 959–985. [46] K.M. Hornik, M. Stinchcombe, H. White, Multilayer feedforward networks are universal approximators, Neural Networks 2 (5) (1989) 359–366. [47] M.B. Menhaj, Application of Computational Intelligence in Control, Professor Hesabi Center of Publishing, Tehran, Iran, 1998. [48] M. Hagan, H. Demuth, M. Beale, Neural Network Design, PWS Publishing, Boston, MA, 1996. [49] R. Ordonez, J.T. Spooner, K.M. Passino, Experimental studies in nonlinear discrete-time adaptive prediction and control, IEEE Trans. Fuzzy Syst. 14 (2) (2006) 275–286. [50] R. Ordo´n˜ez, J. Zumberge, J.T. Spooner, K.M. Passino, Experiments and comparative analyzes in adaptive fuzzy control, IEEE Trans. Fuzzy Syst. 5 (2) (1997) 167–188.