Artificial Intelligence in Engineering 13 (1999) 309–320 www.elsevier.com/locate/aieng
Identification of plant inverse dynamics using neural networks D.T. Pham*, S.J. Oh Intelligent Systems Laboratory, Systems Division, School of Engineering, University of Wales Cardiff, PO Box 688, Newport Road, Cardiff CF2 3TE, UK Received 1 October 1998; received in revised form 25 January 1999; accepted 4 February 1999
Abstract This article investigates the approximation of the inverse dynamics of unknown plants using a new type of recurrent backpropagation neural network. The network has two input elements when modelling a single-output plant, one to receive the plant output and the other, an error input to compensate for modelling uncertainties. The network has feedback connections from its output, hidden, and input layers to its ‘‘state’’ layer and self-connections within the ‘‘state’’ layer. The essential point of the proposed approach is to make use of the direct inverse learning scheme to achieve simple and accurate inverse system identification even in the presence of noise. This approach can easily be extended to the area of on-line adaptive control which is briefly introduced. Simulation results are given to illustrate the usefulness of the method for the simpler case of controlling time-invariant plants. q 1999 Elsevier Science Ltd. All rights reserved. Keywords: Recurrent network; Identification; Inverse control
Nomenclature A, B, P, S, V, W, Z weight matrices in the proposed network c, h outputs of state layer and hidden layer of the network ^ yn additional error input to the network, network e, u, output and network input (vector) ^ yn additional error input to the network, network e, u, output and network input (scalar) activation functions of the state, hidden and fc, fh, fo output layer neurons FN, IN, IN c forward modelling, inverse modelling and control networks k discrete-time instant m, q dimensions of plant output and input MSE mean squared error for training phase N, T numbers of training patterns and iterations n number of hidden/state layer neurons NMSE_U, NMSE_Y normalised mean squared error for recall and control phases u, x, y, yd input, state, output and desired output vectors of a plant u, w, y, yd input, unknown noise, output and desired output of a plant * Corresponding author. Tel.: 1 44-1222-874429; fax: 1 44-1222874003. E-mail address:
[email protected] (D.T. Pham)
z 2d
delay operator for a delay of d sampling periods a, b, g connection gains for P, V and S G, L, P, J coefficient matrices of the plant ~ P; ~ L; ~ J~ G; coefficient matrices of the network h, m learning rate and momentum term
1. Introduction Inverse dynamics identification is defined as finding the inverse mapping of a system, as illustrated in Fig. 1. It is useful to know the inverse dynamics of a plant in order to control it. Then, in an ideal situation, the dynamics of the controller could simply be made equal to the plant inverse dynamics. Neural networks have been employed to identify or extract inverse dynamics models of unknown plants through learning [1–8]. There are two main approaches to the neural-network-based identification of the inverse dynamics of plants. The first approach, adopted by Psaltis for example [1], can be called direct or generalised inverse learning. A neural network is fed outputs from the plant and directly taught to generate the plant inputs that produced those outputs [1]. Errors between the desired and actual outputs of the network are used to adjust the network weights. The second approach, described by Psaltis [1], Saerens and Soquet [2], Hoskins [3], and Brown [4], for instance, can be called indirect or specialised inverse learning. The learning is indirect because it is achieved only by
0954-1810/99/$ - see front matter q 1999 Elsevier Science Ltd. All rights reserved. PII: S0954-181 0(99)00003-5
310
D.T. Pham, S.J. Oh / Artificial Intelligence in Engineering 13 (1999) 309–320
network and shows that it can represent the inverse dynamics of a plant. Section 3 describes the architecture adopted for training the network for inverse dynamics identification. Section 4 presents two ways of using the network for control. Section 5 gives the results of simulation experiments where the proposed neural network was employed successfully to identify and control a variety of unknown plants. 2. Theory A model of the inverse dynamics of a linear plant is first defined in this section. The equations for the so-called ‘‘state’’ layer in the proposed network are then derived. It will be shown that the output of the state layer in a network with linear neurons can represent the state of the inverse of a linear system. 2.1. Inverse dynamics of a linear plant The inverse dynamics of a linear plant can be described in the form of a discrete state space model: x
k 1 1 Gx
k 1 Ly
k 1 1
1a
u
k Jx
k 1 Py
k 1 1
1b
where x [ R , y [ R and u [ R represent the state vector, input vector (plant output) and output vector (input to the plant) respectively, G [ R n×n, L [ R n×m, J [ R q×n and P [ R q×m are coefficient matrices, and k is the time instant. n
m
q
Fig. 1. Plant dynamics and inverse dynamics. (a) Forward dynamics; (b) Inverse dynamics; (c) Inverse dynamics identification.
2.2. Proposed network
training the neural network to act as a controller to the plant. Errors between the desired and actual outputs of the plant (and not those of the network) have to be backpropagated through the plant or a model of it to adjust the weights of the network. Clearly, direct inverse learning is in theory simpler as it does not require this additional backpropagation procedure. Direct inverse learning also has the advantage of resulting in an inverse dynamics model that can be tested independently without having to involve the plant. However, current schemes for direct inverse learning are not goal directed and therefore inefficient [9]. Furthermore, they are prone to giving incorrect inverse models if the plant has nonlinearities. This article proposes a new type of recurrent backpropagation neural network for improved direct inverse learning. The proposed scheme possesses the advantages inherent in direct inverse learning without suffering from the drawbacks of existing techniques. The article is organised as follows. Section 2 derives the equations describing the operation of the proposed neural
The proposed network comprises an input layer, a hidden layer, an output layer and a state layer, as shown in Fig. 2 (a) and (b). This structure is the same as that of the network described by the authors in [10] for dynamics modelling. However, the new inverse dynamics modelling network has two input elements rather than one. One of the input elements is to receive the plant output (assuming a singleoutput plant is being inverse modelled) and the other is an error input to compensate for modelling uncertainties. In addition to the feedback connections used in the network presented in [10], namely, connections from the output layer to the state layer (P), from the hidden layer to the state layer (S), and from the state layer to itself (V), the new network also has trainable connections from the input layer to the state layer (Z) to provide redundancies for plants with nonlinear dynamics involving multiplicative input–output terms (see, for example, the plant of type 4 in [11]). In both the new network and the network in [10], delay elements are included in the feedback paths to give the networks a dynamic memory capacity. Let u^ [ Rq and yn [ R2m be the network output and input vectors respectively, where the input vector yn is
D.T. Pham, S.J. Oh / Artificial Intelligence in Engineering 13 (1999) 309–320
311
Fig. 2. Architecture of the proposed network. (a) Schematic diagram. (b) Connection details.
defined as yn [ye] T with y [ R m being an external input vector (the plant output) and e [ R m, the additional error input vector. As mentioned earlier, e is applied in compensation for the error between the desired and actual plant output due to modelling and environmental uncertainties. The need for e will become apparent when the network is used to control the plant. With all neurons in the network having linear activation, the following relations can be obtained: ^ Wh
k u
k
2
h
k Ac
k 1 Byn
k 1 1
3
^ 2 1 1 Sh
k 2 1 1 Vc
k 2 1 1 Zyn
k c
k Pu
k
4
, P [ where W [ R , A [ R , B [ R R n×q, S [ R n×n, V [ R n×n and Z [ R n×2m are weight q×n
n×n
matrices respectively, and h and c represent the outputs of the hidden layer and state layer. The feedforward connection weights in W, A, B and Z are trainable using the generalised delta rule [12]. The feedback connection weights in P, S and V are fixed. The connection weights a and g in P and S all have the same value 1.0 and those in V have the same value b . From Eqs. (2)–(4) the output of the state layer can be rewritten as: c
k PWA 1 SA 1 Vc
k 2 1 1 PWB 1 SB 1 Zyn
k:
5
Eq. (5) is of the form:
n×2m
~ n
k 1 1 ~ c
k 1 1 Gc
k 1 Ly
6
312
D.T. Pham, S.J. Oh / Artificial Intelligence in Engineering 13 (1999) 309–320
3. Inverse dynamics identification
Fig. 3. Inverse dynamics identification scheme.
where G~ PWA 1 SA 1 V and L~ PWB 1 SB 1 Z are n × n and n × 2m matrices, respectively. From Eqs. (2) and (3), the network output is given by: ~ ^ Jc
k u
k 1 P~ yn
k 1 1
7
~ WA and P ~ WB. with J Eq. (6) has the same form as Eq. (1a) and is a state space equation with state vector c. Eq. (7) has the same form as Eq. (1b) with state vector c and input vector yn. Therefore, according to Section 2.1, the proposed network topology can represent the inverse dynamics mapping from yn to u.
The network described in the previous section was used with the direct inverse learning scheme for inverse system identification as shown in Fig. 3. The forward plant model (FN) required for the scheme can readily be obtained by forward system identification [10]. FN is used to generate the error signal e for inputting to the inverse modelling network. This error signal will be needed to close the control loop when the network is subsequently employed as a controller. The delay term (z2d ) for the target plant input is selectable, with d usually given a value in the range 1–3 for a plant without time delays. This term is provided to allow for time lags in the neural identifier. At the beginning of the process of learning a plant inverse dynamics or when a sudden change is applied at the plant input, the network output may oscillate or even diverge. When the network is used as a controller of the plant, this problem must be strictly avoided. The time lags can help to stabilise the inverse learning process [13,14]. The training set must be chosen properly to contain sufficient frequency components to achieve accurate identification of the given plant [15,16]. In neural network identifications, most researchers have used pseudo random signals or sinusoidal signals [6,7,13]. When the identification is performed on-line while the plant is in normal operation, some other techniques should be
Fig. 4. Control scheme. (a) Control of time-invariant plant. (b) On-line control of time-varying plant (with continuous inverse identification).
D.T. Pham, S.J. Oh / Artificial Intelligence in Engineering 13 (1999) 309–320
5. Simulation
Table 1 Structural and training parameters for neural networks Parameter
h
m
b
n
T
Activation (hidden layer neurons)
Simulation 1 Simulation 2 Simulation 3
0.015 0.0015 0.010
0.05 0.05 0.05
0.80 0.80 0.80
6 7 5
300 000 300 000 300 000
Linear Hyperbolic tangent Hyperbolic tangent
considered for obtaining rich training signals (for example by adding noise to the input) or a control performance assessment mechanism should be employed [14,16]. However, detailed discussion of this issue is out of the scope of this article. During training, the network takes as inputs the plant output and the additional error term e generated through FN from the error between the delayed target ^ input to the plant, z 2du, and the network output, u. The network learns to produce u^ as an approximation of z 2du by adjusting its trainable weights based on the gradient of the error between z 2du and u^ with respect to these weights, so as to drive the error to zero. In the input–output model, u^ can be written as:
The ability of the proposed neural network to identify the inverse dynamics of various plants has been tested in simulation. Although the modelling capability of the proposed network has been theoretically demonstrated only for linear systems, simulations have been carried out to explore its use to represent nonlinear systems also. This idea is feasible as nonlinear activation functions can be assigned to the neurons in the network to enable it to perform nonlinear mappings [7,16,18–21]. The performance indices used are: 1. mean square error for the training phase MSE
N 1 X 2 ^ {u
k 2 u
k} N k1
9
2. normalised mean square error for the recall phase
1=N
N X
1=N
f o Wf h Byn
k 1 1 1 Af c Zyn
k 1 h
k 2 1
2 ^ {u
k 2 u
k}
k1
NMSE_U
^ f o Wh
k f o Wf h Byn
k 1 1 1 Ac
k u
k
^ 2 1 1bc
k 2 1 1 u
k
313
N X
10 {u
k}
2
k1
8 3. normalised mean square error for the control phase
where fo, fh, and fc are the activation functions of the output, hidden, and state layer elements in the network.
1=N NMSE_Y
4. Control schemes
N X
{yd
k 1 1 2 y
k 1 1}2
k1
1=N
N X
11 {yd
k 1 1}
2
k1
Fig. 4(a) illustrates how, after training, the network might be used in a recall mode to control the plant, the inverse dynamics of which it has learnt to model. This scheme is applicable when the plant is time invariant, as the inverse dynamics model is fixed after learning is completed. If the plant does not vary abruptly, the scheme can be modified for on-line training and adaptive control as shown in the architecture of Fig. 4(b). Here, the desired output is used as an input to the control network, IN c, which is a copy of the inverse modelling network, IN. IN is continuously trained on-line to capture the non-stationary inverse dynamic characteristics of the plant. The initial weights of IN and IN c are obtained by first applying the off-line inverse identification scheme depicted in Fig. 3. For further details of the use of inverse dynamics models based on neural networks to control nonlinear time-varying systems, see [14,17]. From Fig. 4(a) and (b), it is apparent that the control loop is closed by feeding the error signal e between the desired and actual plant outputs to the neural controllers (IN in Fig. 4(a) and IN c in Fig. 4(b)). This explains the use of e as an input to IN during identification (Figs. 3 and 4(b)).
In the previous equations, N represents the number of data pairs (y, u) in the training set. N was taken as 400 in all the simulations carried out. In the simulations, the pattern-based training method was used [22]. Each training iteration consisted of presenting an input data item, u(k), to the plant at time k and feeding the measured output, y(k 1 1), to the inverse modelling network ^ to produce its output u
k. The difference between z 2du(k) ^ was used to modify the weights of the network as and u
k shown in Fig. 3. Three sets of simulation experiments were conducted. The first was on a linearised model of a typical electrical servomechanism. The second and third sets of experiments involved non-linear models of processes such as fermentation, distillation and chemical reactions.
Simulation 1. The linear dynamic system considered behaved according to the following discrete input–output
314
D.T. Pham, S.J. Oh / Artificial Intelligence in Engineering 13 (1999) 309–320
Table 2 Performance indices obtained in different simulations
Simulation 1 Simulation 2 Simulation 3
Step Sinusoid Step Sinusoid Step Sinusoid
MSE (before training)
MSE (after training)
0.26473
0.00060
0.26776
0.00192
0.58454
0.00324
NMSE_U (recall)
NMSE_Y (control)
0.00224 0.00165 0.01540 0.00489 0.00223 0.00822
0.00378 0.01520 0.00190 0.00145 0.01640 0.01807
Fig. 5. Responses of inverse dynamics network during recall (Simulation 1).
D.T. Pham, S.J. Oh / Artificial Intelligence in Engineering 13 (1999) 309–320
315
Fig. 6. Responses of plant under control of inverse dynamics network (Simulation 1).
equation (a sampling period of 0.01 s was assumed): y
k 1 1 1:72357y
k 2 0:74082y
k 2 1 1 0:00904u
k 1 0:0082u
k 2 1
12
The optional delay term (z 2d) was set to z 21. For the training phase, the parameters of the network were chosen as shown in Table 1, where h , m, b and n, respectively, denote the learning coefficient, momentum term, self-feedback connection and number of neurons in the hidden layer and state layer of the network, and T represents the number of training iterations. All neurons in the network had linear
activation. For the training signals, first a sinusoidal signal, u(k) 0.1 sin(2pk × 0.1) 1 0.4 sin(2pk × 0.03) 1 0.6 sin(2pk × 0.01), was applied to the plant to obtain its response (according to [19], pp142-148, this training signal should be sufficiently rich for the given plant). White Gaussian noise with variance equal to 0.014 (approximately 5% of the output) was then added to the plant output to yield the network input y. The other network input e was produced on-line by feeding the error between the network output u^ and the delayed plant input z 2du to a pre-identified forward model of the plant. The values of the performance indices are given in Table
316
D.T. Pham, S.J. Oh / Artificial Intelligence in Engineering 13 (1999) 309–320
Fig. 7. Responses of inverse dynamics network during recall (Simulation 2).
2. Fig. 5(a) and (b) show the responses of the network during recall for the case where the target plant input u(k) was, respectively, a step function (u
k 1:0; k . 0) and a sinusoidal function (u(k) 0.5 sin(2pk × 0.03) 1 0.5 sin(2pk × 0.01)). The trained network was then employed to control the plant as illustrated in Fig. 4(a). Fig. 6(a) and (b) give the results obtained when the desired plant output was a step, yd(k) 0.5, and a sinusoid, yd(k) 0.5 sin(2p k × 0.03) 1 0.5 sin(2pk × 0.01), respectively.
The responses shown in Fig. 6(a) and (b) could have been achieved using a conventional PID controller as the plant is a simple linear plant. However, this set of simulation experiments has demonstrated that the proposed network can be trained to control a system without requiring explicit information about it. Simulation 2. A nonlinear plant with the following input–output equation was to be identified and
D.T. Pham, S.J. Oh / Artificial Intelligence in Engineering 13 (1999) 309–320
317
Fig. 8. Responses of plant under control of inverse dynamics network (Simulation 2).
controlled: y
k 1 1 y
ky
k 2 1 1 u
k 1 w
k 1 1:
y
k 1 2:5 1:0 1 y
ky
k 1 y
k 2 1y
k 2 1
13
The neural network structural and training parameters are shown in Table 1. Hyperbolic tangent activation functions were used for the hidden layer neurons, and linear activation functions for the other elements in the
network. A combined sinusoidal signal (the same as in Simulation 1) was applied as the training input signal. White Gaussian noise with variance 0.014 (again, approximately 5% of the output) was added to the output of the plant. Table 2 gives the values obtained for the performance indices. The responses of the network during the recall phase for the case of step and sinusoidal plant inputs are shown in Fig. 7(a) and (b). Fig. 8(a) and (b) present the results when the trained network was used to control
318
D.T. Pham, S.J. Oh / Artificial Intelligence in Engineering 13 (1999) 309–320
Fig. 9. Responses of inverse dynamics network during recall (Simulation 3).
the plant according to the scheme depicted in Fig. 4(a). It can be noted that good accuracy was achieved in spite of the high degree of non-linearity of the plant which would have caused problems for conventional linear controllers.
Simulation 3. This simulation was conducted for a nonlinear plant with the following input–output
equation: y
k 1 1
y
k 2 0:3y
k 2 1 1:5 1 y
ky
k 1 0:5u
k 1 w
k 1 1:
14
The structural and training parameters for the neural network are given in Table 1. Hyperbolic tangent activation functions were also adopted for the hidden layer
D.T. Pham, S.J. Oh / Artificial Intelligence in Engineering 13 (1999) 309–320
319
Fig. 10. Responses of plant under control of inverse dynamics network (Simulation 3).
neurons and linear activation functions, for all the other elements. Apart from the small differences in the structural and training parameters adopted (Table 1), the training conditions were the same as for Simulations 1 and 2. The results obtained are illustrated in Figs. 9 and 10. Again, these show the effectiveness of the proposed scheme in modelling and controlling a difficult plant.
6. Conclusion A practical scheme for inverse dynamics identification of plants using a new type of recurrent backpropagation neural network has been proposed in this article. Owing to the simple nature of the neural network and the on-line training architecture adopted, the scheme possesses the advantages of direct inverse learning without suffering from the
320
D.T. Pham, S.J. Oh / Artificial Intelligence in Engineering 13 (1999) 309–320
drawbacks of existing inverse learning techniques. The results of applying the scheme to three different plants, including two with high degrees of non-linearity, have demonstrated the good performance of the proposed scheme. Although this article has focused on time-invariant plants, it has also indicated how the scheme described here can be used for on-line adaptive control of time-variant systems. Acknowledgements The authors wish to thank the European Commission and the Welsh Office for supporting the work described in this article under the European Regional Development Fund projects ‘‘Knowledge-Based Manufacturing Centre’’, ‘‘Innovation in Manufacturing Centre’’ and ‘‘Innovative Technologies for Effective Enterprises’’. The projects are administered by the Welsh European Programme Executive. The article has been improved as a result of suggestions by a referee whose help is appreciated by the authors. References [1] Psaltis D, Sideris A, Yamamura A. A multilayered neural network controller. IEEE Control Systems Magazine 1989;April:17–21. [2] Saerens M, Soquet A. Neural controller based on back-propagation algorithm. Proc IEE: Part F 1991;138(1):55–62. [3] Hoskins D, Hwang JN, Vagners J. Iterative inversion of neural networks and its application to adaptive control. IEEE Trans on Neural Net 1992;3(2):292–301. [4] Brown M, Lightbody G, Irwin G. Nonlinear internal model control using local model networks. IEE Proc Control Theory and Appl 1997;144(6):505–524. [5] Hunt KJ, Sbarbaro D, Zbikowski R, Gawthrop PJ. Neural networks for control systems—A survey. Automatica 1992;28(6):1083–1112. [6] Diron J, Casassud M, Le Lann M, Casamatta G. Design of a neural controller by inverse modelling. Computers and Chem Engng 1995;19:S797–802.
[7] Deshpande N, Gupta M. Inverse kinematic neuro-control of robotic systems. Engng Appl AI 1998;11(1):55–66. [8] Lee J-W, Oh H-H. Inversion control of nonlinear systems with neural network modelling. IEE Proc Control Theory and Appl 1997;144(5):481–487. [9] Colombano SP, Compton M, Bualat M. Goal directed model inversion: adaptation to unexpected model changes, in: Proc. 4th International Conference on Neural Networks and Their Applications (NEURO-NIMES’91), Nimes, France, 1991, pp. 271–278. [10] Pham DT, Oh SJ. A recurrent backpropagation neural network for dynamic system identification. J Systems Engng 1992;2(4):213–223. [11] Narendra K, Parthasarathy K. Identification and control of dynamical systems using neural networks. IEEE Trans Neural Net 1992;1(1):4– 27. [12] Rumelhart DE, Hinton GE, Williams RJ. Learning internal representations by error propagation. Parallel distributed processing, 1. Cambridge, MA: MIT Press, 1988. [13] Sira-Ramirez H, Zak SH. The adaptation of perceptrons with applications to inverse dynamics identification of unknown dynamic systems. IEEE Trans Sys Man Cybernetics 1991;21(3):634–643. [14] Pham DT, Oh SJ. Adaptive control of dynamic systems using neural networks, in: Proc IEEE-SMC Conference, Systems Engineering in the Service of Humans, Vol.4, Le Touquet, France, 1993, pp. 97–102. [15] Landau I. System identification and control design. Englewood Cliffs, NJ: Prentice-Hall, 1990 pp. 142–148. [16] Lu S, Basar T. Robust nonlinear system identification using neural networks models. IEEE Trans Neural Net 1998;9(3):407–429. [17] Zeman V, Patel R, Khorasani K. Control of a flexible-joint robot using neural networks. IEEE Trans Control Systems Technol 1997;5(4): 453–462. [18] Song Q. Robust training algorithm of multilayered neural networks for identification of nonlinear dynamic systems. IEE Proc Control Theory Appl 1998;145(1):41–46. [19] Lightbody G, Irwin G. Direct neural model reference adaptive control. IEE Proc Control Appl 1995;142(1):31–43. [20] Ni X, Verhaegen M, Krijgsman A. A new method for identification and control of nonlinear dynamic systems. Engng Appl AI 1996;9(3): 231–243. [21] Chen T, Chen H. Universal approximation to nonlinear operators by NNs with arbitrary activation functions and its application to dynamical systems. IEEE Trans Neural Net 1995;6:911–917. [22] McClelland JL, Rumelhart DE. Explorations in parallel distributed processing: A handbook of models, programs, and exercises. Cambridge, MA: MIT Press, 1988.