Copyright © IFAC System Identification, Kitakyushu, Fukuoka, Japan, 1997
STRUCTURAL OPTIMIZATION OF NEURAL NETWORKS FOR SYSTEMS MODELLING D. Popovic and D. Xiaosong Institute of Automation Technology, University of Bremen, Germany
Abstract
guaranteed that the trained network can really be used as the system model. This must be tested on as many fresh examples as possible. Only if the trained network can accurately predict the system behavior, it can be accepted as valid system model. Industrial application of neural networks (7] handles with the difficulties of getting a large, representative set of learning examples from the process to be modeled in the entire range of interest, particularly not for on-line network training. The issue is particularly crucial when time-varying systems have to be modelled. Here, a repeated on-line system identification is required and thus a recurrent network training.
A new, optimal structure for backpropagation networks is proposed that enables modeling and on-line identification of dynamic systems. Usually, the activation function used for individual neurons is sigmoid-like, that enables modeling of nonlinear dynamic systems. However, because usually the required condition of uniformly distributed on-line training examples is not satisfied, the backpropagation networks are not suitable for on-line training. To solve this problem and to neural network for modeling and on-line identification of real-time systems, the sigmoid functions have been modified, as described below, to be able specify the behavior of each neuron in a limited range of training data. In this way, the strong couplings among the network parameters are largely released and the training data for backpropagation networks need not to be chosen randomly from the identified input range. It is shown that the modified network exhibits a real-time memory capability and can be on-line trained.
Because the speed of off-line training of networks is relatively slow, the efforts have been made to accelerate the convergence of off-line learning process to resolve the above difficulties In the and reported in [1], [2], [3], and [5] . following, the on-line learning approach is proposed for this purpose. It is well known, that the on-line learning using backpropagation networks can not be achieved by simply improving the learning algorithm [6]. It is much more required to develop a suitable network structure, and/or the input-output behavior of individual neurons. In the approach, presented below, this behavior was modified for the neurons in the hidden layer using an output limiter, i.e. a function that limits the values of signals at neurons outputs. It is shown that the backpropagation network, modified in this way, becomes contentaddressable, so that it has a very strong ability to
Keywords: neural networks, system modeling, on-line training, structural optimization, real-time systems identification. 1. Introduction
Application of backpropagation networks in dynamic system mode ling generally requires in its learning phase a large number of selected training data, to learn the inherent system behavior. If, the trained network can represent the this behaviour with the desired accuracy, it has been successfully trained. However, it is still not
743
remember the information of training examples of proceeding ranges and training times. v
v
2. Enhanced Capability of Networks Memory
The troublesome feature of backpropagation networks with sigmoid-like activation function is that the network, trained in one range of training data, has - to be applied in another range - to be anew trained because the network can forget information once learnt. In the following, a procedure is described enabling the neural network to learn new patterns in a new domain without loo sing knowledge saved in another domain. Such network behaviour is closer to the behaviour of human brain.
u1
Fig. 2: The responseof neurons in the hidden layer with the new activation function In Fig. 2 the centre of the limiting range for the neuron considered is (0.5,0.5) . The connecting weights to this neuron are only responsible for memorizing the behavior of the identified process in a small range around (0.5,0.5) . The weights of other neurons are for other ranges of the training data. If the current training data set is far away from the center (0.5,0.5), the connecting weights to this neuron will be almost unchanged, i.e. the memory of the network in this small range will not be lost.
Let us consider a backpropagation network with two inputs and one output with the activation function of every neuron in the hidden layer
If W;
= -7.5,
~
3. Selection of the Limiting Functions
= -7.5, b = 10.5 and the network-
inputs are within the range (0,1), the output of the network will be the weighted sum of p nonlinear surfaces, as shown in( Fig. 1).
For each neuron, the appropriate selection of the proposed limiting functions, can rely on fixed or on adaptive parameters. In the first case the assigned parameters will not take part in the training process and in the second also the free parameters of the function will be optimally tuned during the training phase of the network, that requires the application of special learning algorithms .
.. Actually, both selections of limiting functions still guarantee that the modified network can be seen as a good approximator for nonlinear functions, this based on the theorem of Stinchombe and White[4] . However, as regard the networks training with arbitrarily distributed examples, the first strategy is preferable. Therefore, in the following, mainly the fixed parameter limiting functions will be considered.
Fig. 1: The output of neurons in the hidden layer
To reduce the strong coupling between the individual neurons sigmoid function is multiplied by a function that limits their outputs and restrains their behavior within a small dynamic range. In this way, the memory of the network is split up into a number of smoothly interconnected independent storage areas. The modified activation function, proposed for the backpropagation network, is expressed as
After the decision is made on the identified input range of the network, the limiting functions should homogeneously be distributed in it. For a network with two inputs, the distribution of the limiting functions is illustrated in Fig. 3, where the identified input range has been split into 25 small memory areas. The circles in the figure denote centres of limiting functions.
For W; = -7.5, ~ = -7.5, b = 10.5, a1=15, /31=0 .5, a 2=10 and /3 2=0.5, the output of the neurons in hidden layer will have the form shown in Fig. 2.
744
[2)D.Xiaosong, D. Popovic and G. Schulz-Ekloff (1995). "Oscillation-Resisting in the earning of Backpropagation Networks," 3rd IFAC/IFIP Workshop on Algor. and Archit. for Real-Time Control, Ostende, Belgium, 1995.
not only on the second part but also on the first part of the input ranges.
5. Conclusions
[3)N.B . Karayiannis, A.N. Venetsanopoulos, (1993). "Efficient Learning Algorithms for Neural Networks," IEEE Trans. on Syst. , Man, and Cybernetics, Vol. 23, No.5, pp. 1372-138.
It was shown that the proposed new-structured backpropagation network, splitting up the training range into a number of small memory areas and using for this purpose the limiting functions , is applicable to the on-line training. As limiting functions the continuous probability density function can be used. The new-structured network, however, requires more neurones in hidden layer than the conventional one. Generally, the more memory areas are to be established, the more neurons in hidden layer will be required. The choice of the number of limiting functions and the number of neurons assigned under each limiting function depends on the complexity of the identified system.
[4 )M. Stinchcombe and H. White (1989). "Universal Approximation Using Feedforward Networks with Non-Sigmoid Hidden Layer Activation Functions," Intnl. Joint Conf. on Neural Networks, June 18-22, 1989,pp. 1-613 . [5)D.Popovic and D. Xiaosong (1994). "The approach of on-line modeling of dynamic systems with neural network," ASCC , Tokyo, Japan July, 1994 . [6]
6. References
[1).M. Bishop (1992). "Curvature-Driven Smoothing in Backpropagation Neural Network," J.G. Taylor and C.L.T . MalUlion (Eds) (1992) Theory and Applications of Neural Networks, pp. 139-148
D. Popovic and P. V. Bhatkar (1994). "Methods and Tools of Applied Artificial Intelligence," Marcel Dekker Inc, New York
[7]D. Xiaosong, D. Popovic, and G. Schulz-Ekloff (1995). "Real-Time Identification and Control of a Continuous Stirred Tank Reactor with Neural Networks", IEEE/JAS Intnational Conference on Industrial Automation and Control, Hyderabad, India, January 1995.
747