Analysis of chaotic population dynamics using artificial neural networks

Analysis of chaotic population dynamics using artificial neural networks

Chaos. So/iro,ts & Fmcrals Vol. 1. No. S. pp.413-421. Printed in Great Britain 1991 0 G%O-0779/92$5.00 + 00 1992 Pergamon Press plc Analysis of Cha...

608KB Sizes 0 Downloads 106 Views

Chaos. So/iro,ts & Fmcrals Vol. 1. No. S. pp.413-421. Printed in Great Britain

1991 0

G%O-0779/92$5.00 + 00 1992 Pergamon Press plc

Analysis of Chaotic Population Dynamics Using Artificial Neural Networks E. B. BARTLETT Department

of Mechanical

Engineering,

Iowa State University,

Ames,

IA 50011-2230,

U.S.A.

Abstract-This paper describes a method of modeling the complex population oscillations defined by a chaotic Verhulst animal population simulation. A predictive artificial neural network model is developed and tested. Computer simulation results are given. Results show that the artificial neural network model predicts chaotic Verhulst dynamics regardless of the initial condition within a limited growth rate interval.

1. INTRODUCTION

Many natural phenomena exhibit chaotic behavior. In the wild, for example, certain animal populations are known to be chaotic. Numerous situations arise where it is necessary to attempt to model these phenomena for the purposes of strategic planning or other reasons. Modeling of this type can be quite difficult. This degree of difficulty, however, does not preclude the need to understand or predict these natural phenomena. Often the information is needed for the proper management of a natural resource or another similarly noble cause. This paper describes an attempt at the task of chaotic time series prediction using the analysis technique known as artificial neural networks [l-4]. The next section of this paper gives a brief description of the chaotic model used in the analysis. Section 3 contains a description of the artificial neural networks used. Section 4 contains the computer simulation descriptions and results. Conclusions are given in Section 5. 2. A SIMPLE MODEL OF ANIMAL POPULATION

DYNAMICS

There are many models which can be used to predict animal population dynamics [5-91. Many of these systems can be shown to exhibit chaotic behavior. One of the more widely known of these models is the simple Verhulst logistics function [lo], X ,+1

=

f(X,>

=

Xr

+

a

-

X,)X,.

(1)

Where x, is the normalized population of a group of animals at time step t, r is the growth parameter such that the growth rate is r(1 - x,) and 1 is the normalized stable population. This is of course just the discrete form of dx/dt = r(1 - x)x. It should be noted that this continuous function has a simple analytical solution [ll]. A graph of the logistic function, equation (l), with x0 = 0.0001 and r = 2.7 is given in Fig. 1. It can be shown that this function exhibits chaotic dynamics for values of the growth 413

E. B. BARTLETT

414

.IL

20

40

60 ’

80 1

100 I

120 1

Time step

Fig. 1. A chaotic time sequence

with growth

parameter,

I equal to 2.7 and an initial condition

xg equal to 0.0001.

parameter above 2.570 [lo]. Thus, the process exhibits sensitive dependence on initial conditions and is aperiodic. Given this chaotic behavior and the frequent need to analyze complex natural phenomena, the question arises as to whether it is plausible or even possible to model or predict a chaotic system such as the one described by equation (1). It will be shown in the sections which follow that it is in fact possible to obtain an approximate predictive model of this logistics function using computer simulated artificial neural networks.

3. ARTIFICIAL 3.1

NEURAL NETWORKS

Introduction to artificial neural networks

Among the many interesting capabilities of artificial neural networks (ANNs) is their ability to learn functional mappings from an input space to an output space [2]. ANNs achieve this ability by utilizing a large number of simple processing elements called nodes, which are interconnected with each other. These networks of nodes and interconnections are computer implementations of what is roughly analogous to biological neural networks, or living brains. ANNs are also very useful at tasks which take advantage of their high degree of parallelism and connectivity such as pattern recognition and classification. for example in the case where a Function mapping of unknown or difficult functions, relation is known to exist but the function which relates two sets of data is elusive or difficult to evaluate, can be taught to an ANN without regard to the actual functional relationship. Thus, the need for functionality assumptions during model creation is eliminated. The input-output relation learned by an ANN is stored in the distributed memory of the ANN for future recall. Once the mapping is learned by the ANN the recall examples need not be similar to the training data and good results are still obtained as long as the input data is within the domain of the learned mapping. In this way, the network is learns the mapping between the data sets, not the data itself. This characteristic sometimes referred to as network generalization [ 121. ANNs are distinguished by their network architecture or topology, training or learning rules, and type of artificial neuron or node. The nodes used in ANNs are usually nonlinear

Analysis

of chaotic

population

415

dynamics

and are typically analog in nature. Most nodes output a limited, weighted sum of their inputs. The interesting characteristics of ANNs come more from the complexity of the interconnection scheme or network architecture, and the values of the interconnection weights, than from the rather limited abilities of the individual nodes. For this reason the study of ANNs is often called connectionism. Unlike the familiar sequential Von Neumann computer, each node in an ANN has the capability of interdependent and simultaneous computation. This fact makes ANN computing very fast once the network has been trained. Training of ANN can, however, be computationally expensive [13]. New methods for increasing learning rates are being investigated with vigor for this reason. ANN training consists of presenting a set of known examples, from training sets of associated input-output data, to the network and adjusting the interconnection weights between the individual nodes until the network learns the mapping associated with the given data. The algorithms for this adjustments can be very complex [14]. However, a relatively simple training scheme known as backpropagation is widely applied, very useful, and relatively straightforward [15]. In this work, however, a somewhat more complex stochastic learning paradigm is used for the modeling illustrations in Section 4 [16, 171. 3.2

Network and nodal architectures

The artificial neural networks utilized in this paper can be described mapping, M which may be continuous or discrete, such that M(X,,,)

=

as follows.

A

&+l,n

is modeled by a network of layered nodes as shown in Fig. 2, where x

1,n

=

(XWZ? Xl,Z,n,

x 3.2.n

X3.1.”

’ . .Y Xl,J(l),nY-

x 3,J(l+l),n

nodes

Hidden

Input

I ‘l,l,n Fig. 2. An example

I ’ 1,Z.n

I x 1.3.n

nodes

I

‘l,J(ll.n

network showing input, hidden and output nodes, as well as the indexing nodes, activations and weights. Note that I = 2 in this example.

notation

for the

E. B. BARTLEII

416

is the input vector and

X 1+1,n

=

( X1+1,1,n,

X1+1,2,n,

* . .J XI+l,J(I+ldT

is the output vector, corresponding to the output of the Ith layer of nodes, and J(1) and J(I+l) are the dimensions of the input and output vectors respectively. And II, is the training set exemplar number. Each node has the following input-output relation, as shown in Fig. 3, Xl,i,fl = U r,,,n

=

kl,,j{arCtan(ui,j,,)/n

gl;,,

’ k$l(wi,j,k

‘Xi-l,k,n)

The trainable parameter sets are {61i,j}, {gl,,;}, function, used to measure the network performance c(W) = { l/NJ(I+l)

+

(2)

l/2}

+

bli,j>.

{ kl,,j} and {Wi,j,k}. The cost (error) during training, has the form,

* n$l*~~‘(n,,,,j,,-x,+,,,,.)‘))‘n

(3)

output ‘i,j,n

output

constant

Transfer function

Bias

9 g’ i,j

Gain

Sum

Weights

inputs Fig. 3. An enlarged

generalized

node showing

the signal flow path as well as each trainable

parameter.

Analysis

of chaotic

population

417

dynamics

where N is the number of training exemplars in the training set, {Qr, Qt+r}. The problem is to reconstruct or approximate the desired mapping 2, such that, XD 1+1,n

=

from {S21,R,+1}, where XD is the desired solutions M, which satisfy the training set n

1+1,n

=

ZG%,n),

output

vector.

There

are,

however,

many

~(QlJ~

none of which are necessarily the desired solution R 1+1,n

=

ZPlJ.

An outline of the training algorithm is shown in Fig. 4. The challenge is to determine the best way to adapt the trainable parameter selection criteria so that the result is an

1. Make two initial random parameter set guesses and evaluate c(W) for each; 2. Store the best parameter sets, discard the sets with the largest c(W); 3. Make a small random change to each member of a parameter set and evaluate the cost function at this new time step, P’(W). If c’+‘(W) < c’(W) continue to 4, if not, go to 2; 4. Change the parameter selection criteria, f(W), based on information gained during step 3; 5. Apply the same, successful, parameter changes again. If c’+‘(W) < c’(W) go to 5, repeat a fixed number of times, if not, go to 2; 6. Cycle through steps, 1 through 5, for each adaptable parameter set, (bli,j), (gli,j), {kli,j}r and (Wi,j,k);

7. If the network learning is slow, expand the network architecture by adding a node to the most important layer; 8. If the total cost is acceptable then reduce the network size by deleting the least important node; 9. If the network structure oscillates about a fixed architecture; stop, otherwise go to 2.

Fig. 4. An outline

of the stochastic

paradigm

used in the computer

simulations.

418

E. B. BARTLETT

increased probability of a successful selection at future trails. The stochastic evaluation of integrals procedure, along with the theory of Monte Carlo importance function biasing, provides an estimate from which to select future changes in the trainable parameters. Thus, learning is adapted by the algorithm through the use of internal learning parameters which control the system dynamics by continually updating the system estimate of the optimal probability density function supplied by the theory. The recall process, after the ANN has been trained, is a simple feedforward process through the network from the input nodes to the output nodes as illustrated is Fig. 2.

4. COMPUTER

SIMULATION

RESULTS

This section describes the procedure by which a network is trained to predict the logistic function, equation (1). As input the ANN is given the function’s values at 5 previous time steps. The ANN in then trained to output the value of the function at the next time step. Thus, using the ANN notation presented above, X 1.n is the input

(Xl,r++l*

=

. . .T X1.J

vector and, X 1+1,n

is the output

X1.r+3,n,

=

‘vector’. In this way the ANN XI+l,r+S.n

=

XI+l,r+S,n

is trained

M(Xl,r+4,n>

*

to learn

X1,r+3,n,

. . .)

the mapping Xl.l,A

at each time step in the training set. The training set consisted of 100, 5 input, 1 output r was set to patterns. So in this case 0 6 n < 99 and 15 < t G 114. The growth parameter, 2.7 and the initial condition of this sequence was x0 = 0.0001. A graph of this sequence is shown in Fig. 1. The training set was chosen in sequence such that x1+1,20,0

=

M(Xl,l9,0,

XlJ8.0~

. . ‘7 x 1,15,0)

x1+1,22,1

=

~(Xl.21.1~ . .

X1,20,1~

. . .> x1,17,1)

XI:l.n+5,n

=

kX;,n+4,nT . . . .

x1+1,119,99

=

M(X1J18.99,

i,n+3.nY’.

x1,117,997

. .T Xl,n,n

. . .t

x1,114,99

> *

>*

In each case the network is, in effect, trained to predict the value of the population at time t + 1 given only the values of the population at times f, t - 1, . . ., t - 4. The network was started as a 1 x 1 x 1 (one input, one output and one node in one hidden layers) network. The dynamic node architecture option of the stochastic paradigm used [ll] then builds its own network architecture with the result being 5 x 3 x 1. Figure 5 shows the training set results. This graph shows the true data, the ANN output and network error which is simply the difference between the network output and the true value. The root-mean-square (RMS) error, see equation (2), for the training set is 0.01764. Figure 6 shows the results for the logistics function which was started with an initial condition of x0 = 0.01 and the same growth parameter as the training set, r = 2.7. The RMS error for this recall case is 0.02371. Figures 7 and 8 shows the results for the logistics function which was started with a initial condition of x0 = 0.01 but with a growth parameter of r = 2.6 and 2.8 respectively. These results were totally unexpected. The fact that the ANN can actual follow the series produced by these different equations was a surprise. The

Analysis

1.5

-0.5’

population

419

dynamics

r

10

20 n

Fig. 5. A chaotic

of chaotic

30

40

I 50

60 70 80 Time step

I 90

. ANN output

Desired

100

I 110

I 120

1 130

A Error

time sequence and the corresponding artificial neural network parameter for the series, r is equal to 2.7 and an initial condition

output and error. xg is 0.0001.

The

growth

1.5-

: l._ z i 0” ,” 0.5.-2 m E

1

10

20 n

Fig. 6. A chaotic

30 Desired

40

50

60

70 80 Time step

. ANN output

90

100

110

120

130

A Error

time sequence and the corresponding artificial neural network output and parameter for the series, r is equal to 2.7 and an intial condition .x0 is 0.01.

error.

The growth

RMS error for these cases are 0.04128 and 0.07837 respectively. Table 1 summarizes the computer simulation results for the cases shown. It is important to note that the ANN was only trained on the time series shown in Fig. 1, and that it had never been trained on the data from the other sequences. Thus, the data in Fig. 1 is the only training case (set) and all the other cases were just recalled by the ANN by a simple feedfoward operation through the network. Again, the ANN was not trained on these recall data sets and that the fact that the ANN can predict these data sets is an illustration of the generalization capabilities of the ANN. The ANN does not memorize the data presented to it in the training set but actually models the underlying functional relation.

420

E. B. BARTLETT

1.5-

.-2 ;;; i :

l-

,” .-2 x

0.5-

-0.51 10

I 20 n

Fig. 7. A chaotic

I 40

I 50

Desired

. ANN output

20

30

n

Desired

40

50

5 6 7 8 _ ” Training

Series number

1 2 3 4 5 6 7 8 set

I 90

I 100

I 110

I 120

I 130

A Error

60 70 80 Time step . ANN output

90

100

110

of computer Growth parameter (r) 2.7 2.7 2.7 2.7 2.6 2.8 2.75 2.65

The growth

120 ii30

A Error

time sequence and the corresponding artificial neural network output and error. parameter for the series, r is equal to 2.8 and an initial condition xg is 0.01.

Table 1. Summary Figure number

1 I I 60 70 80 Time step

time sequence and the corresponding artificial neural network output and error. parameter for the series, r is equal to 2.6 and an initial condition xg is 0.01.

10

Fig. 8. A chaotic

I 30

simulation

The growth

results Initial condition (x0)

RMS error

0.0001 0.001 0.01 0.1 0.01 0.01 0.01 0.01

0.01764* 0.02408 0.02371 0.02063 0.04128 0.07837 0.04307 0.02281

Analysis

of chaotic

population

dynamics

421

5. CONCLUSIONS

An artificial neural network model with a dynamic node architecture learning paradigm is used to obtain an approximate solution to the chaotic time series prediction problem. Computer simulation results show that the method can obtain an approximate solution for any initial condition over a limited range of the growth parameter in a systematic way. The method therefore eliminates the need to make functional assumptions prior to modeling. REFERENCES 1. J. A. Anderson and E. Rosenfeld (Editors), Neurocomputing: Foundations of Research. MIT Press, Cambridge (1988). 2. R. P. Lippmann, An introduction to computing with neural nets, I.E.E. E. Acoustics, Speech and Signal Processing-4, 4-22 (1987). 3. D. E. Rumelhart, J. L. McClelland and the PDP Research Group, Institute for Cognitive Science, University of California, San Diego, Parallel Distributed Processing: Explorations in the Microstructure of Cognition, Vols. 1, and 2. MIT Press. Cambridge, MA (1986). 4. P. J. Werbos, Building .and understanding adaptive systems: a statistical/numerical approach to factory automation and brain research. IEEE Trans Svstems, Man, and Cybernetics, January/February, 7-20 (1987). 5. J. M. Gushing, Integrodifferential equationsand delay models-in population dynamics, Lecture Notes in Biomathematics, Vol. 20 Springer, New York (1977). 6. C. Grebogi, E. Ott and J. A. Yorke, Chaos, strange attractors, and fractal basin boundaries in nonlinear dynamics, Science, 238, 632-638 (1987). C. D. Gelatt Jr. and M. P. Vecchi, Optimization by simulated annealing, Science 220, 671-680 7. S. Kirkpatrick, (1983). ecology: 1932-1940: a collection of works by V. 8. F. M. Scudo and J. R. Ziegler, The golden age of theoretical Volterra, V, A. Kostitzin, A. J. Lotka, and A. N. Kolmogoroff, Lecture Notes in Biomathematics, Vol. 22. Springer, New York (1978). Elementary Mathematical Ecology. Wiley, New York (1981). 9. J. Vandermeer, The Beauty of Fractals: Images of Complex Dynamica! Systems. Springer, 10. H. 0. Peitgen and P. H. Richter, New York (1986). 11. R. L. Devaney, An Introduction to Chaotic Dynamical Systems. Addison-Wesley, Redwood City, Ca (1987). and E. Levin, Consistent inference of probabilities in layered networks: Predictions and 12. N. Tishby generalization, IJCNN Int. Conf. on Neural Networks, Vol. 2, pp. 403-409, (1989). in networks is hard, I.E.E.E. First Int. Conf. on Neural Networks, Vol. 2, pp. 685-692. 13. S. Judd. Learning (1987). ” 14. G. A. Carpenter, and S. Grossberg, The art of adaptive pattern recognition by a self-organizing neural network, I.E. E. E. Computer 77-88, (1988). Theory of the backpropagation neural network, I.J.C.N. N. Int. Conf. on Neural Networks, 15. R. Hecht-Nielsen, Vol. 1, pp. 593-605. (1989). Nuclear power plant status diagnostics using simulated condensation: an auto-adaptive 16. E. B. Bartlett, computer learning technique, Ph.D. Dissertation, The University of Tennessee, Knoxville, (1990). Monte Carlo Methods, Vol. I: Basics. Wiley, New York (1986). 17. M. H. Kales and P. A. Whitlock,