Information Processing Letters
ELSEVIER
Information
Processing Letters 63 ( 1997) 137-141
Finite automata-models for the investigation of dynamical systems Christian Schittenkopf a*b**,Gustav0 Deco ‘, Wilfried Brauer b ’ Siemens AG, Corporate Technology, Dept. ZT IK 4, Otto-Hahn-Ring 6, 81730 Munich, Germany b Technische Universitdt Miinchen, ArcisstraJe 21, 80290 Munich, Germany
Received 7 February 1997; revised 15 May 1997 Communicated
by T. Lengauer
Abstract We describe a method to measure the complexity of a dynamical system. By complexity we mean the intrinsic information processing abilities which we believe to be visible only on an infinitesimal scale. The complexity measure is based on concepts from information theory and from the theory of formal languages. @ 1997 Elsevier Science B.V. Keywords: Formal languages;
Finite automata;
Dynamical
systems; Complexity
1. Introduction The interest in dynamical systems, especially in nonlinear and chaotic dynamics, has been growing steadily for several years. One of the main reasons is the property of chaos to include both deterministic and stochastic aspects which brings two formerly distinct points of view together. There have been approaches to study chios from different scientific fields including thermodynamics, information theory, turbulence theory, mathematical statistics, and theoretical computer science [ 3,6-81. The usefulness of methods of this last-mentioned field stems indirectly from a particular property of chaotic systems: the extreme sensitivity to initial conditions. Since measurements about any dynamics can only be done with finite precision and since the resulting measurement error increases exponentially * Corresponding
author.
0020-0190/97/$17.00 @ 1997 Elsevier Science B.V. All rights reserved. PII SOO20-0190(97)00110-5
for chaotic processes, the resolution of the measuring device or equivalently, the partitioning of the phase space heavily influences the obtained results. This partitioning is a division of the attractor A into a set of disjoint boxes Bi (not necessarily of equal size), namely
P = I&}$,
ij
Bi = A
(1)
i=l
and p
BiflBj=@,
i # j.
is called a partition of A. By labeling each box Bi with a symbol i the original, real-valued process is reduced to a sequence {s,}, t = ix;, t = 1,2,./, . ., of m drfferent symbols and thereby a symbolic &nknics is defined [ 41. Consequently, concepts from the theory of formal languages can be applied with the following peculiarity: Changing the partition /3 can be interpreted as changing the resolution of the measuring device which leads to more or less symbols.
C. Schittenkopf et al./lnformation Processing Letters 63 (1997) 137-141
138
In fact several authors have constructed finite (and infinite) automata (FA) using a binary alphabet to define the complexity of a dynamical system [ 3,6-81. However, for chaos it is well-known that the intrinsic characteristics of the dynamics can only be determined if the partition is very fine [ 5,101. In this letter we describe a procedure to construct deterministic finite automata (DFA) for a class of deterministic chaotic systems not only for binary alphabets but also for alphabets corresponding to increasingly fine measuring resolutions. These DFA are then used to quantify the complexity of the underlying dynamics. By complexity we mean the intrinsic information processing abilities of a dynamical system which we believe to be visible only on an extremely fine scale.
2. Single-humped maps One of the most studied class of chaotic processes are single-humped maps. They have a critical point xc and map the interval [a; b] onto itself with a = f( f( x,) ) and b = f( x,). They are monotonically increasing for a < x < xc, decreasing for xc < x < b and differentiable on [a; b] . The classical generating partition, which is important for calculating characteristic quantities of chaotic systems, is defined by dividing [a; b] at xc into two parts, i.e. xt E [a;xC[ -
s, =O, (2)
X,E [x,;b] -~,=l,
where sr denotes the symbol corresponding to the observed value x,. The Ruelle intervals or cylinder sets of order k [4] are defined by refining this binary partition in the following way, Zk = Z’ n f(_‘)Z’ SI
J
x?
n..
. ”
f(--k+l)y
Sk ’
(3)
where j = 2
si2’-‘,
I; = [a;~~[,
Z; = fx,;b],
(4)
i=l and f(-“‘jr,’ is the set of points mapped to I,’ by f(m) = fofo. . . o f (m times). For any k the number of intervals is bounded by 2k, i.e. j = 0, 1, . . . , j(k), j(k) < 2k. If the intervals Z! are now interpreted as symbols, one obtains different symbolic dynamics of
the single-humped map corresponding to the degree of refinement (indicated by k). In other words, for each k one possibly observes a different symbol sequence. Our aim is to study the symbol sequences of the partition consisting of the intervals Z;, j = 0, . . . , j( k), and finally, to formulate a measure of complexity which is independent of the resolution.
3. Construction and experiments Let Sf: denote the set of symbolic subsequences of length not larger than n which are produced by a single-humped map where the partitioning is given by the Ruelle intervals of kth order. We define the language generated by the symbolic dynamics by Sk = lim n+a, Si. If the analysis is restricted to a binary alphabet (k = 1 ), there is a very efficient procedure for constructing a DFA the accepted language of which approximates Sk (for a similar approach cf. [ 71): From the kneading sequence [ 21 of the maeone obtains analytically the (possibly infinite) set Sk of forbidden words (FWs). A FW is a sequence which is never generated and which does not contain any shorter FW (z denotes the restriction of 3 to sequences of length not larger than n) . Since there is at most one FW for each length n, the size of the sets $ grows rather slowly Consequently, the DFA accepting c = ((0, l}*Sfi{O, l}*}* (where * denotes the transitive closure), which is the set of all sequences containing at least one FW, is quite small, and also the complementary DFA accepting (0, l}* \ !@ can be constructed easily. For increasing n, this procedure results in better and better approximations of Sk by regular languages. However, this approach is infeasible for fine partitions. First, there is no obvious generalization of the concept of kneading sequences to large alphabets. Second, for deterministic chaos the number of FWs grows exponentially fast: For the map given by Eq. (7) with r = 4 and k = 4 the number of different words of length 10 is only 8192 out of some lo’* possible sequences! Consequently, one must choose a positive approach meaning that the DFA is constructed from the sequences which do occur. Of course one has to restrict observations to words of finite length n but if the recorded symbol sequence is long enough, one can assume that the set Sf: is known exactly. Even if
C. Schittenkopf et al./Information Processing Letters 63 (1997) 137-141
there are sequences which are generated with some very small probability and which therefore have not been observed yet, the error which results from using only a subset of the true set Si, is not disastrous since our measure of complexity (described below) also takes probabilistic aspects into account. The only question concerning the DFA which remains to be answered, is the following one: How is Sk approximated knowing Sk for some finite n? Besides the fact that the ansatz Sk x (Sk)* may be too restrictive it mainly suffers from its clmputational cost: Constructing a DFA which accepts the transitive closure of the accepted language of another DFA is in principal the same task as making a non-deterministic FA deterministic which is of exponential complexity in the number N$ of nodes of the FA [ 91. We thus chose the least restrictive alternative Sk M (S,“) ( Ak)* where Ak denotes the alphabet corresponding to the Ruelle partition of order k. The advantage of this approach consists in the fact that a DFA is obtained immediately. The minimization of this DFA only requires 0( N,”log N,k) operations [9] where Ni is bounded by N,” < 2 + 2k(2” - 1) (worst case). To measure the complexity of the resulting minimal DFA 3: we used the set complexity defined in [6] which is the Shannon entropy of the probability distribution induced on the nodes of 3,. More precisely, sequences of length n (not Si) are generated from the corresponding single-humped map and presented to 3: according to thefrequency of their occurrence. In general, all of these words will be accepted by the DFA due to its construction. However, if any rare sequence is not accepted since it was not available in the estimated set Sf: during the construction of the DFA, the DFA falls into an error state and remains there until the next sequence is generated (in fact, we never encountered this situation during our numerical experiments). Consequently, a probability distribution is induced on the nodes of 3: and the set complexity is defined by
(5) i=l
where pi denotes the probability of visiting node i. Our complexity measure is given by the asymptotic growth rate of the set complexity for infinitesimal measuring resolutions and for sequences of arbitrary length, i.e.
C = )im, $E(
139
Cnk+’- C,“).
(6)
This definition is partition-independent because the size of the partition elements shrinks to zero as k + 00. It is well known [ 1] that this property is crucial to the determination of intrinsic characteristics and therefore C quantifies the complexity of the underlying, real-valued map. This measure can be generalized to define the complexity of any dynamical system: one just has to guarantee that the size E of the partition elements becomes infinitesimal (li&,a instead Of lim&,, in Eq. (6)). The probably most famous single-humped map is the logistic map xr+t = f(x,)
= rx,( 1 - xt),
xc = 0.5,
(7)
where the parameter r E [ 0; 41 allows different dynamics ranging from periodic to chaotic behavior. For special values of r the complexity C can be calculated analytically. If r < ra = 3.5699456. . , (the period doubling accumulation point), the symbol sequence generated by the logistic map is periodic (possibly after a transient phase). For r = 3.54 for instance, the period four sequence . . e--+2-+5+4+6+2+ . . ais obtained. The DFA 3: is depicted in Fig. 1(a). In general, the obtained DFA is very simple if the partition is fine enough to resolve the periodicity p of the observed sequence. There are p “equiprobable branches” which do not interfere except in the “final” node (see Fig. 1(a) ), and consequently, one obtains c,” = 5 logZp + log* n (independent of k for large k) and C = 0. The complexity of a periodic sequence is therefore equal to zero. For r = 4 (fully developed chaos) the observed symbol sequence is also very structured. In particular, any of the 2k symbols can only be followed by one of two possible symbols. To illustrate this important property the DFA 3: is shown in Fig. 1(b). Since all generated symbol sequences are equiprobable for r = 4, some short calculations give C,” =“-‘(k-l)+log,nandC=l.Weemphasize that fo: the binary alphabet, which was used in previous definitions of complexity [ 3.6-81, all possible (2”) words of length n are generated. This means that the corresponding DFA 3: has only one node (S’ = (0, l}*) resulting in a vanishing set complexity. Therefore the fully developed logistic map can not be distinguished from a random process (coin toss) if the classical bipartition is used (see (2)). Only by
140
C. Schittenkopf
et al./Information
Processing
Letters 63 (1997) 137-141
Fig. 1. DFA accepting periodic and chaotic symbol sequences generated by logistic maps (see text for detailed explanations).
using fine resolutions (large alphabets) the true complexity (C = 1) can be determined since for any random system which generates all possible sequences, C = 0 holds. These analytical results are possible because the obtained Ruelle intervals form a so-called Markov partition, which is characterized in the one-dimensional case by the fact that the edges of the intervals are mapped again onto edges. For a given Markov partition (fixed k) n-l
c,k = - n
D(k)
+ log2 n
(8)
holds where D (k)depends only on k (arbitrary length n). Interestingly, D(k) equals the set complexity of the DFA accepting Sk in the cases of periodic sequences and of fully developed chaos. These DFA would be the result of the (most general but in general infeasible) approach using FWs (described at the beginning of this section). For k = 3 the corresponding DFA are depicted in Figs. 1 (c) and 1 (d) . In fact, our definition of complexity C is just the asymptotic growth rate of D(k) meaning that at least for special parameter values r the measure C is an appropriate generalization of the set complexity to DFA constructed from non-binary symbol sequences. In general (any r), the partition given by the Ruelle intervals of order k is not Markovian. However, in our numerical experiments the increase of C,” as a function of n and k was always close to the scaling law given by Eq. (8). We thus assume that our computational
Fig. 2. Approximated complexity C of logistic maps ranging from periodic processes to the fully developed chaotic case.
results are a good approximation of the complexity measure C. We performed calculations for all logistic maps with r E [3.4;4] using the step size Ar = 0.001. The approximated complexity measures are depicted in Fig. 2 where we used C % $‘( Cf,, - C:,,) with respect to Rqs. (6) and (8) (n = 10, k = 3). For periodic sequences C is zero except for parameter values slightly below ra where the estimation of C is unreliable due to large periodicities. The big periodic window starting with period three at r M 3.83 is also clearly visible. The maximal value C = 1 is reached for r = 4 (fully developed chaos).
C. Schittenkopf et al./Information 4.
Conclusion
Processing Letters 63 (1997) 137-141
141
[41 A. Csordh, G. Gyiitgyi, P. Stipfalusy and T. Tel, Statistical
A partition-independent measure of complexity was defined for dynamical systems. This measure should be regarded as another piece of the bridge which is built from theoretical computer science to the theory of dynamical systems. These two fields are naturally related via the concept of symbolic dynamics.
properties of chaos demonstrated in a class of onedimensional maps, Chaos 3 ( 1993) 31-49. 151 G. Deco, C. Schittenkopf and B. Schitrmann, Internat. J. Bifur. Chaos, in press. 161 P Grassbetger, Toward a quantitative theory of self-generated complexity, Internat. J. Theoret. Phys. 25 (1986) 907-938. [71 P. Grassberger, On symbolic dynamics of one-humped maps of the interval, Z Naturforsch. 43a (1988) 671-680. [81 R. Giinther, B. Schapiro and P Wagner, Complex systems, complexity measures, grammars and model-inferring, Chaos,
References
Solitons & Fractals 4 ( 1994) 635-651. 191 J.E. Hopcroft and J.D. Ullman, Introduction to Automata Theory, Languages and Computation (Addison-Wesley,
[1] C. Beck and E Schlogl, Thermodynamics of Chaotic Systems
(Cambridge University Press, Cambridge, 1993). [2] I? Collet and J.-P Eckmann, Iterated Maps on the Interval as Dynamical Systems (Birkhluser, Boston, MA, 1980). [ 31 J.P. Crutchtield and K. Young, Inferring statistical complexity, Phys. Rev. L&t. 63 (1989) 105-108.
Reading, MA, 1979).
1to1 C. Schittenkopf and G. Deco, Exploring the intrinsic information loss in single-humped maps by refining multisymbol partitions, Phys. D 94 (1996) 57-64.