Chaos, Solitons and Fractals 13 (2002) 461±469
www.elsevier.com/locate/chaos
Information and dynamical systems: a concrete measurement on sporadic dynamics Fiorella Argenti a, Vieri Benci b,c, Paola Cerrai a,c, Alessandro Cordelli c, Stefano Galatolo a,c,*, Giulia Menconi c a
Dipartimento di Matematica, Universit a di Pisa, Via Buonarroti 2/a, 56127 Pisa, Italy Dipartimento di Matematica Applicata, Universita di Pisa, Via Bonanno Pisano 25/b, 56126 Pisa, Italy Centro Interdisciplinare per lo Studio dei Sistemi Complessi (CISSC), Universit a di Pisa, Via Bonanno Pisano 25/b, 56126 Pisa, Italy b
c
Abstract We present a method for the study of dynamical systems based on the notion of quantity of information. Measuring the quantity of information of a string by using data compression algorithms, it is possible to give a notion of orbit complexity of dynamical systems. In compact ergodic dynamical systems, entropy is almost everywhere equal to orbit complexity. We have introduced a new compression algorithm called CASToRe which allows a direct estimation of the information content of the orbits in the 0-entropy case. The method is applied to a sporadic dynamical system (Manneville map). Ó 2001 Elsevier Science Ltd. All rights reserved.
1. Introduction In this article we present a method for the study of dynamical systems based on the notion of quantity of information contained in a string. From a mathematical point of view the most powerful way to de®ne this quantity is given by the Kolmogorov complexity, or algorithmic information content (AIC) [3,6]. The AIC of a string is the length of the shortest program (written in a ®xed universal programming language) that outputs the string. AIC was ®rst used in the context of dynamical systems by Brudno [2]. He de®ned a notion of orbit complexity which is a measure of the quantity of information necessary to describe the orbit. The orbit is translated in a set of strings by the methods of symbolic dynamics and its complexity is de®ned by the AIC of the strings. One of the main features of Brudno orbit complexity is that, in compact ergodic dynamical systems, its value equals the Kolmogorov entropy of the system for almost every initial condition. The complexity of an orbit is a real number and it is de®ned independently of the choice of an invariant measure. Therefore it gives information on the dynamics even if the dynamical system does not have any nontrivial invariant measure or all the invariant measures give the null value for the Kolmogorov entropy (0-entropy dynamics). In this case, all the traditional indicators of the complexity of the dynamics (such as Kolmogorov entropy) are not useful. Unfortunately, it is not possible to make concrete use of the AIC since it is not a computable function and in particular it cannot be computed by any algorithm [3].
*
Corresponding author. E-mail addresses:
[email protected] (F. Argenti),
[email protected] (V. Benci),
[email protected] (P. Cerrai), alessandro.
[email protected] (A. Cordelli), galatolo@runi.dm.unipi.it,
[email protected] (S. Galatolo),
[email protected] (G. Menconi). 0960-0779/01/$ - see front matter Ó 2001 Elsevier Science Ltd. All rights reserved. PII: S 0 9 6 0 - 0 7 7 9 ( 0 1 ) 0 0 0 2 8 - 5
462
F. Argenti et al. / Chaos, Solitons and Fractals 13 (2002) 461±469
In our approach, the AIC of a string is replaced by another measure of information. For example, we can imagine that an approximate measure of the quantity of information contained in a string is the length of the string after it has been compressed by some compression algorithm (like those existing in almost every personal computer). The formalization of this concept leads to the study of the universal coding algorithms. Lempel and Ziv [7] proved that the universal coding scheme proposed by themselves was able to compress strings coming from an information source up to the best possible compression ratio: the entropy of the source. By these results, it is possible to give a de®nition of orbit complexity using universal coding schemes. The new de®nition of orbit complexity is related to the Kolmogorov entropy of the system as the Brudno complexity of an orbit, but it is based on a computable measure of information. Our method can be used to give information on the 0-entropy dynamics. In Section 2 of this paper, we present some general results obtained about the relation between entropy of dynamical systems and universal coding schemes. The Lempel±Ziv '78 (LZ78) coding scheme satis®es the hypotheses for the results stated in this section. However, it is not a good tool for studying the 0-entropy dynamics. The reason can be understood by observing the behaviour of the LZ78 algorithm while compressing a constant string with length n. Let us consider a binary constant string with length n: 11111111111111111111111111111 . . . It is reasonable to imagine that the information contained in that string is about Const log
n; actually, the string can be reconstructed by specifying the digit `1' and the number of times that the digit have to be repeated (with a cost of about log
n bits). This is the AIC of the string. On the other hand, the LZ78 compression algorithm compresses the string to a string with length about Const n1=2 bits (see Section 3). For this reason we cannot expect that the LZ78 algorithm is able to distinguish a process where the information grows like na
a 6
1=2 from a constant or a periodic one. This is the main motivation which leads us to introduce a new coding scheme called Compression Algorithm, Sensitive To Regularity (CASToRe ). This new coding scheme will be described in Section 3. It is able to compress constant or periodic strings with length n to string about log
n digits long and we think it would be a very sensible device to measure the information coming from sporadic dynamics or process with multifractal behaviour (this is supported by heuristic motivations and by the following experimental results). We have implemented our compression algorithm and used it to study a classic case of sporadic dynamics: the Manneville map [8]. The results (shown in Section 4) agree with the theoretical predictions. When the exponent of the Manneville map is lower than 2 we have positive entropy and the information increases linearly with time. When the exponent is greater than 2 the information increases as a power law with exponent similiar to the theoretically predicted one. 2. Theoretical results
Let set R be the space f0; 1g of ®nite (possibly empty) sequences whose characters belong to a ®nite alphabet f0; 1g and A be the space of ®nite sequences coming from the ®nite alphabet A. An ideal coding scheme (ICS) is a one-parameter family E
E` `2N : A ! R of recursive functions 1 codifying strings in A to binary strings, satisfying the following properties ICSp 1 and ICSp 2. The parameter ` is a sort of accuracy parameter. Each ICS gives a de®nition of orbit complexity for which following Theorems 1 and 2 hold. Before stating the properties, let us de®ne the notations that will be used later. We de®ne K E
sn ; `, the E` -information content of a ®nite string sn , as the length of the string after it has been coded by E` . That is, K E
sn ; ` jE`
sn j.
1
Functions that can be calculated by an algorithm.
F. Argenti et al. / Chaos, Solitons and Fractals 13 (2002) 461±469
463
Let s 2 AN be an in®nite string. Let us consider K E
sn ; `, where sn is the string containing only the ®rst n symbols of the in®nite string s. Let qE
s; ` be lim sup n!1
K E
sn ; ` ; n
the compression ratio for s with respect to E` . Let qE
s lim supq
E`
s; `!1
the asymptotic ICS compression ratio. Now we are ready to state: ICSp 1. If s is drawn from an ergodic stationary source with entropy H, then PrfqE
s H g 1: ICSp 2. If A; B are two alphabets and f : B ! A is surjective,
si i2N is an infinite string in BN and
s0i i2N such that s0i f
si is a string in AN obtained by identifying each element of B to some element of A, then qE
s0 6 qE
s: The set of ICS is not empty; the Lempel±Ziv '78 algorithm satis®es the ICSp 1 and 2 [4]. Now we see how to apply those results to dynamical systems. Let
X ; T be a topological dynamical system, where X is a metric space and T is a continuous onto mapping T : X ! X . Let us consider a ®nite cover U fB0 ; . . . ; BN 1 g of X: the sets Bi are Borel sets whose union is X and may have nonempty intersections. We will use this cover to code the orbits of
X ; T into a set of in®nite strings. If x 2 X , then we set o n RU
x x 2 f0; . . . ; N 1gN : 8n 2 N; T n
x 2 Bx
n ; it is the set of all possible symbolic orbits of x under the map T with respect to the cover U. Let us consider an ICS
E` `2N . The ICS-complexity of the orbit of x 2 X with respect to the cover U is de®ned as K E
x; T jU inf qE
x: x2RU
x
Since
E` `2N satis®es ICSp 2, if we have two covers U 6 W then K E
x; T jU 6 K E
x; T jW [4]. We can state the following de®nition of ICS-complexity of an orbit, not depending on the chosen cover: K E
x; T supK E
x; T jU; U
where the supremum is taken over all the finite open covers U of X. This de®nition associates to a point in X a real number which is a measure of the complexity of its orbit. For example, if a point is periodic, or its orbit converges to a ®xed point, then its orbit complexity is 0. ICS orbit complexity is de®ned independently of the choice of an invariant measure or of the knowledge of other global features of the system considered. We have the following general results [4]. Theorem 1. If
X ; l; T is an ergodic dynamical system and a is a measurable partition of X, then K E
x; T ja hl
T ja l
almost everywhere;
where hl
T ja is the metric entropy of
X ; l; T with respect to the measurable partition a.
464
F. Argenti et al. / Chaos, Solitons and Fractals 13 (2002) 461±469
We remark that the previous result holds even if the coding scheme does not satisfy ICSp 2. The following is the analogous of the Brudno main theorem [2, p. 139]. Theorem 2. If
X ; l; T is ergodic and X is compact, then K E
x; T hl
T
l
almost everywhere;
where hl
T is the metric entropy of
X ; l; T .
3. Algorithms 3.1. The Lempel±Ziv '78 algorithm Now, we present the Lempel±Ziv coding scheme; we will describe its running and in next section we will see why we decided to modify this algorithm to obtain a coding scheme which is useful for the study of 0-entropy dynamics. The Lempel±Ziv '78 code LZ78 [7] is used to codify a string that is segmented in dierent blocks of length `, to be encoded sequentially. When the parsing parameter ` is assigned, we will call the algorithm LZ` . Upon completion of a `-block, the machine resets, thus forgetting all past history before starting to encode the next input `-block. According to a so-called incremental parsing procedure, the code divides the `-block in words that are all dierent from each other. The dictionary is composed in following inductive way. First, the dictionary is empty and the `-block is not parsed. At every step, starting from the ®rst character of the still unparsed part of the block, the code recognises the shortest substring not belonging to the dictionary: then, the unparsed part of the block is divided adding the substring to the dictionary. This new word without its last character is a word that belongs to the dictionary. Then, the new word will be codi®ed by the couple (number, symbol) where the number is the pointer to the word in the dictionary de®ned above and the symbol is the last character, that is then a sux to the `old' word. Let us consider an example. Let s be the string aaaaaaabbabaaaabaaaabba . . . Let ` 12; so, the algorithm encodes the ®rst block aaaaaaabbaba as follows: 1: 2: 3: 4: 5: 6:
0; a
1; a
2; a
1; b
0; b
4; a
a aa aaa ab b aba
Step 1: the algorithm encodes
0; a, so the ®rst occurrence of the symbol `a' will be also the ®rst word in the dictionary, as showed in the last column (0 is the pointer to the empty word). Step 2: the algorithm processes the second symbol, that is `a' again (already belonging to the dictionary), so it goes on processing the next symbol, that forms with the previous character the substring `aa', that is new and will be the next new parsed word. The algorithm encodes
1; a, where 1 is the pointer to the ®rst word in the dictionary and `a' is the sux and so on. At the nth step, the algorithm encodes
An ; bn , where An is the pointer to the appropriate old word and the symbol bn is the sux to form the new encoded word. In [4] it is proved that LZ
LZ` `2N satis®es ICSp 1 and 2, then it is an ICS. So, Theorem 2 holds also for LZ.
F. Argenti et al. / Chaos, Solitons and Fractals 13 (2002) 461±469
465
3.2. CASToRe: a new algorithm Exponential separation of trajectories of nearby initial conditions (sensitive dependence on initial conditions) is a very general feature shared by a large variety of dissipative and conservative dynamical systems. It is known that Kolmogorov algorithmic complexity K provides a connection between the dynamical instability and the randomness of trajectories. The Brudno±White theorem asserts that random orbits prevail in the so-called strongly chaotic systems, where Kolmogorov entropy is positive. Now, we will brie¯y analyse how much the LZ78 algorithm is able to measure the information content of strings presenting dierent randomness properties. In the following, we will consider each string as a single block, so the block length ` will be the length of the string. Let us consider a dynamical system
X ; l; T . Let us look at strings as symbolic representations of orbits in X. A periodic string can be constructed by specifying only the length n of the string and the pattern of one period, so that K
sn log2
n; where K
sn is the AIC of the string sn . We know from Brudno±White theorem [9] that random trajectories prevail in systems with positive entropy; in fact, if l is ergodic, for l-almost all orbits there exists the limit limn!1 K
sn =n and it is equal to the metric entropy hl
T . Then, for l-almost all initial conditions, we have K
sn n: The question arises whether intermediate dynamical behaviours could exist between periodic and chaotic ones in the sense that complexity increases asymptotically as a power law K
sn na ;
0 < a < 1:
Gaspard and Wang [5] called sporadic such a dynamical system. First of all, let us analyse how LZ78 faces these dierent behaviours. Let us ®x a code LZn . Let us consider a constant string aaaaaaaaaaaaaaaaaaaaaaaaa . . . after n steps the encoded string is like that: 1: 2: 3: 4: n:
0; a
1; a
2; a
3; a ...
n 1; a
Again, the ®rst number in the brackets is the pointer to the position (number out of brackets) in the dictionary of the word which we have to add the sux (second symbol) to, to obtain the new word. Pn encoded 1 2 So, if the codi®ed string has more or less n symbols, the uncodi®ed pstring has O
i O
n =2 i0 symbols and if a string s has length n, its codi®cation LZ`
s has length O
n. Then, we cannot expect this algorithm to distinguish between the two cases, the periodic (logarithmic) one and the one with power law with parameter a 6 1=2. In order to distinguish those two dierent behaviours, using the Lempel±Ziv algorithmic information lossless structure, we have modi®ed the LZ78 and created a new algorithm that we think would recognise the dierence between the periodic case and the sporadic one. Let us brie¯y describe our code. Again, the idea is adding a new sux to an old word each time, but now we add a pointer to a word already belonging to the dictionary: the new encoded word is the longest word that can be constructed by using two words already present in the dictionary.
466
F. Argenti et al. / Chaos, Solitons and Fractals 13 (2002) 461±469
Consider the following example: aaabbabaabb . . . The encoded string will be: 1: 2: 3: 4: 5: 6:
0; a
1; 1
0; b
3; 1
4; 1
3; 3 ...
a aa b ba baa bb
where the ®rst symbol in round brackets is the pointer to the position in the dictionary of the word which will be the ®rst part of the new composed word, while the second pointer is referred to the second part of the new word. In the latter square brackets there is the corresponding segment of the string. Working as before, let us consider a constant string: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa . . . its codi®cation will be: 1: 2: 3: 4: 5: n:
0; a
1; 1
2; 2
3; 3
4; 4 ...
n 1; n
a aa aaaa aaaaaaaa aaaaaaaaaaaaaaaa 1 a . . . a
Pn 1 When the encoded string has length more or less n, the uncodi®ed one has length O
i0 2i O
2n . Then, if we codify a string of length n, we obtain a coded string of length O
log2
n as wanted. So, we think that, in the case of regular strings, this new algorithm will be a sensitive measure of the information content of the string. That is why we called it CASToRe.
4. Results In this section we apply our algorithm to study the complexity of the orbits of a well-known sporadic dynamical system: the Manneville map. The study of this map and that of sporadic dynamics is of interest also for nonextensive thermodynamical problems [1]. We see that the information content of the orbits of the Manneville map as measured by our algorithm accords to the theoretical predictions of Gaspard and Wang [5]. We have applied CASToRe to a deterministic sporadic map, ®rst studied by Manneville: f
x x xz
mod: 1;
z P 1:
As can be seen in Fig. 1, f maps the segment 0; 1 twice on itself. The origin x 0 is always a ®xed point. Let us de®ne ~x by f
~x 1 and assume an initial condition x0 close to the origin and let the iteration proceed. The nth iterate xn f n
x0 ®rst increases gently but then more and more rapidly up to a point where it becomes larger than ~x. During this phase of the motion, the iterate drifts regularly away from the origin, so we call it laminar phase. Due to the expansivity of f in points far from the origin, this monotonous variation is suddenly interrupted; a turbulent burst occurs starting a chaotic dynamics which ®nally reinjects the iterate in a region very close to the origin and begins a new laminar phase. Hence we have an intermittent signal.
F. Argenti et al. / Chaos, Solitons and Fractals 13 (2002) 461±469
467
Fig. 1. (a) Graph of the Manneville map with z 2. (b) Lyapunov exponent for the Manneville map as a function of the parameter z.
In [5] it is stated that the intermittent systems of Manneville: xn1 f
xn are sporadic when z P 2, in the sense given in Section 3.2. In [5] it is also stated that the expectation of the information content K
sn of the symbolic string of length n coming from the orbits of the Manneville map behaves like E
K
sn n E
K
sn n1=
z
if 1 6 z < 2; 1
if z > 2:
Finally, Gaspard and Wang [5] assert that the entropy is positive for z < 2 but vanishes near z 2 and is zero when z P 2. So, intermittent systems are chaotic when z < 2 but neither chaotic nor periodic when
Fig. 2. The graphs show the length of the compressed symbolic orbit as a function of the number of iterations, with six dierent initial points: (a) the graph with z 1:5 and the mean compression rate is 0.6; (b) the graph is referred to z 1:7 and the mean compression rate is 0.75.
468
F. Argenti et al. / Chaos, Solitons and Fractals 13 (2002) 461±469
Fig. 3. The graphs show the length of the compressed symbolic orbit as a function of the number of iterations (plotted in bilog scale), with six dierent initial points: (a) the graph with z 2:3; (b) the graph is referred to z 4.
z P 2. Fig. 1 shows a numerical estimation of the Lyapunov exponent of the Manneville map as a function of the parameter z. We have constructed a set of symbolic orbits from the Manneville map by dividing the interval 0; 1 into two segments B0 0; ~x, B1
~x; 1, obtaining binary symbolic orbits. We have measured the information content of the orbits of the Manneville map as a function of the number of iterations. The information content was measured by compressing the symbolic strings with CASToRe. We show the results in the following ®gures. When z < 2 (as in the examples plotted on Fig. 2) we expect that the information increases linearly with the number of iterations. Moreover, from Theorem 2, we expect that the proportionality factor would be the entropy of the system. As we can see, the numerical results (mean compression rate 0.6 and 0.75 for z 1:7 and 1.5, respectively) agree with the theoretical predictions. In Fig. 3 the numerical results obtained with exponents z 2:3 and 4 are showed: graphs are plotted in bilog scale, so the expected power law behaviour of the information becomes linear. The slopes of those graphs are the exponents of the power laws.
Fig. 4. The ®gure shows the average on 12 dierent orbits of the information content (in bilog scale), compared with a line whose slope ; (b) the exponent is z 4 and is the teoretically predicted exponent: (a) the graph is referred to the exponent z 2:3 and the slope is 10 13 the slope is 13.
F. Argenti et al. / Chaos, Solitons and Fractals 13 (2002) 461±469
469
Fig. 4 shows (in bilog scale, again) the average behaviour of the length of compressed orbits, compared with a line whose slope is the theoretically predicted one. References [1] Buiatti M, Grigolini P, Palatella L. Nonextensive approach to the entropy of symbolic sequences. Physica A 1999;268:214. [2] Brudno AA. Entropy and the trajectories of a dynamical system. Trans Moscow Math Soc 1983;2:127±51. [3] Chaitin GJ. Information, randomness and incompletessness Papers on algorithmic information theory. Singapore: World Scienti®c; 1987. [4] Galatolo S. Orbit complexity and data compression. Preprint of University of Pisa, to appear in Discrete & Continuous Dynam Syst. [5] Gaspard P, Wang X-J. Sporadicity: between periodic and chaotic dynamical behaviours. Proc Natl Acad Sci USA 1988;85:4591±5. [6] Kolmogorov AN. Combinatorial foundations of information theory and the calculus of probabilities. Russian Math Surv 1983;38:29±40. [7] Lempel A, Ziv J. Compression of individual sequences via variable-rate coding. IEEE Trans Inform Theory 1978;IT-24:530±6. [8] Manneville P. Intermittency, self-similarity and spectrum in dissipative dynamical systems. J Physique 1980;41:1235±43. [9] White H. Algorithmic complexity of points in dynamical systems. Ergodic Theory Dynam Syst 1993;13:807±30.