Physica D 231 (2007) 137–142 www.elsevier.com/locate/physd
Topological permutation entropy Jos´e M. Amig´o a,∗ , Matthew B. Kennel b a Centro de Investigaci´on Operativa, Universidad Miguel Hern´andez. 03202 Elche, Spain b Institute for Nonlinear Science, University of California, San Diego. La Jolla, CA 92093-0402, USA
Received 7 December 2006; received in revised form 27 April 2007; accepted 30 April 2007 Available online 22 May 2007 Communicated by J. Stark
Abstract Permutation entropy quantifies the diversity of possible ordering of the successively observed values a random or deterministic system can take, just as Shannon entropy quantifies the diversity of the values themselves. When the observable or state variable has a natural order relation, making permutation entropy possible to compute, then the asymptotic rate of growth in permutation entropy with word length forms an alternative means of describing the intrinsic entropy rate of a source. Herein, extending a previous result on metric entropy rate, we show that the topological permutation entropy rate for expansive maps equals the conventional topological entropy rate familiar from symbolic dynamics. This result is not limited to one-dimensional maps. c 2007 Elsevier B.V. All rights reserved.
Keywords: Topological entropy; Order patterns; Permutation entropy
1. Introduction Metric and topological permutation entropy were first introduced by Bandt et al. in [3,4]. In [1] the authors introduced a slightly different definition for the metric permutation entropy of a map, the new definition differing from the original one basically in the order of an iterated limit (first the length of the orbit, then the precision of the measurement, as in the definition of topological entropy). This technical change allowed the authors to generalize one of the main results of [3], namely, the equality of metric entropy and metric permutation entropy for piecewise monotone maps on one-dimensional intervals to higher dimensions, at the expense of requiring ergodicity (Theorem 1 below). The possibility of applying the same approach to the topological entropies was not explored at that time. In this paper we fill the gap with the parallel result that the equality of topological entropy and topological permutation entropy for piecewise monotone maps on one-dimensional ∗ Corresponding address: Universidad Miguel Hern´andez, Statistics and Applied Mathematics, Avda. de la Universidad s/n, 03202 Elche, Alicante, Spain. Fax: +34 96 665 87 15. E-mail addresses:
[email protected] (J.M. Amig´o),
[email protected] (M.B. Kennel).
c 2007 Elsevier B.V. All rights reserved. 0167-2789/$ - see front matter doi:10.1016/j.physd.2007.04.010
intervals (the other main result of [3]) can also be generalized to higher dimensions, this time requiring the (continuous) map to be expansive (Theorem 2 below), a property very common in applications. The possibility of going higher dimensional is an advantage of our definitions of metric and topological permutation entropies. Our presentation proceeds as follows. We begin by considering discrete-time, finite-alphabet information sources since our definitions of metric and topological permutation entropies of maps require doing so. Since this paper is a continuation of [1], we use the same notation. Also, for the reader’s convenience, we have tried to present all the materials needed for our main result, Theorem 2, in a self-contained way. This requires us to summarize some basic facts and results from [1]. After developing the theoretical part in Section 2 to Section 4, the numerical simulations in Section 5 address the difficulties and feasibility of using order patterns to estimate the topological entropy in practice. Finally, in Section 6 we discuss the role of order patterns in discriminating between random and deterministic dynamics. 2. Topological permutation entropy of information sources Let (Ω , F, µ) be a probability space, i.e., Ω is a non-empty set, F is a sigma-algebra of subsets of Ω (the possible ‘events’)
138
J.M. Amig´o, M.B. Kennel / Physica D 231 (2007) 137–142
and µ is a positive measure on (Ω , F) such that µ(Ω ) = 1. Furthermore, let S = (Sn ), where n ∈ N, N0 := {0} ∪ N or Z, be a sequence of random variables on Ω taking values on a discrete set A = {a1 , a2 , . . . , a|A| } called the state space. We say that S is a discrete-time, finite-state stochastic (or random) process and call the totality of its joint distribution functions, Pr(s0 , . . . , s L−1 ) := µ({ω ∈ Ω : S0 (ω) = s0 , . . . , SL−1 (ω) = s L−1 }),
(1)
Pr(s0 , . . . , s L−1 ) = Pr(sk , . . . , s L−1+k ) for any k ∈ N. Henceforth we call a discrete-time, finite-state stationary stochastic process indexed by the set N0 , S = (Sn )n∈N0 , an information source. Such random processes provide models for physical information sources that must be turned on at some time. In this context, we call the state space A the alphabet of the source and thus speak of finite-alphabet information sources. (Discrete-time, arbitrary-alphabet information sources were considered in [1].) Observe that the possible outputs of the process S are points in the sequence space AN0 = {ξ = (ξn )n∈N0 : ξn ∈ A}, and that the probability of the ‘message’ of length L, L−1 (Sn (ω))n=0 = s0 , . . . , s L−1 , is Pr(s0 , . . . , s L−1 ). The stationary probability functions (1) define then a measure m on the so-called cylinder sets of AN0 (the generators of the product topology and the product sigma-algebra Z of AN0 ), m {ξ ∈ AN0 : ξ0 = s0 , . . . , ξ L−1 = s L−1 } := Pr(s0 , . . . , s L−1 ), L ≥ 1, that is invariant under the one-sided shift T : (ξn ) 7→ (ξn+1 ). We say that S is ergodic, mixing, etc., if T has the corresponding property with respect to the measure m. In this framework, we define the topological entropy of order L of S as 1 log N (S, L), L
(2)
where S0L−1 is shorthand for the block of random variables S0 , . . . , SL−1 and N (S, L) is the number of sequences (words, blocks, . . . ) of length L, s0L−1 := s0 , . . . , s L−1 , that S can output, i.e., N (S, L) is the number of words of length L, built by consecutive letters, that can be observed in the messages of S (since S is stationary, we may restrict ourselves to the initial segment). The topological entropy of S is then h top (S) := lim sup Htop (S0L−1 ). L→∞
(3)
If, furthermore, Hm (S0L−1 ) := −
h m (S) := lim Hm (S0L−1 ) ≤ h top (S),
(5)
L→∞
where h m (S) is the Shannon (or metric) entropy of S. Also
L ≥ 1, its probability law. Moreover, we say that the stochastic process S is stationary if its probability law is invariant under ‘time translation’, i.e., if
Htop (S0L−1 ) :=
is the Shannon (or metric) entropy of order L of S, then clearly Hm (S0L−1 ) ≤ Htop (S0L−1 ) (for any logarithm base > 1) and, therefore,
1X m(s0 , . . . , s L−1 ) log m(s0 , . . . , s L−1 ) L (4)
h m (S) = h top (S) ⇔ m(s0 , . . . , s L−1 ) =
1 N (S, L)
∀L ≥ 1.
Observe that Hm (S0L−1 ), Htop (S0L−1 ) and, consequently, h m (S) and h top (S) are actually entropy rates, intensive quantities which are invariants of the dynamical system. From here on we may adopt conventional nomenclature, omitting the qualifier “rate” from intensive entropy rates. Suppose now that the alphabet A of the information source S is endowed with a total ordering ≤, so that one can also define the corresponding permutation entropies via the order patterns defined by the words of finite lengths. Let σ L denote the set of permutations on {0, 1, . . . , L − 1}. If π ∈ σ L and 0 7→ π(0), . . . , L − 1 7→ π(L − 1), then we write π = [π(0), . . . , π(L − 1)]. Given the output (sn )n∈N0 of S, we say that a length-L word skk+L−1 = sk , sk+1 , . . . , sk+L−1 defines the permutation or order pattern π ∈ σ L if sk+π(0) ≺ sk+π(1) ≺ · · · ≺ sk+π(L−1) , where, for definiteness, given si , s j ∈ A and i, j ∈ N0 with i 6= j, si < s j si ≺ s j ⇔ or i< j if si = s j . The metric and topological permutation entropies of an information source are defined analogously to the metric entropy and topological entropy, using rank variables. Given a finite-alphabet source S = (Sn )n∈N0 with sequence space (AN0 , Z, m), each possible order pattern defined by a block of length L, e.g., s0L−1 = s0 , . . . , s L−1 , can be indexed as a word of ranks, each an integer in successively larger alphabets. In particular, define for n P ≥ 0 the rank variable Rn = |{Si , 0 ≤ n i ≤ n : Si ≤ Sn }| = i=0 δ(Si ≤ Sn ), where, as usual, | · | denotes cardinality and the δ-function of a proposition is 1 if it holds and 0 otherwise. By definition, Rn is a discrete random variable on Ω with alphabet {1, . . . , n + 1} and the sequence R = (Rn )n∈N0 builds a discrete-time non-stationary process. Then the order pattern π ∈ σ L defined by s0L−1 can also be viewed as the word r0L−1 = r0 , . . . , r L−1 , the relation between them being one-to-one. The many-to-one relation between S0L−1 and R0L−1 is written as R0L−1 = ϕ(S0L−1 ). For example, consider a source S over the alphabet A = {1, 2, 3}. Suppose we observe the word s02 = 3, 2, 2. Then, r02 = ϕ(s02 ) = 1, 1, 2, (of course other words, e.g., 2, 1, 1 or 3, 1, 1, also map to r02 = 1, 1, 2) and π = [1, 2, 0] is the order pattern defined by s02 . The word s02 could be counted as
139
J.M. Amig´o, M.B. Kennel / Physica D 231 (2007) 137–142
matching both the ordering s1 ≤ s2 ≤ s0 and s2 ≤ s1 ≤ s0 too. By using ranks, by contrast, the measure associated with each word is unambiguously associated with one permutation—the order pattern defined by the word. Thus, the metric permutation entropy of S, h ∗m (S), is defined as h ∗m (S) = lim sup Hm∗ (S0L−1 )
h µ ( f ) := sup h µ ( f, α).
L→∞
1 Hm∗ (S0L−1 ) := Hm (R0L−1 ) := − L −1 X × m(r0 , . . . , r L−1 ) log m(r0 , . . . , r L−1 ) (6) (see (4)), and the topological permutation entropy of S, h ∗top (S), is defined as ∗ h ∗top (S) = lim sup Htop (S0L−1 )
(7)
L→∞
Assuming logarithms to base 2, h µ ( f ) has units of bits per symbol or time unit if L is interpreted as discrete time. In an information-theoretical setting, h µ ( f, α) represents the long-term average of the information gained per unit time with respect to a certain partition and h µ ( f ) the maximum information per unit time available from any stationary process generated by the source. In order to define next the permutation entropy of f , we consider first product partitions ι=
with ∗ Htop (S0L−1 ) := Htop (R0L−1 ) :=
1 log N (R, L) L −1
h ∗m (S) ≤ h ∗top (S)
q Y
{I1,k , . . . , I Nk ,k }
k=1
(8)
(see (2)). The normalization factor 1/(L − 1) in (6) and (8), instead of 1/L as in (4) and (2), is due to the fact that single letters do not define any order pattern (of course, the choice 1/L leads to the same limit when L → ∞). Like in (5), the topological permutation entropy is an upper bound of the metric permutation entropy, (9)
and
of I into N := N1 · . . . · Nq subintervals of lengths ∆ j,k , 1 ≤ j ≤ Nk , in each coordinate k, defining the norm kιk = max j,k ∆ j,k of the partition ι (other definitions are also possible). For definiteness, the intervals are lexicographically ordered in each dimension, i.e., points in I j,k are smaller than points in I j+1,k and for the multiple dimensions a lexicographic order is defined, I j,k < I j,k+1 , so there is an order relation between all the N partition elements, and we can enumerate them with a single index i ∈ {1, . . . , N }: ι = {Ii : 1 ≤ i ≤ N },
=
h ∗top (S)
(12)
α
with
h ∗m (S)
P|β| Hµ (β) := − j=1 µ(B j ) log µ(B j ) for any finite partition β = {B1 , . . . , B|β| } ⊂ B| I ; usually, logarithms are taken to base 2 or e and, by convention, 0·log 0 := limx→0+ x log x = 0. It can be shown that the limit in (11) always exists [7]. The metric entropy rate of map f is then defined as
1 ⇔ m(r0 , . . . , r L−1 ) = N (R, L)
∀L ≥ 2.
From these definitions and N (R, L) ≤ N (S, L), since several finite symbol sequences may produce the same sequence of rank variables (i.e. ϕ(·) is many-to-one), it follows that h ∗top (S) ≤ h top (S).
(10)
3. Topological permutation entropy of maps Let I be a compact interval of Rq endowed with the sigmaalgebra B| I = B ∩ I , the restriction of the Borel sigmaalgebra of Rq to I , and let f : I → I be a µ-preserving transformation, with µ being a probability measure on (I, B| I ). Let us recall first of all the definition of metric (also called measure-theoretic and Kolmogorov–Sinai) entropy, tailored to our particular setting. Given a finite partition α = {A1 , . . . , A|α| } ⊂ B| I of I , the entropy of f with respect to α is defined as ! L−1 _ 1 −i Hµ f α , (11) h µ ( f, α) := lim L→∞ L i=0 L−1 −i L−1 −i where ∨i=0 f α = {∩i=0 f A ji } is the least common refinement of the partitions α, f −1 α, . . . , f −L+1 α and
Ii < Ii+1 .
Next define a collection of simple observations Sι = (Snι )n∈N0 with respect to f with precision kιk: Snι (x) = i
if f n (x) ∈ Ii , n = 0, 1, . . . .
Then Sι is a stationary N -state random process or, equivalently, an information source on (I, B| I , µ) with finite alphabet Aι = {1, . . . , N } and output probability distribution m = µ ◦ φ −1 , with φ(x) = (S0ι (x), S1ι (x), . . .) ∈ AιN . As in [1], we define the metric permutation entropy of f with respect to the invariant measure µ by h ∗µ ( f ) := lim h ∗m (Sι ). kιk→0
(13)
The limit exists (since h ∗m (Sι ) is greater the finer ι is) and does not depend on the product partition ι, so that we can take uniform partitions (i.e., “box partitions”) whenever convenient. Previously the present authors and Kocarev [1] proved in this framework the equality of conventional metric entropy to the permutation entropy: Theorem 1 ([1]). If f : I → I is ergodic with respect to the invariant measure µ, then h ∗µ ( f ) = h µ ( f ), where h µ ( f ) denotes the metric entropy of f with respect to µ. In words, the permutation entropy of ergodic maps equals the measuretheoretic entropy.
140
J.M. Amig´o, M.B. Kennel / Physica D 231 (2007) 137–142
We define now, analogously to (13), the topological permutation entropy of f as
norm kιk = ε. Suppose that there are points x, y ∈ Σ ∩ Ii0 , 1 ≤ i 0 ≤ N . Then
h ∗top ( f ) := lim h ∗top (Sι ).
dn (x, y) > ε ⇔ d( f i (x), f i (y)) > ε
(14)
kιk→0
for some 0 ≤ i ≤ n − 1 Again, the limit in (14) does not depend on the specifics of the product partition ι, so that it can be taken to be a box partition without loss of generality. From (9), we have h ∗µ ( f ) ≤ h ∗top ( f ).
(15)
4. Relation between topological entropy and topological permutation entropy The topological entropy of a map can be defined in different ways: via open covers (Adler, Konheim and McAndrew) or via separated and spanning sets (Dinaburg and Bowen); see e.g. [7]. Next we will briefly recall the definition that is most useful for our purposes. Let (X, d) be a compact metric space, d denoting a metric on the set X , and f : X → X a continuous map on X . (The compactness assumption can be dropped if f is uniformly continuous, but we do not need this additional generality.) If n ∈ N, we define a new metric dn on X by dn (x, y) =
max d( f (x), f (y)). i
Then, f is said to be (positively) expansive if there exists δ > 0 such that dn (x, y) ≤ δ for ∀n ≥ 0 implies x = y. We will call δ the expansiveness constant of f . On the other hand, given ε > 0, a subset Σ ⊂ X is said to be (n, ε)-separated with respect to f if x, y ∈ Σ , x 6= y, implies dn (x, y) > ε. Thus, an (n, ε)-separated subset of X is a kind of microscope that allows us to distinguish orbits of length n up to a precision ε. Let cn (ε, X ) denote the largest cardinality of any (n, ε)-separated subset of X with respect to f . The topological entropy rate of f is then defined as ε→0 n→∞
Thus, every point x ∈ Σ ∩ Ii0 generates a different sequence S0ιn−1 (x) = i 0 , . . . of length n. Of course, there can be points x 0 ∈ Ii0 , x 0 6∈ Σ , such that S0ιn−1 (x 0 ) = i 0 , . . . 6= S0ιn−1 (x) for all x ∈ Σ ∩ Ii0 , but the number of such points will vanish when n → ∞ if ε ≤ δ, δ being the expansiveness constant of f . In this limit (and ε ≤ δ) we also have Σ ∩ Ii 6= ∅ for ∀i, 1 ≤ i ≤ N , so that there is a one-to-one relation between points in Σ and outputs s0ι∞ of Sι . It follows that 1 1 log N (Sι , n) = lim sup log cn (ε, I ), n→∞ n n for ε ≤ δ, and hence lim sup
n→∞
1 log N (Sι , n) n = lim lim sup log cn (ε, I )
lim h top (Sι ) = lim lim sup
kιk→0
kιk→0 n→∞ ε→0 n→∞
= h top ( f ).
i
0≤i≤n−1
h top ( f ) = lim lim sup
⇒ S0ιn−1 (x) 6= S0ιn−1 (y).
1 log cn (ε, X ). n
Theorem 2. Let I be a compact q-dimensional interval and f : I → I a continuous map. If, moreover, f is expansive, then h ∗top ( f ) = h top ( f ). Proof. From Theorem 1, h µ ( f ) = h ∗µ ( f ) holds for ∀µ ∈ E(I, f ), and thus (see (16)) h top ( f ) =
sup
µ∈E(I, f )
h ∗µ ( f ) ≤ h ∗top ( f ),
(17)
where the last inequality follows from (15). On the other hand, we can use (10) and Lemma 1 to prove the reverse inequality: h ∗top ( f ) = lim h ∗top (Sι ) ≤ lim h top (Sι ) = h top ( f ). kιk→0
kιk→0
(18)
Moreover, if E(X, f ) denotes the set of f -invariant and ergodic measures on X (E(X, f ) 6= ∅ because of the Krylov–Bogolioubov Theorem [7]), then h top ( f ) can also be obtained from the metric entropies of f with respect to µ ∈ E(X, f ) by means of the following variational principle [7]:
A last comment. The definition given by Bandt, Keller and Pompe in [3] of the topological permutation entropy of a map f on a closed interval I ⊂ R is
h top ( f ) =
where N ( f, L) is the number of order patterns defined by the L−1 orbits of length L, ( f n (x))n=0 with x ∈ I . Then they proved B K P∗ h top ( f ) = h top ( f ) for f piecewise monotone (i.e., there is a finite partition of I into intervals such that on each of those intervals f is continuous and monotone). In turn, Misiurewicz proved that this result is not true if the map is not piecewise monotone [6]. His counterexample is a continuous map with infinite monotony segments. Theorem 2 shows that the definition (14) of topological permutation entropy has also advantages in the one-dimensional case—if the map is expansive. Qualitatively what we are doing is first assuming
sup
µ∈E(X, f )
h µ ( f ).
(16)
Lemma 1. Let I ⊂ Rq be a compact interval and f : I → I an expansive map. Then lim h top (Sι ) = h top ( f ).
kιk→0
Proof. Because of our definition of partition norm, we will use the l∞ -distance in Rq , namely, d(x, y) = max1≤i≤q |xi − yi |, q q with x = (xi )i=1 and y = (yi )i=1 . Let Σ be an (n, ε)-separated N subset of I and lay on I a product partition ι = {Ii }i=1 of
B K P∗ h top ( f ) := lim
n→∞
1 log N ( f, L), L −1
J.M. Amig´o, M.B. Kennel / Physica D 231 (2007) 137–142
141
a finite discretization of the space, define entropies and rates on the easier-to-handle discrete space, and lastly taking the limit as the discretization goes to zero. This parallels the conventional way of defining Kolmogorov–Sinai entropy of dynamical systems and appears to lessen the pathologies and incidence of unusual cases as exemplified by [6]. We advance as a conjecture that most physically reasonable dynamical systems will give the same topological permutation entropy with either scheme. 5. Numerical simulations Estimation of topological entropies from naive numerical simulation of long orbits is notoriously difficult. Metric entropy by itself can be quite tricky and difficult, requiring very long data sets for increasing L, but topological entropy is worse yet, because it weights each pattern equally. This means that patterns which are exceptionally infrequent on the natural measure of the attractor can still have a significant influence on the result. Attempting to estimate the same quantities using empirical occurrences of order patterns is even more difficult, requiring more data than would a good, low-alphabet generating partition for ordinary symbolic dynamics. For the present purpose, we desire a continuous system in greater than one dimension, with a natural chaotic attractor, and whose topological entropy can be found by independent rigorous means, and there exist parameter values with reasonably low entropies. The Lozi map, xi+1 = yi yi+1 = 1 + bxi − a|yi | for parameters a, b, is the only map we know of that satisfies all these criteria. In particular, we find that a = 6/5, b = −2/15 yield a low-entropy chaotic attractor (roughly 0.3 bits/iteration), and for those parameters, the topological entropy has been bounded rigorously with computer-assisted analytical computations [8,9], and we use their results. We found that the best numerical procedure was to look at the “outgrowth ratio” of order patterns of a given length L. The outgrowth ratio for some pattern of length L is the cardinality of the set of distinct order patterns of length L + 1 which have the given length-L pattern as a prefix. More concretely, we find vectors of length L + 1 from an orbit of the map. The order pattern on the first L points is the prefix pattern. Regardless of the dynamics, there can be at most L +1 order patterns of length L +1 conditioned on the length-L order pattern, since the single new element is a rank variable in alphabet {1, . . . , L + 1}. Indeed, according to the definition (14), (7) and (8), the topological permutation entropy h ∗top ( f ) is the scaling rate of the logarithm of the number of patterns with L of the ‘coarsegrained’ dynamics S ≡ Sι for ι sufficiently fine, i.e., log N (R, L) ≈ (L − 1)h ∗top (S), so that log
N (R, L + 1) ≈ h ∗top (S). N (R, L)
Fig. 1. Logarithmic outgrowth ratios for the Lozi map versus L. The dotted line represents rigorous bounds on the topological entropy rate, computed by computer-assisted analytical methods. The outgrowth ratio approximates the topological permutation entropy rate, is practical for computing and can scale to significant L.
Therefore, a reasonable estimator for h ∗top ( f ) is the logarithm of the outgrowth ratio averaged uniformly over all extant prefix patterns. This value, for sufficiently large L, and sufficiently large simulation sets, ought to be h ∗top ( f ) on average. Note that independent white noise would give an estimate of log(L + 1), i.e. not converging with L. Fig. 1 shows the numerical result of estimating h ∗top ( f ) on long orbits of the Lozi map using two specific instantiations of the outgrowth method. The dotted lines are the bounds on the true topological entropy. The first strategy involves computing N1 = 50 · 106 order patterns of length L + 1 and their length-L prefix. For every element in the prefix set we accumulate the number of distinct elements in the conditioning set and average the logarithm of the number of distinct occurrences over the observed lengthL order patterns—as long as each of those order patterns had at least two successors. This method will typically have a bias downward for large L on account of undersampling the space. The second strategy starts by computing N2 = 106 order patterns of length L + 1 from orbits of the map. The set of distinct order-L prefixes forms the “conditioning” set. The N2 length-L + 1 order patterns from these are accumulated, and then the map is iterated and order patterns computed, until there have been (K − 1)N2 more observations of length-L order patterns which were in the prefix, so that there are K N2 = N1 with K = 50 observations, all of whose order-L prefixes are in the conditioning set. Then similarly the logarithm of the outgrowth ratio is estimated over the conditioning set for all conditioning patterns with at least two observations. This method has positive and negative biases due to finiteness of observations. Firstly, because of finite K there is a downward bias, as the number of observed outgrowths is a strict lower bound on the number of allowed outgrowths in the dynamical system. There is a more subtle upwards bias, which changes with L as well. This is because the order patterns which were selected as conditioning states came from an ergodic
142
J.M. Amig´o, M.B. Kennel / Physica D 231 (2007) 137–142
sample on the natural measure which does not sample the support uniformly. More frequently occurring patterns are more likely to occur in the conditioning set—and we have observed heuristically that in chaotic systems the outgrowth ratio tends to be roughly correlated in the same direction as the frequency of the conditioning pattern. The measure on the allowable patterns does vary very widely and hence it can take very long simulations to find more of the allowable conditioning patterns even though their total number is far smaller than the number of samples from the map. This effect is also present in the first method as well but appears to be dominated by the downward bias. We comment that the feasibility of finding such a long scaling region in L and good numerical equality is quite dependent on us having used a low-entropy map to begin with, and the ability to employ very large simulation lengths compared to the number of observed distinct patterns. In other circumstances with higher-entropy maps we found that it was quite difficult to control all the biases well enough to see the true asymptotic scaling regime. For higher entropies there are far more patterns, making long simulations and the memory requirements unfeasible for the first strategy. The second strategy was designed to constrain the memory requirements but there are nevertheless statistical biases from multiple sources which make it difficult to find quantitative correspondence to known values with reasonable computational resources. Nevertheless we do find typically that the outgrowth ratio does remain bounded for deterministic dynamics as opposed to noise, so that practical finiteness of the outgrowth ratio may still be a useful distinguishing statistic, between determinism and noise. See also [5] for a similar conclusion in the analysis of time series using order patterns (“ordinal analysis of time series”). 6. Discussion After reading this paper and results in [1,3], the reader may be tempted to dismiss permutation-based analysis of dynamics as an uninteresting equivalent to well-known symbolic dynamics. In fact, order patterns of dynamical systems do maintain equivalent results to symbolic dynamics, such as the metric and topological entropies that we discussed, but in other ways, there are major distinctions, which are just starting to be explored for permutations. For instance, the canonical tent map and the Bernoulli shift ( f (x) = 2x mod 1) are isomorphic under a conventional analysis, and in symbolic dynamics are equivalent to an i.i.d. source of white bits. However, under permutation-based analysis, once the state is imbued with total ordering, the class of permutation isomorphisms is different [2]. Both conventional symbolic dynamics, assuming a generating partition of a map, and permutation analysis are useful discrete representations of what would otherwise be a dynamical system in continuous space. However, the symbolic dynamics which results from a conventional partitioning is not
fundamentally distinguishable from a noisy system; both result in conventional information sources on a discrete alphabet with a positive Shannon entropy. By contrast, the permutationbased analysis does show a fundamental distinction between deterministic chaos and noisy systems. With chaos there is a rich structure of “forbidden patterns” among the order patterns of different length, and a hierarchy of consequent derived forbidden patterns [2], the nature of which is not shared with conventional symbolic dynamics. More closely impacting the present work, the number of allowed permutations can scale super-exponentially, e.g. as the size of the space, L!, which is fundamentally faster than the exponential scaling which must eventually happen with a noise-free deterministic chaotic system. For practical applications the numerical tools of the type that we discussed here may serve as a way of distinguishing chaos-like dynamics from noise, at least in simulations. This may be useful in the detection of emergent “coherent structures” similar to low-dimensional chaos in what otherwise might be a high-degree-of-freedom system which could be rather noise-like. We comment on the unique property of permutations having a discrete “algebraic” nature permitting some rapid computational methods, without the requirement of estimating a generating partition for each dynamics. We feel that the appropriate tools for analysis of the typically short observed time series will require more sophisticated statistical thinking and methods still, just as high-quality estimation of entropies from low-alphabet information sources can be a difficult problem despite the apparent simplicity of the definitions themselves. Acknowledgments The authors are very grateful to the reviewers for their valuable comments and suggestions. This work was financially supported by the Spanish Ministry of Education and Science, grant MTM2005-04948 and European FEDER Funds. References [1] J.M. Amig´o, M.B. Kennel, L. Kocarev, The permutation entropy rate equals the metric entropy rate for ergodic information sources and ergodic dynamical systems, Physica D 210 (2005) 77–95. [2] J.M. Amig´o, L. Kocarev, J. Szczepanski, Order patterns and chaos, Phys. Lett. A 355 (2006) 27–31. [3] C. Bandt, G. Keller, B. Pompe, Entropy of interval maps via permutations, Nonlinearity 15 (2002) 1595–1602. [4] C. Bandt, B. Pompe, Permutation entropy: A natural complexity measure for time series, Phys. Rev. Lett. 88 (2002) 174102. [5] K. Keller, M. Sinn, Ordinal analysis of time series, Physica A 356 (2005) 114–120. [6] M. Misiurewicz, Permutations and topological entropy for interval maps, Nonlinearity 16 (2003) 971–976. [7] P. Walters, An Introduction to Ergodic Theory, Springer-Verlag, New York, 1982. [8] Y. Ishii, D. Sands, Monotonicity of the Lozi family near the tent-maps, Comm. Math. Phys. 198 (1998) 397–406. [9] http://topo.math.u-psud.fr/ sands/Programs/Lozi/index.html.