Chaos, Solitons and Fractals 42 (2009) 322–334
Contents lists available at ScienceDirect
Chaos, Solitons and Fractals journal homepage: www.elsevier.com/locate/chaos
Consistency in approximate entropy given by a volumetric estimate B.T. Santos a, R.A. Martins b, J.E.S. Natali a, V.H. Rodrigues a, F.S. Marques a, J.G. Chauí-Berlinck a,* a
Laboratory of Theoretical Physiology, Department of Physiology, Biosciences Institute, University of São Paulo, Rua do Matão tr14, 321, 05508-090 São Paulo, Brazil Laboratory of Automation and Control, Electrical Engineering, Polytechnic School, University of São Paulo, Brazil
b
a r t i c l e
i n f o
Article history: Accepted 10 December 2008
a b s t r a c t Non-linear methods for estimating variability in time-series are currently of widespread use. Among such methods are approximate entropy (ApEn) and sample approximate entropy (SampEn). The applicability of ApEn and SampEn in analyzing data is evident and their use is increasing. However, consistency is a point of concern in these tools, i.e., the classification of the temporal organization of a data set might indicate a relative less ordered series in relation to another when the opposite is true. As highlighted by their proponents themselves, ApEn and SampEn might present incorrect results due to this lack of consistency. In this study, we present a method which gains consistency by using ApEn repeatedly in a wide range of combinations of window lengths and matching error tolerance. The tool is called volumetric approximate entropy, vApEn. We analyze nine artificially generated prototypical time-series with different degrees of temporal order (combinations of sine waves, logistic maps with different control parameter values, random noises). While ApEn/SampEn clearly fail to consistently identify the temporal order of the sequences, vApEn correctly do. In order to validate the tool we performed shuffled and surrogate data analysis. Statistical analysis confirmed the consistency of the method. 2008 Elsevier Ltd. All rights reserved.
1. Introduction The use of non-linear methods for estimating variability in a given class of phenomena is very important to access the underlying nature of these processes. On the one hand, the variability in a time-series might entail relevant information on the underlying dynamics of the phenomenon, and, on the other hand, changes in the variability might inform us about deviations experienced by a system [1–4]. Therefore, the usefulness of non-linear methods ranges from basic to applied sciences, and it is an active field of research. However, each method has its own drawbacks. For instance, a very important point to be addressed is whether the tool has consistency, i.e., if the method gives non-discrepant results depending on the way it is applied. One of such methods currently at hand is approximate entropy (ApEn), a practical but non-consistent tool. In this study, we develop an ApEn-based method by what we call a ‘‘volumetric estimation”, and we show that this new tool has consistency. Approximate entropy was first proposed by Pincus, in 1991 [5], as a feasible way to approximate the correlation dimension as proposed by Grassberger and Procaccia [6]. Ultimately, ApEn estimates the degree of order in a time-series. The method is based in a counting procedure using sub-vectors of length m from the original time-series; and the counts are computed as positive whenever a distance between sub-vectors is equal or less than a certain error tolerance r (details are presented in a next section). Here we use the notation ApEn(a, b, c)d to indicate the ApEn computation for window m = a, tolerance r = b, vector length N = c, and time-series d under analysis.
* Corresponding author. E-mail address:
[email protected] (J.G. Chauí-Berlinck). 0960-0779/$ - see front matter 2008 Elsevier Ltd. All rights reserved. doi:10.1016/j.chaos.2008.12.002
B.T. Santos et al. / Chaos, Solitons and Fractals 42 (2009) 322–334
323
ApEn became a much employed tool despite the fact that, as pointed out by Pincus himself, it is a biased estimator and sensitive to the length of the time-series vector [5,7]. Moreover, and in our opinion, most relevant, is the fact that ApEn lacks consistency: even when a given time-series S1 is more organized than another S2, the resulting ApEn values might point to the opposite. A few years later, Richman and Moorman [8] evolved the ApEn tool by means of removing self-matches from the counting process (i.e., a given sub-vector xi is not compared to itself), and termed the new tool as Sample ApEn (SampEn). Accordingly to the authors, the most important point in SampEn is that the method is less biased and has more consistency than ApEn. However, since there are no prescriptions for the use of the parameters r and m in both ApEn and SampEn, consistency is still a major concern in using these tools. In other words, an experimenter choosing a certain pair (m1, r1) to analyze her/his data could, potentially, obtain the opposite result that another experimenter computes with a different pair (m2, r2). For instance, Fig. 1 shows nine prototypical time-series with different degrees of organization: a single sine-wave (1-sine; 0.25 Hz); the sum of two sine-waves (2-sine; 0.25 and 0.50 Hz); the sum of three sine-waves (3-sine; 0.25, 0.50 and 1 Hz); the sum of four sine-waves (4-sine; 0.05, 0.25, 0.50 and 1 Hz); logistic maps with control parameter of 3.6, 3.8 and 3.9 (L_3.6, L_3.8 and L_3.9, respectively); uniformly distributed random noise (u.d.r.n.); and normally distributed random noise (n.d.r.n.). In Table 1A and B, we present some ApEn and SampEn results for these nine time-series. As it can be verified in both tables, there are many incongruent results in relation to what we know beforehand about the temporal organization of these series. As an extension of the preceding discussion, consider the Chaitin sequences analyzed by Pincus and Kalman [9]: (A) 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 (B) 0 1 1 0 1 1 0 0 1 1 0 1 1 1 1 0 0 0 1 0 The authors refer that ApEn(1, 100, 20)A = 0 while ApEn(1, 100, 20)B = 0.6774. Of concern is the fact that such a referred ApEn value for (A) is the same value that one would obtain for a completely ordered series represented by a single-valued data set (a horizontal straight line). Therefore, one should conclude that the sequence in (A) and a straight horizontal line have the same degree of order. In fact, when we compute ApEn for sequence (A) we found 0.0014 instead of 0. In addition, if one chooses, for example, m = 7, then ApEn(7, 100, 20)A = 0.0030 and ApEn(7, 100, 20)B = 0.0741. How would one interpret these results remain to be discussed and we will address this point shortly. ApEn is intended to be applied in analyzing stationary data (e.g., see [4]). However, the presence of deterministic drifts in the time-series would lead to unexpected results. For example, adding a linear drift in a random series might decrease the ApEn value obtained, even though such a data would span a wider volume in its phase-space and thus would be associated with a lower order than before. The same type of ‘‘weird” and unwanted results occurs when drift is added to a sine wave, and, once again, the results depend on the choice of m and r, as shown in Table 2. Therefore, the aim of this study is to present a numerical tool, derived from approximate entropy, which gains consistency. This goal is achieved by employing all the possible values of m and the widest meaningful range of values of r. We named the tool as volumetric approximate entropy, vApEn, since the values obtained would resemble the volume between the ApEn function and the m r plane.
Fig. 1. Prototypical time-series analyzed. (a) Single sine-wave (1-sine; 0.25 Hz). (b) Sum of two sine-waves (2-sine; 0.25 and 0.50 Hz). (c) Sum of three sinewaves (3-sine; 0.25, 0.50 and 1 Hz). (d) Sum of four sine-waves (4-sine; 0.05, 0.25, 0.50 and 1 Hz). (e)–(g) Logistic maps with control parameter of 3.6, 3.8 and 3.9 (L_3.6, L_3.8 and L_3.9, respectively). (h) Uniformly distributed random noise (u.d.r.n.). (i) Normally distributed random noise (n.d.r.n.). Series generated in Matlab 7.0. First 100 points in each series are shown.
324
B.T. Santos et al. / Chaos, Solitons and Fractals 42 (2009) 322–334
Table 1 (A) ApEn values and (B) SampEn values. The analyses are for the nine prototypical time-series of various degrees of temporal organization (Fig. 1) for different pairs ðm; rÞ. Notice the lack of consistency. N = 300; simulated sampling rate: 1 Hz. Simulations in Matlab 7.0. m
r
Time-series – ApEn results 1-sine
2-sine
3-sine
4-sine
L_3.6
L_3.8
L_3.9
u.d.r.n
n.d.r.n.
10 20 30
0.15 0.17 0.21
0.23 0.23 0.27
0.28 0.30 0.36
0.52 0.86 0.73
0.23 0.21 0.20
0.41 0.42 0.41
0.50 0.49 0.46
0.56 1.20 1.44
0.49 1.04 1.25
3
10 20 30
0.10 0.11 0.11
0.12 0.12 0.15
0.14 0.15 0.19
0.12 0.39 0.45
0.22 0.21 0.20
0.37 0.40 0.41
0.46 0.46 0.45
0.05 0.25 0.57
0.02 0.25 0.54
4
10 20 30
0.07 0.08 0.08
0.07 0.05 0.09
0.13 0.13 0.14
0.07 0.22 0.30
0.19 0.18 0.18
0.33 0.35 0.36
0.41 0.44 0.46
0.00 0.05 0.16
0.00 0.03 0.14
10 20 30
0.63 0.47 0.33
1.12 0.92 0.77
1.60 1.31 1.15
2.55 1.95 1.55
0.27 0.08 0.11
0.48 0.42 0.37
0.55 0.43 0.37
1.67 1.00 0.67
2.94 2.14 1.74
3
10 20 30
0.23 0.25 0.28
0.27 0.29 0.32
0.50 0.46 0.45
1.44 1.20 1.03
0.17 0.06 0.02
0.40 0.38 0.36
0.47 0.43 0.35
1.60 0.99 0.68
2.86 2.35 1.84
4
10 20 30
0.14 0.14 0.14
0.11 0.15 0.17
0.24 0.19 0.20
0.87 0.75 0.63
0.19 0.06 0.00
0.39 0.35 0.33
0.45 0.44 0.36
1.72 0.98 0.68
2.08 2.99 1.76
(A) 2
(B) 2
Table 2 The effects of added drift to a sine-wave of 0.25 Hz with sampling rate of 1 Hz. Small drift corresponds to a slope of 0.05 s1; large drift corresponds to a slope of 0.10 s1. N = 100 points. Simulations in Matlab 7.0. See text for details. m//r
Pure sine-wave
Small drift
Large drift
2//25 3//10 2//50
0.2020 0.0560 0.2839
0.3801 0.2349 0.1682
0.1163 0.1641 0.0415
2. Volumetric_ApEn 2.1. Outline of the tool For a given real-valued time-series So containing finite N points we obtain the total volumetric ApEn, TOT_vApEn, value as
TOT vApEnðSo Þ ¼
Z
1
er
"
N1 X
#
ApEnðm; r; NÞ dr;
ð1Þ
m¼1
where the inferior limit er is a value of r slight greater than zero (see below). Putting it simple, the TOT_vApEn is the sum of ApEn for all possible sub-vectors of length m for each value of r ranging from er to infinite. This improper integral converges and is composed by two parts, one positive and one negative, as it will be shown in the next sections. Fig. 2 illustrates the concept of the method. 2.2. Strict delineation of the tool 2.2.1. Approximate entropy Approximate entropy is defined as follows (see [5,7] for further details). For a sub-vector i of length m, all the matches along the time-series are obtained by the comparison with a moving window j of the same length. A match is considered positive when the distance between the sub-vector i and the window j is equal or less than a certain error tolerance r, and distance is given by the absolute difference in a pairwise ordered way of the elements of i and j. This results in a count C for each sub-vector i:
Cm i ðrÞ ¼
#m i ; Nmþ1
ð2Þ
where we use # to indicate the number of events where dði; jÞ 6 r sðSo Þ, and s(So) is the standard deviation of the sequence. From these counts, rm(r) is obtained:
B.T. Santos et al. / Chaos, Solitons and Fractals 42 (2009) 322–334
325
Fig. 2. Illustration of the surface obtained for ApEnðm; r; 200Þ. m in the x-axis, r in the y-axis, ApEn values in the z-axis. Grey surface is ApEn for a normally distributed random noise (n.d.r.n.); black surface is ApEn for a sum of two sine-waves (2-sine). The arrows point to combinations of ðm; rÞ which result in higher ApEn values for the 2-sine than for the random noise, in the positive part of both. Notice that there are vast negative regions where ApEn for the sinewave is less negative than for the n.d.r.n. data. Simulations in Matlab 7.0.
Um ðrÞ ¼
Nmþ1 X 1 lnðC m i Þ: Nmþ1 1
ð3Þ
Then, the estimator ApEn is given by
ApEnðm; r; NÞ ¼ Um ðrÞ Umþ1 ðrÞ:
ð4Þ
The parameter ApEn is the ApEn value obtained when N ? 1. Henceforth, for the sake of simplicity, we will use ‘‘r” instead of ‘‘r sðSo Þ” where no confusion is expected. Also, ApEn is to be understood as the estimator unless otherwise stated. S2 where S1 has a higher degree of temporal The parameter ApEn has consistency. For any two stationary processes S1 and correlation, ApEn(m, r, S1) < ApEn(m, r, S2), where Si is a realization of the i process. This is shown in the Appendix A2. However, as stated before, the estimator ApEn lacks consistency. The best one can say is that the probability of finding ApEn values for a sequence S1 lower than ApEn values for a sequence S2 is greater than the probability of the opposite result whenever the sequence S1 has a higher temporal order than the sequence S2:
Prob½ApEnðÞS1 < ApEnðÞS2 > Prob½ApEnðÞS1 > ApEnðÞS2 : Obviously, the number of points N should be equal in both series. In fact, as it is shown in the next section, the statement for the estimator ApEn must be
Prob½jApEnðÞS1 j < jApEnðÞS2 j > Prob½jApEnðÞS1 j > jApEnðÞS2 j; where —ApEn()— stands for the absolute value of ApEn. 2.2.2. The meaning of negative ApEn Consider the parameter ApEn (i.e., N ? 1). Then, from Eq. (3) and setting Eq. (4) to a value lower than zero:
Nmþ1 Q
ApEnðm; r; N ! 1Þ ¼ ln
Nm þ ln Nmþ1 Nm Q
#ðmÞ
#
1 Nmþ1
ðmþ1Þ
<0 1 Nm
ð5aÞ
() Nmþ1 Y
#ðmÞ <
!Nmþ1 Nmþ1 Nm Nm Y Nmþ1 #ðmþ1Þ : Nm
ð5bÞ
326
B.T. Santos et al. / Chaos, Solitons and Fractals 42 (2009) 322–334
And taking N ? 1: 1 Y
#ðmÞ < e
1 Y
#ðmþ1Þ :
ð5cÞ
Therefore, whenever the product of counts at window m is less than e times the product of counts at window m þ 1, ApEn is lower than zero. This simply means that when the decrease in matches from window m to window m þ 1 is small, negative ApEn values are expected. The prototypical case for this section is a straight line with non-zero slope. In Appendix A3, we show analytically that such a time-series result in negative ApEn values. Considering the time-series as the output of a dynamical system, ApEn < 0 would then represent an asymptotically unstable equilibrium point, with the system never returning to a previous delimited region in its m-dimensional phase-space. The points in this m-dimensional space would then never be close to each other in a range less then the r tolerance level. In face of this, negative ApEn values should be as informative as their positive counterparts are considered to be: for a given set of windows of size m; m þ 1; m þ 2; m þ 3; . . . ; m þ k, less ordered sequences should result in negative ApEn values more often than highly ordered sequences. We can now understand the results presented in Table 2, as well as those that we reported before for added drifts. For a certain time-series and pairs ðm; rÞ, a small drift represents an increase in temporal disorganization of the series, and ApEn increases, whereas a large drift acts as an unstable sub-space of the system, pulling ApEn to values near to zero, and below. 2.3. Convergence for m and r Changes in the window length m are a discrete process with a minimum step-size Dm of 1 unit. Therefore, the sum in Eq. (1) represents the area obtained as a sum of rectangles of base 1 and height ApEnðm; r; NÞ for a given value of r. The net result from this operation is a telescoping sum of rm for a given r:
uðrÞ ¼
N1 X
ApEnðm; r; NÞ ¼
m¼1
N 1 X
½Um Umþ1 ¼ U1 UN :
ð6Þ
m¼1
Because a sub-vector of size m ¼ N has C Ni ðrÞ ¼ 1=1 ¼ 1 for all r (see Eq. (2)), then rN = 0 "r, and the area u(r) reduces to
uðrÞ ¼ U1 ðrÞ:
ð7Þ 1
There would be cases where r (r) may not converge as N ? 1. This is particularly true for stochastic processes from a continuous space (see Appendix A1). Then, there should be a minimal value of r, denoted here by er, for which C 1i ðer Þ converges to a fixed proportion of points lying within the range xi ± er (see Appendix A1). Thus, for r P er, u(r) attains a definite value as N increases. When the stochastic process is from a discrete countable space, then r1(r) always converge as N ? 1 (see Appendix A1). The next point to be addressed is whether the integral in Eq. (1) converges. This is obvious from the scratch. Since we have a finite N in a given realization So, it is always possible to find an r satisfying the inequality:
r>
maxðSo Þ minðSo Þ ¼ rþ : sðSo Þ
ð8Þ
In other words, there exists a value, denoted by r+, such that for r > r+ all points in the sequence are considered to be equal under the metric used in ApEn. This is true for stationary sequences, whether weak or strict, since we are concerned with finite N. Inequality (8) says that ApEn(1, r P r+, N) = 0. Because r+ is defined in relation to m = 1, then its value also guarantee that ApEnðm; r P rþ ; NÞ ¼ 0 for other window lengths. Then, for r > r+:
Z
1
rþ
"
N1 X
#
ApEnðm; r; NÞ dr ¼ 0:
ð9Þ
m¼1
Hence, for a given N, the integral in Eq. (1) has a definite value, and so has the quantity TOT_vApEn(So). The total volume is obtained by computing the value of r1(r) for a finite set of r in a range of [er, r+], and then numerically integrating these values. Fig. 3 shows the numerically evaluated TOT_vApEn for time-series with different degrees of order. Plain observation of Fig. 3 reveals that, as a general rule, TOT_vApEn is not able to recognize time-series with different temporal organization. This is because r1 is related to the distance between any two points in the time-series. Therefore, their temporal arrangement is irrelevant to the eyes of r1. Nevertheless, we still can elaborate a tool from this ‘‘volume” for inferring order in a given sequence, as it will be shown next. 2.4. The composite volume As we have developed above, there are pairs ðm; rÞ which result in positive and others which result in negative ApEn values. All these negative and positive values are enclosed in the telescoping sum in Eq. (1). Rewriting this equation, we have
TOTv_ApEn
B.T. Santos et al. / Chaos, Solitons and Fractals 42 (2009) 322–334
327
-170
-190
-210 1-sine
2-sine 3-sine
4-sine
L_3.6
L_3.8
L_3.9 u.d.r.n. n.d.r.n.
time-series
Fig. 3. TOT_vApEn for 9 time-series with different degrees of temporal organization for N = 50 (dotted), N = 100 (dashed) and N = 200 (solid). Notice that TOT_vApEn is insensitive to the temporal organization of the signals and, also, to the series length N (see text for discussion). Simulations in Matlab 7.0.
TOT vApEnðSo Þ ¼
Z
rþ
uþ ðrÞdr þ
er
Z
rþ
u ðrÞdr ¼ vApEnþ ðSo Þ þ vApEn ðSo Þ;
ð10Þ
er
where we use u(r) and u(r) for the positive and negative areas along the values of r. These areas define, thus, a positive volume, vApEn+, and a negative volume, vApEn. Differently from the simple u(r), such positive and negative areas cannot be ascribed to a pre-determined analytical computation. They are, ultimately, the sum of positive and negative values of ApEn obtained at the various values of r. As shown in the preceding section, u(r) is unable to discern among different time-series in respect to their temporal organization. At the same time, u(r) represents the sum of positive and negative ApEn values for all possible windows m for a given N and in the range of r for which ApEn converges. Therefore, while a single computation of ApEn does not guarantee consistency, the comparison of vApEn+ (or vApEn) for different time-series is associated with the result of the intersection of probabilities of the type Prob[jApEnS1j < jApEnS2j] for ðm; rÞ pairs. This means that in the long run, i.e., for a huge number of computations of ApEn values, it is likely that a higher ordered series would attain a lower jvApEnj value than the one obtained for a less ordered series. 2.4.1. Non-convergence of vApEn+ (vApEn) as N ! 1 The last topic to be addressed before we pass to the applications of the tool is related to the absence of convergence of the positive and negative parts of TOT_vApEn(So) as the number of data points grows. Because ApEnðm; r; NÞ itself converges as N ? 1 (see above) and because for m ? N, ApEnðm; r; NÞ comes from a limited set of possible values,1 then the u+(u) area can be written as the sum of three parts:
uþ ðrÞ ¼
n X m¼1
~þm ðm; rÞ þ a
M X m¼nþ1
ApEnþ ðm; rÞ þ
N1 X
~þm ðm; rÞ: a
ð11Þ
m¼Mþ1
~þ In Eq. (11), windows of length m 6 n N contain the fix-valued (converged) ApEn results (indicated by a m ) and windows of length M + 1 6 m 6 N 1 contain the values from the limited set at the m ? N boundary. These two sums are constant values and add nothing to vApEn+ (vApEn) as N increases. In the in between windows of length n + 1 6 m 6 M, the increasing N adds up because these values are not yet fixed and there will be a new set of additional values for each datum appended in the time-series. We postpone to the Discussion the potential relevance of this non-convergence of vApEn+ (vApEn), and, to the Appendix, the convergence of u(r) as N ? 1. Next, we present the application of the vApEn tool. 3. Applications The vApEn tool requires that a set of values of ApEn is computed from m = 1 to m ¼ N 1, and for a set of values of r from
er to a certain r+. This r+ is a value which has to be established for the particular time-series So under analysis in order to obtain ApEnð1; r; NÞSo ¼ 0 "r P r+ (see Eq. (8)). The er is easily obtained as well: it is the minimum value of the set of the distances between all pairs (xi, xj) in the time-series So. Then, a partition of the set of r values must be created and the ApEn values integrated numerically over this partition. Obviously, the mesh of this partitioning becomes relevant to the results. Our numerical tests suggest that a partition containing more than 100 of equally spaced values of r is sufficient for convergence of vApEn+ (vApEn). As presented previously, only the positive or the negative component of vApEn is necessary to be evaluated in order to have an estimate of the temporal organization of the time-series under analysis. From now, we simply call the positive volumetric estimate as vApEn unless otherwise stated.
1 The most obvious of these values is the one for m ¼ N 1 where, independently of the organization in the time-series, there will be only two possible outcomes: either the two vectors of length N 1 are equal ðApEnðN 1; r; NÞ ¼ 0Þ or different from each other ðApEnðN 1; r; NÞ ¼ ln 2Þ.
B.T. Santos et al. / Chaos, Solitons and Fractals 42 (2009) 322–334
vApEn+
328
1500
200
1200
100 50
900 600 300 0 1-sine 2-sine 3-sine 4-sine
L_3.6
L_3.8
L_3.9 u.d.r.n. n.d.r.n.
time-series
Fig. 4. vApEn+ for the nine prototypical time-series. N = 50 (dotted), N = 100 (dashed) and N = 200 (solid). Notice how the method is capable of distinguishing the different degrees of organization for N as low as 50.
Fig. 4 presents vApEn computed for the sequences from Table 1. As developed in the preceding sections, the tool is able to delineate the different degrees of temporal organization in the series, even for N as low as 50 points. To further check the robustness of the tool, we performed a series of analysis: A. Shuffling data applied to the sine waves and the logistic maps (Table 3). B. New realizations of the stochastic processes of uniform and of normally distributed random noises (Table 3). C. Surrogate data by varying initial conditions for the logistic maps (Table 4). Table 5 presents some statistical tests for the results from Tables 3 and 4. A note on banded chaos. The results concerning the logistic map might draw attention of the reader. Particularly, the map with l = 3.9 has a lower vApEn than that of l = 3.8; and l = 3.6 (within the chaotic region of the map) has vApEn lower than the 3-sine and 4-sine series (Tables 3 and 4). However, the logistic map (and, potentially, other unidimensional maps, see [10]) presents banded (periodic) chaos for some parameter values, {lB}, within the chaotic interval. This means that for some values of the control parameter the time-series generated has a periodic transition among definite regions. For example, for lB = 3.856800652. . ., the iterations jump among three regions (for a detailed discussion, see [10,11]). In banded chaos, each region is chaotic within, but only a limited countable number of such regions are visited after a initial transient of the system. Within an open interval around a given lB, the control parameters retain part of the periodic feature of lB, i.e., bands are still somewhat recognized in the time-series.
Table 3 Robustness of vApEn. The values in the line ‘‘original data” are those from Fig. 3 for N = 100. The entries in ‘‘shuffled data” represent values (mean, standard deviation, minimum and maximal) of vApEn obtained by shuffling each of the original vectors of the sine waves and the logistic maps, and then re-computing vApEn for the newly generated sequence. Mean for 10 events of shuffled data. The entries in ‘‘new data” represent new realizations of the noises (uniform and normal). Mean for 20 events of newly generated data. Simulations in Matlab 7.0. Original data
Shuffled data
Time-series – vApEn results
Mean Std Min Max
1-sine 207
2-sine 423
3-sine 525
4-sine 606
L_3.6 383
L_3.8 1016
L_3.9 952
u.d.r.n 1011
n.d.r.n. 1141
980 10.8 964 999
1138 29.9 1098 1179
1082 39.8 1040 1143
1121 23.9 1079 1160
908 10.3 896 931
1049 13.6 1027 1067
993 12.8 976 1017
1053 32.5 987 1167
1209 59.4 1098 1296
New data
Table 4 Robustness of vApEn. The values in the line ‘‘original data” are those from Table III for the logistic maps (N = 100). The entries in ‘‘varying initial condition” were obtained by iterations beginning at different initial values randomly chosen. 10 new sequences were generated for each control parameter. Simulations in Matlab 7.0. Original data
Time-series L_3.6 383
L_3.8 1016
L_3.9 952
355 36.7 280 386
1033 30.1 995 1081
951 28.3 915 989
Varying initial condition Mean Std Min Max
329
B.T. Santos et al. / Chaos, Solitons and Fractals 42 (2009) 322–334
Table 5 Statistical tests for vApEn. ‘‘Comparison” columns are the series for which the means are being compared. The ‘‘equal variance” column presents whether data can be considered homocedastic. When unequal variance was detected the test was performed under Welch-correction. The three last tests represent comparisons against a given value (those from the original data). P values for Student t-test. Comparison a
b
L_3.6 L_3.6 L_3.6 L_3.6 L_3.8 L_3.8 L_3.8 L_3.8 u.d.r.n. L_3.8 L_3.6 L_3.8 L_3.9
L_3.8 L_3.9 1-sine 2-sine L_3.9 4-sine u.d.r.n. n.d.r.n. n.d.r.n. L_3.9 383 1016 952
P value
Equal variance
Source table
<0.0001 <0.0001 <0.0001 <0.0001 <0.0001 <0.0001 0.64 <0.0001 <0.0001 <0.0001 0.04 0.11 0.91
Yes Yes Yes No Yes Yes No No No Yes – – –
3 3 3 3 3 3 3 3 3 4 4 4 4
This is the case for the data presented here. l = 3.6 lies close to lB = 3.592572184. . ., that generates sub-harmonic cycles of period 4. These four regions are readily identifiable in the L_3.6 series. Therefore, such a series ends up as a more organized sequence than the 3-sine and 4-sine series, which, by their turn occupy a larger range of values at the sampling rate employed. The 3.9 value of l is in the vicinities of lB generating 2 cycles of period 10, while l = 3.8 is near a limiting value generating infinite-period cycles. Therefore, L_3.9 results in a more organized series than L_3.8. Finally, just for the sake of curiosity, we computed vApEn for the Chaitin sequences presented in Section 1. For the (A) sequence, the value obtained was 26, while for (B), the value was 374. These results are much more pertinent and unequivocal.
4. Discussion In the present study, we develop a method to employ approximate entropy with consistency. The method uses all the available lengths m for sub-vectors that a time-series can generate combined with the entire range of values of error tolerance r that gives ApEnðm; r; NÞ–0. We named the tool as volumetric approximate entropy, vApEn. R1 It was first shown that there is a ‘‘total volume” TOT vApEn ¼ er U1 ðrÞdr. Such a total volume is of a definite value because the improper integral converges. Also, it was shown that this integral is of little use to discern among different degrees of temporal organization. Even thought TOT_vApEn is of limited use, it is composed by two elements: a positive (vApEn+) and negative (vApEn) part. As shown, both of these parts have absolute values related to the temporal organization of the time-series. Because R 1 r (r)dr attains a limiting value despite the increase in the length of the time-series, vApEn+ and vApEn must match each other. This implies that it is of no relevance which part of the volume is computed in order to address the temporal organization of the sequence, as long as one computes always the positive or the negative part in different occasions. Just for the sake of simplicity, we choose vApEn+ as the estimator vApEn. It is important to note that the positive and negative parts of volumetric ApEn increase with the increasing number of points in a time-series. Therefore, when comparing different time-series, the length N of the sequences should have a common value. Nine time-series, with different degrees of temporal organization, were analyzed with the vApEn tool. It was verified that vApEn is able to classify correctly their temporal organizations. Validation criteria by means of data shuffling and surrogate series (new realizations of random sequences with prescribed mean and variance; and new realizations of iterated maps with different initial conditions) show that the method has adequate consistency (Tables 3 and 4). vApEn clearly segregates the normally from the uniformly distributed random noises, normally distributed noise from deterministic maps, different control parameters in iterated maps, etc. (Table 5; also, see A note on banded chaos in Section 3). Now, let us return for a while to the Chaitin sequences presented in Section 1. Our concern in relation to the sequence (A) is that an ApEn value of 0 should indicate that there was only one region occupied in the phase-space. As stated in Section 1, we computed a value different from zero (i.e., 0.0014) for the sequence (A). If we try to force ApEn to zero, we should use r = 200. However, this value of r renders ApEn(1, 200, 20)B = 0 as well. Therefore, one should conclude not only that both series have the same degree of order, but also, that such a degree is similar to the one of a constant-value function (a horizontal straight line). On the other hand, vApEn analysis clearly distinguishes both (A) and (B) series (26 and 374, respectively) and, moreover, neither of them is confused with a single-valued sequence.
B.T. Santos et al. / Chaos, Solitons and Fractals 42 (2009) 322–334
internal energy (a.u.)
330
vApEn (x10 3 )
1.5
n.d.r.n.
20
0.75
10
0
1.00
0
250
500
750
N
L_3.8 2-sine
1.0
0.5
0.0 0
25
50
75
100
125
150
N +
Fig. 5. vApEn as a function of vector length N. The plot shows the results for normally distributed random noise (squares), logistic map with control parameter = 3.8 (triangles), and the 2-sine wave (circles); for N = [25 50 75 100 150]. Notice the non-linear increase of vApEn+ with N and, also, its relation with the organization in the series (see text). Inset: Just for a visual comparison between the phenomena, we show the increase in internal energy per particle, f ðNÞ in arbitrary units, with the number of particles in systems with long-range correlation given by the ratio a/d (0.75 upper line; 1.0 lower line), which represents the ratio of force interaction distance and the dimension of the system – see [16]: f(N) = [N1a/d a/d]/[1 a/d].
Finally, let us return to the non-convergence of vApEn as N ? 1 cited earlier. On the one hand, in terms of the application of the tool, this non-convergence obliges one to fix a certain N for analyzing different time-series comparatively. On the other hand, the non-convergence reveals a reliable quality of the quantity being computed. This is because the thermodynamical entropy of Boltzmann–Gibbs is an extensive quantity [12,13] and, thus, vApEn rescues such an attribute. Considering that an entropy S is extensive if lim SðNÞ=N is finite (and non-vanishing) [14], then the simple ApEn is not extensive since it conN!1
verges as N increases and makes the limit to vanish. This is not the case for vApEn, even though its increase is not linear with N. Fig. 5 illustrates a few examples. Obviously, the Boltzmann–Gibbs entropic functional is not adequate to address the rate of increase in vApEn along with N. This is (probably) due to the presence of some degree of memory within the time-series, something that can be interpreted as long-range correlations in time (instead of space). Therefore, entropic functionals that capture this kind of feature might be of help in the analysis: instead of simply computing vApEn, one might compute its increase with N and try to obtain a characteristic index of such a relationship. In this sense, the Tsallis’ formulation of Sq entropy is a first-line candidate for a series of reasons; among others: Sq generalizes the Boltzmann–Gibbs entropy [15]; it also generalizes the Kolmogorov–Sinai entropy [15] that gave birth to ApEn; it can be directly related to long-range correlations within a system [15–17]. Therefore, finding a q-index for the rate of increase in vApEn along with N may result in a general analytic tool which might be useful even in very small data sets. 5. Conclusion This study presents a tool, derived from approximate entropy, for analysis of variability in time-series. The method, named as volumetric approximate entropy (vApEn), meets the criteria of consistency and correctly identifies the level of organization in time-series. Besides the relevant consistency issue, the new tool seems to open the route for the application of other robust approaches, such as non-extensive entropic indexes, to this kind of analysis. Acknowledgements This study is part of a research project supported by FAPESP (06/02120-0), and FAPESP masters’ fellowships to BTS (06/ 02377-0), JESN (06/02176-5), RAM (06/05490-2), FSM (06/02171-3) and VHR (06/01955-0). Appendix A A.1. Convergence of
r1 ðrÞ as N ! 1
By assumption, the sequence So = {x1, x2, . . . , xN} is a realization of a stationary process. For the m ¼ 1 case, such a sequence would maintain the finite-dimensional distributions of the process (e.g. [18,19]). Consider a process with a probability denR sity function fx in the set X: 1 ¼ X fx dx. For our purposes, the set X 2 R and is either a closed interval or the entire real line. If X is a closed interval, the problem offers no difficulty, as it will be shown. Therefore, we will analyze the real line case, from which the other follows easily (in Remark 3, below, we present the non-continuous case). The mathematical expectation of
331
B.T. Santos et al. / Chaos, Solitons and Fractals 42 (2009) 322–334
R þ1 the process is EðXÞ ¼ 1 x fx dx, which is defined because the process is stationary by assumption. For EðXÞ to converge, it is 2 obvious that fx < x from a certain jx0j to j1j (without loss of generality, we are assuming symmetry in the distribution). From Eq. (3), we have, for a given r:
U1 ðrÞ ¼
1 N 1 X #i ln : N 1 N
ðA:1Þ
The #i 1/N term ends up as the relative frequency of points lying in a range ±rr from a given xi (notice that we put r for the standard deviation straightway, as we analyze N ? 1). r1 resembles the mean of the natural logarithm of these relative frequencies. Notice that the r1 function maps the real line to an open interval 1; B, where the value B ðB < 0Þ depends on the tolerance error r. This mapping generates another probability distribution, the probabilities of the logarithm terms, that we name as fL. Because fL emerges from fx, it is also in a probability space and it follows that:
1¼
Z
B
1
EðU1 Þ ¼
fL dðlnðÞÞ; Z
ðA:2:aÞ
B
lnðÞfL dðlnðÞÞ:
ðA:2:bÞ
1
For r1 to have definite expectation, the terms in the logarithms must be finite and to grow (absolutely) ‘‘slowly”. For a given interval [xi rr, xi + rr] we have, as N ? 1, the probabilities:
#1i ! P rr ðxi Þ ¼ N
Z
xi þr r
fx dx:
ðA:3Þ
xi r r
Let us take a limiting case where rr = d, d ? 0, because there the probabilities Prr(xi) tend to zero, and, thus, the log terms would be high (absolutely). By the mean-value theorem, at a given point xi, Eq. (A.3) can be written as
Pd ðxi Þ ¼
Z
xi þd
fx dx ¼ 2d fx ðxi Þ:
ðA:4Þ
xi d
For given p.d.f. fx and d, each probability Pd(xi) value is associated to a finite countable set XP {xj, xk 2 XJjPd(xj) = Pd(xK) "j, k}. In other words, a given value of the function fx(x) have a unique set of values (points) xi 2 X associated to such a value. Due to the stationarity of the process, it follows that fL 6 fx from a certain jx1j to j1j, since values xi far away from the expected EðXÞ have low frequencies. Thus, let us take a limiting case putting fL = fx and rewrite Eq. (A.2.b) using (A.4):
EðU1 Þ ¼
Z
B
lnð2d fx Þfx dx ¼
1
Z
B
lnðfx Þfx dx þ lnð2dÞ
1
Z
B
fx dx:
ðA:5Þ
1
The last term in Eq. (A.5) converges as long as d > 0 (see Remark 2 below). As we stated before, fx < x2 from a certain jx0j to j1j, and, thus
Z
B
1
Z lnðfx Þfx dx < 2
B
1
lnðjxjÞ dx: 2 x
ðA:6Þ
Because jln(x)j < x for x > 1, the right-hand integral in (A.6) converges. Therefore, E(r1) is defined and r1 converges as N ? 1, as it had to be shown. Remark 1. The X 2 closed interval case offers no problem since the values of ln(Prr(xi)) become limited as well. Remark 2. The imposition d > 0 translates to the vApEn analysis as the condition for convergence in r1. This is the reason for the r inferior limit, er. Remark 3. Considering that X is a set of disjoint points, then there is a minimal distance dmin > 0 in the set of the distances between all pairs (xi, xj) 2 X. This represents the inferior limit er. However, if X is a countable set, then a realization of the stochastic process can contain a certain value xi 2 X more than once and, then, in the realization, there would occur matches even if r = 0. In these cases (i.e., X forming a countable set of disjoint values), r1(r) converges always, in spite of er = 0. A.2. Consistency of the parameter ApEn We will analyze m ¼ 1. The m > 1 cases follow immediately. To begin with, let us define three auxiliary functions:
#1 #1i þ ln iþ1 ; N N 2 # i;iþ1 F 2 ði; i þ 1Þ ¼ ln ; N1 1 apemði; i þ 1Þ ¼ F 1 ði; i þ 1Þ F 2 ði; i þ 1Þ: 2 F 1 ði; i þ 1Þ ¼ ln
ðA:7:aÞ ðA:7:bÞ ðA:7:cÞ
332
B.T. Santos et al. / Chaos, Solitons and Fractals 42 (2009) 322–334
We urge the reader to note that these auxiliary functions are not computing ApEn: they are computations done on a pairwise (xi, xi+1) basis for a given i. Let N ? 1 (this is the parameter ApEn), let x stands for the xi element and y for the xi+1 one. Using Eq. (A.3):
apemðx; yÞ ¼ ln
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi PðxÞ PðyÞ : Pðx; yÞ
ðA:8Þ
Notice that we dropped the rr from P, for simplicity. Firstly, consider the case of an uniformly distributed p.d.f.. Without loss of generality, let X = [0, 1] and then, fx = 1 "x 2 X. It follows that P(x) = P(y) = fx 2rr. This represents the one-dimensional case, or m = 1. Going to the two-dimensional case, we observe that the vector ðx; yÞ has positive matches in a volume delimited by a parallelepiped with basis (2rr)2 and height fx. Fig. 6 is a graphical representation of the concept. This volume of positive matches represent the joint probability of x and y to encounter another vector ðw; zÞ whose elements are, respectively, within the tolerance error simultaneously. At the same time, the element x may form two-dimensional vectors with all other elements in the X set, and the same is true for y. Therefore, x and y have, each one, a ‘‘total R volume of combinations” (TVC) given by X 2rr fx dx, which reduces to fx 2rr. Notice that, despite the fact that fx = 1, we are carrying fx along the computations, for the sake of comparisons later. Then, the vector ðx; yÞ sweeps a TVC of
Fig. 6. (a) Pictorial view of the procedure for obtaining the probability Pðx; yÞ in m = 2, for an uniform distribution. The figure depicts the m = 1 probabilities PðxÞ and PðyÞ (shaded areas), the mutual volume and the total volume of comparisons for xi and xi+1. (b) Two-dimensional histogram of relative frequency of pairs (xi, xi+1) obtained from a sine-wave (106 points). Notice that the sine-wave have a contracted TVC. See text for further discussion.
B.T. Santos et al. / Chaos, Solitons and Fractals 42 (2009) 322–334
333
fx[4rr (2rr)2], representing the sum of the volumes (combinations) swept by x along the y-axis and by y along the x-axis, minus their mutual volume. From the total volume and the positive matches volume, we obtain the probability Pðx; yÞ of the positive matches in m = 2:
Pðx; yÞ ¼
fx 4ðr rÞ2 fx ½4rr 4ðr rÞ2
¼
rr 1 rr
ðA:9Þ
and, from (A.8):
apemðx; yÞ ¼ ln
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ðfx 2r rÞ2 rr 1r r
¼ ln ð2f x ð1 r rÞÞ:
ðA:10Þ
Notice that fx = 1. Because apem(x, y) given by Eq. (A.10) is valid for any pair of consecutive elements (xi, xi+1), then the parameter ApEn is a positive value. Remark 4. For the estimative given in (A.10) makes sense, rr 6 0.5 is a condition. The reason for this is obvious: we are not taking into account the cut-off at the extremities of the distribution and, therefore, rr = 0.5 represents the entire range of the X set for the central elements x = 0.5. In other words, the estimative given in (A.10) results in apem(x, y) = 0 when the tolerance error r allows for the entire X set to be considered as ‘‘equal to x” "x 2 X. This is in complete accordance with the ApEn estimator. Next, we proceed to the normally distributed random noise. At the m = 1 level, the probabilities for xi and for xi+1 are given by Eq. (A.3). For the sake of analytical results, but without loss of generality, let us suppose rr = d, as in Eq. (A.4). Then, we have
PðxÞ ¼ 2d fx ðxÞ;
ðA:11:aÞ
PðyÞ ¼ 2d fx ðyÞ:
ðA:11:bÞ
We urge the reader to not get confused by the use of fx(): the fx stands for a p.d.f. of the probability space while fx() is the value of the p.d.f. at (). This notation was not employed in the uniform case only because fx(x) = fx(y) = 1 "x, y. In the two-dimensional case, for obtaining the total volume of combinations for the x element (or y), we must integrate the probability PðxÞ over the y-axis (and vice versa for y, as we have done for the uniform distribution). However, since x and y are independently generated, we obtain
Z
þ1
2d fx ðyÞ fx dx ¼ 2d fx ðyÞ
1
Z
þ1
fx dx ¼ 2d fx ðyÞ 1:
1
And the same occurs for x. Following the steps done for the uniform case, the positive matches volume is the joint probability of x and y, and the total volume of comparisons (i.e., the probability of forming two-dimensional vectors with other elements in the X set) is the sum of their volumes minus their mutual volume. Therefore, the probability Pðx; yÞ at the two-dimensional level is
Pðx; yÞ ¼
4d2 fx ðxÞ fx ðyÞ 2d ½fx ðxÞ þ fx ðyÞ 2d fx ðxÞ fx ðyÞ
ðA:12Þ
and
apemðx; yÞ ¼ ln
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2d fx ðxÞ fx ðyÞ 4d2 fx ðxÞfx ðyÞ 2d½fx ðxÞþfx ðyÞ2dfx ðxÞfx ðyÞ
¼ ln
fx ðxÞ þ fx ðyÞ 2d fx ðxÞ fx ðyÞ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi : fx ðxÞ fx ðyÞ
ðA:13Þ
Notice that fx(x) < 1 in the normally distributed noise. As in Eq. (A.10), apem(x, y) here is valid for all the combinations (xi, xi+1). Thus, not only the parameter ApEn is positive but, moreover, the value from (A.13) is always higher than the value from (A.10). This results in the consistency of the parameter ApEn, as expected to be shown. Remark 5. Consider rr = d in Eq. (A.10) and the particular cases when fx(x) = fx(y) in Eq. (A.13). Then, apem(x, y) = ln(2 [1 d fx(x)]) in the latter and apem(x, y) = ln(2 [1 d]) in the former, exemplifying easily the higher values in (A.13). Finally,, let us examine what happens to a time-series with high temporal organization, a sine-wave for instance. In such sequence, a given value xi is close to its neighborhoods and/or to some other sets of values in the X set. More important, however, is the fact that there are sub-sets of X to which a given xi is not close. Let xj represent an element of one such a sub-set not related to a given xi. Therefore, there will never be a two-dimensional vector of the type (xi, xj) or (xj, xi). The net result is that what we called as ‘‘total volume of combinations”, for a two-dimensional vector, shrinks (see Fig. 6). As becomes clear, then, the decrease in the parameter ApEn is related to the low (even zero) probability of having some com-
334
B.T. Santos et al. / Chaos, Solitons and Fractals 42 (2009) 322–334
binations of elements of X in a two-dimensional pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi level. This causes a decrease in the TVC and a corresponding increase in Pðx; yÞ, approximating it to the value PðxÞPðyÞ (see Eq. (A.8)). Remark 6. The approach we developed above to examine the case of ApEn for m = 1 can be easily extended to any m-dimension, m > 1. Even thought it cannot be illustrated graphically (it would require more than three dimensions: xi, xi+1 . . . , xm plus an axis of fx), the concept is plain: a pair of vectors Xi = (xi, xi+1, . . . , xi+m) and Xi+1 = (xi+1, xi+2 . . . , xi+1+m) have probabilities P(Xi) and P(Xi+1), respectively, in m-dimension, to present positive matches; while the probability P(Xi, Xi+1) in ðm þ 1Þ-dimension is given by the ratio of their joint probability and the TVC generated by each element. A.3. ApEn values for lines with non-zero slope Consider a line y = a + bt, b – 0. Obviously, yi+1 yi = bDt. Then, for a certain value of the tolerance level r, there is a fixed ðmÞ ðmþ1Þ , number of matches for each y in the time-series, irrespectively to the size of the window m. This implies that #i ¼ #i except for one vector in the series. Let a be the number of matches for the points in the series (in the following, a < m N). Then
#ðmÞ ¼ fa þ 1; a þ 2; a þ 3; . . . ; 2a; 2a þ 1; 2a þ 1 . . . 2a þ 1; 2a; . . . ; a þ 3; a þ 2; a þ 1g: And there are N m a vectors with 2a þ 1 matches. For the m þ 1 window, there are N m a 1 vectors with 2a þ 1 matches. With a little algebra, we obtain
ApEnðm; a; NÞ ¼ ln
a X c 2 1 2a þ 1 lnða þ iÞ þ ; ln c þ 1 cðc þ 1Þ i¼1 c þ 1 ð2a þ 1Þc2a c
ðA:14Þ
where c ¼ N m and we put a instead of r as the argument for ApEn just to highlight that the important thing is the number of matches. The only positive factor in Eq. (A.14) is the last term. However, this term has always a lower value than the first one. Therefore, ApEnðm; r; NÞline 6 0 as stated in the main body of the text. References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19]
Eckmann J-P, Ruelle D. Ergodic theory of chaos and strange attractors. Rev Mod Phys 1985;57:617–56. Pincus SM, Goldberger AL. Physiological time-series analysis: what does regularity quantify? Am J Physiol 1994;266:H1643–56. Goldberger AL, Peng C-K, Lipsitz LA. What is physiologic complexity and how does it change with aging and disease? Neurobiol Aging 2002;23:23–6. Seely AJE, Macklem PT. Complex systems and the technology of variability analysis. Critical Care 2004;8:R367–84. Pincus SM. Approximate entropy as a measure of system complexity. Proc Natl Acad Sci USA 1991;88:2297–301. Grassberger P, Procaccia I. Estimation of the Kolmogorov entropy from a chaotic signal. Phys Rev A 1983;28:2591–3. Pincus SM. Approximate entropy (ApEn) as a complexity measure. Chaos 1995;5:110–7. Richman JS, Moorman JR. Physiological time-series analysis using approximate entropy and sample entropy. Am J Physiol 2000;278:H2039–49. Pincus S, Kalman RE. Not all (possibly) ‘‘random” sequences are created equal. Proc Natl Acad Sci USA 1997;94:3513–18. Thomae S, Grossmann S. Correlations and spectra of periodic chaos generated by the logistic parabola. J Statist Phys 1981;26:485–504. Makey MC. Time’s arrow: the origins of thermodynamic behavior. Mineola: Dover; 1992 [chapter 6]. Fermi E. Thermodynamics. New York: Dover; 1936. Abe S, Nakada Y. Temporal extensivity of Tsallis’ entropy and the bound on entropy production rate. Phys Rev E 2006;74:021120. Tsallis C. Occupancy of phase space, extensivity of Sq, and q-generalized central limit theorem. Physica A 2006;365:7–16. Tsallis C. Nonextensive statistical mechanics: a brief review of its present status. Ann Braz Acad Sci 2002;74:393–414. Tsallis C. Entropic nonextensivity: a possible measure of complexity. Chaos, Solitons and Fractals 2002;13:371–91. Tsallis C, Gell-Mann M, Sato Y. Extensivity and entropy production. Europhys News 2005;(Nov/Dec):186–9. Brockwell PJ, Davis RA. Time series: theory and methods. New York: Springer; 1991. Morettin PA. Ondas e Ondoletas. 1999; Editora da Universidade de São Paulo, São Paulo.