9 January 1995 PHYSICS LETTERS A
El .$EVIER
Physics Letters A 197 (1995) 31-39
Statistical estimation of the correlation dimension D . V . P i s a r e n k o a,b, V.F. P i s a r e n k o b a Institut de Physique du Globe de Paris, Dgpartement de Sismologie, 4 Place Jussieu, 75252 Paris Cedex 05, France b International Institute of Earthquake Prediction Theo~ and Mathematical Geophysics, Russian Academy of Sciences, Warshavskoe Sh. 79 k. 2, 113556 Moscou,; Russian Federation
Received 16 October 1992; accepted for publication 14 November 1994 Communicated by A.R. Bishop
Abstract We introduce a statistical method of estimation of the correlation dimension d2 of fractal sets. The method is based on the maximum likelihood approach and provides an efficient estimation of the correlation dimension even for very limiled data sets. We derive an explicit expression for the asymptotic variance of the estimated dimension, which can be used as a physically meaningful error bar. Both the estimate and its variance depend on the upper cutoff ro of the presumed power-law scaling range. We propose to choose the cutoff scale r0 by comparing, with the help of the standard .¥Ltest, the observed distribution within this scale to a power-law having the estimated exponent. The estimate of the correlation dimension based on the value of ro which maximizes the associated X2-probability, is then chosen as the optimal one tor the given data set. A numerical example illustrates the application of the method to binomial multifractal measures with known theoretical values of fractal dimensions.
I. Introduction In the present Letter we consider the problem of estimation o f the correlation dimension d2 from limited data sets. In this problem one may deal with a geometric set of points representing physical objects (e.g. microfractures in a solid specimen [ 1 ], epicenters of earthquakes [2] etc.), as well as with the reconstruction o f dynamics of a nonlinear system from its time series by the method of delays [3]. One o f the methods frequently used to calculate fractal dimensions is the box-counting algorithm [4]. Being easy to implement, this method has however several important shortcomings. In particular, for small data sets when each box contains few data points, the probabilities associated with the boxes are poorly estimated, which may considerably reduce the range of the power-law scaling. Another, more robust method of calculation of fractal dimensions is based on the so-called correlation integral C ( r ) introduced by Grassberger and Procaccia [ 5 ] , N
C(r) =
H(r-lxi-xjl),
(1)
i,j=l
where xt . . . . . XN are data points, and H ( given by the limit Elsevier Science B.V. SSDI 0375-9601 (94)00923-6
) is the Heaviside function. The correlation dimension d2 is then
32
D.V. Pisarenko, V.F. Pisarenko /Physics Letters A 197 (1995) 31-39
d2 = lira lira
r---+0 N - - - ~
lnC(r) In r
(2)
In practice to calculate d2 one determines the slope of C ( r ) on a log-log graph, usually with the help of the least-square fit. The standard deviation of the fit is then often interpreted as the error bar of the obtained dimension value. The error bar defined in such a way does not reflect the uncertainty of the calculated value of d2 due to the finiteness of the data set, but is rather related to the choice of the fitting range. In this approach, the question of whether a given data set is self-similar at all is overlooked. It is clear that the higher the dimension of an object, the greater the number of points needed in order to correctly evaluate its fractal dimension. Several empirical criteria have been suggested in the literature, which define the minimal number of data points N required for a "reliable" estimation of a given correlation dimension d2 [6,7]. They have the form N >> 10 a2/2,
(3)
N/> 42 J2 ~ lO 1"6[d21 ,
(4)
and
where [de] is the integer part of d2 (d2 > 1). These limitations are in some sense subjective, the underlying question being: how many data points are needed in order to obtain a "correct" estimate of a given dimension? We suggest the following alternative statement of the problem: given N points representing a presumably fractal object, what is the statistical error bar of its estimated fractal dimension? The method for estimation of the correlation dimension d2 introduced below is based on the maximum-likelihood principle which has been originally applied to this problem in Ref. [9]. We derive the explicit expressions for the maximum-likelihood estimate of d2 and its asymptotic variance and show how the X2-test can be used to find the optimal upper cutoff of the scaling range.
2. The maximum likelihood estimation of the correlation dimension
Let us consider the correlation integral with a slightly modified normalization, 2 C(r) - N(N-1)
Z
H(r-
]xi- xjl).
(5)
t>j
{n this form C ( r ) can be regarded as a sample estimate of the probability "P{Ixi - xjt < r} that two randomly chosen points xi and xj lie within a sphere of radius r. Let F be the distribution function for the distances between two points xi and xj randomly picked from our data set, F ( r ) = ~ { [ x i - xjl < r}.
(6)
It follows from (2) that d2 = lim r~o
lnF(r) In r
(7)
Considering this limit formula we assume that the following relation holds in some limited range of scales, F(r) =
,
0 < r <~ ro.
(8)
D. V. Pisarenko, V.F. Pisarenko / Physics Letters A 197 (1995) 31-39
33
Let us first assume that we have a set of i n d e p e n d e n t distances rl . . . . . rM within M pairs of randomly chosen points. In this case we can directly apply the standard maximum-likelihood method [ 10] in order to estimate the parameter d2. Since we know the probability density corresponding to the distribution function (6) d f(x)
=
F(x)
d2 x d2 - 1 77~o2 ,
=
~9)
O < x <~ ro,
the expression for the likelihood logarithm is straightforward, M
lnL(d2) = ln[f(rl)
× . . . × f(ra4) ] = M l n d 2
- d2Mlnro
+ (d2 - 1) Z l n r j .
(10)
j=l
The maximum-likelihood estimate ~ maximizing (10) is then '
d'2 =
In
r0
,
rj<~ro.
(tl)
j=l This estimate is asymptotically efficient, unbiased and normal when M --+ oo. The variance of the limit normal law can be expressed through the Fisher information ( 0 lnL'~ 2 IF(d2) = E \ - - ~ 2 J "
(i2)
This quantity represents the information contents about the parameter d2 in the sample rl . . . . . rM (for more details on the Fisher information see for instance Ref. [ 10], p. 283). The asymptotic variance of d2 is then As. var.
d'2- 1 _ d22 IF(d2) M
(13)
Expressions (11) and (13) give the explicit statistical estimation of the parameter d2 and its asymptotic variance. Note that the value of r0 should be taken small enough in order to ensure the validity of (8). At the same time, a too low r0 would reduce the statistics in ( 11 ) and consequently make the estimate less precise. A procedure for choosing r0 based on the verification of the power-law distribution hypothesis (8) is discussed below. Let us now consider the case when all the distances between N points x l . . . . , X N are used in order to estimate the correlation dimension. Similarly to (11) we can make the estimate of d2 involving the distaeces rij = Ixi - x j l between all pairs of points, d2 =
wij In
ro
,
(14)
where wij = H ( r o - r i j ) and L = y'~,iNi Wij. Of course in this case the distances rij can no longer be considered as independent. However when N --+ :x~ the estimate (14) is asymptotically unbiased. Its distribution tends to a normal law with the variance given by As. var.
d'2-
2d2 pN(N-
1)
[2q(N-
2) + I I ,
(15)
where p = 79{rij < r0} and q is the correlation coefficient between 1 - d 2 1 n ( r o / r i j ) and 1 - d z l n ( r o / r i k ) , under the condition that r U < ro and rik < ro (expression (15) is derived in the Appendix). This correlation coefficient can be estimated as
34
D. V. Pisarenko, V.F. Pisarenko / Physics Letters A 197 (1995) 31-39
= ~iNl ~i
(16)
V / ~ [ I - d2 ln(ro/r~i) ]2 ~ [ 1 - d2 ln(ro/rik) ]2 where both sums in denominator are taken over all i and over those j and k for which wij wik = 1.
3. Verification of the power law distribution hypothesis Attributing to an object a scaling exponent or a fractal dimension is meaningful only when one or more of its characteristics follow a power law over some range of scales. Of course in real data a pure power law can be perturbed by the presence of noise and lacunarity effects, while the range of scaling is always limited by the physical constraints of the problem. Thus it is relevant to have a quantitative measure of validity of the power law distribution hypothesis. To verify whether the distances between the points follow the distribution (8) we propose to use the standard X2-criterion. Suppose for a given r0 we have R non-overlapping independent pairs of points, the distance within each pair being less than r0. To implement the X2-test we divide the interval of presumed scaling [0, r0] into k subintervals [a0, a l ) , [al, a2) . . . . . [ak-l, ak], where a0 = 0, ak = r0. According to (8) the probability Pi that a distance between two randomly chosen points lies within the given interval is a!
Pl =
r0--/
'
pk= l_ P2 = \ r 0 /
--
~0
(ak-l) \r0
Let mj denote the number of distances lying within the interval [ a j - l , a j ] , ,¥2-statistics is
k (m.i _ Rpj) 2
~2. /
SO
(17)
that ~jk=l mj = R. Then the
(18)
j=l
In the limit N ---+ oo this statistics has the xe-distribution with k - 2 degrees of freedom, since the number of fitting parameters is one. The number of subintervals k should be not too large, especially for small R. In the numerical example discussed below, we found that for R of the order of several hundreds reasonable values of k are 5 to 7. Choosing the significance level A and the corresponding threshold X2(A) we accept the hypothesis (8) when g 2 ~< g 2 ( a ) , and reject it otherwise. Note that this test involves both of the parameters of the distribution (8), namely d2 and r0. This means that the estimation of d2, for example using (14), must precede this test. On the other hand, the X2-statistics and the probability associated with it can be used as the indicator in the choice of the upper cutoff of the scaling range r0. The choice of r0 can be made in the following way. We start with some initial value rl which is small enough, but at the same time the total number of distances N within the radius rl must be sufficient to calculate tlhe statistics (numerical examples show that typically N should be at least a few hundreds). For this value c,f the upper cutoff of the scaling range rl we calculate aTz(rl) and x 2 ( r l ) . Note that only non-overlapping (independent) pairs of points should be used in the calculation of x2(rl ). Then, gradually increasing the cutoff scale rl < r 2 . . . < rm we verify each time the power law distribution hypothesis with the help of the X2-test. Finally, we choose as r0 the maximal cutoff r i for which the hypothesis of the distribution (8) is still accepted.
D. V. Pisarenko, KF. Pisarenko / Physics Letters A 197 (1995) 31-39
1.1
,
i
,
,
i
,
,
35
i
1
IIITIITTT~
....
0.9
0.8 A
de
0.7
llIIIlllIIIII] ItiliillllIlliiil t
0.6 • TTTITTITIiTTTT TT~I~TTT~_ O'Sl 0.4 0.30.0
I
I
I
I
I
I
I
I
0.I
0.15
0.2
0.25
0.3
0,35
0.4
0.45
0.5
%
1.1
b
o li 0.8
(/2 0.7
0.6
°.It 0.05
....
TTT
. . . . .
,~ITT~_
I
I
i
0.1
0.15
02
0.25
I
I
I
I
I
0.3
0.35
0.4
0.45
0.5
% Fig. 1. Maximum-likelihood estimates of the correlation dimension of random realizations on binomial measures with the partitJon probabilities p = 0.6, 0.7 and 0.8 (from top to bottom). The horizontal solid lines correspond to the theoretical values. The error bars represent square roots of the asymptotic variances given by (15). The arrows indicate the values of ro which produce the maximal X2-probabilities in the X2-test. (a) Realizations of 200 points; (b) realizations of 500 points. 4. N u m e r i c a l e x a m p l e To illustrate the use o f the m a x i m u m - l i k e l i h o o d estimator ( 1 4 ) , ( 1 5 ) w e applied it to r a n d o m i z e d b i n o m i a l measures on a unit interval [ 11 ]. These multifractal measures are easy to construct on binary-rational intervals with the help o f a s i m p l e "cascade" generator, and their fractal d i m e n s i o n s dq can be expressed analytically. In particular
D. Iz. Pisarenko, W..F.Pisarenko / Physics Letters A 197 (1995) 31-39
36
1I
i --
a
b
0.8
0.8
a.6
0.6
P
P 0.4
D,4
I 0.2
0'2 i
~.o5
0.1
0.15
0.2
0.25
0.3
0.35
0,4
0.45
0i
0.5
0.05
0.1
0.15
0.2
ro
0.25
0.3
0.35
0.4
0.45
0,~
ro
0.2
Ilfll ~.05
0.1
0,15
0.2
0.3
0.25
0.35
J 0.4
0.45
0.5
ro ]Fig. 2. X2-probabilities as a function of the cutoff scale ro for the realizations of 200 points shown in Fig. 1. The highest probabilities correspond to the values of r0 indicated by arrows in Fig. la, (a) p = 0.6; (b) p = 0.7; (c) p = 0.8.
d2 = - log2[p 2 + (1 - p ) 2 ] ,
(19)
where p is the partition probability of the generator. We applied the maximum-likelihood estimator to realizations of 200 and 500 points on binomial measures with p = 0.6, 0.7 and 0.8 which have the theoretical values of the correlation dimension d2 = 0.94, 0.79 and 0.56 respectively. The estimates of d2 as a function of the scaling range cutoff ro are shown in Fig. 1. The error bars represent square roots of the asymptotic variances of d2 !given by (15). The solid lines represent the corresponding theoretical values of d2. The arrows indicate the values of r0 suggested by the X2-test described above. To illustrate the use of this test, we plot in Fig. 2 the k,2-probabilities as a function of the cutoff scale r0 for the realizations of 200 points from Fig. 1. The highest probabilities correspond to the r0 values on which the final estimates ,~2 are based. To verify whether the formula for the asymptotic variance (15) gives reasonable error estimates for relatively .,;mall number of data points, we calculated d2 for 20 different realizations of 200 points each. The cutoff scales r0 were fixed and were equal to those indicated by arrows in Fig. 1. The result is shown in Fig. 3. With few
D. K Ptsarenko, V.F. Pisarenko / Physics Letters A 197 (1995,) 31-39
1~2
37
0.7
0. 4
0.3
i iO
i 15
) 20
Realization number
Fig. 3. Maximum-likelihood estimates ~2 for 20 different random realizations of 200 points each on the same binomial measures (p = 0.6, 0.7 and 0.8 from top to bottom). The asymptotic error bars cover the true dimension values (horizontal lines) in 50 realizations of 60. exceptions the error bars cover the true values of the correlation dimension, which means that in this case the asymptotic formula is still applicable to small data sets. Note that the standard deviation error bars correspond to a 68%-confidence level in the case o f the normal distribution. Thus, the true values o f d2 should be covered by the error bars on average in 41 of 60 realizations shown in Fig. 3. In our numerical example there are 50 realizations for which the true value lies within the error bar. We showed that the maximum-likelihood method allows us to obtain satisfactory estimates o f the correlation dimension even from a relatively small number o f data points. The important condition to satisfy is to correctly choose the scaling range upper cutoff. For small values o f cutoff r0 the number of points involved in the computation o f the statistics is reduced, and thus the estimate becomes less reliable. Higher values o f r0 produce more stable results but the estimate is biased by the deviation of distance distribution from the power law (8) due to finite-size effects. A compromise between the two constraints must be found in each particular case. The choice o f the upper cutoff of the scaling range can be made with the help o f the X2-test which compares the observed distribution o f distances to the presumed power-law distribution.
Acknowledgement We thank J.-P. Rivet for stimulating discussions. One of the authors (D.P.) was supported by the Fondation des Treilles.
Appendix Let us derive expression (15) for the asymptotic variance of the maximum likelihood estimate ( 1 4 ) . We denote a
=
1 N ( N - 1) i-..'w~i' i#j
and
(AI)
D.V. Pisarenko, V.F. Pisarenko / Physics Letters A 197 (1995) 31-39
38
E WijIn
b=N(N_l) where as above
r0
i*]
wij = H(r(j - ro).
(A.2) The maximum-likelihood estimate of d2 is then
a
(A.3)
It follows then that
Ea = Ewij = p = 79{r(i <
(A.4)
r0},
and
AS, since under the assumption of the power law distribution of rij, the conditional probability density of ln(r0/rij) under the condition wij = 1 equals d2 exp(-d2x). Considering that lim
a =
N---~cx3
Ea,
lim
b =
N----*oo
Eb,
(A.6)
we have
a - Ea N~ Ea lim
-
= 0
(A.7)
lim
-
b - Eb Eb
-
(A.8)
and
U~
We can now expand
d2 a b
O.
a/b
with respect to these small parameters keeping only first-order terms:
Ea+a-Ea_ Eb + b - Eb
Ea[l+(a-Ea)/Ea] ~ - - E a ( l + a - E a ) Eb[1 + ( b - Eb)Eb] Eb Ea
E a ( 1+ a - E a b - E b )Eb ~E---g
=d2
l+--
(/v--d2 1) Ewij[1-d21n(r~ij) 1' = d2 -4-pN.-:7--
p
(1
bEbb )
p/d (A.9)
i4~j
Then
Ewij [1- d21n( rij r° /~ I
=0,
(A.IO)
and (A.11) The asymptotic variance of the estimate d2 is then
D. V. Pisarenko, V.F. Pisarenko / Physics Letters A 197 (1995) 31-39
As. var.
39
d2 )2E{Zwo[l_d21n(rO)]}2 pN( N - 1) i/neqj \ rij /I j
ds=(
1))
ZEw~i i4=j
I' -
[l - d 2 1 n ( r O ) ] kr~t/J
\rijJJ k*t
als,
All terms of the external sum in (A.12) multiplied by the internal sum give the same expectation value for all (i,j). There are N ( N - 1) terms in the external sum, therefore d'2 - p 2 N ( Nd22 _ 1) E W l , 2 [ l _ d s l n ( r~O ) ) ~
As. var.
wkt
[1 - d s l n ( r~o ) ] .
(A.13)
In the remaining sum there are two terms involving wl,2 and w2,1 which equal exactly to the coefficient in square brackets preceding the sum. Besides, there are 4(N - 2) terms of the form Wl,j, Wi,l, ws,j and wi,2 which have only one common index with this coefficient. Taking into account that
Ewe,2 1
-
d2 In
r0
=p
(A.14)
and
Ew, ,2[1 - dsln(rr~,2)l Wl,31[ - d 2 1 n (rr--~.3)] =pq,
(A.~5)
where q is the following correlation coefficient, EWl,2 [ 1 - d2 ln(ro/rj,2) ] wl,3[ 1 - d2 ln(ro/rl,3) ]
q=
Ew2,2[ 1 - d2 ln(ro/rl,2) ]2
'
(A.16)
we finally obtain As.
var.
=
p 2 N ( N - 1) [2p + 4 ( N - 2)pq]
_
pN(N - 1) [ 2 q ( N - 2) + 1],
which completes the demonstration. References
[ 1] [2] [3] [4] [5] [6] [71 [8] [9] [ 10] [11]
T. Hirata, T. Satoh and K. Ito, Geophys. J. R. Astr, Soc. 90 (1987) 369. M.B. Geilikman, T.V. Golubeva and V.E Pisarenko, Dokl. Akad. Nauk SSSR 310 (1990) 1335. F. Takens, Lecture notes on mathematics, Vol. 898 (Springer, Berlin, 1981) p. 366. J.D. Farmer, E. Ott and J.A. Yorke, Physica D 7 (1983) 153. P. Grassberger and 1. Procaccia, Phys. Rev. Lett. 50 (1983) 346; Physica D 9 (1983) 189. D. Ruelle, Proc. R. Soc. A 427 (1990) 241. L. Smith, Phys. Lett. A 133 (1988) 283. E Takens, Lecture notes on mathematics, Vol. 1125 (Springer, Berlin, 1985) p, 99. S. Ellner, Phys. Lett. A 133 (1988) 128. C.R. Rao, Linear statistical inference and its applications (Wiley, New York, 1965). V.E Pisarenko and D.V. Pisarenko, Phys. Lett. A 153 (1991) 169.
(A. 17)