Minimal partition coverings and generalized dimensions of a complex network

Minimal partition coverings and generalized dimensions of a complex network

Physics Letters A 381 (2017) 1659–1664 Contents lists available at ScienceDirect Physics Letters A www.elsevier.com/locate/pla Minimal partition co...

611KB Sizes 0 Downloads 54 Views

Physics Letters A 381 (2017) 1659–1664

Contents lists available at ScienceDirect

Physics Letters A www.elsevier.com/locate/pla

Minimal partition coverings and generalized dimensions of a complex network Eric Rosenberg AT&T Labs, Middletown, NJ 07748, United States

a r t i c l e

i n f o

Article history: Received 4 January 2017 Received in revised form 3 March 2017 Accepted 4 March 2017 Available online 21 March 2017 Communicated by C.R. Doering Keywords: Complex networks Generalized dimensions Multifractals Box counting Partition function

a b s t r a c t Computing the generalized dimensions D q of a complex network requires covering the network by a minimal number of “boxes” of size s. We show that the current definition of D q is ambiguous, since there are in general multiple minimal coverings of size s. We resolve the ambiguity by first computing, for each s, the minimal covering that is summarized by the lexicographically minimal vector x(s). We show that x(s) is unique and easily obtained from any box counting method. The x(s) vectors can then be used to unambiguously compute D q . Moreover, x(s) is related to the partition function, and the first component of x(s) can be used to compute D ∞ without any partition function evaluations. We compare the box counting dimension and D ∞ for three networks. © 2017 Elsevier B.V. All rights reserved.

1. Introduction A network G = (N , A) is a set N of nodes connected by a set A of arcs. For example, in a friendship social network [21], a node might represent a person and an arc indicates that two people are friends. In a co-authorship network, a node represents an author, and an arc connecting two authors means that they co-authored (possibly with other authors) at least one paper. In a communications network [14], a node might represent a router, and an arc might represent a physical connection between two routers. Many applications of network models are discussed in [2]. We use the term “complex network” to mean an arbitrary network without special structure (as opposed to, e.g., a regular lattice), for which all arcs have unit cost (so the length of a shortest path between two nodes is the number of arcs in that path), and all arcs are undirected (so the arc between nodes i and j can be traversed in either direction). There are many measures used to characterize complex networks. The degree of a node is the number of arcs having that node as one of its endpoints, and one of the most studied measures is the average node degree [11]. The clustering coefficient quantifies, in social networking terms, the extent to which my friends are friends with each other. The diameter  is defined by  ≡ max{ dist (x, y ) | x, y ∈ N }, where dist (x, y ) is the length of the shortest path between nodes x and y. (We use “≡” to denote

E-mail address: [email protected]. http://dx.doi.org/10.1016/j.physleta.2017.03.004 0375-9601/© 2017 Elsevier B.V. All rights reserved.

a definition.) Other network measures include the average path length [1], the box counting dimension d B ([8,19]), the information dimension d I ([17,22]), and the correlation dimension dC ([9, 16,18]). In [17], Rosenberg showed that the definition proposed in [22] of the information dimension d I of a complex network G is ambiguous, since d I is computed from a minimal covering of G by “boxes” of size s, and there are in general different minimal coverings of G by boxes of size s, yielding different values of d I . Using the maximal entropy principle of Jaynes [7], the ambiguity is resolved for each s by maximizing the entropy over the set of minimal coverings by boxes of size s. We face the same ambiguity when using box counting to compute D q (as in the method proposed in [20]), since the different minimal coverings by boxes of size s can yield different values of D q . We illustrate this indeterminacy, for a very simple network, in Section 3. Thus different researchers applying different methods for computing a minimal covering of the same network might compute very different values of D q . The solution to this indeterminacy is to select, for each s, the minimal covering of G satisfying some appropriate criterion that guarantees uniqueness. Moreover, the method used to select the unique minimal covering should require only negligible additional computation beyond what is required to compute a minimal covering. A natural way to obtain a unique minimal covering for a given s and q is to compute a minimal covering that minimizes the partition function; we call such a covering an “(s, q) minimal covering”. We show that for q > 1 an (s, q) minimal covering will try to

1660

E. Rosenberg / Physics Letters A 381 (2017) 1659–1664

equalize the number of nodes over all boxes in a minimal covering. An (s, q) minimal covering can be computed by a minor modification of whatever method is used to compute a minimal covering. Although the new notion of (s, q) minimal coverings removes the ambiguity in the calculation of D q , it is chiefly of theoretical interest, since we do not want to compute an (s, q) minimal covering for each s and q. Rather, we want to compute D q using only a single minimal covering for each s. To this end, we introduce the new notion of a lexico (short for lexicographically) minimal summary vector x(s), which summarizes a minimal covering of size s. The value x j (s) is the number of nodes in box B j of a minimal covering, and x j (s) is non-increasing in j. We prove that x(s) is unique for each s and that x(s) summarizes an (s, q) minimal covering for all sufficiently large q. Computing x(s) requires essentially no extra computation beyond what is required to compute a minimal covering. Since for each s there is a unique lexico minimal vector x(s), and x(s) summarizes a minimal covering, we can use the x(s) vectors to unambiguously compute D q . We also show that D ∞ ≡ limq→∞ D q can be computed from the x1 (s) values, where x1 (s) is the first component of x(s), without any partition function evaluations. We illustrate this by computing D ∞ for three networks, and comparing D ∞ to the box counting dimension d B . For two of the three networks d B > D ∞ and for the third network d B ≈ D ∞ . We emphasize that this paper does not propose a new box counting method for computing the generalized dimensions D q of a network. Nor is our goal to compare D q with other network dimensions such as d B , dC , or d I . Rather, our intent is to introduce the x(s) summary vectors, describe their interesting properties, and show how any box counting method can easily be modified to compute the x(s) vectors, which can then be used to unambiguously compute D q . 2. Preliminary definitions Throughout this paper, G will refer to a complex network with node set N and arc set A. We assume that G is connected, meaning there is a path of arcs in A connecting any two nodes. Let N ≡ |N | be the number of nodes. The network B is a subnetwork of G if B is connected and B can be obtained from G by deleting nodes and arcs. For each positive integer s such that s ≥ 2, let B (s) be a collection of subnetworks (called boxes) of G satisfying two conditions: (i) each node in N belongs to exactly one subnetwork (i.e., to one box) in B (s), and (ii) the diameter of each box in B (s) is at most s − 1. We call B (s) a covering of G of size s, or more simply, an s-covering. We do not consider B (s) for s = 1, since a box of diameter 0 contains only a single node. Define B (s) = |B (s)|, so B (s) is the number of boxes in B (s). The s-covering B (s) is minimal if B (s) is less than or equal to the number of boxes in any other s-covering. For s > , the minimal s-covering consists of a single box, which is G itself. The term “box counting” refers to computing a minimal s-covering of G for a range of values of s. In general, we cannot easily compute a minimal s-covering, but good heuristics are known (e.g., [3,8,15,19,23]). The next set of definitions concern the generalized dimensions of a geometric object. Consider a dynamical system in which motion is confined to some bounded set  ⊂ R E (E-dimensional Euclidean space) equipped with a natural invariant measure σ . Define a “box” to be a neighborhood (centered at some point) of . We cover  with a set B (s) of boxes of diameter s such that σ ( B j ) > 0 for each box B j ∈ B(s) and such that for any two boxes B i , B j ∈ B (s) we have σ ( B i ∩ B j ) = 0 (i.e., boxes may overlap, but the intersection of each pair of boxes has measure zero). Define the probability p j (s) of B j by p j (s) ≡ σ ( B j )/σ (). In practice, p j (s) is approximated by N j (s)/ N, where N is the total number of ob-

Table 1 Symbols and their definitions. Symbol

Definition

 B (s) B (s)

network diameter covering of G by boxes of size s cardinality of B(s) box in B(s) box counting dimension generalized dimension information dimension complex network number of nodes in G number of nodes in box B j ∈ B(s) probability of box B j ∈ B(s) E-dimensional Euclidean space vector summarizing the covering B(s) partition function value for the covering B(s) partition function value for the summary vector x

Bj dB Dq dI

G

N N j (s) p j (s)

RE x(s)  Z q B (s) Z (x, q)

served points and N j (s) is the number of points in box B j [13]. For q ∈ R, define







Z q B (s) ≡

[ p j (s)]q .

(1)

B j ∈B(s)

For q > 0 and q = 1, the generalized dimension D q was defined in 1983 by Grassberger [5] and by Hentschel and Procaccia [6] as

Dq ≡



1

lim q − 1 s→0



log Z q B (s) log s

.

(2)

Since definition (1) was presented only in the context of a geometric object, we extend the definition to a complex network. Let B(s) be an s-covering of G . For B j ∈ B(s), define p j (s) ≡ N j (s)/ N, where N j (s) is the number of nodes in B j . For q ∈ R, we use (1)     to define Z q B (s) , and we call Z q B (s) the partition function value for B (s). For convenience, the symbols used in this paper are summarized in Table 1. 3. Minimizing the partition function The method of [20] for computing D q for G is the following.  For each s, compute a minimal s-covering B (s) and Z q B (s) . (In   practice, if using a randomized box counting heuristic, Z q B (s) is the average partition function value, averaged over some number of executions of the heuristic.) Then G has the generalized dimension D q (for q = 1) if over some range of s and for some constant c





log Z q B (s) ≈ (q − 1) D q log(s/) + c .

(3)

This definition is ambiguous, since minimal s-coverings   different can yield different values of Z q B (s) . In particular, [17] showed that the value of d I for G depends on the particular minimal s-coverings of G selected, and proposed the notion of a maximal entropy minimal covering for use in computing d I . A similar ambiguity arises in defining D q for G , since different minimal s-coverings can yield different box probabilities p j (s) and hence different values of D q . Example 1. Consider the “chair” network of Fig. 1, which shows two minimal 3-coverings and a minimal 2-covering.  Choosing  (3) from (1) we have Z 2 B (3) = ( 3 )2 + q = 2, for the covering B 5

  (3) we have Z 2 B (3) = ( 4 )2 + ( 1 )2 = 17 . , while for B ( 25 )2 = 13 25 5 5 25   9 (3) For B (2) we have Z 2 B (2) = 2( 25 )2 + ( 15 )2 = 25 . If we use B  then from (3) and the range s ∈ [2, 3] we obtain D 2 = log 13 − 25  9 (3) and the same log 25 /(log 3 − log 2) = 0.907. If instead we use B   9 range of s we obtain D 2 = log 17 − log 25 /(log 3 − log 2) = 1.569. 25

E. Rosenberg / Physics Letters A 381 (2017) 1659–1664

1661

pose we have executed box counting some  number of times,  and stored Bmin (s) and Z q (s) ≡ Z q Bmin (s) , so Z q (s) is the current best estimate of the minimal value of the partition function. Now suppose we execute box counting again, and generate a new  s-covering B (s) using B (s) boxes. Let Z q ≡ Z q B (s) be the partition function value associated with B (s). If B (s) < B min (s) holds, or if B (s) = B min (s) and Z q < Z q (s) hold, set Bmin (s) = B (s) and Z q (s) = Z q . 2 Fig. 1. Two minimal 3-coverings and a minimal 2-covering for the chair network.

Thus the method of [20] can yield different values of D 2 depending on the minimal covering selected. 2 To devise a computationally efficient method for selecting a unique minimal covering, first consider the maximal entropy criterion used in [17]. A maximal entropy minimal  s-covering is a minimal s-covering for which the entropy − j p j (s) log p j (s) is largest. It is well known that entropy is maximized when all the probabilities are equal. Theorem 1 below, concerning a continuous optimization problem, proves that the partition function is minimized when the probabilities are equal. To formalize this idea, for integer J ≥ 2, let P (q) denote the continuous optimization problem

minimize

J 

q

p j subject to

j =1

J 

p j = 1 and p j ≥ 0 for each j .

Let s be fixed. Procedure 1 above shows that we can easily modify any box counting method to compute the (s, q) minimal covering for a given q. However, this approach to eliminating ambiguity in the computation of D q is not particularly attractive, since it requires computing an (s, q) minimal covering for each value of q for which we wish to compute D q . We can modify Procedure 1 to deal with multiple q values by implementing multiple tests, e.g., testing if Z q1 < Z q1 (s), and testing if Z q2 < Z q2 (s), but this approach is not elegant. Moreover, since there are only a finite number of minimal s-coverings, as q varies we will eventually rediscover some (s, q) minimal coverings, that is, for some q1 and q2 we will find that the (s, q1 ) minimal covering is identical to the (s, q2 ) minimal covering. The next section offers an elegant alternative to computing an (s, q) minimal covering for each s and q, while still achieving our goal of unambiguously computing D q . 4. Lexico minimal summaries

j =1

Theorem 1. For q > 1, the solution of P (q) is p j = 1/ J for each j, and the optimal objective function value is J 1−q . Proof. If p j = 1 for some j, then for each other j we have p j = 0 and the objective function value is 1. Now suppose each p j > 0 in the solution of P (q). Let λ be the Lagrange multiplier associJ ated with the constraint j =1 p j = 1. The first order optimality q−1

condition for p j is qp j − λ = 0. Thus each p j has the same value, which is 1/ J . The corresponding objective function value is J (1/ J )q = J 1−q . This value is less than the value 1 (obtained j =1

when exactly one p j = 1) when J 1−q < 1, which holds for q > 1. For q > 1 and p > 0, the function p q is a strictly convex function of p, so p j = 1/ J is the unique global (as opposed to only a local) solution of P (q). 2 Theorem 1 to G , minimizing the partition function  Applying Z q B (s) over all minimal s-coverings of G yields a minimal s-covering for which all the probabilities p j (s) are approximately equal. Since p j (s) = N j (s)/ N, having the box probabilities almost equal means that all boxes in the minimal s-covering have approximately the same number of nodes. The following definition of an (s, q) minimal covering, for use in computing D q , is analogous to the definition in [17] of a maximal entropy minimal covering, for use in computing d I . Definition 1. For q ∈ R, the covering B (s) of G is an (s, q) minimal covering if (i) B (s) is a minimal s-covering and (ii) for any other     (s) we have Z q B (s) ≤ Z q B (s) . 2 minimal s-covering B Procedure 1 below shows how an (s, q) minimal covering can be computed, for a given s and q, by a simple modification of whatever box counting method is used to compute a minimal s-covering. Procedure 1. Let Bmin (s) be the best s-covering obtained over all executions of whatever box counting method is utilized. Sup-

In this section we provide a simple way to modify any box counting method to compute a single minimal s-covering, with desirable properties, that can be used to unambiguously compute D q for all q. To begin, pick a box size s. We summarize an s-covering B(s) by the point x ∈ R J , where J ≡ B (s), where x j = N j (s) for 1 ≤ j ≤ J , and where x1 ≥ x2 ≥ · · · ≥ x J . We say “summarize” since x does not specify all the information in B (s); in particular, B (s) specifies exactly which nodes belong to each box, while x specionly the number of nodes in each box. We use the notation fies  x= B(s) to mean that x summarizes the s-covering B(s) and that x1 ≥ x2 ≥ · · · ≥ x J . For example, if N = 37, s = 3, and B (3) = 5 then we might have  x= B(3) for x = (18, 7, 5, 5, 2). However, we cannot have x = B(3) for x = (7, 18, 5, 5,  2) since the components of x are not ordered correctly. If x = B(s) then each x j ispositive, since x j is the number of nodes in box B j . We call x= B(s) a summary of B(s). By “x is a summary” we mean x is a summary of B (s) for some B (s). Let x ∈ R K for some positive integer K . Let right (x) ∈ R K −1 be the point obtained by deleting the first component of x. For example, if x = (18, 7, 5, 5, 2) then right  (x) = (7, 5, 5, 2). Similarly, we define right 2 (x) ≡ right right (x) , so right 2 (7, 7, 5, 2) = (5, 2). Let u ∈ R and v ∈ R be numbers. We say that u v (in words, u is lexico greater than or equal to v) if ordinary inequality holds, that is, u v if u ≥ v. (We use lexico instead of the longer lexicographically.) Thus 6 3 and 3 3. Now let x ∈ R K and y ∈ R K . We define lexico inequality recursively. We say that y x if either (i) y 1 > x1 or (ii) y 1 = x1 and right ( y ) right (x). For example, for x = (9, 6, 5, 5, 2), y = (9, 6, 4, 6, 2), and z = (8, 7, 5, 5, 2), we have x y and x z and y z.



Definition 2. Let x = B(s). Then x is lexico minimal if (i) B(s) is a (s) is a minimal s-covering distinct minimal s-covering  and (ii) if B (s) then y x. 2 from B (s) and y = B Theorem 2. For each s there is a unique lexico minimal summary.





(s) are B(s) and y = B Proof. Pick any s. Suppose that x = both lexico minimal. Define J ≡ B (s), so J is the number of boxes

1662

E. Rosenberg / Physics Letters A 381 (2017) 1659–1664

(s). If J = 1 then a single box covers G , and in both B (s) and in B that box must be G itself, so the theorem holds. So assume J > 1. If x1 > y 1 or x1 < y 1 then x and y cannot both be lexico minimal, so we must have x1 = y 1 . We apply the same reasoning to right (x) and right ( y ) to conclude that x2 = y 2 . If J = 2 we are done; the theorem is proved. If J > 2, we apply the same reasoning to right 2 (x) and right 2 ( y ) to conclude that x3 = y 3 . Continuing in this manner, we finally obtain x J = y J , so x = y. 2 For x =

Z (x, q) ≡



B(s) and q ∈ R, define

B (s)   x j q

N

j =1

From (1), for x =



(4)

.

  B(s) we have Z (x, q) = Z q B(s) .



B(s). If x is lexico minimal then B(s) is (s, q) Theorem 3. Let x = minimal for all sufficiently large q. 

 B(s) be lexico minimal. Let Proof. Pick s and let x =  B (s) be (s). By a minimal s-covering distinct from B (s), and let y = B Theorem 2, y = x. Define J ≡ B (s). Consider the fraction F≡

Z ( y , q) Z (x, q)

j =1

= J

q

( y j /N )

j =1

(x j / N )q

(x j /xk )q <

j =k

J  (x j /xk ) ≤ J − k + 1 .

(5)

j =k

By the Lemma, F > 1 if and only if

J j =k

( y j / N )q >

J j =k

(x j / N )q .

(s) are minimal s-coverings, and since x is lexico Since B (s) and B minimal, then yk > xk . We have

J j =k

J

q

( y j /N )

(x / N )q j =k j

= >

log Z (x(s), q) ≈ (q − 1) D q log(s/) + c ,

J −q  J N q xk ( y j / N )q ( y j /xk )q j =k j =k =   −q J J N q xk (x / N )q (x /x )q j =k j j =k j k ( yk /xk )q ( yk /xk )q ≥ , J J −k+1 (x /x )q j =k j k

where the final inequality holds by (5). Since yk > xk then for all sufficiently large q we have F > 1. 2 Analogous to Procedure 1, Procedure 2 below shows how, for a given s, the lexico minimal x(s) can be computed by a simple modification of whatever box counting method is used to compute a minimal s-covering. Procedure 2. Let Bmin (s) be the best s-covering obtained over all executions of whatever box counting method is utilized. Suppose we have executed box counting some number of times, and  stored Bmin (s) and xmin (s) = Bmin (s), so xmin (s) is the current best estimate of a lexico minimal summary vector. Now suppose we execute box counting again,  and generate a new s-covering B(s) using B (s) boxes. Let x = B(s). If B (s) < B min (s) holds, or if B (s) = B min (s) and xmin (s) x hold, set Bmin (s) = B (s) and xmin (s) = x. 2



(6)

B(s) is lex-

Since (6) is identical to (3), Definition 3 modifies the definition of D q in [20] to ensure uniqueness of D q . Since x(s) summarizes a minimal s-covering, using x(s) to compute D q , even when q is not sufficiently large as required by Theorem 3, introduces no error in the computation of D q . By using the summary vectors x(s) we can unambiguously compute D q , with negligible extra computational burden. Example 1 (continued). Consider again the chair  network of Fig. 1. Choose q = 2. For s = 2 we have x ( 2 ) = B(2) = (2, 2, 1) and    9 (3) = (3, 2) and Z x(2), 2 = 25 . For s = 3 we have x(3) = B



.

Let k be the smallest index such that y j = x j . For example, if x = (8, 8, 7, 3, 2) and y = (8, 8, 7, 4, 1) then k = 4. Since x j ≤ xk for k ≤ j ≤ J , for q > 1 we have J 

Definition 3. For q = 1, the complex network G has the generalized dimension D q if for some constant c, for some positive integers L and U satisfying 2 ≤ L < U ≤ , and for each integer s ∈ [ L , U ],

where Z (x(s), q) is defined by (4) and where x(s) = ico minimal. 2

Lemma. Let α , β , and γ be positive numbers. Then (α +β)/(α + γ ) > 1 if and only if β > γ .

J

Procedure 2 shows that the only additional steps, beyond the box counting method itself, needed to compute x(s) are lexicographic comparisons, and no evaluations of the partition function  Z q B (s) are required. Theorem 2 proved that x(s) is unique. Theorem 3 proved that x(s) is “optimal” (i.e., (s, q) minimal) for all sufficiently large q. Thus an attractive way to unambiguously compute D q is to compute x(s) for a range of s and use the x(s) vectors to compute D q , using Definition 3 below.



Z x(3), 2 = 13 . Choose L = 2 and U = 3. From Definition 3 we 25 have D 2 = log(13/9)/ log(3/2) = 0.907. 2 5. The densest part of a complex network

In this section, we show how to use the  x(s) summary vectors to compute D ∞ ≡ limq→∞ D q . Let x(s) = B(s) be lexico minimal and let x1 (s) be the first element of x(s). Let m(s) be the multiplicity of x1 (s), defined to be the number of occurrences of x1 (s) in x(s). (For example, if x(s) = (7, 7, 6, 5, 4) then x1 (s) = 7 and m(s) = 2 since there are two occurrences of 7.) Then x1 (s) > x j (s) for j > m(s). Using (4), we have



lim

log Z x(s), q

q→∞



q−1

⎧ ⎛ ⎞⎫ q    B (s) ⎨ 1  x j (s) q ⎬ x1 (s) ⎠ = lim + log ⎝m(s) q→∞ ⎩ q − 1 ⎭ N N j =m(s)+1   q   1 x1 (s) = lim log m(s) q→∞ q − 1 N   log m(s) + q log x1 (s)/ N = lim q→∞ q−1   x1 (s) . = log (7) N

Rewriting (6) yields

log Z (x(s), q) q−1

≈ D q log

s c . +  q−1

(8)

Taking the limit of both sides of (8) as q → ∞, and using (7), we obtain

log (x1 (s)/ N ) ≈ D ∞ log (s/) .

(9)

We can use (9) to compute D ∞ without having to compute any partition function values. It is well known [13] that, for geometric

E. Rosenberg / Physics Letters A 381 (2017) 1659–1664

1663

Fig. 2. Jazz network. (For interpretation of the references to color in this figure, the reader is referred to the web version of this article.)

Fig. 3. Dolphins network. (For interpretation of the references to color in this figure, the reader is referred to the web version of this article.)

multifractals, D ∞ corresponds to the densest part of the fractal. Similarly, (9) shows that, for a complex network, D ∞ is the factor that relates the box size s to x1 (s), the number of nodes in the box with the highest probability. We now present computational results comparing the box counting dimension d B with D ∞ for three networks. We utilized Procedure 2, which can be used with any box counting method; we used the greedy heuristic of [19], which is very similar to the method used in [23]. Other box counting methods are described in ([3,8,15,23]). The greedy heuristic of [19] first transforms the covering problem into a graph coloring problem. For a given box size s ≥ 2, s = (N , A s ). The node set of G s is N , create the auxiliary graph G s is A s , and it depends and it is independent of s. The arc set of G s if dist(i , j ) ≥ s, where on s: there is an undirected arc (i , j ) in A s may be disthe distance is in the original graph G . The graph G connected: if for some node i we have dist(i , j ) < s for each node s . j then i will be an isolated node of G  s , Having constructed Gs , the task is to color the nodes of G s conusing the minimal number of colors, such that no arc in A s , then i nects nodes assigned the same color. That is, if (i , j ) ∈ A and j must be assigned different colors. The minimal number of s and is tracolors required is called the chromatic number of G s ). The key observation ([8,19]) is that ditionally denoted by χ (G χ (Gs ) = B (s). The heuristic we use to color Gs is the following. Suppose we have ordered the N = |N | nodes in some order  (ni ) be n1 , n2 , · · · , n N . Define N ≡ {1, 2, · · · , N }. For i ∈ N, let N  the set of nodes in Gs that are adjacent to node ni . (This set is empty if ni is an isolated node.) Let c (n) be the color assigned to node n. We initialize c (n1 ) = 1 and c (n) = 0 for all other nodes. For i = 2, 3, · · · , N, the color c (ni ) assigned to ni is the smallest color not assigned to any neighbor of ni :

(ni )} . c (ni ) = min{k ∈ N | k = c (n) for n ∈ N s ), which is also Then the estimate of the chromatic number χ (G the estimate of B (s), is max{c (n) | n ∈ N }, the number of colors used to color the auxiliary graph. All nodes with the same color belong to the same box. This node coloring heuristic assumes some given ordering of the nodes. We generated many random orderings of the nodes: for the dolphins and jazz networks we generated 10N random orderings of the nodes, and for the c. elegans network we generated N random orderings. We executed the node coloring heuristic for each random ordering. Random orderings were obtained by starting with the permutation array π (n) = n, picking two random integers i , j ∈ N, and swapping the contents of π in positions i and j. For the dolphins and jazz networks we performed 10N swaps to generate each random ordering, and for the c. elegans network we performed N swaps. In the plots for these three networks, the horizontal axis is log(s/), the red line plots log(x1 (s)/ N ) vs. log(s/), and the blue line plots − log B (s) vs. log(s/). We compare d B to D ∞ since for geometric multifractals we have d B = D 0 ≥ D ∞ [5]. Example 2. Fig. 2 provides results for the jazz network, with 198 nodes, 2742 arcs, and  = 6. This is a collaboration network of jazz musicians [4]. The red and blue plots are over the range 2 ≤ s ≤ 6. Over this range, a linear fit to the red plot yields, from (9), D ∞ = 1.55, and a linear fit to the blue plot yields d B = 1.87. Since d B = D 0 , we have D 0 > D ∞ . Example 3. Fig. 3 provides results for the dolphins network, with 62 nodes, 159 arcs, and  = 8. This is a social network describing frequent associations between 62 dolphins in a community living off Doubtful Sound, New Zealand. The data was compiled

1664

E. Rosenberg / Physics Letters A 381 (2017) 1659–1664

Fig. 4. c. elegans network. (For interpretation of the references to color in this figure, the reader is referred to the web version of this article.)

by Lusseau et al. [10]. The red and blue plots are over the range 2 ≤ s ≤ 8. Using (9), a linear fit to the red plot over 2 ≤ s ≤ 6 (the portion of the plots that appears roughly linear) yields D ∞ = 2.29, and a linear fit to the blue plot over 2 ≤ s ≤ 6 yields d B = 2.27, so D ∞ very slightly exceeds d B . Example 4. Fig. 4 provides results for the c. elegans network, with 453 nodes, 2040 arcs, and  = 7. This is the unweighted, undirected version the neural network of c. elegans [12]. The red and blue plots are over the range 2 ≤ s ≤ 7. Using (9), a linear fit to the red plot over 3 ≤ s ≤ 7 (the portion of the plots that appears roughly linear) yields D ∞ = 1.52, and a linear fit to the blue plot over 3 ≤ s ≤ 7 yields d B = 1.73, so D 0 > D ∞ . In each of the three figures, the two lines roughly begin to intersect as s/ increases. This occurs since x1 (s) is non-decreasing in s and x1 (s)/ N = 1 for s > ; similarly B (s) is non-increasing in s and B (s) = 1 for s > . 6. Concluding remarks We have shown that the definition in [20] of D q for a complex network G is ambiguous, since there are in general multiple minimal coverings of G by boxes of size s, and these different coverings can yield different values of D q . Just as [17], in calculating the information dimension d I of G , used the criterion of maximal entropy to select among the various minimal covers of size s, here we introduced the new concept of a lexicographically minimal summary vector x(s). We proved that for each s there is a unique x(s), and that for all sufficiently large q, x(s) summarizes a minimal s-covering that also minimizes the partition function Z q . The x(s) vectors can be computed by a simple modification of whatever box counting method is utilized, and the calculation of x(s) does not require any partition function evaluations. By using the x(s) summary vectors we can unambiguously compute D q with negligible extra computation effort beyond what is required to compute a minimal covering. The x(s) vectors have an additional nice property. We showed that the limiting value D ∞ and x1 (s) (the first coordinate of x(s)) are related by log (x1 (s)/ N ) ≈ D ∞ log (s/), where  is the diameter of G . We compared D ∞ and the box counting dimension d B for three networks. Acknowledgements Many thanks to Robert Murray, Curtis Provost, and the reviewers for their comments and suggestions.

References [1] L. da F. Costa, F.A. Rodrigues, G. Travieso, P.R. Villas Boas, Characterization of complex networks: a survey of measurements, Adv. Phys. 56 (2007) 167–242. [2] L. da F. Costa, O.N. Oliveira Jr., G. Travieso, F.A. Rodrigues, P.R.V. Boas, L. Antiqueira, M.P. Viana, L.E.C. Rocha, Analyzing and modeling real-world phenomena with complex networks: a survey of applications, Adv. Phys. 60 (2011) 329–412. [3] L.K. Gallos, C. Song, H.A. Makse, A review of fractality and self-similarity in complex networks, Physica A 386 (2007) 686–691. [4] P.M. Gleiser, L. Danon, Community structure in jazz, Adv. Complex Syst. 6 (2003) 565. [5] P. Grassberger, Generalized dimensions of strange attractors, Phys. Lett. A 97 (1983) 227–230. [6] H.G.E. Hentschel, I. Procaccia, The infinite number of generalized dimensions of fractals and strange attractors, Physica D 8 (1983) 435–444. [7] E.T. Jaynes, Information theory and statistical mechanics, Phys. Rev. 106 (1957) 620–630. [8] J.S. Kim, K.-I. Goh, B. Kahng, D. Kim, A box-covering algorithm for fractal scaling in scale-free networks, Chaos 17 (2007) 026116. [9] L. Lacasa, J. Gómez-Gardeñes, Correlation dimension of complex networks, Phys. Rev. Lett. 110 (19 April 2013) 168703. [10] D. Lusseau, K. Schneider, O.J. Boisseau, P. Haase, E. Slooten, S.M. Dawson, The bottlenose dolphin community of Doubtful Sound features a large proportion of long-lasting associations, Behav. Ecol. Sociobiol. 54 (2003) 396–405. [11] M.E.J. Newman, The structure and function of complex networks, SIAM Rev. 45 (2003) 167–256. [12] M.E.J. Newman, Network data, http://www-personal.umich.edu/~mejn/netdata/. [13] H.O. Peitgen, H. Jürgens, D. Saupe, Chaos and Fractals, Springer-Verlag, New York, 1992. [14] E. Rosenberg, Capacity requirements for node and arc survivable networks, Telecommun. Syst. 20 (2002) 107–131. [15] E. Rosenberg, Lower bounds on box counting for complex networks, J. Interconnect. Netw. 14 (2013) 1350019. [16] E. Rosenberg, The correlation dimension of a rectilinear grid, J. Interconnect. Netw. 16 (2016) 1550010. [17] E. Rosenberg, Maximal entropy coverings and the information dimension of a complex network, Phys. Lett. A 381 (2017) 574–580. [18] C. Song, S. Havlin, H.A. Makse, Self-similarity of complex networks, Nature 433 (2005) 392–395. [19] C. Song, L.K. Gallos, S. Havlin, H.A. Makse, How to calculate the fractal dimension of a complex network: the box covering algorithm, J. Stat. Mech. Theory Exp. (2007) P03006. [20] D.-L. Wang, Z.-G. Yu, V. Anh, Multifractal analysis of complex networks, Chin. Phys. B 21 (2012) 080504. [21] D.J. Watts, Networks, dynamics, and the small-world phenomenon, Am. J. Sociol. 105 (1999) 493–527. [22] D. Wei, B. Wei, Y. Hu, H. Zhang, Y. Deng, A new information dimension of complex networks, Phys. Lett. A 378 (2014) 1091–1094. [23] X.-Y. Zhao, B. Huang, M. Tang, H.-F. Zhang, D.-B. Chen, Identifying effective multiple spreaders by coloring complex networks, Europhys. Lett. 108 (2014) 68005.