Exploring community structure in networks by consensus dynamics

Physica A xx (xxxx) xxx–xxx Contents lists available at ScienceDirect Physica A journal homepage: www.elsevier.com/locate/physa Q1 Q2 Exploring c...

Download PDF

689KB Sizes 6 Downloads 57 Views

Report

PDF Reader
Full Text

Physica A xx (xxxx) xxx–xxx

Contents lists available at ScienceDirect

Physica A journal homepage: www.elsevier.com/locate/physa

Q1

Q2

Exploring community structure in networks by consensus dynamics He He a , Bo Yang a,∗ , Xiaoming Hu b a

School of Automation, Wuhan University of Technology, Wuhan 430070, China

b

Optimization and Systems Theory, KTH Royal Institute of Technology, Stockholm 10044, Sweden

highlights • • • •

We establish connections between community structure and consensus dynamics. We measure the node similarity and then trace the clustering by consensus process. We extract the information of communities via the evolution of dynamic matrices. The proposed algorithms uncover the hierarchical modular organization of networks.

article

info

Article history: Received 12 July 2015 Received in revised form 17 November 2015 Available online xxxx Keywords: Complex network Consensus Community structure Dynamic matrix

abstract This paper investigates the relationship between community structure and consensus dynamics in complex networks. We analyze the dynamical process towards consensus and show that those sets of densely interconnected nodes corresponding to well-defined communities appear in different time scales. In order to reveal such topological scales, two algorithms built around the idea of visualizing the evolution of different measured quantities are proposed. Then we test our algorithms on a few benchmark graphs whose community structures are already known. Numeric simulations are given to demonstrate the effectiveness and reliability of our methods. © 2016 Elsevier B.V. All rights reserved.

1. Introduction The past decade has witnessed a great deal of interest in the science of complex networks [1–4]. This is partially due to the growing interest in understanding intriguing complex systems in the real world, and also due to its broad applications in many areas as diverse as the internet, neural networks, social networks, electricity grids and biological organizations [5–9]. With the deepening of the study on complex networks, researchers have focused particularly on some statistical properties a large number of networks have in common. One of the most distinctive properties existed in many complex systems is ‘‘community structure’’ [10–12], which means that nodes are divided into groups, such that within the same group they are connected together densely while between the groups there are only very sparse connections. The community detection is to find an appropriate partition for identifying such kind of groups in networks, and that would be very helpful to reveal deeper structure of the whole network and functional modules. Due to the great importance of detecting communities in the science of complex networks, scientists spent a large effort on it and have already proposed a number of methods such as graph partitioning [13], spectral algorithm [14], modularity-based methods [15,16] and the hierarchical clustering algorithms [17], which can be classified in agglomerative algorithms and divisive algorithms.

∗

Corresponding author. E-mail address: [email protected] (B. Yang).

http://dx.doi.org/10.1016/j.physa.2015.12.140 0378-4371/© 2016 Elsevier B.V. All rights reserved.

1

2 3 4 5 6 7 8 9 10 11 12 13

2

H. He et al. / Physica A xx (xxxx) xxx–xxx

17

On the other hand, an important object of study on complex networks is to understand how the topological structure influences on the dynamical process. Thus dynamic algorithms for detecting communities become issues of attention, for instance, many physicists focus their thoughts on spin models [18], random walks [19], and synchronization [20]. Santo Fortunato [21] has presented a presentation of most methods developed. In addition, consensus is also a vital method employing process running on a graph and has been emphatically studied. For example, Yao Chen et al. [22] investigated the cluster consensus problem based on Markov chains and nonnegative matrix analysis. Then we attempt to explore the organization of communities by extracting quantitative information from the dynamical process of consensus, which depends on several tools from algebraic graph theory, matrix theory and control theory. This work is aimed at answering the following questions around the community structure of a network: (1) does this network exhibit community structure or not; if it does, then (2) how many communities exist; (3) is the community structure strong or weak. Further, we would like to provide information of topology on hierarchical community organization: (a) the number of hierarchical levels; (b) the number of communities at each level; (c) the strength of the community structure at each level. Two algorithms are presented to achieve these aims in this study. The rest is organized as follows. Some preliminaries are provided in Section 2. Section 3 presents two algorithms associated with consensus dynamics to explore the community structure of a network. In Section 4, we test our algorithms on several computer-generated and real-world graphs whose community structures are already known. Then we expose a detailed demonstration of the proposed algorithms in Section 5. Finally, some conclusions are drawn in Section 6.

18

2. Preliminaries

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

19 20 21 22 23

Let G = (V , E ) be an undirected graph of order n with the set of nodes V = {v1 , v2 , . . . , vn }, set of edges E ⊆ V × V . The node indexes belong to a limited index set ℓ = {1, 2, . . . , n}. An edge between node vi and node vj is denoted by eij = (vi , vj ). We say that node vi is a neighbor of node vj if eij ∈ E, and the set of neighbors of node vi is denoted by Ni = {j ∈ ℓ, (vi , vj ) ∈ E }. The weighted adjacency matrix A is a basic expression of graph G in graph theory and computer science, which is defined by

 24

25 26 27 28 29 30 31 32 33 34 35 36 37

38

Aij =

1 0

eij ∈ E otherwise.

(1)

Additionally, we assume Aii = 0 for all i ∈ ℓ. Suppose each node of a graph is a dynamic agent with linear dynamics: x˙ i (t ) = ui (t )

(2)

where xi ∈ R denotes the value of node vi . The value of a node might represent any physical state of the corresponding agent. Accordingly, X = (x1 , x2 , . . . , xn )T denotes the value of a network. In a distributed control system, each agent updates its current state based on the information received from its neighbors. A state feedback ui = ki (xj 1 , . . . , xj m )

(3)

is said to be a protocol if the cluster Ji = {vj 1 , . . . , vj m } satisfies Ji ⊆ {vi } ∪ Ni . We say node vi and node vj agree in a network if and only if xi = xj , the nodes of a network have reached a consensus if and only if xi = xj for all i, j ∈ ℓ. And protocol (3) is said to be a consensus protocol if distributed system (2) could reach a consensus under protocol (3). The following protocol has been frequently cited to solve a consensus problem: ui =



Aij (xj − xi ).

(4)

j∈Ni 39

40 41

42

43 44 45 46 47

Then the value of a network evolves in accordance with the following system: X˙ (t ) = −LX (t )

(5)

where the matrix L is called graph Laplacian and is defined by

 n    A ik Lij = k=1,k̸=i  −A ij

i=j

(6)

i ̸= j.

Apparently, the location of the eigenvalues of graph Laplacian L plays an important role in the stability properties of system (5). In order to study the relationship between the topology of a graph and the eigenvalue spectrum of L, Fiedler defined the second-smallest eigenvalue (or λ2 ) of L as the algebraic connectivity of graphs, which is also called Fiedler eigenvalue. A well-known observation regarding the Fiedler eigenvalue of undirected graphs is that for dense graphs λ2 is relatively large while for sparse graphs λ2 is relatively small.

H. He et al. / Physica A xx (xxxx) xxx–xxx

3

3. Detecting community structure

1

We now investigate the relation of community structure to consensus dynamics. As Olfati-Saber et al. [23] established a positive connection between the algebraic connectivity of graphs and the negotiation speed of the consensus protocol, it is demonstrated that the sets of densely interconnected nodes solve an agreement problem faster than those with sparse connections. In particular, for a network with community structure, the subnetworks corresponding to well-defined communities reach an agreement first, and these communities converge to the final consensus space in a relatively long time. Thus, it is available to identify the community structure of a graph according to the order of an agreement problem solved. Then we propose two approaches to extract information about how nodes reach an agreement in a sequential process by visualizing different measured quantities.

10

3.1. Algorithm 1

11

3.1.1. Identification mechanism We propose the definition of module and then explain how the evolution of modules implies the presumable community structure of networks. In the consensus process a group of nodes are said to become a module if they fulfill two conditions: (1) they reach an agreement with each other; (2) they agree with the same external nodes. In general, those sets of nodes with high dense interconnections and sparse extra connections would become modules in a short time. Then larger and larger spatial groups also do it according to the topological structure until the entire graph becomes a module. Thus, we receive a route of an agreement problem solved and this dynamical route will reveal different topological structures, presumably those which represent communities. That implies the number of modules at different time scales indicates the potential number of communities. 3.1.2. Visualizing of measured quantity As we can study the community structure by observing the number of modules, it is of great importance to extract quantitative information about the evolution of modules from the consensus procedure. A dynamic matrix D that depends on the consensus dynamics is defined as: Dij = 1 when node vi and node vj reach an agreement; Dij = 0 when vi and vj do not reach an agreement. Particularly, Dii = 1 holds for all i ∈ ℓ. Dij (t ) =



1 0

δij (t ) ≤ T δij (t ) > T

(7)

where δij (t ) = |xj (t ) − xi (t )| and T is a threshold to determine the boundaries between different groups. Let m be the number of modules, N1 be the number of nonzero eigenvalues of matrix D. From matrix theory one can know that when a set of nodes becomes a module, the specific rows of matrix D corresponding to the nodes in this module would be identical. So the number of modules is equal to the number of nonzero eigenvalues of matrix D, i.e. m = N1 . Furthermore, suppose that a module emerges at t1 and vanishes at t2 , where t1 depends on the density of internal edges and t2 depends on the density of external connections. Then the time scale (t2 − t1 ) implies the relative stability of this corresponding partition for the network. So we can study the community structure of a network if we draw the number of nonzero eigenvalues of matrix D as a function of time. 3.2. Algorithm 2 3.2.1. Identification mechanism For a network all nodes are assumed to be isolated and there are n disconnected components at the beginning. Then we analyze the system by applying protocol (4) and connect node vi and node vj by a kind of dynamical edge when vi agrees with vj . As time goes on, units of nodes will be connected in a sequential process according to the topological structure. During the procedure we could obtain a series of unconnected graphs and this dynamical route will unveil different topological structures, presumably those which represent communities. The unconnected graphs represent potential partitions for the network at different time scales and the number of disconnected components in every unconnected graph indicates the number of communities for the corresponding partition. Similarly, the time scale at each plateau region implies the relative stability of the corresponding partition if we plot the number of disconnected components as a function of time. 3.2.2. Visualizing of measured quantity As the number of disconnected components indicates the presumable number of communities, we attempt to extract information about the number of disconnected components from the consensus process. To this end we first define a dynamic

2 3 4 5 6 7 8 9

12 13 14 15 16 17 18 19 20

21 22 23 24 25

26

27 28 29 30 31 32 33 34

35

36 37 38 39 40 41 42 43 44

45 46

4

H. He et al. / Physica A xx (xxxx) xxx–xxx

Fig. 1. (a) Topology of the example network; (b) state trajectories of all nodes.

Fig. 2. Evolution of dynamical edge in a consensus process. 1

2

3 4 5

6

adjacency matrix  A, which is defined by

 1  Aij (t ) = 0

when i ̸= j and δij (t ) ≤ T otherwise

where δij (t ) = |xj (t ) − xi (t )| and T is a threshold to determine the boundaries between different groups. During the evolution of matrix  A, we could attain a series of graphs G( A) and then obtain the corresponding Laplacian matrix  L of each graph by

 Lij (t ) =

 n     Aik (t )

8 9

10

11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

i=j (9)

k=1,k̸=i

   −Aij (t ) 7

(8)

i ̸= j.

The number of disconnected components in graph G( A) is equal to the number of zero eigenvalues of corresponding  L. Then we can plot the number of zero eigenvalues of matrix  L as a function of time and observe the relative stability of different partitions for the network. 3.3. Discussion Although algorithm 1 and algorithm 2 are both built around the idea of visualizing the evolution of a measured quantity during a consensus, their results may not keep same all the time. That can be explained as the measured quantities of the two algorithms are different, which leads to the different routes of an agreement problem solved to unveil community structures: (1) In algorithm 1 a group of nodes will be recognized when they agree with each other and they reach an agreement with the same external nodes, while in algorithm 2 a group of nodes will be recognized when they agree with any node in this group. (2) In algorithm 1 we observe the number of nonzero eigenvalues of matrix D, while in algorithm 2 we observe the number of zero eigenvalues of matrix  L. In addition, the diagonal elements of matrix D are different from those of matrix  A: for any i ∈ ℓ, we define Dii = 1 while  Aii = 0. In order to address and compare how algorithm 1 and algorithm 2 reflect the community structure of graphs, we now display the evolution of measured quantities by proposing a simple example. For a connected graph of 4 nodes shown in Fig. 1(a), suppose X (0) = [3, 1, 5, 7]T is the initial value and we analyze the system under protocol (4). Fig. 1(b) shows the state trajectories of all nodes. We denote a kind of dynamical edge as: node pair is connected by a dotted line when they reach an agreement. Fig. 2 shows the procedure of reaching a consensus. Table 1 shows the evolution of matrix D corresponding to the graphs in Fig. 2 and N1 denotes the number of nonzero eigenvalues of matrix D (equivalent to the number of modules). Table 2 shows the evolution of matrix  L corresponding to the graphs in Fig. 2 and N2 denotes the number of zero eigenvalues of matrix  L (equivalent to the number of disconnected components).

H. He et al. / Physica A xx (xxxx) xxx–xxx

5

Table 1 Evolution of matrix D corresponding to the graphs in Fig. 2. a 1 0 0 0

D

0 1 0 0

N1

0 0 1 0

b 0 0 0 1

1 1 0 0

1 1 0 0

4

0 0 1 0

c 0 0 0 1

1 1 1 0

1 1 0 0

3

d

1 0 1 0

0 0 0 1

1 1 1 0

1 1 1 0

e

1 1 1 0

4

0 0 0 1

1 1 1 1

1 1 1 1

2

1 1 1 1

1 1 1 1

1

Table 2 Evolution of  A and  L corresponding to the graphs in Fig. 2. a

b

 A

0 0 0 0

0 0 0 0

0 0 0 0

0 0 0 0

 L

0 0 0 0

0 0 0 0

0 0 0 0

0 0 0 0

N2

4

c

d

e

0 1 0 0

1 0 0 0

0 0 0 0

0 0 0 0

0 1 1 0

1 0 0 0

1 0 0 0

0 0 0 0

0 1 1 0

1 0 1 0

1 1 0 0

0 0 0 0

0 1 1 1

1 0 1 1

1 1 0 1

1 1 1 0

1

−1

0 0 0 0

−1

−1

−1

0 1 0

−1 −1

2 −1 0

0 0 0 0

−1

1 0 0

−1 −1

3

−1 −1

0 0 0 0

2

1 0 0

0 0 0 0

2

−1

−1 −1 −1

3 −1 −1

−1 −1 3

−1 −1 −1

−1

3

0 0

3

0

2

0

2 0 2

1

Fig. 3. Test on the example network. (a) Algorithm 1: the number of nonzero eigenvalues of matrix D (equivalent to the number of modules) as a function of time; (b) algorithm 2: the number of zero eigenvalues of matrix  L (equivalent to the number of disconnected components) as a function of time.

Then we plot the number of nonzero eigenvalues of matrix D as a function of time and draw the number of zero eigenvalues of matrix  L as a function of time. By observing Fig. 3 we know that both algorithms detect the hierarchical organization of the example network: the first hierarchical level is composed of three groups and two communities represent the second level. Particularly, in the applications of algorithm 1, a module may split into multiple modules when a node belonging to this module agrees with external nodes while other nodes in this module do not. For instance, Fig. 2(b) and (c) show the procedure that a module consisting of node v1 and node v2 splits into two modules. That is why N1 increases in Fig. 3(a). 3.4. Design of key parameter

1 2 3 4 5 6 7

8

We detect the community structure of a network based on the order of an agreement problem solved. However, the initial value X (0) and the threshold T affect the agreement process by determining where it begins and where it ends, respectively. Therefore, except for the topology of a graph, X (0) and T may also exert influences on the route of an agreement problem solved and distort the results we observed. So it is important to set appropriate X (0) and T with the purpose of minimizing their impacts. We try to connect X (0) and T by a series of experiments on some benchmark graphs. For each graph we initialize X (0) with different sets of random numbers and then adjust T accordingly such that the evolution of measured quantities reveals the community structure. During the trials we discover a connection between DX and T , and propose a heuristic rule following a number of experiments:

9 10 11 12 13 14 15 16

√ T =ε·

DX 2

where parameter ε is an adjustment parameter and DX is the variance of the initial value X (0).

(10)

17

18

6

H. He et al. / Physica A xx (xxxx) xxx–xxx

Fig. 4. Test on a random network. (a) Algorithm 1: the number of nonzero eigenvalues of matrix D (equivalent to the number of modules) as a function of time; (b) algorithm 2: the number of zero eigenvalues of matrix  L (equivalent to the number of disconnected components) as a function of time.

1 2 3 4 5 6 7 8 9

10

Then we apply the algorithms on some benchmark graphs and initialize each graph with different X (0). The experimental results show that the heuristic rule (10) can effectively limit the influence of X (0) on N1 and N2 . Parameter ε affects the number of communities by impacting on the boundaries between different groups, which means ε determines the resolution of our algorithms: (1) If we raise the parameter ε , the resolution of the algorithms will decrease and nodes belonging in different communities may be identified in one community. Then the measured number of communities decreases. (2) If we reduce the parameter ε , the resolution of the algorithms will increase and nodes will solve an agreement problem with difficulty. Then the measured number of communities may frequently fluctuate and show no plateau region in the experimental results. 4. Tests of the method

13

Testing a method for detecting community structure actually means applying it to a specific network whose community structure is already known and comparing such result with that delivered by the method. In this section we introduce such specific networks consisting of computer-generated and real-world networks, which are called Benchmark graphs.

14

4.1. Computer-generated networks

11 12

15 16

17 18 19 20 21 22

23 24 25 26

27 28 29 30 31 32 33 34 35 36

We first generate random and regular networks that show no community structure to test the performance of the algorithms we have proposed in Section 3. 4.1.1. Random network Fig. 4 shows the applications of the two algorithms on a 50-node random network with a fixed probability of edges connected p = 0.1. It implies that no communities are identified if the curve standing for the number of nonzero eigenvalues of matrix D or the number of zero eigenvalues of matrix  L falls to 1 without a plateau region. Fig. 4(a) and (b) show that no communities are identified in a random network under algorithm 1 and algorithm 2. 4.1.2. Regular network Fig. 5 shows the applications of the two algorithms on a common regular work, a nearest-neighbor coupled network, which consists of 20 nodes and each node has 4 coupled neighbors. The results of testing also present that no communities exist in a regular network. 4.1.3. Planted l-partition model Then we test our algorithms on some networks with community structure. The paradigmatic network with a well defined community structure is the planted l-partition model [24], which has become very popular to test algorithms for community detection in these years. This model partitions a graph into l groups with g nodes each. Zin and Zout indicate the expected internal and external degree of a node respectively. We follow this intuitive idea and consider the model constructed by Girvan and Newman [25], where n = l ∗ g = 4 ∗ 32 = 128, and the average number of links per node is fixed at a value of 16. Q3 Here we propose a network with Zin = 15 based on Newman–Girvan model. Fig. 6 is the results of applications of the two algorithms on network Zin = 15. We can easily observe a plateau region of N1 = 4 from Fig. 6(a) and a plateau region of N2 = 4 from Fig. 6(b). That implies both the two algorithms detected the four communities structure for network Zin = 15.

H. He et al. / Physica A xx (xxxx) xxx–xxx

7

Fig. 5. Test on a regular network. (a) Algorithm 1: the number of nonzero eigenvalues of matrix D (equivalent to the number of modules) as a function of time; (b) algorithm 2: the number of zero eigenvalues of matrix  L (equivalent to the number of disconnected components) as a function of time.

Fig. 6. Test on network Zin =15. (a) Algorithm 1: the number of nonzero eigenvalues of matrix D (equivalent to the number of modules) as a function of time; (b) algorithm 2: the number of zero eigenvalues of matrix  L (equivalent to the number of disconnected components) as a function of time.

4.1.4. Hierarchical scale-free networks Many real networks in nature and society may share a hierarchical structure, implying that they may display several levels of clustering of the nodes, with small clusters included within large clusters, which are in turn included in larger clusters. Ravasz and Barabási [26] proposed a popular model of hierarchical scale-free networks. This class of networks have some hub nodes and show the scale-free property. Alex Arenas et al. [27] presented a simple example of this model for the case of two levels. We introduce this example and test our algorithms on this network. Fig. 7(a) shows the topology of a hierarchical scale-free network, Fig. 7(b) and (c) are the applications of algorithm 1 and algorithm 2, respectively. One can observe a plateau region of N1 = 5 and a plateau region of N1 = 3 from Fig. 7(b). That implies this network includes two hierarchical levels of communities: five compartments represent the first community organization level and three compartments represent the second community organization level. Fig. 7(c) delivers the same message as we have received from Fig. 7(b). 4.1.5. Hierarchical planted l-partition model Then we consider a transformative planted l-partition model to test the performance of our algorithms on hierarchical computer-generated networks. These networks are composed of two hierarchical levels of communities: four groups represent the first community organization level, and the second organization level is defined by two larger groups containing each one two different groups of the first level. The edge densities in this model are indicated by three parameters Zin1 , Zin2 and Zout : Zin1 is the expected internal degree of nodes within its first level; Zin2 is the expected internal degree of nodes at the second level; Zout is the expected number of edges connecting the nodes of the two different larger groups. The average degree ⟨k⟩ of a vertex is fixed to 30 and ⟨k⟩ = Zin1 + Zin2 + Zout . We firstly generate network 25-4 with Zin1 = 25 and Zin2 = 4 to test the performance of the algorithms on hierarchical topological networks, and then apply our algorithms on network 28-1 with Zin1 = 28 and Zin2 = 1 for comparison. Here we analyze the plateau regions shown as Fig. 8(a) as example. We can easily observe a plateau region of N1 = 4 and a plateau region of N1 = 2, which implies network 25-4 includes two hierarchical levels of communities: four compartments represent the first community organization level and two compartments represent the second level. Further, the plateau range of N1 = 2 is larger than the plateau range of N1 = 4 and that suggests the community structure at the second level is stronger. By similar analysis we can know the topological structures of the two networks from Figs. 8 and 9: both networks include two hierarchical levels of communities, and for network 25-4 the second level is stronger while for network 28-1 the first level is stronger.

1 2 3 4 5 6 7 8 9 10 11

12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

8

H. He et al. / Physica A xx (xxxx) xxx–xxx

Fig. 7. Test on hierarchical scale-free network. (a) The topology of a hierarchical scale-free network; (b) algorithm 1: the number of nonzero eigenvalues of matrix D (equivalent to the number of modules) as a function of time; (c) algorithm 2: the number of zero eigenvalues of matrix  L (equivalent to the number of disconnected components) as a function of time.

Fig. 8. Test on network 25-4. (a) Algorithm 1: the number of nonzero eigenvalues of matrix D (equivalent to the number of modules) as a function of time; (b) algorithm 2: the number of zero eigenvalues of matrix  L (equivalent to the number of disconnected components) as a function of time.

Fig. 9. Test on network 28-1. (a) Algorithm 1: the number of nonzero eigenvalues of matrix D (equivalent to the number of modules) as a function of time; (b) algorithm 2: the number of zero eigenvalues of matrix  L (equivalent to the number of disconnected components) as a function of time.

H. He et al. / Physica A xx (xxxx) xxx–xxx

9

Fig. 10. Test on Zachary’s karate club network. (a) Algorithm 1: the number of nonzero eigenvalues of matrix D (equivalent to the number of modules) as a function of time; (b) algorithm 2: the number of zero eigenvalues of matrix  L (equivalent to the number of disconnected components) as a function of time.

Fig. 11. Test on dolphin social network. (a) Algorithm 1: the number of nonzero eigenvalues of matrix D (equivalent to the number of modules) as a function of time; (b) algorithm 2: the number of zero eigenvalues of matrix  L (equivalent to the number of disconnected components) as a function of time.

4.2. Real world networks

1

4.2.1. Zachary’s karate club network Researchers have also constructed a number of benchmark graphs by observing complex systems in the real world and we are going to apply our algorithms on some of them. The first of these benchmark graphs we used is the well-known Zachary’s karate club [28]. In this study, Zachary observed 34 members of a karate club over two years. During the observation, the club’s instructor left and founded a new club with about a half of members because of a conflict with the club’s administrator. Zachary’s karate club is the most investigated network and a number of algorithms have detected the community structure. In Fig. 10(a) we plot, for the karate club network, the number of nonzero eigenvalues of matrix D as a function of time. And in Fig. 10(b) we draw the number of zero eigenvalues of matrix  L as a function of time. As there is an obviously plateau region of N1 = 2 in Fig. 10(a) and N2 = 2 in Fig. 10(b), we can conclude that the karate club network could be divided to two communities, which is in line with the general knowledge.

2 3 4 5 6 7 8 9 10 11 12

4.2.2. Dolphin social network The dolphin social network [29] is another real-world benchmark frequently cited in the community detection, which was constructed from the observation of 62 dolphins over a period of 7 years. The group splits into two groups and a few members that on the boundary disappeared during the observation. Fig. 11 shows the applications on the dolphin social network. By observing Fig. 11(a) and (b) we know both algorithms recognize a hierarchical organization of communities: four communities form the first community organization level and the second level is composed of two communities.

13 14 15 16 17 18 19

5. A detailed demonstration In this section we present a detailed demonstration of our methods on the planted l-partition network Zin = 15, which has been used in Section 4.1.3 (see Fig. 12).

20

21

Q4

22

10

H. He et al. / Physica A xx (xxxx) xxx–xxx

Fig. 12. Topology of network Zin = 15.

Fig. 13. The number of nonzero eigenvalues of matrix D (equivalent to the number of modules) as a function of time: (a) when ε = 0.001; (b) when

ε = 0.01. 1

2 3

4

5.1. Demonstration of algorithm 1 (1) Initialize the network with a random X (0) and then analyze the system by applying consensus protocol (4). (2) Extract information from the consensus process through a dynamical matrix D, which is defined by Dij (t ) =



1 0

|xj (t ) − xi (t )| ≤ T |xj (t ) − xi (t )| > T

(11)

√

5 6 7 8 9

where T = ε · 2DX . (3) Adjust ε and plot the number of nonzero eigenvalues of matrix D as a function of time. (a) At first we give ε a small value, and the curve would frequently fluctuate as shown in Fig. 13(a). (b) Then increase ε gradually until the curve shows plateau regions as Fig. 13(b). (4) Select a time point t1 corresponding to the plateau region of N1 = 4 in Fig. 13(b). For example, let t1 = 2.

10

Then obtain the matrix D(t1 ), and dye each group of nodes corresponding to the identical rows in matrix D(t1 ) a respective

11

Q5 color. As seen in Fig. 14, the four groups of nodes with respective colors represent the four communities detected by

12

algorithm 1.

13

5.2. Demonstration of algorithm 2

14 15

16

17

18

(1) Initialize the network with a random X (0) and analyze the system by applying consensus protocol (4). (2) Define the dynamical adjacency matrix  A as

 1  Aij (t ) = 0

when i ̸= j and |xj (t ) − xi (t )| ≤ T otherwise

(12)

and then obtain the dynamical Laplacian matrix by

 Lij (t ) =

 n     Aik (t )

i=j (13)

k=1,k̸=i

   −Aij (t )

i ̸= j.

H. He et al. / Physica A xx (xxxx) xxx–xxx

11

Fig. 14. Separated structure of network Zin = 15 at t1 = 2. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Fig. 15. Application of algorithm 2. (a) The number of zero eigenvalues of matrix  L (equivalent to the number of disconnected components) as a function of time when ε = 0.01; (b) the separated structure of network Zin = 15 at t2 . (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

(3) Similarly, we assign ε a small value at first and then adjust ε according to the evolution of the number of zero eigenvalues of matrix  L. (4) Select a time point t2 corresponding to the plateau region of N2 = 4 in Fig. 15(a). Then obtain the matrix  A(t2 ), and dye each group of nodes corresponding to a connected component in graph G( A(t2 )) a respective color. The result is shown in Fig. 15(b) and the four groups of nodes with respective colors represent the four communities detected by algorithm 2. 6. Conclusion In this paper we established a connection between the community structure and consensus dynamics in complex networks. We analyzed the dynamical process towards consensus and found that those sets of densely interconnected nodes corresponding to well-defined communities would emerge in different time scales if a clear community structure exists. We extract information about the community structure of networks by observing two different dynamic matrices associated with the topology and consensus dynamics, which shows how the consensus process can be used for discovering the hierarchical organizations of a given network. Then two algorithms for detecting community structure scales are proposed by visualizing the respective measured quantities. Finally the simulation on several benchmark networks demonstrated the reliability of our methods. In addition, detecting the community structure of networks by other coordinated dynamics will be a topic of future research.

1 2 3 4 5 6

7

8 9 10 11 12 13 14 15 16

12

1

2 3 4 5

6

7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50

H. He et al. / Physica A xx (xxxx) xxx–xxx

Acknowledgments The authors would like to thank the anonymous reviewers and the Editor for their invaluable remarks and suggestions. This work was supported by the National Natural Science Foundation of China (Grant No. 61203032), the Natural Science Foundation of Hubei Province (Grant No. 2012FFB05007), the Fundamental Research Funds for the Central Universities, the Q6 China Scholarship Council, and the Swedish Research Council. References [1] S.H. Strogatz, Exploring complex networks, Nature 410 (2001) 268–276. http://dx.doi.org/10.1038/35065725. [2] M. Newman, A.-L. Barabási, D.J. Watts, The Structure and Dynamics of Networks, Princeton University Press, 2006. [3] R. Albert, A.-L. Barabási, Statistical mechanics of complex networks, Rev. Modern Phys. 74 (2002) 47–97. http://dx.doi.org/10.1103/RevModPhys.74. 47. [4] Y.-Y. Liu, J.-J. Slotine, A.-L. Barabási, Controllability of complex networks, Nature 473 (2011) 167–173. http://dx.doi.org/10.1038/nature10011. [5] R. Albert, H. Jeong, A.-L. Barabási, Internet: Diameter of the world-wide web, Nature 401 (1999) 130–131. http://dx.doi.org/10.1038/43601. [6] O. Sporns, Networks analysis, complexity, and brain function, Complexity 8 (2002) 56–60. http://dx.doi.org/10.1002/cplx.10047. [7] J. Scott, Social Network Analysis: A Handbook, third ed., SAGE Publications Ltd., 2012. [8] R.J. Williams, N.D. Martinez, Simple rules yield complex food webs, Nature 404 (2000) 180–183. http://dx.doi.org/10.1038/35004572. [9] M.E.J. Newman, The structure and function of complex networks, SIAM Rev. 45 (2003) 167–256. http://dx.doi.org/10.1137/S003614450342480. [10] G.W. Flake, S. Lawrence, C.L. Giles, Self-organization and identification of web communities, Computer (2002) 66–70. http://dx.doi.org/10.1109/2. 989932. [11] M. Girvan, M.E.J. Newman, Community structure in social and biological networks, Proc. Natl. Acad. Sci. USA 99 (2002) 7821–7826. http://dx.doi.org/ 10.1073/pnas.122653799. [12] L.A. Adamic, E. Adar, Friends and neighbors on the web, Social Networks 25 (2003) 211–230. http://dx.doi.org/10.1016/S0378-8733(03)00009-1. [13] A. Pothen, Graph Partitioning Algorithms with Applications to Scientific Computing, Springer, Netherlands, 1997, pp. 323–368. http://dx.doi.org/10. 1007/978-94-011-5412-3_12. [14] M. Marija, T. Bosiljka, Spectral and dynamical properties in classes of sparse networks with mesoscopic inhomogeneities, Phys. Rev. E 80 (2009) 026123. http://dx.doi.org/10.1103/PhysRevE.80.026123. [15] M.E.J. Newman, Fast algorithm for detecting community structure in networks, Phys. Rev. E 69 (2004) 066133. http://dx.doi.org/10.1103/PhysRevE. 69.066133. [16] H. Du, M.W. Feldman, S. Li, X. Jin, An algorithm for detecting community structure of social networks based on prior knowledge and modularity, Complexity 12 (2007) 23–60. http://dx.doi.org/10.1002/cplx.20166. [17] Hastie, Tibshirani, Friedman, The Elements of Statistical Learning, Springer, 2001. [18] J. Reichardt, S. Bornholdt, Detecting fuzzy community structures in complex networks with a potts model, Phys. Rev. Lett. 93 (2004) 218701. http:// dx.doi.org/10.1103/PhysRevLett.93.218701. [19] B.D. Hughes, Random Walks and Random Environments: Random Walks, Vol. 1, Clarendon Press, 1995. [20] A. Arenasa, A. Díaz-Guilerab, C.J. Pérez-Vicenteb, Synchronization reveals topological scales in complex networks, Phys. Rev. Lett. 96 (2006) 114102. http://dx.doi.org/10.1103/PhysRevLett.96.114102. [21] S. Fortunato, Community detection in graphs, Phys. Rep. 486 (2010) 75–174. http://dx.doi.org/10.1016/j.physrep.2009.11.002. [22] Y. Chen, J. Lu, F. Han, X. Yu, On the cluster consensus of discrete-time multi-agent systems, Systems Control Lett. 60 (7) (2011) 517–523. http://dx.doi. org/10.1016/j.sysconle.2011.04.009. [23] R. Olfati-Saber, R.M. Murray, Consensus problems in networks of agents with switching topology and time-delays, IEEE Trans. Automat. Control 49 (2004) 1520–1533. http://dx.doi.org/10.1109/TAC.2004.834113. [24] A. Condon, R.M. Karp, Algorithms for graph partitioning on the planted partition model, Random Struct. Algorithms (2001) 221–232. http://dx.doi. org/10.1007/978-3-540-48413-4_23. [25] M. Girvan, M.E.J. Newman, Community structure in social and biological networks, Proc. Natl. Acad. Sci. USA 99 (2002) 7821–7826. http://dx.doi.org/ 10.1073/pnas.122653799. [26] E. Ravasz, A.-L. Barabási, Hierarchical organization in complex networks, Phys. Rev. E 67 (2003) 026112. http://dx.doi.org/10.1103/PhysRevE.67. 026112. [27] A. Arenasa, A. Díaz-Guilerab, C.J. Pérez-Vicenteb, Synchronization processes in complex networks, Physica D 224 (2006) 27–34. http://dx.doi.org/10. 1016/j.physd.2006.09.029. [28] W. Zachary, An information flow model for conflict and fission in small groups, J. Anthropol. Res. 33 (1977) 452–473. [29] D. Lusseau, M. Newman, Identifying the role that individual animals play in their social network, Proc. R. Soc. Lond. Ser. B 271 (2004) 477–481.

Exploring community structure in networks by consensus dynamics

Exploring community structure in networks by consensus dynamics

Recommend Documents