Communicated by Dr. Muhammet Uzuntarla
Accepted Manuscript
Constrained Common Cluster based Model for Community Detection in Temporal and Multiplex Networks Pengfei Jiao, Wenjun Wang, Di Jin PII: DOI: Reference:
S0925-2312(17)31507-2 10.1016/j.neucom.2017.09.013 NEUCOM 18870
To appear in:
Neurocomputing
Received date: Revised date: Accepted date:
4 April 2017 11 June 2017 5 September 2017
Please cite this article as: Pengfei Jiao, Wenjun Wang, Di Jin, Constrained Common Cluster based Model for Community Detection in Temporal and Multiplex Networks, Neurocomputing (2017), doi: 10.1016/j.neucom.2017.09.013
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT
Constrained Common Cluster based Model for Community Detection in Temporal and Multiplex Networks
a Tianjin b Tianjin
CR IP T
Pengfei Jiaoa,b , Wenjun Wanga,b,∗, Di Jina
University, School of Computer Science and Technology, Tianjin, 300350, China Engineering Center of SmartSafety & Bigdata Technology,Tianjin Key Laboratory of Advanced Networking (TANK)
AN US
Abstract
On one hand, the detection of tightly connected groups, also known as community detection in complex networks, is a prominent problem for network analysis and mining. On the other hand, almost all of social, biological, bibliographic, communication and computer systems are modeled as temporal networks, the
M
topological structures of which evolve with time, or multiplex networks, each pair of nodes of which has multiple linked relations. Current methods of community detection for temporal networks are based on incremental, independent
ED
or evolutionary clustering, and for multiplex networks are based on fusion of the multiple links. However, all these methods ignore the common structure hidden in the networks, which is denoted as the common cluster here. So in this paper,
PT
we propose a constrained common cluster based model (C 3 model) to analyze the temporal and multiplex networks, which can not only detect the community
CE
structure, but also identify the importance of each node based on the common cluster structure of both two classes of networks. The intrinsic assumption of the proposed model is that there are common or coincident clusters hidden in
AC
these networks. In detail, we first construct the Markov steady-state matrices of each snapshot of temporal network or each slice of multiplex network. Next, ∗ Corresponding
author Email addresses:
[email protected] (Pengfei Jiao),
[email protected] (Wenjun Wang),
[email protected] (Di Jin)
Preprint submitted to Journal of LATEX Templates
September 11, 2017
ACCEPTED MANUSCRIPT
we propose the object function of C 3 model by combining the Markov steadystate matrices, similarity matrices with community membership matrices of each snapshot or slice of the network. Last, a gradient descent algorithm based on
CR IP T
non-negative matrix factorization is proposed for the object function. Experiments on both synthetic datasets and real-world networks demonstrate that the
proposed C 3 model has competitive performance based on the evaluation in-
dexes N M I and error of community detection, otherwise, the proposed model could identify the importance of nodes of the temporal or multiplex networks. Keywords: temporal or multiplex networks, constraint common structre,
AN US
Markov steady-state matrices, non-negative matrix factorization
1. Introduction
Community structure is a remarkable feature observed in complex networks and community detection has a giant impact on complex network analysis and
5
M
mining [1, 2]. Communities are usually composed of tightly connected subgraphs in complex networks, for example, in a protein-protein interaction (PPI) network, nodes and edges correspond proteins and interactions among the pro-
ED
teins, respectively, and each community may represent a biological organism with specific function [3]. A variety of methods have been proposed for commu-
10
PT
nity detection from different fields, such as traditional clustering methods [3], spectral methods [4], modularity optimization [5], stochastic block model [6, 7], non-negative matrix factorization [8, 9, 10] and so on. Detailed and comprehen-
CE
sive introductions for community detection can be seen in [11, 12, 13]. Furthermore, many social, biological, bibliographic, communication and com-
AC
puter systems can be modeled as temporal networks [14], i.e. the topological
15
structures of which evolve with time, or multiplex network [15], i.e. each pair of nodes of which has multiple relations. On one side, in a mobile communication network, the links between each two individuals are dynamic with times and the time interval distribution of these interactions follows a power-low distribution [16]. A ecological network, which is constructed with the species or
2
ACCEPTED MANUSCRIPT
20
other categories of organisms and interactions among them, it may change with the seasons as organisms going through different phases of their life cycles [17]. On the other side, such as in a social network which has a multiplex structure,
CR IP T
different slices may represent different social relationships (including Facebook, Twitter, etc); in a transportation network, nodes and different type links corre25
spond to airports and different airline flights, respectively [18]. In the temporal or multiplex networks, the communities of which have their own characteris-
tics. So finding the community structure for both these two types of network
makes the problem of community detection more interesting and challenging.
30
AN US
Otherwise, detecting anomalies in temporal networks has also received great attentions[19, 20].
In fact, over the topological structure varying with time, community structure of the temporal networks has its own characteristics which including community growth, contractions and so on [21]. For a multiplex network, each single layer of which has a piece of meaningful information from its perspective, and hence how to exploit and fuse the multi slices or multiplexes information
M
35
to detect the community structure and analyze each layer of the network is an
ED
essential problem [22]. Some pioneering methods have been proposed for either the temporal networks or the multiplex networks, such as the methods based on independent clustering [21], incremental clustering [23, 24] and evolutionary clustering [25, 26] for temporal networks, matrix factorization [27] and pattern
PT
40
mining [28] for multiplex networks. However, the methods of community de-
CE
tection for temporal networks are not suitable for multiplex networks, because they didn’t make the best of multi slices structure of the networks. On the contrary, the methods for multiplex networks ignore the temporal information and couldn’t analyze the evolution of the temporal networks.
AC
45
In addition, how can we detect the community structures in each snapshots
of the temporal networks or each slices of the multiplex networks and analyze their latent properties? Is there any internal consistency among these communities? How to generate the each slice of the multiplex network and the each
50
snapshots of the temporal networks we observed by joining with the common 3
ACCEPTED MANUSCRIPT
cluster structure and the temporal or multiplex communities? Aiming at these challenges, in this paper we present an intuitive, principal, interpretable and optimizable model, called constrained common cluster model
55
CR IP T
(C 3 model), to analyze and explore community structure and common cluster structure hidden in the temporal or multiplex networks. We assume that there are same nodes and fixed number of communities for each snapshot or each slice
of the network, with varying link structure in the temporal network and different
types of links at each slice in the multiplex network. The intrinsic assumption of the proposed model is that, there are common or coincident structure in the snapshots of temporal networks or slices of multiplex networks, which are
AN US
60
called common clusters. In fact, we also assume that the number of common clusters is equal to the number of communities across the temporal or multiplex networks, and nodes are divided into different communities at each snapshot or slice. However, each cluster contains all the nodes of the network in the form 65
of probability, which makes our model analyze the importance of each node in
M
the network. In detail, We first construct the Markov steady-state matrices of each snapshot of temporal network or each slice of multiplex network. Next,
ED
we propose the object function of C 3 model by combining the Markov steadystate matrices, similarity matrices and community membership matrices of the 70
network in a theoretical way. Finally, a gradient descent algorithm based on
PT
non-negative matrix factorization is proposed for optimizing the object function. Experiments on both synthetic datasets and real-world networks demonstrate
CE
that our C 3 model has competitive performance based on different evaluation indexes of community detection and could reveal interesting results in some neworks.
AC
75
80
The main contributions of this paper is summarized as follows: • By assuming that there are common or coincident cluster structure in the temporal or multiplex network, we propose an intuitive, principal and interpretable statistical model C 3 , to detect the community structure, which is suitable for both these two types of networks.
4
ACCEPTED MANUSCRIPT
• Based on the common or coincident cluster structure hidden in the network, the proposed model could identify the importance of each node in the cluster of the network.
CR IP T
• We propose an iterative algorithm to optimize the object function which is constituted by combing the Markov steady-state matrices, similarity
85
matrices and community membership matrices of the network.
• We test the proposed model and compare it with some widely used methods for community detection on a variety of temporal and multiplex net-
AN US
works, and the experimental results show the effectiveness and competitively performance of the proposed model.
90
The rest of this paper is organized as follows. First, we review some classic and widely used methods for community detection in temporal or multiplex networks in Section 2. Second, we give the detailed construction of our model
95
M
and the algorithm in Section 3 and 4, respectively. Third, experiments on both artificially generated networks and real world datasets are shown in Section 5. At last, we conclude our work and give some future extensions of the proposed
ED
model in Section 6.
PT
2. Related work
Community detection, as an important and meaningful research direction in 100
complex networks, has been attracted many attentions and a number of methods
CE
have been proposed. Surveys about community detection in static networks can been see in [11, 12, 13, 29].
AC
Moreover, recently there are also a growing number of literatures on com-
munity detection in temporal or multiplex networks separately.
105
On one hand, a temporal network can be modeled as s series of snapshot
networks and methods of community detection for the temporal networks can be divided into three categories in general. The first class is called independent
5
ACCEPTED MANUSCRIPT
clustering, the basic idea of which is to apply static methods of community detection to each snapshot of temporal networks and then analyze the community 110
evolution among them [21]. The second is called incremental clustering, which
CR IP T
continuously updates the community structures of the networks by translating the varying of the network as stream data [23, 24]. Similar methods include
GraphScope [30], incremental spectral clustering [31], and dynamic modular-
ity optimization methods [32, 33]. The last and most important one is called 115
evolutionary clustering, which assumes that the temporal networks vary slowly,
and the previous snapshot or community structure can be used as a penalty
AN US
term when we analyze the current snapshot network. Furthermore, the authors
in [34] extended the classic k-means and hierarchical clustering to analyze dynamic datasets. Chi et.al. [25] also extended the spectral clustering to dynamic 120
scene and proposed two clustering frameworks, i.e. Preserving Cluster Quality (PCQ) and Preserving Cluster Membership (PCM). Liu et.al. proposed a label propagation based evolutionary clustering for detecting overlapping and
M
non-overlapping communities in dynamic networks [35]. A classic and most widely used method, F aceN et [36], which generalized the symmetric nonnegative matrix factorization for community detection in temporal networks, and
ED
125
also been as a comparative method in our work. Folino et.al. [26] proposed the DY N M OGA which takes the evolutionary clustering as a multi objective clus-
PT
tering problem. Gauvin et.al. [37] proposed a method based on non-negative tensor factorization to detect the community structure and activity patterns in temporal networks. In addition, Stochastic Block Models [38] (SBM), a classical
CE
130
statistical model for network analysis, has been also extended to dynamic SBM
AC
for modeling temporal networks [39, 40, 41]. On the other hand, matrix factorization and pattern mining are the two
classes of methods for community detection in multiplex networks. Based on
135
matrix factorization, Tang et al. [27] and Gao et al. [42] proposed different graph clustering algorithms for this problem. The basic idea of both these two algorithms is to fuse different information by extracting common factors from multiple slices of the networks. Zeng et al. [28] proposed a subgraph 6
ACCEPTED MANUSCRIPT
mining algorithm which denotes to find cross-graph quasi-cliques in a multiplex 140
networks, which is a classic and effective method based on pattern mining. Besides, some similar works can be seen in [43], and a detailed review in [44].
CR IP T
Also of note, a generalized network quality function for both the timedependent (temporal) and multiplex networks is proposed in [45], and the au-
thors proposed a heuristic method M ultislice, which detects the temporal and 145
multiplex community structure by a greedy optimization algorithm. The key
idea of this method is to couple multiple or multi-snapshots adjacency matri-
ces of the networks, and hence the returned communities have no a principle
AN US
and theoretically meaning. Therefor, the constrained common cluster model (C 3 model) is proposed, which can detect the community structure in both 150
temporal and multiplex network and analyze importance of each node in the network.
M
3. Constraint Common Cluster based Model
In this section, we introduce the basic notations and the proposed model for
155
ED
the temporal and multiplex networks. 3.1. Notations and Model
As we know, a temporal or multiplex network usually can be represented as
PT
Gt = (V, Et ), where, t = 1, 2, · · · , S. V and Et denote the nodes of the network and edges at the snapshot or slice t, respectively. N = |V | and Mt = |Et | are the
CE
number of nodes and the number of edges of the network at each t, respectively.
160
S and K are the number of snapshots and the number of communities of the temporal network or the number of slices of the multiplex network, respectively.
AC
In other words, we set the number of nodes and the number of communities in the temporal or multiplex networks as constants. We also assume that the network is unweighted and undirected with At denoting the adjacency matrix
165
of the network at snapshot or slice t, and At,ij = 1 if there is an edge between
nodes i and j at snapshot or slice t, and At,ij = 0 otherwise.
7
CR IP T
ACCEPTED MANUSCRIPT
Figure 1: The plate representation of the constraint common clsuter based model. We present
AN US
the community membership matrix at snapshot or slice t and the common cluster W for fitting the temporal or multiplex network G with the Markov steady-state matrix Pt and adjacency matrix At at each snapshot or slice t.
It is worth noting that, at each snapshot or slice t of the network, communities are denoted as a set C = {C1 , C2 , · · · , CK } and each Ck , k = 1, 2, · · · , K is a set of nodes at the snapshot or slice network t, which are usually constrained by ∪K k=1 Ck = V and Ck ∩ Cl = φ when k 6= l. This is called non overlapping
M
170
communities structure and if Ck ∩ Cl 6= φ, it is called overlapping community nodes V of the network as W = {W1 , W2 , · · · , WK }, each Wk ∈ (0, 1)N ×1 is a PN probability vector and i=1 Wik = 1 for k = 1, 2, · · · , K and i = 1, 2, · · · N .
As represented in Figure 1, which is the plate representation of our proposed
PT
175
ED
structure. On the other hand, we denote a common cluster structure on all the
model. Ht is the community membership matrix of the network at snapshot or
CE
slice t, each Ht,jk represents the propensity of node j belonging to community k at t, where j = 1, 2, · · · , N , k = 1, 2, · · · , K, and K is the number of com-
munities of the network and we assume that it is a constant and has a same value for different snapshots of the temporal network and for different slices of
AC
180
the multiplex network. Pt is the Markov steady-state matrix of the network at snapshot or slice t, each Pt,ij represent the probability of a random walker arriving node j from node i, which also can be considered the probability we
find node j from node i.
8
ACCEPTED MANUSCRIPT
On one hand, as we denoted before, W is the probabilistic matrix of the
185
common cluster structure. Each Wik represent the importance of node i in cluster k, which also can be regarded as a prior probability of cluster k containing
CR IP T
190
node i or the probability of we could find node i in cluster k. It is easy known PN that i=1 Wik = 1. So Wik Ht,jk is the expected probability we find node i PK from node j in cluster k at every snapshot or clice t. So the k=1 Wik Ht,jk
is the expected probability we find node i from node j in the network at t, PK with which, we know that Pt,ij ≈ k=1 Wik Ht,jk , further more, we write that as Pt ≈ Pˆt = W HtT . On the other hand, the Ht,ik Ht,jk can be regard as
AN US
195
the the expected number of links between nodes i and j in community k at PK snapshot or slice t, so the k=1 Ht,ik Ht,jk is the expected number of links between nodes i and j at snapshot or slice t of the network. We have the PK formula At ≈ Aˆt = k=1 Ht HtT , where HtT is the transposition of matrix Ht .
As we denoted above, Pt − W HtT and At − Ht HtT for every snapshot or slice
t are the errors which we need to optimize, so we will optimize the following
O1 = λ
T X
(1)
where, Dt
Dt (Pt |W HtT ) + (1 − λ)
ED
t=1
(1)
M
object function of the proposed C 3 model
T X t=1
(2)
Dt (At |Ht HtT )
(1)
(2)
and Dt , t = 1, 2, · · · , T maybe any cost functions that quantify
200
PT
the quality of the approximation, such as the the squared loss or the square of the Frobenius norm, which also is the euclidean distance between two matrices X and Y [46] or the Kullback-Leibler divergence between the two matrices.
CE
The squared loss and Kullback-Leibler divergence are denoted as DF (X|Y ) = P P X |X − Y |2F = ij (Xij − Yij )2 and DKL (X|Y ) = ij (Xij log Yijij − Xij + Yij ),
AC
which are equivalent to the errors Xij − Yij for all pair of i ∼ j and which are
205
Normal distribution and Poisson distribution, respectively. The λ ∈ [0, 1] is a balance parameter, if λ is close to 1, the community membership matrices Ht are more similar because of the common cluster structure W , if λ is close to 0, the model will deteriorate to independent clustering at each snapshot. So a suitable value of λ for the objective function will achieve the best tradeoff,
9
ACCEPTED MANUSCRIPT
210
which can not only detect the well communities structure, but also evaluate the importance of each node in every cluster. (1)
For simple, in this paper, we just set all the Dt
(2)
and Dt , t = 1, 2, · · · , T as
and can be rewritten as min O1 = λ
W,Ht
s.t.
T X t=1
Ht ≥ 0,
|Pt − W HtT |2F + (1 − λ)
T X t=1
t = 1, 2, . . . , S
|At − Ht HtT |2F
.
(2)
(3)
k = 1, 2, . . . , K
AN US
W ≥ 0, PN i=1 Wik = 1,i = 1, 2, . . . , N,
CR IP T
the euclidean distance and the object function become a minimization problem
And without loss of generality, here we set λ = 0.5 (there must be another better value for λ, how to select the best value is not in the scope of this paper), the optimization of object function 2 can be effectively solved by a 215
gradient descent method based on the non-negative matrix factorization [47]
M
and a detailed description can be found in section 4.
ED
4. Parameter Learning For The C 3 Model It is difficult to obtain the optimal solution by optimize the object function 2 directly because of its non convex. So we can return the W and Ht by obtaining the local minimum of the object function under the Majorization-Minimization
PT
220
framework [48], in which we can iteratively updates Ht , t = 1, 2, · · · , S given W
CE
and updates W given Ht , t = 1, 2, · · · , S. The detailed formulation about the update rules of the parameters is as follows. We update each Ht while remaining W are constant. A Lagrange multiplier
AC
matrix Θt for each Ht is introduced and the corresponding object function is as OHt = |Pt − W HtT |2F + |At − Ht HtT |2F + tr(Θt HtT ) = tr[(Pt − W HtT )T (Pt − W HtT )]+
tr[(At − Ht HtT )T (At − Ht HtT )] + tr(Θt HtT )
10
(4)
ACCEPTED MANUSCRIPT
here, we ignore the λ and (1 − λ) due to space constraints and it is insignificant to the model and optimization. We then set derivative of OHt with respect to Ht to 0 and get (5)
CR IP T
2Ht W T W − 2Pt W + 4Ht HtT Ht − 4At Ht + Θ = 0
with complementary slackness Karush-Kuhn-Tucker (KKT) condition for each Ht and we get
(2Pt W − 2Ht W T W + 4At Ht − 4Ht HtT Ht )ik Ht,ik = 0
(6)
as Ht,ik ← Ht,ik
AN US
Following with the updates in [8, 49, 50], we get the update of Ht [51, 47, 52] (Pt W + 2At Ht )ik (Ht W T W + 2Ht HtT Ht )ik
14
(7)
this updating rule of Ht guarantee the non-increasing and convergence of the 225
objective function.
M
We then update W when given each Ht , the object function about W can be rewritten as
ED
OW =
S X t=1
with the constraints W ≥ 0 and
PN
|Pt − W HtT |2F
i=1
(8)
Wik = 1, i = 1, 2, . . . , N, k = 1, 2, . . . , K.
We also introduce a Lagrange multiplier matrix Φ for the nonnegative con-
PT
straints on W and the object function OW is written as
CE
OW =
S X t=1
|Pt − W HtT |2F + tr(ΦW T )
(9)
AC
the derivative of OW with respect to W is S
X ∂OW = (2W HtT Ht − 2Pt Ht )ik + Φ( ik) ∂Wik t=1
(10)
then, we easy get the updating of W similar it in [42] as Wik
PS ( t=1 (Pt Ht ))ik ← PS ( t=1 W HtT Ht )ik 11
(11)
ACCEPTED MANUSCRIPT
and we ensure constraint
PN
i=1
Wik = 1, i = 1, 2, . . . , N, k = 1, 2, . . . , K. after
each iteration with column normalization. As we discussed above, given the initial values, we iteratively update W and each Ht and calculate the objective function until convergence, this algorithm
CR IP T
230
can be see in algorithm 1, and it is easy to prove the convergence and correctness of the algorithm [51, 47, 52].
Algorithm 1 Algorithm for the Constraint Common Cluster based Model Input: A temporal or multiplex network G = A1 , A2 , · · · , AS ; The number of communities K, and the maximum number of iterations niter
snapshot or slice of the network 1:
initialize W with Wik ≥ 0 and
2:
for t = 1 : S do
4: 5:
PN
initialize Ht with Ht,ik ≥ 0 for n = 1 : niter do
P ( S
(P H ))
i=1
Wik = 1, i = 1, 2, . . . , N, k = 1, 2, . . . , K
M
3:
AN US
Output: The common structure W and community membership Ht for each
Wik ← Wik (PSt=1W Ht T tH ik ) t=1
PWik i Wik
t
t ik
Wik ←
7:
for t = 1, 2, · · · , S do 14 (Pt W +2At Ht )ik Ht,ik ← Ht,ik (Ht W T W +2H H T H ) t t ik
8:
t
Calculate objective function equation 2
PT
9:
ED
6:
return W and Ht
CE
The most time- consuming parts of the algorithm are the updating Wik and
Ht,ik for each t. The time cost at n−th iteration for each Wik is O(S[(N +1)K + di ]) and for each Ht,ik is O((N + 1)K + di ), where the di denotes the degree
AC
235
of the node i at t. So the over all time cost of the algorithm is O(N KS[(N + 1)K] + M K), where the N , M , K, and S are the number of nodes, the mean number of edges of all snapshots or slices, the number of communities and the number of snapshots or slices of the network.
12
ACCEPTED MANUSCRIPT
240
5. Experimental results In this section, we introduce the algorithms to be compared with the proposed model, evaluation metrics and give the experiments on both the synthetic
CR IP T
and real world networks including some temporal networks and multiplex networks. 245
5.1. Comparison Algorithms
Here, we introduce the comparison algorithms in our experiment, which are
AN US
• the static symmetric non-negative matrix factorization [8] (StaN M F )
which is the special case of our model when we set λ = 0 of the object function 2.
• DY N M OGA [26], a multiobjective approach which is also based on the
250
evolutionary clustering and formalized as a multiobjective optimization
M
problem to be optimized by a genetic algorithm.
• F aceN et [36]: which combine the symmetric nonnegative matrix factoriza-
ED
tion and evolutionary clustering, we set the penalty coefficient of smoothness as 0.8, which could return the best performance of the method.
255
• M ultislice [45]: which optimize the temporal, multiscale and multiplex
PT
modularity with a greedy heuristic method. We set the resolution parameter γ = 1 and couple parameter ω = 0.2 for this method, which are the
CE
usually parameters setting in the related works.
• our proposed model (C 3 ) and the common constraint non-negative matrix
260
AC
factorization (ComNMF), which is the special case of the proposed model when we set λ = 1 of the object function 2.
5.2. Evaluation Metrics
265
Two widely used evaluation metrics, the Normalized Mutual Information
(NMI) [53], the error [54] for community detection, are introduced to evaluate
13
ACCEPTED MANUSCRIPT
the performance of the methods. The larger N M I value means a better performance of the algorithm and a larger error presents a relatively poor result of the corresponding method. We use both metrics on each snapshots of the
CR IP T
temporal networks and each slices of the multiplex networks. The Normalized Mutual Information (NMI), which determines how similar-
ity the community detection results delivered by the algorithm to the ground truth information and is denoted as N M I(X, Y ) =
(12)
where the X and Y are the two partitions of the community structure and P P (x,y) I(X, Y ) = x y P (x, y)log PP(x)P (y) is the mutual information between the X
AN US
270
2I(X, Y ) H(X) + HH(Y )
and Y , H(X) and H(Y ) denote the entropies of X and Y , respectively.
The error, which determines the difference between the different partitions and is denoted as
(13)
M
error = kZZ T − CC T k2F
where Z and C denotes the partition of ground truth and the partition of the
275
ED
algorithm, respectively. And kV kF is the Frobenius norm of matrixV . 5.3. Experiments on the synthetic networks In this subsection, we introduce one class multiplex benchmark and three
PT
class temporal synthetic networks to present the performance of our proposed
CE
model and the baseline methods introduced before. Multiple Girvan-Newman benchmark Here, we first generate the Girvan-Newman benchmark (GN), which is a
280
AC
class of networks with 128 nodes and 4 equal size communities, each community has 32 nodes and expected degree of all nodes are equal to 16. A extremely important parameter zin, which denotes the average number of edges of each node connecting in its own community, and presents the significance of community
285
structure of the network, detailed description about this data and its procedure is in [53]. Based on each generated GN network, we generate the multiplex 14
ACCEPTED MANUSCRIPT
network as follows, by randomly selecting the edges of the network S times but with all the nodes, we set a fraction parameter r to control the selecting number of edges, and we can construct a multiplex network with S slices. Of course, the ground truth of each slices of the network is constant and we ensure each
CR IP T
290
slice network is connected.
As present in the figure 2, we show four situations of experimental per-
formances of the networks where we set the parameters as zin = 7, r = 0.7; zin = 7, r = 0.8; zin = 8, r = 0.7; and zin = 8, r = 0.8 respectively. For each 295
situation, we run all the algorithms 20 times and the Figure 2 shows the average
AN US
and standard deviation of the results at each slice over times. We find that the performance of our proposed model are relative better than other methods, all the methods have relative poor results with zin = 7. As for the error, the method M ultislice has lower values, however, which automatic selects 13 com300
munities in each slice and it is serious wrong. We test all the situation with the parameter zin varying from 8 to 15, which is the setting almost static methods
M
do, our proposed method is nearly better than the baseline algorithms, we only give the situation with zin = 8 for concise. We also know that, the performances
305
ED
of all the methods are better when setting r = 0.8 than that in r = 0.7. It is easy to be understanding for the later setting which makes each slice network
PT
more sparser.
Temporal benchmark network 1 This class temporal synthetic networks is first proposed by Lin et al. in
CE
[54] which is the dynamic generalization of the Girvan-Newman benchmark.
310
We generate the temporal networks with 10 snapshots, at each snapshot, there
AC
are 128 nodes and 4 communities with each has equal size, and the average degree of nodes is 20. A mixing parameter or noise level z is used to control
the significance of the communities. Here, we set z = 5 and z = 6, respectively, which are the usually setting in previous works. The dynamics of the networks
315
are as follows, at snapshot t ≥ 1, some nodes in each community join other communities randomly by leaving their own communities, we use parameter nc 15
ACCEPTED MANUSCRIPT
to represent the dynamics of each community, which means that there are 4nc nodes switching their communities at each snapshot. To test the performance of the proposed model, we set mixing parameter z = 5, 6 and dynamic level parameter nc = 3, 9, respectively. As shown in Figure
CR IP T
320
3, we give the results with different parameters setting on both two evaluation
metrics. Here, we compare six different methods, which are the StaN M F ,
F aceN et, DY N M OGA, M ultislice, ComN M F and C 3 , respectively. Firstly, the performance of all the methods are better when parameter z = 5 than that 325
in z = 6 on both NMI and error, which is intuitive and easy to be interpreted.
AN US
Secondly, Our proposed C 3 nearly have best performance based on either NMI
or error, for the M ultislice method over fitting the network, in other words, which finds a large number of communities than the ground truth. Thirdly, although the proposed ComN M F method with relative poor performance, the 330
C 3 model works nice, from which, we know that the common structure is a well hypothesis to analyze the temporal networks. Lastly, we run all the algorithms
ED
NMI and error.
M
20 times, our proposed method C 3 has relatively stable performance on both
Temporal benchmark network 2
The second class temporal benchmark networks are proposed in [55], which
335
PT
is generated by taking into account some events that characterize the dynamic of the networks. For our model’s assumption, we select one class synthetic networks, called switch, we also generate the temporal networks with 10 snapshots.
CE
At each snapshot, the network has 1000 nodes and 40 communities, respectively,
340
the average degree of nodes is 15 and the degree distribution follows power law.
AC
There are two parameters mu and p to control the community structure and temporal dynamics of the networks. The first is called mixing parameter, a larger value mu, a more fuzzy community structure. The latter is called temporal switch parameter, p represent the probability of each node changing their
345
communities of between time steps. In our experiments, we set mu = 0.1, 0.2, · · · , 0.8 and p = 0.1, 0.8, respec16
ACCEPTED MANUSCRIPT
tively. In the Figure 4, we show the experimental results with parameters setting nu = 0.7, p = 0.1, nu = 0.7, p = 0.8, nu = 0.8, p = 0.1, and nu = 0.8, p = 0.8, respectively, of generated temporal networks. Performance of all the methods are relative better when setting nu < 0.7 and we ignore that. From the results,
CR IP T
350
we easy know that, firstly, the proposed model C 3 and F aceN et have better performance than other methods in general based on NMI, the former method has more competitively results based on error, which show that the C 3 model is more appropriate for temporal networks analysis. secondly, the ComN M F 355
method give poor results with a larger p, which is same to our intuition. lastly,
AN US
it is most important that, compared with adding a penalty term using last community structure in F aceN et, adding the common structure is more better, meanwhile, the balance parameter is easy to be set. Temporal benchmark network 3
The dynamic Grow-shrink benchmark proposed in [56] is mainly used for
360
M
testing the models and algorithms of temporal community detection. The benchmark denotes a triangular waveform function which is used to model the dy-
ED
namic of the network and each snapshot network is generated with the classic SBM. Here, we generate the temporal networks with 21 snapshots, and at each 365
snapshot, we set the number of the community 8 and each community has 32
PT
and 64 nodes, respectively. Two other parameters pin and pout are denoted to represent the density within communities and between different communities, respectively. In our experiments, we set pin = 0.5 and vary pout from 0.1 to
CE
0.5.
As shown in Figure 5, the proposed model C 3 has more better and more
370
AC
stable performance based on NMI and error at each snapshot, especially for the networks when pout = 0.8, which means that our model are suitable for
fuzzy temporal networks. The method M ultislice has a lower error values for
it is over fitting the networks, as in our results, this method has more than
375
20 communities at each snapshot, however, the number of communities at each snapshot is only 8 as we denoted. 17
ACCEPTED MANUSCRIPT
To test the performance of results of all these methods on the temporal networks, we give the detailed results on the networks with varying parameter pout. We show the results in a simple way that we compute the average value of NMI and error on all snapshot of each temporal network. It is surprising and
CR IP T
380
reasonable that our proposed model C 3 has competitive performance based on
either NMI or error, especially for larger size or more fuzzy temporal networks. 5.4. Experiments on the real world networks A MIT Reality Multiplex network
The multiplex network of MIT Reality Mining [57] used in this paper is col-
AN US
385
lected and analyzed by the MIT human dynamic lab with 87 users involving in the October 2008 on the camps. The networks has 3 layers, each of which represent the different social relationships including Calls, SMS, and Proximity respectively. We construct the temporal networks as follows, each user is de390
noted as a node and there is 3 slices in this networks, at each snapshot t, we
M
denote the similarly matrix Aij = 1 if nodes i and j has more than 3 times interactions and Aij = 0, otherwise. The ground truth can be denoted as the
ED
individual attributes based on survey data. We consider two class annotated data including ”year-school”, which year the user were in, and ”floor”, which 395
living sector in the dorm building the user apartments were located in.
PT
As represented in Figure 7, we show these two results on this multiplex network of the methods, StaN M F , ComN M F , C 3 and M ultislice based on
CE
NMI and error. It is obvious that our model C 3 and the M ultislice have better performance in general. A KIT email temporal network
AC
400
This temporal network is constructed by the email senders, recipients and
their interactions over time 1 . Each email sender or recipient is a node and in-
teractions represent the edges. The senders or recipients belong to a community 1 www.iti.kit.edu/projects/spp1307/
18
ACCEPTED MANUSCRIPT
if they are guided by a same supervisor. We select these data with time stamps 405
ranging from July 2007 to December 2009, further more, we make these data into temporal networks with four resolutions, which are 2, 3, 4 and 6 months,
CR IP T
respectively. So we have 4 temporal networks with the number of snapshots 24, 16, 12, and 8 respectively. To enforce the number of nodes and communities of each snapshot same, we select only the common nodes and communities 410
occurring all the snapshots for the 4 temporal networks.
As represented in Figure 8, the methods of ComNMF and C 3 achieve better
An DBLP example
AN US
performance than other methods in general.
To exhibit the community evolution of the temporal networks and verify the 415
rationality of the assumption-common cluster structure-of our proposed model, we analyze a DBLP dataset
2
as our case study. The data contains the co-
authorship relation among the papers in 28 conference over 5 years, which is
M
come from three main research areas in computer science including data mining (DM), database (DB) and artificial intelligence (AI). We construct this temporal network as follows, each author is denoted one node, and if two authors
ED
420
collaborate one or more papers, then there is an edge between the nodes. We set the number of snapshots of this network is S = 5 with one year of each, and
PT
we find the authors who appear in all the snapshots, which makes this temporal network having a constant number of nodes. So the number of nodes and 425
communities of this networks are 632 and 3, respectively.
CE
Here we give the performance of our proposed method, as shown in Figures
9 and 10. As we have denoted, W means the common cluster structure of the
AC
network, and each Wik means the importance of node i in community k, we sort
each column of W and select top 20 authors in each community. As represent
430
in Figure 9, each circle represents one author and the name of author is in it, the navy blue, medium blue, light blue circles represent the top 20 authors in 2 http://dblp.uni-trier.de/db/
19
ACCEPTED MANUSCRIPT
three communities, respectively. The size of one circle represent its importance. It is easy to know that, the 3 communities are corresponding to artificial intelligence, data mining and database, respectively. Author M ichaelI.Jordan has the largest size in the artificial intelligence, authors JiaweiHan and P hilipS.Y u
CR IP T
435
have the top and second importance in data mining. Based on the community structure of the proposed method, we analyze the evolution of the three commu-
nities, which are shown in Figure 10. The left subfigure represents the transition probability among the three communities at consecutive two snapshots and the right subfigure represents the evolution of the communities, which is plotted by the tool in [58].
6. Conclusion and discussion
AN US
440
In this paper, we have introduced the Constraint Common Cluster based model (C 3 model), to detect the community structure in the temporal and multiplex networks. The key assumption in the proposed model is that there
M
445
are common cluster structures in the networks. We have also derived an iterative algorithm to optimize the proposed model, which is converge to a stable local
ED
optimal solution. We have tested the model on a variety of generated and real world networks, and compared it with some widely used methods, showing its competitive performances in community detection.
PT
450
In detailed, as our intuitive perception, the common cluster is a useful constraint for community detection in temporal or multiplex networks. Compared
CE
with the widely used methods, such as static nonnegative matrix factorization, M ultislice, F aceN et, DY N M OGA and so on, either the ComN M F or the
455
C 3 model not only shows the competitive performances in the computer gen-
AC
erated datas and real world networks, but also has a principled, generated and theoretical comprehension. Based on the assumption, the proposed methods could identify the importance of each node in the networks, as represented in the DBLP example. Also of note, we can analyze the evolution of the temporal
460
communities with the help of other methods as in experiments.
20
ACCEPTED MANUSCRIPT
In fact, our proposed method is just designed to detect communities at each snapshot of temporal networks or each slice of multiplex networks based on common cluster structure, which ignore the temporal order of snapshots of
465
CR IP T
temporal networks and the difference of slices of multiplex networks. However, we analyze the evolution of the temporal communities in other methods and validate the multiplex communities based on different ground-truth. We hope to solve this problem better in future.
In addition, there are a number of directions the proposed model could be
extended. Firstly, we assume that there is a fixed number of nodes in each slice or snapshot of the network, which is always unrealistic in the real networks,
AN US
470
how to extend it, especially in the temporal networks? Secondly, the balance parameters on each slice or snapshot are the same, which is easily to set, and how to automatic setting these parameters based on the observed networks is an interested idea in future. Lastly and most importantly, how to set the 475
number of communities, and how to extend the model to deal the networks
M
with different number of communities in different snapshots or slices? A lot of methods determining the number of communities in static networks have been
ED
proposed and extending these methods to the multiplex and temporal networks reasonably will be our next work. Acknowledgment
PT
480
This work was supported by the Major Project of National Social Science
CE
Fund(14ZDB153),the major research plan of the National Natural Science Foundation (91224009,51438009 ).
AC
References
485
[1] M. E. J. Newman, Communities, modules and large-scale structure in networks, Nature Physics 8 (1) (2011) 25–31.
[2] Y.-Y. Ahn, J. P. Bagrow, S. Lehmann, Link communities reveal multiscale complexity in networks, Nature 466 (7307) (2010) 761–764. 21
ACCEPTED MANUSCRIPT
[3] M. Girvan, M. E. J. Newman, Community structure in social and biological networks, Proceedings of the National Academy of Sciences (12) (2001)
490
7821–7826. arXiv:PMC122977.
CR IP T
[4] F. Krzakala, C. Moore, E. Mossel, J. Neeman, A. Sly, L. Zdeborova,
P. Zhang, Spectral redemption in clustering sparse networks, Proceedings of the National Academy of Sciences 110 (52) (2013) 20935–20940. 495
[5] M. E. J. Newman, Modularity and community structure in networks, Proceedings of the National Academy of Sciences 103 (23) (2006) 8577–8582.
AN US
[6] B. Ball, B. Karrer, M. E. J. Newman, Efficient and principled method
for detecting communities in networks, Physical Review E 84 (3) (2011) 036103–13. 500
[7] B. Karrer, M. E. J. Newman, Stochastic blockmodels and community struc-
83.016107.
M
ture in networks, Phys. Rev. E 83 (2011) 016107. doi:10.1103/PhysRevE.
URL https://link.aps.org/doi/10.1103/PhysRevE.83.016107
ED
[8] F. Wang, T. Li, X. Wang, S. Zhu, C. Ding, Community discovery using nonnegative matrix factorization, Data Mining and Knowledge Discovery
505
PT
22 (3) (2010) 493–521.
[9] J. Yang, J. Leskovec, Overlapping community detection at scale: a non-
CE
negative matrix factorization approach., WSDM (2013) 587–596. [10] I. Psorakis, S. Roberts, M. Ebden, B. Sheldon, Overlapping community detection using Bayesian non-negative matrix factorization, Physical Review
AC
510
E 83 (6) (2011) 066114–9.
[11] S. Fortunato, Community detection in graphs, Physics Reports 486 (3-5) (2010) 75–174.
[12] S. Fortunato, D. Hric, Community detection in networks: A user guide, 515
Physics Reports 659 (2016) 1–44. 22
ACCEPTED MANUSCRIPT
[13] F. D. Malliaros, M. Vazirgiannis, Clustering and community detection in directed networks: A survey, Physics Reports (2013) 1–48. [14] P. Holme, J. Saram¨ aki, Temporal networks, Physics Reports 519 (3) (2012)
520
CR IP T
97–125.
[15] S. Boccaletti, G. Bianconi, R. Criado, C. I. del Genio, J. G´ omez-Garde˜ nes,
M. Romance, I. Sendi˜ na-Nadal, Z. Wang, M. Zanin, The structure and dynamics of multilayer networks, Physics Reports 544 (1) (2014) 1–122.
[16] M. X. Li, Z. Q. Jiang, W. J. Xie, S. Miccich`e, M. Tumminello, A com-
AN US
parative analysis of the statistical properties of large mobile phone calling networks, Scientific reports (2014) 5132.
525
[17] C. Aggarwal, K. Subbian, Evolutionary Network Analysis, ACM Computing Surveys 47 (1) (2014) 1–36.
[18] M. Kivela, A. Arenas, M. Barthelemy, J. P. Gleeson, Y. Moreno, M. A.
M
Porter, Multilayer networks, Journal of Complex Networks 2 (3) (2014) 203–271.
ED
530
[19] A. Sapienza, A. Panisson, J. Wu, L. Gauvin, C. Cattuto, Detecting Anomalies in Time-Varying Networks Using Tensor Decomposition, in: 2015 IEEE
PT
International Conference on Data Mining Workshop (ICDMW), IEEE, 2015, pp. 516–523. [20] A. Sapienza, A. Panisson, J. Wu, L. Gauvin, C. Cattuto, Anomaly De-
CE
535
tection in Temporal Graph Data - An Iterative Tensor Decomposition and
AC
Masking Approach., AALTD@PKDD/ECML.
[21] G. Palla, A.-L. Barab´ asi, T. Vicsek, Quantifying social group evolution,
540
Nature 446 (7136) (2007) 664–667.
[22] V. Nicosia, G. Bianconi, V. Latora, M. Barthelemy, Growing Multiplex Networks, Physical Review Letters 111 (5) (2013) 058701–17.
23
ACCEPTED MANUSCRIPT
[23] H. Tong, S. Papadimitriou, J. Sun, P. S. Yu, C. Faloutsos, Colibri: fast mining of large static and dynamic graphs., KDD (2008) 686–694. [24] C. Tantipathananandh, T. Y. Berger-Wolf, Constant-factor approximation
CR IP T
algorithms for identifying dynamic communities., KDD (2009) 827–836.
545
[25] Y. Chi, X. Song, D. Zhou, K. Hino, B. L. Tseng, Evolutionary spectral clustering by incorporating temporal smoothness., KDD (2007) 153–162.
[26] F. Folino, C. Pizzuti, An Evolutionary Multiobjective Approach for Com-
munity Discovery in Dynamic Networks, IEEE Transactions on Knowledge
AN US
and Data Engineering 26 (8) (2014) 1838–1852.
550
[27] W. Tang, Z. Lu, I. S. Dhillon, Clustering with Multiple Graphs, in: 2009 Ninth IEEE International Conference on Data Mining (ICDM), IEEE, 2009, pp. 1016–1021.
[28] Z. Zeng, J. Wang, L. Zhou, G. Karypis, Out-of-core coherent closed quasi-
M
clique mining from large dense graph databases, ACM Transactions on
555
ED
Database Systems 32 (2) (2007) 13. [29] J. Xie, S. Kelley, B. K. Szymanski, Overlapping community detection in networks, ACM Computing Surveys 45 (4) (2013) 1–35.
PT
[30] J. Sun, C. Faloutsos, S. Papadimitriou, P. S. Yu, GraphScope - parameterfree mining of large time-evolving graphs., KDD (2007) 687.
560
CE
[31] H. Ning, W. Xu, Y. Chi, Y. Gong, T. S. Huang, Incremental spectral clustering by efficiently updating the eigen-system., Pattern Recognition
AC
43 (1) (2010) 113–127.
[32] R. Aktunc, I. H. Toroslu, M. Ozer, H. Davulcu, A Dynamic Modularity
565
Based Community Detection Algorithm for Large-scale Networks, in: the 2015 IEEE/ACM International Conference, ACM Press, New York, New York, USA, 2015, pp. 1177–1183.
24
ACCEPTED MANUSCRIPT
[33] T. N. Dinh, N. P. Nguyen, M. T. Thai, An adaptive approximation algorithm for community detection in dynamic scale-free networks, in: INFOCOM, 2011 Proceedings IEEE, IEEE, 2013, pp. 55–59.
570
CR IP T
[34] D. Chakrabarti, R. Kumar, A. Tomkins, Evolutionary clustering., KDD (2006) 554–560.
[35] H. Sun, J. Huang, X. Zhang, J. Liu, D. Wang, H. Liu, J. Zou, Q. Song,
IncOrder: Incremental density-based community detection in dynamic networks, Knowledge-Based Systems 72 (C) (2014) 1–12.
575
AN US
[36] Y.-R. Lin, Y. Chi, S. Zhu, H. Sundaram, B. L. Tseng, Facetnet: a frame-
work for analyzing communities and their evolutions in dynamic networks., WWW (2008) 685–694.
[37] L. Gauvin, A. Panisson, C. Cattuto, Detecting the community structure and activity patterns of temporal networks: A non-negative tensor factor-
580
pone.0086028.
M
ization approach, PLOS ONE 9 (1) (2014) 1–13. doi:10.1371/journal.
ED
URL https://doi.org/10.1371/journal.pone.0086028 [38] K. Nowicki, T. A. B. Snijders, Estimation and Prediction for Stochastic Blockstructures, Journal of the American Statistical Association 96 (455)
585
PT
(2001) 1077–1087.
[39] T. Yang, Y. Chi, S. Zhu, Y. Gong, R. Jin, Detecting communities and
CE
their evolutions in dynamic social networks—a Bayesian approach, Machine Learning 82 (2) (2010) 157–189.
[40] K. S. Xu, A. O. Hero, Dynamic Stochastic Blockmodels for Time-Evolving
AC
590
Social Networks, IEEE Journal of Selected Topics in Signal Processing 8 (4) (2014) 552–562.
[41] C. Matias, V. Miele, Statistical clustering of temporal networks through a dynamic stochastic block model, arXiv.org.
25
ACCEPTED MANUSCRIPT
595
[42] J. Gao, J. Han, J. Liu, C. W. 0001, Multi-View Clustering via Joint Nonnegative Matrix Factorization., SDM (2013) 252–260. [43] X. Dong, P. Frossard, P. Vandergheynst, N. Nefedov, Clustering With
CR IP T
Multi-Layer Graphs - A Spectral Perspective., IEEE Trans. Signal Processing 60 (11) (2012) 5820–5831. 600
[44] J. Kim, J.-G. Lee, Community Detection in Multi-Layer Graphs: A Survey., SIGMOD 44 (3) (2015) 37–48.
[45] P. J. Mucha, Community structure in time-dependent, multiscale, and mul-
AN US
tiplex networks (vol 328, pg 876, 2010), Science, 2010.
[46] D. Cai, X. He, J. Han, T. S. Huang, Graph Regularized Nonnegative Matrix Factorization for Data Representation, IEEE Transactions on Pattern
605
Analysis and Machine Intelligence 33 (8) (2011) 1548–1560.
NIPS (2000) 556–562.
M
[47] D. D. Lee, H. S. Seung, Algorithms for Non-negative Matrix Factorization.,
ED
[48] M. A. T. Figueiredo, J. M. Bioucas-Dias, R. D. Nowak, Majorizationminimization algorithms for wavelet-based image restoration, IEEE Trans-
610
actions on Image Processing 16 (12) (2007) 2980–2991. doi:10.1109/TIP.
PT
2007.909318.
[49] D. Jin, Z. Chen, D. He, W. Zhang, Modeling with Node Degree Preservation
CE
Can Accurately Find Communities., AAAI. 615
[50] D. Jin, H. Wang, J. Dang, D. He, W. Zhang, Detect Overlapping Commu-
AC
nities via Ranking Node Popularities., AAAI (2016) 172–178.
[51] C. Fevotte, J. Idier, Algorithms for nonnegative matrix factorization with the β-divergence, Neural computation 23 (9) (2011) 2421–2456.
[52] W. Wang, P. Jiao, D. He, D. Jin, L. Pan, B. Gabrys, Autonomous over-
620
lapping community detection in temporal networks - A dynamic Bayesian
26
ACCEPTED MANUSCRIPT
nonnegative matrix factorization approach., Knowl.-Based Syst. 110 (2016) 121–134. [53] M. E. J. Newman, M. Girvan, Finding and evaluating community structure
625
CR IP T
in networks, Physical Review E 69 (2) (2004) 026113–15.
[54] Y.-R. Lin, Y. Chi, S. Zhu, H. Sundaram, B. L. Tseng, Analyzing commu-
nities and their evolutions in dynamic social networks, ACM Transactions on Knowledge Discovery from Data 3 (2) (2009) 1–31.
[55] D. Greene, D. Doyle, P. Cunningham, Tracking the Evolution of Commu-
AN US
nities in Dynamic Social Networks, in: 2010 International Conference on
Advances in Social Networks Analysis and Mining (ASONAM 2010), IEEE,
630
2010, pp. 176–183.
[56] C. Granell, R. K. Darst, A. Arenas, S. Fortunato, S. G´ omez, Benchmark model to assess community structure in evolving networks, Physical Review
635
M
E 92 (1) (2015) 012805–8.
[57] W. Dong, B. Lepri, A. S. Pentland, Modeling the co-evolution of behaviors
ED
and social relationships using mobile phone data, in: the 10th International Conference, ACM Press, New York, New York, USA, 2011, pp. 134–143.
PT
[58] R. Mall, R. Langone, J. A. K. Suykens, Netgram: Visualizing Communities
CE
in Evolving Networks, PLoS ONE 10 (9) (2015) e0137502–24.
Biography
AC
640
27
ACCEPTED MANUSCRIPT
Pengfei Jiao received the B.S. degree from the Hainan University, Haikou, China, in 2012. He is currently pursuing the Ph.D. degree from the School of 645
Computer Science and Technology, Tianjin University, Tianjin, China. His cur-
CR IP T
rent research interests include dynamic complex network analysis, data mining,
AN US
and machine learning.
Wenjun Wang received the Ph.D. degree from Peking University, Beijing, 650
China, in 2004. He is currently a Professor with the School of Computer Science and Technology, Tianjin University, Tianjin, China. His research interests include computational social science, emergency management, large-scale data
ED
M
mining and network science.
PT
655
Di Jin received the B.S., M.S., and Ph.D. degrees in computer science from Jilin University, Changchun, China, in 2005, 2008, and 2012, respectively. He
CE
was a Post-Doctoral Research Fellow at the School of Design, Engineering, and Computing, Bournemouth University, Poole, U.K., from 2013 to 2014. He is an Assistant Professor with the School of Computer Science and Technology,
AC
660
Tianjin University, Tianjin, China. He has published over 30 international journal articles and conference papers. His current research interests include data mining, complex network analysis, and machine learning.
28
0.4
0.2
NMI
NMI
0.3
0.1 0
1
2
3
4
5
6
7
8
9
0
10
error 1
2
3
4
5
6
7
t
(a)
9
10
NMI
1
2
3
4
5
6
7
8
9
1
2
3
4
4
5
6
5
7
8
StaNMF
9 10 ComNMF C3 Multislice
6
7
8
9
10
6
7
8
9
10
6
7
8
9
10
t
(b)
ED
2
3
4
5
6
7
8
0.4
1
2
3
4
5
1
2
3
4
5
6000
4000
1
0.6
0.2
10
error
0.4
2000
4000
3
0.8
0.6
6000
error
8
M
NMI
0.8
5000
2
AN US
error
5000
0.2
1
6000
6000
4000
0.2
CR IP T
ACCEPTED MANUSCRIPT
9
10
4000 2000 0
PT
t
t
(c)
(d)
Figure 2: The experimental results on the multiplex Girven-Newman Synthetic Networks,
CE
each sub figure represents the performance based on the NMI and error, respectively. The network size is 128, the average degree is 16. (a): zin = 7, r = 0.7; (b): zin = 7, r = 0.8;
(c): zin = 8, r = 0.7; (d): zin = 8, r = 0.8;. The black lines represent the results of the
AC
our ComN M F method, the red lines represent that of the C 3 model proposed in this paper,
the other two lines represent the StaN M F and M ultislice methods, respectively. Error bars show the standard deviations estimated on 20 runs for all the methods.
29
1
NMI
NMI
1 0.5 0
1
2
3
4
5
6
7
8
9
0
10
10000
error
4000 2000 1
2
3
4
5
6
7
t
(a)
10
NMI
2
3
4
5
6
6000 4000
1
2
7
ED
error
9
3
4
8
9
1
2
3
4
5
3
4
5
6
7
8
9
10
6
7
8
9
10
6
7
t
(b)
5
6
7
8
9
0.2
0
10
1
2
3
4
5
10
6000 4000 2000
1
2
3
4
5
PT
6
t
(c)
(d)
Figure 3: The experimental results on the temporal Girven-Newman Synthetic Networks, each
CE
sub figure represents the performance based on the NMI and error, respectively. The network size is 128, the average degree is 20. (a): z = 5, nc = 3; (b): z = 5, nc = 9; (c): z = 6, nc = 3; (d): z = 6, nc = 9; The red lines represent that of the C 3 model proposed in this paper, the
black lines represent the results of the ComN M F method, the other four lines represent the
AC
10
C3 DYNMOGA FaceNet Multislice
t
StaN M F , F aceN et, DY N M OGA and M ultislice methods, respectively. Error bars show the standard deviations estimated on 20 runs for all the methods.
30
StaNMF
8 ComNMF 9
8000
error
1
8000
2000
0
2
0.4
0.2
0
8
M
NMI
0.4
5000
1
AN US
error
6000
0
0.5
CR IP T
ACCEPTED MANUSCRIPT
7
8
9
10
1
NMI
0.6 0.4 0.2
1
3
4
5
6
7
error 1
2
3
4
5
6
7
t
(a)
5
8
9
10
4
5
4
5
6
7
8
9
10
1
2
3
6
7
8
9
10
t
(b)
NMI
2
3
4
5
6
10 4
15 10 5 1
2
7
3
4
5
6
8
9
M
1
7
8
9
10
0.3 0.2 0.1
1
10
2
3
4
5
2
3
4
5
7
8
9
10
6
7
8
9
10
10 5 0
1
t
PT
6
10 4
15
error
0.2
0
3
10 4
10
0
2
0.4
0.3
ED
NMI
0.4
1
15
5
0.1
0
10
C3 DYNMOGA FaceNet Multislice
10
0
error
StaNMF
8 ComNMF 9
10 4
15
error
2
0.5
AN US
NMI
0.8
CR IP T
ACCEPTED MANUSCRIPT
t
(c)
(d)
Figure 4: The experimental results on the temporal switch Synthetic Networks, each sub
CE
figure represents the performance based on the NMI and error, respectively. The network size is 128, the average degree is 20. (a): nu = 0.7, p = 0.1; (b): nu = 0.7, p = 0.8; (c): nu = 0.8, p = 0.1; (d): nu = 0.8, p = 0.8; The red lines represent that of the C 3 model proposed in
AC
this paper, the black lines represent the results of the ComN M F method, the other four lines represent the StaN M F , F aceN et, DY N M OGA and M ultislice methods, respectively. Error bars show the standard deviations estimated on 20 runs for all the methods.
31
NMI
0.4
0.5
0
4
6
8
10
12
14
C3 DYNMOGA FaceNet Multislice
10 4
1 2
4
6
8
10
12
14
16
t
(a)
NMI
0.1
error
10
2
4
6
8
10
12
10 4
5
0
2
14
16
18
8
10
4
6
8
10
12
14
16
18
20
2 1
2
12
14
16
18
20
t
(b)
4
6
8
10
12
14
16
18
0.1
0.05
0
20
M
0
6
10 4
0
20
4
0.15
0.2
ED
NMI
0.3
18
2
3
error
2
0
0
StaNMF 18 20 ComNMF
16
10
error
error
3
2
0.2
AN US
NMI
1
CR IP T
ACCEPTED MANUSCRIPT
20
2
4
6
8
10
12
14
16
18
20
12
14
16
18
20
t
10 4
5
0
2
4
6
8
10
t
t
(d)
PT
(c)
Figure 5: The experimental results on the temporal Grow-shrink Synthetic Networks, each
CE
sub figure represents the performance based on the NMI and error, respectively. (a): the size of each snapshot is 256, pout = 0.24; (b): the size of each snapshot is 256, pout = 0.28; (c): the size of each snapshot is 512, pout = 0.24; (d):the size of each snapshot is 512, pout = 0.28; The red lines represent that of the C 3 model proposed in this paper, the black lines represent
AC
the results of the ComN M F method, the other four lines represent the StaN M F , F aceN et,
DY N M OGA and M ultislice methods, respectively. Error bars show the standard deviations estimated on 20 runs for all the methods.
32
2 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 3 10 4
2 1 0
1
NMI
error
3
error
1
C3 DYNMOGA FaceNet 0 Multislice 1 2 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 3 4 5 10 4 10
0.5
5
AN US
0.5 0
StaNMF ComNMF
1
2 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 3 t
4
4
5
5
0
1
2 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 3 t
(b)
ED
(a)
M
NMI
1
CR IP T
ACCEPTED MANUSCRIPT
Figure 6: The experimental results on the temporal Grow-shrink Synthetic Networks with varying pout, each sub figure represents the performance based on the NMI and error, respec-
PT
tively. (a): the size of each snapshot is 256; (b): the size of each snapshot is 512;The red lines represent that of the C 3 model proposed in this paper, the black lines represent the results of the ComN M F method, the other four lines represent the StaN M F , F aceN et, DY N M OGA
CE
and M ultislice methods, respectively. Error bars show the standard deviations estimated on
AC
all the snapshots for all the methods.
33
4
5
0.45
2500
0.4 2000
AN US
0.35
CR IP T
ACCEPTED MANUSCRIPT
0.3
1500
NMI
error
0.25
0.2
1000
M
0.15
0.1
0
ED
0.05
500
StaNMF ComNMF
C^3
0
Multislice
StaNMF ComNMF
C^3
Multislice
PT
Figure 7: The results on the MIT reality multiplex network with two classes ground truth. The left panel shows the performance based on NMI, and the right panel gives the error results. The white and the light purple histogram are the performance of different methods
CE
based on ”year-school” and ”floor”, respectively. Error bars show the standard deviations
AC
estimated on all the snapshots for all the methods.
34
0.9
0.8
0.8
0.7
StaNMF ComNMF
0.6
5
10
15
C 20 DYNMOGA FaceNet Multislice
4000
1500
error
error
2000
1000 500
5
10
15
0.7 0.6
3
2000
0
20
NMI
0.7
2
4
6
8
10
8
2
4
6
8
10
12
14
16
10
12
14
16
5
6
7
8
5
6
7
8
(b)
0.8 0.7 0.6
12
1
2
3
4
1
2
3
4
6000
M
6000
error
6
0.9
0.8
4000 2000
2
4
6
8
10
error
NMI
0.9
0
4
t
(a)
0.6
2
AN US
t
CR IP T
0.9
NMI
NMI
ACCEPTED MANUSCRIPT
12
4000 2000 0
t
ED
t
(d)
PT
(c)
Figure 8: The experimental results on the KIT email temporal Networks based on NMI and error: (a): The temporal network with 24 snapshots, the number of nodes and communities are
CE
138 and 23, respectively; (b): The temporal network with 24 snapshots, the number of nodes and communities are 170 and 25, respectively; (c): The temporal network with 24 snapshots, the number of nodes and communities are 195 and 25, respectively; (d): The temporal network with 24 snapshots, the number of nodes and communities are 231 and 27, respectively. The
AC
red lines represent that of the C 3 model proposed in this paper, the black lines represent the results of the ComN M F method, the other four lines represent the StaN M F , F aceN et, DY N M OGA and M ultislice methods, respectively. Error bars show the standard deviations estimated on all the snapshots for all the methods.
35
AC
CE
PT
ED
M
AN US
CR IP T
ACCEPTED MANUSCRIPT
Figure 9: The importance representation of the top 20 authors in 3 communities. The navy blue, medium blue, light blue circles represent the communities of AI, DM, and DB.
36
W2
C2
C2
C3
C3 C1
C2 T2
C3
C1
C2 T3
C3
1
W4
C1
C1
C2
C2
C3
M
T4
T3
W3
2
AN US
C1 T2
T1
W1 C1
CR IP T
ACCEPTED MANUSCRIPT
C3
C2 T4
C3
C1
C2 T5
ED
C1
(a)
C3
0
(b)
PT
Figure 10: The evolution of communities in the DBLP data. (a): the transition probability among the three communities at consecutive two snapshots; (b): the evolution of the
AC
CE
communities.
37