Defining quality metrics for graph clustering evaluation

Defining quality metrics for graph clustering evaluation

Accepted Manuscript Defining Quality Metrics for Graph Clustering Evaluation Anupam Biswas, Bhaskar Biswas PII: DOI: Reference: S0957-4174(16)30633-...

11MB Sizes 0 Downloads 137 Views

Accepted Manuscript

Defining Quality Metrics for Graph Clustering Evaluation Anupam Biswas, Bhaskar Biswas PII: DOI: Reference:

S0957-4174(16)30633-9 10.1016/j.eswa.2016.11.011 ESWA 10982

To appear in:

Expert Systems With Applications

Received date: Revised date: Accepted date:

16 June 2016 24 September 2016 6 November 2016

Please cite this article as: Anupam Biswas, Bhaskar Biswas, Defining Quality Metrics for Graph Clustering Evaluation, Expert Systems With Applications (2016), doi: 10.1016/j.eswa.2016.11.011

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ACCEPTED MANUSCRIPT

Highlights • Proposed a set of three quality metrics AVI, AVU and ANUI. • Roughly four kind of analysis have been performed.

CR IP T

• AVI, AVU and ANUI together can give better indication about accuracy. • These metrics satisfy all of the six quality related properties.

AC

CE

PT

ED

M

AN US

• Linearity in characteristics of our metrics can also give indication about accuracy.

1

ACCEPTED MANUSCRIPT

Defining Quality Metrics for Graph Clustering Evaluation Anupam Biswasa,∗, Bhaskar Biswasa of Computer Science and Engineering, Indian Institute of Technology (BHU), Varanasi, India

CR IP T

a Department

Abstract

AN US

Evaluation of clustering has significant importance in various applications of expert and intelligent systems. Clusters are evaluated in terms of quality and accuracy. Measuring quality is a unsupervised approach that completely depends on edges, whereas measuring accuracy is a supervised approach that measures similarity between the real clustering and the predicted clustering. Accuracy cannot be measured for most of the real-world networks since real clustering is unavailable. Thus, it will be advantageous from the viewpoint of expert systems to develop a quality metric that can assure certain level of accuracy along with the quality of clustering. In this paper we have proposed a set of three quality metrics for graph clustering that have the ability to ensure accuracy along with the quality. The effectiveness of the metrics has been evaluated on benchmark graphs as well as on real-world networks and compared with existing metrics. Results indicate competency of the suggested metrics while dealing with accuracy, which will definitely improve the decision-making in expert and intelligent systems. We have also shown that our metrics satisfy all of the six quality-related properties.

M

Keywords: Graph clustering, Community detection, Social network analysis, Quality and accuracy measures 1. Introduction

AC

CE

PT

ED

Graphical knowledge representation of a complex system makes the system easily understandable, and it has the capability to analyze data intelligently. Identification of object patterns is a prominent intermediary step for the analysis and exploration of complex systems (Lorentz et al., 2016; Niu et al., 2014; Newman, 2003). These objects could be the pixels in various applications of image processing and pattern recognition (Kim et al., 2014; Singh et al., 2016), persons in social network (Xu et al., 2015; Han et al., 2016; Colladon & Remondi, 2016), genes in genetic analysis Steinhaeuser & Chawla (2010), proteins in proteinprotein interaction network (Wang et al., 2011; Hao et al., 2016), locations in supply chain network Yang et al. (2016) and so on. These objects along with their in-between connections are often studied as network of objects. Identification of connectivity patterns of objects within the network has become very challenging task with growing complexity within the network. Clus∗ Corresponding

author Email addresses: [email protected] (Anupam Biswas), [email protected] (Bhaskar Biswas) Preprint submitted to Expert Systems with Applications

tering techniques ease the understanding of such complex systems by discovering patterns within the system (Kim et al., 2014; Kanawati, 2011; Xu et al., 2007; Steinhaeuser & Chawla, 2010; Wang et al., 2011). From the view point of expert systems, the process of clustering can be treated as unsupersvised learning of connectivity patterns in the network. Clustering has a significant role in understanding the relationships among various objects of the network and their functions in a complex system. Main objective of clustering is to maximize the similarity within the clusters, while minimize similarity among different clusters. Generally, dense connectivity represents high similarity. Clustering in networks is often referred as graph clustering. Evaluation of graph clustering techniques often confront with the issues related to quality and accuracy (Traag et al., 2011; Almeida et al., 2011; Ali & Couillet, 2016; Creusefond et al., 2016). The quality of clustering indicates the impact of assigning various objects of the network to different clusters. Different quality metrics are designed to generate a score depending both on the network structure and connectivity among the objects of clusters (Yang & Leskovec, 2015; Emmons et al., 2016; Leskovec et al., 2010). On the other hand, accuracy deals with the correctness of clustering November 12, 2016

ACCEPTED MANUSCRIPT

CR IP T

obtained with any graph clustering technique. All objects of the network are checked one by one with the real clustering where ground truth is available to measure the accuracy of predicted clustering. Accuracy metrics generate a score through such inspections (Rand, 1971; Strehl & Ghosh, 2002; Manning et al., 2008; Kou et al., 2014). Both these measures are similar to the internal and external validations that are performed in the case simple data clustering (clustering not in graph or network) (Milligan, 1981; Wu et al., 2009; Liu et al., 2013; Jiang et al., 2013). However unlike graph clustering evaluation, the simple data clustering does not consider the connectivity among objects for internal validation (or measuring quality), since there have no defined links among the objects. Moreover, the graph clustering evaluation is influenced by numerous factors, which include the graph clustering algorithm, network structure, and the connectivity among the objects of different clusters. Impact of these factors emerges during evaluation in the quality and accuracy of the predicted clustering. Most of the real-world networks obtained with the graphical representation of objects are of unknown structure, very large and complex. Ground truths of these networks are not available and, as a result, it is not possible to measure the accuracy directly for such networks. For those cases, accuracy has to be assured alternatively without knowing the real clustering. Quality metrics are very helpful in such situations as they can generate scores without knowing real clustering. However, most of the quality metrics are biased towards the good quality and fail to assure required level of accuracy (Almeida et al., 2011; Orman et al., 2012). An example of such trade-off is shown in the Figure 1. Thus, an effective quality metric should have the ability to diminish the trade-off between quality and accuracy of graph clustering. Considering both these facts, a set of three quality metrics is proposed in this paper. Unlike most of the existing quality metrics that are developed based on dense connectivity, the proposed metrics are designed based on two properties of social community formation: the unification and isolation. The results on benchmark graphs and real-world networks show the effectiveness, competitiveness, and efficiency of the proposed metrics to indicate the accuracy along with quality. Moreover, the proposed metrics satisfy all of the six standard properties suggested in Van Laarhoven & Marchiori (2014) for quality metrics. Rest of the paper is organized as follows. Section 2 briefs preliminary definitions and notations. Section 3 discusses various quality and accuracy metrics with their drawbacks. Section 4 illustrates the proposed quality metrics and derives some properties. Sec-

(a) Predicted clustering

(b) Real clustering

ED

M

AN US

Figure 1: Trade-off between quality and accuracy. (a) Clustering obtained with Leaders Identification for Community Detection (LICOD) (Kanawati, 2011) algorithm on Karate network(σ = 0.7, δ = 0.28 and  = 0.0). (b) Real clustering of Karate network. Clearly, clustering produced by LICOD is completely different from the real clustering. However, quality of predicted clustering seems to be higher.

tion 5 explains six standard properties suggested by Van Laarhoven and Marchiori (Van Laarhoven & Marchiori, 2014) for quality metrics, and shows proposed quality metrics satisfy all. Section 6 discusses the performance of proposed quality metrics on synthetic and real-world graphs. Section 7 furnishes concluding remarks and future extensions. 2. Definitions and Notations

AC

CE

PT

A undirected weighted graph G is a pair (V, E) of a finite set V of nodes and a finite set E = V × V of connections where a function δE : {V × V → <, < ≥ 0} defines the strengths or weights of connections such that δE (u, v) = δE (v, u) for all (u, v) ∈ E. If connection set is not specified then the strength of connection (u, v) is represented simply with δ(u, v). For unweighted graphs, function δE is defined as δE : V × V → {0, 1}, where 0 represents absence and 1 represents presence of connection. Self loops are allowed in both weighted and unweighted graphs. Note that only difference between weighted and unweighted graph is presence of connections are marked only with 1 in unweighted graph, whereas it can be any < > 0 in weighted graph. A non-overlapping clustering C of graph G = (V, E) is partitioning of G into a non-empty set of subgraphs, known as clusters, such that

3

• any Ci ∈ C is a graph Ci = (Vi , Ei ) with Vi ⊆ V nodes and Ei ⊆ E connections,

ACCEPTED MANUSCRIPT

• for all u, v ∈ Vi , if there exists a connection (u, v) ∈ E then (u, v) ∈ Ei , • δEi (u, v) = δE (u, v) for all (u, v) ∈ Ei , S • ki=1 Vi = V, and

• Vi ∩V j , φ if and only if Vi = V j , for all Ci , C j ∈ C.

ED

M

AN US

CR IP T

If any node u ∈ Vi of cluster Ci ∈ C, then we write it simply u ∈ Ci , otherwise u < Ci . When two nodes u and v are in same cluster Ci ∈ C, then we write u ∈Ci v. Otherwise we write u
of cluster formation. This is likely the explanation behind the reason for inconsistent and inaccurate indications of these metrics (Good et al., 2010; Almeida et al., 2011; Traag et al., 2011). On the contrary, the objective of proposed metrics is defined based on the natural community formation in social network. In addition, the proposed approach also deals with connectivity and incorporates quite similar strategy as in Coverage with different fractions of connectivity. The notion of balancing positive and negative components also utilized into the quality metrics (Zaidi et al., 2010). Several accuracy metrics have been proposed accounting from information theoretic approach to mathematical indexing. Some popular accuracy measure such as Adjusted Rand Index (ARI) (Rand, 1971), Normalized Mutual Information (NMI) (Strehl & Ghosh, 2002), F-measure and Purity (Manning et al., 2008) are still in the front line for measuring accuracy of clustering. Apart from these metrics there is a handful of other accuracy metrics addressed in various literature (Alper Selver & Akay, 2009; Foggia et al., 2009; Kou et al., 2014; Chan et al., 2014). As mentioned above, these accuracy metrics require ground truth in order to evaluate predicted clustering. Proposed metrics do not require ground truth since these metrics are designed primarily as quality metrics. Accuracy is ensured through the mechanism incorporated for the designing of the metrics. Several researchers have studied the properties, shortcomings of present metrics, and introspected to the correlation of metrics with the clustering process. Several shortcomings of existing quality metrics are addressed in (Almeida et al., 2011; Duan et al., 2014). The authors showed quality metrics have biases towards good quality clustering but fail to assure required level of accuracy. They also showed that quality metrics do not behave as usual when cluster structures are different from traditional one. In (Sileshi & Gamback, 2009; Orman et al., 2012), discussed several quality metrics and have noted that cluster distribution has very important role in overall cluster structure. Delling et al. (2006) discussed centrality and duality of quality metrics. With these two properties, the authors showed that the random graphs can be generated corresponding to a quality metric or vice-versa. They also showed that clustering algorithms, quality metrics, and random graphs are nothing but the different aspects of the same problem.These studies pointed out that the quality metrics are deeply related to cluster structure and distribution of clusters. Proposed metrics are analyzed in this aspect and established correlation with the accuracy in section 6.4. Moreover, the designing of proposed metrics

3. Related Work

AC

CE

PT

Previous works on quality metrics have primarily focused on dense and sparse connectivity among objects of network (Yang & Leskovec, 2015; Lin et al., 2007; Casas-Roma et al., 2015; Jo & Lee, 2007; Aldecoa & Mar´ın, 2013). Most widely used quality metric, the Modularity is measured on the basis of connections between nodes of same cluster (Newman & Girvan, 2004). It measures the fraction of connections in the network that connect nodes in the same cluster by subtracting the expected value and considering the connections were placed randomly. Similarly, Coverage of any clustering is the fraction of connections that connect two nodes of same cluster within the total number of connections present in the network (Brandes et al., 2003). On the same line of work, another quality metric called Clustering Index(CI) is proposed in (Jo & Lee, 2007). Distance based measures are also encouraging for evaluating clusters (Lin et al., 2007; Zaidi et al., 2010). Objectives of these metrics differ from the natural process 4

ACCEPTED MANUSCRIPT

also resolve the issues associated with the accuracy. Along with the study of issues related to various metrics and clustering, researchers have tried to bring all currently existing metrics and new metrics under a single flag by defining some standard set of properties. The four criteria: Minimum Total Distance (C1 ), Separated Clusters (C2 ), Object Positioning (C3 ) and Number of objects Correctly Positioned (C4 ) which are required for a quality metric to be good are proposed in (Dubes & Jain, 1979; Raskutti & Leckie, 1999) . The first three criteria are distance-based whereas the last one is based on the similarity measure. The designing of axiom sets and properties are led by the inspiring work in (Kleinberg, 2003) . Ackerman & Ben-David (2009) defined three different axioms for quality metrics. Van Laarhoven & Marchiori (2014) extended further these axioms and suggested six axioms or properties for quality metrics. Those axioms are termed as Permutation invariance, Scale invariance, Richness, Monotonicity, Locality and Continuity. Interestingly, they showed that the most commonly used and widely accepted quality metric Modularity does not satisfy all of these properties. An axiomatic analysis has been done incorporating these axioms in section 5 and theoretically proved that the proposed metrics meet the qualitative requirement.

CR IP T

persons are represented as objects. Strongly connected persons with each other forms group or cluster or community1 by following the notion of unification of persons with common personal interest. On the other hand, weakly connected persons or the persons having different personal interest detach from other persons by following the notion of isolation. Similarly, group of persons having common personal interest also follows isolation from the persons having different personal interest, which implies the fundamental requisite of clustering i.e. similarity within the cluster and dissimilarity among the clusters. Thus, formation of community or cluster in a network requires to follow both unification and isolation properties. These two properties are stated in the context of communities as follows:

M

AN US

Unification: Unification is a property that unites the members of smaller communities into one community. Two communities are unified into single community if members are significantly connected.

4. Proposed Quality Metrics

ED

This section explains the formation of social community through social interactions. Incorporating the notion of social community formation into the graph clustering, the designing of proposed quality metrics is explained.

Isolation: Isolation is a property that isolates the members of community from rest of the network and integrates the members of the community. This means that connectivity of community with rest of the network should be less and connectivity within the community should be high. Extrapolating these two properties, a set of three quality metrics is defined to evaluate the clusters predicted with any graph clustering technique. Designing of these metrics are detailed below.

PT

4.2. Unifiability The unification property implies that there has a tendency to unify multiple communities into single community. Unifiability is the measure of such tendency for a cluster as a whole instead of single node. It incorporates the notion of unification of any cluster with other clusters by considering the personal interest or connections of all members of the cluster as a whole. Intuitively, the members of cluster are supposed to have connections with rest of the network. Thus, members of two different clusters will have connections at individual level. The collective strength of such connections will give the strength of the relationship between two clusters. To unify two clusters, the strength of relationship between two has to be higher than their total strength of relationships with all clusters. Therefore, Unifiability of

AC

CE

4.1. Social Community Formation In social network, personal interest is one of the stimulators for the interaction and communication between two persons. Those who share common or interrelated personal interest, have the greater tendency to communicate with each other (Newman, 2003). Social connectivity grows based on the foundation of such social interaction (Jackson, 2008; Chen et al., 2014; Abufouda & Zweig, 2014). Engagements with the interactions among different persons determine how strongly or loosely they will be connected or have relationship with each other. Such notion in the relationship is referred as mutual interest of both persons induced by their personal interest (Biswas & Biswas, 2015). Higher mutual interest yields stronger relationship or connections. The strength of such connections among the persons are represented graphically by assigning relative weights while

1 Community (in terms of social network) and cluster (in terms of graph clustering) are used interchangeably throughout the paper.

5

ACCEPTED MANUSCRIPT

Theorem 4.1. Range of AVU will be [0, 1] for any connected network.

cluster Ci with respect to another cluster C j is measured as follows: P

δ(u, v) P Unifiability(Ci , C j ) = P u∈Ci ,v
Proof. Consider a connected network of n nodes. If every node is connected to all remaining nodes, then any node will connect (n − 1) other nodes. For best case, if network is partitioned into n clusters, all connections will be external connection to cluster. Thus, there will be (n − 1) external connections for each cluster. For any 1 1 = 2n−3 . pair of clusters Unifiability will be (n−1)+(n−1)−1 Since, each node represents a cluster so there will be (n − 1) pair of clusters which can unify with any cluster. Hence, total Unifiability for any cluster will be 1 . Total Unifiability of the clustering will (n − 1) × 2n−3 n−1 be n × 2n−3 . Since, network contains n clusters so AVU n−1 n−1 will be n1 × n × 2n−3 = 2n−3 . Since, Unifiability is the measure of ability of a cluster to unify with other clusters, there has to be at least n−1 two clusters. Therefore, in AVU measure 2n−3 the value of n − 1 ≥ 1 since n ≥ 2. Again for n ≥ 2, (n − 1) ≥ 1 ⇒ (2n − 3) ≥ 1 and (2n − 3) ≥ (n + 1). For all n ≥ 2, (n + 1) > (n − 1) so (2n − 3) ≥ (n − 1). Now, n−1 by property of ratio 2n−3 ≤ 1. Worst case for AVU can only happen in clustering of any connected network, when there is no direct connection between two clusters. This can happen only when clusters are connected either in the form of star graph or circular graph. Since, in Unifiability there has no role of internal connections of the cluster so we can assume each cluster as nodes. Since, the graph is connected, even though there is no direct connection with any cluster it will be connected via some other clusters. That means in worst-case scenario also any cluster will have at least one external connection, with which it maintains connectivity with rest of the network. Hence, Unifiabil0 ity of such cluster pair (A, B) will be 1+1+0 = 0. Let, C be the cluster via whom cluster A and B are connected. 1 Unifiability of A and C will be at least 1+2−1 = 1/2 > 0. Hence, in worst case also A’s Unifiability with all clusters will greater than 0. This implies AVU will always greater than 0 for with at least two clusters. It is clear with the definition of Unifiability that at least two clusters are required to measure AVU of clustering. However, sometimes clustering algorithm identifies single cluster for entire network. In that case, there is nothing left for the cluster to unify since the cluster itself has included entire network into it. Thus, it is quite intuitive to acknowledge that Unifiability of the cluster would be zero. Therefore, even though by definition Unifiability cannot be measured we suggest to use the value 0 for AVU when there is only one cluster in the clustering.

n o (u, v)|u ∈ Ci & v ∈ C j

{(u, x), (v, y)} |u ∈ Ci , x < Ci , v ∈ C j , y < Ci , C j

ED

Unifiability(Ci , C j ) = (

M

AN US

CR IP T

u∈Ci ,v∈C j

)

CE

PT

(2) where, u and v are any two nodes of cluster Ci and C j respectively, x is the any neighbor node of cluster Ci , y is the any neighbor node of cluster C j excluding the nodes belong to cluster Ci , the pairs (u, x) and (v, y) represents connection between nodes u to x and v to y respectively. Unifiability of cluster Ci with respect to all k clusters of clustering C can be obtained as:

AC

Unifiability(Ci ) =

k X

Unifiability(Ci , C j )

(3)

j=1

Unifiability of all clusters together will represent Unifiability of clustering. The overall effect of Unifiability corresponding to each cluster can be obtained by averaging. For any predicted clustering C of graph G, that comprises k clusters within it, AVerage Unifiability (AVU) can be computed as follows: k

QAVU (G, C) =

1X Unifiability(Ci ) k i=1

(4) 6

ACCEPTED MANUSCRIPT

follows:

4.3. Isolability

k

Isolability of a cluster is defined by incorporating the notion of isolation and connection strength among the nodes of the network. In this context, nodes are examined by considering the strength of connections to measure overall ability of any cluster to isolate itself from rest of the network. It is named as Isolability because this measure determines the ability of any cluster to isolate itself from rest of the network. Isolability of any cluster Ci is defined as follows:

u∈Ci v

u∈Ci v

P

u∈Ci ,v
δ(u, v)

(7)

Theorem 4.2. Range of AVI will be [0, 1] for any connected network. Proof. Consider a connected network of n nodes. If every node is connected to all remaining nodes in the netnumbers of connections work, then maximum n×(n−1) 2 are possible (Excluding self-loops and multiple connections between two nodes). For worst case, network is clustered into n clusters i.e. each node represents a cluster. Isolability of each cluster will be 0 and AVI will also have value 0. For the best case, entire network is clustered into a single cluster. Since, every node is connected to all remaining nodes, total internal connection and external connection will be 0. Therewill be n×(n−1) 2

δ(u, v)

δ(u, v) +

1X Isolability(Ci ) k i=1

CR IP T

Isolability(Ci ) = P

P

QAV I (G, C) =

(5)

AN US

where, u and v are any two nodes and δ(u, v) represents the strength of connection between nodes u and v. In the above definition, Isolability for any cluster is measured, which is the ratio of connection strength within the cluster to total strength of connections associated with the cluster. For unweighted graph, strengths of connections are considered as 1. Hence, Isolability of any cluster for unweighted graph can be considered as a ratio of the number of connections within a cluster to the total number of connections associated with the cluster. This instance is very similar to Relative Density measure (Schaeffer, 2007), where the ratio of internal and the total node degree has been considered instead of connectivity. Simplified Relative Density appears same ratio as Isolability when all connection strengths are considered as 1. Hence, Relative Density can be treated as a special case of Isolability. Isolability of any cluster Ci for unweighted graph is defined simply in terms of connections as follows:  (u, v)|u ∈Ci v Isolability(Ci ) =  {(u, v); (u, w)} |u ∈Ci v & w < Ci (6) where, u and v are any two nodes belong to the cluster Ci , w is the node outside the cluster, the pairs (u, v) and (u, w) represents connection between nodes u to v and u to w respectively. Measuring Isolability ensures higher connectivity among all nodes within the cluster. Similarly, Isolability of all clusters present in the predicted clustering can be measured relative to each other. Averaging Isolability of all clusters will yield overall connectivity of nodes corresponding to their respective clusters. For any predicted clustering C of graph G, that comprises k clusters within it, AVerage Isolability (AVI) can be computed as

fore, Isolability of cluster will be

n×(n−1) 2 n×(n−1) +0 2

=

n−1 n−1

= 1.

If network contains x self-loops and y parallel connections between two nodes, then Isolability of cluster will be

n×(n−1) +x+y 2 n×(n−1) +x+y+0 2

= 1. Since, network contains only one

ED

M

cluster so AVI will also be 1. Hence, for a connected network AVI will always lie in between 0 and 1. Theorem 4.3. In connected network, it is not possible to attain value 1 for both AVI and AVU at the same time.

AC

CE

PT

Proof. In connected network, AVU can attain value 1 when only two clusters are there. In that case, AVI will always less than 1 because the network is connected. There has to be at least one connection between nodes of two clusters so numerator will always less than denominator in Isolability. Again, AVI can attain value 1 when there is only one cluster. In that case, AVU cannot be measured since to measure Unifiability requires at least two cluster. For single cluster, entire network is included into one cluster so there is nothing left to unify with the cluster. Thus, logically measuring AVU does not arises for single cluster, instead assign value 0 as explained above. Thus, both AVI and AVU cannot attain value 1 at the same time. Nevertheless, both can attain value 1 in different cases. 4.4. Balanced Isolability and Unifiability AVI should be high as it indicates the ability of cluster to isolate itself from rest of the network. On the contrary, AVU should be low as it indicates the ability of cluster to unify itself with other clusters. For good clustering, clusters should able to isolate itself from rest 7

ACCEPTED MANUSCRIPT

of the network as much as possible. At the same time, it’s ability to unify with other clusters has to be as less as possible. Therefore, AVI and 1/AVU has to be maximized to indicate good clustering. We consider that both Unifiability and Isolability of a clustering have equal importance. Thus, both have to combine by giving equal weightage to obtain the effect in a single value. In that case, harmonic mean of AVI and 1/AVU will result in balanced effect of both. The notion of balancing is inspired by well established accuracy metric F-measure that combines the effect of precision and recall by taking harmonic mean of two. Both precision and recall have to be maximum for accurate clusters, which are balanced with harmonic mean. We refer such balancing of AVI and 1/AVU as Average of Unifiability and Isolability (AUI). The AUI for any clustering C of graph G can be expressed as follows:

=

2 1/QAVU (G,C) + 1

2 × QAV I (G, C) 1 + QAVU (G, C) × QAV I (G, C)

(11)

CR IP T

Theorem 4.5. Range of ANUI will be [0, 1] for any connected network. Proof. Since, ANUI is simply a division of AUI by 2 so with Theorem 4.4 we can show the range of ANUI will be [0, 1]. 5. Axiomatic Analysis

(8)

1 QAV I (G,C)

(10)

Axiomatic analysis is performed to prove theoretically that proposed metrics fulfill necessary criterion of being quality measure for clustering. For axiomatic analysis, six axioms or properties suggested by Laarhoven and Marchiori are considered. They have showed in their recent work (Van Laarhoven & Marchiori, 2014) that the aforementioned six properties are enough for being a good quality metric. Competency of AVI, AVU and ANUI in perspective of all six properties are proved theoretically. These properties are defined as follows: First property is about independence of node identity. Quality Metric has to be dependent only on strengths of connections, not on the identity of nodes. It has to remain constant for a clustering even if nodes of actual graph are permuted.

AN US

QAUI (G, C) =

QAUI (G, C) 2 QAV I (G, C) = 1 + QAVU (G, C) × QAV I (G, C)

QANUI (G, C) =

(9)

ED

M

The concept of maximization AVI and minimization of AVU looks very similar to the positive and negative components of the metric defined by Zaidi et al. (2010). However, they combined both by subtracting negative component from positive component so it may result sometimes in negative values. In our case, the harmonic mean of AVI and 1/AVU gives the flexibility to equal contribution of both AVI and 1/AVU in maximizing the final value. Hence, AUI gives a balanced effect of both AVI and AVU.

PT

Definition 5.1 (Permutation invariance). A quality metric Q is permutation invariant if for all graphs G = (V, E) and all isomorphisms f : {u → u0 |u ∈ V, u0 ∈ V} of G, it satisfies Q(G, C) = Q( f (G), f (C)). I.e. for any clustering of a graph, quality metric should remain same for all isomorphic forms of the graph.

Theorem 4.4. Range of AUI will be [0, 2] for connected network.

AC

CE

Proof. In the Equation 9, numerator is 2× AV I can have value 0, since AVI can attain value 0 as shown in the Theorem 4.2. Thus, AUI can also attain value 0. The numerator can have maximum value 2 when AVI have value 1. When AVI have value we consider value 0 for AVU so denominator will have value 1. Therefore, AUI will have value 2. In all other cases, the value will be less than 2 and greater than 0.

Quality of clustering has to be unaltered when distances or connection strengths are scaled uniformly. This implies that if quality metric for a clustering indicates good quality it should retain that indication even if distances are scaled.

With the Theorem 4.4, we have AUI values ranged from 0 to 2. However, in general, metric values are normalized to 1. Since, AUI can have maximum value 2 so AUI is normalized to 1 by simply dividing it by 2. Therefore, we have Average Normalized Unifiability and Isolability (ANUI) as follows:

Definition 5.2 (Scale invariance). A quality metric Q is scale invariant if for all graphs G = (V, E), all pairs of clusterings C and D of G and all constants α > 0, Q(G, C) ≤ Q(G, D) if and only if Q(αG, C) ≤ Q(αG, D). Here, αG ⇒ δ(u, v) 7→ αδ(u, v), ∀(u, v) ∈ E. This means all connection strengths of G are scaled with α factor. 8

ACCEPTED MANUSCRIPT

Quality metric has to be non decreasing under monotonic consistent improvement. Strengthening internal connections of clusters or weakening external connections of clusters is referred as consistent improvement. For all such consistent improvements quality metric has to be non decreasing.

Proof. For any graph G = (V, E), isomorphic graph f (G) implies f (u) = u0 such that u ∈ V and u0 ∈ V. Thus, any connection (u, v) ∈ E will change to (u0 , v0 ) such that ( f −1 (u0 ), f −1 (v0 )) ∈ E. Since connection identity has changed with nodes in f (G), not the strength of the connection so δ(u, v) = δ(u0 , v0 ). For any cluster Ci ∈ C of G, if u ∈Ci v then u0 ∈Ci0 v0 for isomorphic graph f (G) where Ci0 ∈ f (C), u 7→ u0 and v 7→ v0 . Thus, associated internal connection (u, v) remains internal to Ci0 for f (G). Hence, total internal connection strength of Ci0 ∈ f (C) remains same as Ci ∈ C. Similarly, it can be shown that total external strength of Ci also remains same. Isolability and Unifiability are independent measures for all clusters. Both depends on connection strength which remains unaltered with isomorphism. Thus for all isomorphisms of a graph, Isolability and Unifiability remains unaffected. Since individual Isolability and Unifiability of clusters are independent measures, average of these independent measures will also remain unaffected. Therefore, QAV I (G, C) = QAV I ( f (G), f (C)) and QAVU (G, C) = QAVU ( f (G), f (C)) i.e. AVI and AVU are permutation invariant. Since, ANUI is just harmonic mean of AVI and 1/AVU so it will also be permutation invariant.

CR IP T

Definition 5.3 (Monotonicity). A quality metric Q is monotonic if for all graphs G = (V, E), all clusterings C of G, all β ≥ 0 and all consistent improvements G ± β of G, it satisfies Q(G ± β, C) ≥ Q(G, C). Here, G ± β ⇒ δ(u, v) + β ≥ δ(u, v) whenever u ∈Ci v and δ(u, v) − β ≤ δ(u, v) whenever u
AN US

Definition 5.4 (Locality). A quality metric Q is local if for all graphs G = (V, E), all clusterings C of G, all clusters Ci ∈ C and all changes G∗ associated to Ci , it satisfies Q(Ci , C) + Q(C\Ci , C) = Q(G, C) if and only if Q(Ci∗ , C) + Q(C\Ci , C) = Q(G∗ , C). Here, G∗ ⇒ all kinds of changes associated to Ci . Ci∗ represents all kind the affections in Ci .

ED

M

Quality metrics should have the ability to make any clustering optimal by revising strengths of the connections. Main objective is to rule out trivial quality metrics. This is done by acquiring richness.

PT

Definition 5.5 (Richness). A quality metric Q is rich if for all graphs G = (V, E) and all clustering C, there exist a graph G0 = (V, E 0 ) such that argmaxC Q(G, C) = Q(G0 , C) and δ(u, v) 7→ δ0 (u, v), where δ(u, v) is the strength of (u, v) ∈ E and δ0 (u, v) is the strength of (u, v) ∈ E 0 .

AC

CE

Last property is about continuity. Small change in the graph has to cause a smaller impact on quality metric. Quality metrics should not be affected significantly for any smaller modification to the actual graph. Definition 5.6 (Continuity). A quality metric Q is continuous if for all graphs G = (V, E) and β > 0 there exist a γ > 0 such that for all graphs G0 = (V, E 0 ), if δ(u, v) − β < δ0 (u, v) < δ(u, v) + β for some connections (u, v) ∈ E, then Q(G0 , C) − γ < Q(G, C) < Q(G0 , C) + γ for all clustering C of G. Here, δ(u, v) and δ0 (u, v) are the strengths of connection (u, v) in E and E 0 respectively. Theorem 5.1. Quality metrics AVI, AVU and ANUI are permutation invariant. 9

Theorem 5.2. Quality metrics AVI, AVU and ANUI are scale invariant.

Proof. Consider any two clusterings C and D of G = (V, E). Let, any arbitrary connection (u, v) ∈ E. Isolability is a ratio of connection strengths. This particular connection (u, v) either contributes to numerator or denominator of the ratio. Depending on the contribution of δ(u, v) to Isolability for clusterings C and D arise following possibilities. (i) δ(u, v) contributes to denominator for C and numerator for D, (ii) δ(u, v) contributes to numerator for C and denominator for D, (iii) δ(u, v) contributes to denominator for both C and D, and (iv) δ(u, v) contributes to numerator for both C and D. For case (i), δ(u, v) contributes to denominator for C and numerator for D means connection (u, v) is an external connection in C and an internal to any cluster Di ∈ D. Suppose, QAV I (G, C) = QAV I (G, D) before addition of δ(u, v) . Thus, the contribution of all such δ(u, v) will definitely result QAV I (G, C) < QAV I (G, D). For α > 0, if all δ(u, v) > αδ(u, v) or all δ(u, v) < αδ(u, v) then QAV I (αG, C) < QAV I (αG, D) when QAV I (αG, C) = QAV I (αG, D) before the contribution of all αδ(u, v). Hence for the case (i), α > 0 will result QAV I (G, C) ≤ QAV I (G, D) if and only if QAV I (αG, C) ≤ QAV I (αG, D). Similarly, it can also be shown for the cases (ii), (iii) and (iv).

ACCEPTED MANUSCRIPT

Any connection (u, v) ∈ E will be in one of these four cases for C and D. Thus, QAV I (G, C) ≤ QAV I (G, D) if and only if QAV I (αG, C) ≤ QAV I (αG, D) for all α > 0. Similarly, it can also be shown for AVU. Thus, ANUI is also scale invariant.

increase. Thus, QAV I (G − β, C) ≥ QAV I (G, C). Hence, AVI is monotonic. Unifiability does not depend on the internal connections of cluster. Hence, strengthening of internal connections will have no impact on Unifiability i.e. QAVU (G + β, C) = QAVU (G, C) . Weakening of external connections will result in decrease of Unifiability. Unifiability has to be minimum for good clustering. Hence, degradation of Unifiability implies improvement in quality. Since, both AVI and 1/AVU are non decreasing for consistent improvements so ANUI will reflect the same i.e. QANUI (G ± β, C) ≥ QANUI (G, C).

CR IP T

I Lemma 1. Consider a ratio I+E , where I, E and I + E are positive and E < I. If I is incremented or decreI+x I I−x I < I+E+x or I+E > I+E−x , mented with amount x then I+E I I±x and if x  I, then I+E  I+E±x .

Proof. Clearly, I < I + E since I, E, I + E > 0. Now, I+x I I−x I by property of ratio I+E < I+E+x and I+E > I+E−x , since I < I + E. As x  I, this implies xI  0. With this we can write, I+x I+E+x

=

1+ xI I+E x I +I



1+0

I+E I +0

1

Theorem 5.4. Quality metrics AVI, AVU and ANUI are local. Proof. The AVI is a measure of individual Isolability contribution of all clusters. Isolability of cluster depends only on the nodes within the cluster and neighbor nodes. Isolability of each individual cluster is independent of other cluster. Thus, for any graph G = (V, E) and clustering C, contribution of any cluster Ci ∈ C is separated as follows. Suppose, clustering C consists k clusters. Now, quality metric AVI is expressed as:

I I+E . I I−x I+E−x  I+E



I+E I



AN US

for x  I. Similarly, we can show This means for very small change in I, the ratio does not change much. I Lemma 2. Consider a ratio I+E , where I, E and I + E are positive and E < I. If E is incremented or decreI I I I mented with amount x then I+E > I+E+x or I+E < I+E−x , I I and if x  I, then I+E  I+E±x .

I

I

I

M

I I I > I+E+x and I+E < Proof. By property of ratio I+E I , since I, E, I + E > 0. I+E−x As x  I, this implies xI  0. With this we can write, I 1 1 1 I I+E+x = I+E + x  I+E +0  I+E  I+E . I

ED

I I  I+E for x  I. Similarly, we can show I+E−x This means for very small change in E, the ratio does not change much.

k X

I solability(C j ) I solability(Ci ) j=1, j,i + QAV I (G, C) = k k = QAV I (Ci , C) + QAV I (C\Ci , C)

All kind of changes G∗ associated with Ci will affect AVI. Resulting changed AVI is expressed as follows:

Theorem 5.3. Quality metrics AVI, AVU and ANUI are monotonic.

PT

I solability(Ci∗ ) I solability(C\Ci∗ ) + k k I solability(C\Ci ) ∗ = QAV I (Ci , C) + k (all changes Ci∗ is local to Ci )

QAV I (G∗ , C) =

CE

Proof. Since Isolability is a ratio (Equation( 6)) of the I form I+E , here I and E represents total strength of internal and external connections. Consistent improvement can happen either when we strengthen internal connections of the cluster or weaken external connections of the cluster. For any graph G = (V, E), if additional β ≥ 0 amount is added to the connection strength δ(u, v) such that u ∈Ci v for any cluster Ci ∈ C then this will be the case as in Lemma 1 where I is incremented. As shown in Lemma 1, at this situation the ratio increases (by property of ratio) i.e. Isolability increases. Thus, QAV I (G + β, C) ≥ QAV I (G, C). Again, if β ≥ 0 amount is subtracted from the connection strength δ(u, v) such that u
AC

= QAV I (Ci∗ , C) + QAV I (C\Ci , C) Hence, AVI satisfies locality. Quality metrics is local if it is a sum over number of clusters and each summand depends only on nodes and connections associated with single clusterVan Laarhoven & Marchiori (2014). AVI is simply a sum over individual Isolability of clusters. Thus, looking into the designing itself one can say, AVI is local. Similarly, AVU is also a sum over Unifiability of each cluster to other clusters. Hence, AVU is also local. Harmonic mean of AVI and 1/AVU implies local so ANUI also satisfies locality. 10

ACCEPTED MANUSCRIPT

G0 is

Theorem 5.5. Quality metrics AVI, AVU and ANUI are rich.

ay 1 × k ay + by 1 ax = × k ax + (b − m)x + my (for m external connections x , y) 1 ax = × k ax + (b − m)x + m(x + β) (replacing y with x + β) 1 ax = × k ax + bx + mβ < QAV I (Ci , C)

Q0AV I (Ci , C) =

CR IP T

Proof. As shown in Theorem 5.3, Isolability and Unifiability are the ratios and can be represented in the form as in Lemma 1 and Lemma 2. Also, it is shown that consistent improvement is possible with both AVI and AVU. This implies that with addition of connection strengths within the cluster or removal of connection strengths between clusters will improve AVI, AVU and ANUI values. With significant amount of such changes, any clustering can be made optimal quality clustering. Hence, AVI, AVU and ANUI are rich.

AN US

Theorem 5.6. Quality metrics AVI, AVU and ANUI are continuous.

Thus, QAV I (G0 , C) = Q0AV I (Ci , C) + QAV I (C\Ci , C) < QAV I (G, C) Suppose, QAV I (G0 , C) = QAV I (G, C) − z, z > 0. Now, addition of any γ > z to QAV I (G0 , C) implies QAV I (G, C) < QAV I (G0 , C) + γ. Similarly, we can show there exist γ > 0 such that QAV I (G0 , C) − γ < QAV I (G, C). Therefore, AVI is continuous. Unifiability I can also be represented as a ratio of the form I+E by considering I as number of common connections and E as combined external connections of two clusters. Hence, we can show AVU as well as ANUI are also continuous.

ED

M

I , with Proof. As Isolability is a ratio of the form I+E Lemma 1 and Lemma 2 we can show that small change in I or E i.e. change in Isolability is negligible. Since AVI satisfies locality (shown in Theorem 5.4), this negligible affection will not spread over entire graph quality. This implies that small change in any connection within or outside cluster will have small impact on AVI. Consider a graph G = (V, E), δ(u, v) = x for all (u, v) ∈ E. Since AVI is local so we can separate contribution of any cluster Ci ∈ C of graph G, i.e. QAV I (G, C) = QAV I (Ci , C) + QAV I (C\Ci , C). Contribution of any cluster Ci ∈ C to AVI is

We have analyzed performance of the proposed metrics in two directives: accuracy and quality. As a matter of first importance, we have inspected competitiveness of proposed metrics in terms of accuracy with respect to other metrics. Thereafter, we have analyzed the effectiveness of our metrics for indicating accuracy on real clustering and further verified those with predicted clustering obtained with different algorithms. Finally, we have analyzed characteristics of proposed metrics and established a mechanism to evaluate accuracy of clusters of unknown network. We have also analyzed quality related issues during these analysis. Moreover, we have done a theoretic analysis of proposed metrics solely for qualitative requirement in section 5.

1 ax × , k ax + bx where, a=number of internal connections.

PT

QAV I (Ci , C) =

b=number of external connections.

CE

x=weight of each connection in G.

AC

k=number of clusters in clustering C.

Let, G = (V, E 0 ), δ0 (u, v) = y for all (u, v) ∈ E 0 be another graph such that m external connections of cluster Ci ∈ C that associated only to Ci satisfy δ(u, v) − β < δ0 (u, v) < δ(u, v) + β for β > 0. Remaining |E| − m connections have same weights as in G, i.e. y = x. Now, for m connections δ0 (u, v) < δ(u, v) + β ⇒ y < x + β. Contribution of Ci ∈ C to AVI with respect to G0 is separately written as, QAV I (G0 , C) = Q0AV I (Ci , C) + QAV I (C\Ci , C). Contribution of cluster Ci ∈ C to AVI with respect to

6. Performance Analysis

6.1. Experimental Setup All accuracy related analysis mentioned above are experimented first on benchmark graph and afterward on real-world network. Practical clustering for the analysis are generated through graph clustering or community detection algorithms. Several quality and accuracy related metrics are considered for comparative analysis 11

ACCEPTED MANUSCRIPT

are designed solely either to ensures quality or accuracy, but not both at the same time. As our metrics ensure both at the same time, we have considered both kind of metrics for analysis. We have considered metrics those are popular and used widely for evaluating clusters. We have considered four accuracy metrics NMI, ARI, Purity, F-measure for accuracy comparison and two quality metrics Coverage and Modularity for quality comparison.

of proposed metrics. Data sets, algorithms and metrics considered for the experimentation are detailed below.

CR IP T

6.1.1. Data Sets Using Lancichinetti-Fortunato-Radicchi (LFR) generator, we have generated three LFR benchmark graphs (Lancichinetti et al., 2008) of 5000, 6000 and 7000 nodes. These graphs are generated with mixing parameter 0.1. We have also considered three realworld network data sets. The first real data set we considered is the 2000 NCAA Football Bowl Subdivision (formerly Division 1-A) football schedule. The National Collegiate Athletic Association (NCAA) divides 115 schools into 12 conferences. The network contains 115 nodes and 613 edges. This data set was firstly used by Newman (Newman & Girvan, 2004). Second data set we considered is the Zachary’s Karate club network (Zachary, 1977). This network contains 34 nodes, 78 edges and divided into 2 clusters. Both these data sets are used widely to evaluate clustering algorithms specially to measure accuracy of clustering and are available online2 . Third data set we considered is the Arxiv GR-QC (General Relativity and Quantum Cosmology) collaboration network data set (Leskovec et al., 2007). This data covers papers in the period from January 1993 to April 2003 (124 months)3 . An important point to note here is that the real clustering of GRQC is unknown. GR-QC network contains 5242 nodes and 14496 edges.

6.2. Metric Competitiveness Analysis

ED

M

AN US

Main objective of this section is to show that our metrics have the ability to ensure quality as well as accuracy. Quality and accuracy are two different aspects of the same problem. Each aspect needs to be handled separately without loosing the significance of both. To verify the indications of any new metric require wellestablished quality and accuracy metrics. If both wellestablished quality and well-established accuracy metrics’ indications are justified with the indications of proposed metrics then only one can say that the proposed metrics ensure both quality and accuracy. Hence, first we have noticed indication by accuracy metrics and justified with indications of our metrics. Then we have compared with the indications of quality metrics. We have discussed competitiveness of ANUI in terms of accuracy and also showed that ANUI along with AVI and AVU can give more insight detail about the clustering. 6.2.1. LFR Benchmark Graphs Metrics’ values obtained on clusters predicted through different algorithms is verified with the metrics values obtained for real clustering. Correctness of any metric and competitiveness with other metrics is examined through such verification. Considering accuracy metrics NMI, ARI, Purity and F-measure correctly measure accuracy of clustering, we have examined competitiveness of our metrics with respect to Modularity and Coverage in determining accuracy. Results obtained on LFR benchmark graph are presented in Fig. 2. All the accuracy metrics show highest value on LFR graphs as these metrics are evaluated over real clustering. As indicated by NMI, ARI and Purity values, HC-PIN has been predicted as most accurate clusters. F-measure values contradict level of accuracy indicated by NMI, ARI and Purity on all algorithms. This trade-off is very clear in case of RW, where F-measure as well as Purity show poor accuracy. Despite of such trade-offs Modularity and Coverage show very high values for clustering of both HC-PIN and RW. Specifically, for clusters produced by RW on the LFR-6K graph, Modularity and

AC

CE

PT

6.1.2. Algorithm Selection We have considered four graph clustering algorithms HC-PIN (Wang et al., 2011), LICOD (Kanawati, 2011), Random Walk (RW) (Steinhaeuser & Chawla, 2010) and SCAN (Xu et al., 2007). Among these LICOD and SCAN were designed in context of social community detection, whereas HC-PIN and RW were designed to find clusters in protein-protein interaction network. Nevertheless, these algorithms do the same thing i.e. clustering of network data in the context of the respective domains. Since quality and accuracy measure have to be effective irrespective of domain, we have considered algorithms of different domains to maintain diversity of clustering algorithms. 6.1.3. Metric Selection Several quality and accuracy metrics have been designed to evaluate clustering (section 3). Those metrics 2 http://vldao.fmf.uni-lj.sj/pub/networks/pajek 3 available

online http://snap.stanford.edu/data/

12

ACCEPTED MANUSCRIPT

(a) Football

CR IP T

(a) LFR-5K

M

(c) LFR-7K

(b) Karate

AN US

(b) LFR-6K

ED

Figure 2: Comparison of effectiveness of AVI, AVU and ANUI with other metrics on LFR benchmark graphs. NMI, ARI, Purity and Fmeasure are accuracy metrics. Higher values of these metrics indicate accuracy of predicted clustering with respect to real clustering. Note that for real clustering these metrics resulted highest values. Coverage and Modularity are quality metrics. Higher values of these metrics indicate better quality clustering. Comparisons are done on the basis of predictability of better algorithm correctly in terms of accuracy and quality. For example in (a), real clustering shows highest NMI value. Predicted clustering of HC-PIN and RW also show higher NMI values. This means NMI indicates HC-PIN and RW are better in terms of accuracy. Similarly, HC-PIN and RW have higher Modularity values and real clustering also shows higher Modularity value. This means Modularity indicates HC-PIN and RW are better in terms of quality.

(c) GR-QC

CE

PT

Figure 3: Comparison of effectiveness of AVI, AVU and ANUI with other metrics on Real-world network. Networks with known ground truth are presented in (a) and (b). Similar kind of comparison as presented in Fig. 2. Network with unknown ground truth is presented in (c). Since, without real clustering NMI, ARI, Purity and F-measure cannot be measured so bars for these metrics and column real clustering are absent in (c). Measure has to rely only on quality metrics. I.e. higher quality metric values will indicate better quality clustering.

AC

case of SCAN, Modularity and Coverage show comparatively higher values, whereas all accuracy metric have indicated level of accuracy almost equal to zero. ANUI indicated low accuracy level for SCAN on all of the three LFR graphs and also for LICOD it shows quite reasonable values in perspective of accuracy. To get more insight about the ability of ANUI to deal with the accuracy, we have analyzed Isolability and Unifiability of clusters. Considering the AVI values, one can notice higher values of AVI for real clustering on all the LFR graphs. Like Modularity and Coverage, AVI is also very much dependent on internal connections within the cluster. Summarily, good clustering are

Coverage show very high values, where Purity shows almost zero accuracy level. This indicates Modularity and Coverage are incapable of dealing with the accuracy level of clustering. On the contrary, our metric ANUI can deal with such trade-off very smoothly. Considering the specific case of RW, one can notice the lower values of ANUI. For the case of HC-PIN also, ANUI values are lower than real clustering which indicates predicted clustering is not exactly same as real clustering. For the 13

AN US

(a)

CR IP T

ACCEPTED MANUSCRIPT

ED

M

(b)

AC

CE

PT

(c)

(d)

(e)

Figure 4: Contribution of individual original cluster to total AVI, AVU and ANUI values with respect to real clustering. Clusters of LFR benchmark graphs on 5000, 6000 and 7000 are presented in (a), (b) and (c) respectively. Original clusters of Football and Karate network are shown in (d) and (e) respectively. Right part of each diagram presents a graphical view of the network and cluster structure in different colors. Left part of each diagram shows cluster size and its contribution to each of the three metrics, which also presents cluster size distribution on the original network. Diagrams indicate effectiveness of each metric on different clusters corresponding to their sizes. For good quality clustering AVI values have to be high and AVU value has to be low (described in section 4). LFR graphs follow this principle pretty well, while real network seems to be not rely on this principle. That means original clusters are more likely to unify with other clusters. ANUI counterbalance such unstable factor (in section 6.3). However, for both LFR graphs and real networks, contribution to each metric increases with cluster size.

14

ACCEPTED MANUSCRIPT

AN US

(b) LFR-6K graph

CR IP T

(a) LFR-5K graph

M

(c) LFR-7K graph

ED

Figure 5: Contribution of individual cluster to total AVI, AVU and ANUI values for clusters predicted with four algorithms on LFR benchmark graphs. Results obtained on LFR graphs of 5000, 6000 and 7000 nodes are presented in (a), (b) and (c) respectively. Predicted cluster size distribution has to be compared with cluster distribution of real clustering. Cluster size distributions of original LFR graphs are presented in Fig. 4(a), (b) and (c). Similarly, contribution to total AVI, AVU and ANUI values for predicted clustering also have to be evaluated with respect to real clustering.

supposed to have higher connectivity inside the cluster. Hence, one can notice higher values of all three metrics. However as mentioned above, Modularity and Coverage are unable to deal with accuracy. Clearly, AVI shows higher values for HC-PIN but much lower values than real clustering for LICOD, SCAN and RW. On the contrary, Modularity and Coverage show comparatively higher values for HC-PIN, SCAN and RW. Clearly, AVU values indicate that for all LFR graphs, the clusters are quite likely to unify with other clusters. Interestingly, AVU shows high values for LICOD on all of the three LFR graphs. This happens because clusters produced are very inaccurate and AVI values are very low. It means that internal connectivity within the cluster is less and apparently external connectivity become high. Obviously, more external connectivity will result high AVU value. For any good quality clustering internal connectivity has to be more in comparison to external so HC-PIN is far better than LICOD in terms

AC

CE

PT

of accuracy. Hence, both AVI and AVU have significant role in dealing with accuracy and better understanding of the clustering. Solely, ANUI can give overall value accumulating quality and accuracy that is enough for evaluating clustering whereas, ANUI along with AVI and AVU values can avail better understanding of the clustering along with evaluation. Significance of AVI, AVU and ANUI values are summarized in Table 1 and these will be more clear with the discussion of results obtained on real-world network. 6.2.2. Real-world Network Results obtained on real-world networks are presented in Fig. 3. Real clustering for both Football and Karate networks show highest accuracy metric values as these metrics are evaluated over the real clustering itself. Quality of real clustering for Football network is not that high, but real clustering for Karate network shows very high quality. For Football network, HC-PIN produces most accurate clustering as indicated by ac15

ACCEPTED MANUSCRIPT

AN US

(b) Karate network

CR IP T

(a) Football network

M

(c) GR-QC network

ED

Figure 6: Contribution of individual cluster to total AVI, AVU and ANUI values for clusters predicted with four algorithms on real-world networks. Results obtained on Football, Karate and GR-QC networks are presented in (a), (b) and (c) respectively. Predicted cluster size distribution contribution to total AVI, AVU and ANUI values for Football and Karate network has to be compared with real clustering. Cluster size distributions and corresponding metric contributions of these two networks are presented in Fig. 4(d) and (e). Since, GR-QC network does not have ground truth so size distribution of clustering predicted by any algorithm has to be compared with clustering predicted by other algorithms. Similarly, contribution to total AVI, AVU and ANUI values for predicted clustering also have to be evaluated with respect to clustering predicted by other algorithms.

Accuracy of HC-PIN and SCAN are also high but much lower than the real clustering. Accuracy metrics indicate that LICOD has produced least accurate clustering. Modularity and Coverage fail to detect such an inaccurate clustering but ANUI has clearly indicated that LICOD has produced least accurate clustering. Modularity and Coverage values obtained for clustering of LICOD have showed their incapability to deal with the accuracy. For GR-QC network, we do not have the real clustering so we have to depend only on the values of quality metrics to ensure accuracy. In the case of LFR graphs as well as on Football and Karate network, we have seen worst performance of LICOD both in terms of quality and accuracy. Moreover, we have already noted in the above discussion about incapability of Modularity and Coverage to ensure accuracy (example inaccuracy of LICOD). In that case, indication by Modularity and Coverage as clustering of LICOD is the best cluster-

PT

Table 1: Combination of AVI, AVU and ANUI with indication of clustering quality and accuracy.

AVU

High High High Low Moderate Low Moderate

Low Moderate High High High Low Moderate

ANUI

Clustering

Low High High Low High Moderate

Bad Good Good Bad Good Not possible Good

AC

CE

AVI

curacy metric values. Accuracy of LICOD and SCAN are also quite high, whereas RW produces least accurate clustering. ANUI, Modularity and Coverage also indicated the same. For Karate network, RW produces most accurate clustering. Accordingly, one can notice highest ANUI value and it is almost same as real clustering. 16

ACCEPTED MANUSCRIPT

Table 2: Inclination of ANUI towards Real-world Network (RWN). Variations of AVI, AVU and ANUI with the size of cluster are noted. Proportionate balancing of ANUI towards RWN is indicated by variation of ANUI with size of cluster both on LFR graphs and RWNs.

Individual cluster contribution with relative size variation Networks

AVI

AVU

ANUI

Remarks

Variation

Value

Variation

Balanced towards

Variation

LFR-5K LFR-6K LFR-7K

High High High

Uniform Uniform Uniform

Low Low Low

Increased Increased Increased

AVU AVU AVU

Increased Increased Increased

Clusters are able to isolate but ANUI variation same to RWN

Football Karate

Low Low

Increased Increased

High High

Uniform Uniform

AVI AVI

Increased Increased

Clusters are more likely to unify

AN US

CR IP T

Value

(b)

AC

CE

PT

ED

M

(a)

(c) Figure 7: Characteristics of AVI, AVU and ANUI on LFR benchmark graphs. Characteristics of AVI, AVU and ANUI are shown in (a), (b) and (c) respectively.

17

ACCEPTED MANUSCRIPT

Table 3: Distribution of predicted cluster sizes (DCS) and most of the cluster contributions to overall AVI, AVU and ANUI in comparison to real clustering.

Most of the cluster contributions Algorithms AVI

AVU

ANUI balanced towards

Similar Different Different Different

High Low High Mixed

Low High Low Mixed

AVU AVI AVU Mixed

ing is questionable. We have already noticed inclination of ANUI towards accuracy in the above discussion. Hence, indication of ANUI as worst clustering of LICOD is quite believable. We have also noticed higher accuracy of HC-PIN and RW above. Thus, indication of ANUI as HC-PIN and RW better than LICOD and SCAN is also believable to some extent. We can also assert easily that the clustering of HC-PIN and RW are more accurate without knowing real clustering of GRQC by observing simply the ANUI values.

Best Poor Poor Better than LICOD and SCAN

CR IP T

HC-PIN LICOD SCAN RW

Remarks DCS

6.3.1. Inclination towards Real-world Network Real clustering of LFR graphs and the contribution of all clusters with respect to their sizes are presented in Fig. 4(a), (b) and (c). Clearly, LFR graphs are well structured. As mentioned earlier that for good clustering, Isolability of cluster should be high and Unifiability of cluster should be low. For all LFR graphs, AVI contribution of each cluster is high whereas AVU contribution is low. ANUI contribution of all clusters lie below AVI and above AVU. Unlike LFR graphs, real-world networks are not well structured as shown in Fig. 4(d) and (e). AVU contribution of most of the clusters indicate that clusters are more likely to unify with other clusters. For example, clusters of size 5 and 7 in Football network are more likely to unify with other clusters rather than isolate itself as indicated by high AVU and low AVI values. However in the original Football network, clusters of size 5 and 7 are actually ground truth clusters. Hence our metric ANUI should suppress

6.3.2. Verification of Accuracy on Predicted Clustering We have done a comparative analysis of AVI, AVU and ANUI contributions of predicted clustering with respect to the real clustering. This analysis will validate effectiveness of ANUI to deal with the accuracy of individual clusters obtained with different algorithms. Contribution of clusters to each metric corresponding to clustering predicted by four algorithms on LFR graphs are presented in Fig. 5. Distribution of cluster size for HC-PIN looks very similar to the real clustering of LFR graphs (Fig. 4), except some of the smaller sized clusters. All AVI contributions are high and AVU contributions are low for the clustering of HC-PIN. Distribution of cluster size for other three algorithms look very different from the real clustering. Number of clusters produced by LICOD and SCAN are very high and most of the cluster sizes are less then 50. All three metrics’ contributions for LICOD are almost opposite to the real clustering. Moreover, large portion of clusters produced

AC

CE

PT

ED

M

AN US

6.3. Metric Effectiveness Analysis We have mentioned earlier that ANUI reflects combined effect of both AVI and AVU. Hence, we will analyze effectiveness of ANUI in terms of AVI and AVU to show overall inclination of ANUI towards accuracy. AVI and AVU accumulates Isolability and Unifiability of all clusters respectively. Therefore, we will analyze contribution of Isolability and Unifiability in AVI and AVU respectively for each cluster.

Unifiability of such clusters and encourage towards the reality in order to show inclination towards accuracy. As indicated by the results, ANUI has the ability to not only balance Unifiability and Isolability of clusters, but also have the ability to balance it in a proportionate manner. The sizes of clusters also play very important role in contribution of ANUI. For LFR graphs, AVI contribution almost remains high and uniform but AVU contribution increases with size. ANUI contribution also increases with the cluster size. However, in real-world network, AVU values remain higher and AVI increases with the size of cluster. AVI and AVU contribution patterns of clusters for real-world network are almost opposite to the LFR graphs. Nevertheless, the ANUI contribution patterns of clusters for real-world network are very much similar to the LFR graphs. This gives clear indication that ANUI is highly inclined towards the real-world network. Overall meaning of AVI, AVU and ANUI contribution of each cluster with the cluster size variation are briefed in Table 2.

18

AN US

(a)

CR IP T

ACCEPTED MANUSCRIPT

ED

M

(b)

(c)

PT

Figure 8: Characteristics of AVI, AVU and ANUI on real-world networks. Characteristics of AVI, AVU and ANUI are shown in (a), (b) and (c) respectively.

by SCAN and RW are of size 1. Nevertheless, RW produces almost same number of larger clusters as in real clustering. These observations give clear indication about superiority of HC-PIN in terms of accuracy and worst clustering of LICOD and SCAN. However, accuracy of RW has been indicated as better than LICOD and SCAN. We have already noticed same indications during competitiveness analysis (section 6.2.1). Key findings are noted in the Table 3.

AC

CE

about similar sizes as in the real clustering. LICOD has produced very small number of clusters on GR-QC in comparison to other three algorithms. Clustering of LICOD actually has two very huge clusters of size almost 1000 and 1500. These two clusters include almost 50% of the network. Since AVI and AVU are influenced by internal and external connectivity of the cluster respectively, one can notice higher AVI and lower AVU values in Fig. 3 for LICOD on GR-QC network. We have already noticed above that the real clustering of Football and Karate have comparatively higher AVU values. We have also noticed balancing act of ANUI in such cases. Thus, small contributions of AVU and overall ANUI value (in Fig. 3(c)) indicate that accuracy of clustering produced by LICOD on GR-QC network is

Similar kind of indications can also be noticed in the results of real-world network presented in the Fig. 6. Specifically, for the LICOD on Karate network, one can notice numbers of clusters are high and all are of small sizes. Real clustering of Karate network have only two clusters. Clearly, RW has produced two clusters of 19

ACCEPTED MANUSCRIPT

low. On the contrary, other three algorithms produced evenly distributed smaller sized clusters with comparatively higher AVU contributions. This indicates accuracy of HC-PIN, SCAN and RW are better than LICOD. We have also observed such indications during competitiveness analysis (section 6.2.2).

Table 4: Characteristics of AVI, AVU and ANUI. Characteristics for predicted clustering are inspected with respect to real clustering. Increment with cluster size and significance in terms of accuracy are briefed.

CDF of AVI for predicted vs real clustering Following

Linear increment

Clustering accuracy

HC-PIN LICOD SCAN RW

Yes No No Some part

Yes Some part No Some part

High Low Low Average

CR IP T

6.4. Metric Characteristics From the above discussions, algorithms HC-PIN, LICOD, SCAN and RW can be ranked in terms of their clustering accuracy as 1, 4, 3 and 2 respectively. With this intuition, in this section we have analyzed behavior of proposed metrics. Characteristics on LFR graphs are presented in the Fig. 7. Clearly, Cumulative Density Function (CDF)s of ANUI for most of the algorithms are following closely with respect to that of the real clustering. Although, CDF of ANUI for RW is following that of real clustering but it is not closely following. Hence, we can assert that characteristics of ANUI for accurately predicted clustering by any algorithm will be same as characteristics of ANUI for real clustering. Now considering the CDFs of AVI, we can see that CDFs of most algorithms are following, but except HC-PIN, they are far from the CDF of AVI for real clustering. Therefore, we consider that the characteristics of AVI will also be same as ANUI i.e. it can accurately predict clustering. On the contrary, characteristics of AVU may also be same as the real clustering but AVU alone cannot determine accuracy. For example, clusters produced by LICOD on LFR graphs show similar characteristics of AVU with real clustering but that does imply accuracy (see in Fig. 2 AVU values of LICOD are same as real clustering but accuracy metric values are very low). However, if we consider AVU along with AVI, it can indicate accuracy. ANUI reflects combined effect of both AVI and AVU so one can notice CDF of ANUI for LICOD is not following real clustering even if CDF of AVU is following. CDFs of all three metrics for real clustering of LFR graphs increase almost linearly with the increment in percentage of clusters covered. Such linear increment can also be observed for real clustering of Football and Karate network as shown in Fig. 8. We have noted during competitiveness analysis (section 6.2.2) in Fig. 3(b) that RW shows very accurate clustering on Karate network. Accordingly, one can notice CDFs of all three metrics are aligned with real clustering of Karate network. With such insight knowledge, we can have clear indication about accuracy of different clustering of GRQC by simply visualizing linearity of AVI, AVU and ANUI characteristics. Clearly, characteristics of all three metrics for RW are more linear. This indicates

Algorithms

CDF of AVU for predicted vs real clustering Algorithms

Linear increment

Clustering accuracy

Yes Yes No Some part

Yes Yes No Some part

High Low Low Average

AN US

HC-PIN LICOD SCAN RW

Following

ED

M

CDF of ANUI for predicted vs real clustering

Algorithms

Following

Linear increment

Clustering accuracy

HC-PIN LICOD SCAN RW

Yes No No Some part

Yes No No Some part

High Low Low Average

CE

PT

accuracy of clustering produced by RW on GR-GC network is higher than other three algorithms. Similarly, we can assert that accuracy of LICOD on GR-QC is least among all four algorithms. Overall significance of characteristics analysis on LFR graphs and real-world networks are summarized in Table 4. 7. Conclusion and Future Directions

AC

In this paper, a set of three quality metrics AVI, AVU and ANUI are proposed to evaluate clustering predicted with any graph clustering technique. These metrics evaluate clusters in terms of both quality and accuracy. Metrics are designed based on the two properties of social community formation. First, the clusters should have the ability to isolate itself from rest of the network i.e. isolation property. Second, a cluster should be unlikely to unify with other clusters i.e. unification property. We have designed two metrics on both these perspectives. Proposed metric AVI utilizes internal as well as external connections of cluster to measure Isola20

ACCEPTED MANUSCRIPT

explored. In such cases, assurance of accuracy to any clustering is even more challenging as ground truth is not available to validate the accuracy. Quality metrics does not require ground truth for evaluating any clustering. The proposed quality metrics have the ability to determine the accuracy along with the quality of the clustering. The characteristics analysis is evident for this. Though, it is not possible to ensure accuracy fully, we have achieved our goal to an extent despite numerous difficulties and constraints. The work can be further explored and tuned to increase the level of assurance of accuracy. This work studies ANUI by considering equal weightage to both Unifiability and Isolability. In future work, we will examine the impact of assigning different weightage and develop clustering algorithms with the ideas of these metrics to ensure accuracy algorithmically. Although, proposed metrics can be used to evaluate clustering obtained for most of the connected networks, we suggest not to use AVU and ANUI for the networks that are densely connected. Most graph clustering algorithms are likely to produce singleton cluster for densely connected networks. For such cases, AVU cannot be measured as there is no external connection left if entire network is included in single cluster. ANUI also cannot be measured for such cases as it incorporates AVU. In addition, disconnected networks are also not suitable for proposed metrics.

ED

M

AN US

CR IP T

bility of clusters, whereas AVU utilizes only external connections to measure Unifiability of clusters. ANUI accumulates both AVI and AVU to reflect combined effect of both Isolability and Unifiability of clusters. Analysis method used to evaluate the performance of proposed metrics can be divided roughly into four parts. Firstly, competitiveness of AVI, AVU and ANUI in terms of accuracy are examined with respect to other popular and widely used metrics for evaluating clustering. Secondly, analyzed the effectiveness of AVI, AVU and ANUI in indicating accuracy along with quality. Thirdly, characteristics of AVI, AVU and ANUI are inspected. Lastly, axiomatic analysis is performed on theoretical ground to ensure that proposed metrics meet qualitative requirements. All experiments related to the accuracy have been done on three LFR benchmark graphs and three real-world networks. Clusters used for the analysis are generated with four algorithms of different domains. The experimental results illustrate the effectiveness, competitiveness, and efficiency of the proposed metrics to indicate the accuracy along with quality. Capability of each of the three metrics is analyzed individually while dealing with accuracy. Results demonstrate that both AVI and ANUI alone can indicate the level of accuracy of clustering, but AVU alone cannot determine accuracy. However, ANUI along with AVI and AVU can give more insight knowledge about the clustering in following ways. Good quality clustering can be identified in three ways by simply observing AVI and AVU values. Firstly, both AVI and AVU may show higher values. Secondly, AVI may show higher value and AVU may show moderate value. Lastly, AVI may show moderate value and AVU may show higher value. For bad clustering, AVU shows comparatively higher value and AVI shows very less value and vice-versa. However, both AVI and AVU cannot be very low at the same time because if external connectivity increases then internal connectivity decreases and vice-versa. Linearity in characteristics of AVI, AVU and ANUI can also give the indication about accuracy. Such indication will be very helpful for determining accuracy especially for unknown networks, which would very helpful in decision making of expert and intelligent systems. Moreover, theoretically proved that AVI, AVU and ANUI satisfy all of the six quality related properties. Most of the quality metrics including widely used modularity are unable to satisfy all of the six properties. Real-world networks obtained with the graphical representation of objects are diverse, huge, complex, and unknown. Realization of ground truth for such network is very challenging task and often remains un-

References

AC

CE

PT

Abufouda, M., & Zweig, K. A. (2014). Interactions around social networks matter: Predicting the social network from associated interaction networks. In Advances in Social Networks Analysis and Mining (ASONAM) (pp. 142–145). Ackerman, M., & Ben-David, S. (2009). Measures of clustering quality: A working set of axioms for clustering. In Advances in neural information processing systems (pp. 121–128). Aldecoa, R., & Mar´ın, I. (2013). Surprise maximization reveals the community structure of complex networks. Scientific reports, 3. Ali, H. T., & Couillet, R. (2016). Performance analysis of spectral community detection in realistic graph models. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP16). Almeida, H., Guedes, D., Meira Jr, W., & Zaki, M. J. (2011). Is there a best quality metric for graph clusters? In Machine Learning and K D D (pp. 44–59). Springer. Alper Selver, M., & Akay, O. (2009). Evaluating clustering methods for classification of marble slabs in an automated industrial marble inspection system. In Electrical and Electronics Engineering, 2009. ELECO 2009. International Conference on (pp. II–115). IEEE. Biswas, A., & Biswas, B. (2015). Investigating community structure in perspective of ego network. Expert Systems with Applications, 42, 6913 – 6934. Brandes, U., Gaertler, M., & Wagner, D. (2003). Experiments on graph clustering algorithms. In G. Di Battista, & U. Zwick (Eds.), Algorithms - ESA 2003 (pp. 568–579). Springer Berlin Heidelberg volume 2832 of Lecture Notes in Computer Science.

21

ACCEPTED MANUSCRIPT

tion Sciences, 275, 1–12. Lancichinetti, A., Fortunato, S., & Radicchi, F. (2008). Benchmark graphs for testing community detection algorithms. Physical Review E, 78, 046110. Leskovec, J., Kleinberg, J., & Faloutsos, C. (2007). Graph evolution: Densification and shrinking diameters. ACM Transactions on Knowledge Discovery from Data (ACM TKDD), 1. Leskovec, J., Lang, K. J., & Mahoney, M. (2010). Empirical comparison of algorithms for network community detection. In Proceedings of the 19th Int. conf. on World wide web (pp. 631–640). ACM. Lin, C., Cho, Y.-r., Hwang, W.-c., Pei, P., & Zhang, A. (2007). Clustering methods in protein-protein interaction network. Knowledge Discovery in Bioinformatics: Techniques, Methods and Application, (pp. 1–35). Liu, Y., Li, Z., Xiong, H., Gao, X., Wu, J., & Wu, S. (2013). Understanding and enhancement of internal clustering validation measures. Cybernetics, IEEE Transactions on, 43, 982–994. doi:10.1109/TSMCB.2012.2220543. Lorentz, H., Hilmola, O.-P., Malmsten, J., & Srai, J. S. (2016). Cluster analysis application for understanding {SME} manufacturing strategies. Expert Systems with Applications, 66, 176 – 188. doi:http://dx.doi.org/10.1016/j.eswa.2016.09.016. Manning, C. D., Raghavan, P., & Sch¨utze, H. (2008). Introduction to Information Retrieval. (1st ed.). Cambridge University Press. Milligan, G. G. (1981). A mote carlo study of thirty internal criterion measures for cluster analysis. Psychometrika, 46, 187–199. Newman, M. E. (2003). Mixing patterns in networks. Physical Review E, 67, 026126. Newman, M. E., & Girvan, M. (2004). Finding and evaluating community structure in networks. Physical review E, 69, 026113. Niu, D., Jennifer, G. D., & Michael Jordan, I. (2014). Iterative discovery of multiple alternative clustering views. IEEE Transaction on Pattern Analysis and Machine Intelligence, 36, 1761 – 1774. Orman, G. K., Labatut, V., & Cherifi, H. (2012). Comparative evaluation of community detection algorithms: a topological approach. Journal of Statistical Mechanics: Theory and Experiment, 2012, P08001. doi:10.1088/1742-5468/2012/08/P08001. Rand, W. M. (1971). Objective criteria for the evaluation of clustering methods. Journal of American Statistical Association, 66, 846– 850. Raskutti, B., & Leckie, C. (1999). An evaluation of criteria for measuring the quality of clusters. In IJCAI (pp. 905–910). Schaeffer, S. E. (2007). Graph clustering. Computer Science Review, 1, 27 – 64. Sileshi, M., & Gamback, B. (2009). Evaluating clustering algorithms: cluster quality and feature selection in content-based image clustering. In Computer Science and Information Engineering, 2009 WRI World Congress on (pp. 435–441). IEEE volume 6. Singh, B. K., Verma, K., & Thoke, A. (2016). Fuzzy cluster based neural network classifier for classifying breast tumors in ultrasound images. Expert Systems with Applications, 66, 114 – 123. doi:http://dx.doi.org/10.1016/j.eswa.2016.09.006. Steinhaeuser, K., & Chawla, N. V. (2010). Identifying and evaluating community structure in complex networks. Pattern Recognition Letters, 31, 413 – 421. Strehl, A., & Ghosh, J. (2002). Cluster ensembles—a knowledge reuse framework for combining multiple partitions. The Journal of Machine Learning Research, 3, 583–617. Traag, V. A., Van Dooren, P., & Nesterov, Y. (2011). Narrow scope for resolution-limit-free community detection. Physical Review E, 84, 016114. Van Laarhoven, T., & Marchiori, E. (2014). Axioms for graph clustering quality functions. Journal of Machine Learning Research, 15, 193–215.

AC

CE

PT

ED

M

AN US

CR IP T

Casas-Roma, J., Herrera-Joancomart´ı, J., & Torra, V. (2015). Anonymizing graphs: measuring quality for clustering. Knowledge and Information Systems, 44, 507–528. Chan, J., Vinh, N. X., Liu, W., Bailey, J., Leckie, C. A., Ramamohanarao, K., & Pei, J. (2014). Structure-aware distance measures for comparing clusterings in graphs. In Advances in Knowledge Discovery and Data Mining (pp. 362–373). Springer. Chen, Y.-L., Chuang, C.-H., & Chiu, Y.-T. (2014). Community detection based on social interactions in a social network. Journal of the Association for Information Science and Technology, 65, 539–550. Colladon, A. F., & Remondi, E. (2016). Using social network analysis to prevent money laundering. Expert Systems with Applications, (pp. –). doi:http://dx.doi.org/10.1016/j.eswa.2016.09.029. Creusefond, J., Largillier, T., & Peyronnet, S. (2016). On the evaluation potential of quality functions in community detection for different contexts. In Advances in Network Science (pp. 111–125). Springer. Delling, D., Gaertler, M., G¨orke, R., Nikoloski, Z., & Wagner, D. (2006). How to evaluate clustering techniques. Univ., Fak. f¨ur Informatik, Bibliothek. Duan, L., Street, W. N., Liu, Y., & Lu, H. (2014). Community detection in graph through correlation. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2014) (pp. 376–1385). ACM. Dubes, R., & Jain, A. K. (1979). Validity studies in clustering methodologies. Pattern Recognition, 11, 235–254. Emmons, S., Kobourov, S., Gallant, M., & B¨orner, K. (2016). Analysis of network clustering algorithms and cluster quality metrics at scale. arXiv preprint arXiv:1605.05797, . Foggia, P., Percannella, G., Sansone, C., & Vento, M. (2009). Benchmarking graph-based clustering algorithms. Image and Vision Computing, 27, 979–988. Good, B. H., de Montjoye, Y.-A., & Clauset, A. (2010). Performance of modularity maximization in practical contexts. Physical Review E, 81, 046106. URL: http://dx.doi.org/10.1103/PhysRevE.81.046106. Han, X., Wang, L., Farahbakhsh, R., ngel Cuevas, Cuevas, R., Crespi, N., & He, L. (2016). Csd: A multi-user similarity metric for community recommendation in online social networks. Expert Systems with Applications, 53, 14 – 26. doi:http://dx.doi.org/10.1016/j.eswa.2016.01.003. Hao, T., Peng, W., Wang, Q., Wang, B., & Sun, J. (2016). Reconstruction and application of protein–protein interaction network. International Journal of Molecular Sciences, 17, 907. Jackson, M. O. (2008). Social and Economic Networks. (1st ed.). Princeton University Press. Jiang, B., Pei, J., Tao, Y., & Lin, X. (2013). Clustering uncertain data based on probability distribution similarity. IEEE Transactions on Knowledge and Data Engineering, 25, 751–763. doi:10.1109/TKDE.2011.221. Jo, T., & Lee, M. (2007). The evaluation measure of text clustering for the variable number of clusters. In Advances in Neural Networks– ISNN 2007 (pp. 871–879). Springer. Kanawati, R. (2011). Licod: Leaders identification for community detection in complex networks. In 2011 IEEE third international conference on Privacy, security, risk and trust (passat) and 2011 IEEE third international conference on social computing (socialcom) (pp. 577–582). IEEE. Kim, S., Yoo, C. D., Nowozin, S., & Kohli, P. (2014). Image segmentation using higher-order correlation clustering. IEEE Transaction on Pattern Analysis and Machine Intelligence, 36, 1761 – 1774. Kleinberg, J. (2003). An impossibility theorem for clustering. Advances in neural information processing systems, (pp. 463–470). Kou, G., Peng, Y., & Wang, G. (2014). Evaluation of clustering algorithms for financial risk analysis using mcdm methods. Informa-

22

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

M

AN US

CR IP T

Wang, J., Li, M., Chen, J., & Pan, Y. (2011). A fast hierarchical clustering algorithm for functional modules discovery in protein interaction networks. IEEE/ACM Trans. Comput. Biol. Bioinformatics, 8, 607–620. Wu, J., Chen, J., Xiong, H., & Xie, M. (2009). External validation measures for k-means clustering: A data distribution perspective. Expert Systems with Applications, 36, 6050–6061. Xu, X., Yuruk, N., Feng, Z., & Schweiger, T. A. J. (2007). Scan: A structural clustering algorithm for networks. In Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining KDD ’07 (pp. 824–833). New York, NY, USA: ACM. Xu, Y., Xu, H., & Zhang, D. (2015). A novel disjoint community detection algorithm for social networks based on backbone degree and expansion. Expert Systems with Applications, 42, 8349 – 8360. doi:http://dx.doi.org/10.1016/j.eswa.2015.06.042. Yang, H., Zhuo, W., Zha, Y., & Wan, H. (2016). Twoperiod supply chain with flexible trade credit contract. Expert Systems with Applications, 66, 95 – 105. doi:http://dx.doi.org/10.1016/j.eswa.2016.08.056. Yang, J., & Leskovec, J. (2015). Defining and evaluating network communities based on ground-truth. Knowledge and Information Systems, 42, 181–213. Zachary, W. W. (1977). An information flow model for conflict and fission in small groups. Journal of Anthropological Research, 33, 452–473. Zaidi, F., Archambault, D., & Melanc¸on, G. (2010). Evaluating the quality of clustering algorithms using cluster path lengths. In Adv. in Data Mining. App. and Theoretical Aspects (pp. 42–56). Springer.

23