Computers in Human Behavior 56 (2016) 295e305
Contents lists available at ScienceDirect
Computers in Human Behavior journal homepage: www.elsevier.com/locate/comphumbeh
Full length article
Introducing a procedure for developing a novel centrality measure (Sociability Centrality) for social networks using TOPSIS method and genetic algorithm Mehrdad Agha Mohammad Ali Kermani*, Aghdas Badiee, Alireza Aliahmadi, Mahdi Ghazanfari, Hamed Kalantari PhD Candidate of Industrial Engineering Department, Iran University of Science and Technology, Iran
a r t i c l e i n f o
a b s t r a c t
Article history: Received 4 July 2015 Received in revised form 4 November 2015 Accepted 9 November 2015 Available online 17 December 2015
Centrality is one of the most important fields of social network research. To date, some centrality measures based on topological features of nodes in social networks have been proposed in which the importance of nodes is investigated from a certain point of view. Such measures are one dimensional and thus not feasible for measuring sociological features of nodes. Given that the main basis of Social Network Analysis (SNA) is related to social issues and interactions, a novel procedure is hereby proposed for developing a new centrality measure, named Sociability Centrality, based on the TOPSIS method and Genetic Algorithm (GA). This new centrality is not only based on topological features of nodes, but also a representation of their psychological and sociological features that is calculable for large size networks (e.g. online social networks) and has high correlation with the nodes' social skill questionnaire scores. Finally, efficiency of the proposed procedure for developing sociability centrality was tested via implementation on the Abrar Dataset. Our results show that this centrality measure outperforms its existing counterparts in terms of representing the social skills of nodes in a social network. © 2015 Elsevier Ltd. All rights reserved.
Keywords: Sociability Centrality Social network analysis Mobile phone based social network TOPSIS Genetic algorithm
1. Introduction Networks are all around us and we, as individuals, are ourselves the units of a network of social relationships of different kinds and, as biological systems, the delicate results of a network of biochemical reactions (Boccaletti, Latora, Moreno, Chavez, & Hwang, 2006). Study of networks was started by Leonhard Euler € nigsberg bridge problem. Later, and published in response to the Ko this field of study was named graph theory and focused on answering a number of specific practical questions such as defining the shortest path length between two nodes of a network or specifying maximum flow between source and sink nodes. Efforts of scholars in extending the field of graph theory into social sciences has led to the development of Social Network Analysis (SNA) which was introduced in the 1920s. SNA focuses on the relationship analysis between social and economic entities with the aim of
* Corresponding author. E-mail address:
[email protected] (M. Agha Mohammad Ali Kermani). http://dx.doi.org/10.1016/j.chb.2015.11.008 0747-5632/© 2015 Elsevier Ltd. All rights reserved.
finding common features between them. For example, differences between social networks and other networks with respect to some common features of social networks such as high degree of clustering were studied by Newman and Park in 2003 (Newman & Park, 2003). Since SNA is an interdisciplinary field of research, many scholars are working on relating concepts of SNA from different disciplines such as economics, sociology and computer science. These research efforts deal with network analysis from a single point of view and try to respond to certain questions in SNA. For example, economists attempt to model why links are formed between nodes in a social network (Goyal & Vega-Redondo, 2005; Jackson, 2008a,b; Jackson & Wolinsky, 1996) and physicists and computer scientists work more towards research on searching and diffusion in networks (Adamic & Adar, 2005; Adamic, Lukose, Puniyani, & Huberman, 2001; Leskovec, Adamic, & Huberman, 2007). Research in SNA can be categorized into five main classes as follows: 1) Modeling real phenomena such as the labor market -Armengol & Jackson, (Calvo-Armengol & Jackson, 2004; Calvo
296
M. Agha Mohammad Ali Kermani et al. / Computers in Human Behavior 56 (2016) 295e305
2007), virtual friendships (Lewis, Kaufman, Gonzalez, Wimmer, & Christakis, 2008), supply chain management (Borgatti & Li, 2009), etc. which show the power of SNA. A comprehensive review on the application of SNA in modeling real life phenomena was conducted by Jackson in 2010 (Jackson, 2010). 2) Studying local and structural features of social networks such as centrality measures and six degrees of separation, proposed by Milgram in 1967 (Milgram, 1967). One of the central works in this class of research, with focus on structural features of social networks from a global point of view, was authored by Newman and Park (Newman & Park, 2003). 3) Studying models of network forsi & Albert, 1999; Box-Steffensmeier & Chrismation (Baraba tenson, 2014; ERDdS & WI, 1959; Leskovec, Chakrabarti, Kleinberg, & Faloutsos, 2005). This class of research is subcategorized into random and strategic models of network formation by Jackson (2008a,b). 4) Studying processes such as navigation and diffusion in networks (Kleinberg, 2000; Louni & Subbalakshmi, 2014; Small, 2012; Zinoviev, 2011). Such work investigates diffusion of phenomena in social networks from a macro point of view in contrast to some other research that deals with the problem from a micro point of view. 5) Social Influence studies which investigate processes from a specific point of view. This class of research focuses on models dealing with how a node in a social network is infected by others, and are known as influence models (Granovetter, 1978; Granovetter & Soong, 1986; Macy, 1991). One of the well-known problems in SNA is the identification of key nodes in social networks (e.g. (Bonacich, 1987; Freeman, 1979)). Since many years ago, the interconnectedness of actors in social networks has been a central issue of SNA (Friedl & Heidemann, 2010). In the aforementioned classifications of SNA studies, research is related to local structural features of social networks. All scholars would agree that power is a fundamental property of social structures; however, there is no consensus on what power is defined as and how it can be described and consequently analyzed (Hanneman & Riddle, 2005). The power and importance of actors (nodes) in social networks are quantified by centrality measures. The idea of centrality applied to human communication as a criterion for measuring relative power was proposed by Bavelas in 1948 (Bavelas, 1948). A comprehensive review is conducted on centrality measures and their applications by Freeman in 1979 (Freeman, 1979). Therefore, four basic concepts of centrality; degree centrality, betweenness centrality, close-ness centrality and eigenvector are presented as follows, respectively (Friedl & Heidemann, 2010). Each centrality measure captures different events. Degree centrality should be considered when direct connection of nodes is the only criterion for quantifying the power on actors. Betweenness centrality should be used in situations where power of nodes is related to their intermediary, for example, in Medici networks (Kent, 1975). However, closeness centrality should be utilized where the power of an actor is related to ease of access to other nodes. Also, eigenvector centrality should be considered in diffusion applications such as Banerjee et al. (Banerjee, Chandrasekhar, Duflo, & Jackson, 2013). The discussion is often limited to undirected and unweighted social networks in a manner that is rather simplified. However, even for these relatively simple graphs there is no uniform understanding of an actor's centrality in a social network (Borgatti & Everett, 2006). Besides the above-mentioned basic centrality measures, some other centrality measures have also been proposed over the years (see, e.g. (Agryzkov, Oliver, Tortosa, & Vicent, 2014; Banerjee et al., 2013; Yan, Zhai, & Fan, 2013)). These recently proposed centrality measures enable consideration of directed or
weighted networks (e.g. (Opsahl, Agneessens, & Skvoretz, 2010; Page, Brin, Motwani, & Winograd, 1999)) and some of them are developed for especial applications (e.g. (Hines and Blumsack, 2008)). 2. Problem definition Generally speaking, each of the proposed centrality measures investigates the importance of nodes in social networks from a particular point of view. “Social network” is defined as a “pattern of friendship, advice, communication or support between individual members or groups of members within a social system” (Valente, 1996). There exist a number of researches that support the presence of significant connection (relation) between centrality measures and nodes' social activity in social networks. For example, extraverted individuals generally have more degree centrality in Facebook as a social network (Amichai-Hamburger & Vinitzky, 2010), and belong to more Facebook Groups (Ross et al., 2009) than introverted individuals. Moreover, Borgatti et al. investigated the measures of social capital and focused on the relation between well-known centrality measures and social capital as a part of their research (Borgatti, Jones, & Everett, 1998). Another work that also focused on the relation between centrality measures and social activity is presented in (Baldwin, Bedell, & Johnson, 1997). Here, the authors considered an advice network which reflects an individual's involvement in exchanging assistance with coworkers and also engaging in mutual problem-solving. They argued that an individual centered in the advice network is, over time, able to accumulate knowledge about task-related problems and workable solutions. In addition, Sparrowe et al. showed that Individuals who were central in their work group's advice network had higher levels of in-role and extra-role performance than did individuals who were not central players in that network (Sparrowe, Liden, Wayne, & Kraimer, 2001). Although the main basis of SNA is related to social issues and social interactions, it seems that all existing centrality measures are based merely on the topological features of networks. These measures are not directly relevant to psychological and sociological features of nodes and are thus not useful for comparing nodes based on their social skill performances. So, to cope with this problem and to compare the nodes according to their social skill performances, it is proposed that one of the relevant questionnaires should be distributed to be filled in by individuals. As such, qualitative data that is not related to the nodes' position in the network is gathered. Moreover, this method is not consistent in large size networks such as online social networks. In total, there exists no centrality measure that encompasses all the following features: Nodes' topological feature Nodes' psychological and sociological features Applicability to large size networks Therefore, it is reasonable to combine some of the existing measures (based on nodes' position in the network) using a given method to develop a new measure, here called “Sociability Centrality”, that incorporates/includes the above-mentioned features. In this paper, the procedure for developing a new centrality measure called “sociability centrality” is introduced. This procedure is based on the position of nodes in a network and would represent the social skill performance of every node. On the one hand, the present work aims at evaluating selected nodes in social networks based on certain criteria (e.g, In-Degree, Betweenness and Closeness) and assigning a score to each node. On the other hand, these
M. Agha Mohammad Ali Kermani et al. / Computers in Human Behavior 56 (2016) 295e305
scores should be correlated with the nodes' psychological and sociological features. Given the features of sociability centrality, this measure could be applied in large scale social networks such as Online Social Networks. This suggests, from an applications point of view, that the proposed centrality measure and its development procedure could be considered a novel application to social networks (e.g. Facebook). Finally, “Sociability Centrality” could be calculated based on some of the already existing centrality measures. The proposed procedure is based on TOPSIS as a multi attribute decision making (MADM) method and also GA as metaheuristic optimization methods. The structure of this paper is organized as follows; in Sections 3 and 4, TOPSIS and GA are presented, respectively. The proposed procedure for developing Sociability Centrality is then introduced in Section 5 and finally, experimental results of its implementation on the Abrar dataset are reported in Section 6.
3. TOPSIS as a MADM method Some problems which deal with the evaluation of some choices made based on some criteria consisting of m choices and n criteria are called as Multi Criteria Decision Making problems for which some solution methods such as AHP,1 VIKOR2 and ANP3 have been proposed. These methods are introduced in Duckstein and Opricovic (1980), Saaty (1990), and Saaty (2005), respectively. TOPSIS is acronym of “Technique for order of Preference by Similarity to Ideal Solution”. It is a decision making method for solving problems with multiple criteria proposed by Hwang and Yoon (Hwang & Yoon, 1981). TOPSIS method minimizes distance to the ideal solution while distance to the nadir is maximized and uses a compensatory accumulation method that evaluates several choices by considering the weighted criteria. This method is used in many applications in various fields such as social network analysis (Mesgari, Agha Mohammad Ali Kermani, Hanneman & Aliahmadi, 2015), supply chain management (Agha Mohammad Ali Kermani, Navidi, & Alborzi, 2012), HSE4 management (Sadoughi, Yarahmadi, Taghdisi & Mehrabi, 2012) and project management (Mahmoodzadeh, Shahrabi, Pariazar, & Zaeri, 2007)and its advantages are outlined by Park and Yoon (Kim, Park, & Yoon, 1997) as follows: a sound logic that represents the rationale of human choice, a scalar value that accounts for both the best and worst alternatives simultaneously, a simple computation process that can be easily programmed into a spread sheet, and the performance measures of all alternatives on attributes can be visualized on a polyhedron, at least for any two dimensions.
The Basic principle of TOPSIS requires the chosen alternative to have the shortest distance from the positive ideal choice and the farthest distance from the negative ideal solution, as shown in Fig. 1. Details of TOPSIS method is presented in (Jahanshahloo, Lotfi, & Izadikhah, 2006). In the present work, an MCDM problem that deals with the evaluation of social skills in selected nodes of a network based on some existing criteria in SNA is targeted. Here, number of selected nodes and number of centrality measures used for the evaluation of selected nodes are denoted by m and n, respectively. As described in the previous section, TOPSIS is used to allocate scores to selected nodes in a network; as such, the related scores have the greatest correlation coefficient with social skill questionnaire scores. 4. Genetic algorithm Genetic Algorithms (GAs) were first proposed by John Holland (Holland, 1992). GA searches the extensive and complex solution space based on guided random procedures to detect “good”, but not meznecessarily optimum solutions, in a given timeframe (Go nez, 1999). The different solution values are Skarmeta & Jime considered coded as proposed initial solution values and are called “initial population”, with each member of the population called a chromosome (Surmann & Maniadakis, 2001). Generally speaking, genetic algorithm operators are applied on the predecessor population in order to generate offsprings for the next population, alike parent selection, cross-over and mutation. Selection refers to the stage in a GA in which individual genes are chosen from a population for later breeding through wellknown methods like roulette-wheel selection, Tournament selection, Reward-based selection, Stochastic universal sampling, etc. Mutation is a genetic operator used to maintain genetic diversity from one generation of a population of genetic algorithm chromosomes to the next. It is analogous to biological mutation. Mutation alters one or more gene values in a chromosome from its initial state. In mutation, the solution may change entirely from the one in the previous stage. Hence, GA can potentially end up in a better solution via mutation. Mutation occurs during evolution based on a user-defined mutation probability value which should be set low. If too high, the search will turn into a primitive random search. For different genome types, different mutation types are suitable like Bit string mutation, Flip bit, Boundary, Non-Uniform, Uniform, Gaussian and so on (Ahmed, 2010).
So, TOPSIS has a reasonable procedure for evaluating some choices (m) based on some criteria (n) with different levels of relative importance. The relative importance of criteria in the evaluation process is shown by weights. It should be noted that in a certain (Multi Criteria Decision Making) MCDM problem, the final score of each choice will be modified by making some changes in weights of criteria.
1
Analytic Hierarchy Process. ViseKriterijumska Optimizacija I Kompromisno Rjesenje (It is a Serbian Word that means: Multicriteria Optimization and Compromise Solution, with pronunciation). 3 Analytic network process. 4 Health, Safety and Environment.
297
2
Fig. 1. Principle of TOPSIS.
298
M. Agha Mohammad Ali Kermani et al. / Computers in Human Behavior 56 (2016) 295e305
Crossover is a genetic operator used to modify the programming of a chromosome or chromosomes from one generation to the next. It is analogous to reproduction and biological crossover, upon which genetic algorithms are based. Cross over is a process of taking more than one parent solution from which child solutions may be produced. Many crossover techniques exist for organisms, each of which use different data structures to store themselves. These are One-point crossover, Two-point crossover, Cut and splice, Uniform Crossover and Half Uniform Crossover (Gwiazda, 2006). Then, based on the fitness function value of each chromosome per population and its respective survivor selection, generation of new population would take place until the termination criteria is fulfilled. The schematic procedure of GA is depicted in Fig. 2. 5. Proposed procedure The main objective of the present research is to quantify the sociability power of nodes in social networks, via Sociability Centrality. To introduce this novel measure, the following procedure is proposed for both small and large size networks. The proposed procedure is based on TOPSIS and GA with the following steps: a) Select a sample of nodes randomly in a social network; the sample size could be much smaller than the actual size of the network, but more accurate results are attained if the sample size is large enough based on the central limit theorem (Hazewinkel, 2001). The selected nodes are considered alternatives in the TOPSIS method. b) Fill in the Social Skills Questionnaire. This is performed by a selection of the nodes in the network and yields the social skill score of each node. Given the purpose of this work, which is to introduce a new centrality measure that is representative of the social skill of nodes in the network, the questionnaire designed by Inderbitzen and Foster in 1992 (Inderbitzen & Foster, 1992) is used. c) Calculate well-known centrality measures for each selected node in the network. These include Indegree, Outdegree, Eigenvector, Closeness, Betweenness and PageRank (Jackson, 2010) measures. Values of these centrality measures are considered as the criteria for TOPSIS. d) Use GA as an optimization routine to find the best weights for TOPSIS criteria. The main challenge here is to obtain best weights for the criteria, and determination of these weights could thus be considered an optimization problem. Sum of
these weights should be equal to 1 and they should be tuned in such a way that scores of selected nodes obtained based on TOPSIS exhibit the greatest possible correlation with the social skills questionnaire scores. These weights can be determined by Genetic Algorithms (GA) as tools for approximation of the solution within reasonable time. e) Calculate the well-known centrality measures for all nodes in the social network under study. These include Indegree, Outdegree, Eigenvector, Closeness, Betweenness and PageRank measures as before. Since these centrality measures are topology based, they could be calculated for all nodes within acceptable time. f) Perform an additional TOPSIS in which all nodes in the social network under study are considered alternatives, and the calculated values of Indegree, Outdegree, Eigenvector, Closeness, Betweenness and PageRank measures are considered the criteria, and the resulting weights from GA in step (d) are considered weights of the criteria. Scores obtained based on the latest TOPSIS show Sociability Centrality of each node.
6. Experimental results 6.1. Introducing the dataset To introduce Sociability Centrality and demonstrate its performance via implementation of the proposed produce, the Abrar Dataset is used as a medium size social network. Abrar University is a single-sex university located in Tehran. More than eight hundred students are enrolled to the university in three different disciplines. For the implementation, 163 students enrolled in the 2010e2011 and 2011e2012 academic years in the fields of Computer Engineering and Industrial Engineering are considered as nodes of the social network. In this network, directed links are drawn between individuals i and j, if i has the mobile phone number of j. In addition, forty-one randomly selected students (about 25% of the total number of students in the network) were asked to fill in a social skills questionnaire (Inderbitzen & Foster, 1992). The resulting network which is introduced in Agha Mohammad Ali Kermani, Karimimajd, Mohammadi, and Aliahmadi (2014) and used in Mesgari et al. (2015), Agha Mohammad Ali Kermani, Aliahmadi, and Hanneman is depicted in Fig. 3. 6.2. Parameter tuning of GA To generate an initial population, and to detect the best weight for each centrality measure for obtaining the maximum correlation coefficient between social skill questionnaire scores and calculated scores based on TOPSIS, input parameters related to weights of centrality measures could be selected randomly. The general scheme of the considered GA is as follows: a) Representations The assumed chromosome in each population consists of n genes which represent weights of various centrality measures in TOPSIS method. The constraint on the aforementioned weights is as below:
Wj ε½0; 1 and Fig. 2. Schematic procedure of GA.
n X j¼1
Wj ¼ 1
j ¼ 1; 2; …; n
M. Agha Mohammad Ali Kermani et al. / Computers in Human Behavior 56 (2016) 295e305
population is number ¼ 100.
considered
100,
299
that
is
max
generation
d) Parent Selection Mechanism The fitness proportionate selection or roulette-wheel selection is the selection method of choice applied with the following implementation:
Fig. 3. Abrar student network (49, 63).
The fitness function is evaluated for each individual, providing fitness values which are then normalized. Normalization is the process by which the fitness value of each individual is divided by the sum of all fitness values, so that the sum of all resulting fitness values equals 1. The population is sorted by descending fitness values. Accumulated normalized fitness values are computed (the accumulated fitness value of an individual is the sum of its own fitness value plus the fitness values of all the previous individuals). A random number R between 0 and 1 is chosen. The selected individual is the first one whose accumulated normalized value is greater than R.
b) Fitness Function
e) Survivor Selection
Following calculation of each centrality measure based on selected nodes in the Abrar dataset and their respective weights, input parameters of TOPSIS are provided. So, the correlation coefficient of scores for the social skills questionnaire in the aforementioned nodes of the dataset and their calculated scores based on TOPSIS is considered as the fitness function. Clearly, target weights of the normalized centrality measures lead to the maximum correlation coefficient and consequently maximum fitness function. Given the best and worst values for each centrality measure among all selected nodes are denoted as Vjþ and Vj respectively, the distance between node i and the ideal positive and negative alternatives are calculated as follows:
The number of any new population generated from mutated, crossovered and migrated chromosomes is predetermined as follows:
dþ i
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi uX 2 u n Vij V þ ¼t j
i ¼ 1; 2; …; m
Number of mutated chromosomes ¼ probability of mutation * popsize Number of crossovered chromosomes ¼ probability of crossover * popsize Number of migrated chromosomes ¼ probability of migration * popsize In this work, probability of mutation, crossover and migration are considered 0.1, 0.6 and 0.3, respectively. Migration refers to the transfer of the best chromosomes to the next population without any change.
j¼1
f) Mutation
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi uX 2 u n Vij V di ¼ t j
i ¼ 1; 2; …; m
j¼1
However, score of each node is directly related to its closeness to the ideal positive alternative. The fitness function presents the correlation coefficient of the scores obtained and the social skill questionnaire scores are named stopsis ðst Þ and squestionnaire ðsq Þ, respectively.
. stopsis ¼ d i d þ dþ i i rs ;s t
q¼
Pm i¼1
st st
sq sq
m:sst ssq
c) Population A population is made up of a number of chromosomes and these chromosomes consist of weights of the proposed centrality measure. In this work, each population is made up of 20 chromosomes, that is pop size ¼ 20 and maximum generation number of the
In this paper, mutated chromosomes are generated by the onepointed mutation method in which genome values in the related chromosomes of parents are replaced by another genome value randomly, after parent selection. Also, a new genome value is determined randomly according to a Uniform distribution between the lower and upper bounds, that is (0,1). Then, it is possible that sum of the new gene values in a chromosome would not be equal to one. Therefore, to overcome this issue, it is suggested that the genome value per chromosome is divided by the total genome value and the result placed in a new gene structure. As such, sum of the newly computed gene values equals one. g) Crossover In this paper, one-point crossover is used as the crossover operator in which a cross-over point is selected randomly for every two parent selected (Fig. 4). All data beyond that point in either of the organism strings is swapped between the two parent organisms. The resulting organisms are the children. As mentioned above, given the considered chromosome consists of seven genes, six situations for crossover are available, one of which is selected randomly in each new population generation process.
300
M. Agha Mohammad Ali Kermani et al. / Computers in Human Behavior 56 (2016) 295e305
h) Termination constraint Various methods can be applied for termination of any iterative algorithm. In this paper, if the number of population generation reaches the predetermined max generation value, the genetic algorithm terminates and the best solution is reported. The abovementioned parameters are shown in Table 1. Fig. 4. Crossover.
6.3. Representing the solutions
Table 1 Considered genetic algorithm parameters. Parameters
Value
Popsize Maxgeneration number Mutation probability Crossover probability Migration probability n m (First TOPSIS) m (Second TOPSIS)
20 100 0.1 0.6 0.3 6 41 163
Main contribution of the present research is the introduction of a new procedure for developing a novel centrality measure, based on a number of existing centrality measures, which can show the sociability skill of nodes. Based on the proposed procedure and the above-mentioned GA parameters, decision matrix of the selected nodes in the First TOPSIS consists of 41 alternatives and 6 criteria as shown in Table 2. 41 nodes selected in the Abrar dataset are considered in the first TOPSIS to obtain the best weights of criteria, in addition to the existing centrality measures and social skill questionnaire scores. This is shown in Table 3.
Table 2 Decision Matrix of Selected Nodes. Node ID
In-Degree Centrality
Out-Degree Centrality
Eigenvector Centrality
Clossness Centrality
Betweenness Centrality
PageRank Centrality
32 26 122 107 27 47 49 118 37 2 84 70 35 10 57 135 111 82 92 142 55 77 136 143 150 61 69 54 50 147 159 13 101 3 119 100 65 108 163 7 25
45 36 28 31 32 26 25 32 25 24 26 21 17 17 19 14 26 18 10 13 20 26 15 15 16 13 16 13 12 11 10 13 13 10 14 7 7 7 7 25 27
46 39 40 36 32 36 36 29 22 23 21 20 22 20 17 22 10 15 22 18 11 5 16 16 14 17 14 16 15 16 16 11 6 9 3 7 6 5 1 20 17
0.955 0.804 0.382 0.423 0.71 0.575 0.38 0.426 0.614 0.534 0.355 0.222 0.384 0.39 0.265 0.056 0.36 0.27 0.167 0.088 0.201 0.308 0.056 0.067 0.068 0.12 0.193 0.151 0.151 0.05 0.04 0.245 0.175 0.213 0.196 0.1 0.055 0.104 0.1 0.548 0.652
1.84 1.864 1.994 2.111 2.284 2.247 2.241 2.204 2.377 2.383 2.278 2.623 2.315 2.457 2.222 2.457 2.599 2.593 2.395 2.315 2.79 2.815 2.556 2.58 2.593 2.63 2.512 2.414 2.42 2.574 2.401 2.648 2.778 2.765 2.914 2.611 3.105 2.735 3.66 2.389 2.414
1494.973 1012.26 1279.359 892.432 67.538 149.728 647.799 389.984 19.125 58.817 377.086 120.307 52.494 57.969 433.872 94.702 27.887 19.501 7.473 513.106 109.86 26.769 79.998 98.679 206.077 50.985 60.454 72.7 61.284 35.439 160.074 101.128 6.793 2.105 3.179 13.633 0.255 5.122 12.188 50.883 20.174
0.014391 0.008256 0.011642 0.010411 0.007271 0.006779 0.008845 0.006504 0.0062 0.005242 0.005659 0.005018 0.003804 0.005343 0.006137 0.015216 0.00513 0.003721 0.001948 0.016211 0.006886 0.004911 0.005348 0.003823 0.012143 0.003257 0.003584 0.003574 0.005109 0.008111 0.003447 0.003791 0.004124 0.002682 0.003722 0.00133 0.00245 0.002138 0.003671 0.005358 0.005807
M. Agha Mohammad Ali Kermani et al. / Computers in Human Behavior 56 (2016) 295e305
301
Table 3 Comparison between sample nodes (41 nodes) based on Sociability centrality and some of existing centrality measures. Node ID Value
32 26 122 107 27 47 49 118 37 2 84 70 35 10 57 135 111 82 92 142 55 77 136 143 150 61 69 54 50 147 159 13 101 3 119 100 65 108 163 7 25
Measures Normalized Questionnaire Scores 0.888 1 0.881 0.918 0.837 0.903 0.888 0.822 0.762 0.851 0.866 0.822 0.851 0.896 0.740 0.792 0.748 0.740 0.777 0.874 0.851 0.911 0.948 0.814 0.837 0.940 0.762 0.896 0.911 0.829 0.881 0.674 0.577 0.674 0.681 0.651 0.622 0.607 0.585 0.903 0.851
Normalized Sociability Centrality Value
Rank
0.848 0.331 0.693 0.851 0.685 0.645 1 0.396 0.349 0.477 0.771 0.722 0.376 0.374 0.383 0.900 0.355 0.473 0.292 0.522 0.563 0.594 0.686 0.517 0.529 0.368 0.777 0.354 0.368 0.724 0.331 0.214 0.220 0.279 0.205 0.284 0.140 0.239 0.047 0.342 0.490
9 111 27 8 30 36 1 92 107 72 18 24 98 99 95 5 105 73 118 59 49 43 29 61 57 102 16 106 101 23 112 139 136 122 141 121 153 130 161 108 70
Normalized In-Degree Centrality
Normalized Out-Degree Centrality
Normalized Eigenvector Centrality
Normalized Closeness Centrality
Normalized Betweenness Centrality
Normalized PageRank Centrality
0.849 0.679 0.528 0.584 0.603 0.490 0.471 0.603 0.471 0.452 0.490 0.396 0.320 0.320 0.358 0.264 0.490 0.339 0.188 0.245 0.377 0.490 0.283 0.283 0.301 0.245 0.301 0.245 0.226 0.207 0.188 0.245 0.245 0.188 0.264 0.132 0.132 0.132 0.132 0.471 0.849
0.901 0.764 0.784 0.705 0.627 0.705 0.705 0.568 0.431 0.450 0.411 0.392 0.431 0.392 0.333 0.431 0.196 0.686 0.431 0.352 0.215 0.098 0.313 0.313 0.274 0.333 0.274 0.313 0.294 0.313 0.313 0.215 0.117 0.686 0.058 0.137 0.117 0.686 0.0196 0.392 0.333
0.955 0.804 0.382 0.423 0.71 0.575 0.38 0.426 0.614 0.534 0.355 0.222 0.384 0.39 0.265 0.056 0.36 0.27 0.167 0.088 0.201 0.308 0.056 0.067 0.068 0.12 0.193 0.151 0.151 0.05 0.04 0.245 0.175 0.213 0.196 0.1 0.055 0.104 0.1 0.548 0.652
0.502 0.509 0.544 0.576 0.624 0.613 0.612 0.602 0.649 0.651 0.622 0.716 0.632 0.671 0.607 0.671 0.710 0.708 0.654 0.632 0.762 0.769 0.698 0.704 0.708 0.718 0.686 0.659 0.661 0.703 0.656 0.723 0.759 0.755 0.796 0.713 0.848 0.747 1 0.652 0.659
0.784 0.531 0.671 0.468 0.035 0.078 0.340 0.204 0.010 0.0308 0.197 0.063 0.0275 0.0304 0.227 0.049 0.014 0.010 0.003 0.269 0.057 0.014 0.041 0.051 0.108 0.026 0.031 0.038 0.032 0.018 0.084 0.053 0.003 0.001 0.001 0.007 0.0001 0.002 0.006 0.026 0.010
0.605 0.347 0.490 0.438 0.306 0.285 0.372 0.273 0.261 0.220 0.238 0.211 0.160 0.224 0.258 0.640 0.215 0.156 0.081 0.682 0.289 0.206 0.225 0.160 0.511 0.137 0.150 0.150 0.215 0.341 0.145 0.159 0.173 0.112 0.156 0.056 0.103 0.090 0.154 0.225 0.244
To find the best relative weights of criteria, the proposed genetic algorithm is run in Matlab 7.8.0 (R2009a) software on a personal computer with Intel(R) Core™2 Duo CPU 2.53 GHz and 4 GB RAM at 9.80 s CPU time. The resulting optimum solution is shown in Table 4, and the resulting correlation coefficient maximum between social skill questionnaire scores of the selected nodes in the Abrar dataset and their calculated scores based on first TOPSIS in each population is depicted in Fig. 5. The resulting optimized weights acquired in the previous step are applied as weights of criteria in the second TOPSIS in which all nodes in the studied social network and decision matrix consist of 163 alternatives and 6 criteria. The proposed TOPSIS for developing sociability centrality measure in the Abrar dataset is summarized in Fig. 6. The correlation coefficient (0.51) obtained between the second TOPSIS scores and the Social Skill questionnaire's scores is significantly higher than the resulting correlation coefficient between other well-known centrality measure values and Social Skill questionnaire's scores. Therefore, it can be concluded that the novel sociability centrality measure could represent the sociability power
of each node in social networks better than other measures from a sociological performance point of view (see Fig. 7). It is clear that this novel centrality measure can be simply calculated in large networks using the proposed algorithm. To demonstrate the efficiency of this measure, the aforementioned correlation coefficients are depicted in Table 4 and scatter plots of normalized Social Skill questionnaire's scores are illustrated in Fig. 7 via the normalized version of each centrality measure. Based on Fig. 7 and Table 5, we can compare centrality measures (i.e, existing and proposed ones) based on their correlation with social skill questionnaire scores. This correlation can show the power of each centrality measure in representing the social skill of nodes. Correlations between different centrality measures and social skill questionnaire scores in the Abrar dataset show that the proposed centrality measure is the one that yields the best centrality for measuring the nodes' social skill. It is therefore suggested that applying sociability centrality is reasonable for measuring the social skill of individuals in other large size networks (such as online social networks) in which distribution of the questionnaire is not feasible.
302
M. Agha Mohammad Ali Kermani et al. / Computers in Human Behavior 56 (2016) 295e305
Table 4 Resulted best solution based on sample. Best results Optimized weights
Value Fitness function Indegree centrality Outdegree centrality Eigenvector centrality Closeness centrality Betweenness centrality PageRank centrality
Fig. 5. Resulted maximize fitness function through generated populations.
Fig. 6. Proposed TOPSIS for introducing the sociability centrality in Abrar dataset.
0.51956 0.321927 0.412728 0.00952 0.003089 0.012475 0.240262
M. Agha Mohammad Ali Kermani et al. / Computers in Human Behavior 56 (2016) 295e305
Social Skill Questionnaire scores Vs. Outdegree 1
Normalized Outdegree
Normalized Indegree
Social Skill Questionnaire scores Vs. Indegree 1 0.8 0.6 0.4 0.2 0 0.5
0.6
0.7
0.8
0.9
Normalized Social Skill Questionaire Scores
0.8 0.6 0.4 0.2 0 0.5
1
0.7
0.8
0.9
1
Social Skill Questionnaire scores Vs. Closeness
1
1
Normalized Clossness
Normalized Eigenvector
0.6
Normalized Social Skill Questionaire Scores
Social Skill Questionnaire scores Vs. Eigenvector 0.8 0.6 0.4 0.2
0.8 0.6 0.4 0.2
0
0 0.5
0.6
0.7
0.8
0.9
Normalized Social Skill Questionaire Scores
1
0.5
Social Skill Questionnaire scores Vs. PageRank
0.6 0.4 0.2 0 0.6
0.7
0.8
0.9
1
Normalized Betweenness
0.8
0.5
0.6
0.8
0.9
1
1
0.8 0.6 0.4 0.2 0 0.5
0.7
0.9
Normalized Social Skill Questionaire Scores
Normalized Social Skill Questionaire Scores
Normalized Proposed Centrality
0.7
Normalized Social Skill Questionaire Scores
Social Skill Questionnaire scores Vs. Betweenness
1
Normalized PageRank
303
Socail Skill Questionnaire scores Vs. Proposed Centrality 1 0.8 0.6 0.4 0.2 0 0.5
0.55
0.6
0.65
0.7
0.75
0.8
0.85
0.9
0.95
Normalized Social Skill Questionaire Scores
1
Fig. 7. Correlation between centrality measures and social skill questionnaire scores.
Table 5 Correlation between centrality measures and social skill questionnaire scores in Abrar dataset. Centrality measure
Sociability
PageRank
Betweenness
Closeness
Eigenvector
Out-degree
In-degree
Correlation with questionnaire score
0.6175
0.347
0.2183
0.4233
0.4226
0.4174
0.4446
304
M. Agha Mohammad Ali Kermani et al. / Computers in Human Behavior 56 (2016) 295e305
7. Conclusion One of the well-known problems in social network analysis (SNA) is identifying the importance of nodes in the network as a function of certain measures. Regarding the social theme of SNA, it seems that a centrality measure that is capable of measuring the social skill of nodes is needed. So, in this paper, a new centrality measure (sociability centrality) is proposed. Sociability centrality is developed based on TOPSIS (as a multi criteria decision making method) and Genetic Algorithm (as an optimization method) and can be calculated for large size networks. Based on the proposed procedure for identifying the importance of nodes from a social skills point of view, nodes in the network are evaluated based on the TOPSIS method. The proposed TOPSIS considers seven centrality measures (In-Degree Centrality, Out-Degree Centrality, Closeness Centrality, Betweenness Centrality, Eigenvector Centrality and PageRank Centrality) as its criteria. On the other hand, weights of the criteria are obtained using Genetic Algorithms. As such, scores obtained for each node based on TOPSIS (sociability centrality) have very good correlation with their corresponding social skill questionnaire scores. To show the efficiency of the proposed procedure, it is implemented on the Abrar Dataset. It is shown that the centrality measure developed based on this procedure outperforms other centrality measures significantly as indicated by correlations with social skill questionnaire scores. References Adamic, L., & Adar, E. (2005). How to search a social network. Social Networks, 27(3), 187e203. Adamic, L. A., Lukose, R. M., Puniyani, A. R., & Huberman, B. A. (2001). Search in power-law networks. Physical Review E, 64, 046135. Agryzkov, T., Oliver, J. L., Tortosa, L., & Vicent, J. (2014). A new betweenness centrality measure based on an algorithm for ranking the nodes of a network. Applied Mathematics and Computation, 244, 467e478. Agha Mohammad Ali Kermani M., Aliahmadi A., Hanneman R., Optimizing the choice of influential nodes for diffusion on a social network. International Journal of Communication Systems. (XXX(XXX):XXXeXXX). Agha Mohammad Ali Kermani, M., Karimimajd, A., Mohammadi, N., & Aliahmadi, B. (2014). A note on predicting how people interact in attributed social networks. International Journal of Current Life Sciences (IJCLS), 4(5). Agha Mohammad Ali Kermani, M., Navidi, H., & Alborzi, F. (2012). A novel method for supplier selection by two competitors, including multiple criteria. International Journal of Computer Integrated Manufacturing, 25(6), 527e535. Ahmed, Z. H. (2010). Genetic algorithm for the traveling salesman problem using sequential constructive crossover operator. International Journal of Biometrics & Bioinformatics (IJBB), 3(6), 962. Amichai-Hamburger, Ya., & Vinitzky, G. (2010). Social network use and personality. Computers in Human Behavior, 26(6), 1289e1295. Baldwin, T. T., Bedell, M. D., & Johnson, J. L. (1997). The social fabric of a team-based MBA program: network effects on student satisfaction and performance. Academy of Management Journal, 40(6), 1369e1397. Banerjee, A., Chandrasekhar, A. G., Duflo, E., & Jackson, M. O. (2013). The diffusion of microfinance. Science, 341(6144). Barab asi, A.-L., & Albert, R. (1999). Emergence of scaling in random networks. Science, 286(5439), 509e512. Bavelas, A. (1948). A mathematical model for group structures. Human Organizations, 7(3), 16e30. Boccaletti, S., Latora, V., Moreno, Y., Chavez, M., & Hwang, D.-U. (2006). Complex networks: structure and dynamics. Physics Reports, 424(4), 175e308. Bonacich, P. (1987). Power and centrality: a family of measures. American Journal of Sociology, 1170e1182. Borgatti, S. P., & Everett, M. G. (2006). A graph-theoretic perspective on centrality. Social Networks, 28(4), 466e484. Borgatti, S. P., Jones, C., & Everett, M. G. (1998). Network measures of social capital. Connections, 21(2), 27e36. Borgatti, S. P., & Li, X. (2009). On social network analysis in a supply chain context*. Journal of Supply Chain Management, 45(2), 5e22. Box-Steffensmeier, J. M., & Christenson, D. P. (2014). The evolution and formation of amicus curiae networks. Social Networks, 36, 82e96. Calvo-Armengol, A., & Jackson, M. O. (2004). The effects of social networks on employment and inequality. American Economic Review, 426e454. -Armengol, A., & Jackson, M. O. (2007). Networks in labor markets: wage and Calvo employment dynamics and inequality. Journal of Economic Theory, 132(1), 27e46. Duckstein, L., & Opricovic, S. (1980). Multiobjective optimization in river basin
development. Water Resources Research, 16(1), 14e20. ERDdS, P., & WI, A. (1959). On random graphs I. Publicationes Mathematicae Debrecen, 6, 290e297. Freeman, L. C. (1979). Centrality in social networks conceptual clarification. Social networks, 1(3), 215e239. Friedl, D.-M. B., & Heidemann, J. (2010). A critical review of centrality measures in social networks. Business & Information Systems Engineering, 2(6), 371e385. mez-Skarmeta, A., & Jime nez, F. (1999). Fuzzy modeling with hybrid systems. Go Fuzzy Sets and Systems, 104(2), 199e208. Goyal, S., & Vega-Redondo, F. (2005). Network formation and social coordination. Games and Economic Behavior, 50(2), 178e207. Granovetter, M. (1978). Threshold models of collective behavior. American Journal of Sociology, 1420e1443. Granovetter, M., & Soong, R. (1986). Threshold models of interpersonal effects in consumer demand. Journal of Economic Behavior & Organization, 7(1), 83e99. Gwiazda, T. D. (2006). Crossover for single-objective numerical optimization problems. Tomasz Gwiazda. Hanneman, R. A., & Riddle, M. (2005). Introduction to social network methods. University of California Riverside. Hazewinkel, M. (2001). Encyclopaedia of mathematics, supplement III. Springer. Hines, P., & Blumsack, S. (Eds.). (2008). A centrality measure for electrical networks. Hawaii International Conference on System Sciences, Proceedings of the 41st Annual. IEEE. Holland, John H. (1992). Adaptation in natural and artificial systems. Cambridge, MA: MIT Press. Hwang, C.-L., & Yoon, K. (1981). Multiple attribute decision making. Springer. Inderbitzen, H. M., & Foster, S. L. (1992). The teenage inventory of social skills: development, reliability, and validity. Psychological Assessment, 4(4), 451. Jackson, M. O. (2008a). Social and economic networks. Princeton University Press Princeton. Jackson, M. O. (2008b). Social and economic networks. Princeton University Press. Jackson, M. O. (2010). An overview of social networks and economic applications. The Handbook of Social Economics, 1, 511e585. Jackson, M. O., & Wolinsky, A. (1996). A strategic model of social and economic networks. Journal of Economic Theory, 71, 44e74. Jahanshahloo, G. R., Lotfi, F. H., & Izadikhah, M. (2006). An algorithmic method to extend TOPSIS for decision-making problems with interval data. Applied Mathematics and Computation, 175(2), 1375e1384. Kent, D. (1975). The Florentine reggimento in the fifteenth century. Renaissance Quarterly, 28(4), 575e638. Kim, G., Park, C. S., & Yoon, K. P. (1997). Identifying investment opportunities for advanced manufacturing systems with comparative-integrated performance measurement. International Journal of Production Economics, 50(1), 23e33. Kleinberg, J. M. (2000). Navigation in a small world. Nature, 406(6798), 845. Leskovec, J., Adamic, L. A., & Huberman, B. A. (2007). The dynamics of viral marketing. ACM Transactions on the Web (TWEB), 1(1), 5. Leskovec, J., Chakrabarti, D., Kleinberg, J., & Faloutsos, C. (2005). Realistic, mathematically tractable graph generation and evolution, using kronecker multiplication. Knowledge discovery in databases: PKDD 2005 (pp. 133e145). Springer. Lewis, K., Kaufman, J., Gonzalez, M., Wimmer, A., & Christakis, N. (2008). Tastes, ties, and time: a new social network dataset using Facebook. com. Social Networks, 30(4), 330e342. Louni, A., & Subbalakshmi, K. (2014). Diffusion of information in social networks. Social networking (pp. 1e22). Springer. Macy, M. W. (1991). Chains of cooperation: threshold effects in collective action. American Sociological Review, 730e747. Mahmoodzadeh, S., Shahrabi, J., Pariazar, M., & Zaeri, M. (2007). Project selection by using fuzzy AHP and TOPSIS technique. World Academy of Science, Engineering and Technology, 30, 333e338. Mesgari, I., Agha Mohammad Ali Kermani, M., Hanneman, R., & Aliahmadi, A. (2015). Identifying key nodes in social networks using multi-criteria decisionmaking tools. Mathematical technology of networks (pp. 137e150). Springer. Milgram, S. (1967). The small world problem. Psychology Today, 2(1), 60e67. Newman, M. E., & Park, J. (2003). Why social networks are different from other types of networks. Physical Review E, 68(3), 036122. Opsahl, T., Agneessens, F., & Skvoretz, J. (2010). Node centrality in weighted networks: generalizing degree and shortest paths. Social Networks, 32(3), 245e251. Page, L., Brin, S., Motwani, R., & Winogra, T. (1999). The PageRank citation ranking: Bringing order to the Web. Ross, C., Orr, E. S., Sisic, M., Arseneault, J. M., Simmering, M. G., & Orr, R. R. (2009). Personality and motivations associated with Facebook use. Computers in Human Behavior, 25(2), 578e586. Saaty, T. L. (1990). The analytic hierarchy process: Planning, priority setting, resource allocation. Saaty, T. L. (2005). Theory and applications of the analytic network process: Decision making with benefits, opportunities, costs, and risks. RWS publications. Sadoughi, S., Yarahmadi, R., Taghdisi, M. H., & Mehrabi, Y. (2012). Evaluating and prioritizing of performance indicators of health, safety, and environment using fuzzy TOPSIS. African Journal of Business Management, 6(5), 2026e2033. Small, L. (2012). Information diffusion on social networks. National University of Ireland Maynooth. Sparrowe, R. T., Liden, R. C., Wayne, S. J., & Kraimer, M. L. (2001). Social networks and the performance of individuals and groups. Academy of Management Journal, 44(2), 316e325. Surmann, H., & Maniadakis, M. (2001). Learning feed-forward and recurrent fuzzy
M. Agha Mohammad Ali Kermani et al. / Computers in Human Behavior 56 (2016) 295e305 systems: a genetic approach. Journal of Systems Architecture, 47(7), 649e662. Valente, T. W. (1996). Social network thresholds in the diffusion of innovations. Social Networks, 18(1), 69e89. Yan, X., Zhai, L., & Fan, W. (2013). C-index: a weighted network node centrality
305
measure for collaboration competence. Journal of Informetrics, 7(1), 223e239. Zinoviev, D. (2011). Information diffusion in social networks. Social networking and community behavior modeling: Qualitative and quantitative measures. Qualitative and Quantitative Measures (p. 146).