Available online at www.sciencedirect.com
Procedia Engineering 15 (2011) 3456 – 3460
$GYDQFHGLQ&RQWURO(QJLQHHULQJDQG,QIRUPDWLRQ6FLHQFH
Set Pair Community Mining and Situation Analysis Based on Web Social Network Zhang Chunying 1 , Liang Ruitao 1 ˈLiu Lu 1 ,Wang Jing2 1* 1 Hebei United University ,College of science, Hebei,Tangshan,China, 063009 2 Tangshan College, Hebei,Tangshan,China,063020
Abstract With regard to the uncertain and ambiguous problems in Web community, this paper uses the established set pair social network model and the integration community mining method to propose a set pair community mining algorithms with situation analysis and to study its trends. Firstly, the set is defined on the basis of the social network model, set pair community is defined according to the features of community; secondly, through calculating the set pair community links between the two degree of body to build relations contact matrix, to give the corresponding threshold value in order to cluster set pair social networks and to find the minimum and maximum community of set of social networks; finally, this paper designs a set pair community-mining algorithm, and conducts situation of minimum and maximum community. The analysis shows that the development trends of set pair community have a close relation with the strength of community relations contact situation.
© 2011 Published by Elsevier Ltd. Selection and/or peer-review under responsibility of [CEIS 2011] Keywords: Social networks; community; set pair social networks; set pair community;
1. Introduction Social network [1] is connected with each individual to individual of social structure, performance by the nodes and link network of relations. Community’s [2-7] social networks for some relationship between each other a group of individuals, social network is the most common one of the structure. Community mining techniques attracted increasing attention has been the data mining areas of problems. In real life, individual and individuals have multiple often, such as ˖ both relations with friends, colleagues, the relationship, etc. and in the existing network of relations of a particular of a relation. How many relationships from network to tap intellectual? Wang Jinglong [8] for heterogeneous in the community. Optimization theories of knowledge combined application of the * Corresponding author. Tel.:13733251086 E-mail address:
[email protected] ,
[email protected] ,
[email protected] The research reported in the article was funded by Important Basic Research Project of Applied basic research in Hebei Province,NO:10963527D
1877-7058 © 2011 Published by Elsevier Ltd. doi:10.1016/j.proeng.2011.08.647
3457
Zhang Chunying et al. / Procedia Engineering 15 (2011) 3456 – 3460
small segment of the extraction algorithm maximize the excavation of the community of common interest. Zhang Chunying [9] a temper for the random walk around a match to an algorithm, the concept of a property and give a genus of coarse match his reckoning. in actual social networks are complicated, not only individual relations between and among these relations and there may be ambiguous and agencies and uncertainty relations. Social networks in the existence of uncertainty relations, the author [10] set to combine theory with the analysis, that system, a comprehensive handling of the problem of uncertainty and rational to give a network of relations between the two individual close relationship between the assembly on social network model. the assembly on social network is made of, property and ties a network connection, social network is a set of social network is a unique network. In the assembly on social network, you can dig have much of the community relations structure is demanding prompt solution question, therefore, the first to set out in the community, the concept and established the nature of the assembly of the excavation of an algorithm, and to set about community development tendency was analysed. We can know set pair community development tendency and to the strength of the set pair community by situation analysed. Basic structure as follows ˖ 2. Set pair social network model In the social network relationship, the study of the properties of the two sets v k and v s , have N attributes (ie, the sum of the two objects’ attributes), including S which are the same attribtes in two object properties set, and P are not the same property in two object properties, and F are the uncertain properties in two object properties. Then order: S N
a,
F N
b,
P N
c
˄1˅
As for the complex nodes in social networks, as well as the uncertainty between nodes, we give the following definition: Definition 1.1: (set pair social network model) supposes all the study objects in social network are v k and v s has the attribute set: domain set U ^v1 , v 2 , v n ` , which A(v k ) {v k x1e, v k x 2e, , v k x nn } and A(v s ) {v s x1 , v s x 2 , , v s x n } , then the relation connection degree between k and s in the social network is: ˄2˅ U (vk , vs ) a bi cj Where v k U and v s U , a is the ration of the same properties and mutual properties in two objects; b is the ration of the uncertain properties and mutual properties in two objects; c is the ratio of the different properties and mutual properties in two objects; i is the different markers in the interval depending on the different circumstances on the values of [1,1] , j only acts as tag and valued 1 . Definition1.2: (set pair social network) set the set of vertices V {v1 , v 2 , , v n } , the vertex set of attributes V1 {vx1 , vx 2 , , vx f } , U (v k , v s ) to (v k , v s ) in the t times in the relationship on the set of attributes V1 connection degree, as the connection v k and v s for the edge. Definition 1.3: Set matrix R ( U (v k , v s )) k us called as a set of links between R matrix, where U (v k , v s ) is set on the contact between matrix elements, R ( U (v k , v s )) k us is v k and v s contact with relationship matrix, or a set of contact between matrix determine the set of social network Study the relationship between the connection degree. This matrix can be expressed as:
R(t )
U (v1 , v1 ) U (v1 , v 2 ) U (v1 , v s ) ½ ° U (v , v ) U (v , v ) U (v , v ) ° ° 2 1 2 2 2 s ° ¾ ® ° ° °¯U (vk , v1 ) U (v k , v 2 ) U (vk , v s )°¿
˄3˅
3458
Zhang Chunying et al. / Procedia Engineering 15 (2011) 3456 – 3460
Without considering the case of time t, R matrix is a static relationship matrix, otherwise the dynamic relationship matrix, over time, social network node is constantly dynamic, and thus get the relation matrix is continuously updated, by analyzing the relationship between the matrix at different times, we can find the trend of social networks. Definition 1.4: In the social network, Set vertex set V {v1 , v 2 , , v n } , the vertex set of attributes V1 {vx1 , vx 2 , , vx f } , set pair the contact between matrix R ( U (v k , v s )) k us is set to GR (V (VX ), R ) called set pair social networks. Property 1: Set any of the contact relationship matrix U (v k , v s ) ,there is U (v k , v s ) U (v s , v k ) , that v k and v s as same as v s and v k the contact relationship. Set pair contact relationship matrix is a symmetric matrix. Property 2: Diagonal is constant, there is U (v k , v s ) 1 , that the same individual is 1 itself, diversity and opposition of 0. Property 3: When U (v k , v s ) 1 , then there is no contact between two individuals. 3. Set pair community In the social network share a common nature of the so-called community is a group of individuals. The mining community of social networks is an important feature of the members of the community within the community is very close links between, and the links between communities are evacuated. Social network community structure is one of the most common features. Most communities in the mining algorithm and hierarchical clustering task of sociology and graph theory [11, 12] is closely related to graph partitioning task. In the set pair social networks model proposed a set of social networking concept, while in the set pair social networks, there are also the concept of community is equally entitled to the common nature of a group of individuals, the relationship between these individuals through the set of theoretical for analysis. More comprehensive evaluation of individual integrated relationship between the degree of relationship. Defined as follows: Definition 2.1: (Minimum set pair relationship sets) Setting a threshold a [0,1] , the set pair contact between the matrix, the relationship between any two individuals to contact degree U (v k , v s ) al bl i cl j , where l 1,2, , n . If the degree of U (v k , v s ) in the relationship between contact al t a , the relationship constitutes a minimum set of minrel {Vl U l a l bl i c l , a l t a} . Definition 2.2: (Maximum set pair relationship sets) Setting a threshold D [0,1] , the set pair contact between the matrix, the relationship between any two individuals to contact degree U (v k , v s ) al bl i cl j , where l 1,2, , n . If the degree of U (v k , v s ) in the relationship between contact al bl i t D , the relationship constitutes a minimum set of maxrel {Vl Ul al bl i cl , al bl i t D } . Definition 2.3: ( D minimum community relations) According to al t a , set to form a community relations for the minimum minrel , for the individual to V1 {v11 , v12 , , v1m } , then form a set of sub-networks of social networks, which is D minimum community relations . Definition 2.4: ( D maximum Community Relations) According to al bl i t D , set to form a relationship for the largest community maxrel , for the individual to V1 {v11 , v12 , , v1n } .The formation of a set of sub-networks of social networks, which is D maximum Community Relations. Definition 2.5: (tightness) Set pair social network is a variety of relations between individuals to form, and caused the tightness of the network by the entire network to evaluate the relationship between the degrees of connection, Which is networks,
U
tig
¦ U , Where tig is a close degree pair social n
is set pair the relationship between the individual network connection degree, n is set
pair number of individuals in the network. Property 1: Set pair closely connected to the community is composed of a group of individuals. Property 2: Set pair relationship community matrix is a symmetric matrix. Property 3: If minrel maxrel , is Set pair relationship between social network is a single social network
Zhang Chunying et al. / Procedia Engineering 15 (2011) 3456 – 3460
4. Set pair community mining algorithm For the Set pair social networks, and more analysis of the network relationship is the relationship between individuals an important method of contact between the network is given a reasonable degree of comprehensive relations between the individual level. Set ideas on community mining algorithm: Set pair community by calculating the relationship between the two body connection degree, build relations contact matrix, a given threshold corresponding Set of social networks pair Set pair clustering to identify Set pair Social networks set the minimum and maximum. And find out the minimum and maximum set of two is Set on the community constitutes the minimum and maximum of two communities. D community mining algorithm description Input: V (VX ) vertex set of attributes of a set pair social networks Output: D minimum community relations V1 {v11 , v12 , , v1m } ; D maximum Community Relations V1 {v11 , v12 , , v1n } . As follows: 1) For a given Set pair vertex attributes V (VX ) of social networks between scans for a collection. 2) Set pair network relationship between any two bodies was Set pair analysis of a collection of EA , obtained IDO˄identical different oppose˅Expression:
U (v k , v s )
al bl i cl j , l
1,2 , n
3) The network between any two bodies the relationship between degrees of integration into a matrix, which links to build a relationship matrix R . 4) In the set pair contact between the matrix, the relationship between any two individuals to contact degree U (v k , v s ) al bl i cl j , where l 1,2, , n . 5) Set the threshold value a [0,1] . 6) If you choose al t a , then the relationship constitutes a minimum set of minrel {Vl U l al bl i cl , a l t a} is the implementation of Step 7. If you choose al bl i t a , then the relationship constitutes the largest collection of maxrel {Vl U l al bl i cl , al bl i t a} . Then perform step 8. 7) The relationship between accesses to the minimum set of V1 {v11 , v12 , , v1m } , which is a minimum community relations; 8) The relationship between the maximum set of V1 {v11 , v12 , , v1n } , which is a maximum Community Relations. 9) The algorithm end. 5. Situation analysis of set pair community Situation is reflected in the number of the same degree of contact a , difference degree b , c degree of antagonism between the size of the order of the concept of momentum in the contact number of the major aspects can be divided into the same situation, the balance of power and anti-trend, while in the set of social networks can According to trend analysis of the relationship between the two bodies are graded into ĉ, Ċ and ċ grade three level. For the set pair community links between the individual degree of U (v k , v s ) al bl i cl j where l 1,2, , n , and the relationship between degrees of connection, if a / c ! 1 , then the extent of the relationship between the two bodies belong to grade ĉ. If a / c 1 , then the relationship between the level of the two bodies are Ċ. If a / c 1 , then the relationship between the level of the two bodies are ċ. To contact based on degree of relationship between the three situations, the relationship between the levels of the higher level of the weaker. Can build a relationship level table˖
Relationship level
Table 1 Relationship between the size of abc
Relationship strength
a>c,b=0
Associate Relations
a>c>b, bĮ0
Strong Relations
Level ĉ
3459
3460
Zhang Chunying et al. / Procedia Engineering 15 (2011) 3456 – 3460
a>b>c
Weak Relations
a>c,b>a
Micro-relations
a=c,b=0 Level Ċ
a=c,a>b>0 a=b=c a=c,b>a a
level ċ
a
b>0 aa a
Associate Relations Strong Relations Weak Relations Micro-relations Associate Relations Strong Relations Weak Relations Micro-relations
Through the relationship between scale of our community for the Set pair relations between individuals compare the size of the extent, and thus the relationship between a more in-depth understanding. 6. Conclusion
This paper presents a set pair social networks and the related concepts of set community and social networks. Through the use of Set model pair idea, this paper proposes a affiliated set pair community group of individual network digging from set pair social network, which provides a new idea for the uncertain problems in community. Through using situation, this paper analyzes the trend of set community relations between individuals. This paper only studies set pair social network of static state, but the Set pair social network will change over time. How to solve the problem of digging set pair community dynamically is the next step to study. References: [1]
Gou Hong-mei, Huang Bi-qing. A framework for virtual enterprise Operation management [J].
Computers in Industry, 2003, 50(3):333-352. [2] NEWMAN M E J, GIRVAN M. Finding and evaluating community structure in networks [J].Physical Review E, 2004, 69(2):026113. [3] CLAUSET A, NEWMAN M E J, MOORE C. Finding community structure in very large networks [J].Physical Review E, 2004, 69(2):026113. [4] NEWAN M E J. Modularity and community structure in networks [J].PNAS, 2006, 103(23): 8577-8582. [5] Sun.Y, Mao.B. Behavior of Members in Virtual Community Based on Data Mining [J]. JOURNAL OF COMPUTER APPLICATIONS, 2003, 23(1):50-53. [6] Yue.R, Yue.X. Analysis on topology structure of students' interpersonal transaction networks [J]. JOURNAL OF COMPUTER APPLICATIONS, 2006, 26(1):227-229. [7]
Zhang.J.Z, Fan.M. Mining Web community based on improved maximum flow algorithm [J]. JOURNAL OF
COMPUTER APPLICATIONS, 2009, 29(1):213-216. [8] Wang.J.L, Xu.C.F, Luo.G. J. Community mining with heterogeneous relation.[J]. JOURNAL OF COMPUTER APPLICATIONS, 2007,27(12):3016-3018. [9] Guo J.F, Zhang C.Y, Chen X. Attribute Graph and Its Structure [J].ICIC Express Letter, 2010(ICIC-EL10-1014). [10] Zhang C.Y, Liang R.T, Liu L. Set Pair social network analysis model and information mining [c]. International Conference on Electronic Commerce,Web Application and Communication[S.1]:[s.n.],2011:253-257. [11] Kernighan B W, Lin S. An efficient heuristic procedure for partitioning graphs [J] Bell System Technical Journal, 1970, 49(2), 291-307. [12] Liu.T, Hu.B.Q.Detecting Community in Complex Networks Using Cluster Analysis [J]COMPLEX SYSTEMS AND COMPLEXITY SCIENCE,2007,4(1):28-35.