A new recommender algorithm on signed networks

A new recommender algorithm on signed networks

Physica A 520 (2019) 317–321 Contents lists available at ScienceDirect Physica A journal homepage: www.elsevier.com/locate/physa A new recommender ...

382KB Sizes 2 Downloads 116 Views

Physica A 520 (2019) 317–321

Contents lists available at ScienceDirect

Physica A journal homepage: www.elsevier.com/locate/physa

A new recommender algorithm on signed networks ∗

Peng Zhang a , , Xiaoyu Song a , Leyang Xue a , Ke Gu b a b

School of Science, Beijing University of Posts and Telecommunications, Beijing 100876, PR China School of Systems Science, Beijing Normal University, Beijing 100875, PR China

highlights • • • •

Define an index P to count the likelihood that potential edges are positive ones. Regarding index P as new ratings, we propose a new algorithm on signed networks. The accuracy of recommending negative edges has a big improvement. Our method performs very well on recommendation diversity.

article

info

Article history: Received 3 September 2018 Received in revised form 20 November 2018 Available online 22 January 2019 Keywords: Signed networks Recommendations Online rating systems Motifs

a b s t r a c t Many real-world systems display opposite relationships and can be depicted as signed networks to study. On signed networks, positive/negative edges mean users like/dislike objects. This information is valuable and should be considered into recommendations. In this paper, we mainly study recommendations on signed networks that contain users’ purchase behaviors as well as attitude information, which not only can validate the accuracy of recommendation algorithms but also measure the users’ satisfaction degree after purchasing. Accordingly, we proposed a new recommender algorithm by defining an index P. We further compared our method to other four classical algorithms on three disparate datasets. The results show the accuracy of our method improves at most three times higher than other classic algorithms on recommending negative edges. In addition, the recommendation diversity of our method performs better than heat conduction algorithm which is generally recognized as an effective algorithm in terms of diversity. For instance, the value of Novelty dropped from 19.74 to 3.04 when comparing the heat conduction algorithm with our method on the Movielens dataset. In a word, our method can recommend the objects that are novel to users and ensure users’ satisfaction after purchasing. © 2019 Elsevier B.V. All rights reserved.

1. Introduction In recent years, the complex networks have attracted more and more attention [1–3]. Many real-world systems are depicted as complex networks to investigate their structures and functions. Examples include WWW, internet, foodwebs, biochemical networks, social networks, etc. [4–9]. These works not only brought up new concepts and methods, but also helped us understand complex systems. The signed network is an important kind of complex networks. Edges in signed networks contain the property of positive or negative signs, which represent positive and negative relationships, respectively. Many real-world systems ∗ Corresponding author. E-mail address: [email protected] (P. Zhang). https://doi.org/10.1016/j.physa.2019.01.054 0378-4371/© 2019 Elsevier B.V. All rights reserved.

318

P. Zhang, X. Song, L. Xue et al. / Physica A 520 (2019) 317–321

Fig. 1. Seven motifs of signed bipartite network.

display opposite relationships, especially in social, biological and information fields [10]. So it is worth our attention to research further. The study of signed networks was originally focusing on the field of sociology. In 1940s, Heider explored the structural balance theory which builds the basis of signed networks [11]. Cartwright and Harary introduced the concept of balanced signed graph which is based on the structural balance theory [12]. Now, the existing works mainly focus on the classification of the nodes, the spectral analysis, etc. [13–17]. Besides, the recommendation on signed networks is also a problem which attracts our attention. For instance, users can directly give the objects positive/negative ratings in some online rating systems such as Douban,1 Dianping2 and so on. The positive/negative rating means that users like/dislike objects, which showing their attitudes towards objects [18]. It is a valuable information and should be considered into recommendations. This can make recommendations on signed networks both consistent with users’ historical behaviors and also more easier to be preferred. In this paper, we propose a new algorithm to recommend objects for users with attitude information on the signed network. The attitude information is abstracted to signs of edges. If users like/dislike objects, the corresponding edge is positive/negative. The potential edge more likely to be a positive one is what we want to recommend. Therefore, we define an index P, measuring the likelihood that signs of potential edges are positive. The outline of the article is shown as follows: In Section 2, we will make a further demonstration of index P. Based on which, we propose a new recommender algorithm on signed networks. In Section 3, we get results of the comparison between some classical recommender algorithms and our new algorithm. It shows that the accuracy of our algorithm indeed improves, especially in recommending negative edges. The improvements can be at most three times higher than classical algorithms. At the same time, another metric measures diversity is also greatly improved. In Section 4, we give some concluding remarks. 2. Algorithm A bipartite network G (U , O, E ) consists of two different kinds of vertices. For example, one set of nodes U = u1 , u2 ...um represents users and the other O = o1 , o2 ...on represents objects in a system. There are m users and n objects. An edge between an user and an object represents that the user commented on the object. The set of edges E is denoted by the adjacency matrix A, where the element aiα = 1 if user i praised object α , aiα = −1 if user i held a bad comment on object α , otherwise aiα = 0. The degree of an user denoted as kui is the number of objects that connect to user i and the degree of the object denoted as kα is the number of users who have comments on the object. The original edges E are first conditionally divided into two parts: the probe set (E P ) and the training set (E T ). The proportion of the volumn of the training set (E T ) to that of the probe set (E P ) is 9:1. The recommender algorithms run on the training set. The probe set consists of remaining edges and is used to test the accuracy of recommendation algorithms. In this paper, based on the quadrangle motif [18](As shown in Fig. 1), we propose a new recommender algorithm which is used to recommend objects for users with the attitude information. An index P is defined to measure the likelihood that a potential edge tends to be a positive one. Ni+ α

(1) N where N represents the number of all potential quadrangles which is formed by potential edge liα and other three existing edges. The definition of Ni+ α is similar to N except that the signs of other three existing edges are all positive. If the potential edge liα does not form any potential quadrangles or forms potential quadrangles where signs of other three existing edges are not all positive, then Piα equals to 0. In Ref. [18], Gu et al. show the statistical results of quadrangles (As shown in Fig. 1) on some real-world systems. The results reveal the number of motif where all edges are positive (As shown in Fig. 1(a)) is much more than any other motifs (As shown in Fig. 1(b)–(g)). In addition, the stability of signed networks is a problem which Piα =

1 https://www.douban.com/. 2 https://www.dianping.com/.

P. Zhang, X. Song, L. Xue et al. / Physica A 520 (2019) 317–321

319

Table 1 Basic information of three datasets. Network

Nu

No

E

E+

E−

Mov ielens Netflix Douban

943 800 5399

1682 1884 1835

72855 34650 14908

55375 27777 14530

17480 6873 378

Note: The Nu , No , E, E + and E − denote the number of users, objects, edges, positive edges and negative edges.

worths our attention. Now, the discussion of stable structures on signed networks is inconsistent. But we think that motif with four positive edges has strong stability, just like triangle stable structure in one-mode networks. Based on these, here we only count the motif where all edges are positive as the numerator. In this way, we obtain the Piα of each potential edge, and then recommend objects to users by the Piα which is regarded as a kind of new rating. To evaluate the performance of our recommender algorithm, we introduce some evaluation metrics. The accuracy is usually measured by Ranking Score (R) [19], in which the position of the probe edges is taken into account in recommendations. The definition is ∑ miα 1 }⏐ Rui = ⏐{ (2) ⏐ ui α ∈ E P ⏐ Mui uiα ∈E P

where Rui denotes the ranking score of the user i. The uiα means the user i - objects α relations in the probe set and { miα is the } rank of object α in the recommendation list of user i. Mui is the number of the uncollected object for the user i. | uiα ∈ E P | denotes the number of objects for user i in the probe set. The R of the whole system is obtained by averaging Rui over all users. Obviously, the range of R is that R ∈ (0, 1). The smaller R , the higher the prediction accuracy of recommender algorithm and vice versa. Since we divided the probe set by positive and negative, Ranking Score is redefined as Ranking Score Positive (R+ )

∑ miα 1 ⏐ }⏐ R+ ui = ⏐{ Mui ui α ∈ E P + ⏐ P+

(3)

uiα ∈E

and Ranking Score Negative (R− )

∑ miα 1 ⏐ }⏐ R− ui = ⏐{ P − ⏐ Mui ui α ∈ E P−

(4)

uiα ∈E

{

}

{

}

where | uiα ∈ E P + | and | uiα ∈ E P − | count the number of objects getting positive/negative ratings for user i in the positive/negative probe set. For the values of metric R− , higher means better. This means negative edges are further back in recommendation lists and will be recommended behind positive edges in our algorithm. Besides the accuracy we also considered another index Novelty(N), which measures the ability of an algorithm to generate surprising and unexpected recommendations [19]. The simplest way to calculate ⏐ ⏐ novelty is to use the average popularity of the recommended objects. Given a recommendation list RLi to user i where ⏐RLi ⏐ = L, the novelty is defined as: Ni (L) =

1 ∑ L

koα

(5)

α∈RLi

Lower Ni (L) indicates higher novelty and surprizal. Averaging Ni (L) over all users, we obtain the mean novelty N (L) of the system. 3. Results For our study, we collect three datasets from Movielens [20], Netflix [21] and Douban.3 Three datasets are given by the integer ratings scaling from 1 to 5. The high or low score usually shows the attitude of users, like or dislike the object. When users rate higher/lower than intermediate value 3, there is a positive/negative edge in the signed network and the median ratings are abandoned [18]. More information about signed networks of three datasets are shown in Table 1. In order to test the performance of our algorithm, we compare the results with classical recommender algorithms. The classical algorithms we used are as follows, Collaborative Filtering (CF ) [22–25], Mass Diffusion ()MD) [26], Heat ( ( Conduction ) (HC )[27], Hybrid recommender algorithm combining HC with MD[28–30] where λ = 0.4 Hyb 4 and λ = 0.8 Hyb 8 . Each result experiments. They are shown in Tables 2–4, representing Ranking Score Negative ( − ) is averaged over 10 independent ( ) R , Ranking Score Positive R+ and Novelty (N ), respectively. 3 https://www.douban.com/.

320

P. Zhang, X. Song, L. Xue et al. / Physica A 520 (2019) 317–321 Table 2 The values of evaluation metric R− obtained by some classical algorithms and our new algorithm. Algorithms

CF HC MD Hyb_0.4 Hyb_0.8 P

Datasets Movielens

Netflix

Douban

0.19 0.18 0.17 0.16 0.14 0.46

0.12 0.24 0.12 0.11 0.12 0.57

0.13 0.26 0.14 0.15 0.20 0.44

Table 3 The values of evaluation metric R+ obtained by some classical algorithms and our new algorithm. Algorithms

CF HC MD Hyb_0.4 Hyb_0.8 P

Datasets Movielens

Netflix

Douban

0.10 0.12 0.09 0.08 0.07 0.17

0.08 0.18 0.07 0.07 0.07 0.25

0.12 0.22 0.12 0.13 0.17 0.14

Table 4 The values of evaluation metric N obtained by some classical algorithms and our new algorithm. Algorithms

Datasets Movielens

Netflix

Douban

CF HC MD Hyb_0.4 Hyb_0.8 P

263.26 19.74 251.66 234.15 183.35 3.04

239.54 1.17 237.75 234.71 199.10 1.15

179.16 4.08 169.34 108.73 6.00 3.88

In Table 2, we show the results of R− , which measures the accuracy of recommending negative edges. For the values of metric R− , higher means better. Because edges with negative ratings mean users dislike the objects, which should not be recommended preferentially. From the results, we find the R− of our algorithm is higher than other algorithms. This means negative edges are further back in recommendation lists and will be recommended behind positive edges in our algorithm. That is what we would like to obtain. In our results, the values of R− increase at least twice higher than that of classical recommender algorithms, even three times in the dataset of Netflix by Hyb_4, which is up to 0.57. In Table 3 , we got the results of another metric R+ , which shows the accuracy of recommending positive edges. Through the results, we notice that the R+ of our algorithm is higher than most of other algorithms. But the increase is not obvious. What is more, the value of R+ is close to others in Douban dataset. Compared with the improvement of the ability that negative edges are recommended behind positive edges(The increase of R− ), the situation that positive edges are recommended by mistake can be accepted(The increase of R+ ). In Table 4, we display results of diversity(N). The smaller value of metric N means the better. As can be seen, our algorithm is obviously better than other algorithms on diversity. Even with HC , which is generally recognized as an effective algorithm on diversity, the results of our algorithm are still better than it. Taking Movielens as an example, the value of N drops from 19.74 to 3.04. From the results above, our algorithm improves not only in diversity but also in accuracy, especially in accuracy of recommending negative edges. 4. Conclusion The signed network is an important class of complex networks. Many real-world systems display opposite relationships, which can be depicted as signed networks to study. In signed networks, the attitude information is valuable and should be considered into recommendations, especially negative attitudes. The general thinking assumes that users are more interested in those objects appear in recommendation lists or recommended objects look similar to what they are looking for. However, it does not take into account the situation that users dissatisfy or feel disappointed with objects after they bought those recommended objects. Once users purchase objects they dislike through following recommendation, it may have a significant negative impact on merchants and users’ experience. Therefore, the accurate prediction of dislike objects is very important and cannot be ignored. Through our study, we find that some potential negative edges are recommended before

P. Zhang, X. Song, L. Xue et al. / Physica A 520 (2019) 317–321

321

potential positive edges. Thus, moving potential negative edges back in recommendation lists is what we are supposed to do. It is also the motivation of our algorithm. In this paper, an index P is defined to measure the likelihood that a potential edge tends to be a positive one. Regarding index P as new ratings, we propose a new algorithm to recommend objects for users with attitude information. In order to test the performance of our algorithm, we introduce R+ /R− and N to measure the accuracy of recommending positive/negative potential edges and diversity respectively. From the results, the accuracy of recommending potential negative edges is greatly improved, a three-times increase was observed compared with classical recommender algorithms. Although there is a decrease shown in the accuracy of recommending potential positive edges, it is not obvious. Compared with the improvement in accuracy of avoiding recommending potential negative edges preferentially, recommending some potential positive edges by mistake can be accepted. In addition, our method preforms well on diversity. These results confirm that our algorithm improves not only in diversity of recommendations but also in accuracy, especially in accuracy of recommending negative edges. Acknowledgment This work was supported by The National Natural Science Foundation of China (Grant No. 61403037). References [1] R. Albert, A.-L. Barabási, Statistical mechanics of complex networks, Rev. Modern Phys. 74 (2002) 47–97, http://dx.doi.org/10.1103/RevModPhys.74.47. [2] M.E. Newman, The structure and function of complex networks, SIAM Rev. 45 (2) (2003) 167–256. [3] L.d.F. Costa, O.N. Oliveira Jr, G. Travieso, F.A. Rodrigues, P.R. Villas Boas, L. Antiqueira, M.P. Viana, L.E. Correa Rocha, Analyzing and modeling real-world phenomena with complex networks: a survey of applications, Adv. Phys. 60 (3) (2011) 329–412. [4] Z.-K. Zhang, C. Liu, X.-X. Zhan, X. Lu, C.-X. Zhang, Y.-C. Zhang, Dynamics of information diffusion and its applications on complex networks, Phys. Rep. 651 (2016) 1–34. [5] J. Gao, S.V. Buldyrev, H.E. Stanley, S. Havlin, Networks formed from interdependent networks, Nature Phys. 8 (1) (2012) 40. [6] P. Holme, J. Saramäki, Temporal networks, Phys. Rep. 519 (3) (2012) 97–125, http://dx.doi.org/10.1016/j.physrep.2012.03.001, http://www. sciencedirect.com/science/article/pii/S0370157312000841, Temporal Networks. [7] S. Boccaletti, G. Bianconi, R. Criado, C.I. Del Genio, J. Gómez-Gardenes, M. Romance, I. Sendina-Nadal, Z. Wang, M. Zanin, The structure and dynamics of multilayer networks, Phys. Rep. 544 (1) (2014) 1–122. [8] S. Boccaletti, J. Almendral, S. Guan, I. Leyva, Z. Liu, I. Sendiña-Nadal, Z. Wang, Y. Zou, Explosive transitions in complex networks’ structure and dynamics: percolation and synchronization, Phys. Rep. 660 (2016) 1–94. [9] L. Ermann, K.M. Frahm, D.L. Shepelyansky, Google matrix analysis of directed networks, Rev. Modern Phys. 87 (4) (2015) 1261. [10] S.Q. Cheng, H.W. Shen, G.Q. Zhang, X.Q. Cheng, Survey of signed network research, J. Softw. (2014). [11] F. Heider, Attitudes and cognitive organization, J. Psychol. 21 (1) (1946) 107–112, http://dx.doi.org/10.1080/00223980.1946.9917275. [12] D. Cartwright, F. Harary, Structural balance: a generalization of heider’s theory, in: Psychol. Rev., 63 (1956) 277–293. [13] L. Akoglu, R. Chandy, C. Faloutsos, Opinion fraud detection in online reviews by network effects, in: Proceedings of the 7th International Conference on Weblogs and Social Media, ICWSM 2013, 2013, pp. 2–11. [14] S. Banerjee, K. Sarkar, S. Gokalp, A. Sen, H. Davulcu, Partitioning signed bipartite graphs for classification of individuals and organizations, in: S.J. Yang, A.M. Greenberg, M. Endsley (Eds.), Social Computing, Behavioral - Cultural Modeling and Prediction, Springer Berlin Heidelberg, Berlin, Heidelberg, 2012, pp. 196–204. [15] A. Mrvar, P. Doreian, Partitioning signed two-mode networks, J. Math. Sociol. 33 (3) (2009) 196–221, http://dx.doi.org/10.1080/00222500902946210. [16] J. Leskovec, D. Huttenlocher, J. Kleinberg, Predicting positive and negative links in online social networks, in: Proceedings of the 19th international conference on World wide web, ACM, 2010, pp. 641–650. [17] G. Beigi, J. Tang, H. Liu, Signed link analysis in social media networks, in: ICWSM, 2016, pp. 539–542. [18] K. Gu, Y. Fan, A. Zeng, J. Zhou, Z. Di, Analysis on large-scale rating systems based on the signed network, Physica A 507 (2018) 99–109. [19] Y. Zhou, L. Lü, W. Liu, J. Zhang, The power of ground user in recommender systems, PLOS ONE 8 (8) (2013) 1–11, http://dx.doi.org/10.1371/journal. pone.0070094. [20] F.M. Harper, J.A. Konstan, The movielens datasets: history and context, ACM Trans. Interact. Intell. Syst. 5 (4) (2015) 19:1–19:19, http://dx.doi.org/10. 1145/2827872. [21] J. Bennett, C. Elkan, B. Liu, P. Smyth, D. Tikk, KDD cup and workshop 2007, SIGKDD Explor. Newsl. 9 (2) (2007) 51–52, http://dx.doi.org/10.1145/ 1345448.1345459. [22] J.L. Herlocker, J.A. Konstan, A. Borchers, J. Riedl, An algorithmic framework for performing collaborative filtering, in: Proceedings of the 22Nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, in: SIGIR ’99, ACM, New York, NY, USA, 1999, pp. 230–237, http://doi.acm.org/10.1145/312624.312682. [23] J.L. Herlocker, J.A. Konstan, J. Riedl, Explaining collaborative filtering recommendations, in: Proceedings of the 2000 ACM Conference on Computer Supported Cooperative Work, in: CSCW ’00, ACM, New York, NY, USA, 2000, pp. 241–250, http://doi.acm.org/10.1145/358916.358995. [24] P. Resnick, N. Iacovou, M. Suchak, P. Bergstrom, J. Riedl, Grouplens: an open architecture for collaborative filtering of netnews, in: Proceedings of the 1994 ACM Conference on Computer Supported Cooperative Work, in: CSCW ’94, ACM, New York, NY, USA, 1994, pp. 175–186, http://doi.acm.org/10. 1145/192844.192905. [25] B. Sarwar, G. Karypis, J. Konstan, J. Riedl, Item-based collaborative filtering recommendation algorithms, in: Proceedings of the 10th International Conference on World Wide Web, in: WWW ’01, ACM, New York, NY, USA, 2001, pp. 285–295, http://doi.acm.org/10.1145/371920.372071. [26] T. Zhou, J. Ren, M.c.v. Medo, Y.-C. Zhang, Bipartite network projection and personal recommendation, Phys. Rev. E 76 (2007) 046115, http://dx.doi. org/10.1103/PhysRevE.76.046115, https://link.aps.org/doi/10.1103/PhysRevE.76.046115. [27] Y.-C. Zhang, M. Blattner, Y.-K. Yu, Heat conduction process on community networks as a recommendation model, Phys. Rev. Lett. 99 (2007) 154301, http://dx.doi.org/10.1103/PhysRevLett.99.154301, https://link.aps.org/doi/10.1103/PhysRevLett.99.154301. [28] A. Zeng, A. Vidmer, M. Medo, Y.C. Zhang, Information filtering by similarity-preferential diffusion processes, Europhys. Lett. 105 (5) (2014) 58002, http://stacks.iop.org/0295-5075/105/i=5/a=58002. [29] T. Zhou, Z. Kuscsik, J.G. Liu, M. Medo, J.R. Wakeling, Y.C. Zhang, Solving the apparent diversity-accuracy dilemma of recommender systems, Proc. Natl. Acad. Sci. 107 (10) (2010) 4511–4515, http://dx.doi.org/10.1073/pnas.1000488107, arXiv:http://www.pnas.org/content/107/10/4511.full.pdf. [30] W. Zeng, A. Zeng, H. Liu, M.-S. Shang, T. Zhou, Uncovering the information core in recommender systems, Sci. Rep. 4 (2014).