ARTICLE IN PRESS
JID: NEUCOM
[m5G;June 9, 2017;14:11]
Neurocomputing 0 0 0 (2017) 1–12
Contents lists available at ScienceDirect
Neurocomputing journal homepage: www.elsevier.com/locate/neucom
An item orientated recommendation algorithm from the multi-view perspective Qi-Ying Hu a,b,c,1, Zhi-Lin Zhao a,b,c,1, Chang-Dong Wang a,b,c,∗, Jian-Huang Lai a,b,c a
School of Data and Computer Science, Sun Yat-sen University, Guangzhou 510006, Guangdong, China Guangdong Key Laboratory of Information Security Technology, Guangzhou, 510006, China c The Key Laboratory of Machine Intelligence and Advanced Computing, (Sun Yat-sen University), Ministry of Education, Guangzhou, 510006, China b
a r t i c l e
i n f o
Article history: Received 2 September 2016 Revised 8 December 2016 Accepted 8 December 2016 Available online xxx Keywords: Recommendation algorithm Item orientated Multi-view learning
a b s t r a c t In the traditional recommendation algorithms, items are recommended to users on the basis of users’ preferences to improve selling efficiency, which however cannot always raise revenues for manufacturers of particular items. Assume that, a manufacturer has a limited budget for an item’s advertisement, with this budget, it is only possible for him to market this item to limited users. How to select the most suitable users that will increase advertisement revenue? It seems to be an insurmountable problem to the existing recommendation algorithms. To address this issue, a new item orientated recommendation algorithm from the multi-view perspective is proposed in this paper. Different from the existing recommendation algorithms, this model provides the target items with the users that are the most possible to purchase them. The basic idea is to simultaneously calculate the relationships between items and the rating differences between users from a multi-view model in which the purchasing records of each user are regarded as a view and each record is seen as a node in a view. The experimental results show that our proposed method outperforms the state-of-the-art methods in the scenario of item orientated recommendation. © 2017 Published by Elsevier B.V.
1. Introduction In the era of information explosion, there spring up a great amount of data in Internet. So it is difficult for us to select useful information before we make a decision on a large number of choices in a short time and people have troubles choosing suitable items. To deal with this problem, recommendation algorithm emerges as the times require. It has been widely used in a number of famous e-commerce platforms like Amazon [1], Netflix [2], YouTube [3], Yahoo [4], etc. In recommender system, appropriate items are recommended to users based on users’ preferences or purchasing records, increasing their urge to shop. Generally speaking, the traditional recommendation algorithms can be classified into three types: collaborative filtering (CF), content-based recommendation (CB) and hybrid recommendation [5]. Moreover, some other recommendation algorithms integrating the knowledge transferred from other information like social network [6], item’s
∗ Corresponding author at: School of Data and Computer Science, Sun Yat-sen University, Guangzhou 510 0 06, Guangdong, China. E-mail addresses:
[email protected] (Q.-Y. Hu),
[email protected]. cn (Z.-L. Zhao),
[email protected] (C.-D. Wang),
[email protected] (J.-H. Lai). 1 Authors contribute equally.
tag [7] or the context [8] have also been developed to meet the needs of different application tasks. In addition, many techniques can be incorporated into recommendation algorithms, like matrix factorization [9], graph model [10], random walk [11] and Bayesian approach [12]. However, all the existing algorithms only take the users’ benefits into consideration, and there are no algorithms designed to perform recommendation on the basis of items to raise revenues for manufacturers of particular items. Assume that, a manufacturer has a limited budget for an item’s advertisement. With this budget, it is only possible for him to advertise to a limited number of users. How to select the most suitable users that can increase advertisement revenue? It is clear that it seems to be an insurmountable problem to the existing recommendation algorithms. To our best knowledge, all the existing algorithms are designed to recommend items to users and there are no algorithms recommending users to items. To address the above issue, we present an approach based on the multi-view learning and perform item orientated recommendation. Nowadays a multitude of heterogeneous but related views of data have arisen in many fields, and been widely investigated in social network mining [13], clustering [14], classification [15], computer vision [16], domain adaption [17] and transfer learning [18], etc. Combining information on these related views will improve
http://dx.doi.org/10.1016/j.neucom.2016.12.102 0925-2312/© 2017 Published by Elsevier B.V.
Please cite this article as: Q.-Y. Hu et al., An item orientated recommendation algorithm from the multi-view perspective, Neurocomputing (2017), http://dx.doi.org/10.1016/j.neucom.2016.12.102
JID: NEUCOM 2
ARTICLE IN PRESS
[m5G;June 9, 2017;14:11]
Q.-Y. Hu et al. / Neurocomputing 000 (2017) 1–12
the learning performance because we use more information to describe objects. It leads to the emergence of a challenging machine learning problem termed multi-view learning [19,20]. In the multiview learning, different views admit the same underlying class structure and the generated model can not only better capture the data within individual views but also be consistent across different views [14,21]. However, up to now there are no recommendation algorithms investigating multi-view learning. In this paper, we propose a novel item orientated recommendation algorithm from the multi-view perspective (MVIR) which can recommend users to an item to raise advertisement revenue for the manufacturer. A multi-view model is firstly established where the purchasing records of each user are represented by a view and items purchased by this user correspond to the nodes in the view. The relationships between items and the differences between users will be learned from the multi-view model by the gradient descent algorithm. Both of them will be used to predict the potential customers to items. To evaluate the effectiveness of the proposed approach, extensive experiments have been conducted on four real-world rating datasets. Experimental results show that the proposed MVIR algorithm can recommend users to items well, which means that a manufacturer of a particular item can advertise to those recommended users to increase the revenue because they are the most likely to buy the item. We compare our algorithm with some stateof-the-art recommendation algorithms and the results show that the MVIR is the most suitable one for the problem of recommending users to items. The contributions are summarized as follows: 1. A novel item orientated recommendation algorithm is proposed to increase advertisement revenues of manufacturers of particular items. It is the first time to incorporate multi-view learning to perform recommendation. 2. The proposed model utilizes multi-view to learn the items’ relationships and users’ rating differences to provide the target items with the users that are the most possible to purchase them so that manufacturers can make the best of the advertising budget. 3. Extensive experiments conducted on several real-world rating datasets show that our model outperforms other state-of-theart recommendation algorithms and can predict users who will buy target items accurately. The rest of this paper is organized as follows. Section 2 introduces the related works on recommendation algorithm and multiview. In Section 3, we describe the proposed MVIR algorithm in detail. In Section 4, extensive experiments will be conducted to illustrate the performance of the proposed algorithm. Section 5 concludes this paper. 2. Related work In this section, we review several major approaches for recommender systems and multi-view learning. The collaborative filtering (CF) algorithm is one of the most successful methods in e-commerce [22–24]. The CF algorithm utilizes scores of users’ rating records to predict items the target user will like. There are two kinds of CF algorithms, one of which is itembased collaborative filtering (IBCF) [25] and the other is user-based collaborative filtering (UBCF) [26]. The IBCF algorithm computes the items’ similarities according to the users’ rating records and recommends items to a user which are similar to what the user have purchased. On the contrary, the UBCF algorithm measures the similarities between users and provides target users with the items that the similar users have purchased [27]. The most widely used
similarities include cosine-based similarity, correlation-based similarity and adjust-cosine similarity [28]. Apart from the CF algorithm, the content-based (CB) algorithm [29] also plays a significant role in the industry. It finds some features to represent an item and utilizes the profile of a user to learn the user’s preference, then the recommendation is made by mapping the result of profile learning and the features of an item. The most common way to learn users’ preferences is to use the key words with high weight as features generated by TF-IDF [30] from users’ documents. Hybrid recommendation algorithm can combine the advantages of different recommendation algorithms to make a better recommendation result. The simplest approach is to combine the results calculated from the CF and CB algorithms directly [31] while the more comprehensive method is to integrate different algorithms into a single model, e.g. [32]. Some other recommendation algorithms have been developed. Matrix factorizations, like probabilistic matrix factorization (PMF) [9], maximum-margin matrix factorization (MMMF) [33], nonnegative matrix factorization (NMF) [34] and tensor decomposition (TD) [35], are commonly used in recommendation. In [9], PMF was proposed to obtain the latent feature matrices of users and items respectively by decomposing the user-item rating matrix and get the final rating matrix by the product of the two feature matrices. Another perspective of the matrix factorization is tensor decomposition (TD), for it is viewed as a high-order extension of the matrix factorization [35]. The algorithm discovers three latent feature matrices of user, item and the other element like time information [36] or tag [37] respectively by decomposing the tensor. More information is combined with the existing CF algorithm or matrix factorization [6,38–41]. For instance, SocialMF [6] incorporating PMF with social network can form the user-user trust matrix and the user-item rating matrix, after that the latent feature matrices of user and item can be acquired by trust propagation and PMF, making better recommendation and solving the cold-start problem. In [38], a generalized stochastic block model (GSBM) supposes that the relation between users and items is governed by the relation between their groups, respectively. All users and items are grouped according to the links between users and the rating to the items. In [40], a tag system U-CF-TR inferring users’ similarities according to items’ tags and users’ rating records, is combined with the CF algorithm to improve recommendation quality. Moreover, in [41], it proposed a model named FUSE which utilizes the PARAFAC to extract the knowledge from the auxiliary domains and makes use of the knowledge in target domain. In addition, several techniques can be added into recommendation algorithms [11,12,42]. In [11], the items will be recommended to users by performing query centered random walk on the K-partite graph. It utilizes multi-way clustering to group together the highly related nodes and the recommendation is made by traversing the subgraph induced by clusters on the basis of a user’s preference. The model proposed in [12] uses Bayesian treatment on matrix factorization to place Gaussian-Wishart priors on the user and item hyperparameters of their corresponding feature vectors obtained by PMF and is trained by Markov chain Monte Carlo methods to implement recommendation. Restricted Boltzmann Machines (RBM) [42], one of the most important network structures in deep learning, is utilized to make a movie recommendation on the Netflix dataset. For multi-view, there are many applications in different fields, such as networks, clustering, classification, computer vision, domain adaption, transfer learning, etc. [13,16,43–46]. A generalized framework PICA [13] is developed to find the relation between users from multislice networks. In [16], it developed a model based on multi-view and deep convolutional neural networks to do face detection, which does not require annotation and is able to detect faces in a wide range of orientations. Multi-view
Please cite this article as: Q.-Y. Hu et al., An item orientated recommendation algorithm from the multi-view perspective, Neurocomputing (2017), http://dx.doi.org/10.1016/j.neucom.2016.12.102
JID: NEUCOM
ARTICLE IN PRESS
[m5G;June 9, 2017;14:11]
Q.-Y. Hu et al. / Neurocomputing 000 (2017) 1–12
3
Fig. 1. The proposed multi-view model: within-view relationship and cross-view relationship.
clustering algorithm for text data was implemented in [43] by using k-means and EM to improve the single-view counterparts. In [44], a nonparametric Bayesian probabilistic latent variable model for multi-view anomaly detection was made to find the instances that have views. Multi-view can be useful in semi-supervised learning via training a classifier on both labeled and unlabeled data of each view [45]. Moreover, multi-view can be more flexible by incorporating some techniques, such as in [46], tensor based method with multi-view learning is applied to integrating multi-view information in heterogeneous environments. One of the serious drawbacks of the aforementioned recommendation algorithms is that they only give attention to the users’ preferences and do not take care of the budget of the manufacturers. Inspired by the multi-view learning, an item orientated recommendation algorithm is proposed to address the shortcomings of the existing recommendation algorithms. 3. The proposed (MVIR) model In this section, we will describe in detail the proposed MVIR algorithm. By regarding the purchasing records of each user as an individual view, the relationships between items and the rating differences between users can be computed in the multi-view model. The recommendation is made by finding the users that are the most possible to purchase the target item, meaning that the manufacturer can only deliver advertisements to these users to achieve better advertising effect, making the best of the advertising budget. 3.1. The multi-view model The proposed multi-view learning model is illustrated in Fig. 1. The purchasing records of each user are taken as a view associated with the user where the items rated by the user can be represented as nodes and there exists an edge between any pair of nodes with the weight of edge being the relationship between two items w.r.t. the user. The item relationship computed in an individual view is named within-view relationship, which depicts the relationship between two items revealed by the single user. Some pairs of items will not appear in the same view which means that the two items have not been bought by a user at the same time. So the potential relationship between the two items cannot be mined when we just consider the within-view relationship. Apart from the computation in the within-view, the cross-view item relationship, i.e. the edge between items across different views, is also modeled so as to simultaneously consider both the consistency and diversity of item preferences among different users. Besides, some users would like to give high ratings to items while other users prefer giving low ratings to the same items, causing the relationship between two items among different views may vary greatly on account of the users’ ratings. Therefore the rating differences
Fig. 2. Illustration of the relationship between items.
between users which will be used to adjust the item relationship should be found by using the differences between records of different views. 3.1.1. Within-view and cross-view relationships The input is a user-item rating matrix R ∈ RM×N , whose two dimensions correspond to user and item with sizes M and N, respectively. Each entry ru, i of R in row u and column i denotes the rating of user u to item i within a certain numerical interval that varies in different datasets. The higher the rating value ru, i is, the more user u is fond of item i and if user u has not yet rated item i, the value of ru, i is set to zero. From the rating matrix R, a multi-view model can be constructed, where the within-view relationship and cross-view relationship can be computed, as follows. 1. Within-view relationship: The within-view relationship of two different items i and j in the individual view corresponding to user u, denoted as Si,u j , is computed as follows,
Si,u j =
2 + r2 . ru,i u, j
(1)
2. Cross-view relationship: The cross-view relationship of two different item i purchased by user u and item j purchased by user v across two different views corresponding to user u and user v, denoted as Si,u,jv , is computed as follows,
Si,u,jv =
2 + r2 . ru,i v, j
(2)
The computation of cross-view relationship has not taken the rating differences into consideration because we will incorporate the rating difference into the final objective function. The withinview and cross-view item relationship is computed by the Euclidean distance which will be proved to be a suitable metric in our algorithm as follows. Fig. 2 illustrates the relationship between items. We regard the ratings of item i and item j as the X-axis and Y-axis, respectively. Then the value of relationship is the distance from the origin to the point whose abscissa and ordinate values correspond to the ratings of item i and item j, respectively. It is common that the distance from the origin to point (5,5) is the maximum value of relationship, meaning that two items are closest to each other. It is worth noting that two items are not strongly relevant if the ratings of them are both small although they are the same, like the distance
Please cite this article as: Q.-Y. Hu et al., An item orientated recommendation algorithm from the multi-view perspective, Neurocomputing (2017), http://dx.doi.org/10.1016/j.neucom.2016.12.102
ARTICLE IN PRESS
JID: NEUCOM 4
[m5G;June 9, 2017;14:11]
Q.-Y. Hu et al. / Neurocomputing 000 (2017) 1–12
of point (1,1) is smaller than that of point (1,2). In our model we define that only when two items have high ratings and the ratings of them are similar, the value of relationship will be higher and the items are closer. The reason is that when two items i and j have high as well as similar ratings, it means that users will like both of them and are likely to buy the two items together. So, if a user has bought item i, there is a high chance that this user will buy item j and the manufacturer of this item can advertise to him. However, if two items have similar but low ratings which means that users like none of them even the two items may be similar. So, when a user has bought one of them, we should not recommend the user to buy the other one. Based on the above reasons, we don’t use the distance between item ratings to measure the item relationships because a user may not like both of them even the two items are very close. This also shows that the metric of Euclidean distance between the points of two items’ ratings and origin point is a suitable metric. 3.1.2. Rating difference There exist rating differences among different users. For example, user u grades item i with 4.5 while item j has scores 5 and 2 corresponding to user v and user w respectively. By using Eq. (2), the cross-view relationship between item i and j across view pairs u and v is Si,u,jv =
4.52 + 52 , while the cross-view relationship be-
tween item i and j across view pairs u and w is Si,u,w = 4.52 + 22 , j which are clearly different. So the rating differences have an impact on the relationships between items. Smaller value of rating difference implies that two users are more relevant and vice versa. Fig. 1 also demonstrates the relationships between users. Two users can be connected by an edge whose weight is the value of rating difference corresponding to these two users. The disconnected users means the preferences of these users differ significantly so the rating difference between them is large. Any pair of users u and v connected by an edge can be more relevant than the pairs without connection. And if user u has purchased the target item while the connected user v has not, there is a strong possibility that user v will show preference to this item, so user v can be recommended to this item. 3.1.3. The objective function Based on the discussion above, the goal of the multi-view model is to learn the item relationship matrix α ∈ RN×N and the user rating difference matrix β ∈ RM×M as follows,
α , β = arg min δ α ,β
+ (1 − δ )
M u=1
N
(αi, j − Si,u j )2
i, j∈N( u ),i< j M
N
u=1,v=2,u
(αi, j − Si,u,jv − βu,v )2 , (3)
where δ ∈ [0, 1) is a parameter to balance the relationships between items in within-view and cross-view, and N (u ) is the items purchased by user u. By Eq. (3), we should notice that δ cannot take 1 which ensures that all item relationships and all user differences can be taken into consideration. Many pairs of items may not appear in a single view and the differences between users can only be computed in cross-view, so δ cannot be 1, making the objective function take account of the information of cross-view. By minimizing the error between the final relationship matrix and the item relations in within-view and cross-view, the entry in α can be fitted to all pairs of items and the entry in β used to adjust the differences between views can be learned. 3.1.4. Model optimization Our goal is to minimize the above objective function, and a local minimum can be found by performing gradient descent w.r.t.
relationship α i, j and rating differences β u, v iteratively:
∇αi, j =
M ∂E = 2δ (αi, j − Si,u j ) + 2(1 − δ ) ∂αi, j u=1
×
M
(αi, j − Si,u,jv − βu,v ).
(4)
u=1,v=2,u
∇βu,v =
N ∂E = −2(1 − δ ) (αi, j − Si,u,jv − βu,v ). (5) ∂βu,v i∈N u ), j∈N v ),i< j (
(
After computing ∇α i, j and ∇β u, v , the entries in α and β can be updated by the following equations repeatedly until convergence,
αi, j = αi, j − γ ∇αi, j ,
(6)
βu,v = βu,v − γ ∇βu,v ,
(7)
where γ is the learning rate which will affect the learning convergence. If the learning rate is too large, the solutions of parameters will be divergent. On the contrary, it will take long time until the solution is convergent if the learning rate is too small. So we should use an appropriate learning rate through prior knowledge to get a good performance of the solutions. 3.2. Recommend users to items By obtaining the item relationship matrix α and user rating difference matrix β , we can implement the item orientated recommendation to recommend potential users to the item. Let k1 be the number of the chosen items that are the closest to i and k2 be the number of the users recommended to i who are the most likely to buy the item. We can do recommendation as follows. 1. Find the top k1 items from the i-th row in α excluding item i after sorting α by descending order. 2. Find two disjoint sets h1 and h2 that represent the users have purchased item i and the users have purchased at least one item in the k1 items list excluding item i, respectively, where h1 h2 = ∅. 3. ∀u ∈ h2 , compute the mean of the total differences between u and all users in h1 . The Di set is used to record the mean of differences of every u which can be viewed as a set of candidates for item i. 4. Sort Di according to the ascending order, find the smallest k2 users. Recommend these k2 users to item i. For the manufacturer of item i, k2 can be viewed as the number of available limited advertising budget. In order to make good use of all the budget, the manufacturer should find k2 users who are the most likely to buy the item and advertise to them. In this way, the manufacturer can increase revenue of advertisement since he has made better targeted advertisement. The procedure of our proposed model is summarized as Algorithm 1 . 4. Experiments In this section, we conduct experiments to evaluate the effectiveness of our algorithm on four real-world datasets including BaiduMovie, EachMovie, Jester and MovieLens and measure the results by three different evaluation metrics. First, the values of the three parameters δ , k1 and k2 will be adjusted to analyze their effects on our algorithm, experimental results of which show that different values of balance parameter δ lead to relatively stable results while k1 and k2 will have a larger effect. In addition, comparison experiments are conducted to compare with
Please cite this article as: Q.-Y. Hu et al., An item orientated recommendation algorithm from the multi-view perspective, Neurocomputing (2017), http://dx.doi.org/10.1016/j.neucom.2016.12.102
JID: NEUCOM
ARTICLE IN PRESS
[m5G;June 9, 2017;14:11]
Q.-Y. Hu et al. / Neurocomputing 000 (2017) 1–12
Algorithm 1 The MVIR algorithm. Input: User-item rating matrix R, learning rate γ , balance parameter δ , the number of items close to target item k1 , the number of users recommended to target item k2 1: Initialize αi, j = 0 and βu,v = 0 2: Compute within-view relationship Su by Eq. (1) i, j
Compute cross-view relationship Si,u,jv by Eq. (2) 4: while not converged do Compute gradient ∇αi, j by Eq. (4) 5: Compute gradient ∇βu,v by Eq. (5) 6: Update αi, j by Eq. (6) 7: 8: Update βu,v by Eq. (7) 9: end while 10: for i = 1 to N do Sort αi,: (descending order) 11: 12: Find top k1 items Find set h1 ={users having purchased item i} 13: Find set h2 ={users having purchased at least one item in the 14: k1 items} for u ∈ h2 do 15: Compute Di,u , i.e. mean of differences between u and users 16: in h1 Sort Di (ascending order) 17: Find the smallest k2 users 18: Get the recommendation list of item i 19: 20: end for 21: end for Output: recommendation lists of target items 3:
5
four datasets split randomly with 80% as training set and 20% as testing set. 4.2. Evaluation metrics In the traditional recommendation algorithms, they provide the target user with a personalized recommendation list, named TopN Recommendation. The performances of those recommendation algorithms are measured by Precision, Recall and F1 . Different from the traditional algorithms, we aim to recommend users to items. Considering that the goal of the itemoriented recommendation lies in whether the users in the recommendation list of the target item will buy the item or not, by changing the target recommendation object to the items, we can also employ the three evaluation metrics to evaluate the performance of the proposed algorithm as follows:
N
|R ( i ) T ( i )| N i |R ( i )| N |R ( i ) T ( i )| Recall = i N i |T ( i )| P recision =
F1 =
i
2 × P recision × Recall P recision + Recall
(8)
(9) (10)
where R(i) is the recommendation list to the target item i computed from the training dataset and T(i) is the behavior list of item i on the testing dataset. Obviously larger values of these metrics reveal better results of the recommendation. 4.3. Parameters analysis
three state-of-the-art recommendation algorithms on the four datasets. Comparison results have confirmed the effectiveness of the proposed algorithm.
We analyze the effect of δ , k1 and k2 on the performance of the MVIR algorithm for item orientated recommendation. The analysis is conducted on the four datasets.
4.1. Dataset description
4.3.1. Parameter analysis on δ The balance parameter δ is used to balance the contribution of within-view and cross-view in our model. To analyse the impact of the parameter with different values, we vary the value of δ from 0 to 0.9 with k1 = 5 and k2 =100. As shown in Fig. 3, due to the inherent differences of data properties, the algorithm on each dataset has different sensitivities to δ , reflected in their best accuracy based on different values of δ . For BaiduMovie, the best recommendation is achieved when δ is 0.4 on three metrics while the values of δ generating the best results on the EachMovie, Jester and MovieLens datasets are 0.1, 0.9 and 0.8, respectively. Despite the values of recommendation performance slightly changing on three metrics for EachMovie, apparently varying the value of δ in the interval [0,1) does not cause significant changes. Overall, the performances of the MVIR algorithm are robust to the parameter δ on the four datasets, implying that changing the value of δ in a proper interval has no obvious effect on the final recommendation.
The four datasets used in our experiments are BaiduMovie,2 EachMovie,3 Jester4 and MovieLens.5 1. The BaiduMovie dataset comes from the Movie Recommendation Algorithm Contest of Baidu. The data only stands for the active users and each user has made enough records. 2. The DEC Systems Research Center has collected numeric ratings of users for different movies, which is called EachMovie dataset. Each rating record contains person ID, movie ID and score, other information in each record is ignored owing to useless in recommendation. 3. The Jester dataset collects million continuous ratings of jokes from April 1999 to May 2003 provided by University of California. The dataset is dense because each user has rated enough items so the jokes have a large number of rating records. 4. The MovieLens dataset contains ratings of the online movie recommender service MovieLens. All the users have rated at least 20 movies but the user-item rating matrix is still quite sparse since the number of movies is far larger than 20. In all the data files, taking the scale of data and the unity of the rating into consideration, we have removed the infrequent users and items and mapped the rating to the range [0,5]. Besides, the 2 3 4 5
http://openresearch.baidu.com http://grouplens.org/datasets/eachmovie http://eigentaste.berkeley.edu/dataset http://grouplens.org/datasets/movielens
4.3.2. Parameter analysis on k1 and k2 Based on the analyses in the above section, for each dataset, we conduct the analysis of k1 and k2 on the value of δ that achieves the best recommendation performance. We set k1 = 5, 10, 15 and k2 = 100, 150, 200. As shown in Fig. 4, with the decrease of k1 and the increase of k2 , the value of Precision has no obvious changes on BaiduMovie, Jester and MovieLens while on EachMovie it changes obviously. In Fig. 5, it is evident that there are improvements on the values of Recall with smaller k1 and larger k2 . The same applies in Fig. 6, namely, there are marked changes on the values of F1 . Overall, the performance of the recommendation will be better when the value of k1 is smaller and the value of k2 is larger, and k2 has a higher impact than k1 .
Please cite this article as: Q.-Y. Hu et al., An item orientated recommendation algorithm from the multi-view perspective, Neurocomputing (2017), http://dx.doi.org/10.1016/j.neucom.2016.12.102
JID: NEUCOM 6
ARTICLE IN PRESS
[m5G;June 9, 2017;14:11]
Q.-Y. Hu et al. / Neurocomputing 000 (2017) 1–12
Fig. 3. Parameter analysis: the values of metrics with different δ values on the four datasets in MVIR.
Fig. 4. Parameter analysis: the values of Precision with different k1 and k2 values on the four datasets in MVIR.
Please cite this article as: Q.-Y. Hu et al., An item orientated recommendation algorithm from the multi-view perspective, Neurocomputing (2017), http://dx.doi.org/10.1016/j.neucom.2016.12.102
JID: NEUCOM
ARTICLE IN PRESS
[m5G;June 9, 2017;14:11]
Q.-Y. Hu et al. / Neurocomputing 000 (2017) 1–12
Fig. 5. Parameter analysis: the values of Recall with different k1 and k2 values on the four datasets in MVIR.
Fig. 6. Parameter analysis: the values of F1 with different k1 and k2 values on four datasets in MVIR.
Please cite this article as: Q.-Y. Hu et al., An item orientated recommendation algorithm from the multi-view perspective, Neurocomputing (2017), http://dx.doi.org/10.1016/j.neucom.2016.12.102
7
ARTICLE IN PRESS
JID: NEUCOM 8
[m5G;June 9, 2017;14:11]
Q.-Y. Hu et al. / Neurocomputing 000 (2017) 1–12
Fig. 7. Comparison results: the values of Precision on the four dataset of the four recommendation algorithms.
4.4. Comparison experiments 4.4.1. Comparison methods In this section, comparison experiments are conducted to compare the proposed MVIR algorithm with three state-of-the-art recommendation algorithms, namely Item-Based Collaborative Filtering (IBCF) [25], Slope One (SO) [47] and Probabilistic Matrix Factorization (PMF) [9]. 1. IBCF [25]: It measures the similarity between any pair of items and provides the target users with the items that are the most similar to those that they have already purchased. 2. Slope One (SO) [47]: The idea of this algorithm is linear regression. In this algorithm, the average differences of items are computed through their rating records, after that, the unknown rating to an item is predicted by the items the target user has already purchased and the average differences between those items and the item to be predicted. 3. PMF [9]: This algorithm finds the D dimensional latent feature vector of each user and each item by decomposing the useritem rating matrix. The missing values of the rating matrix can be predicted according to the corresponding latent feature vectors of user and item. Notice that the three state-of-the-art comparison methods are used to recommend items to users in traditional way. When compared with the MVIR algorithm, their way to recommendation should be changed slightly. Users are recommended to the target item according to the similarities or differences of items calculated by IBCF and SO algorithm respectively. In PMF, we provide the target item with users who have a high rating to the item predicted by the latent features vectors.
4.4.2. Comparison result We set k1 = 5, k2 = 100 on all datasets and take the different best δ values on the different datasets on the MVIR algorithm. For PMF, we set D = 5. The above algorithms recommend items to users based on the users’ preferences or behaviors, so in terms of the perspective of item orientated recommendation, they are vulnerable when compared to MVIR which can be evidenced by Figs. 7, 8 and 9, the comparison results in terms of three metrics. The values over four datasets of percentage gains of the proposed MVIR algorithm are shown in Table 1. The recommendation performance of PMF is relatively poor on the four datasets. Its values on three metrics are quite small. Besides, the MVIR algorithm can outperform the performance of PMF on three metrics from 5.7% to 73.6%. The accuracies generated by IBCF and SO are almost similar, for both of them care for the similarities/differences between items. It can be clearly shown that on almost four datasets except Jester dataset the MVIR algorithm is significantly superior to these two algorithms. It demonstrates that MVIR outperforms the recommendation performances of the others by as high as 64.0%. On Jester dataset, Although the performance of MVIR is almost the same as IBCF and SO when it comes to Precision, MVIR can achieve 16.3% improvements compared with these two algorithms on the other two metrics. Overall, the recommendation results in MVIR outperform the other algorithms because both the item relationships and user differences computed simultaneously are considered in MVIR to find out the potential users for the target item. Different from our model, IBCF and SO only pay attention to the relationships of items and PMF is designed to predict the missing value of a user to an item. In addition, in order to make the comparison fairer, more experiments where items are recommended to users are also conducted,
Please cite this article as: Q.-Y. Hu et al., An item orientated recommendation algorithm from the multi-view perspective, Neurocomputing (2017), http://dx.doi.org/10.1016/j.neucom.2016.12.102
JID: NEUCOM
ARTICLE IN PRESS
[m5G;June 9, 2017;14:11]
Q.-Y. Hu et al. / Neurocomputing 000 (2017) 1–12
Fig. 8. Comparison results: the values of Recall on the four dataset of the four recommendation algorithms.
Fig. 9. Comparison results: the values of F1 on four dataset of the four recommendation algorithms.
Please cite this article as: Q.-Y. Hu et al., An item orientated recommendation algorithm from the multi-view perspective, Neurocomputing (2017), http://dx.doi.org/10.1016/j.neucom.2016.12.102
9
ARTICLE IN PRESS
JID: NEUCOM 10
[m5G;June 9, 2017;14:11]
Q.-Y. Hu et al. / Neurocomputing 000 (2017) 1–12
Fig. 10. The distribution of the α value on the four datasets in MVIR. Table 1 The mean values of the percentage gains on item orientated recommendation. Datasets
Methods
Precision (%)
Recall (%)
F1 (%)
BaiduMovie
IBCF SO PMF IBCF SO PMF IBCF SO PMF IBCF SO PMF
4.7 12.0 26.8 48.6 64.2 74.2 10.5 9.3 5.7 18.4 18.6 7.4
4.7 12.0 26.8 31.7 63.9 73.0 10.5 9.3 5.7 −0.05 −0.05 7.4
4.7 12.0 26.8 40.0 64.0 73.6 10.5 9.3 5.7 16.2 16.3 7.4
EachMovie
MovieLens
Jester
Table 2 The mean values of the percentage gains on user orientated recommendation. Datasets
Methods
Precision (%)
Recall (%)
F1 (%)
BaiduMovie
IBCF SO PMF IBCF SO PMF IBCF SO PMF IBCF SO PMF
−39.6 38.4 47.3 −87.2 9.3 −63.3 −32.5 35.0 −8.0 −9.3 −9.3 −15.5
−39.6 38.0 48.5 47.8 920.0 319.0 −32.8 35.5 7.7 0.2 0.2 47.3
−39.7 38.2 47.9 −55.4 227.0 26.3 −32.8 36.7 7.9 −3.5 −3.5 22.5
EachMovie
MovieLens
Jester
the results of which are shown in Table 2. It is clearly shown that our proposed model is slightly inferior than these algorithms when providing users with items, caused by the property of MVIR that it is designed for item orientated recommendation and other algorithms are designed to recommend items to the target users. Therefore, it can be safely concluded that our model is more suitable for item orientated recommendation.
four datasets after training are shown in Figs. 10 and 11, respectively. As the inherent differences of data properties, the distribution on each dataset is different. But all the distributions are not uniform, and the values of α and β are concentrated at 4 and 1 respectively. If an item is mainly related to another, the α value will be large (even larger than 6) and vice versa (even smaller than 2). If two users are likely to give ratings in the same level, e.g., five is a high score for both of them, the value of user rating difference will be smaller. If the two users sharply disagree about ratings, the value will be large. The results show that the proposed method can get the item relationship and user rating difference by training, which play the key role in recommending users to the manufacturer of the target item. 5. Conclusions and future works In this paper, we have proposed a new item orientated recommendation algorithm from the multi-view perspective to raise revenue for the manufacturer of a particular item. By regarding the purchasing records of each user as an individual view and the nodes in each view representing the items rated by this user, we can compute the relationship matrix between items and the rating difference matrix between users at the same time from the multi-view learning. After obtaining the two matrices, we can predict the users that are the most possible to purchase the target item by finding the closest items and the most relevant users. The manufacturer of this item can only market the item to these users, making the best of the limited budget to increase advertisement revenue. Extensive experiments have been conducted to confirm the effectiveness of the proposed method. Comparison results have shown that our proposed method outperforms the state-of-the-art methods. Item orientated recommendation is a new type of recommendation algorithm. For our future work, we will design some more suitable metrics to calculate the relationships between items and try to propose some novel models for item orientated recommendation.
4.5. Further analysis on the distribution of α and β Acknowledgments To make the underlying rationale of the proposed method much clearer, in this subsection, we will analyze the item relationship α and user rating difference β . The distributions of α and β on the
This work was supported by National Key Research and Development Program of China (2016YFB1001003), NSFC (61502543),
Fig. 11. The distribution of the β value on the four datasets in MVIR.
Please cite this article as: Q.-Y. Hu et al., An item orientated recommendation algorithm from the multi-view perspective, Neurocomputing (2017), http://dx.doi.org/10.1016/j.neucom.2016.12.102
JID: NEUCOM
ARTICLE IN PRESS Q.-Y. Hu et al. / Neurocomputing 000 (2017) 1–12
Guangdong Natural Science Funds for Distinguished Young Scholar (2016A030306014). References [1] G. Linden, B. Smith, J. York, Amazon.com recommendations: item-to-item collaborative filtering, IEEE Internet Comput. 7 (1) (2003) 76–80. [2] Y. Koren, Factorization meets the neighborhood: a multifaceted collaborative filtering model, in: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2008, 2008, pp. 426–434. [3] J. Davidson, B. Liebald, J. Liu, P. Nandy, T.V. Vleet, The YouTube video recommendation system, in: Proceedings of the ACM Conference on Recommender Systems, 2010, 2010, pp. 293–296. [4] S. Amer-Yahia, Recommendation projects at Yahoo!, IEEE Data Eng. Bull. 34 (2) (2011) 69–77. [5] D. Jannach, M. Zanker, A. Felfering, G. Friedrich, Recommender Systems: An Introduction, Addison Wesley Publishing, 2013. [6] M. Jamali, M. Ester, A matrix factorization technique with trust propagation for recommendation in social networks, in: Proceedings of the ACM Conference on Recommender Systems, 2010, 2010, pp. 135–142. [7] C. Lu, X. Hu, J.R. Park, J. Huang, Post-based collaborative filtering for personalized tag recommendation., in: Proceedings of iConference, Inspiration, Integrity, and Intrepidity, 2011, 2011, pp. 561–568. [8] L. Baltrunas, B. Ludwig, F. Ricci, Matrix factorization techniques for context aware recommendation, in: Proceedings of the ACM Conference on Recommender Systems, 2011, 2011, pp. 301–304. [9] R. Salakhutdinov, A. Mnih, Probabilistic matrix factorization, in: Proceedings of the Twenty-First Annual Conference on Neural Information Processing Systems, 2007, 2007, pp. 1257–1264. [10] W. Yao, J. He, G. Huang, J. Cao, Y. Zhang, A graph-based model for context-aware recommendation using implicit feedback data, World Wide Web-internet Web Inf. Syst. 18 (5) (2014) 1351–1371. [11] H. Cheng, P.N. Tan, J. Sticklen, W.F. Punch, Recommendation via query centered random walk on k-partite graph, in: Proceedings of the 7th IEEE International Conference on Data Mining, 2007, 2007, pp. 457–462. [12] R. Salakhutdinov, A. Mnih, Bayesian probabilistic matrix factorization using Markov chain Monte Carlo, in: Proceedings of the Twenty-Fifth International Conference on Machine Learning, 2008, 2008, pp. 880–887. [13] D. Greene, P. Cunningham, Multi-view clustering for mining heterogeneous social network data, Soc. Netw. Anal. 57 (2009) 27–57. [14] E.H. Taralova, F.D. la Torre, M. Hebert, Source constrained clustering, in: Proceedings of IEEE International Conference on Computer Vision, 2011, 2011, pp. 1927–1934. [15] M.-R. Amini, C. Goutte, A co-classification approach to learning from multilingual corpora, Mach. Learn. 79 (2010) 105–121. [16] S.S. Farfade, M.J. Saberian, L. Li, Multi-view face detection using deep convolutional neural networks, in: Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, 2015, pp. 643–650. [17] Y. Mansour, M. Mohri, A. Rostamizadeh, Domain adaptation with multiple sources, in: Proceedings of the Twenty-Second Annual Conference on Neural Information Processing Systems, 2008, 2008, pp. 1041–1048. [18] R. Luis, L.E. Sucar, E.F. Morales, Inductive transfer for learning Bayesian networks, Mach. Learn. 79 (2010) 227–255. [19] Y.-M. Xu, C.-D. Wang, J.-H. Lai, Weighted multi-view clustering with feature selection, Pattern Recognit. 53 (2016) 25–35. [20] C.-D. Wang, J.-H. Lai, S.Y. Philip, Multi-view clustering based on belief propagation, IEEE Trans. Knowl. Data Eng. 28 (4) (2016) 1007–1021. [21] A. Kumar, H.D. III, A co-training approach for multi-view spectral clustering, in: Proceedings of the 28th International Conference on Machine Learning, 2011, 2011, pp. 393–400. [22] K.Y. Goldberg, T. Roeder, D. Gupta, C. Perkins, Eigentaste: a constant time collaborative filtering algorithm, Inf. Retr. 4 (2) (2001) 133–151. [23] J.B. Schafer, D. Frankowski, J.L. Herlocker, S. Sen, Collaborative filtering recommender systems, in: Peter Brusilovsky, Alfred Kobsa, Wolfgang Nejdl (Eds.), The Adaptive Web, Methods and Strategies of Web Personalization, 4321, Springer, 2007, pp. 291–324. [24] F. Ricci, L. Rokach, B. Shapira, Introduction to Recommender Systems Handbook, Springer, 2011. [25] B.M. Sarwar, G. Karypis, J.A. Konstan, J. Riedl, Item-based collaborative filtering recommendation algorithms, in: Proceedings of the Tenth International World Wide Web Conference, 2001, 2001, pp. 285–295. [26] J. Wang, A.P. de Vries, M.J.T. Reinders, Unifying user-based and item-based collaborative filtering approaches by similarity fusion, in: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2006, 2006, pp. 501–508. [27] Z.-L. Zhao, C.-D. Wang, J.-H. Lai, AUI&GIV: recommendation with asymmetric user influence and global importance value, PloS One 11 (2) (2016) e0147944.
[m5G;June 9, 2017;14:11] 11
[28] W. J, A.P.de Vries, R. MJT., Unifying user-based and item-based collaborative filtering approaches by similarity fusion, in: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2006, 2006, pp. 501–508. [29] M.J. Pazzani, D. Billsus, Content-based recommendation systems, The Adaptive Web, Methods and Strategies of Web Personalization, Springer, 2007, pp. 325–341. [30] https://en.wikipedia.org/wiki/tf-idf, 2016. [31] Y. Wu, Z. Huang, Z. Gong, Using hybrid algorithm for automatic recommendation service, in: Proceedings of the International Workshop on Knowledge Discovery and Data Mining, 2008, 2008, pp. 396–399. [32] W. Chen, Z. Niu, X. Zhao, Y. Li, A hybrid recommendation algorithm adapted in e-learning environments, World Wide Web 17 (2) (2014) 1–14. [33] N. Srebro, J.D.M. Rennie, T.S. Jaakkola, Maximum-margin matrix factorization, Advances in Neural Information Processing Systems, 37, Neural Information Processing Systems Foundation, Inc., 2004, pp. 1329–1336. [34] K. Benzi, V. Kalofolias, X. Bresson, P. Vandergheynst, Song recommendation with non-negative matrix factorization and graph total variation, in: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, 2016, 2016, pp. 2439–2443. [35] T.G. Kolda, B.W. Bader, Tensor decompositions and applications, SIAM Rev. 51 (3) (2009) 455–500. [36] D.M. Dunlavy, T.G. Kolda, E. Acar, Temporal link prediction using matrix and tensor factorizations, ACM Trans. Knowl. Discov. Data 5 (2) (2011) 10. [37] L. Hu, J. Cao, G. Xu, L. Cao, Z. Gu, C. Zhu, Personalized recommendation via cross-domain triadic factorization, in: Proceedings of the 22nd International World Wide Web Conference, 2013, 2013, pp. 595–606. [38] M. Jamali, T. Huang, M. Ester, A generalized stochastic block model for recommendation in social rating networks, in: Proceedings of the ACM Conference on Recommender Systems, 2011, 2011, pp. 53–60. [39] H. Ma, I. King, M.R. Lyu, Learning to recommend with social trust ensemble, in: Proceedings of the 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2009, 2009, pp. 203–210. [40] J. Liu, W. Wang, Z. Chen, X. Du, Q. Qi, A novel user-based collaborative filtering method by inferring tag ratings, ACM SIGAPP Appl. Comput. Rev. 12 (2012) 48–57. [41] W. Chen, W. Hsu, M. Lee, Making recommendations from multiple domains, in: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2013, 2013, pp. 892–900. [42] R. Salakhutdinov, A. Mnih, G. Hinton, Restricted Boltzmann machines for collaborative filtering, in: Proceedings of the Twenty-Fourth International Conference on Machine Learning, 2007, 2007, pp. 1–8. [43] S. Bickel, T. Scheffer, Multi-view clustering, in: Proceedings of the 4th IEEE International Conference on Data Mining, 2004, 2004, pp. 19–26. [44] T. Iwata, M. Yamada, Multi-view anomaly detection via probabilistic latent variable models, CoRR (2014). [45] G. Li, K. Chang, S.C.H. Hoi, Multiview semi-supervised learning with consensus, IEEE Trans. Knowl. Data Eng. 24 (1) (2012). [46] X. Liu, S. Ji, W. Glänzel, B. De Moor, Multiview partitioning via tensor methods, IEEE Trans. Knowl. Data Eng. 25 (5) (2013) 1056–1069. [47] D. Lemire, A. Maclachlan, Slope one predictors for online rating-based collaborative filtering, in: Proceedings of the SIAM International Conference on Data Mining, 2005, 2005, pp. 471–475. Qi-Ying Hu received the B.S. degree in computer science in 2016 from Sun Yat-sen University, China. Currently, she is purchasing her M.Sc. degree in computer science from Sun Yat-sen University. Her research interest is recommendation algorithm.
Zhi-Lin Zhao received the B.S. degree in computer science in 2016 from Sun Yat-sen University, China. Currently, he is purchasing his M.Sc. degree in computer science from Sun Yat-sen University. His research interest is recommendation algorithm.
Please cite this article as: Q.-Y. Hu et al., An item orientated recommendation algorithm from the multi-view perspective, Neurocomputing (2017), http://dx.doi.org/10.1016/j.neucom.2016.12.102
JID: NEUCOM 12
ARTICLE IN PRESS
[m5G;June 9, 2017;14:11]
Q.-Y. Hu et al. / Neurocomputing 000 (2017) 1–12
Chang-Dong Wang received the Ph..D degree in computer science in 2013 from the Sun Yat-sen University, China. He is currently an associate professor in the School of Data and Computer Science, Sun Yat-sen University. His current research interests include machine learning and pattern recognition, especially focusing on data clustering and its applications. He has published over 50 scientific papers in international journals and conferences such as IEEE TPAMI, IEEE TKDE, IEEE TSMC-C, Pattern Recognition, Knowledge and Information System, Neurocomputing, ICDM and SDM. His ICDM 2010 paper won the Honorable Mention for Best Research Paper Awards. He won 2012 Microsoft Research Fellowship Nomination Award. He was awarded 2015 Chinese Association for Artificial Intelligence (CAAI) Outstanding Dissertation.
Jian-Huang Lai received his M.Sc. degree in applied mathematics in 1989 and his Ph.D. in mathematics in 1999 from Sun Yat-sen University, China. He joined Sun Yat-sen University in 1989 as an Assistant Professor, where currently, he is a Professor with the Department of Automation of School of Information Science and Technology and dean of School of Information Science and Technology. His current research interests are in the areas of digital image processing, pattern recognition, multimedia communication, wavelet and its applications. He has published over 100 scientific papers in the international journals and conferences on image processing and pattern recognition, e.g. IEEE TPAMI, IEEE TKDE, IEEE TNN, IEEE TIP, IEEE TSMC (Part B), Pattern Recognition, ICCV, CVPR and ICDM. Prof. Lai serves as a standing member of the Image and Graphics Association of China and also serves as a standing director of the Image and Graphics Association of Guangdong.
Please cite this article as: Q.-Y. Hu et al., An item orientated recommendation algorithm from the multi-view perspective, Neurocomputing (2017), http://dx.doi.org/10.1016/j.neucom.2016.12.102