Accepted Manuscript
Movie Collaborative Filtering with Multiplex Implicit Feedbacks Yutian Hu , Fei Xiong , Dongyuan Lu , Ximeng Wang , Xi Xiong , Hongshu Chen PII: DOI: Reference:
S0925-2312(19)31032-X https://doi.org/10.1016/j.neucom.2019.03.098 NEUCOM 21088
To appear in:
Neurocomputing
Received date: Revised date: Accepted date:
5 December 2018 28 February 2019 23 March 2019
Please cite this article as: Yutian Hu , Fei Xiong , Dongyuan Lu , Ximeng Wang , Xi Xiong , Hongshu Chen , Movie Collaborative Filtering with Multiplex Implicit Feedbacks, Neurocomputing (2019), doi: https://doi.org/10.1016/j.neucom.2019.03.098
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT
Movie Collaborative Filtering with Multiplex Implicit Feedbacks Yutian Hu1,2, Fei Xiong1,, Dongyuan Lu3, Ximeng Wang1, Xi Xiong4, Hongshu Chen5 1
CR IP T
Key Laboratory of Communication and Information Systems, Beijing Municipal Commission of Education, Beijing Jiaotong University, Beijing 100044, China 2 CETC Big Data Research Institute Co.,Ltd., Guiyang 550081, China 3 School of Information Technology & Management, University of International Business and Economics, Beijing 100029, China 4 School of Cybersecurity, Chengdu University of Information Technology, Chengdu 610225, China 5 School of Management and Economics, Beijing Institute of Technology, Beijing 100081, China.
Abstract: Movie recommender systems have been widely used in a variety of online networking platforms to give users reasonable advice from a large number of choices. As a representative method of movie recommendation, collaborative filtering uses explicit and implicit feedbacks to mine users’ preferences. The use of implicit features can help to improve the accuracy of movie collaborative filtering. However, multiplex implicit feedbacks have not been investigated and utilized comprehensively. In this paper, we analyze different kinds of implicit feedbacks in movie
AN US
recommendation, including user similarities for movie tastes, rated records of each movie and positive attitude of users, and incorporate these feedbacks for collaborative filtering. User relationships are extracted according to user similarities. We propose a recommendation method with multiplex implicit feedbacks (RMIF), which factorizes both the explicit rating matrix and implicit attitude matrix. To demonstrate the effectiveness of our method, we conduct extensive experiments on two real datasets. Experiment results prove that RMIF significantly outperforms state-of-the-art models in terms of movie collaborative filtering.
M
accuracy. Among different kinds of implicit feedbacks, positive attitude has the most important role in
factorization. 1. Introduction
ED
Keywords: Movie recommender system, Collaborative filtering, Multiplex implicit feedback, Matrix
PT
With the development of multimedia technology, various types of movies and videos on social media are exploding, making it much difficult for online users to find useful video information. Collaborative filtering is an effective solution to the problem of information overload, and helps users
CE
to discover the videos in which they are interested. Therefore, personalized recommender systems have become one of the most popular applications since the mid-90s [1]. Varian and Resnick gave the formal definition of recommender systems in 1997, which use e-commerce sites to provide product
AC
information and advice for customers, to help users decide which products to buy, and to help customers to complete the purchase [2]. In recent years, recommender systems have been widely concerned by researchers, and many recommendation models have been presented. The task of recommender systems is mainly divided into two categories: item recommendation and rating prediction. Item recommendation predicts a set of items that users have great probability to adopt. Rating prediction estimates ratings on items that users have not given, and is often used in movie sharing websites [3]. The goal of this paper is rating prediction. Collaborative filtering (CF) is one of the most popular methods to implement a recommender *
Corresponding author:
[email protected] (Fei Xiong).
ACCEPTED MANUSCRIPT
system. The idea of CF is that users who have similar tastes in the past are tending to choose the same items [4,5]. Matrix factorization (MF) is a model-based CF, which has been widely used. MF uses the multiplication of two low-rank feature matrices to represent the rating matrix and fill in the gap values in the matrix [6]. However, most of CF methods have a common shortcoming that when rating data are sparse, the accuracy of predicted results may be greatly reduced [7]. Recommender systems address three objects: users, items, and explicit ratings on items. In fact, implicit information that can be explored from explicit data is also advantageous for recommendation. Therefore, a feasible way to solve the problem of data sparsity is to mine various information from original data. Nowadays, recommender systems can also combine many advanced methods in other fields, such as hybrid
CR IP T
feedback control technique [8], static output feedback stabilization [9], and robust adaptive neural control [10].
In general, the information we can obtain in recommender systems contains explicit feedbacks and implicit feedbacks. Explicit feedbacks refer to the data which are obtained through users’ directly actions, including ratings provided by users, tags of movies, and attributes of users such as the gender, age, and so on. In contrast to explicit feedbacks, implicit feedbacks are extracted based on the traces of
AN US
usage left by users after they have interacted on a website, and the feedbacks include browsing records, user similarities, favorite collections, etc. In addition, we can use the information retrieved from the images of items as a kind of implicit feedback [11-13]. There are many methods of image processing, such as image re-ranking [14]. From the perspective of data sparsity, it is obvious that traces of usage are much more abundant than direct actions. Therefore, implicit feedbacks may contribute to recommendation in addition to explicit feedbacks. Designing CF methods based on implicit feedbacks can alleviate the problem of data .sparsity and improve the accuracy and stability of recommender
M
systems.
There are many sources of implicit feedbacks, so it is essential to choose the appropriate and effective feedbacks to design a CF algorithm. However, most of previous studies used a certain kind of
ED
feedback separately, and did not combine different feedbacks together. In this paper, we present a novel recommendation method based on multiplex implicit feedbacks, called RMIF. RMIF integrates different kinds of implicit feedbacks into the MF framework. In this paper, we mainly consider user
PT
similarities, rated records of items and positive attitude of users as implicit feedbacks. The rated
AC
CE
records of each item refer to the set of users who have rated the item. After users observe others’ ratings on an item, their decisions on the same item may be influenced. The rated records of each item have implicit influence on the users who will rate the item. Implicit feedbacks reveal additional information about users’ actions, so users’ preferences can be portrayed more realistically. Mining users’ actions comprehensively and combining a variety of implicit feedbacks together may help to improve recommendation. To prove the reliability and effectiveness of RMIF, we conduct extensive experiments on two real-world datasets. Results prove that incorporating implicit feedbacks significantly improves the accuracy of recommendation compared with state-of-the-art CF methods. Our work has the following contributions: (1) We introduce three kinds of implicit feedbacks from the rating matrix separately in detail. We explore the feedbacks of user similarities and items’ rated records to improve rating prediction. In addition, we use the positive attitude of users as an auxiliary feedback to make prediction more accurate. (2) We propose a novel CF method based on multiplex implicit feedbacks. We incorporate these implicit feedbacks into the MF framework. The feedbacks of user similarities and positive attitude of
ACCEPTED MANUSCRIPT
users are used in the feature space of users. Rated records of items are used in the feature space of items. By constructing the feature spaces, we use MF to predict unknown ratings. (3)We conduct extensive experiments on two datasets to verify the performance of RMIF. Compared with some state-of-the-art recommendation models, our method significantly improves the recommendation accuracy. In addition, we analyze the impact of each implicit feedback on the performance, and prove that the feedbacks we use are reasonable. The rest of this paper is organized as follows. Section 2 provides a brief overview of some related work. Then, we describe the proposed method in detail in section 3, and design experiments to evaluate the effectiveness of our method in section 4. Finally, we give some concluding remarks and future
CR IP T
directions in section 5. 2. Related work
Matrix factorization is an effective model-based CF approach. From a mathematical view, the goal of MF is to approximate the rating matrix by constructing two low-rank matrices that have interactions in latent feature spaces [15]. We assume that there are 𝑚 users and 𝑛 items in a recommender system. Each user is defined as 𝑢 ∈ *1,2, … , 𝑚+ and each item is defined as 𝑗 ∈ *1,2, … , 𝑛+. Then, the
AN US
user-item rating matrix 𝑅𝑚×𝑛 can be used to represent users’ opinions about items. The element 𝑟𝑢𝑗 in the rating matrix 𝑅 indicates the rating score of item j given by user u. The dimension of the latent feature space 𝑓 (usually 𝑓 ≪ min(𝑚, 𝑛) ) is an important parameter in MF [16]. The vector 𝑝𝑢 ∈ 𝑅1×𝑓 denotes the latent user feature vector for user u, and 𝑞𝑗 ∈ 𝑅1×𝑓 denotes the latent item feature vector for item j. The dimensions of both latent feature vectors are the same. Then, the predictive rating 𝑟′𝑢𝑗 of user u on item j could be written as an inner-product of 𝑝𝑢 and 𝑞𝑗 , 𝑟′𝑢𝑗 = 𝑝𝑢 𝑞𝑗𝑇
M
(1)
the approximation as
ED
A sum-of-squared-errors objective function with quadratic regularization terms can be applied to make
1
2
𝜆
𝜆
2
(2)
PT
min𝑝,𝑞 2 ∑(𝑢,𝑗)(𝑝𝑢 𝑞𝑗𝑇 − 𝑟𝑢𝑗 ) + 2 ‖𝑝𝑢 ‖2 + 2 ‖𝑞𝑗 ‖
where ‖∙‖ represents the Frobenius norm and 𝜆 is the weight parameter for regularized terms [17]. Stochastic gradient descent (SGD) is a popular optimization method to find a local minimum of the
CE
objective function and find suitable latent feature vectors [18]. The main idea is to calculate the partial derivatives of variables that need to be optimized in the objective function and to use iterative methods for optimization. According to the principle of SGD, the variables 𝑝𝑢 and 𝑞𝑗 are updated in the
AC
minus gradient direction of each variable. The optimized results are obtained after a limited number of iterations. Then, the missing ratings in 𝑅 can be predicted by 𝑝𝑢 𝑞𝑗𝑇 . MF has achieved a great success in the field of recommender systems, because it not only
improves the accuracy of prediction results, but also has strong compatibility [19]. Many extended models have been proposed to improve MF. Koren et al. proposed SVD++ [20], which takes into consideration the impact of rated items for rating prediction. Since users may be affected by the time when making a choice, it is possible for the same person to make different evaluations on the same item at different time. Therefore, a dynamic algorithm called TimeSVD++ [21] was proposed to integrate time information into recommendation tasks. Probabilistic matrix factorization (PMF) proposed by Salakhutdinov and Mnih [22] is a recommendation model based on probability. It is assumed that the feature matrices of users U and items V obey the Gaussian distribution, and the
ACCEPTED MANUSCRIPT
feature matrices are obtained from the known values of the rating matrix. Then, the matrices are used to predict the unknown values in the rating matrix. Bayesian probabilistic matrix factorization (BPMF) [23] is similar to PMF. The main difference between them is that BPMF obtains the feature vectors by using the Bayesian inference rather than traditional probability-based methods. Specifically, BPMF no longer regards the system parameters as a fixed value estimate, but it considers that random variables obey a certain distribution. Recommendation algorithms extended from MF have made a great contribution to improve the accuracy of rating prediction. However, there are still some problems that need to be solved. New items and new users in recommender system always have little information for use. We always call this
CR IP T
problem as the cold-start problem [24,25]. It is one of the most important problems which lead to poor prediction. Another shortage of recommendation is data sparsity, which means that there are not enough explicit data for users and items [26]. Up to now, these two problems have not been completely solved.
After review previous studies [27-30], it is necessary to use implicit feedbacks for prediction. To take good use of implicit feedbacks, we propose a new method which integrates multiplex implicit
AN US
feedbacks including user similarities, rated records of each item and positive attitude of users into the traditional MF.
3. Recommendation with multiplex implicit feedbacks
In this section, first, we introduce the problem we pay attention to in this paper and the overall framework of RMIF. Then, we describe three kinds of implicit feedbacks in RMIF respectively, and
M
introduce their roles in recommendation. Finally, the algorithm of RMIF is given in detail. 3.1 Problem definition
In recommender systems, implicit feedbacks can well alleviate the sparsity problem of rating
ED
data. Implicit information is included in the history of users’ behaviors when users rate items on a website [31]. The information reflects users’ preferences and the popularity of items. Therefore, implicit feedbacks have a strong effect on recommendation. We will explore three kinds of implicit
PT
feedbacks in detail, including user similarities, rated records of each items, and positive attitude of users. We incorporate them into the recommendation framework. The flow chart of our method is
CE
shown in Fig.1. All of the implicit feedbacks that we use in RMIF are extracted from the rating matrix. In Fig. 1, the notation u means a user and i means an item. The key to obtain user similarities is to calculate the similarities between users.
AC
The most important points of our method are: how to build the feature vectors of implicit
feedbacks, how to integrate these implicit feedbacks together to make prediction, and how to learn these
feature
vectors.
CR IP T
ACCEPTED MANUSCRIPT
AN US
Fig. 1. Overview of multiplex implicit feedbacks
3.2 Multiplex Implicit feedbacks
Our goal is to study the impact of implicit feedbacks on prediction results. In this section, we will introduce three kinds of implicit feedbacks in detail which are closely related to users’ preferences, and we solve the problem that how to build the feature vectors of implicit feedbacks.
M
3.2.1 Implicit user relationships
It is a fact that people may have high probability to adopt opinions from their friends when buying products and they are influenced by reviews on these items on the Internet. In daily life, our evaluation
ED
of things is always influenced by the people around us. Especially when making choices, we are more convinced of those who are very similar to us. As a result, we always tend to make choices that are close to our ideas. Similar phenomena occur when users give ratings on websites. Users with similar
PT
existing ratings will make similar results when rating new items [32,33]. The higher the similarity is, the closer the scores are.
In the field of recommender systems, user similarities are calculated over their mutual ratings.
CE
When the similarity between two users is very high, we consider that they have an implicit neighbor relationship. Implicit neighbors are different from the friends that users directly click to follow on online social media. A friend who users follow on social media is a kind of explicit information, but in
AC
fact this information is not absolutely reliable. Although some friends who users follow on social media may also be the friends that users know in real life, but users and their friends do not necessarily have similar preferences. Users’ implicit neighbors who are detected through the rating data can guarantee that they are similar for users’ own preferences. This implicit information is reliable. Each user can find a collection of his/her neighbors after performing a certain number of rating operations. The ratings of implicit neighbors for a given item are an important basis for predicting the target user's rating on the item. It should be noted that users’ implicit neighbors are not reversible. Specifically, user v is one of implicit neighbors of user u, but user v does not necessarily have user u in his/her neighbor set. In the selection of the neighbor set for each user, we choose other users with the top 𝑁 similarities as his/her neighbors.
ACCEPTED MANUSCRIPT
We use user similarities to indicate whether two users are implicit neighbors. The values of user similarities are limited in the range [0, 1]. In this paper, we use two representative measures of user similarities, i.e., Pearson Correlation Coefficient (PCC) and Vector Space Similarity (VSS). (1) PCC ∑𝑗∈𝐼𝑢,𝑣(𝑟𝑢𝑗−𝑟̅̅̅) ̅̅̅) 𝑢 (𝑟𝑣𝑗 −𝑟 𝑣
𝑠𝑖𝑚(𝑢, 𝑣) =
2
2
(3)
√∑𝑗∈𝐼𝑢,𝑣(𝑟𝑢𝑗−𝑟 ̅̅̅) ̅̅̅) 𝑢 √∑𝑗∈𝐼𝑢,𝑣(𝑟𝑣𝑗 −𝑟 𝑣
where 𝐼𝑢,𝑣 is a collection of items that both user u and v have rated, and 𝑟̅𝑢 (𝑟̅) 𝑣 is the average value of the ratings that user u (user v) has given.
𝑠𝑖𝑚(𝑢, 𝑣) =
∑𝑗∈𝐼𝑢,𝑣 𝑟𝑢𝑗 𝑟𝑣𝑗 √∑𝑗∈𝐼𝑢,𝑣 𝑟𝑢𝑗2 √∑𝑗∈𝐼𝑢,𝑣 𝑟𝑣𝑗 2
CR IP T
(2) VSS (4)
The results obtained by Eq. (3) are within the range [-1, 1]. However, since we only consider positive implicit relationships, we need to limit the values in the range [0, 1] by using the function 𝑓(𝑥) = (𝑥 + 1)/2.
AN US
Then, we sort the similarities to find implicit neighbors. We set the number of neighbors for each user as 𝑁. The users whose similarities rank in the top 𝑁 for a given user u are selected as the neighbors of user u.
We define the feature vector of user u’s implicit neighbors as follows: 1
𝑖𝑚𝑝𝑢 = |𝐺𝑢 |−2 ∑𝑣∈𝐺𝑢 𝑡𝑣
(5)
1
M
where 𝐺𝑢 represents the implicit neighbor set of user u. 𝑡𝑣 indicates the impact vector of an implicit
ED
neighbor, and the parameter |𝐺𝑢 |−2 is used to normalize the vector. Therefore, we use the combination of the implicit neighbor feature vector 𝑖𝑚𝑝𝑢 and latent user feature vector 𝑝𝑢 to present the preference of user u.
PT
3.2.2 Rated records of items
Recommender systems predict unknown ratings according to existing records, so deeply mining
CE
implicit information from existing data may improve the recommendation performance. Users on online social media may observe others’ ratings, and then, make their own decisions. Users’ decisions on an item may be influenced by the users who have rated the item before. Therefore, implicit
AC
influence exists among these users for the same item. We take rated records of each item as the second implicit feedback. Ratings on an item are marked with a lot of valid information. For example, the rated records by users reflect the popularity of the item and the kind of people who like it. The information helps to predict the results more accurately. We define the implicit feature vector 𝑖𝑚𝑞𝑗 of item j as follows 1 2
−
𝑖𝑚𝑞𝑗 = |𝑀𝑗 |
∑𝑤∈𝑀𝑗 𝑠𝑤
(6)
where 𝑀𝑗 represents the set of users who have rated the given item j, and 𝑠𝑤 represents the implicit 1
influence of user w who is in the set 𝑀𝑗 . The parameter |𝑀𝑗 |−2 is used to normalize the feature vector.
ACCEPTED MANUSCRIPT Therefore, we use the combination of the implicit feature vector 𝑖𝑚𝑞𝑗 and latent item feature vector 𝑞𝑗 to present the feature of item j. 3.2.3 Positive attitude of users Ratings on movie sharing websites are usually represented by five discrete numbers 1-5. In addition, users’ decisions on items can also be converted into two categories: positive or negative attitude. Users’ attitude is not directly given by users, so their attitude is treated as an implicit feedback. Previous studies mostly used five-point ratings to analyze users’ preferences and predict unknown ratings. However, users’ attitude reflects their feelings on items, and may contribute to recommendation.
CR IP T
Therefore, we convert the rating matrix into a binary attitude matrix according to certain rules. Here, we treat the binary matrix as the auxiliary rating matrix. From the auxiliary matrix, we can find the items towards which each user holds positive attitude. When a user gives a rating not less than 3 points on an item, the user probably has positive attitude on this item [29]. In contrast, if the rating is lower than three points, the user may have negative attitude. In the auxiliary matrix, we use five points to represent positive attitude, and use one point to represent negative attitude. Items that were not rated in
AN US
the rating matrix are still marked as zero.
We model users’ attitude towards an item as an implicit feedback for recommendation. When items are divided into positive and negative categories for a user, we may utilize hierarchical items to make a more accurate prediction. In addition, in actual situations, people tend to actively evaluate the item on which they have a good feeling. If they do not care or do not like the item, they tend to ignore it. Therefore, we only focus on positive attitude of users. When we know accurately the items which belong to the positive set, we can predict that users will rate high scores on those items of a similar type.
M
In this paper, we choose users’ positive attitude as the third implicit feedback. The feature vector for positive attitude of user u is given in Eq. (7):
1
ED
𝑝𝑜𝑠𝑢 = |𝐷𝑢 |−2 ∑𝑖∈𝐷𝑢 𝑙𝑖
(7)
where 𝐷𝑢 represents the set of items towards which user u has positive attitude. 𝑙𝑖 is the implicit
PT
feedback for positive attitude towards item i. We use 𝑟′𝑢𝑗2 as an auxiliary evaluation of user u’s attitude on item j as follows.
(8)
CE
𝑟′𝑢𝑗2 = 𝑝𝑜𝑠𝑢 𝑞𝑗𝑇
3.3 Algorithm of RMIF
AC
Reasonable and in-depth use of implicit feedbacks can improve the recommendation model. In
this section, we introduce the algorithm of RMIF in detail. We mainly focus on two questions: how to integrate these implicit feedbacks together to make prediction and how to learn the implicit feature vectors.
The prediction of users’ ratings consists of two parts. The first part is the bias terms of intrinsic attributes. Many of changes in ratings are caused by the impact of users or items themselves, independently of any interaction. For instance, some users always give higher ratings than other users, and some items always get higher ratings than others. It is necessary to consider the global average rating, user bias term and item bias term when making prediction [35]. We define 𝑏𝑢𝑗 to represent the bias terms as follows:
ACCEPTED MANUSCRIPT 𝑏𝑢𝑗 = 𝜇 + 𝑏𝑢 + 𝑏𝑗
(9)
where 𝜇 means the global rating, which is the average of all the ratings. 𝑏𝑢 and 𝑏𝑗 represent the user bias term and item bias term. The second part for rating prediction is the product of the feature vectors of users and items. The feature vectors of users include the latent feature vectors of users and implicit neighbors. The feature vectors of items include the vectors of the latent features and rated records of items. The prediction of user u’ rating on item j is expressed as follows: 𝑇
(10)
CR IP T
𝑟 ′ 𝑢𝑗1 = 𝑏𝑢𝑗 + (𝑝𝑢 + 𝑖𝑚𝑝𝑢 )(𝑞𝑗 + 𝑖𝑚𝑞𝑗 )
AN US
In summary, we apply these three implicit feedbacks to Eq.(8) and Eq.(10) for prediction. The Eq.(8) uses the implicit feedback of users’ positive attitude, and the Eq. (10) uses the implicit feedbacks of rated records of items and implicit user relationships. In consideration of the estimated explicit rating and implicit attitude in Eqs. (8) and (10), we obtain the objective function for recommendation. After estimating users’ ratings, we can calculate latent vectors by minimizing the sum-of-squared-errors objective function according to 𝑟 ′ 𝑢𝑗1 and 𝑟 ′ 𝑢𝑗2 as follows: 2
1
2
𝜆
𝑛 ′ ′ 𝐶 ′ = ∑𝑚 𝑢=1 ∑𝑗=1*(𝑟 𝑢𝑗1 − 𝑟𝑢𝑗1 ) + 𝛿(𝑟 𝑢𝑗2 − 𝑟𝑢𝑗2 ) + 𝑘𝑢𝑗 2
(11)
2
where the first term in Eq. (11) is the error between estimated explicit ratings and real ratings, and 𝜆𝑘𝑢𝑗 ⁄2 is regularization terms. 𝛿 is a parameter that controls the contribution for the implicit feedback of users’ positive attitude. The regularization terms are
𝜆
M
1 2 1 2 𝜆 𝜆 𝜆 𝜆 𝜆 2 2 − 𝑘𝑢𝑗 = ||𝑏𝑢 || + ||𝑏𝑗 || + ∑ |𝐺𝑢 |−2 ||𝑝𝑢 || + ∑ |𝑀𝑗 | 2 ||𝑞𝑗 || 2 2 2 2 𝑢 2 𝑗 1
2
𝜆
1
2
𝜆
2
+ 2 ∑𝑤|𝐺𝑤 |−2 ||𝑠𝑤 || + 2 ∑𝑣|𝐺𝑣 |−2 ||𝑡𝑣 || + 2 ∑𝑖||𝑙𝑖 ||
ED
(12)
The adaptive regularization terms are used to prevent over fitting. 𝜆 is the regularization coefficient. We can adjust the balance between the complexity of the recommendation model and the
PT
degree of fitting training data by changing the value of 𝜆. Penalties are also added in the objective
CE
function to punish active users and popular items with a smaller penalty factor. Note that *𝑏𝑢 , 𝑏𝑗 , 𝑝𝑢 , 𝑞𝑗 , 𝑠𝑤 , 𝑡𝑣 , 𝑙𝑖 + are the set of model parameters which need to be learned. The variables are obtained by performing the SGD training of the objective function. Therefore, we need to calculate the partial derivatives for each variable based on the objective function. The
AC
gradients of variables are listed as follows. 𝜕𝐶 ′ 𝜕𝑏𝑢 𝜕𝐶 ′ 𝜕𝑏𝑗
𝜕𝐶′ 𝜕𝑝𝑢 𝜕𝐶′ 𝜕𝑞𝑗
= ∑𝑗 𝑒𝑢𝑗1 + 𝜆𝑏𝑢 = ∑𝑢 𝑒𝑢𝑗1 + 𝜆𝑏𝑗 1 2
−
= ∑𝑗 𝑒𝑢𝑗1 (𝑞𝑗 + |𝑀𝑗 |
1
∑𝑤∈𝑀𝑗 𝑠𝑤 ) + 𝜆|𝐺𝑢 |−2 𝑝𝑢 1
1
1
−
= ∑𝑢(𝑒𝑢𝑗1 (𝑝𝑢 + |𝐺𝑢 |−2 ∑𝑣∈𝐺𝑢 𝑡𝑣 ) + 𝑒𝑢𝑗2 𝛿|𝐷𝑢 |−2 ∑𝑖∈𝐷𝑢 𝑙𝑖 ) + 𝜆|𝑀𝑗 | 2 𝑞𝑗 (13)
ACCEPTED MANUSCRIPT
∀𝑣 ∈ 𝐺𝑢 ,
∀𝑤 ∈ 𝑀𝑗 , ∀𝑖 ∈ 𝐷𝑢 ,
𝜕𝐶′ 𝜕𝑡𝑣
𝜕𝐶′ 𝜕𝑠𝑤 𝜕𝐶′ 𝜕𝑙𝑖
1
1 2
−
= ∑𝑗 𝑒𝑢𝑗1 |𝐺𝑢 |−2 (𝑞𝑗 + |𝑀𝑗 | 1
1
∑𝑤∈𝑀𝑗 𝑠𝑤 ) + 𝜆|𝐺𝑢 |−2 𝑡𝑣 1
1
−
−
= ∑𝑢 𝑒𝑢𝑗1 |𝑀𝑗 | 2 (𝑝𝑢 + |𝐺𝑢 |−2 ∑𝑣∈𝐺𝑢 𝑡𝑣 ) + 𝜆|𝑀𝑗 | 2 𝑠𝑤 1
= ∑𝑗 𝑒𝑢𝑗2𝛿 |𝐷𝑢 |−2 𝑞𝑗 + 𝜆𝑙𝑖
Where 𝑒𝑢𝑗1 = 𝑟 ′ 𝑢𝑗1 − 𝑟𝑢𝑗1 is the error of the estimated rating, and 𝑒𝑢𝑗2 = 𝑟 ′ 𝑢𝑗2 − 𝑟𝑢𝑗2 is the variables are updated according to Eq. (14). 𝜕𝐶′
𝑋 = 𝑋 − 𝛼 𝜕𝑋
CR IP T
error of the estimated auxiliary attitude. After calculating the partial derivatives of all variables, the
(14)
Where 𝛼 is the learning rate, and 𝑋 ∈ *𝑏𝑢 , 𝑏𝑗 , 𝑝𝑢 , 𝑞𝑗 , 𝑠𝑤 , 𝑡𝑣 , 𝑙𝑖 +.
Algorithm 1 shows the pseudo code of model learning. Input information in Algorithm 1 includes the rating matrix 𝑅, number of similar users 𝑁, regularization parameter 𝜆, learning rate 𝛼, and
AN US
weight on auxiliary evaluation 𝛿. At first, we initialize the variables with random values within (0,1) (line 1). Then, we perform the SGD learning until convergence (line 2-10).The key of our recommendation model is to compute the gradients of variables (line 3). Finally, after the loop, the optimal variables are obtained and we can get the prediction results (line 11).
Algorithm 1. Learning in the RMIF Model Input: 𝑅, 𝑁, 𝜆, 𝛼, 𝛿
M
Output: Rating predictions 𝑟 ′𝑢𝑗1
1 Initialize 𝑏𝑢 ,𝑏𝑗 , 𝑝𝑢 , 𝑞𝑗 , 𝑠𝑤 , 𝑡𝑣 , 𝑙𝑖 with small and random values on (0,1); 2
while not convergence do
compute gradients according to Equation (13); 𝜕𝐶′, 𝑏𝑢 = 𝑏𝑢 − 𝛼 , u=1,2….m
5
𝑏𝑗 = 𝑏𝑗 − 𝛼
6
𝑝𝑢 = 𝑝𝑢 − 𝛼
7
𝑞𝑗 = 𝑞𝑗 − 𝛼
8
∀𝑣 ∈ 𝐺𝑢 , 𝑡𝑣 = 𝑡𝑣 − 𝛼
9
∀𝑤 ∈ 𝑀𝑗 , 𝑠𝑤 = 𝑠𝑤 − 𝛼
𝜕𝑏𝑗 𝜕𝐶′,
,j=1,2…n
, u=1,2….m
PT
𝜕𝑝𝑢 𝜕𝐶′, 𝜕𝑞𝑗
,j=1,2…n
CE
10
𝜕𝑏𝑢 𝜕𝐶′,
ED
3 4
∀𝑖 ∈ 𝐷𝑢 , 𝑙𝑖 = 𝑙𝑖 − 𝛼
𝜕𝐶′,
, u=1,2….m
𝜕𝑡𝑣 𝜕𝐶′,
𝜕𝑠𝑤 𝜕𝐶′, 𝜕𝑙𝑖
, i=1,2….n
, u=1,2….m
AC
11 return 𝑏𝑢 , 𝑏𝑗 ,𝑝𝑢 , 𝑞𝑗 , 𝑠𝑤 , 𝑡𝑣 , 𝑙𝑖
4. Experiments 4.1 Datasets and evaluation metrics In the experiments, we select the Movielens-100k (ML-100k) dataset released by GroupLens Lab. To verify the feasibility and robustness of the algorithm, we also extend our experiments to the larger Movielens-1M (ML-1m) dataset for retesting and verification [31]. Data in both datasets are recorded in the form of (u,j,𝑟𝑢𝑗 ), and all ratings are expressed by five-level integers (1-5). We construct the auxiliary binary matrix of users’ attitude by converting the ratings larger than or equal to 3 as positive attitude and the others as negative attitude. For the auxiliary attitude matrix, we use five points to represent positive attitude and one point to represent negative attitude. Before the experiments, each
ACCEPTED MANUSCRIPT
dataset is randomly split into the training set and the test set. Movielens is a popular movie sharing website that records users’ ratings on various movies. The ML-100k dataset is associated with 943 users and 3952 items. The dataset contains a total of 100,000 ratings, and the density of the rating matrix is 2.7%. The ML-1m dataset is associated with 6040 users and 3952 items. This dataset contains a total of 1,000,209 records, and the matrix density is 4.2%. Statistics about the datasets are described in Table 1. For both datasets, we choose 80% of the data as the training set, and the remaining data are regarded as the test set. We conduct experiments 5 times
Table 1 Overview of datasets Movielens100k
CR IP T
independently, and the average values of the results are used as the final results.
Movielen
s-1M
users
943
items
3952
density
2.7%
maximum number of ratings per
3952
4.2% 2314
AN US
586
6040
user
maximum number of ratings per item
47
3428
106
165.6
25.3
253.1
average number of ratings per user
M
average number of ratings per item
For the accuracy verification of predicted results, we use two popular evaluation metrics. One is
ED
the Mean Absolute Error (MAE) given in Eq. (15). The other is the Root Mean Square Error (RMSE) given in Eq. (16). A smaller value of MAE or RMSE means that the model has more accurate results. ∑𝑢,𝑗∈𝑇 |𝑟𝑢𝑗−𝑟′𝑢𝑗|
RMSE = √
|𝑇|
∑𝑢,𝑗∈𝑇(𝑟𝑢𝑗 −𝑟′𝑢𝑗 )2 |𝑇|
(15)
(16)
CE
PT
MAE =
where 𝑇 is the test set and |𝑇| is the number of ratings in 𝑇.
AC
4.2 Baselines and parameter settings To measure the effectiveness of our method, we compare it with some baseline and state-of-the-art
recommendation models, including item-based collaborative filtering (ICF)[5], biasedMF[15], NMF[35], SVD++[20], LLORAMA [36] , RBM [37] and BPoissMF [38]. ICF: this method uses the similarities between items to perform rating prediction. biasedMF: this method decomposes a large rating matrix into two low-dimensional matrices to predict missing values in the rating matrix. In addition, it integrates the biases of users and items into the model. NMF: this method models user preferences and item attributes as non-negative vectors in a low-dimensional space. SVD++: this method is based on MF. It integrates the implicit feedback of items rated by each
ACCEPTED MANUSCRIPT
user. LLORAMA: this method originates from the real-world fact that a small set of users have an interest in only a small number of items, and they form a sub matrix out of the whole rating matrix. This method obtains the local low-rank sub matrices, and then combines them to predict the ratings. RBM: this method means restricted Boltzmann machine model. It is a kind of generative stochastic neural networks, and pays attention to the correlation between ratings on items. BPoissMF: this method makes recommendation based on matrix factorization by using Poisson distribution to model feature vectors. The recommendation accuracy is closely correlated with the values of parameters. To balance the
CR IP T
relationship among the computation time, storage space, and recommended performance, we choose 10 as the dimension of latent feature vectors for all models. We set parameters of these models either by studying relevant work [5,15,20,35-38] or cross-validation. In ICF, we choose 20 as the neighborhood size. In biasedMF, NMF, and SVD++, we set the learning rate at 0.002 and regularization parameter at 0.001. In LLORAMA and RBM, we choose the learning rate at 0.01 and regularization parameter at 0.01. We set the learning rate at 0.01 and regularization parameter at 0.001 in BPoissMF. We determine
AN US
the optimal parameters for RMIF by cross-validation, as shown in Table 2. To study the influence of parameters on the results, we conduct a series of experiments and analyze them in the following. The results are shown as follows.
Table 2 Parameter settings of RMIF parameters Number of similar users
M
Learning Rate α
ED
Regularization Parameter λ
Weight on auxiliary evaluation δ
Movielens-100k
Movielens-1M
10
10
0.0002
0.001
0.001
0.02
0.4
0.4
PT
Figure 2 shows the impact of the parameter 𝑁 which denotes the number of neighbors for the implicit feedback of user similarities. The parameter 𝑁 affects both the complexity of the method and the accuracy of the results. If 𝑁 is small, combining a few similar neighbors is not enough to express
CE
each user’s preference, so the recommendation performance may degrade. If N is too large, the neighbor set becomes too complex and the method has large computational complexity. Therefore, we should choose the best value of 𝑁. In Fig. 2, when N increases, MAE and RMSE in both datasets
AC
reduce initially. However, when 𝑁 exceeds 10, MAE and RMSE level off and the changes are no longer obvious. The reason is that for 𝑁 ≥ 10, the information about similarities is enough and the feedback of user similarities works well in the recommendation process. Thus, we choose 10 users as similar users.
ACCEPTED MANUSCRIPT
ML-100K
ML-100K
ML-1M
0.72
0.91
0.71
0.9
RMSE
0.7
MAE
ML-1M
0.92
0.89
0.69
0.88
0.68
0.87
0.67
0.86
0.66
0.85 5
8
11
14
17
20
5
8
11
14
17
20
CR IP T
number of similar users
number of similar users
(a) MAE
(b) RMSE
Fig 2 .Influence of the number of similar users 𝑁
Figure 3 shows the effect of number of iterations on MAE and RMSE. Since the gradient calculation and parameter updates are implemented in each iteration, SGD based on the iterative
AN US
calculation is very time consuming. Too few iterations result in inaccurate results, but excessive iterations may lead to over fitting [39]. In Fig.3, it is observed that within 300 iterations, MAE and RMSE decrease with the increasing number of iterations, both for ML-100k and ML-1m. More than 300 iterations result in higher computational complexity without a significant performance improvement in return. ML-100k
M
1.15
ML-1M
1.05
ED
0.95 0.85
0.65
PT
0.75
50 100 150 200 250 300 350 400
CE
iteration
1.4
ML-100k
ML-1M
1.3
1.2 1.1 1 0.9 0.8 50 100 150 200 250 300 350 400
iteration
(a) MAE
(b) RMSE
AC
Fig 3 . Influence of the number of iterations 𝑇
Figure 4 shows the results with different values of regularization parameter 𝜆. The role of
regularization is to prevent over-fitting of the learning process. If 𝜆 is very small, it may not effectively prevent over-fitting; however, if λ is overly large, the results will be greatly biased and the contribution of existing ratings is excessively reduced. To find the optimal value of 𝜆, we conduct some experiments on both datasets. As illustrated in Fig.4, for ML-100k, the value of MAE becomes the smallest when 𝜆 = 0.001, and for ML-1m, the highest accuracy is achieved when 𝜆 = 0.01. Therefore, we set 0.001 as the optimal regularization parameter for ML-100k and 0.01 for ML-1m.
ACCEPTED MANUSCRIPT
ML-100K
0.74
ML-1M
ML-100K
0.94
0.73
ML-1M
0.92
0.71
0.91
0.7
0.9
0.69
0.89
0.68
0.88
CR IP T
0.93
0.72
regularization parameter
regularization parameter (a) MAE
(b) RMSE
Fig 4.Influnence of regularization parameter 𝜆
ML-100K
ML-1M
ML-100K
1.1
RMSE
1
MAE
1.5 1.4 1.3 1.2 1.1 1 0.9 0.8 0.7
AN US
1.2
ML-1M
0.9 0.8
0.6 0
0.2
0.4
0.6
M
0.7 0.8
1
ED
weight on auxiliary data
0
0.2
0.4
0.6
0.8
1
weight on auxiliary data
(a) MAE
(b) RMSE
Fig 5. Influence of weight on auxiliary data 𝛿
PT
Figure 5 shows the effect of the weight on auxiliary evaluation 𝛿. Users’ positive attitude is extracted as auxiliary information in addition to explicit ratings. Users’ attitude is associated with
CE
ratings by mapping them into the same low-dimensional latent space. We assume that the latent feature vector of each item in the prediction of attitude is the same as that in the prediction of ratings. In Fig. 5, when the value of 𝛿 is 0.4, the best performance can be obtained. If the value of 𝛿 is lower than 0.4,
AC
the feedback of positive attitude does not play the best role. However, when the value of 𝛿 exceeds 0.4, the role of explicit ratings is underestimated in the objective function, so errors of estimated ratings increase. Especially when 𝛿 > 0.8, the accuracy for ML-100K starts a precipitous decline as 𝛿 increases. Therefore, we choose 0.4 as the optimal value of 𝛿. 4.3 Comparison with other models We compare the results of our method with other state-of-the-art recommendation models. Parameters of all models are set at the optimal values. Table 3 shows the results of different models for all users. The baseline model ICF performs the worst among all the comparison models, since it is only a memory-based CF. BPoissMF obtains more accurate latent features from Poisson distribution, and therefore, it performs better than biasedMF and NMF. SVD++ incorporates the implicit influence of
ACCEPTED MANUSCRIPT
rating behaviors for each user, and has a notable improvement in the performance, compared with traditional MF methods. LLORMA performs the best among all the comparison methods. LLORMA approximates the rating matrix by the weighted sum of local sub matrices, so the problem of sparsity is alleviated. In addition, our approach outperforms all comparison models in both datasets. For instances, RMIF reduces RMSE as high as 3% in contrast to LLORMA in ML-100k dataset. Koren pointed out that even small changes in MAE and RMSE lead to large differences in results [31]. Overall, the results demonstrate the effectiveness of our method. Multiplex implicit feedbacks help to explore the preferences of users and features of items, and thus improve accuracy of recommendation.
Evaluation
Dataset
ICF
biasedMF
NMF
SVD++
LLORMA
metrics 0.827
0.761
0.753
0.724
0.719
± 0.004
± 0.004
± 0.004
± 0.005
± 0.003
1.035
0.970
0.967
0.928
0.926
± 0.004
± 0.005
± 0.004
± 0.004
± 0.003
0.782
0.732
0.725
0.692
± 0.003
± 0.003
± 0.003
0.988
0.934
0.928
± 0.003
± 0.003
± 0.003
± 0.003
± 0.002
± 0.003
0.943
0.928
0.898
± 0.004
± 0.002
± 0.002
0.683
0.713
0.699
0.670
± 0.004
± 0.003
± 0.003
± 0.002
± 0.002
0.877
0.867
0.917
0.894
0.851
± 0.002
± 0.003
± 0.002
± 0.001
± 0.001
AN US
M
1.4
1.3
MAE RMSE
ED
1.2 1.1
PT
1 0.9 0.8
CE
0.7 0.6
AC
ICF
biased
NMF
RMIF
0.703
RMSE
RMSE
BPoissMF
0.731
ML-100k
ML-1M
RBM
0.745
MAE
MAE
CR IP T
Table 3 Comparison of recommendation results on MAE and RMSE
SVD++ LLORMA (a) ML-100K
RBM
BPiossMF RMIF
ACCEPTED MANUSCRIPT
1.3 1.2
MAE
1.1
RMSE
1 0.9 0.8
0.6 ICF
biased
NMF
SVD++ LLORMA (b) ML-1M
RBM
CR IP T
0.7 BPiossMF RMIF
Fig 6. Performance comparison of different methods for cold-start users
The recommendation performance often degrades as a result of the cold-start problem [40]. Here,
AN US
we design experiments to compare the recommendation accuracy of these methods for cold-start users. We choose the users who give less than 10 ratings as cold-start users. As in Fig. 6, though all results for cold-start users become worse than those for normal users, the variations of MAE and RMSE in different methods differ greatly. This implies that different methods have different capabilities of resisting the cold-start problem. ICF and biasedMF have relatively high values of MAE and RMSE. Therefore, they are not competent to the task of recommendation for cold-start users. SVD++ utilizes the implicit influence of rated items, and can better mitigate the cold-start effect. LLORMA has similar
M
accuracy as SVD++ in both datasets. It is obvious that RMIF also performs the best for cold-start users. These results prove that RMIF can work in cold-start situations.
CE
PT
ED
Here, we evaluate the computational complexity of RMIF and some relevant methods. The computational time is spent on calculating the objective function 𝐶 ′ and all the gradients of variables. We define 𝑓 as the dimension of latent feature vectors, and define 𝑁 as the number of implicit neighbors. 𝑚 represents the number of users, and 𝑛 represents the number of items. |𝑅| and |𝐴| represent the total number of ratings and auxiliary evaluations, respectively. The computational complexity for calculating 𝐶 ′ is 𝑂(2𝑓|𝑅 | + 2𝑓|𝐴| + 𝑓𝑚𝑁). In RMIF , we need to compute the gradients of varibles including *𝑏𝑢 , 𝑏𝑗 , 𝑝𝑢 , 𝑞𝑗 , 𝑠𝑤 , 𝑡𝑣 , 𝑙𝑖 + in Eq. (13). The computational complexities of them in an iteration are 𝑂(𝑓|𝑅|), 𝑂(𝑓 |𝑅 |), 𝑂(𝑓|𝑅|), 𝑂(𝑓|𝑅| + 𝑓|𝐴|), 𝑂(𝑓|𝑅|𝑛1 ), 𝑂(𝑓 |𝑅|𝑁), 𝑂(𝑓|𝐴|𝑛2 ), where 𝑛1 represents the average number of ratings received by an item, and
AC
𝑛2 represents the average number of items towards which a user has positive attitude. In fact, the auxiliary matrix is transformed from the rating matrix, so |𝑅| and |𝐴| are the same. Therefore, the computatinal complexity of RMIF in a single iteration is 𝑂(𝑓|𝑅|𝑛𝑚𝑎𝑥 ) , where 𝑛𝑚𝑎𝑥 = max(𝑛1 , 𝑛2 , 𝑁). Since 𝑛𝑚𝑎𝑥 ≪ |𝑅|, the computational complexity of our method is linear
with the total number of ratings, and RMIF can be used in large datasets. The computational complexity of biasedMF in an iteration is 𝑂(𝑓|𝑅|), and that of SVD++ is 𝑂(𝑓 |𝑅|𝑛3 ), where 𝑛3 is the average number of ratings a user gives. Since both 𝑛𝑚𝑎𝑥 and 𝑛3 are much smaller than |𝑅|, RMIF achieves a siginificant performance improvement, and meanwhile it does not increase much computation. Based on the above analysis, the proposed recommendation method is effective and scalable. 4.4 Effects of implicit feedbacks
ACCEPTED MANUSCRIPT
In this paper, three kinds of implicit feedbacks are combined together in the algorithm to provide recommendation. For further verification, we focus on which implicit feedback plays the most significant role in the recommendation process, and we design the following experiments to verify their roles. By removing one of the implicit feedbacks at a time, RMIF uses only the effects of the other two implicit feedbacks. The results are shown in Table 4. We use the set {(user similarities, rated records of items, users’ positive attitude) = (0,1,1), (1,0,1), (1,1,0), (1,1,1), (0,0,0)} to represent the implicit feedbacks we choose. Note that the value 1 means that we use the implicit feedback, and the value 0 means that the implicit feedback is not used. The method with (1,1,1) represents RMIF, while that with
CR IP T
(0,0,0) means no implicit feedback and the method reduces to MF. Table 4. Effects of different combinations of implicit feedbacks
Dataset
Implicit Feedback (1,1,1) (0,1,1) (1,0,1)
RMSE
0.703
0.898
0.710
0.904
0.713
0.918
AN US
Movielens-100k
MAE
0.720
0.923
(0,0,0)
0.751
0.960
(1,1,1)
0.670
0.851
(0,1,1)
0.676
0.862
(1,0,1)
0.682
0.864
(1,1,0)
0.697
0.871
0.722
0.924
M
Movielen-1M
(1,1,0)
(0,0,0)
ED
As in Table 4, RMIF that incorporates three kinds of implicit feedbacks still achieves the best performance. The values of MAE and RMSE for different feedback settings in both datasets are (1,1,1) < (0,1,1) < (1,0,1) < (1,1,0) < (0,0,0). When we only use two of the implicit feedbacks, the
PT
recommendation accuracy decreases. However, compared with the method with (0,0,0), the accuracy with implicit feedbacks is all greatly improved. It is proved that the implicit feedbacks that we introduce in the recommendation are effective and reasonable. In addition, when users’ positive attitude
CE
is included, the recommendation accuracy increases the most. The feedback of items’ rated records makes a little larger contribution for recommendation than that of user similarities.
AC
5. Conclusion
In contrast to explicit ratings, implicit information can describe users’ preferences more realistic,
and helps to solve the problem of data sparsity in recommendation. In this paper, we explored three kinds of implicit feedbacks through our observation: user similarities, rated records of items and users’ positive attitude. To make rating prediction, we proposed a recommendation method which incorporates multiplex implicit feedbacks. We introduced three implicit feedbacks into the matrix factorization framework. The feedbacks of users’ similarities and items’ rated records contribute to predict unknown ratings, and the feedback of users’ positive attitude is used as an auxiliary feature. Experiments on two real-world datasets, i.e., ML-100k and ML-1m, proved that our method outperforms the other state-of-the-art recommendation models. By verifying the role of each implicit
ACCEPTED MANUSCRIPT
feedback separately, we proved that the three implicit feedbacks we use are reasonable. Our recommendation method based on implicit feedbacks has a great improvement in accuracy. However, in real life, data on online social media are more random and explosive than those we use in experiments. A series of unintended or erroneous operations by users may result in mutations of implicit feedbacks. In the future, to make implicit feedbacks more reasonable and efficient, we will focus on how to maintain the stability of the method and eliminate noise information.
Acknowledgments
CR IP T
This work has been supported by the National Natural Science Foundation of China under Grant 61872033, the Humanity and Social Science Youth Foundation of Ministry of Education of China under Grant 18YJCZH204 and 17YJCZH007, the Fundamental Research Funds for the Central Universities in UIBE under Grant CXTD10-05, the Research Funds for Excellent Young Scholars in UIBE under Grant 17YQ21, and the Beijing Natural Science Foundation under Grant 4184084.
AN US
References
AC
CE
PT
ED
M
[1] Wang, M., Lim, E.-P., Li, L., Orgun, M. [2016] “Behavior analysis in social networks: Challenges, technologies, and trends.” Neurocomputing 210, pp.1-2. [2] Resnick, P., Varian, H. R.[1997] “Recommender systems”, Commun ACM 40, pp. 56-58. [3] Guo, G., Zhang, J. and Yorke-Smith, N. [2016]”A novel recommendation model regularized with user trust and item ratings. ”IEEE Transactions on knowledge and data engineering 28, pp. 1607-1620. [4] Deshpande, M., Karypis, G. [2004] “Item-based top-n recommendation algorithms,” ACM Transactions on Information Systems 22, pp. 143-177. [5] Li, H., Cui, J., Shen, B., Ma, J. [2016] “An intelligent movie recommendation system through group-level sentiment analysis in microblogs.” Neurocomputing 210, pp.164-173. [6] Koren, Y., Bell, R., Volinsky, C. [2009] “Matrix factorization techniques for recommender systems.” Computer 42(8), pp.30-37. [7] Ji, K. and Shen, H. [2015] “Addressing cold-start: Scalable recommendation with tags and keywords.” Knowledge-Based Systems 83, pp.42-50. [8] Li, H. Cao, J. Jiang, H. and Alsaedi, A.[2018] “Finite-time synchronization of fractional-order complex networks via hybrid feedback control.” Neurocomputing 320,pp.69-75. [9] Hyoung, B. and Ho, J.[2018]“Sampled-data static output-feedback control for nonlinear systems in T-S form via descriptor redundancy” Neurocomputing 318,pp.1-6. [10] Wang, H., Shen, H,. Xie, X., Hayat, T. and Alsaadi, F.[2018]“Robust adaptive neural control for pure-feedback stochastic nonlinear systems with Prandtl-Ishlinskii hysteresis” Neurocomputing 314,pp.169-176. [11] Yu. J., Yang, X. G., Gao., F and Tao., D. C., “Deep Multimodal Distance Metric Learning Using Click Constraints for Image Ranking”, IEEE Transactions on Cybernetics, vol. 47, no. 12, pp. 4014 – 4024, 2017. [12] Yu, J., Tao, D. C., Rui,Y. and Wang, M., “Learning to Rank using User Clicks and Visual Features for Image Retrieval”, IEEE Transactions on Cybernetics, 45(4): 767-779, 2015. [13] Ji, K., Sun, R., Li, X., Shu, W. [2016] “Improving matrix approximation for recommendation via a clustering-based reconstructive method.” Neurocomputing 173 ,pp.912-920. [13]Yu, J., Rui Y. and Tao, D. C., “Click Prediction for Web Image Reranking using Multimodal Sparse Coding”, IEEE Transactions on Image Processing,23(5): 2019-2032, 2014. [14] Yu, J., Rui,Y. and Chen, B. “Exploiting Click Constraints and Multiview Features for Image Reranking,” IEEE Transactions on Multimedia, vol. 16, no. 1, pp.159-168, 2014. [15] Zheng, F., Huang, W. and Jia, M. [2015] “Matrix factorization recommendation algorithm based on Spark.” Journal of Computer Applications 35, pp. 2781-2783. [16] Loftin, R., Peng, B., MacGlashan, J., Littman, M. L., Taylor., M. E., et al. [2016] “Learning behaviors via human-delivered discrete feedback: modeling implicit feedback strategies to speed up learning.”Autonomous Agents and Multi-Agent Systems 30(1),pp.1-30. [17] Ma,H., Zhou,T., Lyu,M. and King, I. [2011] “Recommender systems with social regularization,” in Proc. Fourth ACM Int. Conf. Web Search Data Mining, pp. 287–296. [18] Bottou, L. [2012] “Stochastic gradient descent tricks. neural networks: tricks of the trade. ”Springer Berlin
ACCEPTED MANUSCRIPT
AC
CE
PT
ED
M
AN US
CR IP T
Heidelberg, pp.421-436. [19] Yu, L., Liu, C. and Zhang, Z.[2015] “Multi-linear interactive matrix factorization.” Knowledge-Based Systems 85, pp. 307-315. [20] Koren, Y. [2008] “Factorization meets the neighborhood: A multifaceted collaborative filtering model,” in Proc. 14th ACM SIGKDD Int. Conf. Know. Discovery Data Mining, pp. 426-434. [21] Salakhutdinov, R. and Mnih, A.[2008] “Probabilistic matrix factorization,” in Proc. Adv. Neural Inform. Process. Syst 20, pp. 1257-1264. [22] Pan,W., Zhong,H., Xu, C. and Ming, Z.[2015] “Adaptive Bayesian personalized ranking for heterogeneous implicit feedbacks,” Know-Based Syst 73, pp.173-180. [23] Polato, M., Aiolli, F.[2016]"Exploiting sparsity to build efficient kernel based collaborative filtering for top-N item recommendation." Neurocomputing 268, pp. 17-26 [24] Bok, K., Lim, J., Yang, H., Yoo, J. [2016] “Social group recommendation based on dynamic profiles and collaborative filtering.” Neurocomputing 209, pp.3-13. [25] Koren,Y.[2010] “Factor in the neighbors: Scalable and accurate collaborative filtering,” ACM Trans. Know. Discovery Data 4(1), pp.1-24. [26] Segrera, S. [2016] “Web mining based framework for solving usual problems in recommender systems. A case study for movies' recommendation.” Neurocomputing 176, pp. 72-80. [27] Li, Y., Wang, D., He, H., Jiao, L., Xue. Y.[2017] “Mining intrinsic information by matrix factorization-based approaches for collaborative filtering in recommender systems.” Neurocomputing 249, pp.48-63. [28]Wang, Q., Liu, X., Zhang, S., Jiang, Y., Du, F., Yue, Y., Liang, Y.[2015] “A novel APP recommendation method based on svd and social influence.” in Proceedings of International Conference on Algorithms and Architectures for Parallel Processing, pp. 269-281. [29] Zhang, X.,Yang, D. and Zhu, B.[2016] “Recommendation of user’s interest based on multiple similar features.” Journal of Xian Polytechnic University 30, pp. 97-101. [30]Wu,Y., Dubois, C., Zheng, A. X., Ester, M. [2016] “Collaborative denoising auto-encoders for top-n recommender systems.” in Proceedings of ACM International Conference on Web Search & Data Mining,153-162. [31] Meng, X., Liu, S., Zhang, Y., Hu, X.[2015] “Research on social recommender systems.” Journal of Software 26, pp. 1356-1372. [32]Yu, J., Kuang, Z. Z., Zhang, B. P., Zhang, W., Lin, D. and Fan, J. P., “Leveraging Content Sensitiveness and User Trustworthiness to Recommend Fine-Grained Privacy Settings for Social Image Sharing”, IEEE Transactions on Information Forensics and Security, DOI: 10.1109/TIFS.2017.2787986, 2018. [33] Yao, W., He, J. Huang, G., Zhang, Y.[2014] “Modeling dual role preferences for Trust-aware recommendation,” in Proc. 37th Int. ACM SIGIR Conf. Res. Develop. Inform. Retrieval, pp. 975–978. [34] Baltrunas, L., Ludwig, B. and Ricci, F.[2011] “Matrix factorization techniques for context aware recommendation,” in Proceedings of the Fifth ACM Conference on Recommender Systems, pp. 301-304. [35]Costa, A. F. D., Manzato, M. G. [2016] “Exploiting multimodal interactions in recommender systems with ensemble algorithms.” Information Systems 56, pp.120-132. [36] Lopes, R. and Santos, R. L. T. [2016] “Efficient bayesian methods for graph-based recommendation.” in Proceedings of ACM Conference on Recommender Systems, pp. 333-340. [37] Hosseini, S. A., Alizadeh, K., Khodadadi, A., Arabzadeh, A. [2017] “Recurrent Poisson factorization for temporal recommendation.” in Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp.847-855. [38] Liu, Y., Tong, Q., Du, Z., Hu, L. [2014] “Content-boosted Restricted Boltzmann Machine for recommendation.” in Proceedings of International Conference on Artificial Neural Networks, pp.773-780. [39] Zhang, W. Y., Zhang, Z. J., Chao, H. C. and Tseng, F. H. [2018] “Kernel mixture model for probability density estimation in Bayesian classifiers. ” Data Mining and Knowledge Discovery 32, pp. 675-707. [40] Xiong, F, Wang, X. M., Pan, S. R., Yang, H., Wang, H. S. and Zhang, C. Q. [2019] “Social recommendation with evolutionary opinion dynamics.” IEEE Transactions on Systems, Man, and Cybernetics: Systems, DOI: 10.1109/TSMC.2018.2854000.
ACCEPTED MANUSCRIPT
Yutian Hu received the B.E. degree in communication and information systems from Beijing Jiaotong University, Beijing, China, in 2018. From 2016 to 2017, she was an exchange student at Taiwan
CR IP T
National Central University. She is currently a master student with the School of Electronic and Information Engineering, Beijing Jiaotong University. Her current research is mainly on recommender
AN US
system.
Fei Xiong received the B.E. degree and the Ph.D. degree in communication and information systems from Beijing Jiaotong University, Beijing, China, in 2007 and 2013. He is currently an Associate
M
Professor with the School of Electronic and Information Engineering, Beijing Jiaotong University. From 2011 to 2012, he was a visiting scholar at Carnegie Mellon University. He has published over 60 papers in refereed journals and conference proceedings. He was a recipient of National Natural Science
ED
Foundations of China and several other research grants. His current research interests include the areas of web mining, complex networks and complex
AC
CE
PT
systems.
Dongyuan Lu received the B.S. degree from Beijing Normal University, Beijing, China, in 2007, and the Ph.D. degree from CASIA in 2012. She is currently an Associate Professor in School of Information Technology & Management, University of International Business and Economics. Her research interests include social media analysis, information retrieval and data mining.
CR IP T
ACCEPTED MANUSCRIPT
Ximeng Wang received the B.E. degree in communication engineering from the Nanjing University of Posts and Telecommunications, Nanjing, China, in 2011, and the M.E. degree in software engineering from Beijing Jiaotong University, Beijing, China, in 2013. He is currently pursuing the Ph.D. degree with Beijing Jiaotong University. Since 2017, he has been a joint Ph.D. student with the University of Technology Sydney.
M
AN US
His current research interests include recommender systems, complex networks and data mining.
Xi Xiong received the B.S. and M.S. degrees from the Beijing Institute of Technology and the Ph. D.
ED
degree in information security from Sichuan University, Chengdu, in 2013. He is currently a lecturer with the School of Cybersecurity, Chengdu University of Information Technology, Chengdu, China. He has published over 20 papers in the most prestigious journals and conferences. His research
AC
CE
PT
interests include web mining, social computing and machine learning.
Hongshu Chen received the dual Ph.D. degrees in Management Science and Engineering from Beijing Institute of Technology, Beijing, China, in 2015, and in Software Engineering from University of Technology Sydney, NSW, Australia, in 2016. She is currently a Lecturer with the School of Management and Economics, Beijing Institute of Technology. Her research interests include bibliometrics, information systems and text analytics.