Future Generation Computer Systems 29 (2013) 262–270
Contents lists available at SciVerse ScienceDirect
Future Generation Computer Systems journal homepage: www.elsevier.com/locate/fgcs
Smoothing approach to alleviate the meager rating problem in collaborative recommender systems M.K. Kavitha Devi a,∗ , P. Venkatesh b a
Department of Information Technology, Thiagarajar College of Engineering, Madurai, Tamil Nadu, 625015, India
b
Department of Electrical and Electronics Engineering, Thiagarajar College of Engineering, Madurai, Tamil Nadu, 625015, India
article
info
Article history: Received 28 May 2010 Received in revised form 28 January 2011 Accepted 28 May 2011 Available online 18 July 2011 Keywords: Collaborative Filtering Meager rating problem Cold start problem Fuzzy radial basis function network
abstract In this paper, an integrated recommendation approach using Radial Basis Function Network (RBFN) and Collaborative Filtering (CF) is proposed. Radial basis function network is a neural network approximation method used to improve the accuracy of the recommendations. The proposed system RBFN_KFCM has offline and online phases. During offline, the system uses RBFN for smoothing and kernel fuzzy c-means (KFCM) method for clustering. During online recommendation, KFCM based approach is proposed. At the end of each session the clusters are updated by replacing the smoothed rating by the original rating. A comparison is made with benchmark recommender systems such as item-based, user-based systems, singular value decomposition (SVD) and also popular machine learning techniques such as Support Vector Machine (SVM), Multilayer Perceptron (MLP) using backpropagation algorithm, in terms of accuracy, decision-support measures and computational time. Empirical evaluation is carried out by using movielens dataset. Crown Copyright © 2011 Published by Elsevier B.V. All rights reserved.
1. Introduction The magnitude of commodities grows rapidly in the e-market and it becomes more and more difficult for the customers to find out what they want. Personalization techniques [1] help the customers to find the right products. The systems apply statistics and knowledge discovery techniques to make product recommendations during a live customer interaction and achieve widespread success. They enhance e-commerce sales by changing the browsers into buyers, improving cross-sell and building the customer’s loyalty. Different recommender systems have been proposed in the literature [2]. Recommender systems can be broadly categorized as content based and collaborative filtering (CF) based systems. Content based recommender systems match customer interest profiles (e.g., revealed by their highly rated products) with the product attributes (or features) when making recommendations. Bad choices of features result in recommender systems with either the shallow-analysis problem or the over-specialization problem [3]. CF based recommender systems are based on similarities between customer preference ratings for computing recommendations. Since the approach does not depend on the product
∗
Corresponding author. E-mail addresses:
[email protected] (M.K.K. Devi),
[email protected] (P. Venkatesh).
contents, it is free from the two problems of the content-based approach and thus has widely been used for recommending products where product descriptions are either lacking or found to be too specific to be useful. Several technologies of CF recommendation have been reported in the literature. Some popular techniques include the Correlation-based methods [4,5], Latent semantic indexing (LSI) [6,7], and Bayesian learning [8,9]. These systems employ Pearson correlation function and statistical machine learning techniques to find similar users or products. CF recommender system provides accurate recommendation for sufficiently large number of similar users’ ratings for the products, and overlap of product coverage of their ratings. However, this may not be the case in reality because of the lack of users’ ratings for a large pool of products (the meager rating problem). The meager rating problem leads to scanty correlated users, which result in poor recommendations [10]. In this paper, Radial Basis Function Network (RBFN) is employed to address the meager rating problem in CF based recommender systems. The two phases in the proposed system are online phase and offline phase. In offline phase, the meager rating matrix is smoothed using RBFN and then the rating matrix is clustered. KFCM is used for clustering; it assigns a user proportionately to more than one cluster (to know the users’ dependability in each cluster). In online phase, KFCM is used to predict the unknown rating of the active user. At the end of each session the fuzzy cmeans clusters are updated by replacing the smoothed rating by the original rating. Experimental results based on the Movielens
0167-739X/$ – see front matter Crown Copyright © 2011 Published by Elsevier B.V. All rights reserved. doi:10.1016/j.future.2011.05.011
M.K.K. Devi, P. Venkatesh / Future Generation Computer Systems 29 (2013) 262–270
dataset show that the proposed model produces high quality recommendation. In terms of accuracy and relevance, the results proved that the proposed model gives better recommendation than the benchmark recommender system such as item-based, user-based and Singular Value Decomposition (SVD) and also popular machine learning techniques such as Support Vector Machine (SVM), Multilayer Perceptron using backpropagation algorithm. The rest of the paper is organized as follows. In Section 2, a study on the existing systems is provided. In Section 3, the architectural framework of the overall system is proposed. Section 4 discusses the offline activities of proposed system. In Section 5, the online activities of the proposed system are proposed. Empirical results are presented in Section 6. Finally, Section 7 summarizes the work; the future research directions are also suggested. 2. Related work Extensive research is going on to overcome the meager rating problem. In paper [11], the authors used the coefficient factor to calculate the similarity between the users. The coefficient is the ratio between the numbers of items co-rated to the total number of items. In [12], based on the attraction index factor the items are recommended to the active user. In paper [13], item based clustering method is used instead of user based clustering method. The authors also coated the drawback of dimensionality reduction techniques in terms accuracy and relevance. In [14] the system uses the classification approach for recommendation. In [15], a dimensionality reduction technique, Singular Value Decomposition (SVD), was used to reduce the dimensionality of meager ratings matrices. Another approach that also explores the liking and disliking among users using latent class model is proposed in [16]. Applying simple clustering or some statistical cluster models to the preference ratings is demonstrated to be able to improve the local density of the ratings and is considered to be a promising remedy for the meager rating problem. In [17] the radial basis smoothing is proposed. In [18], various activation functions for smoothing are compared. The paper proved that the Gaussian activation approach is better than others. In [19], the authors have replaced the similarity weight in prediction with trust weight by the trust propagation over the trust network. And authors also proposed that trust decreases along propagation. A comparison between Mole Trust and DecTrust based on Epinions.com dataset showed that trust metric can improve the accuracy while keeping coverage. In [20] CF based on rough set theory was proposed to solve this problem of meager rating, which predicted values of the null ratings in the candidates, and got the results using user’s neighbors. It maximizes the amount of co-rated items of different users, thus improves reliability of the computation of similarity in CF. In [21], the authors have argued that trust-awareness can solve some of the traditional problems of recommender systems and proposed framework of Trust-Aware Recommender Systems. In [22], the authors select the users who have high similarity with the active user. Final prediction of the active user is not linear correlation with the users’ votes which accord with the real world. This algorithm shows more accurate result than the existing algorithms. In [23], a-priori score is used to infer user preference and to pre-fill null rating first. After prefilling using the ontology method, the CF on community is executed based on a dense rating matrix. In [24], the authors extend the idea of analyzing user–item interactions as graphs and employ link prediction approaches proposed in the recent network modeling literature for making CF recommendations. In [25], the authors have proposed a novel graph propagation method which joins
263
the domain category tree and the Citeulike database associate graph. In [26], the authors aggregated all of the intrinsic metadata at the user and item level and also used tags for computing recommendations. In [27], the authors propose a graph kernelbased recommendation framework. For each user–item pair, Associative Interaction Graph (AIG) is inspected that contains the users, items, and interactions n steps away from the pair. This graph is used to predict the possible user–item interactions. The authors of [28] reformulate the memory-based CF problem in a generative probabilistic framework, treating individual user–item ratings as predictors of missing ratings. The final rating is estimated by fusing the predictions from three sources: predictions based on ratings of the same item by other users, predictions based on different item ratings made by the same user, and, third, ratings predicted based on data from other but similar users rating other but similar items. In [29], the authors extend the basic random walk model by calculating the product similarities (i.e. transition probabilities) through a weighted bipartite network and allowing the current shopping behaviors to influence the product ranking (i.e. importance) scores. In [30], a hybrid recommender which is a combination of Matrix Factorization, k-nearest neighbor, SVD Cross-Linked Rating and Content Information is proposed to solve the meager rating problem. Most of the existing online recommender systems use CF; these systems use additional information to overcome the meager rating problem. Beehive, Grouplens, smart radio and tapestry systems use implicit information such as browsing history, time spent, and comments. In addition to this information, some of the online systems such as Amazon, Walmart, Barnes and Noble, CDNow, Fab, Movielens, Websell use content information’s of the items. The Orkut system, obtain the trust for other users. Moreover these systems use the correlation or cosine functions to find the similar users or products. 3. Framework of the proposed system using RBFN The framework of the proposed system is given in Fig. 1. As shown in Fig. 1, the smoothing part in offline phase, is same for both the approaches. For user modeling in offline phase, kernel fuzzy c-means (KFCM) algorithm is used in RBFN_KFCM. In online phase, KFCM is used to predict the unknown rating of the active user. In the model-based approach, clustering is the primary task. It is difficult to find the correlated users from a highly meager rating matrix. So, the meager user–item rating matrix is to be smoothed to complete matrix. The smoothing process is done using RBFN. The following details are available:1. A set of X users {ui |i = 1, 2, . . . , X }. 2. A set of Y different items {pi |i = 1, 2, . . . , Y }. 3. A rating table ri,j is a [X × Y ] matrix which contains the ith user jth item rating value in the rating range. The nonrated items are represented by a value of zero. 4. Off-line activities of the proposed system There are two activities carried out in offline, namely smoothing process (Phase-I) and user modeling process (Phase-II). 4.1. Smoothing process (Phase-I) In smoothing process, the meager user–item rating matrix is converted to complete rating matrix using RBFN.
264
M.K.K. Devi, P. Venkatesh / Future Generation Computer Systems 29 (2013) 262–270
functions are attempted in [18], and the authors proved that Gaussian found to be best in terms of accuracy. In this paper Gaussian activation function is used in RBF. 4.1.1.2. Unsupervised learning phase of RBFN. To find the center of the RBF function, clustered based on Pearson correlation similarity is used. The algorithm is to cluster X objects based on attributes into k partitions k < X . The clustering process is described in Algorithm 1. Algorithm 1. Clustering using correlation function. Input: ru,t is the meager user–item rating matrix, ri is the average rating of the users i. Output: k number of clusters Method: 1. For i = 1 to X /* similarity is calculated between X users */ 1.1. sim_count(i) = 0 1.2. For j = 1 to X 1.2.1. Find Pearson correlation similarity between users i and user j, using Eq. (5). Y
Fig. 1. Framework of the proposed system using RBFN.
4.1.1. Radial basis function network (RBFN) RBFN is a family of Artificial Neural Networks, originally developed by Hardy to fit irregular topographical contours of geographical data [31,32]. RBFN enjoy the best approximation property among all feed-forward networks and which have produced excellent fits to arbitrary contours of both deterministic and stochastic response functions. RBFN is a family of artificial neural networks which has three layers: input layer, hidden layer and output layer. The input layer contains X number of neurons, to which the user’s rating vector is input. This layer is fully connected to all the neurons in the hidden layer. A nonlinear transformation (unsupervised learning) is applied from input layer to hidden layer. Each neuron in the hidden layer has activation function. The hidden layer is also fully connected to the output layer. A linear transformation (supervised learning) is applied from hidden layer to output layer. The output layer contains X number of neurons, to which the user’s smoothed rating vector is output. The output layer performs a simple summation function on users. 4.1.1.1. Activation function of RBFN. The radial-basis function technique consists of choosing a function F is given in Eq. (1). F (Xi ) =
k
wj φ Xi − Cj
(1)
where wj denotes weight vector from hidden layer to output layer, Xi is the given set of points, Cj is the center of the given set of points, ∥.∥ defines norm and φ is the activation function. In [2] the authors suggested three activation functions viz., Gaussian, multiquadric and thin-plate spline functions as defined in Eqs. (2)–(4). 1. Gaussian function: for some σ > 0.
t =1
(5)
t =1
1.2.2. If simi,j = 1 then sim_count(i) = sim_count(i) + 1. 2. Identify top k sim_count users as the cluster center for k cluster. 3. Assign the users to the respective clusters, whose similarity is 1. 4.1.1.3. Supervised learning phase of RBFN. Supervised learning can be viewed as approximating a mapping between the hidden layer and the output layer. This phase is a linear phase, where the weight is refined. The initial weight is calculated in three different ways viz., pseudorandom weight function, output weight function and random weight function. 1. Pseudorandom weight function: The pseudorandom weight function is defined in Eq. (6)
max φI − φImin wi = n max φI ,x − φI ,x rx,I φxmax − φxmin max φI − φI ri,I
(6)
where φImax and φImin is the maximum and minimum value of the activation function for the particular item i, riI is the rating for the item I by the user i and φImax ,x is the maximum activation function value for the item I. 2. Output weight function: The weight value is also calculated in terms of input and output rating matrix (supervised learning) as defined in Eq. (7).
w = O ∗ I′ (2)
2. Multiquadric function:
φ(r ) = r β where β > 0 is a positive odd number.
(ra,t − ra ).(ru,t − ru ) Y Y (ra,t − ra )2 (ru,t − ru )2 t =1
x=1
j=1
r2 φ(r ) = exp − 2 2σ
sima,u =
(3)
3. Thin-plate spline function:
φ(r ) = r β log(r ) where β > 0 (4) where r = Xi − Cj , σ is the positive valued shaping parameter, β is the positive valued parameter. These activation
(7) ′
where O represent output matrix and I refers to the inverse of the input matrix. Though the input matrix is highly meager, it is not possible to take inverse of the matrix. 3. Random weight function: The weight can be randomly assigned by the values in the range 0.5–1. The initial weight is updated based on the difference between the calculated F (Xi ) (as defined in Eq. (1)) and the output rating matrix. On simulation, it is observed that pseudorandom weight function gives accurate result than the others.
M.K.K. Devi, P. Venkatesh / Future Generation Computer Systems 29 (2013) 262–270
4.1.2. Smoothing algorithm The smoothing algorithm converts the meager user–item rating matrix to complete rating matrix. The smoothing algorithm is described in Algorithm 2. Algorithm 2. Smoothing
k1
sru,t =
r u ,t r⌢u,t
X
Cj =
i=1
q
mi,j .sri
X
.
(9)
q mi,j
.
3. Calculate membership mij (1 ≤ i ≤ X , 1 ≤ j ≤ k) defined in Eq. (10). 1 mi,j = (10) 2 . k
t =1
sri −Cj
t −1
sri −Ct
4. Calculate Error = max(old mi,j , new mi,j ). 5. if Error > 0.5, then q = q + 1; goto Step 3. 5. On-line activity—recommendation process of the proposed system In recommendation phase, the active user gets recommendation based on similar users from multiple clusters. The recommendation process is given in Algorithm 4.
ri,p
p=1 3.2. Calculate the centers Ck = where k1 denotes k1 number of users belong to that cluster. 3.3. Find the Euclidean distance matrix gi,p = ri,p − Ck where 1 ≤ p ≤ k1. Calculate the activation function φi,p (g ) using Eq. (2). 3.4. Calculate the weights using Eq. (6). 3.5. Calculate r⌢i,j = F ri,j using Eq. (1). 3.6. Calculate the complete rating matrix sru,t , using the Eq. (8).
2. Calculate the cluster center Cj (1 ≤ j ≤ k) defined in Eq. (9).
i=1
Input: Meager user–item rating matrix ⟨ri,j |1 ≤ i ≤ X and 1 ≤ j ≤ Y ⟩, φImax and φImin is the maximum and minimum value of the activation function for the particular item i, ri,I is the rating for the item I by the user i and φImax ,x is the maximum activation function value for the item I. Output: Complete rating matrix sru,t , Smoothed user–item rating matrix sru,t |1 ≤ i ≤ X and 1 ≤ j ≤ Y . Method: 1. Let range = (max _rating − min _rating) + 1. 2. Find the number of clusters k, such that ⌊range/k⌋ ≤ 3. 3. For j = 1 to Y . 3.1. Partition the users into k clusters using an unsupervised clustering technique. Let x = ri,j where 1 ≤ i ≤ X . Call function defined in Algorithm 1.
265
If user u rate the item t Else
(8)
4.2. User modeling (Phase-II) To address the scalability problem, the resultant smoothed matrix is clustered before the recommendation process. This process is different for RBFN_KFCM. 4.2.1. RBFN_KFCM In the recommendation phase, the number of ratings provided by the active user is highly meager. The existing clustering methods such as k-means, similarity based, or expectation maximization, map the data into a potentially much higher rating space. The drawback of these clustering algorithms is that the clustering prototypes lie in high dimensional rating space and hence lack clear and insightful descriptions (meager user–item matrix). So, Zhang and Chen [33] proposed kernel fuzzy clustering algorithm, which is used in RBFN_KFCM for user modeling. The analysis shows that fuzzy clustering is robust to noise and outliers and also tolerates unequal sized clusters than other methods. The clustering process is explained in Algorithm 3. Algorithm 3. Clustering using KFCM technique. Input: sru,t is the complete user–item rating matrix, sri is the Y dimensional rating vector of user i, k number of clusters. Output: Cluster center Cj , degree of membership mij for user i and cluster j (Each user has the membership value in the range 0–1, which indicates the user i interest in each cluster j) Method: 1. Initialize degree of membership mi,j = randperm(j)/ (randperm(j)) and set q = 1.
Algorithm 4. Recommendation for RBFN_KFCM. Input: Active user’s rating vector Ra , T denotes number of items to be recommended, Cluster center Cj and φ(. . .) is the Gaussian activation function defined in Eq. (4.2). Output: Item recommendation for the user. Functional Specification: 1. If the active user is a new user and the rating vector is NULL (New user cold start), then Send N top rated items from each cluster to the recommender agent. 2. If the active user is a new user and with some rating vector. Then do Steps 2.1 and 2.2. And Send N top predicted rating items to the recommender agent. 2.1. Calculate degree of membership mai for active user a with all the clusters using Eq. (10). 2.2. Predict the unknown ratings of the active user a, using the function defined in Eq. (11). k
KFCM_Pa,j =
ma,i φ (Ra , Ci ) Ci,j
i=1 k
.
(11)
ma,i φ (Ra , Ci )
i =1
3. If the active user is an existing user and with no additional ratings, then Send N top training phase predicted rating items to the recommender agent. 4. If the active user is an existing user and with additional ratings, then perform actions for Step 2. 6. Experimental results and discussion 6.1. Test set 6.1.1. Data set preparation In order to analyze the proposed system, experiments have been performed using the Movielens dataset. The dataset (www.grouplens.org/.../datasets/) was collected by the GroupLens Research Project at the University of Minnesota. This data set consists of 100,000 ratings (in the range 1–5) from 943 users on 1682 movies. In order to analyze the influence of meager rating in data set, the entire data is randomly extracted in order to have the meager rating level 50%, 60%, 70%, 80%, 90%, 99%. To analyze the computation time, error during modeling and recommendation phase, the data sets splits into training and test data (80%–20%).
266
M.K.K. Devi, P. Venkatesh / Future Generation Computer Systems 29 (2013) 262–270
6.2. Performance measures High-quality experiments are necessary in order to really know the benefits and limitations of the proposed recommendation techniques. Herlocker et al. [34] stated that the performance evaluation of recommendation algorithms is usually done in terms of accuracy measure. Accuracy measures can be either statistical or decision support. Statistical accuracy metrics mainly compare the predicted ratings against the actual ratings in the user–item matrix and include Mean Absolute Error (MAE). The MAE measure is defined in Eq. (12). MAE =
X Y |Pi,j − ri,j | i=1 j=1
X ∗Y
(12)
where X , Y , Pi,j , ri,j represent number of users, number of items, predicted rating of user i for item j and true rating of user i for item. Decision support measures to determine how well a recommender system can make predictions of high-relevance items. They include classical information retrieval measures of precision (ratio of relevant items selected by the recommender to the number of items), recall (ratio of relevant items selected to the number of relevant). These measures are often, conflicting in nature. For instance, increasing the number recommended items N, which tends to increase recall but decreases precision. The precision and recall are defined in Eqs. (13) and (14) respectively.
|{relevant_items} ∩ {recommended_items}| |{recommended_items}| |{relevant_items} ∩ {recommended_items}| recall = . |{relevant_items}| precision =
(13) (14)
6.3. Results and discussion 6.3.1. To investigate the influence of meager rating in data set To investigate the influence of meager rating, the proposed system is compared with the bench mark recommender system such as item based, user based and singular value decomposition technique (SVD) systems. The primary process of the proposed system is smoothing, so it is also compared with average rating fill system ICRS [35], where the user average rating is filled for the not rated items. The comparison result is shown in Fig. 2, where x axis shows the meager level of the training and the testing dataset and y axis gives the mean absolute error. The observations of the comparison of MAE are as follows:
• SVD and user based CF show drastic increase in error with increase in meager level.
• ICRS [35] shows medium performance, even when meager level increases.
• Item based CF shows better performance than other methods (except RBFN_KFCM).
• Even though, the RBFN_KFCM error increases as meager level increases. RBFN_KFCM shows better performance than all other methods. Thus, experimental result proves that the proposed system outperforms other benchmark systems for various meager levels.
Fig. 2. Prediction error comparison of RBFN_KFCM with existing systems on different meager level for movielens dataset.
Fig. 3. Smoothing time and error comparison of RBFN system with existing systems for movielens dataset.
6.3.2. To investigate the efficiency of RBFN_KFCM with other popular algorithms The RBFN_KFCM is compared with variety of popular recommender algorithms using singular value decomposition (SVD), support vector machine (SVM) using polynomial kernel function (SVM_poly), SVM using radial basis function (SVM_RBF) and multilayered perceptrons (MLP) trained with the back propagation algorithm using sigmoid activation function. The SVM results are computed using Gist Support vector machine and kernel principal components analysis software toolkit (Version 2.0.9) [36]. The other methods are modeled in MatLab software [37]. The efficiency is measured in terms of computational time, mean absolute error, precision and recall (coverage). Fig. 3 depicts smoothing time and error comparison of the RBFN with other existing systems. The observations of the comparisons are as follows:
• SVD is the dimensionality reduction technique; so the modeling time outperforms other systems whereas the error is comparatively high. • As item based smoothing is taken place in the RBFN_KFCM, the modeling time is high whereas MAE is very less than the other systems. • Smoothing phase is same for both RBFN_CF and RBFN_KFCM. So, the performance is also same. • SVM and MLP have medium performance in terms of time and error. The prediction time and error comparison of RBFN system with other existing systems is shown in Fig. 4. The observations reported are as follows:
• The accuracy is better than the SVM_poly but worst than other methods.
• For the SVM methods, polynomial technique is better than RBF in terms of time. But it is worst for accuracy measure.
M.K.K. Devi, P. Venkatesh / Future Generation Computer Systems 29 (2013) 262–270 Table 1 Original sparse rating matrix (SR) 68% sparsity.
U1 U2 U3 U4 U5
267
Table 3 User cluster index for each item.
I1
I2
I3
I4
I5
I6
I7
I8
I9
I10
3 0 1 0 4
0 1 3 0 0
0 0 0 4 0
0 0 0 2 4
2 4 0 0 0
0 0 0 0 2
0 0 1 1 0
5 0 0 0 0
0 0 2 0 0
0 0 1 0 0
U1 U2 U3 U4 U5
I1
I2
I3
I4
I5
I6
I7
I8
I9
I10
2 1 1 1 2
1 1 2 1 1
1 1 1 2 1
1 1 1 1 2
1 2 1 1 1
1 1 1 1 2
1 1 2 2 1
2 1 1 1 1
1 1 2 1 1
1 1 1 1 1
7. Conclusion and future work
Fig. 4. Prediction time and error comparison of RBFN system with existing systems for movielens dataset.
In this paper, RBF smoothed technique is presented and experimentally evaluated to overcome the meager rating and cold start user problems in CF-based recommender systems. The results show that KFCM based prediction provides better result than other techniques and also RBF smoothed correlation cluster based prediction technique. The system is refreshed at the end of each session. When comparing with the existing online CF based recommender systems such as Amazon, Movielens, Ringo, smart radio, Fab, the proposed system overcomes the meager rating problem without using any additional information like purchase history, browsing time, feature of the item. Experimental results show that the fuzzy c-means similarity calculation is better than the standard similarity function such as cosine or Pearson correlation functions, in terms of time and accuracy. As a future enhancement, the training phase computational time may be reduced by the entire item set based smoothing instead of item by item based smoothing. Acknowledgments We thank the movielens organization for providing the high quality data set. We also thank the GIST software for the SVM model.
Fig. 5. Decision support measure comparison of RBFN system with existing systems for movielens dataset.
• The time needed for MLP same as SVD, but the accuracy of MLP is better than SVD. • For the smoothed data using RBF, the recommendation time for the existing user (with no modified ratings) is less than the SVD and MLP with good accuracy. Because the ratings are already predicted in the training phase it self. • RBFN_CF prediction time for the new user or existing user (with additional ratings) is very poor than SVD, MLP and the RBFN_KFCM. But the accuracy is better than others except RBFN_KFCM. The decision support measure comparison is shown in Fig. 5. The observations reported are as follows:
• RBFN_KFCM provides better decision support measure than the other techniques.
• The precision and recall for the RBFN_KFCM is 98%. • Other systems have varying precision and recall, MLP has 98% recalls (coverage) but not the precision.
• SVM_RBF outperforms SVM_Poly, SVM and MLP. • SVD has only 60% coverage and precision.
Appendix In this session, the proposed method described is detailed with an example. There are two phases offline phase (smoothing phase) and online phase (recommendation phase). OFFLINE PHASE. Input: Meager Rating Matrix SR. Intermediate output: Smoothed Rating matrix FR. Output: User Modeling using KFCM. Method: Consider the rating matrix shown in Table 1. There are 5 users {U1, U2, U3, U4, U5} and 10 items {I1, I2, . . . , I9, I10}. The ratings are in the range 1(poor) to 5(good), and the nonrated items are represented by 0. The first step in the training phase is the unsupervised learning phase, where the users are clustered based on the rating of each item. As the rating range is 1–5, the number of clusters is 2. Table 2 shows the cluster center for each item. Since the original rating for item 7 is 0’s and 1’s, the cluster center for both the cluster is 1. Based on the cluster center, the users are grouped in the clusters. The cluster to which the users belong is shown in Table 3. For item I1, the user U1 and U5 belong to cluster 2 and remaining users are in cluster 1.
Table 2 Cluster center.
Cluster center 1 Cluster center 2
I1
I2
I3
I4
I5
I6
I7
I8
I9
I10
0.33 3.5
0.25 3
1 4
0.5 4
0.5 4
1 2
1 1
1 5
1 2
0.2 1
268
M.K.K. Devi, P. Venkatesh / Future Generation Computer Systems 29 (2013) 262–270
Table 4 Gaussian activation function values.
U1 U2 U3 U4 U5
I1
I2
I3
I4
I5
I6
I7
I8
I9
I10
0.902 0.955 0.832 0.955 0.902
0.975 0.793 1 0.975 0.975
0.662 0.662 0.662 1 0.662
0.902 0.902 0.902 0.395 1
0.395 1 0.902 0.902 0.902
0.662 0.662 0.662 0.662 1
0.662 0.662 1 1 0.662
1 0.662 0.662 0.662 0.662
0.662 0.662 1 0.662 0.662
0.984 0.984 0.768 0.984 0.984
Table 5 Spline activation function values.
U1 U2 U3 U4 U5
I1
I2
I3
I4
I5
I6
I7
I8
I9
I10
9.888 0 0 0 22.181
0 0 9.888 0 0
0 0 0 22.181 0
0 0 0 2.773 22.181
2.773 22.181 0 0 0
0 0 0 0 2.773
0 0 0 0 0
40.236 0 0 0 0
0 0 2.773 0 0
0 0 0 0 0
Table 6 Weight matrix.
U1 U2 U3 U4 U5
I1
I2
I3
I4
I5
I6
I7
I8
I9
I10
0 0.525 0.686 0.525 0
0.469 3.814 0 0.469 0.469
1.024 1.024 1.024 0 1.024
0.408 0.408 0.408 2.514 0
2.763 0 0.448 0.448 0.448
1.038 1.038 1.038 1.038 0
1.321 1.321 0 0 1.321
0 0.964 0.964 0.964 0.964
0.937 0.937 0 0.937 0.937
4.041 4.041 0 4.041 4.041
I1
I2
I3
I4
I5
I6
I7
I8
I9
I10
0 0.525 0.686 0.525 0
0.469 3.814 0 0.469 0.469
1.024 1.024 1.024 0 1.024
0.408 0.408 0.408 2.514 0
2.763 0 0.448 0.448 0.448
1.038 1.038 1.038 1.038 0
1.321 1.321 0 0 1.321
0 0.964 0.964 0.964 0.964
0.937 0.937 0 0.937 0.937
4.041 4.041 0 4.041 4.041
Table 7 Weight matrix.
U1 U2 U3 U4 U5
Table 8 Smoothened full rating matrix (FR) Gaussian 0% sparsity.
U1 U2 U3 U4 U5
Table 11 Prediction for Table 8 (Gaussian) error 0.056.
I1
I2
I3
I4
I5
I6
I7
I8
I9
I10
3 2 1 2 4
4 1 3 4 4
3 3 3 4 3
2 2 2 2 4
2 4 2 2 2
3 3 3 3 2
3 3 1 1 3
5 3 3 3 3
2 2 2 2 2
5 5 1 5 5
I1
I2
I3
I4
I5
I6
I7
I8
I9
I10
3 1 1 1 4
0 1 3 0 0
0 0 0 4 0
1 1 1 2 4
2 4 1 1 1
0 0 0 0 2
0 0 1 1 0
5 0 0 0 0
0 0 2 0 0
0 0 1 0 0
I2
I3
I4
I5
I6
I7
I8
I9
I10
4 1 3 4 4
3 3 3 4 3
4 4 4 2 4
2 4 2 2 2
2 2 2 2 2
3 3 1 1 3
5 3 3 3 3
2 2 2 2 2
5 5 1 5 5
U1 U2 U3 U4 U5
I1
I2
I3
I4
I5
I6
I7
I8
I9
I10
3 4 1 4 4
1 1 3 1 1
1 1 1 4 1
4 4 4 2 4
2 4 1 1 1
2 2 2 2 2
1 1 1 1 1
5 1 1 1 1
1 1 2 1 1
1 1 1 1 1
Table 13 Prediction for Table 1 (original).
Table 10 KFCM user modeling.
U1 U2 U3 U4 U5
I1 3 4 1 4 4
Table 12 Prediction for Table 9 (Spline) error 0.125.
Table 9 Smoothened full rating matrix (FR) Spline 52% sparsity.
U1 U2 U3 U4 U5
U1 U2 U3 U4 U5
Cluster 1
Cluster 2
0.5 0.4 0.7 0.5 0.6
0.5 0.6 0.3 0.5 0.4
The last step in the unsupervised learning phase is applying activation function. Table 4 and Table 5 shows the Gaussian activation function values and spline activation function values for the original rating matrix.
U1 U2 U3 U4 U5
I1
I2
I3
I4
I5
I6
I7
I8
I9
I10
3 4 1 4 4
1 1 3 1 1
1 1 1 4 1
4 4 4 2 4
2 4 1 1 1
2 2 2 2 2
1 1 1 1 1
5 1 1 1 1
1 1 2 1 1
1 1 1 1 1
Tables 6 and 7 show the calculated weights based on the Gaussian and spline activation functions for the original rating matrix. Using the Gaussian activation function given in Table 4 and the weights in Table 6, the full matrix is calculated. The full matrix
M.K.K. Devi, P. Venkatesh / Future Generation Computer Systems 29 (2013) 262–270
269
Table 14 TOP 2 recommendations. User
Rating
Prediction
Recommendation
New user New user Existing user (U2) Existing user (U2)
No (cold start user) [0, 1, 0, 0, 4, 0, 0, 0, 0, 0] No new rating [0, 1, 0, 0, 4, 0, 0, 5, 0, 2]
No [4, 1, 3, 4, 4, 2, 3, 3, 2, 5] No [4, 1, 3, 4, 4, 2, 3, 5, 2, 2]
Item 10, 8 Item 10, 4 Item 10, 3 Item 1, 4
is shown in Table 8. Training error = 0.3 and the rating matrix is completely filled. Using the spline activation function given in Table 5 and weights in Table 7, the full matrix is calculated. The full matrix is shown in Table 9. Training error = 0.35 and the rating matrix is not completely filled. Finally users are clustered using KFCM. Table 10 shows the user modeling using KFCM. ONLINE PHASE. Input: User rating Table 1. Output: Recommendation for users. Method: Table 1 is given as input, KFCM prediction for Gaussian, Spline and original rating matrix is given in Table 11, Table 12 and Table 13. The predictions made using spline and original (meager) rating matrices are the same. So, it is concluded that, Spline smoothing technique shows no improvement for the given rating matrix. Gaussian smoothing is comparatively better than the spline smoothing. Table 14 shows the user rating details and Top 2 Recommendation for the Gaussian based smoothed rating matrix. Spline and original rating matrix as fail to provide recommendations for new users. References [1] Anna Goy, Liliana Ardissono, Giovanna Petrone, Personalization in Ecommerce applications, in: The Adaptive Web, in: LNCS, vol. 4321, SpringerVerlag, Berlin, Heidelberg, 2007, pp. 485–520. [2] Gediminas Adomavicius, Alexander Tuzhilin, Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions, IEEE Transactions on Knowledge and Data Engineering 17 (6) (2005). [3] M. Balabanovié, Y. Shoham, Content-based, collaborative recommendation, ACM Communication 40 (3) (1997) 66–72. [4] P. Resnick, N. Iacovou, M. Suchak, P. Bergstorm, J. Riedl, GroupLens: an open architecture for collaborative filtering of netnews, in: Proc. ACM Conf. Computer Supported Cooperative Work, Chapel Hill, NC, 1994, pp. 175–186. [5] U. Shardanand, P. Maes, Social information filtering: algorithms for automating ‘word of mouth’, in: Proc. Computer–Human Interaction Conf., Denver, CO, May 1995. [6] M. Pryor, The effects of singular value decomposition on collaborative filtering, Tech. Rep. PCS-TR98-338, Computer Science Depart., Dartmouth College, Hanover, NH, June 1998. [7] J. Jiang, M. Berry, J. Donato, G. Ostrouchov, N. Grady, Mining consumer product data via latent semantic indexing, Intelligent Data Analysis 3 (1999) 377–398. [8] J. Breese, D. Heckerman, C. Kadie, Empirical analysis of predictive algorithms for collaborative filtering, in: Proc. 14th Conf. Uncertainty Artificial Intelligence, Madison, WI, July 1998. [9] L. Ungar, D. Foster, Clustering methods for collaborative filtering, in: Recommender Systems—Papers from the AAAI Workshop, Madison, WI, July 1998. [10] M. Balabanovic, Y. Shoham, Fab: content-based, collaborative recommendation, Communications of the ACM 40 (3) (1997) 66–72. [11] Lijuan Zheng, Yaling Wang, Jiangang Qi, Dan Liu, Research and improvement of personalized recommendation algorithm based on collaborative filtering, International Journal of Computer Science and Network Security, IJCSNS 7 (7) (2007). [12] Arnaud De Bruyn, C. Lee Giles, David M. Pennock, Offering Collaborative-Like Recommendations When Data is Sparse: The Case of Attraction-Weighted Information Filtering, in: LNCS, vol. 3137, Springer-Verlag, Berlin, Heidelberg, 2004, pp. 393–396. [13] Badrul Sarwar, George Karypis, Joseph Konstan, John Reidl, Item-based collaborative filtering recommendation algorithms, in: Proceedings of the 10th International Conference on World Wide Web, Hong Kong, May 01–05, 2001, pp. 285–295.
[14] D. Billsus, M. Pazzani, Learning collaborative information filters, in: Proc. Int’l Conf. Machine Learning, 1998, pp. 46–54. [15] B. Sarwar, G. Karypis, J. Konstan, J. Riedl, Application of dimensionality reduction in recommender systems—a case study, in: Proc. ACM WebKDD Workshop, 2000. [16] Kwok-Wai Cheung, Kwok-Ching Tsui, Jiming Liu, Extended latent class models for collaborative recommendation, IEEE Transactions on Systems, Man and Cybernetics, Part A: Systems and Humans 34 (1) (2004). [17] M.K. Kavitha Devi, V. Rajeswari, S. Nagalakshmi, P. Venkatesh, Mathematical approximation for model based recommender system, in: Proc. International Conference on Supply Chain Management and Information System, 2008. [18] M.K. Kavitha Devi, P. Venkatesh, Kernel based collaborative filtering system for E-purchasing, Sadhana: Academy Proceedings in Engineering Sciences 35 (Part 5) (2010) 513–524. Springer. [19] Zhili Wu, Xueli Yu, Jingyu Sun, An improved trust metric for trustaware recommender systems, in: First International Workshop on Education Technology and Computer Science, Wuhan, Hubei, China, vol. 1, 2009, pp. 947–951. [20] Chong-Ben Huang, Song-jie Gong, Employing rough set theory to alleviate the sparsity issue in recommender system, in: Proceedings of the Seventh International Conference on Machine Learning and Cybernetics, Kunming, vol. 3, 2008, pp. 1610–1614. [21] Paolo Massa, Paolo Avesani, Trust-aware collaborative filtering for recommender systems, in: Proc. of Federated Int. Conference on the Move to Meaningful Internet: CoopIS, DOA, ODBASE, Agia Napa, Cyprus, vol. 1, 2004, pp. 492–508. [22] Yanhong Guo, Guishi Deng, An improved personalized collaborative filterinng algolrithm in E-commerce recommender system, in: Proceedings of the International Conference on in Service Systems and Service Management, France, vol. 2, 2006, pp. 1582–1586. [23] Li Yu, Using ontology to enhance collaborative recommendation based on community, in: Ninth International Conference on Web-Age Information Management, Zhangjiajie, China, 2005, pp. 45–49. [24] Papagelis Manos, Plexousakis Dimitris, Kutsuras Themistoklis, Alleviating the sparsity problem of collaborative filtering using trust inferences, in: Proceedings of the International Conference Trust Management, Paris, vol. 3477, 2005, pp. 224–239. [25] Linkai Weng, Yaoxue Zhang, Yue-Zhi Zhou, Laurence Tianruo Yang, Pengwei Tian, Ming Zhong, A joint Web resource recommendation method based on category tree and associate graph, Journal of Universal Computer Science 15 (12) (2009) 2387–2408. [26] T. Bogers, A. van den Bosch, Collaborative and content-based filtering for item recommendation on social bookmarking Websites, in: Proceedings of the Workshop on Recommender Systems and the Social Web, Collocated with the 3rd ACM Conference on Recommender Systems, RecSys’09, 2009. [27] Xin Li, Kowloon Tong, Hsinchun Chen, Recommendation as link prediction: a graph kernel-based machine learning approach, Austin, TX, USA, 2009, pp. 213–216. [28] Z. Huang, D. Zeng, H. Chen, A unified recommendation framework based on probabilistic relational models, in: Fourteenth Annual Workshop on Information Technologies and Systems, 2004, pp. 3–18. [29] H. Yildirim, M.S. Krishnamoorthy, A random walk method for alleviating the sparsity problem in collaborative filtering, in: Proceedings of ACM Conference on Recommender Systems, Lausanne, Switzerland, 2008, pp. 131–138. [30] Stephan Spigel, Jerome Kunegis, Fang Li, Hydra: a hybrid recommender system [cross-linked rating and content information], in: Proceeding of the 1st ACM International Workshop on Complex Networks Meet Information & Knowledge Management, Hong Kong, China, 2009, pp. 75–80. [31] S. Lingireddy, L.E. Ormsbee, Neural networks in optimal calibration of water distribution systems, in: Artificial Neural Networks for Civil Engineers: Advanced Features and Applications, 1998, pp. 53–76. [32] A. Shahsavand, A. Ahmadpour, Application of optimal RBF neural networks for optimization and characterization of porous arterials, Computers & Chemical Engineering (2005) 2134–2143. [33] Zhang, S. Chen, Clustering incomplete data using kernel based fuzzy c-means algorithm, Neural Processing Letters 18 (3) (2003) 155–162. [34] J. Herlocker, J. Konstan, A. Borchers, J. Riedl, An algorithmic framework for performing collaborative filtering, in: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Berkeley, CA, USA, 1999, pp. 230–237. [35] M.K. Kavitha Devi, P. Venkatesh, ICRS: an intelligent collaborative recommender system for electronic purchasing, International Journal of Business Excellence 2 (2) (2009) 179–193. [36] Gist Support vector machine and kernel principal components analysis software toolkit Version 2.0.9 Authors: William Stafford Noble and Paul Pavlidis, Copyright© , Columbia University, 1999–2002. [37] www.mathworks.com/products/matlab/.
270
M.K.K. Devi, P. Venkatesh / Future Generation Computer Systems 29 (2013) 262–270 M.K. Kavitha Devi received the Ph.D. degree in information and communication engineering from Anna University Chennai in 2011. She is an assistant professor in the Department of Information Technology at the Thiagarajar College of Engineering, Madurai, India. Dr. M.K. Kavitha Devi’s research focuses on soft computing techniques, recommender systems, and data mining. She has published more than 10 refereed journal and conference papers in these areas.
P. Venkatesh received his Degree in Electrical and Electronics Engineering, his Masters degree in Power System Engineering with Distinction and Ph.D. in 1991, 1994 and 2003, respectively, from Madurai Kamaraj University, India. His area of interest is the application of evolutionary computation techniques to power system problems and power system restructuring. He has received the Boyscast Fellowship Award in 2006 from the Department of Science and Technology, India, for carrying out postdoctoral research work at the Pennsylvania State University, USA. Currently, he is an Associate Professor at the Department of Electrical and Electronics Engineering, Thiagarajar College of Engineering, Madurai, India.