Available online at www.sciencedirect.com Available online at www.sciencedirect.com
Available online at www.sciencedirect.com
ScienceDirect
Procedia Computer Science 00 (2018) 000–000 Procedia Computer Science 14300 (2018) 387–394 Procedia Computer Science (2018) 000–000
www.elsevier.com/locate/procedia www.elsevier.com/locate/procedia
8th International Conference on Advances in Computing and Communication (ICACC-2018) 8th International Conference on Advances in Computing and Communication (ICACC-2018)
Matrix Matrix factorization factorization in in Cross-domain Cross-domain Recommendations Recommendations Framework Framework by Shared Users Latent Factors by Shared Users Latent Factors a Motilal a Motilal
a Ashish K. Sahua,∗ a,∗, Pragya Dwivedia Ashish K. Sahu , Pragya Dwivedi
Nehru National Institute of Technology Allahabad, Allahabad-211004, India Nehru National Institute of Technology Allahabad, Allahabad-211004, India
Abstract Abstract Matrix factorization is one of the most commonly used method of collaborative filtering (CF) for generating personalized recomMatrix factorization the most of commonly method of collaborative (CF)may for fail generating recommendations to users. is A one mainoflimitation CF is thatused It fully depends on observedfiltering ratings and if these personalized observed ratings are mendations to users. A main limitation of CF is that It fully observed ratings and may fail ifcame theseinto observed ratings are in limited amount called sparsity problem. Addressing this depends problem,oncross-domain recommendations existence where in limited amount called sparsity problem. Addressing this problem, cross-domain recommendations came into existence where transfer learning mechanism is applied to mitigate sparsity problem and increase performance of the target domain using other transfer learning mechanism is applied to mitigate sparsity problem and increase performance of the target domain using other related source domains. related source we domains. In this paper, propose the method for knowledge transfer from source domain to target domain through shared users latent In this paper, the method for knowledge transfer from source domain to target domain factors. Firstly,we wepropose apply traditional matrix factorization (MF) method in source domain to learn latentthrough factors shared of usersusers and latent items factors. Firstly, we apply traditional matrix factorization (MF) method in source domain to learn latent factors of users andobjecitems through objective function of MF. After that, learned latent factors of users are directly transferred to target domain. Modify through objective function of MF. After that, learned latent factors users are directly transferred target domain. Modify tive function of MF and learn users/items latent factors of the targetofdomain. Finally, prediction on to unobserved ratings in theobjectarget tive function of MF andinner learnproduct users/items latent factors of the target domain. prediction on unobserved in the target domain is made using of respective user and item latent factors.Finally, Experimental results demonstrateratings that our proposed domain is made using inner product of respective user and item latent factors. Experimental results demonstrate that our proposed method substantially work well from other without and with transfer learning methods in terms of MAE and RMSE metrics. method substantially work well from other without and with transfer learning methods in terms of MAE and RMSE metrics. c 2018 2018 The The Authors. Authors. Published Published by by Elsevier Elsevier B.V. B.V. © c 2018 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND This is an open access article under the CC BY-NC-ND license license (https://creativecommons.org/licenses/by-nc-nd/4.0/) (https://creativecommons.org/licenses/by-nc-nd/4.0/) This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0/) Selection and peer-review under Selection and peer-review under responsibility responsibility of of the the scientific scientific committee committee of of the the 8th 8th International International Conference Conference on on Advances Advances in in Selection and under(ICACC-2018). responsibility of the scientific committee of the 8th International Conference on Advances in Computing andpeer-review Communication (ICACC-2018). Computing and Communication (ICACC-2018). Keywords: Collaborative filtering; Matrix factorization; Transfer learning; Cross-domain recommendations Keywords: Collaborative filtering; Matrix factorization; Transfer learning; Cross-domain recommendations
1. Introduction 1. Introduction In many e-commerce websites, users have millions of items options in which they want to purchase relevant items e-commerce millions of items in which they to manageable purchase relevant items outInofmany millions of items. websites, Exploringusers such have enormous items space options is time consuming andwant is not for most of out of millions of items. Exploring such enormous items space is time consuming and is not manageable for most of users in many situations. Solving this problem, recommender systems have become very useful tool for generating users in many situations. Solving this problem, recommender systems have become very useful tool for generating personalized recommendations to users in e-commerce applications. The mainly used recommendation techniques personalized recommendations to usersfiltering in e-commerce applications. The mainly used recommendation [5, 1] are: content-based, collaborative and hybrid. Among them, collaborative filtering (CF) is techniques one of the [5, 1] are: content-based, collaborative filtering and hybrid. Among them, collaborative filtering (CF) one of the most promising techniques in recent years. It can also be categorized in two types: memory-based and is model based most promising techniques in recent years. It can also be categorized in two types: memory-based and model based ∗ ∗
Ashish K. Sahu Ashish K. Sahu
[email protected] E-mail address: E-mail address:
[email protected] c 2018 1877-0509 The Authors. Authors. Published Published by by Elsevier Elsevier B.V. 1877-0509 © 2018 The B.V. c 2018 1877-0509 Thearticle Authors. Published byBY-NC-ND Elsevier B.V. This is access under thethe CCCC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0/) This is an anopen open access article under license (https://creativecommons.org/licenses/by-nc-nd/4.0/) This is an and openpeer-review access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0/) Selection under responsibility ofofthe committee ofofthethe 8th8th International Conference on Advances in Computing and Selection and peer-review under responsibility thescientific scientific committee International Conference on Advances in Computing Selection and peer-review under responsibility of the scientific committee of the 8th International Conference on Advances in Computing and and Communication (ICACC-2018). Communication (ICACC-2018). Communication (ICACC-2018). 10.1016/j.procs.2018.10.410
Ashish K. Sahu et al. / Procedia Computer Science 143 (2018) 387–394 Ashish K Sahu / Procedia Computer Science 00 (2018) 000–000
388 2
. Memory-based techniques are not suitable for large amount of dataset due to calculation of similarities between users/items is too expensive, on the other hand model-based techniques are flexible to handle large amount of dataset. Matrix factorization [MF] [11] is one of the CF model-based methods in which decomposes the user-item matrix into two sub-matrices and tries to explain the ratings in small number of latent factors of users and items. Latent factors of items show distribution of items on latent factors , and users latent factors show distribution of users taste on those factors. All methods of CF are fully depend on density (available ratings) of user-item rating matrix of the dataset. However, in real-world scenarios, rating matrix is very sparse due to users provide the ratings on few items out of millions of items. Handling this problem, cross-domain recommendations [9, 2, 7] have been proposed using exploit knowledge from other related domains to improve recommendations quality of the target domain. Here, other domains means additional datasets which are called source domains, and dataset which is being used of recommendations called target domain. The main hypothesis in cross-domain is that some common properties should overlap between domains. Exploiting knowledge from one or more than one domains to other domain is called transfer learning [15] mechanism which is new paradigm of machine learning, and combination of recommender systems and transfer learning is called cross-domain recommendations. Several authors [14, 3, 16, 18] have proposed methods to exploit knowledge from source domain with their prospectives. It can be categorized four types [6] based on overlapping of users/items, i.e., users overlap, items overlap, both users and items overlap and neither users nor items overlap. Users overlap: A first paper has been presented by Winoto and Tang [19] in 2008. The authors analyzed that the correlations of different domains based on the categories of items to provide the cross-domain recommendations. Nonetheless, Berkovsky et al. [2] presented the same type of work and evaluated that importing user profile data from other domains yields more accurate predictions. Other type of work has been done by Pan and Yang [17] using like/dislike information treat as source domain. Items overlap, users and items overlap: In both type of categories, a very less amount of work have been done because it is too difficult to find shared items between the domains. Neither users nor items overlap: Li et al. [13] have proposed a method based on cluster-level rating patterns for knowledge transferring between domains. The authors extended their work by incorporating probabilistic model [12]. Both methods are based on cluster-level rating patterns to establish the bridge between domains. In this paper, we focus on first type of category in which users are shared between the domains. We propose the method to exploit knowledge from source domain (D s ) to improve accuracy performance of the target domain (Dt ) using shared latent factors of users. The proposed method consists of three fold: • Apply MF method in source domain to learn latent factors of users and items through objective function of MF. • Modify objective function of MF for exploiting learned latent factors of users in source domain, and learn users/items latent factors of the target domain. • In last fold, predict unobserved ratings in target domain using inner product of respective users and items latent factors.
Nomenclature Dk Domain k k k User-item rating matrix of domain k Rk ∈ RN ×M k N k ×M k I ∈R , R ∈ {0, 1} Indicator user-item matrix of domain k Vector of user i latent factors in domain k Uik ∈ R1× f Vector of item j latent factors in domain k V kj ∈ R1× f k m Number of users in domain k nk Number of items in domain k rˆi, j Prediction in the target domain on item j to user i
Ashish K. Sahu et al. / Procedia Computer Science 143 (2018) 387–394 Ashish K Sahu / Procedia Computer Science 00 (2018) 000–000
f E
389 3
Dimension space of latent factors An objective function
2. Related work Nowadays recommendation system tools are playing central role in many e-commerce companies. It provides recommendations to users based on historical ratings. There are various techniques [5] for generating recommendations but collaborative filtering is one of well known techniques in recent years. It consists two types: memory-based and model-based. A first type of method is fully based on memory space, i.e., whole dataset loads at one time, and prediction can be calculated using statistical tools. Memory-based collaborative filtering consists three steps: 1) Similarity computation 2) Neighborhood selection and 3) Prediction on unobserved ratings. Similarity can be calculated between users (CF-UU) or between items (CF-II). Here, we only focus on collaborative filtering with user-user similarity (CF-UU). On other hand mode-based method, matrix factorization is one of most well known methods which divide user-item rating matrix into two sub matrices, one for users or users latent factors matrix and other for items latent factors. Both techniques are fully depend on historical ratings, and if these ratings are in limited amount, it may fails for recommendations. Solving this problem, cross-domain recommendations came into existence by transfer learning mechanism from other related domains. It exploits knowledge from source domain for improving accuracy performance of the target domain. Berkovsky et al. [2] proposed the method for cross-domain recommendations using aggregating user rating vectors from different domains and apply traditional memory based collaborative filtering. Similarly, Hu et al. [8] have proposed the method for cross-domain version of a matrix factorization, in which an augmented user-item rating matrix is constructed by horizontally concatenating all matrices. Thus MF model can be used to obtain the latent user factors and latent item factors. These type of methods are used multi-task transfer learning where both domains are used simultaneously. In multi-task transfer learning, both domains (source and target) are used simultaneously to increase accuracy of the target domain but in case of adoptive transfer learning, first we extract knowledge from source domain and then extracted knowledge is transfered in the target domain. The novelty of our proposed work is that we have adopted adaptive transfer learning rather multi-task learning method. We extract knowledge from source domain in terms of users latent factors, after that these latent factors are transferred into target domain directly. and then prediction on unobserved ratings can made using inner product of respective users and items latent factors. 3. Proposed method In our proposed method, we use learned latent factors of users U s as extracted knowledge form source domain D s , and then directly transfer to target domain Dt . The intuition is that if users are shared in both domains then distribution of users taste on those factors may same in both domains, i.e., Uis = Uit . And if these are not equal then an error ξi must be present, this error should be minimized using any optimization algorithm. The architecture of proposed method is shown in fig. 1 and major contribution of our proposed method is as follows: Fold-1: Firstly, we apply traditional matrix factorization in source domain to learn users/items latent factors. The objective function shows in equation 1. An error E1 (U s , V s ) is minimized using stochastic gradient descent optimization algorithm. In this equation, two parameters ((U s , V s ) are optimized through partial derivative alternatively using equation 2. U s 2 + V s 2 s s s sT 2 E1 (U s , V s ) = min (r − U V ) + λ I (1) i, j i, j i j i j U∗s ,V∗s (i, j∈Rms ×ns ) θ ←θ−α∗
∂E ∂θ
(2)
4
Ashish K Sahu / Procedia Computer Science 00 (2018) 000–000
Ashish K. Sahu et al. / Procedia Computer Science 143 (2018) 387–394
390
Fig. 1. Architecture of proposed method for cross-domain recommendations
Uis = Uis + α(eisj V js − λUis ) V js = V js + α(eisj Uis − λV js ) where eisj = ruis − Uis V js T and operator is element wise product.
Fold-2: After that, learned users latent factors (knowledge of source domain) are directly transfer to target domain. Knowledge can adopt by modifying the objective function of traditional matrix factorization. It can be followed as: U t 2 + V t 2 + c ∗ (ξ )2 t t t tT 2 E2 (U t , V t ) = min I (r − U V ) + λ (3) i i, j i, j i j i j U∗t ,V∗t t ×nt m (i, j∈R ) Similarly, an error E2 is minimized using stochastic gradient descent optimization algorithm.
Ashish K. Sahu et al. / Procedia Computer Science 143 (2018) 387–394 Ashish K Sahu / Procedia Computer Science 00 (2018) 000–000
391 5
Table 1. Statistics of the dataset after preprocessing
# users # items # ratings density level
Source domain (D s )
Target domain (Dt )
395 6,734 58,132 0.0219
395 26,415 100,061 .0096
Uit ← Uit + α ei, j V tj − λUit − cξi
V tj ← V tj + α ei, j Uit − λV tj t where ξi = Uis − Uit and eti j = rui − Uit V tj T
Fold-3: After learn both parameters in target domain, the prediction can be made using inner product of respective users and items latent factors follow as: rˆi, j = Uit V tj
T
(4)
4. Experiment setup and results In this Section, firstly we describe dataset preprocessing, after that we describe experiment protocols and evaluation metrics which are used in proposed work, and finally we discuss comparison methods and summery of experimental results. 4.1. Dataset preprocessing Our experiments are conducted on Amazon product co-purchasing network metadata 1 . In this dataset, we consider two type of groups (Book and DVD) where one group as a source domain and other for target. In this paper, we focus on shard users between the domains so some preprocessing steps are needed as follows: • we filtered out items those have been rated at-least 10 ratings, and also filtered out users who have rated at-least 50 items in both groups or domains. • We matched common users using userIDs in both domains, so 395 users are selected randomly. Table 1 shows the statistics of the filtered dataset. • We have considered book group and DVD group as source domain and target domain, respectively. We observed that density level of D s in more compare to Dt .
4.2. Experiment protocols We have used 5-fold cross-validation process to report assures unbiased results. In this manner, 4 parts as training and 5th part for testing purpose of target domain. An average value of all 5 test sets as overall evaluation of the method. We have also used 95% confidence interval while calculating an average value of all test sets. Furthermore, validating the effectiveness of cross-domain recommendations, we have divided training set (TR) into three sub parts: TR(50%), TR(75%) and TR(100%). 1
https://snap.stanford.edu/data/amazon-meta.html
Ashish K. Sahu et al. / Procedia Computer Science 143 (2018) 387–394 Ashish K Sahu / Procedia Computer Science 00 (2018) 000–000
392 6
4.3. Evaluation metrics The proposed and other state-of-the-arts methods are evaluated on the basis of Mean absolute error (MAE) (5) and Root mean square error (RMSE) (6) to determine rating prediction accuracy. 1 ri, j − rˆi, j MAE = (5) |rT | (i, j)∈r T
RMS E =
1 (ri, j − rˆi, j )2 |rT | (i, j)∈r
(6)
T
where rT is denote the total number of predicted ratings in target domain. ri, j denotes the actual rating and rˆi, j predicted rating on item i to user u. 4.4. Comparison methods and parameter settings We have used three comparison methods with our proposed work: Average filling, collaborative filtering with useruser similarity, and matrix factorization. In all three comparison methods, we have adopted two type of scenarios, without transfer and with transfer. In first one, we focused only target domain to evaluate rating prediction, in case of with transfer, we augmented user-item rating matrix of source domain with target domain to estimate rating prediction. Second type of scenario called multi-task transfer learning where both domains are used simultaneously. Our proposed method is based on adaptive transfer learning [15] where knowledge is adopted form source domain and directly transferred into target domain. Average filling (AF): This method takes very less time to predict the ratings. We just fill average value of observed items’ rating given by users in target domain. The prediction as follows: rˆi,∗ =
M
j=1 Ii, j ri, j M j=1 Ii, j
(7)
Collaborative filtering with user-user similrity (CF-UU): CF-UU [4] is one of the memory based methods for providing recommendations to users using top K-NN similar users. We have used constraint Pearson correlation similarity formula and K = 50 for top k users. Matrix factorization (MF): MF [10] provides the lower rank approximations of the user-item matrix. The prediction can be done using two latent factors: user side latent factor and item side latent factor. After learn both latent factors, prediction can be made through inner product of both factors. We have used λ = .001 as trade-off parameter, size of latent factors f = 10 and α = .001 are fixed. Proposed method: The trade-off parameter values are λ = .005, α = .001, and c = .01. The size of latent factor f = 10 is fixed. 4.5. Summary of the experimental results We have conducted three set of experiments based on % of training set. In each training set of experiment, we have compared our proposed method with three state-of-the-art methods using two type of scenarios (with and without transfer). MAE and RMS E both are used as evaluation metrics. The results are shown in table 2 and 3. The following observations can be made from the results: • AF method is poorly compared to the other methods, because of just blindly fill the ratings based on the average item ratings given by users. One interesting thing is that no much improvement in accuracy while we have used with transfer method.
Ashish K. Sahu et al. / Procedia Computer Science 143 (2018) 387–394 Ashish K Sahu / Procedia Computer Science 00 (2018) 000–000
393 7
Table 2. Comparison results for cross-domain recommendations in terms of MAE Methods AF (without transfer) AF (with transfer) CF-UU (with out transfer) CF-UU (with transfer) MF (without transfer) MF (with transfer) Proposed method (with transfer)
TR(50%) 0.8860 ± 0.0038 0.8800 ± 0.0027 0.8715 0.0014 0.8702 0.0028 0.8872 ± 0.0039 0.8511 ± 0.0038 0.8312 ± 0.0042
% Training set TR(75%) 0.8848 ±0.0021 0.8789 ±0.0030 0.8688 ± 0.0028 0.8684 ± 0.0043 0.8632 ±0.0065 0.8232 ±0.0021 0.8028 ±0.0061
TR(100%) 0.8840 ± 0.0031 0.8786 ± 0.0027 0.8619 ± 0.0032 0.8602 ± 0.0026 0. 8550 ± 0.0062 0.7978 ± 0.0031 0.7641 ± 0.0015
TR(50%) 1.2000 ± 0.0063 1.1791 ± 0.0028 1.1642 0.0024 1.1604 0.0042 1.1603 ± 0.0041 1.1501 ±0.0021 1.1384 ± 0.0065
% Training set TR(75%) 1.1751 ±0.0031 1.1575 ±0.0039 1.1543 ± 0.0039 1.1537 ± 0.0052 1.1492 ±0.0065 1.4442 ±0.0031 1.1071 ±0.0022
TR(100%) 1.1599 ± 0.0047 1.1556 ± 0.0031 1.1538 ± 0.0014 1.1485 ± 0.0062 1.1456 ± 0.0064 1.1241 ± 0.0062 1.0902 ± 0.0020
Table 3. Comparison results for cross-domain recommendations in terms of RMS E Methods AF (without transfer) AF (with transfer) CF-UU (with out transfer) CF-UU (with transfer) MF (without transfer) MF (with transfer) Proposed method (with transfer)
• CF-UU has given better result compare with AF method, but it is too expensive to calculate similarity between users. • MF is one of the most widely used methods for the recommendations. It has given better results compare to both AF and CF-UU using both scenarios. • The proposed method has lowest MAE and RMSE among all with transfer and without transfer methods. Compare with transfer learning method, we found that our proposed method significantly outperform in terms of accuracy metrics.
5. Conclusion and Future Direction In this paper, we have proposed novel method for knowledge transfer from source domain to the target domain where users are shared. We have used shared latent factors of users for transferring knowledge between domains. Firstly, we learn latent factors of users/items in source domain through standard objective function of matrix factorization. After that, we learn users/items latent factors of target domain through modified objective function of matrix factorization. Finally, predict ratings of target domain using inner product of respective user and item latent factors. We have done three experiments to validate the proposed method using % of training set. The experiments results show that our proposed method performs significantly better than several non-transfer methods and transfer method.
394 8
Ashish K. Sahu et al. / Procedia Computer Science 143 (2018) 387–394 Ashish K Sahu / Procedia Computer Science 00 (2018) 000–000
In future, we will extend our method with partial shared users to check how to transfer learning framework is significantly effective for cross-domain recommendations. References [1] G. Adomavicius and A. Tuzhilin. Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. IEEE transactions on knowledge and data engineering, 17(6):734–749, 2005. [2] S. Berkovsky, T. Kuflik, and F. Ricci. Cross-Domain Mediation in Collaborative Filtering 2 Cross-Domain Mediation in Collaborative Filtering. User Modeling, 4511:355–359, 2007. [3] S. Berkvosky and F. Ricci. Distributed Collaborative Filtering with Domain Specialization. pages 33–40, 2007. [4] J. Bobadilla, F. Ortega, A. Hernando, and A. Guti´errez. Recommender systems survey. Knowledge-based systems, 46:109–132, 2013. [5] L. Candillier, F. Meyer, and M. Boull´e. Comparing state-of-the-art collaborative filtering systems. Lecture Notes in Computer Science, 4571: 548, 2007. [6] P. Cremonesi, A. Tripodi, and T. R. Cross-domain recommender systems. ICDMW2011: IEEE 11th International Conference on Data Mining Workshops, pages 496–503, 2011. [7] I. Fern´andez-Tob´ıas, I. Cantador, M. Kaminskas, and F. Ricci. Cross-domain recommender systems: A survey of the state of the art. In Spanish Conference on Information Retrieval, 2012. [8] L. Hu, J. Cao, G. Xu, L. Cao, Z. Gu, and C. Zhu. Personalized recommendation via cross-domain triadic factorization. Proceedings of the 22nd international conference on World Wide Web - WWW ’13, pages 595–606, 2013. [9] M. M. Khan, R. Ibrahim, and I. Ghani. Cross Domain Recommender Systems: A Systematic Literature Review. ACM Computing Surveys, 50 (3):1–34, 2017. [10] Y. Koren. Factorization meets the neighborhood: A multifaceted collaborative filtering model. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’08, pages 426–434, New York, NY, USA, 2008. ACM. [11] Y. Koren, R. Bell, and C. Volinsky. Matrix factorization techniques for recommender systems. Computer, 42(8):30–37, Aug 2009. [12] B. Li, Q. Yang, and X. Xue. Transfer learning for collaborative filtering via a rating-matrix generative model. In Proceedings of the 26th Annual International Conference on Machine Learning, ICML ’09, pages 617–624, New York, NY, USA, 2009. ACM. [13] B. Li, Q. Yang, and X. Xue. Can movies and books collaborate?: Cross-domain collaborative filtering for sparsity reduction. In Proceedings of the 21st International Jont Conference on Artifical Intelligence, IJCAI’09, pages 2052–2057, San Francisco, CA, USA, 2009. Morgan Kaufmann Publishers Inc. [14] H. Liu, Z. Hu, A. Mian, H. Tian, and X. Zhu. A new user similarity model to improve the accuracy of collaborative filtering. Knowledge-Based Systems, 56(Supplement C):156 – 166, 2014. [15] S. J. Pan and Q. Yang. A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 22(10):1345–1359, oct 2010. [16] W. Pan. A survey of transfer learning for collaborative recommendation with auxiliary data. Neurocomputing, 177:447–453, 2016. [17] W. Pan and Q. Yang. Transfer learning in heterogeneous collaborative filtering domains. Artificial intelligence, 197:39–55, 2013. [18] A. K. Sahu, P. Dwivedi, and V. Kant. Tags and item features as a bridge for cross-domain recommender systems. Procedia Computer Science, 125:624 – 631, 2018. The 6th International Conference on Smart Computing and Communications. [19] P. Winoto and T. Tang. If you like the devil wears prada the book, will you also enjoy the devil wears prada the movie? a study of cross-domain recommendations. New Generation Computing, 26(3):209–225, 2008.