Personalized Standard Deviations Improve the Baseline Estimation of Collaborative Filtering Recommendation
Abstract
:Featured Application
Abstract
1. Introduction
- (1)
- Observed that four kinds of personalized users exist in recommender systems with personalized rating criteria;
- (2)
- Proposed an improved and unified baseline estimation model based on users’ personalized rating distributions and system ratings’ global distribution; and
- (3)
- Proposed application instances of SDP to existing latent factor-based CF recommendations with excellent improved predictive accuracies.
2. Related Work
3. Problem Statement
- (1)
- Normal user. The actual rating range of this kind of user usually covers the recommender system’s rating range. For example, assume that the system’s rating range is between [1,5], those normal users’ rating range may be the same as [1,5].
- (2)
- Strict user. In this situation, users have relatively strict criteria and make a relatively lower rating than others. They rate items strictly, and few cover the highest rating range.
- (3)
- Lenient user. These users habitually give relatively higher ratings than others, and their actual rating range usually does not cover the lowest range.
- (4)
- Middle user. Users of this kind neither give higher ratings nor give lower ratings, and their actual rating range is the middle, e.g., [2,4] while the system’s range is [1,5].
4. Proposed Model
4.1. Symbols Definition
4.2. Observation from Rating Distribution
4.3. Proposed Improved Baseline Estimation Model
5. Application Instances of Proposed SDP
5.1. Application Instance-1: SDPSVD++
Algorithm 1: Get Prediction in SDPSVD++ by Learning |
Input: user-item rating matrix , , learning rate and |
Output: Rating prediction |
(1) Initialize with very small positive value (as 0.0001); |
(2) Calculate global deviation under Equation (3); |
(3) Calculate local standard deviation of user under Equation (5); |
(4) if then |
(5) Begin learning with loss function according to Equation (14); |
(6) while do |
(7) ; |
(8) ; |
(9) ; |
(10) ; |
(11) ; |
(12) ; |
(13) end while |
(14) Calculate according to Equation (13) of SDPSVD++ |
(15) else |
(16) Begin learning with loss function according to algorithm in SVD++; |
(17) Calculate according to Equation (12) of SVD++; |
(18) end if |
(19) return |
5.2. Application Instance-2: SDPTrustSVD
Algorithm 2: Get Prediction in SDPTrustSVD by Learning |
Input: user-item rating matrix , user-user Trust matrix , regularization parameters and , learning rate and |
Output: Rating prediction |
(1) Initialize with very small positive value; |
(2) Calculate global deviation ; |
(3) Calculate local standard deviation ; |
(4) if then |
(5) Begin learning with loss function according to Equation (17); |
(6) while do |
(7) ; |
(8) ; |
(9) ; |
(10) ; |
(11) ; |
(12) ; |
(13) ; |
(14) end while |
(15) Calculate according to Equation (16); |
(16) else |
(17) Begin learning with loss function according to algorithm in TrustSVD; |
(18) Calculate according to Equation (15) of TrustSVD; |
(19) end if |
(20) return |
6. Experiments and Analysis
- Experiment (1): The first experiment evaluates the baseline estimation accuracy of the proposed SDP on the five datasets, compared with classical baseline estimation in (2) and the PBEModel [14].
- Experiment (2): The second experiment evaluates the improved accuracies of SDPSVD++ and proves the efficiency of SDP on improving existing CF recommendations.
- Experiment (3): The third experiment evaluates the performance of the proposed SDPTrustSVD based on the Flixster and FilmTrust datasets, which have trust ratings.
6.1. Performance of Proposed SDP
6.2. Performance of Proposed SDPSVD++
6.3. Performance of Proposed SDPTrustSVD
7. Conclusions
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
References
- Jalili, M.; Ahmadian, S.; Izadi, M.; Moradi, P.; Salehi, M. Evaluating Collaborative Filtering Recommender Algorithms: A Survey. IEEE Access 2018, 6, 74003–74024. [Google Scholar] [CrossRef]
- Tan, Z.; He, L. An Efficient Similarity Measure for User-Based Collaborative Filtering Recommender Systems Inspired by the Physical Resonance Principle. IEEE Access 2017, 5, 27211–27228. [Google Scholar] [CrossRef]
- Koren, Y.; Bell, R.; Volinsky, C. Matrix factorization techniques for recommender systems. Computer 2009, 42, 30–37. [Google Scholar] [CrossRef]
- Takács, G.; Pilászy, I.; Németh, B.; Tikk, D. Major components of the gravity recommendation system. ACM SIGKDD Explor. Newsl. 2007, 9, 80–83. [Google Scholar] [CrossRef]
- Koren, Y. Factorization Meets the Neighborhood: A Multifaceted Collaborative Filtering Model. Available online: https://www.cs.rochester.edu/twiki/pub/Main/HarpSeminar/Factorization_Meets_the_Neighborhood-_a_Multifaceted_Collaborative_Filtering_Model.pdf (accessed on 10 May 2016).
- Hernando, A.; Bobadilla, J.S.; Ortega, F.; Tejedor, J. Incorporating reliability measurements into the predictions of a recommender system. Inf. Sci. 2013, 218, 1–16. [Google Scholar] [CrossRef] [Green Version]
- Moradi, P.; Ahmadian, S. A reliability-based recommendation method to improve trust-aware recommender systems. Expert Syst. Appl. 2015, 42, 7386–7398. [Google Scholar] [CrossRef]
- Koren, Y. Collaborative Filtering with Temporal Dynamics. Available online: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.379.1951&rep=rep1&type=pdf (accessed on 10 May 2016).
- Kumar, R.; Verma, B.K.; Rastogi, S.S. Social popularity based SVD++ recommender system. Int. J. Comput. Appl. 2014, 87, 33–37. [Google Scholar] [CrossRef]
- Bao, Y.; Fang, H.; Zhang, J. TopicMF: Simultaneously Exploiting Ratings and Reviews for Recommendation. Available online: https://www.aaai.org/ocs/index.php/AAAI/AAAI14/paper/viewFile/8273/8391 (accessed on 15 September 2015).
- Guo, G.; Zhang, J.; Yorke-Smith, N. TrustSVD: Collaborative Filtering with Both the Explicit and Implicit Influence of User Trust and of Item Ratings. Available online: https://guoguibing.github.io/papers/guo2015trustsvd.pdf (accessed on 20 June 2016).
- Guo, G.; Zhang, J.; Yorke-Smith, N. A novel recommendation model regularized with user trust and item ratings. IEEE Trans. Knowl. Data Eng. 2016, 28, 1607–1620. [Google Scholar] [CrossRef]
- Pan, W.; Yang, Q.; Duan, Y.; Ming, Z. Transfer learning for semisupervised collaborative recommendation. ACM TiiS 2016, 6, 1–21. [Google Scholar] [CrossRef]
- Tan, Z.; He, L.; Li, H.; Wang, X. Rating Personalization Improves Accuracy: A Proportion-Based Baseline Estimate Model for Collaborative Recommendation. In Computing: Networking, Applications and Worksharing: 12th International Conference, CollaborateCom 2016, Beijing, China, November 10–11, 2016, Proceedings; Springer: Cham, Switzerland, 2017. [Google Scholar]
- Wang, D.; Liang, Y.; Xu, D.; Feng, X.; Guan, R. A content-based recommender system for computer science publications. Knowl. Based Syst. 2018, 157, 1–9. [Google Scholar] [CrossRef]
- Zhang, Y.; Koren, J. Efficient Bayesian hierarchical user modeling for recommendation system. Available online: https://users.soe.ucsc.edu/~yiz/papers/c10-sigir07.pdf (accessed on 10 May 2015).
- Horváth, T. A model of user preference learning for content-based recommender systems. Comput. Inform. 2012, 28, 453–481. [Google Scholar]
- Sharma, R.; Gopalani, D.; Meena, Y. Collaborative filtering-based recommender system: Approaches and research challenges. In Proceedings of the 2017 3rd International Conference on Computational Intelligence & Communication Technology (CICT), Ghaziabad, India, 9–10 February 2017. [Google Scholar]
- Shih, Y.Y.; Liu, D.R. Product recommendation Approaches: Collaborative filtering via customer lifetime Value and customer demands. Expert Syst. Appl. 2008, 35, 350–360. [Google Scholar] [CrossRef]
- Badaro, G.; Hajj, H.; El-Hajj, W.; Nachman, L. A hybrid approach with collaborative filtering for recommender systems. In Proceedings of the 2013 9th International Wireless Communications and Mobile Computing Conference (IWCMC), Sardinia, Italy, 1–5 July 2013. [Google Scholar]
- Parhi, P.; Pal, A.; Aggarwal, M. A survey of methods of collaborative filtering techniques. In Proceedings of the 2017 International Conference on Inventive Systems and Control (ICISC), Coimbatore, India, 19–20 January 2017. [Google Scholar]
- Cao, G.; Kuang, L. Identifying Core Users based on Trust Relationships and Interest Similarity in Recommender System. In Proceedings of the 2016 IEEE International Conference on Web Services (ICWS), San Francisco, CA, USA, 27 June–2 July 2016. [Google Scholar]
- Bartolini, I.; Zhang, Z.; Papadias, D. Collaborative Filtering with Personalized Skylines. IEEE Trans. Knowl. Data Eng. 2011, 23, 190–203. [Google Scholar] [CrossRef] [Green Version]
- Li, W.; Xu, H.; Ji, M.; Xu, Z.; Fang, H. A hierarchy weighting similarity measure to improve user-based collaborative filtering algorithm. In Proceedings of the 2016 2nd IEEE International Conference on Computer and Communications (ICCC), Chengdu, China, 14–17 October 2016. [Google Scholar]
- Qian, X.; Feng, H.; Zhao, G.; Mei, T. Personalized Recommendation Combining User Interest and Social Circle. IEEE Trans. Knowl. Data Eng. 2014, 26, 1763–1777. [Google Scholar] [CrossRef]
- Deerwester, S.; Dumais, S.T.; Furnas, G.W.; Landauer, T.K.; Harshman, R. Indexing by latent semantic analysis. J. Am. Soc. Inform. Sci. 1990, 41, 391–407. [Google Scholar] [CrossRef]
- Wu, D.; Luo, X.; Shang, M.; He, Y.; Wang, G.; Zhou, M. A Deep Latent Factor Model for High-Dimensional and Sparse Matrices in Recommender Systems. IEEE Trans. Syst. Man Cybern. 2019. [Google Scholar] [CrossRef]
- Li, S.; Kawale, J.; Fu, Y. Deep collaborative filtering via marginalized denoising auto-encoder. In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, Melbourne Australia, 19–23 October 2015. [Google Scholar]
- Zhang, S.; Yao, L.; Sun, A.; Tay, Y. Deep learning based recommender system: A survey and new perspectives. ACM Comput. Surv. 2019, 52, 5. [Google Scholar] [CrossRef] [Green Version]
- Sofiane, B.B.; Bermak, A. Gaussian process for nonstationary time series prediction. Comput. Stat. Data Anal. 2004, 47, 705–712. [Google Scholar] [CrossRef]
- Ko, J.; Fox, D. GP-BayesFilters: Bayesian filtering using Gaussian process prediction and observation models. Auton. Robot. 2009, 26, 75–90. [Google Scholar] [CrossRef] [Green Version]
- Tan, Z.; Wu, D.; He, L.; Chang, Q.; Zhang, B. SDP: An Improved Baseline Estimation Model Based on Standard Deviation Proportion. In Proceedings of the 2019 IEEE International Conference on Multimedia and Expo (ICME), Shanghai, China, 8–12 July 2019. [Google Scholar]
Symbols | Description |
---|---|
u, v | user ID |
i, j | item ID |
overall average rating | |
average rating given from u | |
average rating of i | |
observed rating of (u, i) | |
predicted rating of (u, i) | |
bias of u | |
bias of i | |
Local standard deviation of u | |
Global standard deviation | |
The number of all ratings in system | |
The number of ratings given by user | |
factor-item matrix | |
latent feature vector of user u | |
latent feature vector of item i |
User Type | Rating Range | #Related Ratings | Densities | Statistical Parameters | Fitted Parameters | ||
---|---|---|---|---|---|---|---|
Normal | [0,1) | 141,610 | 0.0280 | 3.4506 | 1.1529 | 3.452 | 1.752 |
[1,2) | 322,168 | 0.0638 | |||||
[2,3) | 663,944 | 0.1315 | |||||
[3,4) | 1,719,123 | 0.3405 | |||||
[4,5) | 1,366,972 | 0.2707 | |||||
[5,6) | 835,220 | 0.1654 | |||||
Strict | [0,1) | 12,135 | 0.1042 | 2.6286 | 1.1321 | 2.788 | 1.52 |
[1,2) | 15,200 | 0.1305 | |||||
[2,3) | 24,156 | 0.2074 | |||||
[3,4) | 47,324 | 0.4063 | |||||
[4,5) | 17,649 | 0.1515 | |||||
[5,6) | -- | 0.0000 | |||||
Lenient | [0,1) | -- | 0.0000 | 3.9853 | 0.8665 | 3.991 | 1.689 |
[1,2) | 38,980 | 0.0142 | |||||
[2,3) | 156,224 | 0.0570 | |||||
[3,4) | 836,237 | 0.3054 | |||||
[4,5) | 950,718 | 0.3472 | |||||
[5,6) | 756,329 | 0.2762 | |||||
Middle | [0,1) | -- | 0.0000 | 3.3091 | 0.8353 | 3.251 | 1.154 |
[1,2) | 17,501 | 0.0599 | |||||
[2,3) | 41,515 | 0.1421 | |||||
[3,4) | 138,739 | 0.4750 | |||||
[4,5) | 94,333 | 0.3230 | |||||
[5,6) | -- | 0.0000 |
DataSets | #Users | #Items | #Ratings | #Density | Rating Scale | #Trust |
---|---|---|---|---|---|---|
Flixster | 147,612 | 48,794 | 8,196,077 | 0.11% | [0.5,5] | 11,794,648 |
FilmTrust | 1508 | 2071 | 35,497 | 1.14% | [0.5,4] | 1853 |
MiniFilm | 55 | 334 | 1000 | 5.44% | [0.5,4] | 0 |
ml-10 m | 71,567 | 10,681 | 10,000,054 | 1.31% | [1,5] | 0 |
ml-latest-small | 700 | 10,000 | 100,000 | 1.43% | [1,5] | 0 |
Data Sets | Metrics | BE | PBE | Proposed SDP | Improved (vs. BE) |
---|---|---|---|---|---|
Flixster | RMSE | 0.98813 | 0.95974 | 0.95890 * | 2.96% |
RE | 0.25145 | 0.24423 | 0.24402 * | 2.95% | |
MAE | 0.73589 | 0.70248 | 0.69719 * | 5.26% | |
FilmTrust | RMSE | 0.84009 | 0.82623 * | 0.83027 ** | 1.17% |
RE | 0.26753 | 0.26312 * | 0.26441 ** | 1.17% | |
MAE | 0.63667 | 0.62725 * | 0.62806 ** | 1.35% | |
MiniFilm | RMSE | 1.01431 | 1.00715 | 0.99605 * | 1.80% |
RE | 0.32314 | 0.32086 | 0.31733 * | 1.80% | |
MAE | 0.78123 | 0.77251 | 0.76472 * | 2.11% | |
ml_10m | RMSE | 0.89921 | 0.89831 | 0.89162 * | 0.84% |
RE | 0.24329 | 0.24186 | 0.24039 * | 1.19% | |
MAE | 0.68465 | 0.68451 | 0.67913 * | 0.81% | |
ml_latest_small | RMSE | 0.91483 | 0.89657 | 0.89607 * | 2.05% |
RE | 0.24923 | 0.24426 | 0.24412 * | 2.05% | |
MAE | 0.70022 | 0.68840 | 0.68515 * | 2.15% |
Dataset | Metrics | SVD++ | PBESVD++ | SDPSVD++ | Improved (vs. SVD++) |
---|---|---|---|---|---|
Flixster (d = 5) | RMSE | 0.98347 | 0.93271 | 0.91550 * | 6.91% |
RE | 0.24994 | 0.23587 | 0.23283 * | 6.85% | |
MAE | 0.72168 | 0.67795 | 0.66620 * | 7.69% | |
Flixster (d = 10) | RMSE | 0.93490 | 0.93256 | 0.91498 * | 2.13% |
RE | 0.23759 | 0.23583 | 0.23269 * | 2.06% | |
MAE | 0.68984 | 0.67808 | 0.66591 * | 3.47% | |
FilmTrust (d = 5) | RMSE | 0.81488 | 0.80823 | 0.79726 * | 2.16% |
RE | 0.25950 | 0.25739 | 0.25389 * | 2.16% | |
MAE | 0.63339 | 0.62120 | 0.61539 * | 2.84% | |
FilmTrust (d = 10) | RMSE | 0.81474 | 0.80754 | 0.79735 * | 2.13% |
RE | 0.25946 | 0.25717 | 0.25392 * | 2.14% | |
MAE | 0.63330 | 0.61536 * | 0.61550 | 2.81% | |
MiniFilm (d = 5) | RMSE | 0.98790 | 0.98696 | 0.97232 * | 1.58% |
RE | 0.31090 | 0.31060 | 0.30598 * | 1.58% | |
MAE | 0.76582 | 0.75788 | 0.75614 * | 1.26% | |
MiniFilm (d = 10) | RMSE | 0.98845 | 0.98749 | 0.97200 * | 1.66% |
RE | 0.31109 | 0.31078 | 0.30588 * | 1.67% | |
MAE | 0.76924 | 0.75806 | 0.75568 * | 1.76% | |
ml_latest_small (d = 5) | RMSE | 0.87226 | 0.87089 | 0.86618 * | 0.70% |
RE | 0.23764 | 0.23727 | 0.23599 * | 0.69% | |
MAE | 0.67298 | 0.67285 | 0.66847 * | 0.67% | |
ml_latest_small (d = 10) | RMSE | 0.87353 | 0.87223 | 0.86813 * | 0.62% |
RE | 0.23810 | 0.23775 | 0.23663 * | 0.62% | |
MAE | 0.67307 | 0.67284 | 0.66871 * | 0.65% |
Dataset | Metrics | TrustSVD | SDPTrustSVD | Improved (vs. TrustSVD) |
---|---|---|---|---|
Flixster (d = 5) | RMSE | 0.92323 | 0.90564 * | 1.91% |
RE | 0.23451 | 0.23032 * | 1.79% | |
MAE | 0.68654 | 0.66460 * | 3.20% | |
Flixster (d = 10) | RMSE | 0.92337 | 0.91548 * | 0.85% |
RE | 0.23455 | 0.23254 * | 0.86% | |
MAE | 0.68669 | 0.67480 * | 1.73% | |
FilmTrust (d = 5) | RMSE | 0.81201 | 0.79262 * | 2.39% |
RE | 0.25856 | 0.25241 * | 2.38% | |
MAE | 0.63938 | 0.61474 * | 3.85% | |
FilmTrust (d = 10) | RMSE | 0.81924 | 0.79148 * | 3.39% |
RE | 0.26086 | 0.25205 * | 3.38% | |
MAE | 0.64495 | 0.61352 * | 4.87% |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Tan, Z.; He, L.; Wu, D.; Chang, Q.; Zhang, B. Personalized Standard Deviations Improve the Baseline Estimation of Collaborative Filtering Recommendation. Appl. Sci. 2020, 10, 4756. https://doi.org/10.3390/app10144756
Tan Z, He L, Wu D, Chang Q, Zhang B. Personalized Standard Deviations Improve the Baseline Estimation of Collaborative Filtering Recommendation. Applied Sciences. 2020; 10(14):4756. https://doi.org/10.3390/app10144756
Chicago/Turabian StyleTan, Zhenhua, Liangliang He, Danke Wu, Qiuyun Chang, and Bin Zhang. 2020. "Personalized Standard Deviations Improve the Baseline Estimation of Collaborative Filtering Recommendation" Applied Sciences 10, no. 14: 4756. https://doi.org/10.3390/app10144756
APA StyleTan, Z., He, L., Wu, D., Chang, Q., & Zhang, B. (2020). Personalized Standard Deviations Improve the Baseline Estimation of Collaborative Filtering Recommendation. Applied Sciences, 10(14), 4756. https://doi.org/10.3390/app10144756