Research and Application of Edge Computing and Deep Learning in a Recommender System
Abstract
:1. Introduction
2. Research Status of the Recommender System
3. Recommendation Models Based on Knowledge Distillation and Edge Computing
- In theory, the search space of the model is greater than that of a network. If the convergence of a network with a smaller search space matches or approximates that of a complex model, the solution spaces of these two networks may overlap. The essence of knowledge distillation can be described as follows: leveraging label features of the population parameter, a teacher–student model is employed to construct a structurally complex teacher model with strong learning ability. This knowledge is then transferred to a student model with a simpler structure and lower learning ability, enhancing its generalization ability. Furthermore, a compact model involving fewer parameters was selected to achieve accuracy comparable to that of a complex model, thus facilitating model compression. Based on knowledge distillation, knowledge gained from the complex teacher model through training was adopted to guide the student model training on a smaller scale. This enables the student model to emulate the teacher model, effectuating knowledge transfer and model compression.
- The trained student model is formulated through the mutual superposition of multi-layer neural networks. Models with different numbers of network layers exhibit distinct properties and computational complexities. In addition, significant differences were observed in computing resource demands and data sizes across diverse network layers. The student model generated in step (1) was segmented into two sections: one section, with a heavier computational burden, conducted calculations on an edge server, while the other section, involving lighter computations, was processed on the existing terminal device. Additionally, the terminal device collaborated with the edge server system, effectively reducing the computational latency of the corresponding deep model.
3.1. Knowledge Distillation
- (1)
- Different teacher models (quantity: n) were trained. These models were then used as samples in the training set to produce soft labels, , accordingly. In order to figure out the soft labels, , () was output through logits from the teacher model and then divided by the parameter; the outcome obtained was subjected to at last. Here, is a temperature coefficient. The higher the parameter, the gentler the corresponding distribution probability tends to be.
- (2)
- The number of datasets was input into the student model, which was followed via operating it in the same way as the teacher model and obtaining an output of logits. Subsequently, relevant calculations were divided into two steps: First, via division by the same parameter as that of the teacher model, , -based computing was conducted, obtaining an output of soft predictions, , and this output was then compared with soft labels of the teacher model. Second, after -based calculations, the predicted values were acquired and then compared with Ture labels.
- (3)
- For the corresponding loss function, was selected, and it was formed via the combination of a relative entropy loss function and a cross entropy loss function [23]. Here, Kullback–Leibler divergence was used to measure the asymmetry of differences in probability distributions A and B, while cross entropy represents a difference value between the predicted sample label and the true sample label. A combination of these loss functions is beneficial to better reveal both differences in and differential values of the predicted samples and the true samples.
- (4)
- In cases where a different teacher model is selected, knowledge at diverse importance levels will be provided for the student model. A teacher model of low efficiency may even mislead the learning results of the student model. Here, refers to the weight of importance of the knowledge contributed by each teacher model, and refers to the number of teacher models. In particular, the weight, , needs to meet a condition of . To realize self-adaption to the knowledge distillation learning framework of a multi-teacher model, is defined as follows:
3.2. Model Segmentation
4. Experimental Process and Experimental Result Analysis
4.1. Experimental Datasets
4.2. Experimental Process
- (1)
- Training Stage
- (2)
- Real-Time Computing Stage
4.3. Experimental Results and Analysis
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Shi, W.; Cao, J.; Zhang, Q.; Li, Y.; Xu, L. Edge computing: Vision and challenges. IEEE Internet Things J. 2016, 3, 637–646. [Google Scholar] [CrossRef]
- Shi, S.; Zhang, M.; Lu, H.; Liu, Y.; Ma, S. Wide & deep learning in job recommendation: An empirical study. In Asia Information Retrieval Symposium; Springer: Cham, Switzerland, 2017; pp. 112–124. [Google Scholar]
- Su, X.; Khoshgoftaar, T.M. A survey of collaborative filtering techniques. Adv. Artif. Intell. 2009, 2009, 421425. [Google Scholar] [CrossRef]
- Mooney, R.J.; Roy, L. Content-based book recommending using learning for text categorization. In DL ‘00: Proceedings of the Fifth ACM Conference on Digital Libraries; ACM Digital Library: New York, NY, USA, 2000; pp. 195–204. [Google Scholar]
- Breese, J.S.; Heckerman, D.; Kadie, C. Empirical analysis of predictive algorithms for collaborative filtering. arXiv 2013, arXiv:1301.7363. [Google Scholar]
- Balabanović, M.; Shoham, Y. Fab: Content-based, collaborative recommendation. Commun. ACM 1997, 40, 66–72. [Google Scholar] [CrossRef]
- Verbert, K.; Manouselis, N.; Ochoa, X.; Wolpers, M.; Drachsler, H.; Bosnic, I.; Duval, E. Context-Aware Recommender Systems for Learning: A Survey and Future Challenges. IEEE Trans. Learn. Technol. 2012, 5, 318–335. [Google Scholar] [CrossRef]
- Elkahky, A.M.; Song, Y.; He, X. A Multi-View Deep Learning Approach for Cross Domain User Modeling in Recommendation Systems. In Proceedings of the 24th International Conference on World Wide Web, Florence, Italy, 18–22 May 2015; pp. 278–288. [Google Scholar]
- Yadav, N.; Singh, A.K.; Pal, S. Improved self-attentive Musical Instrument Digital Interface content-based music recommendation system. Comput. Intell. 2022, 38, 1232–1257. [Google Scholar] [CrossRef]
- Liu, K.; Xue, F.; Guo, D.; Wu, L.; Li, S.; Hong, R. MEGCF: Multimodal Entity Graph Collaborative Filtering for Personalized Recommendation. ACM Trans. Inf. Syst. 2023, 41, 1–27. [Google Scholar] [CrossRef]
- Hussain, J.S.I.; Ghazali, R.; Javid, I.; Hassim, Y.M.M.; Khan, M.H. A Hybrid Solution for The Cold Start Problem in Recommendation. Comput. J. 2023, 8, bxad088. [Google Scholar] [CrossRef]
- Zhu, J.; Shan, Y.; Mao, J.C.; Yu, D.; Rahmanian, H.; Zhang, Y. Deep Embedding Forest: Forest-based Serving with Deep Embedding Features. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, 13–17 August 2017; pp. 1703–1711. [Google Scholar]
- Covington, P.; Adams, J.; Sargin, E. Deep Neural Networks for YouTube Recommendations. In Proceedings of the 10th ACM Conference on Recommender Systems, Boston, MA, USA, 15–19 September 2016; pp. 191–198. [Google Scholar]
- Gou, J.; Yu, B.; Maybank, S.J.; Tao, D. Knowledge distillation: A survey. Int. J. Comput. Vis. 2021, 129, 1789–1819. [Google Scholar] [CrossRef]
- Zhao, B.; Cui, Q.; Song, R.; Qiu, Y.; Liang, J. Decoupled knowledge distillation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 11953–11962. [Google Scholar]
- Hinton, G.; Vinyals, O.; Dean, J. Distilling the knowledge in a neural network. arXiv 2015, arXiv:1503.02531. [Google Scholar]
- Guo, H.; Tang, R.; Ye, Y.; Li, Z.; He, X. DeepFM: A factorization-machine based neural network for CTR prediction. arXiv 2017, arXiv:1703.04247. [Google Scholar]
- Shen, X.; Dai, Q.; Mao, S.; Chung, F.-L.; Choi, K.-S. Network together: Node classification via cross-network deep network embedding. IEEE Trans. Neural Netw. Learn. Syst. 2020, 32, 1935–1948. [Google Scholar] [CrossRef] [PubMed]
- Lian, J.; Zhou, X.; Zhang, F.; Chen, Z.; Xie, X.; Sun, G. xDeepFM: Combining Explicit and Implicit Feature Interactions for Recommender Systems. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, London, UK, 19–23 August 2018; pp. 1754–1763. [Google Scholar]
- Yu, L.; Li, Y.; Weng, S.; Tian, H.; Liu, J. Adaptive multi-teacher softened relational knowledge distillation framework for payload mismatch in image steganalysis. J. Vis. Commun. Image Represent. 2023, 95, 103900. [Google Scholar] [CrossRef]
- Jeon, E.S.; Choi, H.; Shukla, A.; Turaga, P. Leveraging angular distributions for improved knowledge distillation. Neurocomputing 2023, 518, 466–481. [Google Scholar] [CrossRef]
- Ding, H.; Chen, K.; Huo, Q. Improving Knowledge Distillation of CTC-Trained Acoustic Models with Alignment-Consistent Ensemble and Target Delay. IEEE/ACM Trans. Audio Speech Lang. Process. 2020, 28, 2561–2571. [Google Scholar] [CrossRef]
- Hershey, J.R.; Olsen, P.A. Approximating thve Kullback Leibler Divergence Between Gaussian Mixture Models. In Proceedings of the 2007 IEEE International Conference on Acoustics, Speech and Signal Processing-ICASSP ′07, Honolulu, HI, USA, 15–20 April 2007; Volume 4, pp. IV–317–IV–320. [Google Scholar]
- Li, E.; Zhou, Z.; Chen, X. Edge Intelligence: On-Demand Deep Learning Model Co-Inference with Device-Edge Synergy. In Proceedings of the 2018 Workshop on Mobile Edge Communications, Budapest, Hungary, 20 August 2018; pp. 31–36. [Google Scholar]
- Rendle, S. Factorization Machines. In Proceedings of the 2010 IEEE International Conference on Data Mining, Sydney, NSW, Australia, 13–17 December 2010; pp. 995–1000. [Google Scholar]
- Sun, J.; Zhao, L.; Liu, Z.; Li, Q.; Deng, X.; Wang, Q.; Jiang, Y. Practical differentially private online advertising. Comput. Secur. 2022, 112, 102504. [Google Scholar] [CrossRef]
- Da, F.; Peng, C.; Wang, H.; Li, T. A risk detection framework of Chinese high-tech firms using wide & deep learning model based on text disclosure. Procedia Comput. Sci. 2022, 199, 262–268. [Google Scholar]
- Xu, J.; Hu, Z.; Zou, J. Personalized Product Recommendation Method for Analyzing User Behavior Using DeepFM. Korea Inf. Process. Soc. 2021, 17, 369–384. [Google Scholar]
- Wang, R.; Shivanna, R.; Cheng, D.; Jain, S.; Lin, D.; Hong, L.; Chi, E. Dcn v2: Improved deep & cross network and practical lessons for web-scale learning to rank systems. In Proceedings of the Web Conference 2021, Ljubljana, Slovenia, 19–23 April 2021; pp. 1785–1797. [Google Scholar]
- Chen, C.; Zhou, J.; Zheng, L.; Wu, H.; Lyu, L.; Wu, J.; Wu, B.; Liu, Z.; Wang, L.; Zheng, X. Vertically federated graph neural network for privacy-preserving node classification. arXiv 2020, arXiv:2005.11903. [Google Scholar]
Dataset | Instances | Fields | Positive Ratio |
---|---|---|---|
Ticket purchase data | 240 M | 38 | 86% |
User info | 25 M | 40 | 92% |
Train info | 3000 | 20 | 95% |
Real-time interaction | 125 M | 18 | 74% |
Model | NDCG@4 | NDCG@8 | NDCG@12 | NDCG@16 | NDCG@20 |
---|---|---|---|---|---|
FM | 0.0379 | 0.0571 | 0.0728 | 0.0531 | 0.0312 |
FFM | 0.0631 | 0.8391 | 0.1257 | 0.0452 | 0.0429 |
Wide&Deep | 0.0931 | 0.1231 | 0.1498 | 0.0943 | 0.0782 |
DeepFM | 0.1432 | 0.1674 | 0.1982 | 0.1321 | 0.0912 |
DCN | 0.1627 | 0.2027 | 0.2219 | 0.1337 | 0.0937 |
KD | 0.1523 | 0.1841 | 0.2087 | 0.1186 | 0.0821 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Hao, X.; Shan, X.; Zhang, J.; Meng, G.; Jiang, L. Research and Application of Edge Computing and Deep Learning in a Recommender System. Appl. Sci. 2023, 13, 12541. https://doi.org/10.3390/app132312541
Hao X, Shan X, Zhang J, Meng G, Jiang L. Research and Application of Edge Computing and Deep Learning in a Recommender System. Applied Sciences. 2023; 13(23):12541. https://doi.org/10.3390/app132312541
Chicago/Turabian StyleHao, Xiaopei, Xinghua Shan, Junfeng Zhang, Ge Meng, and Lin Jiang. 2023. "Research and Application of Edge Computing and Deep Learning in a Recommender System" Applied Sciences 13, no. 23: 12541. https://doi.org/10.3390/app132312541
APA StyleHao, X., Shan, X., Zhang, J., Meng, G., & Jiang, L. (2023). Research and Application of Edge Computing and Deep Learning in a Recommender System. Applied Sciences, 13(23), 12541. https://doi.org/10.3390/app132312541