HDGFI: Hierarchical Dual-Level Graph Feature Interaction Model for Personalized Recommendation
Abstract
:1. Introduction
- We consider several shortcomings of existing feature interaction models and use graph structure to model the interaction process between features, which increases the interpretability of feature interaction and improves the recommendation results of the model.
- We drop out the edges of the feature graph hierarchically, preserving the feature interactions that are most useful to the target node. The feature interaction process is modeled from local and global perspectives to obtain high interaction gain.
- We conducted experiments on three public datasets. The results show that our proposed model outperforms similar algorithms in terms of AUC and Logloss metrics.
2. Related Works
2.1. Feature Interaction Recommendation Model
2.2. Graph Neural Networks and Recommendation
2.3. Graph Structure Learning
3. Proposed Model
3.1. Problem Definition
3.2. Overview
- (1)
- Constructing Feature Graph Module. This module maps a high-dimensional and sparse raw feature to a low-dimensional dense vector representation. Each feature field is regarded as a node in the graph, and the edge connected with the node represents the interaction between features. A metric-based method is used to calculate the weight of edges and select important edges for connection.
- (2)
- Dual-level Node and Graph Representation Generation Module. This module constructs two levels of feature interaction and fusion processes, which includes two components. One component is a local-level feature interaction that uses edge weights to update the node representation, and the other is a SENet component that captures important features at the global level.
- (3)
- Prediction Module. This module uses the obtained feature interactions after each layer of node representation to calculate the final click probability.
3.3. Constructing Feature Graph Module
3.3.1. Feature Graph Node Embedding
3.3.2. Hierarchical Edge Selection Layer
3.4. Dual-Level Node and Graph Representation Generation Module
3.4.1. Local-Level Attention Messaging and Aggregation
3.4.2. Global-Level Squeeze and Excitation
3.4.3. Dual-Level Node Embedding Fusion
3.4.4. Graph Representation Readout
3.5. Prediction and Optimization
4. Experiments
4.1. Experiment Setup
4.1.1. Datasets
4.1.2. Evaluation Metrics
4.1.3. Baselines
4.1.4. Hyper-Parameter Settings
4.2. Overall Performance
Dataset Model | KKBox | Frappe | MovieLens-1M | |||
---|---|---|---|---|---|---|
AUC | Logloss | AUC | Logloss | AUC | Logloss | |
LR(A) | 0.76647 | 0.57593 | 0.93565 | 0.28721 | 0.86949 | 0.43775 |
FM(B) | 0.78961 | 0.55487 | 0.96571 | 0.20912 | 0.89104 | 0.42229 |
AFM(B) | 0.79868 | 0.54858 | 0.96534 | 0.21947 | 0.88224 | 0.42861 |
FFM(B) | 0.79758 | 0.54323 | 0.96871 | 0.19901 | 0.89563 | 0.40881 |
NFM(C) | 0.80979 | 0.53088 | 0.97283 | 0.20717 | 0.89975 | 0.40351 |
DeepFM(C) | 0.81439 | 0.52556 | 0.97551 | 0.18532 | 0.90617 | 0.38856 |
FiBiNet(C) | 0.81783 | 0.52207 | 0.97554 | 0.18061 | 0.90628 | 0.39021 |
Fi-GNN(D) | 0.81831 | 0.52033 | 0.97541 | 0.18431 | 0.90668 | 0.38755 |
GraphFM(D) | 0.82013 | 0.51872 | 0.9764 | 0.17824 | 0.90782 | 0.38378 |
HDGFI(ours) | 0.82278 | 0.51555 | 0.97894 | 0.16495 | 0.91113 | 0.37871 |
p-value | 2.82% | 2.82% | 0.9% | 0.9% | 0.9% | 1.62% |
4.3. Hyper-Parameter Study
4.3.1. Influence of Neighborhood Sampled Size
4.3.2. Influence of Reduction Ratio
4.4. Ablation Study
Dataset Variants | KKBox | Frappe | MovieLens-1M | |||
---|---|---|---|---|---|---|
AUC | Logloss | AUC | Logloss | AUC | Logloss | |
HDGFI_E | 0.82149 | 0.51716 | 0.97741 | 0.16437 | 0.90939 | 0.38051 |
HDGFI_B | 0.82111 | 0.51738 | 0.97738 | 0.17088 | 0.90975 | 0.38025 |
HDGFI | 0.82278 | 0.51555 | 0.97894 | 0.16495 | 0.91113 | 0.37871 |
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Data Availability Statement
Conflicts of Interest
References
- McMahan, H.B.; Holt, G.; Sculley, D.; Young, M.; Ebner, D.; Grady, J.; Nie, L.; Phillips, T.; Davydov, E.; Golovin, D.; et al. Ad click prediction: A view from the trenches. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Chicago, IL, USA, 11–14 August 2013; pp. 1222–1230. [Google Scholar]
- Su, Y.; Zhang, J.D.; Li, X.; Zha, D.; Xiang, J.; Tang, W.; Gao, N. Fgrec: A fine-grained point-of-interest recommendation framework by capturing intrinsic influences. In Proceedings of the 2020 International Joint Conference on Neural Networks (IJCNN), Glasgow, UK, 19–24 July 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 1–9. [Google Scholar]
- Wang, J.; Huang, P.; Zhao, H.; Zhang, Z.; Zhao, B.; Lee, D.L. Billion-scale commodity embedding for e-commerce recommendation in alibaba. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, London, UK, 19–23 August 2018; pp. 839–848. [Google Scholar]
- Wang, X.; He, X.; Wang, M.; Feng, F.; Chua, T.S. Neural graph collaborative filtering. In Proceedings of the 42nd international ACM SIGIR conference on Research and development in Information Retrieval, Paris France, 21–25 July 2019; pp. 165–174. [Google Scholar]
- Rendle, S. Factorization machines. In Proceedings of the 2010 IEEE International Conference on Data Mining, Sydney, NSW, Australia, 13–17 December 2010; IEEE: Piscataway, NJ, USA, 2010; pp. 995–1000. [Google Scholar]
- Xiao, J.; Ye, H.; He, X.; Zhang, H.; Wu, F.; Chua, T.S. Attentional factorization machines: Learning the weight of feature interactions via attention networks. arXiv 2017, arXiv:1708.04617. [Google Scholar]
- Juan, Y.; Zhuang, Y.; Chin, W.S.; Lin, C.J. Field-aware factorization machines for CTR prediction. In Proceedings of the 10th ACM Conference on Recommender Systems, Boston, MA, USA, 15–19 September 2016; pp. 43–50. [Google Scholar]
- Guo, H.; Tang, R.; Ye, Y.; Li, Z.; He, X. DeepFM: A factorization-machine based neural network for CTR prediction. arXiv 2017, arXiv:1703.04247. [Google Scholar]
- He, X.; Chua, T.S. Neural factorization machines for sparse predictive analytics. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, Tokyo, Japan, 7–11 July 2017; pp. 355–364. [Google Scholar]
- Cheng, H.T.; Koc, L.; Harmsen, J.; Shaked, T.; Chandra, T.; Aradhye, H.; Anderson, G.; Corrado, G.; Chai, W.; Ispir, M.; et al. Wide & deep learning for recommender systems. In Proceedings of the 1st Workshop on Deep Learning for Recommender Systems, Boston, MA, USA, 15 September 2016; pp. 7–10. [Google Scholar]
- Song, W.; Shi, C.; Xiao, Z.; Duan, Z.; Xu, Y.; Zhang, M.; Tang, J. Autoint: Automatic feature interaction learning via self-attentive neural networks. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management, Beijing, China, 3–7 November 2019; pp. 1161–1170. [Google Scholar]
- Li, Z.; Cheng, W.; Chen, Y.; Chen, H.; Wang, W. Interpretable click-through rate prediction through hierarchical attention. In Proceedings of the 13th International Conference on Web Search and Data Mining, Houston, TX, USA, 3–7 February 2020; pp. 313–321. [Google Scholar]
- Li, Z.; Cui, Z.; Wu, S.; Zhang, X.; Wang, L. Fi-gnn: Modeling feature interactions via graph neural networks for ctr prediction. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management, Beijing, China, 3–7 November 2019; pp. 539–548. [Google Scholar]
- Li, Z.; Wu, S.; Cui, Z.; Zhang, X. GraphFM: Graph Factorization Machines for Feature Interaction Modeling. arXiv 2021, arXiv:2105.11866. [Google Scholar]
- Zhang, W.; Du, T.; Wang, J. Deep learning over multi-field categorical data. In Proceedings of the European Conference on Information Retrieval, Padua, Italy, 20–23 March 2016; Springer: Cham, Switzerland, 2016; pp. 45–57. [Google Scholar]
- Shan, Y.; Hoens, T.R.; Jiao, J.; Wang, H.; Yu, D.; Mao, J.C. Deep crossing: Web-scale modeling without manually crafted combinatorial features. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2022; pp. 255–262. [Google Scholar]
- Huang, T.; Zhang, Z.; Zhang, J. FiBiNET: Combining feature importance and bilinear feature interaction for click-through rate prediction. In Proceedings of the 13th ACM Conference on Recommender Systems, Copenhagen Denmark, 16–20 September 2019; pp. 169–177. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Proceedings of the Advances in Neural Information Processing Systems 2017, Long Beach, CA, USA, 4–9 December 2017; Volume 30. [Google Scholar]
- Kipf, T.N.; Welling, M. Semi-supervised classification with graph convolutional networks. arXiv 2016, arXiv:1609.02907. [Google Scholar]
- Hamilton, W.; Ying, Z.; Leskovec, J. Inductive representation learning on large graphs. In Proceedings of the Advances in Neural Information Processing Systems 2017, Long Beach, CA, USA, 4–9 December 2017; Volume 30. [Google Scholar]
- Veličković, P.; Cucurull, G.; Casanova, A.; Romero, A.; Lio, P.; Bengio, Y. Graph attention networks. arXiv 2017, arXiv:1710.10903. [Google Scholar]
- Li, Y.; Tarlow, D.; Brockschmidt, M.; Zemel, R. Gated graph sequence neural networks. arXiv 2015, arXiv:1511.05493. [Google Scholar]
- Wang, X.; He, X.; Cao, Y.; Liu, M.; Chua, T.S. Kgat: Knowledge graph attention network for recommendation. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA, 4–8 August 2019; pp. 950–958. [Google Scholar]
- Zhu, Y.; Xu, W.; Zhang, J.; Du, Y.; Zhang, J.; Liu, Q.; Yang, C.; Wu, S. A Survey on Graph Structure Learning: Progress and Opportunities. arXiv 2021, arXiv:2103.03036. [Google Scholar]
- Li, R.; Wang, S.; Zhu, F.; Huang, J. Adaptive graph convolutional neural networks. In Proceedings of the AAAI conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018; Volume 32. [Google Scholar]
- Chen, Y.; Wu, L.; Zaki, M. Iterative deep graph learning for graph neural networks: Better and robust node embeddings. In Proceedings of the Advances in Neural Information Processing Systems, virtual, 6–12 December 2020; Volume 33, pp. 19314–19326. [Google Scholar]
- Jiang, B.; Zhang, Z.; Lin, D.; Tang, J.; Luo, B. Semi-supervised learning with graph learning-convolutional networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 11313–11320. [Google Scholar]
- Luo, D.; Cheng, W.; Yu, W.; Zong, B.; Ni, J.; Chen, H.; Zhang, X. Learning to drop: Robust graph neural network via topological denoising. In Proceedings of the 14th ACM International Conference on Web Search and Data Mining, Virtual Event, Israel, 8–12 March 2021; pp. 779–787. [Google Scholar]
- Gao, X.; Hu, W.; Guo, Z. Exploring structure-adaptive graph learning for robust semi-supervised classification. In Proceedings of the 2020 IEEE International Conference on Multimedia and Expo (ICME), London, UK, 6–10 July 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 1–6. [Google Scholar]
- Zhao, G.; Lin, J.; Zhang, Z.; Ren, X.; Su, Q.; Sun, X. Explicit sparse transformer: Concentrated attention through explicit selection. arXiv 2019, arXiv:1912.11637. [Google Scholar]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7132–7141. [Google Scholar]
- Richardson, M.; Dominowska, E.; Ragno, R. Predicting clicks: Estimating the click-through rate for new ads. In Proceedings of the 16th International Conference on World Wide Web, Banff, AB, Canada, 8 May 2007; pp. 521–530. [Google Scholar]
Dataset | Num of Fields | Num of Features | Instances |
---|---|---|---|
KKBox | 13 | 92,247 | 7,377,418 |
Frappe | 10 | 5382 | 288,609 |
MovieLens-1M | 10 | 22,100 | 1,149,238 |
Dataset Variants | KKBox | Frappe | MovieLens-1M | |||
---|---|---|---|---|---|---|
AUC | Logloss | AUC | Logloss | AUC | Logloss | |
HDGFI_L | 0.80779 | 0.53411 | 0.97331 | 0.19246 | 0.90016 | 0.39976 |
HDGFI_G | 0.81679 | 0.52293 | 0.97598 | 0.17051 | 0.90914 | 0.38133 |
HDGFI | 0.82278 | 0.51555 | 0.97894 | 0.16495 | 0.91113 | 0.37871 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ma, X.; Cui, Z. HDGFI: Hierarchical Dual-Level Graph Feature Interaction Model for Personalized Recommendation. Entropy 2022, 24, 1799. https://doi.org/10.3390/e24121799
Ma X, Cui Z. HDGFI: Hierarchical Dual-Level Graph Feature Interaction Model for Personalized Recommendation. Entropy. 2022; 24(12):1799. https://doi.org/10.3390/e24121799
Chicago/Turabian StyleMa, Xinxin, and Zhendong Cui. 2022. "HDGFI: Hierarchical Dual-Level Graph Feature Interaction Model for Personalized Recommendation" Entropy 24, no. 12: 1799. https://doi.org/10.3390/e24121799
APA StyleMa, X., & Cui, Z. (2022). HDGFI: Hierarchical Dual-Level Graph Feature Interaction Model for Personalized Recommendation. Entropy, 24(12), 1799. https://doi.org/10.3390/e24121799