Implicit-Causality-Exploration-Enabled Graph Neural Network for Stock Prediction
Abstract
:1. Introduction
- We propose a unified GNN-based stock predictor with the incorporation of explicit and implicit relations with the aim of making a tradeoff between the interpretability of structural relations and the power of black-box deep learning models.
- We design an equivalent undirected graph of Granger causality as a feed of the proposed dynamic and static fusion GNN model, named DSF-GNN, which captures both static and dynamic characteristics of implicit and explicit relations by leveraging two proposed modules, namely the static-relation-based feature extractor and dynamic-relation-based feature extractor.
- Through experiments on the Chinese stock market spanning over more than three years of data, our proposed model outperforms other counterparts, demonstrating a performance improvement of 2.63% to 6.76% in terms of accuracy.
2. Related Work
2.1. Euclidean-Input Based Methods
2.2. Graph-Based Methods
3. Preliminaries
3.1. Granger Causality
3.2. Message Passing Based GNN
4. Methodology
4.1. Granger Causal Graph Modeling
- Most classical approaches to Granger causality detection have the limitation of being implemented on a time series with nonlinear features, resulting in an inconsistent estimation of Granger causal interactions.
- Granger causality leads to the relation between two variables being directed rather than undirected, whereas the work so far on GNN models has focused primarily on undirected graphs [46]. Moreover, this is followed by another challenge, namely that a hybrid graph structure (directed and undirected graph structure)-enabled GNN model is necessary for our stock predictor.
4.2. Moralization
4.3. Model Architecture
- Time feature representation module (TFR): This module is devised to capture the temporal dependencies of historical stock prices on future fluctuations. We employ a classic time series analysis model, i.e., GRU, to extract the historical dynamic features of stocks, serving as input for subsequent modules.
- Relationship feature encoding module (RFE): This module is designed to capture relational dependencies among related stocks and contains two sub-modules: a dynamic-relation-based feature extractor (DRE) and a static-relation-based feature extractor (SRE). DRE aims to capture dynamic relational dependencies, whereas SRE focuses on capturing static relational dependencies among stocks.
- Relationship feature fusion module (RFF): To capture additional potential relational dependencies, RFF is designed to integrate both dynamic and static embeddings, generating a novel high-level stock representation.
- Prediction layer: Finally, after the feature fusion process, the fused embeddings are input into the prediction layer, a fully connected network that generates predicted movement trends for each stock.
4.3.1. Temporal Feature Representation
4.3.2. Relational Feature Encoding
4.4. Relational Feature Fusion
4.5. Prediction Layer
5. Experiments
5.1. Data
5.2. Baselines
5.2.1. Euclidean-Input-Based Methods
- Long short-term memory (LSTM) [47] is one of the most widely recognized recurrent neural network (RNN) models used for processing time series data.
- Gated recurrent unit (GRU) [48] is another model commonly used for processing time series data. In comparison to LSTM, GRU has a simpler structure, with only two gated units.
- The dual-stage attention-based recurrent neural network (ALSTM) [16] employs a recurrent neural network model with a two-stage attention mechanism, consisting of an input attention layer and a temporal attention layer.
5.2.2. Non-Euclidean-Input Based Methods
- The graph convolutional network (GCN) [38] updates features by aggregating the information of neighboring nodes.
- Temporal graph convolution (TGC) [5] makes the prediction by considering the sequential embedding and relational embedding.
- The multi-graph convolutional gated recurrent unit (Multi-GCGRU) [6] emphasizes the diversity of relations by constructing three types of graphs to enhance the representations of cross effects.
- Shared information for stock trend forecasting (HIST) [8] makes the prediction by combining predefined concepts, hidden concepts, and individual information.
- Multi-relational graph attention ranking (MGAR) [39] employs a graph aggregation network that concurrently incorporates multiple stock relation graphs as input to examine the interactions between stocks, particularly focusing on similarity relationships.
5.3. Metrics
5.4. Experimental Results
5.5. Ablation Experiments
- To assess the effectiveness of RFF, we remove it from DSF-GNN. Instead, we concatenate the outputs of DRF and SRF for prediction. The results demonstrate that incorporating the interaction between dynamic and static relational features, while preserving their independent characteristics, can enhance the prediction outcomes.
- To assess the effectiveness of DRF and SRF, we conduct experiments by removing the DRF and SRF modules, respectively. As one of the modules is removed, the RFF module is not utilized. The results demonstrate that DRF and SRF contribute to the improvement in prediction performance.
- Removing the TFR module would decrease the predictive performance. Therefore, the utilization of historical price indicators is crucial for stock trend forecasting.
5.6. Relations Comparison
5.7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Menkveld, A.J. The economics of high-frequency trading: Taking stock. Annu. Rev. Financ. Econ. 2016, 8, 1–24. [Google Scholar] [CrossRef]
- Dash, R.; Dash, P.K. A hybrid stock trading framework integrating technical analysis with machine learning techniques. J. Financ. Data Sci. 2016, 2, 42–57. [Google Scholar] [CrossRef]
- Arya, A.N.; Xu, Y.L.; Stankovic, L.; Mandic, D.P. Hierarchical Graph Learning for Stock Market Prediction Via a Domain-Aware Graph Pooling Operator. In Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes Island, Greece, 4–10 June 2023; pp. 1–5. [Google Scholar] [CrossRef]
- Sawhney, R.; Agarwal, S.; Wadhwa, A.; Derr, T.; Shah, R.R. Stock selection via spatiotemporal hypergraph attention network: A learning to rank approach. In Proceedings of the AAAI Conference on Artificial Intelligence, Online, 2–9 February 2021. [Google Scholar] [CrossRef]
- Feng, F.; He, X.; Wang, X.L.; Cheng, L.; Yiqun, C. Temporal relational ranking for stock prediction. Acm Trans. Inf. Syst. (TOIS) 2018, 37, 1–30. [Google Scholar] [CrossRef]
- Ye, J.; Zhao, J.; Ye, K.; Xu, C. Multi-graph convolutional network for relationship-driven stock movement prediction. In Proceedings of the 25th International Conference on Pattern Recognition (ICPR), Milano, Italy, 10–15 January 2021; pp. 6702–6709. [Google Scholar] [CrossRef]
- Chen, Y.; Wei, Z.; Huang, X. Incorporating corporation relationship via graph convolutional neural networks for stock price prediction. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management, Turin, Italy, 22–26 October 2018; pp. 1655–1658. [Google Scholar] [CrossRef]
- Xu, W.; Liu, W.; Wang, L.; Xia, Y.; Bian, J.; Yin, J.; Liu, T.Y. HIST: A graph-based framework for stock trend forecasting via mining concept-oriented shared information. arXiv 2021, arXiv:2110.13716. [Google Scholar]
- Tian, H.; Zheng, X.; Zhao, K.; Liu, M.W.; Zeng, D.D. Inductive Representation Learning on Dynamic Stock Co-Movement Graphs for Stock Predictions. INFORMS J. Comput. 2022, 34, 1940–1957. [Google Scholar] [CrossRef]
- Cheng, R.; Li, Q. Modeling the momentum spillover effect for stock prediction via attribute-driven graph attention networks. In Proceedings of the AAAI Conference on Artificial Intelligence, virtually, 2–9 February 2021; Volume 1, pp. 55–62. [Google Scholar] [CrossRef]
- Li, W.; Bao, R.; Harimoto, K.; Chen, D.; Xu, J.; Su, Q. Modeling the stock relation with graph network for overnight stock movement prediction. In Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence, Yokohama, Japan, 7–15 January 2021; pp. 4541–4547. [Google Scholar] [CrossRef]
- Papana, A.; Kyrtsou, C.; Kugiumtzis, D.; Diks, C. Financial networks based on Granger causality: A case study. Phys. Stat. Mech. Appl. 2017, 482, 65–73. [Google Scholar] [CrossRef]
- Saha, S.; Gao, J.; Gerlach, R. A survey of the application of graph-based approaches in stock market analysis and prediction. Int. J. Data Sci. Anal. 2022, 14, 1–15. [Google Scholar] [CrossRef]
- Botunac, I.; Bosna, J.; Matetić, M. Optimization of Traditional Stock Market Strategies Using the LSTM Hybrid Approach. Information 2024, 15, 136. [Google Scholar] [CrossRef]
- Chen, K.; Zhou, Y.; Dai, F. A LSTM-based method for stock returns prediction: A case study of China stock market. In Proceedings of the 2015 IEEE International Conference on Big Data (Big Data), Santa Clara, CA, USA, 29 October–1 November 2015. [Google Scholar] [CrossRef]
- Qin, Y.; Song, D.; Chen, H.; Cheng, W.; Jiang, G.; Cottrell, G. A dual-stage attention-based recurrent neural network for time series prediction. arXiv 2017, arXiv:1704.02971. [Google Scholar]
- Zhang, L.; Aggarwal, C.; Qi, G.J. Stock price prediction via discovering multi-frequency trading patterns. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, 13–17 August 2017; pp. 2141–2149. [Google Scholar] [CrossRef]
- Feng, F.; Chen, H.; He, X.; Ding, J.; Sun, M.; Chua, T.S. Enhancing stock movement prediction with adversarial training. arXiv 2019, arXiv:1810.09936. [Google Scholar] [CrossRef]
- Md, A.Q.; Kapoor, S.; Junni, A.V.C.; Sivaraman, A.K.; Tee, K.F.; Sabireen, H.; Janakiraman, N. Novel optimization approach for stock price forecasting using multi-layered sequential LSTM. Appl. Soft Comput. 2023, 134, 109830. [Google Scholar] [CrossRef]
- Smith, N.; Varadharajan, V.; Kalla, D.; Kumar, G.R.; Samaah, F. Stock Closing Price and Trend Prediction with LSTM-RNN. J. Artif. Intell. Big Data 2024, 4, 877. [Google Scholar]
- Wang, G.; Cao, L.; Zhao, H.; Liu, Q.; Chen, E. Coupling macro-sector-micro financial indicators for learning stock representations with less uncertainty. In Proceedings of the AAAI Conference on Artificial Intelligence, Virtually, 2–9 February 2021; Volume 35, pp. 4418–4426. [Google Scholar] [CrossRef]
- Mittal, A.; Goel, A. Stock Prediction Using Twitter Sentiment Analysis; CS229; Standford University: Stanford, CA, USA, 2021. [Google Scholar]
- Kraus, M.; Feuerriegel, S. Decision support from financial disclosures with deep neural networks and transfer learning. Adv. Neural Inf. Process. Syst. 2017, 104, 38–48. [Google Scholar] [CrossRef]
- Li, X.; Wu, P.; Wang, W. Incorporating stock prices and news sentiments for stock market prediction: A case of Hong Kong. Inf. Process. Manag. 2020, 57, 102212. [Google Scholar] [CrossRef]
- Liapis, C.M.; Kotsiantis, S. Temporal Convolutional Networks and BERT-Based Multi-Label Emotion Analysis for Financial Forecasting. Information 2023, 14, 596. [Google Scholar] [CrossRef]
- Deng, S.; Zhu, Y.; Yu, Y.; Huang, X. An integrated approach of ensemble learning methods for stock index prediction using investor sentiments. Expert Syst. Appl. 2024, 238, 121710. [Google Scholar] [CrossRef]
- Ding, X.; Zhang, Y.; Liu, T.; Duan, J. Deep learning for event-driven stock prediction. In Proceedings of the 24th International Joint Conference on Artificial Intelligence, Buenos Aires, Argentina, 25–31 July 2015. [Google Scholar]
- Deng, S.; Mitsubuchi, T.; Shioda, K.; Shimada, T.; Sakurai, A. Combining technical analysis with sentiment analysis for stock price prediction. In Proceedings of the 2011 IEEE Ninth International Conference on Dependable, Autonomic and Secure Computing, Sydney, Australia, 12–14 December 2011; pp. 800–807. [Google Scholar] [CrossRef]
- Chen, W.; Yeo, C.K.; Lau, C.T.; Lee, B.S. Leveraging social media news to predict stock index movement using RNN-boost. Data Knowl. Eng. 2018, 118, 14–24. [Google Scholar] [CrossRef]
- Sawhney, R.; Agarwal, S.; Wadhwa, A.; Shah, R. Deep attentive learning for stock movement prediction from social media text and company correlations. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Online, 16–20 November 2020; pp. 8415–8426. [Google Scholar]
- Kim, R.; So, C.H.; Jeong, M.; Lee, S.; Kim, J.; Kang, J. Hats: A hierarchical graph attention network for stock movement prediction. arXiv 2019, arXiv:1908.07999. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. arXiv 2017, arXiv:1706.03762. [Google Scholar] [CrossRef]
- Zhang, Q.; Zhang, Y.; Yao, X.; Li, S.; Zhang, C.; Liu, P. A dynamic attributes-driven graph attention network modeling on behavioral finance for stock prediction. Acm Trans. Knowl. Discov. Data 2023, 18, 1–29. [Google Scholar] [CrossRef]
- Shi, Y.; Wang, Y.; Qu, Y.; Chen, Z. Integrated gcn-lstm stock prices movement prediction based on knowledge-incorporated graphs construction. Int. J. Mach. Learn. Cybern. 2024, 15, 161–176. [Google Scholar] [CrossRef]
- Qian, H.; Zhou, H.; Zhao, Q.; Chen, H.; Yao, H.; Wang, J.; Liu, Z.; Yu, F.; Zhang, Z.; Zhou, J. MDGNN: Multi-Relational Dynamic Graph Neural Network for Comprehensive and Dynamic Stock Investment Prediction. arXiv 2024, arXiv:2402.06633. [Google Scholar] [CrossRef]
- Hamilton, W.; Ying, Z.; Leskovec, J. Inductive representation learning on large graphs. arXiv 2017, arXiv:1706.02216. [Google Scholar] [CrossRef]
- Xing, R.; Cheng, R.; Huang, J.; Li, Q.; Zhao, J. Learning to Understand the Vague Graph for Stock Prediction with Momentum Spillovers. IEEE Trans. Knowl. Data Eng. 2024, 36, 1698–1712. [Google Scholar] [CrossRef]
- Kipf, T.N.; Welling, M. Semi-supervised classification with graph convolutional networks. arXiv 2016, arXiv:1609.02907. [Google Scholar] [CrossRef]
- Song, G.; Zhao, T.; Wang, S.; Wang, H.; Li, X. Stock ranking prediction using a graph aggregation network based on stock price and stock relationship information. Inf. Sci. 2023, 643, 119236. [Google Scholar] [CrossRef]
- Wang, H.; Li, S.; Wang, T.; Zheng, J. Hierarchical Adaptive Temporal-Relational Modeling for Stock Trend Prediction. In Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, Montreal, QC, Canada, 19–27 August 2021; pp. 3691–3698. [Google Scholar]
- Tank, A.; Covert, I.; Foti, N.; Shojaie, A.; Fox, E.B. Neural granger causality. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 44, 4267–4279. [Google Scholar] [CrossRef]
- Tank, A.; Fox, E.B.; Shojaie, A. Granger causality networks for categorical time series. arXiv 2017, arXiv:1706.02781. [Google Scholar] [CrossRef]
- Arnold, A.; Liu, Y.; Abe, N. Temporal causal modeling with graphical granger methods. In Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Jose, CA, USA, 12–15 August 2007; pp. 66–75. [Google Scholar] [CrossRef]
- Damos, P. Using multivariate cross correlations, Granger causality and graphical models to quantify spatiotemporal synchronization and causality between pest populations. BMC Ecol. 2016, 16, 33. [Google Scholar] [CrossRef]
- Mainali, K.; Bewick, S.; Vecchio-Pagan, B.; Karig, D.; Fagan, W.F. Detecting interaction networks in the human microbiome with conditional Granger causality. Plos Comput. Biol. 2019, 15, e1007037. [Google Scholar] [CrossRef]
- Zhang, X.; He, Y.; Brugnone, N.; Perlmutter, M.; Hirn, M. Magnet: A neural network for directed graphs. Adv. Neural Inf. Process. Syst. 2021, 34, 27003–27015. [Google Scholar] [PubMed]
- Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
- Kyunghyun, C.; van Bart, M.; Caglar, G.; Dzmitry, B.; Fethi, B.; Holger, S.; Yoshua, B. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv 2014, arXiv:1406.1078. [Google Scholar] [CrossRef]
- Li, S.; Liao, W.; Chen, Y.; Yan, R. PEN: Prediction-explanation network to forecast stock price movement with better explainability. In Proceedings of the AAAI Conference on Artificial Intelligence, Wahington, DC, USA, 7–14 February 2023. [Google Scholar] [CrossRef]
- Liao, S.; Xie, L.; Du, Y.; Chen, S.; Wan, H.; Xu, H. Stock trend prediction based on dynamic hypergraph spatio-temporal network. Appl. Soft Comput. 2024, 154, 111329. [Google Scholar] [CrossRef]
Reference | Relation Type | Pros | Cons | |
---|---|---|---|---|
Euclidean-input-based | LSTM [14,15] | Dynamic | The time-based patterns in historical price indicators are effectively utilized. | Failing to recognize that the impacts of various historical trading days differ. |
MLS-LSTM [19] | Dynamic, implicit | Taking into account the varying impacts of different historical trading days. | Overlooking the aggregate effect of associated stocks. | |
ALSTM [16] | Dynamic, implicit | |||
LSTM-RNN [20] | Dynamic, implicit | |||
Graph-based | GCN [38] | Static, implicit | A stock graph efficiently reveals impacts from linked stocks. | A static graph cannot model the complex, dynamic relationships between stocks. |
TGC [5] | Static, explicit | Dynamically learn and capture interactions between stocks. | Using one graph for prediction limits integrating multiple stock graphs. | |
ADGAT [10] | Dynamic, implicit | |||
Graph-based | DGATS [33] | Static, explicit | Explores multifaceted stock relationships, capturing interactions single graphs may miss. | The main focus is on the correlation between related stocks, without fully exploring causal relationships. |
MGAR [39] | Dynamic, explicit | |||
MDGNN [35] | Dynamic, implicit |
Methods | ACC | MCC | |
---|---|---|---|
Euclidean-input-based methods | ALSTM | 52.55 | 0.0613 |
LSTM | 52.56 | 0.0596 | |
GRU | 52.72 | 0.0620 | |
Non-Euclidean-input-based methods | GCN | 51.49 | 0.0295 |
Multi-GCGRU | 52.95 | 0.0582 | |
TGC | 53.15 | 0.0596 | |
HITS | 53.56 | 0.0618 | |
MGAR | 53.34 | 0.0602 | |
Proposed Model | 54.97 | 0.0821 |
TFR | DRF | SRF | RFF | ACC | MCC |
---|---|---|---|---|---|
✓ | ✓ | ✓ | × | 53.86 | 0.0659 |
✓ | ✓ | × | × | 53.58 | 0.0632 |
✓ | × | ✓ | × | 53.21 | 0.0564 |
× | ✓ | ✓ | ✓ | 54.23 | 0.0712 |
✓ | ✓ | ✓ | ✓ | 54.97 | 0.0821 |
ACC | MCC | |
---|---|---|
w/o -S | 53.41 | 0.0592 |
w/o -C | 53.87 | 0.0656 |
w/o -M | 54.85 | 0.0801 |
w/o -I | 53.58 | 0.0632 |
DSF-GNN | 54.97 | 0.0821 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, Y.; Xue, X.; Liu, Z.; Duan, P.; Zhang, B. Implicit-Causality-Exploration-Enabled Graph Neural Network for Stock Prediction. Information 2024, 15, 743. https://doi.org/10.3390/info15120743
Li Y, Xue X, Liu Z, Duan P, Zhang B. Implicit-Causality-Exploration-Enabled Graph Neural Network for Stock Prediction. Information. 2024; 15(12):743. https://doi.org/10.3390/info15120743
Chicago/Turabian StyleLi, Ying, Xiaosha Xue, Zhipeng Liu, Peibo Duan, and Bin Zhang. 2024. "Implicit-Causality-Exploration-Enabled Graph Neural Network for Stock Prediction" Information 15, no. 12: 743. https://doi.org/10.3390/info15120743
APA StyleLi, Y., Xue, X., Liu, Z., Duan, P., & Zhang, B. (2024). Implicit-Causality-Exploration-Enabled Graph Neural Network for Stock Prediction. Information, 15(12), 743. https://doi.org/10.3390/info15120743