From Text Representation to Financial Market Prediction: A Literature Review
Abstract
:1. Introduction
- We provide state-of-the-art use cases of fundamental analysis studies in financial market prediction, trading strategy recommendation, and correlation analysis, which distinguishes this study from the existing surveys.
- We systematically review the structuring heterogeneous data sources, such as knowledge graph mining or tensor decomposition techniques, and discuss the lag analysis or significance indication methods, which are often neglected in other surveys.
- We organize our review into four main categories of heterogeneous data sources, text structuring, analysis of information, and knowledge discovery methods.
- We present the future research directions in all of these categories.
- We discuss various big data aspects, such as variety, veracity, volume, valence, and velocity, as well as the tools and challenges in this field.
2. Background and Methodology
2.1. Methodology
2.2. Information Sources
2.3. Text Representation
2.4. Predictive Models
3. Review Configuration
3.1. Textual Representation
3.1.1. Bag of Words
3.1.2. Concepts-Based Approaches
3.1.3. Word Embedding
3.1.4. Discussion
3.2. Information Retrieval
3.2.1. Discussion about Sentiment Analysis
3.2.2. Event Detection
3.3. Knowledge Extraction
3.3.1. Predictive Models Based on Heterogeneous Data Fusion
3.3.2. Statistical Analysis of Investors’ Behavior
3.3.3. Recommendation Trading Strategies
4. Big Data, Tools, and Challenges
5. Open Research and Future Directions
- What is the effect of famous authors and influential persons working for financial newsgroups on market fluctuations?
- Do investors respond to the news, social media posts, or other information resources in the same way?
- Does the representation of contextual information in news documents and the proximity between a sequence of news help with improving financial decision supports?
- How do rumor and fake news diffusion patterns in financial social media correlate with market fluctuation?
5.1. Data
5.2. NLP Applications
5.3. Finance Applications
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Fama, E.F. The behavior of stock-market prices. J. Bus. 1965, 38, 34–105. [Google Scholar] [CrossRef]
- Shiller, R.J. From efficient markets theory to behavioral finance. J. Econ. Perspect. 2003, 17, 83–104. [Google Scholar] [CrossRef]
- Ramiah, V.; Xu, X.; Moosa, I.A. Neoclassical finance, behavioral finance and noise traders: A review and assessment of the literature. Int. Rev. Financ. Anal. 2015, 41, 89–100. [Google Scholar] [CrossRef]
- Le, Q.; Mikolov, T. Distributed representations of sentences and documents. In Proceedings of the International Conference on Machine Learning, Beijing, China, 21–26 June 2014; pp. 1188–1196. [Google Scholar]
- Mikolov, T.; Chen, K.; Corrado, G.; Dean, J. Efficient estimation of word representations in vector space. arXiv 2013, arXiv:1301.3781. [Google Scholar]
- Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers); Association for Computational Linguistics: Minneapolis, Minnesota, 2019; pp. 4171–4186. [Google Scholar] [CrossRef]
- Cutler, D.M.; Poterba, J.M.; Summers, L.H. What moves stock prices? Technical report; National Bureau of Economic Research: Cambridge, MA, USA, 1988. [Google Scholar]
- Barber, B.M.; Loeffler, D. The “Dartboard” Column: Second-Hand Information and Price Pressure. J. Financ. Quant. Anal. 1993, 28, 273–284. [Google Scholar] [CrossRef]
- Wuthrich, B.; Cho, V.; Leung, S.; Permunetilleke, D.; Sankaran, K.; Zhang, J. Daily stock market forecast from textual web data. In SMC’98 Conference Proceedings, Proceedings of the 1998 IEEE International Conference on Systems, Man, and Cybernetics (Cat. No. 98CH36218), San Diego, CA, USA, 14 October 1998; IEEE: Piscataway, NJ, USA, 1998; Volume 3, pp. 2720–2725. [Google Scholar] [CrossRef]
- Daniel, M.; Neves, R.F.; Horta, N. Company event popularity for financial markets using Twitter and sentiment analysis. Expert Syst. Appl. 2017, 71, 111–124. [Google Scholar] [CrossRef]
- Sun, Y.; Fang, M.; Wang, X. A novel stock recommendation system using Guba sentiment analysis. Pers. Ubiquitous Comput. 2018, 22, 575–587. [Google Scholar] [CrossRef]
- Anbaee Farimani, S.; Vafaei Jahan, M.; Milani Fard, A.; Tabbakh, S.R.K. Investigating the informativeness of technical indicators and news sentiment in financial market price prediction. Knowl.-Based Syst. 2022, 247, 108742. [Google Scholar] [CrossRef]
- Passalis, N.; Avramelou, L.; Seficha, S.; Tsantekidis, A.; Doropoulos, S.; Makris, G.; Tefas, A. Multisource financial sentiment analysis for detecting Bitcoin price change indications using deep learning. Neural Comput. Appl. 2022, 1–12. [Google Scholar] [CrossRef]
- Krishnamoorthy, S. Sentiment analysis of financial news articles using performance indicators. Knowl. Inf. Syst. 2018, 56, 373–394. [Google Scholar] [CrossRef]
- Seifollahi, S.; Shajari, M. Word sense disambiguation application in sentiment analysis of news headlines: An applied approach to FOREX market prediction. J. Intell. Inf. Syst. 2019, 52, 57–83. [Google Scholar] [CrossRef]
- Anbaee Farimani, S.; Vafaei Jahan, M.; Milani Fard, A.; Haffari, G. Leveraging Latent Economic Concepts and Sentiments in the News for Market Prediction. In Proceedings of the 8th IEEE International Conference on Data Science and Advanced Analytics (DSAA), Porto, Portugal, 6–9 October 2021. [Google Scholar]
- Chen, X.; Ma, X.; Wang, H.; Li, X.; Zhang, C. A hierarchical attention network for stock prediction based on attentive multi-view news learning. Neurocomputing 2022, 504, 1–15. [Google Scholar] [CrossRef]
- Vargas, M.R.; dos Anjos, C.E.; Bichara, G.L.; Evsukoff, A.G. Deep leaming for stock market prediction using technical indicators and financial news articles. In Proceedings of the 2018 International Joint Conference on Neural Networks (IJCNN), Rio de Janeiro, Brazil, 8–13 July 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 1–8. [Google Scholar] [CrossRef]
- Ding, X.; Zhang, Y.; Liu, T.; Duan, J. Knowledge-Driven Event Embedding for Stock Prediction. In Proceedings of the COLING 2016, the 26th International Conference on Computational Linguistics, Osaka, Japan, 11–16 December 2016; Technical Papers. The COLING 2016 Organizing Committee: Osaka, Japan, 2016; pp. 2133–2142. [Google Scholar]
- Long, J.; Chen, Z.; He, W.; Wu, T.; Ren, J. An integrated framework of deep learning and knowledge graph for prediction of stock price trend: An application in Chinese stock exchange market. Appl. Soft Comput. 2020, 91, 106205. [Google Scholar] [CrossRef]
- Chen, D.; Ma, S.; Harimoto, K.; Bao, R.; Su, Q.; Sun, X. Group, Extract and Aggregate: Summarizing a Large Amount of Finance News for Forex Movement Prediction. In Proceedings of the Second Workshop on Economics and Natural Language Processing; Association for Computational Linguistics: Hong Kong, 2019; pp. 41–50. [Google Scholar] [CrossRef]
- Lutz, B.; Pröllochs, N.; Neumann, D. Predicting sentence-level polarity labels of financial news using abnormal stock returns. Expert Syst. Appl. 2020, 148, 113223. [Google Scholar] [CrossRef]
- Wang, H.; Lu, S.; Zhao, J. Aggregating multiple types of complex data in stock market prediction: A model-independent framework. Knowl.-Based Syst. 2019, 164, 193–204. [Google Scholar] [CrossRef]
- Ren, J.; Long, J.; Xu, Z. Financial news recommendation based on graph embeddings. Decis. Support Syst. 2019, 125, 113115. [Google Scholar] [CrossRef]
- Yang, L.; Xu, Y.; Ng, J.; Dong, R. Leveraging BERT to improve the FEARS index for stock forecasting. In Proceedings of the First Workshop on Financial Technology and Natural Language Processing, Macao, China, 12 August 2019; ACL: Stroudsburg, PA, USA, 2019. [Google Scholar]
- Zhang, Z.; Zohren, S.; Roberts, S. Deep reinforcement learning for trading. J. Financ. Data Sci. 2020, 2, 25–40. [Google Scholar] [CrossRef]
- Gao, Z.; Gao, Y.; Hu, Y.; Jiang, Z.; Su, J. Application of Deep Q-Network in Portfolio Management. In Proceedings of the 2020 5th IEEE International Conference on Big Data Analytics (ICBDA), Xiamen, China, 8–11 May 2020; pp. 268–275. [Google Scholar] [CrossRef]
- Li, X.; Wu, P.; Wang, W. Incorporating stock prices and news sentiments for stock market prediction: A case of Hong Kong. Inf. Process. Manag. 2020, 57, 102212. [Google Scholar] [CrossRef]
- Moews, B.; Ibikunle, G. Predictive intraday correlations in stable and volatile market environments: Evidence from deep learning. Phys. A Stat. Mech. Appl. 2020, 547, 124392. [Google Scholar] [CrossRef]
- Teng, X.; Wang, T.; Zhang, X.; Lan, L.; Luo, Z. Enhancing Stock Price Trend Prediction via a Time-Sensitive Data Augmentation Method. Complexity 2020, 2020, 6737951. [Google Scholar] [CrossRef]
- Wong, S.Y.K.; Chan, J.S.K.; Azizi, L.; Xu, R.Y.D. Time-varying Neural Network for Stock Return Prediction. Int. J. Intell. Syst. Account. Financ. Manag. 2022, 29, 3–18. [Google Scholar] [CrossRef]
- Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv 2018, arXiv:1810.04805. [Google Scholar]
- Lu, Z.; Du, P.; Nie, J.Y. VGCN-BERT: Augmenting BERT with Graph Embedding for Text Classification. In Proceedings of the European Conference on Information Retrieval, Lisbon, Portugal, 14–17 April 2020; Springer: Berlin, Germany, 2020; pp. 369–382. [Google Scholar] [CrossRef]
- Wang, T.; Yuan, C.; Wang, C. Does Applying Deep Learning in Financial Sentiment Analysis Lead to Better Classification Performance? Econ. Bull. 2020, 40, 1091–1105. [Google Scholar]
- Liu, X.; Huang, H.; Zhang, Y.; Yuan, C. News-Driven Stock Prediction With Attention-Based Noisy Recurrent State Transition. arXiv 2020, arXiv:2004.01878. [Google Scholar]
- Liu, Q.; Cheng, X.; Su, S.; Zhu, S. Hierarchical complementary attention network for predicting stock price movements with news. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management, Torino, Italy, 22–26 October 2018; pp. 1603–1606. [Google Scholar] [CrossRef]
- Liu, X.; Luo, Z.; Huang, H. Jointly multiple events extraction via attention-based graph information aggregation. arXiv 2018, arXiv:1809.09078. [Google Scholar]
- Zhao, R.; Deng, Y.; Dredze, M.; Verma, A.; Rosenberg, D.; Stent, A. Visual attention model for cross-sectional stock return prediction and end-to-end multimodal market representation learning. arXiv 2018, arXiv:1809.03684. [Google Scholar]
- Yang, L.; Zhang, Z.; Xiong, S.; Wei, L.; Ng, J.; Xu, L.; Dong, R. Explainable text-driven neural network for stock prediction. In Proceedings of the 2018 5th IEEE International Conference on Cloud Computing and Intelligence Systems (CCIS), Nanjing, China, 23–25 November 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 441–445. [Google Scholar] [CrossRef]
- Deng, S.; Zhang, N.; Zhang, W.; Chen, J.; Pan, J.Z.; Chen, H. Knowledge-Driven Stock Trend Prediction and Explanation via Temporal Convolutional Network. In Proceedings of the Companion Proceedings of The 2019 World Wide Web Conference, San Francisco, CA, USA, 13–17 May 2019; Association for Computing Machinery: New York, NY, USA, 2019. WWW ’19. pp. 678–685. [Google Scholar] [CrossRef]
- Prokhorov, V.; Pilehvar, M.T.; Collier, N. Generating Knowledge Graph Paths from Textual Definitions using Sequence-to-Sequence Models. arXiv 2019, arXiv:1904.02996. [Google Scholar]
- Cheng, D.; Yang, F.; Xiang, S.; Liu, J. Financial time series forecasting with multi-modality graph neural network. Pattern Recognit. 2022, 121, 108218. [Google Scholar] [CrossRef]
- Kim, R.; So, C.H.; Jeong, M.; Lee, S.; Kim, J.; Kang, J. HATS: A Hierarchical Graph Attention Network for Stock Movement Prediction. arXiv 2019, arXiv:1908.07999. [Google Scholar]
- Kumar, B.S.; Ravi, V. A survey of the applications of text mining in financial domain. Knowl.-Based Syst. 2016, 114, 128–147. [Google Scholar] [CrossRef]
- Li, Q.; Chen, Y.; Wang, J.; Chen, Y.; Chen, H. Web media and stock markets: A survey and future directions from a big data perspective. IEEE Trans. Knowl. Data Eng. 2017, 30, 381–399. [Google Scholar] [CrossRef]
- Man, X.; Luo, T.; Lin, J. Financial sentiment analysis (fsa): A survey. In Proceedings of the 2019 IEEE International Conference on Industrial Cyber Physical Systems (ICPS), Taipei, 6–9 May 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 617–622. [Google Scholar] [CrossRef]
- Nassirtoussi, A.K.; Aghabozorgi, S.; Wah, T.Y.; Ngo, D.C.L. Text mining for market prediction: A systematic review. Expert Syst. Appl. 2014, 41, 7653–7670. [Google Scholar] [CrossRef]
- Jiang, W. Applications of deep learning in stock market prediction: Recent progress. Expert Syst. Appl. 2021, 184, 115537. [Google Scholar] [CrossRef]
- Rajendiran, P.; Priyadarsini, P. Survival study on stock market prediction techniques using sentimental analysis. Mater. Today Proc. 2021. [Google Scholar] [CrossRef]
- Saha, S.; Gao, J.; Gerlach, R. A survey of the application of graph-based approaches in stock market analysis and prediction. Int. J. Data Sci. Anal. 2022, 14, 1–15. [Google Scholar] [CrossRef]
- Cao, L.; Yang, Q.; Yu, P.S. Data science and AI in FinTech: An overview. Int. J. Data Sci. Anal. 2021, 12, 81–99. [Google Scholar] [CrossRef]
- Kumbure, M.M.; Lohrmann, C.; Luukka, P.; Porras, J. Machine learning techniques and data for stock market forecasting: A literature review. Expert Syst. Appl. 2022, 197, 116659. [Google Scholar] [CrossRef]
- Agarwal, S.; Kumar, S.; Goel, U. Stock market response to information diffusion through internet sources: A literature review. Int. J. Inf. Manag. 2019, 45, 118–131. [Google Scholar] [CrossRef]
- Xing, F.Z.; Cambria, E.; Welsch, R.E. Natural language based financial forecasting: A survey. Artif. Intell. Rev. 2018, 50, 49–73. [Google Scholar] [CrossRef]
- Bollen, J.; Mao, H.; Zeng, X. Twitter mood predicts the stock market. J. Comput. Sci. 2011, 2, 1–8. [Google Scholar] [CrossRef]
- Tetlock, P.C. Giving content to investor sentiment: The role of media in the stock market. J. Financ. 2007, 62, 1139–1168. [Google Scholar] [CrossRef]
- Tetlock, P.C.; Saar-Tsechansky, M.; Macskassy, S. More than words: Quantifying language to measure firms’ fundamentals. J. Financ. 2008, 63, 1437–1467. [Google Scholar] [CrossRef]
- Baker, M.; Wurgler, J. Investor sentiment and the cross-section of stock returns. J. Financ. 2006, 61, 1645–1680. [Google Scholar] [CrossRef]
- Weng, B.; Ahmed, M.A.; Megahed, F.M. Stock market one-day ahead movement prediction using disparate data sources. Expert Syst. Appl. 2017, 79, 153–163. [Google Scholar] [CrossRef]
- Nassirtoussi, A.K.; Aghabozorgi, S.; Wah, T.Y.; Ngo, D.C.L. Text mining of news-headlines for FOREX market prediction: A Multi-layer Dimension Reduction Algorithm with semantics and sentiment. Expert Syst. Appl. 2015, 42, 306–324. [Google Scholar] [CrossRef]
- Long, W.; Song, L.; Tian, Y. A new graphic kernel method of stock price trend prediction based on financial news semantic and structural similarity. Expert Syst. Appl. 2019, 118, 411–424. [Google Scholar] [CrossRef]
- Van de Kauter, M.; Breesch, D.; Hoste, V. Fine-grained analysis of explicit and implicit sentiment in financial news articles. Expert Syst. Appl. 2015, 42, 4999–5010. [Google Scholar] [CrossRef]
- Do, H.H.; Prasad, P.; Maag, A.; Alsadoon, A. Deep Learning for Aspect-Based Sentiment Analysis: A Comparative Review. Expert Syst. Appl. 2019, 118, 272–299. [Google Scholar] [CrossRef]
- Shavandi, A.; Khedmati, M. A multi-agent deep reinforcement learning framework for algorithmic trading in financial markets. Expert Syst. Appl. 2022, 208, 118124. [Google Scholar] [CrossRef]
- Chen, J.; Luo, C.; Pan, L.; Jia, Y. Trading strategy of structured mutual fund based on deep learning network. Expert Syst. Appl. 2021, 183, 115390. [Google Scholar] [CrossRef]
- Carta, S.; Ferreira, A.; Podda, A.S.; Reforgiato Recupero, D.; Sanna, A. Multi-DQN: An ensemble of Deep Q-learning agents for stock market forecasting. Expert Syst. Appl. 2021, 164, 113820. [Google Scholar] [CrossRef]
- Nam, K.; Seong, N. Financial news-based stock movement prediction using causality analysis of influence in the Korean stock market. Decis. Support Syst. 2019, 117, 100–112. [Google Scholar] [CrossRef]
- Shynkevich, Y.; McGinnity, T.M.; Coleman, S.A.; Belatreche, A. Forecasting movements of health-care stock prices based on different categories of news articles using multiple kernel learning. Decis. Support Syst. 2016, 85, 74–83. [Google Scholar] [CrossRef]
- Schumaker, R.P.; Zhang, Y.; Huang, C.N.; Chen, H. Evaluating sentiment in financial news articles. Decis. Support Syst. 2012, 53, 458–464. [Google Scholar] [CrossRef]
- Hagenau, M.; Liebmann, M.; Neumann, D. Automated news reading: Stock price prediction based on financial news using context-capturing features. Decis. Support Syst. 2013, 55, 685–697. [Google Scholar] [CrossRef]
- Ho, C.S.; Damien, P.; Gu, B.; Konana, P. The time-varying nature of social media sentiments in modeling stock returns. Decis. Support Syst. 2017, 101, 69–81. [Google Scholar] [CrossRef]
- Oliveira, N.; Cortez, P.; Areal, N. Stock market sentiment lexicon acquisition using microblogging data and statistical measures. Decis. Support Syst. 2016, 85, 62–73. [Google Scholar] [CrossRef]
- Geva, T.; Zahavi, J. Empirical evaluation of an automated intraday stock recommendation system incorporating both market data and textual news. Decis. Support Syst. 2014, 57, 212–223. [Google Scholar] [CrossRef]
- Feuerriegel, S.; Prendinger, H. News-based trading strategies. Decis. Support Syst. 2016, 90, 65–74. [Google Scholar] [CrossRef]
- Xiang, C.; Zhang, J.; Li, F.; Fei, H.; Ji, D. A semantic and syntactic enhanced neural model for financial sentiment analysis. Inf. Process. Manag. 2022, 59, 102943. [Google Scholar] [CrossRef]
- Yang, C.; Zhang, H.; Jiang, B.; Li, K. Aspect-based sentiment analysis with alternating coattention networks. Inf. Process. Manag. 2019, 56, 463–478. [Google Scholar] [CrossRef]
- Zhang, X.; Ghorbani, A.A. An overview of online fake news: Characterization, detection, and discussion. Inf. Process. Manag. 2020, 57, 102025. [Google Scholar] [CrossRef]
- Ghorbanali, A.; Sohrabi, M.K.; Yaghmaee, F. Ensemble transfer learning-based multimodal sentiment analysis using weighted convolutional neural networks. Inf. Process. Manag. 2022, 59, 102929. [Google Scholar] [CrossRef]
- Zhang, X.; Zhang, Y.; Wang, S.; Yao, Y.; Fang, B.; Philip, S.Y. Improving stock market prediction via heterogeneous information fusion. Knowl.-Based Syst. 2018, 143, 236–247. [Google Scholar] [CrossRef]
- Liang, B.; Su, H.; Gui, L.; Cambria, E.; Xu, R. Aspect-based sentiment analysis via affective knowledge enhanced graph convolutional networks. Knowl.-Based Syst. 2022, 235, 107643. [Google Scholar] [CrossRef]
- Consoli, S.; Barbaglia, L.; Manzan, S. Fine-grained, aspect-based sentiment analysis on economic and financial lexicon. Knowl.-Based Syst. 2022, 247, 108781. [Google Scholar] [CrossRef]
- Wu, H.; Zhang, Z.; Shi, S.; Wu, Q.; Song, H. Phrase dependency relational graph attention network for Aspect-based Sentiment Analysis. Knowl.-Based Syst. 2022, 236, 107736. [Google Scholar] [CrossRef]
- Kim, H.K.; Kim, H.; Cho, S. Bag-of-concepts: Comprehending document representation through clustering words in distributed representation. Neurocomputing 2017, 266, 336–352. [Google Scholar] [CrossRef] [Green Version]
- Song, Q.; Liu, A.; Yang, S.Y. Stock portfolio selection using learning-to-rank algorithms with news sentiment. Neurocomputing 2017, 264, 20–28. [Google Scholar] [CrossRef]
- Yang, S.Y.; Mo, S.Y.K.; Liu, A.; Kirilenko, A.A. Genetic programming optimization for a sentiment feedback strength based trading strategy. Neurocomputing 2017, 264, 29–41. [Google Scholar] [CrossRef]
- Atkins, A.; Niranjan, M.; Gerding, E. Financial news predicts stock market volatility better than close price. J. Financ. Data Sci. 2018, 4, 120–137. [Google Scholar] [CrossRef]
- Nisar, T.M.; Yeung, M. Twitter as a tool for forecasting stock market movements: A short-window event study. J. Financ. Data Sci. 2018, 4, 101–119. [Google Scholar] [CrossRef]
- Sun, A.; Lachanski, M.; Fabozzi, F.J. Trade the tweet: Social media text mining and sparse matrix factorization for stock market prediction. Int. Rev. Financ. Anal. 2016, 48, 272–281. [Google Scholar] [CrossRef]
- Li, Y.; Pan, Y. A novel ensemble deep learning model for stock prediction based on stock prices and news. Int. J. Data Sci. Anal. 2022, 13, 139–149. [Google Scholar] [CrossRef] [PubMed]
- Ito, T.; Sakaji, H.; Izumi, K.; Tsubouchi, K.; Yamashita, T. Ginn: Gradient interpretable neural networks for visualizing financial texts. Int. J. Data Sci. Anal. 2020, 9, 431–445. [Google Scholar] [CrossRef]
- Hájek, P. Combining bag-of-words and sentiment features of annual reports to predict abnormal stock returns. Neural Comput. Appl. 2018, 29, 343–358. [Google Scholar] [CrossRef]
- Li, X.; Xie, H.; Wang, R.; Cai, Y.; Cao, J.; Wang, F.; Min, H.; Deng, X. Empirical analysis: Stock market prediction via extreme learning machine. Neural Comput. Appl. 2016, 27, 67–78. [Google Scholar] [CrossRef]
- Jahan, M.V.; Akbarzadeh-T, M.R. Extremal optimization vs. learning automata: Strategies for spin selection in portfolio selection problems. Appl. Soft Comput. 2012, 12, 3276–3284. [Google Scholar] [CrossRef]
- Shi, Y.; Li, W.; Zhu, L.; Guo, K.; Cambria, E. Stock trading rule discovery with double deep Q-network. Appl. Soft Comput. 2021, 107, 107320. [Google Scholar] [CrossRef]
- Zhang, W.; Wang, P.; Li, X.; Shen, D. Quantifying the cross-correlations between online searches and Bitcoin market. Phys. A Stat. Mech. Appl. 2018, 509, 657–672. [Google Scholar] [CrossRef]
- Ochiai, T.; Nacher, J. A model for the dynamic behavior of financial assets affected by news: The case of Tohoku–Kanto earthquake. Phys. Lett. A 2011, 375, 3552–3556. [Google Scholar] [CrossRef]
- Rodrigues, F.B.; Giozza, W.F.; de Oliveira Albuquerque, R.; García Villalba, L.J. Natural Language Processing Applied to Forensics Information Extraction With Transformers and Graph Visualization. IEEE Trans. Comput. Soc. Syst. 2022, 1–17. [Google Scholar] [CrossRef]
- Jahan, M.V.; Akbarzadeh-T, M.R. From local search to global conclusions: Migrating spin glass-based distributed portfolio selection. IEEE Trans. Evol. Comput. 2010, 14, 591–601. [Google Scholar] [CrossRef]
- Ding, X.; Zhang, Y.; Liu, T.; Duan, J. Deep learning for event-driven stock prediction. In Proceedings of the Twenty-fourth international joint conference on artificial intelligence, Buenos Aires, Argentina, 25–31 July 2015. [Google Scholar]
- Liu, W.; Zhou, P.; Zhao, Z.; Wang, Z.; Ju, Q.; Deng, H.; Wang, P. K-BERT: Enabling Language Representation with Knowledge Graph. In Proceedings of the AAAI, New York, NY, USA, 7–12 February 2020; pp. 2901–2908. [Google Scholar] [CrossRef]
- Ding, X.; Zhang, Y.; Liu, T.; Duan, J. Using Structured Events to Predict Stock Price Movement: An Empirical Investigation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 25–29 October 2014; Association for Computational Linguistics: Doha, Qatar, 2014; pp. 1415–1425. [Google Scholar] [CrossRef]
- Si, J.; Mukherjee, A.; Liu, B.; Pan, S.J.; Li, Q.; Li, H. Exploiting social relations and sentiment for stock prediction. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 25–29 October 2014; pp. 1139–1145. [Google Scholar]
- Papaluca, A.; Krefl, D.; Suominen, H.; Lenskiy, A. Pretrained Knowledge Base Embeddings for improved Sentential Relation Extraction. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop, Dublin, Ireland, 22–27 May 2022; Association for Computational Linguistics: Dublin, Ireland, 2022; pp. 373–382. [Google Scholar] [CrossRef]
- Xu, Y.; Cohen, S.B. Stock movement prediction from tweets and historical prices. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Stroudsburg, PA, USA, 15–20 July 2018; Association for Computational Linguistics: Melbourne, Australia, 2018; pp. 1970–1979. [Google Scholar] [CrossRef]
- Hajek, P.; Barushka, A. Integrating Sentiment Analysis and Topic Detection in Financial News for Stock Movement Prediction. In Proceedings of the 2nd International Conference on Business and Information Management, ICBIM ’18, Barcelona, Spain, 20–22 September 2018; Association for Computing Machinery: New York, NY, USA, 2018; pp. 158–162. [Google Scholar] [CrossRef]
- Jin, F.; Self, N.; Saraf, P.; Butler, P.; Wang, W.; Ramakrishnan, N. Forex-foreteller: Currency trend modeling using news articles. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Chicago, IL, USA, 11–14 August 2013; pp. 1470–1473. [Google Scholar] [CrossRef]
- Sontag, D.; Roy, D. Complexity of inference in latent dirichlet allocation. In Proceedings of the Advances in neural information processing systems, Granada, Spain, 12–15 December 2011; pp. 1008–1016. [Google Scholar]
- Zhang, X.; Fuehres, H.; Gloor, P.A. Predicting stock market indicators through twitter I hope it is not as bad as I fear. Procedia-Soc. Behav. Sci. 2011, 26, 55–62. [Google Scholar] [CrossRef]
- Wang, Q. Cryptocurrencies asset pricing via machine learning. Int. J. Data Sci. Anal. 2021, 12, 175–183. [Google Scholar] [CrossRef]
- Atzeni, M.; Dridi, A.; Recupero, D.R. Using frame-based resources for sentiment analysis within the financial domain. Prog. Artif. Intell. 2018, 7, 273–294. [Google Scholar] [CrossRef]
- Zhang, K.; Zi, J.; Wu, L.G. New event detection based on indexing-tree and named entity. In Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, Amsterdam, The Netherlands, 23–27 July 2007; pp. 215–222. [Google Scholar] [CrossRef]
- Da, Z.; Engelberg, J.; Gao, P. The sum of all FEARS investor sentiment and asset prices. Rev. Financ. Stud. 2015, 28, 1–32. [Google Scholar] [CrossRef]
- Wu, J.; Xu, K.; Zhao, J. Online reviews can predict long-term returns of individual stocks. arXiv 2019, arXiv:1905.03189. [Google Scholar]
- Wu, G.G.R.; Hou, T.C.T.; Lin, J.L. Can economic news predict Taiwan stock market returns? Asia Pac. Manag. Rev. 2019, 24, 54–59. [Google Scholar] [CrossRef]
- Liu, Y. Fine-tune BERT for extractive summarization. arXiv 2019, arXiv:1903.10318. [Google Scholar]
- Harris, Z.S. Distributional Structure. WORD 1954, 10, 146–162. [Google Scholar] [CrossRef]
- Salton, G. Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer; Addison-Wesley: Boston, MA, USA, 1989; Volume 169. [Google Scholar]
- Dumais, S.T. Latent semantic analysis. Annu. Rev. Inf. Sci. Technol. 2004, 38, 188–230. [Google Scholar] [CrossRef]
- Feuerriegel, S.; Gordon, J. News-based forecasts of macroeconomic indicators: A semantic path model for interpretable predictions. Eur. J. Oper. Res. 2019, 272, 162–175. [Google Scholar] [CrossRef] [Green Version]
- Blei, D.M.; Ng, A.Y.; Jordan, M.I. Latent dirichlet allocation. J. Mach. Learn. Res. 2003, 3, 993–1022. [Google Scholar]
- Anbaee Farimani, S.; Tabatabaee, H.; Kaffashan, M. An Investigation into the Process of Organizing and Retrieving Web Texts Based on the Integration of Semantic Concepts In order to organize knowledge. Iran. J. Inf. Process. Manag. 2019, 34, 1879–1904. [Google Scholar]
- Huynh, H.D.; Dang, L.M.; Duong, D. A new model for stock price movements prediction using deep neural network. In Proceedings of the Eighth International Symposium on Information and Communication Technology, Nha Trang City, Viet Nam, 7–8 December 2017; pp. 57–62. [Google Scholar] [CrossRef]
- Lutz, B.; Pröllochs, N.; Neumann, D. Sentence-Level Sentiment Analysis of Financial News Using Distributed Text Representations and Multi-Instance Learning. arXiv 2018, arXiv:1901.00400. [Google Scholar]
- Hiew, J.Z.G.; Huang, X.; Mou, H.; Li, D.; Wu, Q.; Xu, Y. BERT-based Financial Sentiment Index and LSTM-based Stock Return Predictability. arXiv 2019, arXiv:1906.09024. [Google Scholar]
- Jiang, C.; Liang, K.; Chen, H.; Ding, Y. Analyzing market performance via social media: A case study of a banking industry crisis. Sci. China Inf. Sci. 2014, 57, 1–18. [Google Scholar] [CrossRef]
- Hendershott, T.; Livdan, D.; Schürhoff, N. Are institutions informed about news? J. Financ. Econ. 2015, 117, 249–287. [Google Scholar] [CrossRef]
- Gupta, K.; Banerjee, R. Does OPEC news sentiment influence stock returns of energy firms in the United States? Energy Econ. 2019, 77, 34–45. [Google Scholar] [CrossRef]
- Medovikov, I. When does the stock market listen to economic news? New evidence from copulas and news wires. J. Bank. Financ. 2016, 65, 27–40. [Google Scholar] [CrossRef]
- Verma, I.; Dey, L.; Meisheri, H. Detecting, quantifying and accessing impact of news events on Indian stock indices. In Proceedings of the International Conference on Web Intelligence, Sogndal, Norway, 25–27 May 2011; pp. 550–557. [Google Scholar] [CrossRef]
- Dale, R. GPT-3: What’s it good for? Nat. Lang. Eng. 2021, 27, 113–118. [Google Scholar] [CrossRef]
- Tetlock, P.C. Does public financial news resolve asymmetric information? Rev. Financ. Stud. 2010, 23, 3520–3557. [Google Scholar] [CrossRef]
- Tausch, F.; Zumbuehl, M. Stability of risk attitudes and media coverage of economic news. J. Econ. Behav. Organ. 2018, 150, 295–310. [Google Scholar] [CrossRef]
- Lee, C.Y.; Soo, V.W. Predict Stock Price with Financial News Based on Recurrent Convolutional Neural Networks. In Proceedings of the 2017 Conference on Technologies and Applications of Artificial Intelligence (TAAI), Taipei, 1–3 December 2017; pp. 160–165. [Google Scholar] [CrossRef]
- Rao, Y.; Zhong, X.; Lu, S. Research on News Topic-Driven Market Flucatuation and Predication. In Proceedings of the 2016 International Conference on Identification, Information and Knowledge in the Internet of Things (IIKI), Beijing, China, 20–21 October 2016; pp. 559–562. [Google Scholar] [CrossRef]
- Baccianella, S.; Esuli, A.; Sebastiani, F. SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining. In Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC’10), Paris, France, 17–23 May 2010; European Language Resources Association (ELRA): Valletta, Malta, 2010. [Google Scholar]
- LOUGHRAN, T.; MCDONALD, B. When Is a Liability Not a Liability? Textual Analysis, Dictionaries, and 10-Ks. J. Financ. 2011, 66, 35–65. [Google Scholar] [CrossRef]
- Zhang, H.; Li, Z.; Xie, H.; Lau, R.Y.; Cheng, G.; Li, Q.; Zhang, D. Leveraging statistical information in fine-grained financial sentiment analysis. World Wide Web 2022, 25, 513–531. [Google Scholar] [CrossRef]
- Zhao, M.; Yang, J.; Zhang, J.; Wang, S. Aggregated graph convolutional networks for aspect-based sentiment classification. Inf. Sci. 2022, 600, 73–93. [Google Scholar] [CrossRef]
- Li, B.; Chan, K.C.; Ou, C.; Ruifeng, S. Discovering public sentiment in social media for predicting stock movement of publicly listed companies. Inf. Syst. 2017, 69, 81–92. [Google Scholar] [CrossRef]
- Shi, Y.; Liu, W.M.; Ho, K.Y. Public news arrival and the idiosyncratic volatility puzzle. J. Empir. Financ. 2016, 37, 159–172. [Google Scholar] [CrossRef]
- Zhang, G.; Xu, L.; Xue, Y. Model and forecast stock market behavior integrating investor sentiment analysis and transaction data. Clust. Comput. 2017, 20, 789–803. [Google Scholar] [CrossRef]
- Bouktif, S.; Fiaz, A.; Awad, M. Augmented Textual Features-Based Stock Market Prediction. IEEE Access 2020, 8, 40269–40282. [Google Scholar] [CrossRef]
- Shi, Y.; Ho, K.Y.; Liu, W.M. Public information arrival and stock return volatility: Evidence from news sentiment and Markov Regime-Switching Approach. Int. Rev. Econ. Financ. 2016, 42, 291–312. [Google Scholar] [CrossRef]
- Araci, D. Finbert: Financial sentiment analysis with pre-trained language models. arXiv 2019, arXiv:1908.10063. [Google Scholar]
- Vora, V.; Shah, M.; Chouhan, A.; Tawde, P. Stock Market Prices and Returns Forecasting Using Deep Learning Based on Technical and Fundamental Analysis. In Proceedings of the Information and Communication Technology for Competitive Strategies (ICTCS 2021); Kaiser, M.S., Xie, J., Rathore, V.S., Eds.; Springer Nature Singapore: Singapore, 2022; pp. 717–728. [Google Scholar] [CrossRef]
- Sun, C.; Huang, L.; Qiu, X. Utilizing BERT for aspect-based sentiment analysis via constructing auxiliary sentence. arXiv 2019, arXiv:1903.09588. [Google Scholar]
- Huang, J.; Xing, R.; Li, Q. Asset pricing via deep graph learning to incorporate heterogeneous predictors. Int. J. Intell. Syst. 2022. 37, 8462–8489. [CrossRef]
- Yadav, A.; Jha, C.; Sharan, A.; Vaish, V. Sentiment analysis of financial news using unsupervised approach. Procedia Comput. Sci. 2020, 167, 589–598. [Google Scholar] [CrossRef]
- Yadav, R.; Kumar, A.V.; Kumar, A. News-based supervised sentiment analysis for prediction of futures buying behaviour. IIMB Manag. Rev. 2019, 31, 157–166. [Google Scholar] [CrossRef]
- Vilas, A.F.; Redondo, R.P.D.; Crockett, K.; Owda, M.; Evans, L. Twitter permeability to financial events: An experiment towards a model for sensing irregularities. Multimed. Tools Appl. 2019, 78, 9217–9245. [Google Scholar] [CrossRef]
- Kruiper, R.; Vincent, J.F.; Chen-Burger, J.; Desmulliez, M.P.; Konstas, I. In Layman’s Terms: Semi-Open Relation Extraction from Scientific Texts. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online, 5–10 July 2020. [Google Scholar]
- Zhou, S.; Yu, B.; Sun, A.; Long, C.; Li, J.; Yu, H.; Sun, J.; Li, Y. A Survey on Neural Open Information Extraction: Current Status and Future Directions. arXiv 2022, arXiv:2205.117252022. [Google Scholar]
- Gurin, Y.; Szymanski, T.; Keane, M.T. Discovering news events that move markets. In Proceedings of the 2017 Intelligent Systems Conference (IntelliSys), London, UK, 21–22 September 2016; IEEE: Piscataway, NJ, USA, 2017; pp. 452–461. [Google Scholar]
- Li, Q.; Chen, Y.; Jiang, L.L.; Li, P.; Chen, H. A tensor-based information framework for predicting the stock market. ACM Trans. Inf. Syst. (TOIS) 2016, 34, 1–30. [Google Scholar] [CrossRef]
- Checkley, M.; Higón, D.A.; Alles, H. The hasty wisdom of the mob: How market sentiment predicts stock market behavior. Expert Syst. Appl. 2017, 77, 256–263. [Google Scholar] [CrossRef]
- Granger, C. Testing for causality: A personal viewpoint. J. Econ. Dyn. Control 1980, 2, 329–352. [Google Scholar] [CrossRef]
- Wei, Y.C.; Lu, Y.C.; Chen, J.N.; Hsu, Y.J. Informativeness of the market news sentiment in the Taiwan stock market. North Am. J. Econ. Financ. 2017, 39, 158–181. [Google Scholar] [CrossRef]
- de Araújo, J.G.; Marinho, L.B. Using Online Economic News to Predict Trends in Brazilian Stock Market Sectors. In Proceedings of the 24th Brazilian Symposium on Multimedia and the Web, Salvador, Brazil, 16–19 October 2018; WebMedia ’18. Association for Computing Machinery: New York, NY, USA, 2018; pp. 37–44. [Google Scholar] [CrossRef]
- Romanov, V.; Naletova, O.; Panteleeva, E.; Federyakov, A. Fractal model of estimating news and insider influence on market volatility. Autom. Doc. Math. Linguist. 2007, 41, 141–149. [Google Scholar] [CrossRef]
- Zhou, W.X. Multifractal detrended cross-correlation analysis for two nonstationary signals. Phys. Rev. E 2008, 77, 066211. [Google Scholar] [CrossRef]
- Alamatian, Z.; Vafaei Jahan, M.; Milani Fard, A. Using Market Indicators to Eliminate Local Trends for Financial Time Series Cross-Correlation Analysis. In Proceedings of the 34th Canadian Conference on Artificial Intelligence (Canadian AI), Vancouver, BC, Canada, 25–28 May 2021. [Google Scholar] [CrossRef]
- Chen, K.; Luo, P.; Liu, L.; Zhang, W. News, search and stock co-movement: Investigating information diffusion in the financial market. Electron. Commer. Res. Appl. 2018, 28, 159–171. [Google Scholar] [CrossRef]
- Omrane, W.B.; Hussain, S.M. Foreign news and the structure of co-movement in European equity markets: An intraday analysis. Res. Int. Bus. Financ. 2016, 37, 572–582. [Google Scholar] [CrossRef]
- Omrane, W.B.; Tao, Y.; Welch, R. Scheduled macro-news effects on a Euro/US dollar limit order book around the 2008 financial crisis. Res. Int. Bus. Financ. 2017, 42, 9–30. [Google Scholar] [CrossRef]
- Birz, G. Stale economic news, media and the stock market. J. Econ. Psychol. 2017, 61, 87–102. [Google Scholar] [CrossRef]
- Fang, L.; Yu, H.; Huang, Y. The role of investor sentiment in the long-term correlation between US stock and bond markets. Int. Rev. Econ. Financ. 2018, 58, 127–139. [Google Scholar] [CrossRef]
- El Akraoui, B.; Daoui, C. Deep Reinforcement Learning for Bitcoin Trading. In Proceedings of the Business Intelligence; Fakir, M., Baslam, M., El Ayachi, R., Eds.; Springer International Publishing: Cham, Switzerland, 2022; pp. 82–93. [Google Scholar] [CrossRef]
- Oliveira, N.; Cortez, P.; Areal, N. The impact of microblogging data for stock market prediction: Using Twitter to predict returns, volatility, trading volume and survey sentiment indices. Expert Syst. Appl. 2017, 73, 125–144. [Google Scholar] [CrossRef]
- Ghasemaghaei, M. The role of positive and negative valence factors on the impact of bigness of data on big data analytics usage. Int. J. Inf. Manag. 2020, 50, 395–404. [Google Scholar] [CrossRef]
- Kiymaz, H. The effects of stock market rumors on stock prices: Evidence from an emerging market. J. Multinatl. Financ. Manag. 2001, 11, 105–115. [Google Scholar] [CrossRef]
- Ranjan, S.; Sood, S. Investor community sentiment analysis for predicting stock price trends. Int. J. Manag. Technol. Eng. 2019, 9, 6012–6020. [Google Scholar]
- Liu, Y.; Ott, M.; Goyal, N.; Du, J.; Joshi, M.; Chen, D.; Levy, O.; Lewis, M.; Zettlemoyer, L.; Stoyanov, V. Roberta: A robustly optimized bert pretraining approach. arXiv 2019, arXiv:1907.11692. [Google Scholar]
- Brown, T.B.; Mann, B.; Ryder, N.; Subbiah, M.; Kaplan, J.; Dhariwal, P.; Neelakantan, A.; Shyam, P.; Sastry, G.; Askell, A.; et al. Language models are few-shot learners. arXiv 2020, arXiv:2005.14165. [Google Scholar]
- Zhang, J.; Zhao, Y.; Saleh, M.; Liu, P.J. Pegasus: Pre-training with extracted gap-sentences for abstractive summarization. arXiv 2019, arXiv:1912.08777. [Google Scholar]
Source | Papers | No. of Papers | Citation |
---|---|---|---|
Journal of Finance | [56,57,58] | 3 | 6137 |
Journal of Computational Science | [55] | 1 | 2801 |
Expert Systems with Applications | [9,22,47,48,52,59,60,61,62,63,64,65,66] | 14 | 1229 |
Decision Support Systems | [24,67,68,69,70,71,72,73,74] | 9 | 785 |
Information Processing & Management | [28,75,76,77,78] | 5 | 406 |
Knowledge-Based Systems | [12,23,79,80,81,82] | 7 | 162 |
Neurocomputing | [17,83,84,85] | 4 | 158 |
The Journal of Finance and Data Science | [86,87] | 2 | 111 |
International Review of Financial Analysis | [3,88] | 2 | 100 |
International Journal of Data Science and Analytic | [50,51,89,90] | 4 | 28 |
Neural Computing and Applications | [13,91,92] | 3 | 99 |
Applied Soft Computing | [20,93,94] | 2 | 77 |
Physica A: Statistical Mechanics and its Applications | [95,96] | 3 | 50 |
IEEE Transactions | [45,97,98] | 3 | 45 |
AAAI Conference on Artificial Intelligence | [99,100] | 2 | 1080 |
Empirical Methods in Natural Language Processing (EMNLP) | [101,102] | 2 | 320 |
Association for Computational Linguistics (ACL) | [21,25,103,104] | 4 | 232 |
Data Attributes | Analysis | |||||||
---|---|---|---|---|---|---|---|---|
Literature | Data | Market | Media Source | Duration | Feature Selection | Time Frame | Machine Learning | |
BOW | [69] | news | S&P500 | Yahoo Finance | October 2005–November 2005 | POS | 20 min | SVR |
[70] | news | DGAP | DGAP and EuroAdhoc | 1997–2011 | two-word combination | Intraday | SVM | |
[60] | news | Forex | MarketWatch.com | 2008–2011 | Ontology | 2 h | SVM | |
[15] | news | Forex | MarketWatch.com | 2008–2011 | POS | Intraday | SVM | |
[91] | news | NSYE and NASDAQ | MarketWatch.com | 2013 | BOW | Daily | NN | |
Concept Based | [125] | social media | 2008 America bank crisis | Yahoo! Finance | January 2008–December 2008 | LDA | 20 min | Regression |
[134] | News | CSI100 | Hexun | January 2015–December 2015 | LDA | Daily | Naive Bayes | |
[86] | News | NASDAQ | Reuters | September 2011–September 2012 | LDA | Minutes | Naive Bayes | |
[105] | News | Forex | Reuters | 2012–2016 | LDA | Daily | MLP | |
[119] | News | CDAX | Website of the EQS Group | July 1996–April 2006 | Latent Variable | monthly | Lasso Regression | |
Word Embedding | [122] | News | S&P500 | Thomson Reuters, Bloomberg | 2006–2014 | word2vec | Day–week | LSTM |
[18] | News | S&P500 | Reuter | 2006–2013 | word2vec | Daily | HCAN | |
[79] | news, social media | HK and CSI100 | Xueqiu and Guba and Sina and Hexun | January 2015–December 2015 | word2vec | Daily | MFC | |
[133] | News | TWSE | TWSE Official Website | 2007–2017 | word2vec | Daily | CNN-LSTM | |
[22] | News | CDAX | DGAP | January 2001–September 2017 | Doc2vec | Daily | LSTM | |
[129] | News | Indian stock market | Indian news wires | January 2013–December 2016 | pharagraph 2vec | Daily | LSTM | |
[35] | News | S&P 500 | Reuters and Bloomberg | October 2006–October 2013 | Word embedding | Daily | LSTM | |
[21] | News | Forex | Reuters | 2013–2017 | BERT Word embedding | Daily | LSTM | |
[25] | News | S&P500 | Google Trends | January 2004–December 2015 | BERT Word embedding | Weekly | NN |
Category | Data Attributes | Analysis | |||||||
---|---|---|---|---|---|---|---|---|---|
Literature | Media Source | Market | Duration | Feature Selection | Time Frame | Machine Learning | |||
News | Lexicon | Based | [62] | Belgian financial newspaper De Tijd | Dutch Company | May 2012 | BoW | Daily | SVM |
[72] | December 2012–October 2015 | LSA | Daily | Regression | |||||
[58] | Wall Street Journal | DJIA | 1935–1961 | News count | Daily | Regression | |||
Machine | Learning | [23] | Sina Weibo | Shanghai Stock SSEC | December 2014–April 2016 | BoW | Intra- day | Logistic regression | |
[148] | Reuters and Moneycontrol | NIFTY | Five month | POS | Daily | SVM | |||
[14] | LexisNexis | NIFTY | BoW | Monthly | ARM | ||||
Deep | Learning | [142] | Yahoo Finance | NASDAQ-100 | 2008–2018 | BoW | Daily | MLP | |
[22] | Deutsche Gesellschaft (DGAP) | CDAX | January 2001–September 2017 | Doc2vec | Daily | LSTM | |||
[12] | Fxstreet, NewsBTC, Cointelegraph | Forex, Cryptocurrency | October 2018–July 2021 | FinBERT | Hourly | LSTM | |||
Social Media | Lexicon | Based | [149] | Indian Financial | NIFTY | January 2009–December 2009 | BoW | hourly | SVM |
[90] | Yahoo Finance, Reuters | Tokyo Stock Exchange | September 2015 and January 2007–December 2016 | Word embedding | Daily | MLP | |||
[79] | Sina, Hexuan websites | HK, CSI100 | January 2015–December 2015 | Word embedding | Daily | Tensor decomposition | |||
[139] | NYSE | October 2011–March 2012 | BoW | Daily | ARM | ||||
Machine | Learning | [55] | Dow Jones | February 2008–December 2008 | News count | Daily | Fuzzy neural network | ||
[110] | Reuters and Forbes | Amazon, Google | 2017 | BoW | Daily | Lasso regression, SVR | |||
[141] | Baidu news | Shanghai 50ETF | 2008–2015 | BoW | Daily | Naive Bayes, LSTM | |||
Deep | Learning | [124] | Tencent, Ping An, CCB, Weibo | Hong Kong Market | January 2016–December2018 | Word embedding | Daily | LSTM, VAR |
Data Attributes | Analysis | |||||||
---|---|---|---|---|---|---|---|---|
Literature | Media | Market | Duration | Feature Selection | Time Frame | Machine Learning | ||
Anomalous | changes | [10] | Dow Jones, S&P500 | September 2013–September 2015 | Hashtag | Daily | SVR | |
[150] | Tesco, Booker | January 2017 | Hashtag | Minutes | Naive bayes | |||
[153] | Irish Farmers Journal | Irish Beef Market | 2005–2015 | BOW | Day–week–month | SVM, log-likelihood | ||
OpenIE | [101] | Reuters, Bloomberg | S&P 500 | October 2006–November 2013 | OpenIE-based tuples | Day–week–month | MLP | |
[99] | Reuters, Bloomberg | S&P500 | October 2006–November 2013 | OpenIE-based tuples | Day–week–month | MLP | ||
[19] | Reuters, Bloomberg | S&P500 | October 2006–November 2013 | Event embedding | Daily | SVM | ||
Knowledge Graph | [24] | GF Securities | Airbnb Market | May 2017–May 2018 | word2vec | News release time | Bi-LSTM | |
[39] | Reuters, Bloomberg | S&P 500 | October 2006–November 2013 | Word embedding | Daily | Bi-LSTM | ||
[61] | ifeng.com (Financial China) | SZ002424 | September 2012–March 2017 | 2-gram | Daily | Kernel SVM | ||
[40] | Reddit WorldNews Channel | DJIA | July 2008–January 2016 | Price vector and event embedding | Daily | Temporal CNN | ||
[102] | S&P500 | January 2013 | BoW | Hourly | Vector auto-regression model |
Data Attributes | Analysis | |||||
---|---|---|---|---|---|---|
Literature | Media | Market | Duration | Feature | Time | Machine |
Selection | Frame | Learning | ||||
[92] | Caihua | HIS | 2001 | BoW | Daily | MLP |
[154] | Sina, eastmoney | CSI100 | 2011 | BoW | Minute | High-order tensor regression |
[23] | Sina Weibo | Shanghai Stock SSEC | September 2014–April 2016 | Tensor decomposition | Intraday | Logistic regression |
[21] | Thomson Reuters | Forex | 2013–2017 | BERT word embedding, MLP-based feature extraction | Intraday | LSTM |
[104] | NASDAQ | 2014–2016 | Word embedding, latent variable extraction | Daily | Bi-GRU | |
[20] | CITIC Securities, GF Securities, China Pingan | Chinese stock | March 2012–July 2018 | Graph embedding | Daily | Bi-LSTM |
Literature | Sentiment Analysis | Market | Analysis Method | Time Frame | Parameter | Finding |
---|---|---|---|---|---|---|
[96] | Yes | Forex | Mathem- atical model | 5 min | volatility | Their findings show that the model with multiplicative noise can reproduce the dynamics observed in the real financial market affected by the arrival of high-impact news. |
[158] | No | FBOVE- SPA | Kendall cross correlation | 15 min | average price, trading volume | Besides demonstrating that sector is indeed influenced contrastingly by news, they also show that the machine learning models utilized performed better than random and other less complex baselines for all sectors in FBOVESPA, in this manner giving proof that news information conveys in reality a significant signal for comprehension of BM and FBOVESPA dynamics. |
[126] | Yes | NYSE | Regression | Daily | trading volume, volatility | Their results suggest that significant price discovery related to news stories occurs through institutional trading before the news announcement date. |
[128] | No | US | Copulas Statics | Daily | equity returns | The finding shows the market reacts strongly and negatively to the most unfavorable macroeconomic news but appears to largely discount the good news. |
[165] | Yes | S&P 500 | Correlation | Week | volatility | Their findings show a statistically and economically significant relationship between stale news stories on unemployment and next week’s S&P 500 returns. This effect is then completely reversed during the following week. |
[95] | No | Bitcoin market | MF-DCCA | Daily | trading volume | By employing the Multifractal Detrended Cross-Correlation Analysis method, they find that the change in Google Trends (CGT) and the Bitcoin market, i.e., returns and changes of volume, is an overall higher degree of multifractal in the long term and weak multifractal in the short term. |
[166] | Yes | S&P 500 | DCCA-MIDAS | Daily | return | The results show that the composite index of investor sentiment has a significantly positive influence on the long-term stock–bond correlation, and the shock of crises significantly decreases the average correlation, but the effect of sentiment does not change significantly. |
[87] | Yes | FTSE 100 | Correlation | Daily | close price | Their findings show there is evidence of causation between public sentiment and the stock market movements, in terms of the relationship between MOOD and the daily closing price, and the time-lag findings of MOOD and PRICE. |
[163] | No | DAX | Impulse response analysis | 5 min | volatility | Show that 50 percent of the total accumulated impact of US macroeconomic news on the DAX 30 and CAC 40 volatilities is attained after 90 min. |
[164] | No | Forex | Impulse response analysis | Intra- day | volatility | News surprise moderates extreme pure news effects that have nearly all positive (negative) coefficients in both regimes for volatility and depth (spread). In addition, volatility and depth respond positively to good and bad unscheduled news in both states (with more intensity during the expansion), while spread decreases in both states. |
Data Attributes | Analysis | |||||
---|---|---|---|---|---|---|
Literature | Media | Market | Duration | Feature Selection | Machine Learning | Evaluation Metrics |
[84] | Thomson Reuters | S&P500 | 2006–2014 | BoW | ListNet, RankNet | Sharpe ratio, MDD |
[133] | TWSE Official Website | TWSE | 2007–2017 | word2vec | CNN, LSTM | RSME, profit |
[85] | NASDAQ, S&P 500 | 1 August 2012–30 January 2015 | BoW | Genetic programming | Sterling ratio, Sharpe ratio, profit | |
[73] | Reuters | S&P500 | September 2006–August 2007 | BOW | NN, DT, SLR | Profit |
[74] | DGAP | CDAX | 2004–2011 | BOW | Reinforcement learning | Profit |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Farimani, S.A.; Jahan, M.V.; Milani Fard, A. From Text Representation to Financial Market Prediction: A Literature Review. Information 2022, 13, 466. https://doi.org/10.3390/info13100466
Farimani SA, Jahan MV, Milani Fard A. From Text Representation to Financial Market Prediction: A Literature Review. Information. 2022; 13(10):466. https://doi.org/10.3390/info13100466
Chicago/Turabian StyleFarimani, Saeede Anbaee, Majid Vafaei Jahan, and Amin Milani Fard. 2022. "From Text Representation to Financial Market Prediction: A Literature Review" Information 13, no. 10: 466. https://doi.org/10.3390/info13100466
APA StyleFarimani, S. A., Jahan, M. V., & Milani Fard, A. (2022). From Text Representation to Financial Market Prediction: A Literature Review. Information, 13(10), 466. https://doi.org/10.3390/info13100466