Next Article in Journal
Optimizing Energy Storage Profits: A New Metric for Evaluating Price Forecasting Models
Previous Article in Journal
Pension Risk and the Sustainable Cost of Capital
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Fin-ALICE: Artificial Linguistic Intelligence Causal Econometrics

Department of Computer Science and Engineering, University of Colorado Denver, Denver, CO 80204, USA
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
J. Risk Financial Manag. 2024, 17(12), 537; https://doi.org/10.3390/jrfm17120537
Submission received: 20 October 2024 / Revised: 20 November 2024 / Accepted: 25 November 2024 / Published: 26 November 2024
(This article belongs to the Section Financial Technology and Innovation)

Abstract

:
This study introduces Fin-ALICE (Artificial Linguistic Intelligence Causal Econometrics), a framework designed to forecast financial time series by integrating multiple analytical approaches including co-occurrence networks, supply chain analysis, and emotional sentiment analysis to provide a comprehensive understanding of market dynamics. In our co-occurrence analysis, we focus on companies that share the same emotion on the same day, using a much shorter horizon than our previous study of one month. This approach allows us to uncover short-term, emotion-driven correlations that traditional models might overlook. By analyzing these co-occurrence networks, Fin-ALICE identifies hidden connections between companies, sectors, and events. Supply chain analysis within Fin-ALICE will evaluate significant events in commodity-producing countries that impact their ability to supply key resources. This analysis captures the ripple effects of disruptions across industries and regions, offering a more nuanced prediction of market movements. Emotional sentiment analysis, powered by the Fin-Emotion library developed in our prior research, quantifies the emotional undertones in financial news through metrics like “emotion magnitude” and “emotion interaction”. These insights, when integrated with Temporal Convolutional Networks (TCNs), significantly enhance the accuracy of financial forecasts by capturing the emotional drivers of market sentiment. Key contributions of Fin-ALICE include its ability to perform month-by-month company correlation analysis, capturing short-term market fluctuations and seasonal patterns. We compare the performance of TCNs against advanced models such as LLMs and LSTMs, demonstrating that the Fin-ALICE model outperforms these models, particularly in sectors where emotional sentiment and supply chain dynamics are critical. Fin-ALICE provides decision-makers with predictive insights and a deeper understanding of the underlying emotional and supply chain factors that drive market behaviors.

1. Introduction

Financial market analysis has become increasingly complex, requiring the integration of diverse data sources and advanced analytical techniques to navigate the intricate landscape of global markets. The period from 2019 to 2021, characterized by geopolitical shifts, economic sanctions, pandemics, natural disasters, and wars, presented a particularly volatile and challenging environment for financial decision-making (McCarthy and Alaghband 2023a). In this context, the incorporation of emotional sentiment analysis into traditional market analysis has emerged as a promising approach to gain a richer understanding of market dynamics (McCarthy and Alaghband 2023b).
The assumption that sentiment analysis can enhance the capacities of stock market forecasting models has been a driving force behind recent research efforts (Liapis and Karanikola 2023). Numerous studies have explored the application of deep learning techniques, such as Temporal Convolutional Networks (TCNs) (Liapis and Kotsiantis 2023; Guo et al. 2023; Dai et al. 2022; Guo et al. 2022; Cen et al. 2022), and the integration of sentiment analysis (Correia et al. 2022; Chong et al. (2017)) to improve stock market prediction. These approaches have shown promising results, outperforming traditional econometric models (Pothepalli 2021).
Moreover, the rapid advancements in large language models (LLMs) have sparked interest in their potential applications in the financial domain. Models such as FinBERT-LSTM (Halder 2022), LagLlama (Rasul et al. 2024), and FinGPT (Liu et al. 2023; Wang et al. 2023) have been developed to address the unique challenges of financial text data. Recent studies have explored the capacity of LLMs like ChatGPT in forecasting stock price movements (Lopez-Lira et al. 2023), highlighting both the potential and limitations of these models in financial prediction tasks. Despite the impressive capabilities of LLMs in natural language processing tasks, their performance in time series forecasting tasks has lagged behind specialized models like TCNs (Jin et al. 2023; Miller et al. 2024).
Recent studies have also highlighted the importance of considering supply chain dynamics in financial market analysis. McCarthy and Alaghband (2023b) propose a multidimensional data science framework that synthesizes data on world economies, stock metrics, significant market events, and knowledge concepts to construct a knowledge graph that captures the relationships between supply chain disruptions, corporations, and commodities.
Traditional approaches to analyzing company correlations often rely on longer-term averages, which can obscure important short-term dynamics and events that drive market behaviors. In this study, we shift to a month-by-month correlation analysis, to better capture the evolving nature of these relationships and align them with specific economic, political, or industry-specific events. This granular approach not only enhances our understanding of the mechanisms by which news and market movements propagate through the corporate ecosystem but also provides a more responsive and adaptable framework for risk management and portfolio optimization.
Building upon these developments, we introduced a novel quantitative metric called “emotion magnitude” that captures the emotional undercurrents of the market (McCarthy and Alaghband 2023b). By integrating this metric with traditional time series analysis using TCNs applied to stock market futures, we demonstrated a more holistic understanding of market dynamics. Our findings suggest that incorporating emotion magnitude as a feature leads to significantly better performance in predicting future market trends compared to traditional market-based risk measures. This granular approach enhances our understanding of how news and market movements propagate through the corporate ecosystem by focusing on the most recent events and interactions. For example, in the shorter one-month horizon, we identified a significant co-occurrence between Goldman Sachs and Amazon in March 2019. This relationship was not visible in a two-year aggregated view but becomes relevant when considering Goldman Sachs’ digital strategy, which included collaboration with Amazon Web Services (AWS) during this period. Similarly, in January 2020, we observed a correlation between Boeing and Honeywell. This is further validated by a news article in March 2020, where Boeing and Honeywell formally announced a partnership, reflecting the utility of short-term correlation analysis in capturing emerging relationships.
In this paper, we present a comparative study using Fin-ALICE (Artificial Linguistic Intelligence Causal Econometrics), a comprehensive framework for financial time series forecasting that integrates emotional analysis (i.e., artificial linguistic intelligence) of financial news, co-occurrence networks (i.e., causal econometrics), and supply chain analysis. By leveraging a shorter time horizon in our co-occurrence network, we enable more near-term analysis, while incorporating a feature selection process to dynamically identify the best-correlated features across sectors. We focus specifically on how the integration of emotional sentiment and supply chain analysis enhances short-term market trend predictions. Our goal is to demonstrate that Fin-ALICE not only outperforms these models but also offers more granular insights into market behaviors by considering emotional and supply chain factors. The outcomes of our research provide a robust analytical tool for financial risk strategy, empowering stakeholders to navigate the complexities of the ever-evolving global financial ecosystem with greater precision and understanding.
This paper is structured as follows: Section 2 presents related work. Section 4 presents the methods and materials used in our research for the development of the Fin-ALICE framework, including the integration of emotional analysis, supply chain data, and advanced deep learning techniques such as Temporal Convolutional Networks (TCNs) and large language models (LLMs). Section 5 provides a conclusion on the improved performance of the emotion interaction feature and the month-by-month correlation analysis, as well as findings related to the relationships between commodity-producing countries, supply chain events, top companies by sector, and their impact on market dynamics.

2. Related Work (Used for Comparative Study)

The intersection of deep learning and financial market analysis has given rise to innovative approaches that aim to enhance forecasting accuracy by integrating various data sources and techniques. This section reviews the related works that have contributed to the development of models combining emotional sentiment analysis, supply chain dynamics, and advanced deep learning techniques for financial market prediction.
For each related work presented, we detail the methodologies used, their contributions to the field, and how they are implemented or adapted in our study for direct comparison with our proposed model. This comparative analysis provides a clear understanding of where current methodologies excel, their limitations, and how our proposed approach addresses these gaps to advance the state of the art in financial time series forecasting.

2.1. LSTM + BERT

The integration of Long Short-Term Memory (LSTM) networks with Bidirectional Encoder Representations from Transformers (BERT) has been explored to improve the handling of sequential and textual data in financial forecasting. The model FinBERT-LSTM, introduced by Halder (2022), leverages the contextual understanding of BERT to process financial news and the temporal capabilities of LSTM for sequential data. This hybrid approach aims to capture complex patterns in financial time series data, demonstrating enhanced prediction accuracy compared to traditional models. The implementation details and further explorations are available in the code repository (xraptorgg 2024). In our work, we compare our TCN model with the FinBERT-LSTM model and also compare the FinBERT LLM model’s emotional analysis capabilities with our emotion interaction feature. FinBERT LLM (Araci 2019) is a large language model specifically fine-tuned on financial texts, distinct from LSTM in its architecture and ability to process and generate human-like text. FinBERT LLM can understand and generate complex financial language, potentially offering different insights into sentiment analysis.

2.2. Temporal Convolutional Networks (TCNs) + MultiLabel BERT

Temporal Convolutional Networks (TCNs) combined with MultiLabel BERT models have shown promise in extracting nuanced emotional sentiment from financial news to inform market predictions. Guo et al. (2023) describe a methodology where textual data are pre-processed and passed through sentiment and emotion classification modules. The study utilizes various sentiment analysis tools like TextBlob, Vader, and FinBERT to generate sentiment polarity time series, along with 28 distinct emotion time series. A feature selection procedure, employing the SelectKBest class from the scikit-learn library (Pedregosa et al. 2011), refines these inputs for the TCN model. Despite challenges in accessing the original code, the procedure outlined in their study offers valuable insights into the integration of sentiment and deep learning models for financial analysis.
In our study, we replicated this multilabel emotion classification process using 28 emotion categories based on the GoEmotions dataset (Monologg 2024; Demszky et al. 2020), and included all sentiment analysis tools mentioned in the paper. We applied the SelectKBest feature selection and a 7-day rolling mean to smooth the generated time series data. Leveraging our TCN network architecture, we evaluated the effectiveness of each feature in predicting financial trends. Additionally, we introduced our own emotional features, such as emotion interaction, to assess their comparative performance.
This replication enabled us to directly compare the impact of various emotion classification methods on the predictive power of the TCN model. Our results show that while traditional classification tools like TextBlob and Vader provide good baseline performance, the inclusion of multilabel emotion classification and our custom emotional features significantly enhance model accuracy by capturing more complex emotional dynamics within financial news.

2.3. FinGPT Forecaster

The FinGPT Forecaster, as detailed by Liu et al. (2023), explores the potential of fine-tuning large language models (LLMs) for financial forecasting. Although not primarily designed for time series, FinGPT demonstrates the adaptability of LLMs in the financial domain through prompt engineering and fine-tuning. The implementation focuses on the Dow Jones Industrial Average (DJIA) using the Llama2-7B model, which is employed for natural language processing tasks such as sentiment analysis and news interpretation, highlighting the versatility of LLMs in financial applications. The corresponding code repository provides further implementation details and potential modifications for time series analysis (AI4Finance-Foundation 2024). In this research, we give the FinGPT forecaster a 7-day forecasting task, and provide the forecaster with time series information from the previous 30 days to predict the next 7 days. We compare the performance of our TCN model with the FinGPT Forecaster model.

2.4. Time-LLM

Jin et al. (2024) introduce Time-LLM, a novel approach that reprograms large language models for time series forecasting. This methodology leverages the inherent capabilities of LLMs in processing sequential data, adapting them specifically for time series tasks. The model demonstrates significant improvements in forecasting accuracy, providing a robust framework for incorporating LLMs into financial market analysis. The official implementation offers comprehensive details on the model’s architecture and performance (KimMeen 2024). The Time-LLM model depends on the previous five closing prices for each forecast day. In our research, we compare the performance of our TCN model with the Time-LLM model.

2.5. Lag-Llama

The Lag-Llama model, presented by Rasul et al. (2024), addresses the challenges of time series forecasting by integrating large language models with lagged data inputs. This approach emphasizes the importance of temporal dependencies and lag structures in financial data, enhancing the model’s predictive capabilities. The implementation details and further exploration of this model are available in their code repository (time-series-foundation-models 2024). In this research, we give Lag-Llama model a 7-day forecasting task, and provide the forecaster with time series information from the previous 30 days to predict the next 7 days. We compare the performance of our TCN model with this large language model.

2.6. Supply Chain Dynamics and Sentiment Analysis

In McCarthy and Alaghband (2023b), we proposed a multidimensional data science framework that integrates supply chain dynamics with sentiment analysis for long-term financial market prediction over a two-year period. In this study, we build upon that work by adapting the framework for a monthly analysis, introducing a new feature, emotional interaction. In addition, we refine several existing state-of-the-art models and perform a significant comparative study, which demonstrates that our enhanced model outperforms these models in predictive accuracy. In our approach, we synthesize data on world economies, stock metrics, significant market events, and knowledge concepts to construct a knowledge graph. This graph captures the relationships between supply chain disruptions, corporations, and commodities, providing a comprehensive understanding of market dynamics. This work highlights the potential of combining diverse data sources to enhance financial forecasting models.
We implemented a comprehensive data science framework that integrates diverse data sources to construct a knowledge graph for financial market analysis. The first step in this framework was to collect the necessary data, which include the following:
  • Macroeconomic Data: Macroeconomic data were sourced from the Federal Reserve Economic Data (FRED), providing insights into economic indicators and trends that influence broader market conditions.
  • Sector-specific Financial Market Data: Sector-related financial data were obtained from Yahoo Finance using sector-specific indices and exchange-traded funds (ETFs) for the top 10 companies in each sector. The sector indices used include the following:
    • Consumer Staples: XAP=F.
    • Utilities: XAU=F.
    • Materials: XAB=F.
    • Consumer Discretionary: XLY.
    • Healthcare: XAV=F.
    • Real Estate: XLRE.
    • Energy: XAE=F.
    • Industrials: XAI=F.
    • Financials: XAF=F.
    • Technology: XAK=F.
    • Telecommunications: XAZ=F.
    • Gold: QO=F.
    These indices were used as proxies to represent the overall performance of each sector and capture key market movements that could impact sentiment and company performance.
  • Financial News and General News Articles: Financial news articles were collected from a variety of sources, including well-known financial platforms such as Nasdaq, Barrons, TheStreet, Investing.com, Forbes, MarketWatch, and Bloomberg. To complement this, we also included news articles from general and international sources like The New York Times, The Washington Post, Reuters, Fox News, CNN, BBC, and CNBC to provide a broader context of events impacting market sentiment.
  • Emotion Analysis in News Data: The news data were processed through our emotion library and various sentiment analysis models, including TextBlob, Vader, FinBERT, and a custom multilabel emotion classification model based on the GoEmotions dataset to support our comparison. This allowed us to extract not only polarity (positive, neutral, negative) but also nuanced emotional categories that could indicate investor sentiment. These enhancements represent an improvement over our previous methodology and are necessary to compare the performance of our new features with those used in state-of-the-art models.
  • Knowledge Graph Construction and Feature Integration: Using the collected data, we constructed a knowledge graph that captures the relationships between supply chain disruptions, corporations, and commodities. The framework synthesizes the data to model interactions at multiple levels, including global economic indicators, sector-specific movements, and company-level activities. We incorporated a smaller horizon as a parameter to ensure the knowledge graph captures short-term market dynamics and evolving relationships between companies.
Building on this framework, we introduced a novel quantitative metric called "emotion magnitude" to capture the emotional undercurrents of the market. By integrating emotion magnitude with traditional time series analysis using TCNs, we demonstrated a significant improvement in predicting future market trends. Our findings indicate that incorporating emotional sentiment as a feature can lead to better performance compared to traditional market-based risk measures, particularly in sectors where emotional responses and supply chain dynamics play a critical role.
This combined approach underscores the value of synthesizing emotion metrics with supply chain data to provide a more nuanced understanding of market behaviors. It highlights the synergy between short-term emotional fluctuations and long-term supply chain effects, paving the way for more accurate and robust financial and market-based risk forecasting models.
In the following sections, we will detail our proposed work and methods, present our findings, and discuss their implications for the field of financial forecasting.

3. Proposed Work

The landscape of financial market analysis has evolved significantly, and the reviewed studies underscore the potential of integrating deep learning techniques, sentiment analysis, and supply chain dynamics. Models like FinBERT-LSTM, TCN with MultiLabel BERT, FinGPT Forecaster, Time-LLM, and Lag-Llama showcase advancements in this domain, each contributing unique methodologies and insights. These developments, along with the integration of emotional sentiment analysis, large language models (LLMs), and granular company correlation analysis, have led to more sophisticated forecasting models. Building on the landscape of financial market analysis described above, our study will introduce several key innovations:
  • Enhanced Emotional Sentiment Analysis: We expand on our previous work with the “emotion magnitude” metric and introduce a new feature called “emotion interaction”. The emotion interaction feature is derived from the count of companies (how often they show in news articles) and emotion sentiment data. This feature captures the interplay between emotional sentiment and company activity levels, providing a more refined view of market dynamics in a single metric. We integrate these features with Temporal Convolutional Networks (TCNs) with the goal of improving the prediction of future market trends.
  • Leveraging Large Language Models (LLMs): We explore the potential of LLMs in financial analysis by comparing their time series forecasting capabilities against temporal models such as LSTM and TCN. We enhanced the FinGPT model by incorporating supply chain analysis along with 30 days of time series data and the emotion magnitude feature, embedding knowledge of sentiment-driven supply chain events. In contrast, the Lag-Llama and Time-LLM architectures did not support the incorporation of additional data beyond the time series itself, limiting their ability to leverage external features. Our study evaluates whether these LLMs can perform as well as temporal models in the financial domain.
  • Multidimensional Data Science Framework: We adapted our comprehensive framework that synthesizes data on world economies, stock metrics, and market events to construct a knowledge graph by incorporating a shorter time horizon as a parameter. This modification allows the multidimensional framework to improve the accuracy and depth of financial forecasts compared to existing models, making it more adaptable to rapidly changing market conditions. These findings offer a more dynamic understanding of complex relationships and correlations between different companies and sectors in the economy.

Emotion Interaction Calculation

The emotion interaction is calculated by combining the scaled emotion scores with a scaled version of the company count (how often a company is mentioned). The company count is cubed to give more importance to companies that are frequently mentioned. The equation is defined as follows:
E I i = w e · E i E min E max E min + w c · ( C i 3 ) ( C min 3 ) ( C max 3 ) ( C min 3 )
  • E I i is the emotion interaction for the size of fear emotion data within the time series news articles i;
  • E i is the emotion score (count of fear articles) i;
  • C i is the company count (count of a company being mentioned in the news for a day) i;
  • w e = 0.9 is the weight assigned to the emotion score;
  • w c = 0.1 is the weight assigned to the company count;
  • E min , E max are the minimum and maximum values of the emotion scores;
  • C min , C max are the minimum and maximum values of the company count.
This equation balances two key factors: the emotional sentiment extracted from news data and the frequency with which a company is mentioned (company count). The sentiment is given more importance with a weight of w e = 0.9 , while the company count has a lower weight of w c = 0.1 . Cubing the company count ( C i 3 ) emphasizes companies that are mentioned frequently, as they are likely to have a more significant impact on market sentiment. The weights ( w e = 0.9 and w c = 0.1 ) were selected after extensive experimentation to prioritize the stronger predictive influence of emotional sentiment while incorporating company count as a complementary feature. The cubic transformation of company count ( C 3 ) emphasizes high values, effectively capturing the influence of frequently mentioned companies without overwhelming the emotional component, ensuring both features contribute meaningfully to the model’s predictive accuracy. The cubic transformation combined with weighted scaling provided the best results by effectively balancing the two features.
Our study investigated various methods for calculating emotion interaction, including direct multiplication, simple addition, logarithmic scaling, rank-based weighting, categorical weighting, normalized scaling without power transformation, exponential weighting, and emotion and count addition with exponential moving average. These approaches ranged from straightforward calculations to more complex weighted and normalized techniques, each aiming to capture the relationship between emotional sentiment and company mention frequency in financial news. By using the final best-performing equation of weighted combination, the model captures both the intensity of emotions and the prominence of companies in the news, which together provide a more nuanced understanding of investor sentiment and market dynamics.

4. Methods

In this section, we briefly describe the process we used to compare the performance of all the emotion- and sentiment-related features with the month-by-month correlation analysis. We also provide an overview of the data sources, the emotion classification model, LLMs, and the Temporal Convolutional Networks (TCNs) used in our research. We then present the results of our analysis and discuss the implications of our findings.
In this study, we compared various advanced neural network models to evaluate their effectiveness in predicting financial market trends based on a diverse set of features. Figure 1 presents an overview of the different network architectures and feature sets used in our analysis.

4.1. Data Sources and Processing

This study builds upon and enhances our Fin-SupplyChain dataset which integrates financial news data (e.g., Bloomberg, Reuters) from 2019 to 2021, focusing on significant supply chain events such as the U.S.–China trade war, economic sanctions on countries like Iran, natural disasters (e.g., hurricanes, wildfires), pandemics (e.g., COVID-19), and armed conflicts. Data were extracted from financial news articles aligned with commodities such as petroleum, natural gas, gold, and integrated circuits, which were cross-referenced with trade patterns from the Observatory of Economic Complexity (OEC 2023). Emotion annotations, particularly focused on fear, were added using the Fin-Emotion library (FinEmotion 2023).
For this study, we enhanced our dataset by incorporating additional sentiment and emotion analysis techniques necessary for the comparison study as shown in Figure 1. We developed a data processing pipeline that extends our previous work to perform the following steps for each sector:
  • Extracts Relevant Information: Identifies supply chain issues (e.g., pandemic, natural disaster), commodities (e.g., integrated circuits), and countries (e.g., Taiwan, China) mentioned in each article using a custom entity extraction function.
  • Performs General Sentiment Analysis: Evaluates the overall sentiment polarity of the article using general-purpose tools like TextBlob and VADER (e.g., detecting positive or negative sentiment in a news article related to companies identified) to compare to features in a TCN MultiBERT comparison.
  • Conducts Finance-specific Sentiment Analysis: Analyzes sentiments tailored to financial contexts using FinBERT for comparison with FinBERT-LSTM (e.g., identifying positive or negative sentiment in financial articles based on a model trained on financial terms).
  • Classifies Emotions: Categorizes the financial news text into multiple emotion categories for a more nuanced analysis using GoEmotions through multilabel emotional classification (e.g., detecting multiple emotions such as “optimism” and “fear” in financial news articles).
This enhanced dataset, which we call ’processed_commodity_sector_news_sentiment’, serves as input for our main analysis. Table 1 provides an example record from this combined dataset.
We further incorporated the top 10 companies by holding in each sector, analyzing significant market events where a company’s stock decreased by over 2%, to identify the time horizon of the supply chain event (how soon the stock was impacted based on the time from article publication). For the energy sector, the companies included Exxon, Chevron Corporation, Conoco, EOG Resources, Schlumberger, Marathon Petroleum, Pioneer Natural Resources, Phillips 66, Kinder Morgan, and Williams Companies. In the consumer discretionary sector, we analyzed Amazon, Tesla, Inc., The Home Depot, Nike, Inc., McDonald’s, Lowe’s, Starbucks, Target Corporation, Booking Holdings, and TJX Companies. The financial sector included Berkshire Hathaway, JPMorgan Chase, Bank of America, Wells Fargo, Citigroup, Morgan Stanley, Goldman Sachs, BlackRock, Charles Schwab Corporation, and American Express. The sectors and macro indexes analyzed included consumer discretionary, consumer staples, energy, financials, gold, healthcare, industrials, materials, real estate, technology, telecommunications, and utilities.
This enhanced dataset created by our pipeline opened avenues to pinpoint peaks of significant market events, offering a granular view into the sectors that bore the brunt of these supply chain disruptions. It also shed light on specific companies that found themselves impacted by these supply chain events. A unique aspect of our dataset is its ability to capture the emotional tone of articles, facilitating an analysis of emotion distributions over time. To rigorously evaluate our approach, we conducted comprehensive comparisons with state-of-the-art models in financial forecasting described in Section 2.
  • We benchmarked against the LSTM+BERT model xraptorgg (2024), which combines the sequential learning capabilities of LSTM with the contextual understanding of BERT.
  • We compared our results with our implementation of the TCN+MultiLabel BERT approach Liapis and Kotsiantis (2023), utilizing the following emotion classification model:
    • monologg/bert-base-cased-goemotions-original: A BERT-based model fine-tuned on the GoEmotions dataset, which classifies text into 27 distinct emotional or neutral (for a total of 28) categories using a multilabel approach to capture multiple emotions in a single text.
  • We also evaluated our model against cutting-edge language models tailored for financial forecasting:
    • FinGPT Forecaster AI4Finance-Foundation (2024), which leverages the power of large language models for financial prediction.
    • Time-LLM KimMeen (2024), a novel approach that reprograms large language models for time series forecasting.
    • Lag-Llama time-series-foundation-models (2024), a foundation model specifically designed for probabilistic time series forecasting.
Our choice of Temporal Convolutional Networks (TCNs) as the primary model for this study was driven by several key factors. TCNs have demonstrated superior performance in capturing long-range dependencies in time series data, a crucial aspect in financial forecasting where historical patterns can significantly influence future trends. Unlike recurrent neural networks (RNNs) or Long Short-Term Memory (LSTM) networks, TCNs can process inputs in parallel, leading to more efficient training and inference times, especially when compared to the size and time complexity of LLMs, which is particularly useful when dealing with large-scale financial datasets. Additionally, TCNs are highly adaptable and can be easily customized to accommodate different input features and data structures, making them an ideal choice for our multidimensional financial analysis framework leveraging emotion metrics and supply chain data. This comparison highlights the strengths of our TCN-based approach and provides valuable insights into the current limitations and potential future directions for LLMs in financial forecasting tasks.

4.2. Financial Co-Occurrence Graph

To create the financial news co-occurrence network, we modified the analysis horizon from a broader period to a month-by-month basis. Each company pair that shared the same emotion within a given month was tracked and counted. Algorithm M-Graph below describes the monthly co-occurrence graph generation process.
For each company pair sharing emotions in a given month, the weight of the edge was determined by the number of unique emotions shared within that month, rather than the sum of all emotion pairs across two years. Edges were added to the graph if the sum of emotion counts for a company pair was greater than 75% of the maximum sum observed across all pairs. This approach ensures that only the most significant relationships, based on both the variety and frequency of shared emotions, are represented in the graph.
This month-by-month analysis allows for a more granular understanding of how these relationships evolve over time and in response to various market events and economic conditions. It captures seasonal patterns and short-term fluctuations that might be obscured in a longer-term analysis.
In the next section, we present a set of analyses with their corresponding visualizations that will corroborate the insights derived from our comparison of advanced neural network models and insights from the month-by-month analysis of knowledge graphs.

5. Results and Discussion

Our analysis of cross-sector correlations reveals distinct patterns in market behavior, particularly during periods marked by global supply chain disruptions. By implementing Algorithm 1 we significantly enhanced our ability to detect correlated companies. This month-by-month approach, compared to a two-year aggregate view, identifies short-term supply chain events and seasonal patterns that would otherwise be missed in long-term analyses.
Algorithm 1: Monthly co-occurrence graph generation
Input: Grouped monthly data frame (group data into monthly year data slices)
Output: A graph G representing monthly co-occurrences
1: Initialize Graph and Variables
    - Create an empty graph G
    - Extract unique emotions from df_group_month
    - Initialize edge_hash dictionary (keeps track of company pairs with same emotion in same day)
2: Process Emotions and Companies
    - For each emotion:
       - Group data by emotion
       - For each row in the group:
          - Generate all combinations of companies
          - Update edge_hash with emotion counts for each company pair
3: Add Nodes to Graph
    - Add all companies as nodes to G
4: Calculate Edge Weights
    - Find minimum and maximum sum of emotion counts
    - Sort edges by number of unique emotions
    - For each edge:
       - Calculate weight as sum of emotion counts
       - If weight > 75% of max sum, add edge to G
5: Post-processing
    - Remove isolated nodes
    - Assign colors to nodes
6: Visualize and Export
    - Draw the graph using spring layout
    - Export graph in GEXF format
The month-by-month analysis highlighted several periods of particular interest:
  • August 2019 (Figure 2): Goldman Sachs and Amazon exhibited a notable correlation that appeared this month that was not identified in the long-term analysis (the news co-occurrence graph from the prior research noted no connections with these companies). This is significant as Goldman Sachs had been discussing their digital strategy, which included collaboration with Amazon Web Services (AWS).
  • January 2020 (Figure 3): Boeing and Honeywell showed a notable correlation that was not identified in the long-term analysis. Interestingly, in March 2020, Boeing and Honeywell formally announced a partnership, further validating this observed relationship.
  • March 2020 (Figure 4): Microsoft and Morgan Stanley also exhibited a notable connection that was not identified in the long-term analysis. During this period, Morgan Stanley had divested from MSCI but retained a material stake in the company who announced partnering with Microsoft in July 2020. This relationship was further solidified when the two companies announced a strategic partnership in the following year in June 2021.
These findings underscore the importance of monthly analysis in capturing evolving relationships between companies and sectors. By identifying specific connections and tracking their progression, we gain a deeper understanding of how strategic alliances and market conditions shape sector interdependencies over time.
In this section, we present the performance of our models based on the features used for training and validation. The models were trained using historical financial data, sector-specific indices, and emotion-related features derived from news articles. For validation, we used a separate set of financial data that were not included in the training process, ensuring that the models were tested on unseen data to evaluate their predictive accuracy. The differences in features across the tables stem from our exploration of various combinations of features, such as company count, emotion interaction, and added emotional features for comparison, to determine which set of features yields the best results for each sector. The tables present the top-performing model in each sector based on different emotion-related features tested during the experiments.
Analysis of the results presented in Table 2, Table 3 and Table 4 (presented in order of performance) reveals that our Temporal Convolutional Network (TCN) model demonstrates the best and most robust performance across various sectors when utilizing our proposed features such as company count, emotion interaction, and emotion consistently yield competitive results. In several instances, the TCN model’s performance is further enhanced when leveraging the GoEmotion multiclass emotion analysis of news articles as seen with the disappoint_smooth and confusion_smooth, indicating the value of nuanced sentiment information beyond fear in financial forecasting.
Within the group of large language models (LLMs) we evaluated (FinGPT, TimeLLM, and LagLamma), TimeLLM demonstrated the best performance, which we attribute to its innovative architecture incorporating a reprogramming layer. This layer, implemented as a custom attention mechanism, allows for a dynamic mapping between the input time series data and the LLM’s embedding space. However, our observations align with the findings of Tan et al. (2024), who conducted extensive studies on LLM-based time series forecasting methods. Their research highlighted several key insights relevant to our evaluation:
Tan et al. (2024) found that LLM-based methods often exhibit computational inefficiency, requiring significantly more compute time and resources compared to simpler models like our Temporal Convolutional Network (TCN), without yielding proportional improvements in forecasting performance. This inefficiency limits the practicality of using LLMs for time series forecasting in financial applications, especially in scenarios requiring rapid computation and low latency.
Additionally, their study demonstrated that simpler models, such as temporal architecture (TCNs and LSTMs in our study) with standard attention mechanisms, could achieve comparable or superior performance relative to complex LLM-based forecasters, reinforcing the efficacy of more traditional temporal models.
Our experiments with zero-shot (without seeing examples) applications of larger language models, including LagLlama and FinGPT, showed that these models struggled to effectively leverage the 30 days of time series data provided. This highlights the limitations of general-purpose language models in specialized forecasting tasks without domain-specific fine-tuning or architectural modifications. These findings are consistent with broader criticisms of LLMs’ capabilities in complex domains, as noted by Arkoudas (2023), who argues that current LLMs often struggle with tasks requiring genuine reasoning and domain-specific knowledge.
Similar conclusions were drawn in the FinanceBench study by Islam et al. (2023), which identified significant challenges for LLMs in handling financial queries, particularly those requiring numerical reasoning and time-sensitive analysis. Even advanced models like GPT-4-Turbo showed limitations, often providing incorrect answers or refusing to respond to complex financial questions. While augmentation techniques such as using longer context windows showed some improvement in both studies, they also introduced practical challenges such as increased latency and inability to efficiently handle larger financial documents.
Overall, these findings reinforce the value of using simpler models, such as TCNs, which demonstrated superior performance in capturing nuanced sentiment and emotion features with significantly lower computational overhead.
We measure the performance using the formula for Mean Absolute Error (MAE), which quantifies the average magnitude of errors in predictions, allowing for an intuitive assessment of model performance:
MAE = 1 n i = 1 n | y i y ^ i |
Here, y i represents the actual values, y ^ i the predicted values, and n the number of observations. The Train MAE is derived from the model’s performance on the training dataset, measuring its ability to fit the data it was trained on. Similarly, the validation MAE is calculated using the validation dataset, offering an evaluation of the model’s predictive capability on unseen data. This distinction helps to assess the model’s generalization performance and the validity of the results.
We performed a correlation analysis between 63 features—including various sentiment and emotion metrics—and the target sector price for the consumer discretionary sector. The features were ranked by their correlation scores with the target price. As shown in Figure 5, the top-performing features for this sector include admiration_smooth, pride_smooth, and amusement_smooth, which highlight the importance of positive emotions in predicting consumer discretionary trends.
The TCN model consistently outperforms other approaches, with features like “disappointment_smooth” and “company_count” yielding the lowest validation MAE (0.0103 and 0.0138, respectively). This suggests that consumer sentiment, particularly negative emotions, plays a crucial role in the consumer discretionary sector performance. The emotion interaction feature, while not the top performer for this sector, still shows competitive results (validation MAE of 0.1146), indicating its potential to capture complex market dynamics.
In the financial sector, the correlation analysis revealed the strongest features predicting the target sector price. Among the 63 features analyzed, emotion_interaction, emotion, and finbert_sentiment_smooth emerged as the most significant as shown in Figure 6. This emphasizes the critical role of nuanced sentiment analysis and its interaction with emotion in the financial market, where market sentiment can strongly influence performance.
The financial sector shows a preference for more complex emotional features, with “confusion_smooth” and “company_count” yielding the best validation MAEs (0.0230 and 0.0244, respectively). This suggests that market uncertainty and the interconnectedness of financial institutions play crucial roles in this sector’s performance.
For the energy sector, correlation analysis across 63 features identified the most influential predictors of the target sector price. As visualized in Figure 7, emotion, emotion_interaction, finbert_sentiment, and vader_sentiment stand out as the top-performing features. This underscores the role of broad sentiment metrics and emotional interplay in the volatile energy market, reflecting its sensitivity to public perception and external events.
In the energy sector, sentiment-based features dominate the top performances. The “finbert_sentiment” feature with TCN yields the best validation MAE (0.0215), closely followed by “emotion_interaction” (0.0218). This underscores the importance of nuanced sentiment analysis in the volatile energy market, where public perception and global events can significantly impact stock prices. Figure 8 shows details of the three models LSTM, TimeLLM, and TCN for the energy sector. The Mean Absolute Error (MAE) for each model is used to measure accuracy, with lower values indicating better performance. The TCN model achieves a validation MAE of (0.0218), which is significantly better than the LSTM’s MAE of (0.0301) and the TimeLLM’s MAE of (0.2039). The TCN model not only shows a lower error rate but also exhibits more stable and consistent predictions, as observed in the visualizations. In contrast, the TimeLLM model displays higher volatility in its predictions, with larger fluctuations and spikes, indicating a less reliable forecast.
We conducted an empirical analysis of stock price movements during August 2019. The analysis focused on 12 identified company pairs across sectors such as technology, consumer discretionary, financials, and energy. Market events were defined as daily price changes exceeding 2%, and we examined whether these pairs moved in tandem or diverged during these events. The results, summarized in Table 5, reveal a high degree of correlation, with 71 out of 74 events (95.95%) showing the companies moving in the same direction. This underscores the interconnected nature of markets, where shared sentiment, often influenced by public perception and global events, drives synchronous behavior across sectors. These findings emphasize the critical role of sentiment analysis in understanding market dynamics in these co-occurring companies.
Across all sectors, we observe that our TCN model, when leveraging the proposed metrics (company count, emotion interaction, and specific emotions), consistently outperforms (based on MAE score) more complex models like TimeLLM, LagLlama, and FinGPT. This is particularly evident in the stark contrast between the TCN’s prediction performance and that of the LLMs in the study.
The superior performance of our approach can be attributed to the following:
  • The granularity of our emotion-based features, which capture nuanced market sentiments.
  • The ability of TCNs to effectively model temporal dependencies in financial time series.
  • The integration of supply chain dynamics and company relationships through our monthly co-occurrence graph analysis.
These findings underscore the effectiveness of Fin-ALICE in capturing complex market dynamics and sentiment patterns, offering a more robust framework for financial forecasting compared to general-purpose language models or traditional market indicators.

6. Conclusions

Fin-ALICE represents a significant advancement in financial market analysis, offering a more nuanced and comprehensive approach to predicting market trends. By leveraging emotional sentiment, supply chain dynamics, and advanced machine learning techniques, it provides a powerful tool for navigating the complexities of global financial markets. Our comprehensive comparison of advanced neural network models, including TCNs, LLMs, and specialized time series models, demonstrated the superior performance of our TCN model, particularly when leveraging emotion-based features, consistently demonstrated robust performance across various sectors. Future work should focus on further refining these techniques, exploring the potential of hybrid models that combine the strengths of different approaches, and developing more robust methods for handling the unique challenges of financial time series data. One promising direction would be the creation of a two-stage model where LLMs are used for advanced feature extraction from textual financial news and reports, while TCNs handle the time series forecasting based on these extracted features along with traditional financial risk indicators. This approach could potentially combine the advanced language understanding capabilities of LLMs with the robust time series modeling of TCNs.

Author Contributions

Conceptualization, S.M. and G.A.; methodology, S.M.; software, S.M.; validation, S.M. and G.A.; formal analysis, S.M.; investigation, S.M.; resources, S.M.; data curation, S.M.; writing—original draft preparation, S.M.; writing—review and editing, G.A.; visualization, S.M.; supervision, G.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data and research project are made available for academic use only.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. AI4Finance-Foundation. 2024. “FinGPT/fingpt/FinGPT_Forecaster at Master”. GitHub. Available online: https://github.com/AI4Finance-Foundation/FinGPT (accessed on 21 March 2024).
  2. Araci, Dogu. 2019. FinBERT: Financial Sentiment Analysis with Pre-trained Language Models. arXiv arXiv:1908.10063. [Google Scholar]
  3. Arkoudas, Konstantine. 2023. GPT-4 Can’t Reason. arXiv arXiv:2308.03762. [Google Scholar]
  4. Cen, Yuefeng, Mingxing Luo, Gang Cen, Cheng Zhao, and Zhigang Cheng. 2022. Financial Market Correlation Analysis and Stock Selection Application Based on TCN-Deep Clustering. Future Internet 14: 331. [Google Scholar] [CrossRef]
  5. Chong, Eunsuk, Chulwoo Han, and Frank C. Park. 2017. Deep Learning Networks for Stock Market Analysis and Prediction: Methodology, Data Representations, and Case Studies. Expert Systems with Applications 83: 187–205. [Google Scholar] [CrossRef]
  6. Correia, Filipe, Ana Maria Madureira, and Jorge Bernardino. 2022. Deep Neural Networks Applied to Stock Market Sentiment Analysis. Sensors 22: 4409. [Google Scholar] [CrossRef]
  7. Dai, Wei, Yuan An, and Wen Long. 2022. Price Change Prediction of Ultra High Frequency Financial Data Based on Temporal Convolutional Network. Procedia Computer Science 199: 1177–83. [Google Scholar] [CrossRef]
  8. Demszky, Dorottya, Dana Movshovitz-Attias, Jeongwoo Ko, Alan Cowen, Gaurav Nemade, and Sujith Ravi. 2020. GoEmotions: A Dataset of Fine-Grained Emotions. Paper presented at the 58th Annual Meeting of the Association for Computational Linguistics, online, July 5–10; pp. 4040–54. [Google Scholar]
  9. Guo, Sing, Hui Ai, and Shanxin Li. 2023. Stock Movement Prediction via Temporal Convolutional Network and Interactive Attention Network. Paper presented at the 2023 International Conference on Intelligent and Innovative Technologies in Computing, Electrical and Electronics (IITCEE), Bengaluru, India, January 27–28; pp. 413–417. [Google Scholar]
  10. Guo, Wenchao, Zhigang Li, Chuang Gao, and Ying Yang. 2022. Stock Price Forecasting Based on Improved Time Convolution Network. Computational Intelligence 38: 1474–91. [Google Scholar] [CrossRef]
  11. Halder, Shayan. 2022. FinBERT-LSTM: Deep Learning Based Stock Price Prediction Using News Sentiment Analysis. arXiv arXiv:2211.07392. [Google Scholar]
  12. Islam, Pranab, Anand Kannappan, Douwe Kiela, Rebecca Qian, Nino Scherrer, and Bertie Vidgen. 2023. FinanceBench: A New Benchmark for Financial Question Answering. arXiv arXiv:2311.11944. [Google Scholar]
  13. Jin, Ming, Qingsong Wen, Yuxuan Liang, Chaoli Zhang, Siqiao Xue, Xue Wang, James Zhang, Yi Wang, Haifeng Chen, Xiaoli Li, and et al. 2023. Large Models for Time Series and Spatio-Temporal Data: A Survey and Outlook. arXiv arXiv:2310.10196. [Google Scholar]
  14. Jin, Ming, Shiyu Wang, Lintao Ma, Zhixuan Chu, James Y. Zhang, Xiaoming Shi, Pin-Yu Chen, Yuxuan Liang, Yuan-Fang Li, Shirui Pan, and et al. 2024. Time-LLM: Time Series Forecasting by Reprogramming Large Language Models. arXiv arXiv:2310.01728. [Google Scholar]
  15. KimMeen. 2024. Time-LLM: [ICLR 2024] Official implementation of “Time-LLM: Time Series Forecasting by Reprogramming Large Language Models”. GitHub. Available online: https://github.com/KimMeen/Time-LLM (accessed on 21 March 2024).
  16. Liapis, Charalampos M., Aikaterini Karanikola, and Sotiris Kotsiantis. 2023. Investigating Deep Stock Market Forecasting with Sentiment Analysis. Entropy 25: 219. [Google Scholar] [CrossRef]
  17. Liapis, Charalampos M., and Sotiris Kotsiantis. 2023. Temporal Convolutional Networks and BERT-Based Multi-Label Emotion Analysis for Financial Forecasting. Information 14: 596. [Google Scholar] [CrossRef]
  18. Liu, Xiao-Yang, Guoxuan Wang, Hongyang Yang, and Daochen Zha. 2023. FinGPT: Democratizing Internet-Scale Data for Financial Large Language Models. arXiv arXiv:2307.10485. [Google Scholar]
  19. Lopez-Lira, Alejandro, and Yuehua Tang. 2023. Can ChatGPT Forecast Stock Price Movements? Return Predictability and Large Language Models. arXiv arXiv:2304.07619v5. [Google Scholar] [CrossRef]
  20. McCarthy, Shawn, and Gita Alaghband. 2023. Fin-Emotion Library. Available online: https://github.com/AI4Finance-Foundation/FinEmotion (accessed on 23 July 2023).
  21. McCarthy, Shawn, and Gita Alaghband. 2023a. Enhancing Financial Market Analysis and Prediction with Emotion Corpora and News Co-Occurrence Network. Journal of Risk and Financial Management 16: 226. [Google Scholar] [CrossRef]
  22. McCarthy, Shawn, and Gita Alaghband. 2023b. The Emotion Magnitude Effect: Navigating Market Dynamics Amidst Supply Chain Events. Journal of Risk and Financial Management 16: 490. [Google Scholar] [CrossRef]
  23. Miller, John A., Mohammed Aldosari, Farah Saeed, Nasid Habib Barna, Subas Rana, I. Budak Arpinar, and Ninghao Liu. 2024. A Survey of Deep Learning and Foundation Models for Time Series Forecasting. arXiv arXiv:2401.13912. [Google Scholar]
  24. Monologg. 2024. BERT Base Cased GoEmotions Original. Hugging Face. Available online: https://huggingface.co/monologg/bert-base-cased-goemotions-original (accessed on 25 June 2024).
  25. Observatory of Economic Complexity (OEC). 2023. World Profile. Available online: https://oec.world/en/profile/world/wld (accessed on 13 March 2023).
  26. Pothepalli, Parichay. 2021. Stock Market Prediction: Using Econometric Models And Neural Networks. Global Journal For Research Analysis 10: 134–39. [Google Scholar] [CrossRef]
  27. Rasul, Kashif, Arjun Ashok, Andrew Robert Williams, Hena Ghonia, Rishika Bhagwatkar, Arian Khorasani, Mohammad Javad Darvishi Bayazi, George Adamopoulos, Roland Riachi, Nadhir Hassen, and et al. n.d. Lag-Llama: Towards Foundation Models for Probabilistic Time Series Forecasting. arXiv arXiv:2310.08278v3.
  28. Pedregosa, Fabian, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, and et al. 2011. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 12: 2825–30. Available online: https://scikit-learn.org (accessed on 21 March 2024).
  29. Tan, Mingtian, Mike A. Merrill, Vinayak Gupta, Tim Althoff, and Thomas Hartvigsen. 2024. Are Language Models Actually Useful for Time Series Forecasting? Available online: http://arxiv.org/abs/2406.16964 (accessed on 21 March 2024).
  30. time-series-foundation-models. 2024. lag-llama. GitHub. Available online: https://github.com/time-series-foundation-models/lag-llama (accessed on 21 March 2024).
  31. Wang, Neng, Hongyang Yang, and Christina Dan Wang. 2023. FinGPT: Instruction Tuning Benchmark for Open-Source Large Language Models in Financial Datasets. arXiv arXiv:2310.04793. [Google Scholar]
  32. xraptorgg. 2024. FinBERT-LSTM/7_lstm_model_bert.py at Main. GitHub. Available online: https://github.com/xraptorgg/FinBERT-LSTM (accessed on 21 March 2024).
Figure 1. Overview of network architectures and feature sets used in this study. The image illustrates our proposed TCN model with attention, alongside the FinBERT-LSTM, Time-LLM, FinGPT, and LagLamma models. The features analyzed include sentiment analysis outputs (Vader, TextBlob, FinBERT), multilabel emotion classification outputs (GoEmotions), and forecasting features such as lagged closing prices.
Figure 1. Overview of network architectures and feature sets used in this study. The image illustrates our proposed TCN model with attention, alongside the FinBERT-LSTM, Time-LLM, FinGPT, and LagLamma models. The features analyzed include sentiment analysis outputs (Vader, TextBlob, FinBERT), multilabel emotion classification outputs (GoEmotions), and forecasting features such as lagged closing prices.
Jrfm 17 00537 g001
Figure 2. Company pair analysis for August 2019 with notable new connections such as Goldman Sachs and Amazon.
Figure 2. Company pair analysis for August 2019 with notable new connections such as Goldman Sachs and Amazon.
Jrfm 17 00537 g002
Figure 3. Company pair analysis for January 2020 with notable new connections such as Boeing and Honeywell.
Figure 3. Company pair analysis for January 2020 with notable new connections such as Boeing and Honeywell.
Jrfm 17 00537 g003
Figure 4. Company pair analysis for March 2020 with notable new connections such as Microsoft and Morgan Stanley.
Figure 4. Company pair analysis for March 2020 with notable new connections such as Microsoft and Morgan Stanley.
Jrfm 17 00537 g004
Figure 5. Correlation feature matrix for feature scores measured by correlation to target sector index for consumer discretionary sector.
Figure 5. Correlation feature matrix for feature scores measured by correlation to target sector index for consumer discretionary sector.
Jrfm 17 00537 g005
Figure 6. Correlation feature matrix for feature scores measured by correlation to target sector index for the financial sector.
Figure 6. Correlation feature matrix for feature scores measured by correlation to target sector index for the financial sector.
Jrfm 17 00537 g006
Figure 7. Correlation feature matrix for feature scores measured by correlation to target sector index for the energy sector.
Figure 7. Correlation feature matrix for feature scores measured by correlation to target sector index for the energy sector.
Jrfm 17 00537 g007
Figure 8. Performance comparison of the TCN model against the TimeLLM and LSTM models for the energy sector. The TCN model outperforms both TimeLLM and LSTM, demonstrating lower error rates and better prediction stability.
Figure 8. Performance comparison of the TCN model against the TimeLLM and LSTM models for the energy sector. The TCN model outperforms both TimeLLM and LSTM, demonstrating lower error rates and better prediction stability.
Jrfm 17 00537 g008
Table 1. Example Row from the Enhanced Dataset—new features in bold.
Table 1. Example Row from the Enhanced Dataset—new features in bold.
FieldValueFieldValue
Date1 January 2021SectorFinancials
TitleNews headlineSentiment0.2
EmotionSubmissionSupply Chain IssuePandemic
CommodityFinancial marketCountryUnited States
TextBlob Sentiment0.0053VADER Sentiment0.6369
FinBERT Sentiment0.0Admiration0.0
Amusement0.0Disapproval0.0
Disgust0.0Embarrassment0.0
Excitement0.0Fear0.0
Gratitude0.0Grief0.0
Joy0.0Love0.0
Nervousness0.0Anger0.0
Optimism0.0Pride0.0
Realization0.0Relief0.0
Remorse0.0Sadness0.0
Surprise0.0Neutral0.9999
Annoyance0.0Approval0.0
Caring0.0Confusion0.0
Curiosity0.0Desire0.0
Disappointment0.0
Table 2. Performance comparison of different models and features for the consumer discretionary sector. Rows are arranged in order of best performance (validation MAE) to lowest performance. Feature definitions are as follows: Company count is the number of times a company is mentioned in news articles. Emotion interaction is our feature that balances the emotion and frequency of the company being mentioned in news articles. Emotion represents the count of articles categorized under a specific emotion, such as “fear”. Vader and TextBlob are general sentiment analysis tools used to evaluate the polarity of news articles (positive, neutral, negative). FinBERT is a sentiment analysis tool specifically tailored for financial contexts. The remaining 27 emotion categories, along with the “Neutral” category, are derived from the GoEmotions dataset and capture a wide range of emotional expressions within the text.
Table 2. Performance comparison of different models and features for the consumer discretionary sector. Rows are arranged in order of best performance (validation MAE) to lowest performance. Feature definitions are as follows: Company count is the number of times a company is mentioned in news articles. Emotion interaction is our feature that balances the emotion and frequency of the company being mentioned in news articles. Emotion represents the count of articles categorized under a specific emotion, such as “fear”. Vader and TextBlob are general sentiment analysis tools used to evaluate the polarity of news articles (positive, neutral, negative). FinBERT is a sentiment analysis tool specifically tailored for financial contexts. The remaining 27 emotion categories, along with the “Neutral” category, are derived from the GoEmotions dataset and capture a wide range of emotional expressions within the text.
FeatureTrain MAEValidation MAEModelComplexity # Params
disappointment_smooth0.01200.0103TCN<70K
company_count0.01640.0138TCN<70K
amusement_smooth0.01210.0148TCN<70K
pride0.01600.0182TCN<70K
pride_smooth0.01320.0203TCN<70K
admiration_smooth0.01290.0220TCN<70K
neutral0.01920.0231TCN<70K
admiration0.01320.0287TCN<70K
fear_smooth0.01370.0340TCN<70K
emotion & emotion_interaction0.01310.0541TCN<70K
textblob_sentiment0.01480.0617TCN<70K
surprise0.01930.0665TCN<70K
emotion0.01240.0769TCN<70K
Customer Discretionary (no sentiment)0.12250.0892TimeLLMLlama-7B
finbert_sentiment0.01120.0969LSTM<35K
emotion_interaction0.01730.1146TCN<70K
forecast 7 days-1.0500LagLlamaLlama-7B
forecast 7 days-1.7900FinGPTLlama-7B
Table 3. Performance comparison of different models and features for the financial sector. Rows are arranged in order of best performance (validation MAE) to lowest performance. (Features are defined in the Table 2 caption.)
Table 3. Performance comparison of different models and features for the financial sector. Rows are arranged in order of best performance (validation MAE) to lowest performance. (Features are defined in the Table 2 caption.)
FeatureTrain MAEValidation MAEModelComplexity # Params
confusion_smooth0.02750.0230TCN<70K
company_count0.02650.0244TCN<70K
realization_smooth0.03190.0291TCN<70K
approval_smooth0.02440.0304TCN<70K
textblob_sentiment0.03320.0328TCN<70K
finbert_sentiment0.02280.0331LSTM<35K
vader_sentiment0.02870.0342TCN<70K
neutral_smooth0.02560.0351TCN<70K
emotion_interaction0.03070.0375TCN<70K
emotion & emotion_interaction0.02210.0408TCN<70K
emotion0.02430.0420TCN<70K
Financials (no sentiment)0.34090.3076TimeLLMLlama-7B
forecast 7 days-4.8300FinGPTLlama-7B
forecast 7 days-11.5000LagLlamaLlama-7B
Table 4. Performance comparison of different models and features for the energy sector. Rows are arranged in order of best performance (validation MAE) to lowest performance. (Features are defined in the Table 2 caption.)
Table 4. Performance comparison of different models and features for the energy sector. Rows are arranged in order of best performance (validation MAE) to lowest performance. (Features are defined in the Table 2 caption.)
FeatureTrain MAEValidation MAEModelComplexity # Params
finbert_sentiment0.02460.0215TCN<70K
emotion_interaction0.02720.0218TCN<70K
emotion0.02900.0220TCN<70K
emotion & emotion_interaction0.01840.0235TCN<70K
approval_smooth0.03690.0266TCN<70K
finbert_sentiment0.02030.0301LSTM<35K
textblob_sentiment0.02850.0325TCN<70K
vader_sentiment0.02000.0327TCN<70K
neutral_smooth0.02300.0335TCN<70K
finbert_sentiment_smooth0.03770.0531TCN<70K
Energy (no sentiment)0.22000.2039TimeLLMLlama-7B
forecast 7 days-26.8500LagLlamaLlama-7B
forecast 7 days-32.5100FinGPTLlama-7B
Table 5. Market event analysis for August 2019. This table highlights the stock price changes and co-movement patterns of identified company pairs.
Table 5. Market event analysis for August 2019. This table highlights the stock price changes and co-movement patterns of identified company pairs.
Company 1Company 2Event DateChange (%)Change (%)Moved Together
IntelApple2019-08-02−1.66−2.12Yes
IntelMicrosoft2019-08-05−3.51−3.43Yes
JPMorganChaseGoldmanSachs2019-08-05−2.98−3.67Yes
ChevronExxon2019-08-05−1.65−2.05Yes
AmazonJPMorganChase2019-08-05−3.19−2.98Yes
MicrosoftTarget2019-08-05−3.43−0.90Yes
VisaMicrosoft2019-08-05−4.82−3.43Yes
TargetAmazon2019-08-05−0.90−3.19Yes
GoldmanSachsCitigroup2019-08-05−3.67−3.59Yes
MicrosoftAmazon2019-08-05−3.43−3.19Yes
IntelApple2019-08-05−3.51−5.23Yes
JPMorganChaseGoldmanSachs2019-08-060.782.15Yes
TargetAmazon2019-08-062.451.29Yes
VisaMicrosoft2019-08-062.141.88Yes
GoldmanSachsCitigroup2019-08-062.151.64Yes
MicrosoftTarget2019-08-061.882.45Yes
AmazonJPMorganChase2019-08-070.31−2.17No
JPMorganChaseGoldmanSachs2019-08-07−2.17−0.13Yes
GoldmanSachsCitigroup2019-08-080.612.46Yes
IntelMicrosoft2019-08-080.942.67Yes
MicrosoftAmazon2019-08-082.672.20Yes
VisaMicrosoft2019-08-082.612.67Yes
AmazonJPMorganChase2019-08-082.201.69Yes
ChevronExxon2019-08-083.472.67Yes
TargetAmazon2019-08-080.952.20Yes
IntelApple2019-08-080.942.21Yes
MicrosoftTarget2019-08-082.670.95Yes
IntelApple2019-08-09−2.52−0.82Yes
IntelMicrosoft2019-08-09−2.52−0.85Yes
ChevronExxon2019-08-09−0.66−2.13Yes
JPMorganChaseGoldmanSachs2019-08-12−1.88−2.60Yes
GoldmanSachsCitigroup2019-08-12−2.60−2.74Yes
AmazonJPMorganChase2019-08-132.211.54Yes
IntelMicrosoft2019-08-132.722.07Yes
MicrosoftTarget2019-08-132.072.69Yes
MicrosoftAmazon2019-08-132.072.21Yes
VisaMicrosoft2019-08-131.292.07Yes
TargetAmazon2019-08-132.692.21Yes
IntelApple2019-08-132.724.23Yes
VisaMicrosoft2019-08-14−2.86−3.01Yes
TargetAmazon2019-08-14−2.79−3.36Yes
ChevronExxon2019-08-14−3.80−4.03Yes
AmazonJPMorganChase2019-08-14−3.36−4.15Yes
IntelMicrosoft2019-08-14−2.07−3.01Yes
JPMorganChaseGoldmanSachs2019-08-14−4.15−4.19Yes
MicrosoftAmazon2019-08-14−3.01−3.36Yes
GoldmanSachsCitigroup2019-08-14−4.19−5.28Yes
IntelApple2019-08-14−2.07−2.98Yes
MicrosoftTarget2019-08-14−3.01−2.79Yes
AmazonJPMorganChase2019-08-160.932.40Yes
IntelApple2019-08-161.752.36Yes
JPMorganChaseGoldmanSachs2019-08-162.401.65Yes
GoldmanSachsCitigroup2019-08-161.653.52Yes
MicrosoftTarget2019-08-191.672.81Yes
TargetAmazon2019-08-192.811.31Yes
MicrosoftTarget2019-08-211.1120.43Yes
TargetAmazon2019-08-2120.431.23Yes
TargetAmazon2019-08-223.22−1.04No
MicrosoftTarget2019-08-22−0.733.22No
IntelApple2019-08-23-3.89−4.62Yes
MicrosoftTarget2019-08-23−3.19−2.66Yes
VisaMicrosoft2019-08-23−2.70−3.19Yes
IntelMicrosoft2019-08-23−3.89−3.19Yes
TargetAmazon2019-08-23−2.66−3.05Yes
ChevronExxon2019-08-23−2.17−2.99Yes
AmazonJPMorganChase2019-08-23−3.05−2.48Yes
GoldmanSachsCitigroup2019-08-23−3.07−3.07Yes
MicrosoftAmazon2019-08-23−3.19−3.05Yes
JPMorganChaseGoldmanSachs2019-08-23−2.48−3.07Yes
JPMorganChaseGoldmanSachs2019-08-292.272.14Yes
IntelApple2019-08-292.361.69Yes
AmazonJPMorganChase2019-08-291.262.27Yes
IntelMicrosoft2019-08-292.361.89Yes
GoldmanSachsCitigroup2019-08-292.142.47Yes
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

McCarthy, S.; Alaghband, G. Fin-ALICE: Artificial Linguistic Intelligence Causal Econometrics. J. Risk Financial Manag. 2024, 17, 537. https://doi.org/10.3390/jrfm17120537

AMA Style

McCarthy S, Alaghband G. Fin-ALICE: Artificial Linguistic Intelligence Causal Econometrics. Journal of Risk and Financial Management. 2024; 17(12):537. https://doi.org/10.3390/jrfm17120537

Chicago/Turabian Style

McCarthy, Shawn, and Gita Alaghband. 2024. "Fin-ALICE: Artificial Linguistic Intelligence Causal Econometrics" Journal of Risk and Financial Management 17, no. 12: 537. https://doi.org/10.3390/jrfm17120537

APA Style

McCarthy, S., & Alaghband, G. (2024). Fin-ALICE: Artificial Linguistic Intelligence Causal Econometrics. Journal of Risk and Financial Management, 17(12), 537. https://doi.org/10.3390/jrfm17120537

Article Metrics

Back to TopTop