1. Introduction
Financial forecasting aims to empower decision makers to anticipate market movements and asset prices. Price forecasts are crucial in economic decisions and serves as a cornerstone for portfolio construction, risk analysis, and investment strategy development. Accurate price predictions enhance the efficiency and transparency of the financial markets. Methods for price prediction encompass quantitative financial models, statistical analyses, and technical analyses, each with varying degrees of success across different asset classes and market conditions. However, the inherent volatility of markets and the variability of economic indicators make precise price forecasting a formidable challenge. Consequently, the continuous development and refinement of forecasting models are imperative for adaptation and progress. Recent advancements in artificial intelligence (AI) have significantly improved price prediction capabilities. These technologies can extract intelligent patterns from large datasets, yielding more accurate predictions of future price movements. This progress is poised to drive further innovation and revolutionize financial forecasting operations [
1].
Machine learning (ML) is a branch of AI dedicated to developing algorithms and mathematical models that enable computers to perform tasks through inductive inference, mimicking human learning processes. Unlike traditional programming, where explicit instructions dictate behavior, ML models learn from data, identifying patterns, and making decisions based on their acquired knowledge. This process involves training models on vast datasets to recognize complex relationships and generalize from seen examples to unseen scenarios. Such models have led to significant advancements, enabling machines to perform complex tasks with remarkable accuracy and efficiency using data, rather than relying on explicit programming [
2]. ML spans various fields, including healthcare [
3,
4], finance [
5,
6,
7,
8], retail [
9,
10], computer vision [
11,
12], autonomous vehicles [
13,
14], manufacturing [
15,
16], entertainment [
17,
18], and media [
19,
20]. The continuous evolution of ML solutions is propelled by the development of sophisticated algorithms, advancements in computational power, availability of big data, and ongoing research in the field.
Recent advancements in the application of ML to finance have significantly broadened the scope of financial analysis and prediction, reflecting a growing trend toward integrating these seminal methodologies into mainstream financial practices. For instance, the use of various ML algorithms is explored in [
5] to assess financial inclusion, highlighting the potential of these techniques to complement traditional models in evaluating sociodemographic factors influencing financial access. Expanding on this, a comprehensive overview of ML applications is provided in [
6], including supervised learning for both cross-sectional and time series data, with advanced material on Gaussian processes and reinforcement learning, illustrating their use in investment management, trading strategies, and derivative modeling. Additionally, a metadata-based systematic review by another study [
7] offers a meta-analysis of over 5000 documents, mapping the evolution and current state of ML in finance and revealing key trends that have shaped the field over the past two decades. In addition, a decade-long survey on stock market prediction [
8] underscores the increasing accuracy of predictions through advanced ML approaches, such as text data analytics and ensemble methods, while also noting the ongoing challenges posed by the dynamic and erratic nature of market data. Collectively, these works underscore the pivotal role of ML in driving innovation and enhancing decision-making across various financial domains.
Rapid advancements in machine learning have laid the foundation for innovative signal processing techniques, such as empirical mode decomposition (EMD). EMD is a powerful method for analyzing non-linear and non-stationary signals by decomposing them into intrinsic mode functions (IMF). This adaptive technique breaks down complex signals into simpler components, allowing for a detailed examination of the data’s underlying structure and dynamics [
21]. EMD’s ability to operate without predetermined basis functions makes it particularly effective for handling real-world data with inherent variability, a characteristic that complements the adaptive nature of machine learning (ML) techniques. Each IMF captures distinct oscillatory modes, providing insights into the signal’s local characteristics and temporal variations. This decomposition process is highly advantageous in financial forecasting, where market data often exhibit complex, non-linear patterns. Financial time series, such as exchange rates and stock prices, can be challenging to model due to their volatility. By applying EMD, significant trends, cyclic behaviors, and fluctuations within a financial time series can be isolated, thereby enhancing the predictive power of the subsequent modeling techniques [
22].
In addition to advanced signal processing techniques like EMD, financial forecasting also relies heavily on technical indicators (TI) [
23]. TIs are quantitative tools derived from historical price, volume, or open interest data and are essential for analyzing market behavior and making informed trading decisions. Among widely used technical indicators (TIs) are moving averages (MA) [
24], which normalize price data to highlight trends; the relative strength index (RSI) [
25], which assesses the speed and magnitude of price changes to identify oversold or overbought conditions; and Bollinger bands (BB) [
26], which use standard deviations to measure market volatility and potential price levels. TIs are designed to provide an actionable vision by highlighting patterns and trends that might not be immediately apparent from raw data alone. They help analysts understand market momentum, volatility, and trend direction, making it easier to identify potential trading opportunities and manage risk. For instance, moving averages can signal trend reversals or confirm ongoing trends, while the RSI can indicate when a market may be due for a correction. On the other hand, BBs reveal periods of high or low volatility, which can be crucial for timing trades [
27,
28].
While technical indicators offer a straightforward and interpretable approach to market analysis, they are often used in conjunction with advanced machine learning models, such as long short-term memory (LSTM), to enhance predictive accuracy. This hybrid approach combines the practical insights of technical indicators with the sophisticated pattern recognition capabilities of LSTMs, resulting in more reliable and precise financial forecasts [
29]. LSTM, a variety of recurrent neural networks (RNNs) [
30] within the deep learning (DL) framework, excels in capturing temporal sequential interdependencies, making it highly effective for time series prediction. Unlike traditional RNNs, LSTMs address issues such as vanishing and exploding gradients through their architecture, including memory cells and gating mechanisms. These features allow LSTMs to retain and utilize information over long periods and effectively manage the long-term dependencies that are crucial for financial forecasting [
31].
LSTM models are ideally suited for sequence prediction tasks where the order and context of elements are important—as in natural language processing (NLP) [
32], time series analysis [
33], signal processing [
34], bioinformatics [
35], video processing [
36], gaming [
37], and healthcare [
38]. Their ability to remember past data points makes them valuable for forecasting applications like financial markets [
39], energy load prediction [
40], and weather forecasting [
41], where predictions often depend significantly on historical patterns.
In the current study, by integrating the EMD, TI, and LSTM concepts, we capitalize on the strengths of all three methodologies, combining sophisticated signal decomposition, practical interpretability, and advanced predictive power to achieve superior financial forecasting performance. The key contributions of this research are as follows:
Introduction of a novel hybrid model: This paper proposes a new machine learning method, named EMD-TI-LSTM, which combines empirical mode decomposition (EMD), technical indicators (TI), and long short-term memory (LSTM) for financial forecasting. This hybrid approach is proposed for the first time in academic research.
Enhanced forecasting accuracy: Our proposed model significantly outperformed the conventional LSTM model on widely used financial datasets including BTC, BIST, NASDAQ, and GOLD. It achieved average improvements of 39.56%, 36.86%, and 39.90% in the prediction accuracy, as measured by the MAPE, RMSE, and MAE metrics, respectively. These enhancements are realized through advanced mathematical modeling and algorithmic refinement.
Better results than state-of-the-art studies: The EMD-TI-LSTM method achieved a lower mean absolute percentage error (MAPE) rate of 42.91% compared to state-of-the-art methods, showcasing its outstanding predictive performance. This reduction highlights the mathematical effectiveness of the EMD-TI-LSTM model in improving forecasting precision.
Effective use of EMD: This paper demonstrates the effective application of EMD in the context of financial forecasting, emphasizing its effectiveness in decomposing time series data for improved prediction accuracy.
Technical indicator integration: By incorporating technical indicators in time series analysis, the model enhances its capability to capture market trends and patterns, contributing to more accurate financial forecasts. The integration involves mathematical analysis to improve the model’s predictive power.
Comprehensive evaluation: This study conducts thorough evaluations to measure the effectiveness of the EMD-TI-LSTM method, providing a tough comparison against traditional LSTM and state-of-the-art techniques. The evaluation process employs mathematical metrics to ensure accuracy.
Advancement in AI-based financial forecasting: This research underscores the potential of AI-based hybrid models to outperform traditional financial forecasting techniques. Mathematical advancements in AI are laying the groundwork for future advancements in this field.
The structure of this paper is as follows:
Section 2 reviews related work.
Section 3 details the proposed EMD-TI-LSTM method.
Section 4 presents the experimental studies. Finally,
Section 6 concludes the paper with a summary and discusses future work.
2. Related Works
Research on price prediction has explored the use of various different methods. Some studies have been conducted to examine the effectiveness of these methods in financial forecasting. In the following, several notable related studies [
42,
43,
44,
45,
46,
47,
48,
49,
50,
51,
52,
53,
54,
55,
56,
57] are discussed to provide an in-depth review of progress in the field.
In [
42], Gandhmal and Kumar provide a comprehensive review of various approaches used for stock market prediction. Their review covers techniques such as support vector machines (SVMs), artificial neural networks (ANNs), Bayesian models, and fuzzy classifiers. This paper distinguishes between prediction-based and clustering-based methods, showcasing the challenges and limitations of current techniques. It concludes that despite significant progress, forecasting stock markets remains a complex and multifaceted task, which requires a multifaceted approach to achieve higher reliability and efficiency. In [
43], a systematic review critically examines the application of artificial intelligence in stock market trading. This research encompasses topics such as portfolio optimization, financial sentiment analysis, stock market prediction, and hybrid models. Their study explores the use of AI approaches, including deep learning and machine learning, to enhance trading strategies, forecast market behavior, and analyze financial sentiment. It also points out the advanced and specialized applications of AI in the stock market.
In [
44], the paper reported the significant advancements occurring in foreign exchange (Forex) and stock price prediction through the application of deep learning techniques. It reviews various systematic models based on deep learning, such as convolutional neural networks (CNN), LSTM, deep neural networks (DNN), RNN, and reinforcement learning. Their study evaluates the efficacy of these models in predicting market movements using mathematical metrics such as RMSE, MAPE, and accuracy. The findings indicate an emerging trend in which LSTM is frequently used in conjunction with other models to achieve high prediction accuracy, underscoring the evolving landscape of deep learning in financial market forecasting. Singh et al. [
45] developed an enhanced LSTM model that incorporates technical indicators, namely RSI and simple moving average (SMA), for stock price forecasting. Their study highlights the model’s superior accuracy and efficiency in predicting stock prices by leveraging these indicators, demonstrating significant improvements over traditional forecasting methods in responding to market trends and patterns more effectively.
In recent studies, several advancements in stock market predictions have been described. For instance, Mittal and Chauhan [
46] proposed a model that integrates a range of technical indicators with advanced machine learning techniques, resulting in a significant reduction in error values and an enhancement in forecasting accuracy. Babu and Sathyanarayana [
47] established a forecasting model that utilizes technical analysis tools encompassing Bollinger bands, moving averages, and other relevant indicators to boost the reliability of stock price estimates. Kaur et al. [
48] demonstrated the effectiveness of combining various parameters and technical indicators within their model, showing a substantial improvement in prediction performance. Venikar et al. [
49] investigated the application of a stacked model that takes advantage of extensive historical data and advanced computational techniques to achieve more accurate predictions. These studies collectively address the growing trend of employing sophisticated hybrid approaches and integrated methodologies to significantly improve forecasting accuracy in the stock market.
Yang et al. [
50] presented an integrated approach for stock price prediction that combines LSTM with ensemble EMD, indicating its superior effectiveness and accuracy compared to other techniques. Their comprehensive study improves the predictive performance of this hybrid model and emphasizes its robustness in handling complex financial data. By utilizing the strengths of both LSTM and EMD, this joint methodology offers significant advantages in precisely predicting stock prices, making it a valuable tool for financial forecasting. Similarly, Ali et al. [
51] developed an advanced hybrid model using a novel EMD model and LSTM networks, incorporating Akima spline interpolation for the improved treatment of non-stationary and non-linear financial time series. The decomposed signals are filtered and used as inputs to an RNN, enhancing the modeling of long-term dependencies and improving predictions. Their model, tested on the Karachi Stock Exchange (KSE)-100 index of the Pakistan Stock Exchange (PSX), outperformed pure LSTM and alternative ensemble methods, emphasizing the potential of elaborate data decomposition approaches in deep learning to reinforce stock market predictions.
Xuan et al. [
52] represented a novel method for short-term stock price prediction by integrating EMD, LSTM neural networks, and cubic spline Interpolation (CSI). The model aims to enhance both the efficiency and accuracy of predicting short-term trends. It decomposes stock price data into intrinsic mode functions (IMF) and a residual component, classified based on gradient magnitude. High-gradient components use an LSTM model, while the rest employ a CSI model, combining their forecasts for the final prediction. Their approach outperformed conventional models, including the standalone LSTM, EMD-LSTM, and SVM models. Jin et al. [
53] examined anticipating stock closing rates using sentiment analysis and LSTM, highlighting the crucial role of investor sentiment in enhancing model predictability. They developed a deep learning model that incorporated sentiment analysis, voltage series analysis, and an LSTM neural network with an improved attention mechanism. This methodology, particularly using EMD for decomposing complex sequences, effectively addresses stock market volatility and noise. Their study underscores the productivity of combining sentiment analysis with advanced machine learning approaches in financial prediction.
Jiang et al. [
54] introduced a distinctive two-phase ensemble method for forecasting stock prices, combining extreme learning machine (ELM), empirical or variational mode decomposition (VMD), and the improved harmony search (IHS) technique. In the first stage, stock data are segmented into different frequency elements using EMD or VMD. After that, ELM is applied to each component for future price prediction, with IHS optimizing the ELM parameters to enhance accuracy. The performance of the EMD-ELM-IHS and VMD-ELM-IHS models was compared with the autoregressive integrated moving average (ARIMA), multilayer perceptron (MLP), support vector regression (SVR), ELM, and LSTM models, showcasing superior accuracy and stability. Shu and Gao [
55] developed a hybrid model integrating CNN, EMD, and LSTM for stock price forecasting. Their approach involves decomposing stock prices into IMFs using EMD, filtering each IMF with a CNN for feature extraction, and analyzing these features with an LSTM network to model temporal dependencies. The model, tested on the Shanghai Stock Exchange (SSE) composite index for one- and seven-day forecasts, demonstrated improved performance in capturing multifrequency trading patterns compared to counterpart models.
Cao et al. [
56] reported an advanced hybrid forecasting model that combines the complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN), and LSTM approaches to boost the accuracy of stock market price anticipations. The CEEMDAN technique decomposes financial time series data into multiple IMFs and a residual component, capturing various scale-time features of the market. Each IMF, along with the residual, serves as an input to individual LSTM models, which then forecast future stock prices. This hybrid approach leverages CEEMDAN’s ability to handle non-linear and non-stationary characteristics of financial data and LSTM’s strength in modeling long-term dependencies. Validation using global stock market indices proved that this hybrid model outperforms traditional forecasting methods, including the single SVM, MLP, and LSTM algorithms, by delivering more accurate predictions. Another significant LSTM-based model is the one proposed by Hochreiter and Schmidhuber [
57], which addresses the vanishing gradient problem in standard RNNs. This issue complicates RNNs’ ability to learn and retain information over long sequences. LSTMs solve this by incorporating memory cells, enabling the network to maintain information over extended periods.
Unlike previous studies, this paper proposes a new approach that consists of empirical mode decomposition, technical indicators, and long short-term memory to benefit from their capabilities, which have yielded superior results in financial prediction. We capitalized on the strengths of all three methodologies to achieve successful results.
3. Materials and Methods
This study introduces a novel hybrid approach, EMD-TI-LSTM, which integrates empirical mode decomposition (EMD), technical indicators (TI), and long short-term memory (LSTM) to boost prediction precision over traditional methods. EMD is utilized to decompose complex market signals into more interpretable components, which are then processed by LSTM to detect long-range dependencies and patterns. Additionally, the incorporation of technical indicators such as the relative strength index (RSI), exponential moving average (EMA), and Bollinger bands (BB) enriches the model’s forecasting capabilities by integrating key market insights. This combination not only improves the model’s accuracy but also provides a more nuanced analysis of market behavior, making it a robust tool for financial forecasting. In this section, we will detail the methodologies that form the basis of EMD-TI-LSTM, and explain their theoretical foundations and practical applications to achieve our research objectives.
3.1. Methodologies
3.1.1. Long Short-Term Memory (LSTM)
LSTM networks are specialized architectures within a deep learning framework, classified as a type of recurrent neural network (RNN). In contrast to conventional feedforward neural networks, LSTMs incorporate feedback connections, making them particularly adept at handling sequential data. This architecture excels in applications such as speech recognition, time series forecasting, and natural language processing, where capturing temporal dependencies and sequential patterns is essential. LSTMs address several limitations of standard RNNs, notably the vanishing gradient problem, which can impede the learning and retention of information over long sequences. To overcome this challenge, LSTMs use memory cells, which are specialized units that preserve information over extended periods.
The LSTM architecture features diverse mathematical key elements, comprising the cell state, input gate, output gate, and forget gate, as illustrated in
Figure 1. The cell state serves as a long-term memory unit that retains information over time. The input gate controls the addition of new information to the cell state, while the output gate regulates the transfer of information from the cell state to the network’s output. The forget gate determines the information that should be removed from the cell state. These components work in concert to dynamically manage information, allowing LSTMs to retain relevant data while discarding irrelevant information, thus enhancing their ability to effectively process and predict sequential data.
The operation of an LSTM cell can be understood through the interaction of its components, as defined by the following Equations (1)–(6). These mathematical equations describe how information is retained, discarded, and output at each time step:
where
: Forget gate;
: Weight matrix for the forget gate;
: Hidden state from the previous time step;
: Input at the current time step;
: Bias term for the forget gate.
The forget gate
identifies which information from the cell state
should be discarded or retained for the current time step. It uses a sigmoid activation function
that outputs values between 0 and 1. A value of 0 indicates that the related information in the cell state is entirely forgotten, while a value of 1 means it is completely retained. The output
is then used to scale the previous cell state
, regulating the extent to which the old cell state is carried over to the new cell state
.
where
: Input gate;
: Weight matrix for the input gate;
: Hidden state from the previous time step;
: Input at the current time step;
: Bias term for the input gate.
The input gate
regulates the quantity of new information to be incorporated into the cell state
. It uses a sigmoid activation function
to output values between 0 and 1, where 0 indicates that no new information is added, and 1 indicates that the entire new input is stored in the cell state. This gating mechanism ensures that only the relevant information is updated in the cell state, boosting the network’s capability to capture and utilize important features over time.
where
: Cell state candidate;
: Weight matrix for the cell state candidate;
: Hidden state from the previous timestep;
: Input at the current timestep;
: Bias term for the cell state candidate.
The cell state candidate
represents a potential update to the cell state
. It is computed by applying a linear transformation to the concatenation of the prior hidden state
and the current input
, followed by a bias term
. The result is passed through the hyperbolic tangent function
, which squashes the produced values to fall within the range of −1 to 1. The
function allows the model to propose new information to be incorporated into the cell state, ensuring that this information is normalized. The range of −1 to 1 helps maintain the stability of the learning process and mitigate issues such as vanishing gradients. The proposed cell state candidate
is then used, together with the input gate, to determine the final cell state modification, integrating new information with the existing cell state in a controlled manner.
where
: Updated cell state;
: Forget gate output;
: Previous cell state;
: Input gate output;
: Cell state candidate.
The cell state is updated by first multiplying the previous cell state
by the forget gate output
, which determines how much of the old information should be retained or discarded. This process ensures that irrelevant data are filtered out. Next, the model incorporates new, relevant information by adding the product of the input gate output
and the cell state candidate
. This addition updates the cell state with fresh information, which is regarded as crucial, allowing the LSTM to dynamically adjust its memory and improve its ability to process sequential data effectively.
where
: Output gate;
: Weight matrix for the output gate;
: Hidden state from the previous time step;
: Input at the current timestep;
: Bias term for the output gate.
The output gate
specifies the amount of cell state
should be exposed as the hidden state
for the current time step. This gate is computed by applying a linear transformation to the concatenated earlier hidden state
and the current input
, accompanied by a bias term
. The result of this linear combination is then passed through the sigmoid activation function
, which outputs values between 0 and 1. The sigmoid function’s output controls the extent to which the cell state
is considered for the hidden state. A value of 0 means that the cell state has minimal influence, while a value of 1 means it has full influence.
where
The hidden stateis derived from multiplying the output gatewith the normalized cell state. The cell stateis first normalized using the hyperbolic tangent function, which scales its values between −1 and 1 for stability. The output gatethen scales this normalized cell state to produce, which contains the relevant information from the cell state and is used in the next timestep or as the network output. The result represents the output hidden state at time, which contains information based on the input sequence up to that point.
With the detailed architecture and equations of the LSTM network established, we can now explore how these components contribute to its performance and the practical considerations involved in configuring an LSTM model effectively. An LSTM cell effectively manages which information to pass, store, and forget, addressing challenges like the vanishing gradient problem commonly associated with traditional RNNs. This capability allows LSTM networks to effectively handle long-term patterns within sequences, addressing challenges that typically affect standard RNNs, such as difficulties in retaining information over extended sequences.
The quality of the LSTM models is predominantly determined by the chosen hyperparameters. Hyperparameters are model settings established before training begins and remain constant throughout the learning process. The key hyperparameters of LSTM include the following:
Number of units in an LSTM layer: This defines the dimension of the memory cell and influences the network’s ability to recognize complex patterns. More units can enable the network to obtain more information about intricate features, but also increase computational complexity and the potential of overfitting. The ideal number of units depends on the task’s complexity and the available training data.
Learning rate: This controls the magnitude of the updates made by the optimizer during the gradient descent. An elevated learning rate might lead the model toward converging to a suboptimal solution or even diverging, while a reduced learning rate can result in prolonged training times or getting stuck in local minima. Adaptive methods like ADAM, RMSprop, and ADAGRAD tune the learning rate depending on the update history of the weights, helping to mitigate these issues.
Batch size: This involves the number of training examples utilized in each iteration. A smaller batch size results in more frequent updates per epoch and can lead to faster convergence, but may introduce more noise in gradient estimates. Conversely, a larger batch size ensures more steady gradient estimations and uses more memory, potentially slowing convergence. The determination of the batch size also alters how well the model generalizes from the training data.
Depth of the network: This is determined by the number of LSTM layers and influences the network’s capacity to recognize intricate patterns in the data. Deeper networks can model more intricate relationships but may encounter training difficulties due to issues like vanishing gradients. Techniques such as gradient clipping and the inherent gating mechanisms of LSTM help address these problems. Additionally, deeper networks require more computational resources and are more susceptible to overfitting, necessitating regularization methods like dropout.
Dropout rate: Dropout is a regularization method employed to avoid overfitting by randomly eliminating units and their connections during training. In LSTMs, dropout is typically applied between layers rather than within recurrent connections to avoid disrupting the flow of memory information. The dropout rate specifies the proportion of units to be removed, balancing the need for regularization with the risk of underfitting.
Length of input sequences: This hyperparameter affects the LSTM’s ability to learn long-term dependencies. Longer sequences allow the model to capture extended input patterns but increase computational complexity and memory demands. Techniques such as sequence shortening or attention mechanisms can help manage long sequences and enhance model performance.
In conclusion, selecting the appropriate hyperparameters for an LSTM model is crucial for optimizing its performance and training efficiency. The best hyperparameter settings are typically task specific and can be determined through a combination of expert knowledge, empirical testing, and automatic optimization methods, such as grid search or Bayesian optimization. Understanding the role and impact of each hyperparameter will assist practitioners in fine-tuning their LSTM models for diverse applications, ensuring optimal results.
3.1.2. Technical Indicators (TI)
Technical indicators are numerical computations that depend on the price, volume, and open interest of a security or contract and are widely used in technical analysis. These indicators help analyze past and present market behaviors to predict future price movements. Their primary applications include identifying trade opportunities by generating buy and sell signals, determining market trends, and evaluating the strength or weakness of security. Technical indicators can be categorized into the following groups:
Trend indicators: These indicators identify the direction and magnitude of a trend. Examples include the moving average (MA), exponential moving average (EMA), moving average convergence divergence (MACD), and directional movement index (DMI).
Momentum indicators: Used to identify the velocity of price movements, these indicators help recognize overbought or oversold conditions. Examples include relative strength index (RSI), stochastics, and commodity channel index (CCI).
Volatility indicators: These indicators measure the extent of price changes, regardless of direction, and are used for risk assessment and gauging market sentiment. Examples include the Bollinger bands (BB), average true range (ATR), and volatility index (VIX).
Volume indicators: These indicators analyze trading volumes to verify trends or predict trend reversals. Examples include the volume oscillator, on-balance volume (OBV), and Chaikin money flow (CMF).
Investors and traders use technical indicators to recognize optimal entry and exit points, as these indicators can signal the best times to buy or sell security. They also confirm price movements by validating the strength and consistency of price trends using multiple indicators. Additionally, technical indicators are crucial for risk management, helping to place stop-loss orders or adjust portfolio risk exposure based on market conditions. Understanding the assumptions and mathematical calculations behind these indicators is essential for their effective use in both academic and practical applications. By employing varied signals and incorporating other types of analysis, traders can develop a more sophisticated approach to market analysis. To delve deeper into specific technical indicators, we will explore EMA, RSI, and BB, examining how each can be utilized to enhance market analysis and trading strategies.
Exponential Moving Average (EMA)
EMA is a financial market technical indicator that helps transform price data over a certain period, emphasizing the importance of recent price movements. This weighting method makes the EMA more sensitive to recent price changes relative to the simple moving average (SMA), which assigns equal weights to all values within the period. EMA is crucial for identifying trend directions and potential reversals and is an important component of other technical indicators. Mathematically, EMA can be expressed through Equations (7) and (8).
where
: EMA value for the current period;
: Closing price for the current period;
: Number of periods in EMA;
: Smoothing factor in EMA;
: EMA value from the previous period.
EMA utilizes a smoothing factorto adjust the weight given to recent prices, with higher values ofincreasing sensitivity to recent price changes. This dynamic weighting allows the EMA to reflect current market trends more accurately by influencing recent data while still considering historical prices. Updated daily using the closing price and the previous EMA value, the EMA adapts quickly to new information. Its ability to identify trends and potential reversals makes it a valuable tool when used along with other technical indicators to enhance financial strategies and gain deeper insights into market conditions.
Relative Strength Index (RSI)
RSI is a highly regarded momentum oscillator that is commonly exploited in technical analysis to evaluate the velocity and magnitude of recent price movements. By comparing the magnitude of recent gains to recent losses, the RSI serves as a crucial tool for identifying overbought or oversold conditions in a given asset. This comparative analysis allows market participants to gauge the underlying momentum-driving price fluctuations, thereby providing a more subtle comprehension of an asset’s price performance over a specified period. The RSI is mathematically derived using Equations (9)–(12), and is designed to quantify momentum in a way that reduces the impact of transient price anomalies. This smoothing effect enhances the reliability of the RSI, making it a more robust indicator for making informed trading and investment decisions.
where
RSI: Relative strength index as a momentum indicator with a value ranging between 0 and 100
RS: Relative strength representing the ratio of average gains to average losses over the specific period
Average gain: Mean of all positive price changes over the period
Previous average gain: Average gain calculated during the previous period
Current gain: Gain in the current period representing a positive price change, or zero if the price did not increase
Average loss: Mean of all negative price changes over the period
Previous average loss: Average loss calculated during the previous period
Current loss: Loss in the current period representing a negative price change, or zero if the price did not decrease
: Number of periods, commonly set to 14 days
It is worth noting that an RSI value of 70 or above typically suggests that security may be overbought, which could lead to trend reversal or price pullback. Conversely, an RSI value of 30 or less indicates that the security might be oversold or undervalued, potentially setting the stage for a price rebound. A critical signal provided by the RSI is the divergence between the RSI and price action, which often precedes significant trend changes. A bullish divergence occurs when the price hits a new low, but the RSI forms a higher low, suggesting that despite declining prices, the underlying momentum is strengthening. This divergence signals a potential upward reversal. On the other hand, a bearish divergence occurs in scenarios where the price reaches a new high, while the RSI shows a lower high, indicating weakening momentum and a possible downward shift. The RSI is an indispensable tool for traders and analysts, as it helps not only identify overbought and oversold conditions, but also detect these crucial divergences. To maximize its effectiveness, RSI should be integrated with other technical analysis tools to enhance the robustness of trading and investment decisions.
Bollinger Bands (BB)
BB is a technical analysis tool comprising three lines plotted around a security price to assess volatility. The middle band is calculated as the simple moving average (SMA) of the closing prices over a specific period
. The upper band is determined by adding a multiple
of the standard deviation of the closing prices over the same period to the middle band, while the lower band is determined by subtracting this value from the middle band. Bollinger bands adapt to market conditions, widening during times of high volatility and narrowing during times of low volatility. The mathematical formulas for these bands are presented in Equations (13)–(15).
where
: Number of standard deviations, typically set to 2;
: Standard deviation of the closing prices over the specific period.
Wider Bollinger bands indicate higher market volatility, while narrower bands suggest lower volatility. The middle band’s direction shows a trend—an upward tilt signals an uptrend and a downward tilt indicates a downtrend. The upper and lower bands function as adjustable support and resistance levels, with price movements often touching or breaching these bands to signal overbought or oversold conditions. Combining BB with other indicators like EMA and RSI can improve financial analysis, although it may increase the complexity and risk of overfitting. Effective feature selection and regularization are crucial for mitigating this risk. The indicators’ responsiveness varies with market conditions, so their use should be guided by empirical evidence and expertise. Ultimately, integrating these indicators with advanced models like LSTM can enhance predictive accuracy by leveraging detailed market data.
3.1.3. Empirical Mode Decomposition (EMD)
EMD is a method intended to deconstruct a signal into intrinsic mode functions (IMF), which help reveal underlying trends and cycles. Unlike traditional methods that assume linearity and stationarity, EMD is particularly effective for analyzing non-linear and fluctuating data, making it valuable in fields such as finance, geophysics, mechanics, and biomedical engineering. This method relies on recognizing inherent oscillatory modes within a complex dataset and separating them according to the local attributes of the data. The decomposition process is entirely empirical, with IMFs derived directly from the data without any predetermined basis. The EMD process iteratively removes IMFs from the signal, where each IMF must satisfy two key conditions: the number of local maxima and zero crossings should be equal or vary by at most one, and the mean of the envelope between the maxima and minima must be zero at every location. The decomposition process involves the following steps through Equations (16)–(23):
- 2.
Iteratively extract each IMF:
- -
Identify and interpolate the local extrema to form the lower and upper envelopes by incorporating local minima and maxima:
: Lower envelope at iteration, formed by interpolating through the local minima of the signal;
: Upper envelope at iteration, formed by interpolating through the local maxima of the signal;
: Time point to evaluate the variables;
: Interpolation function to create a smooth curve connecting the given points;
: Time instances corresponding to the local extrema of maxima or minima;
: Residue value at time, where the local extrema occur.
- -
Compute the mean envelope as follows:
: IMF at iteration to represent the oscillatory component extracted from the residue after removing the trend;
: Residue from the previous iteration including both the trend and the oscillatory components.
- -
Define the extracted IMF if it meets the following criteria:
: Residue at iteration, the remaining signal after removing the IMF at the current iteration;
: Residue from the previous iteration , the signal before removing the current IMF.
- 3.
The process concludes when the residue evolves into a monotonic function or cannot be further decomposed. The original signal is represented as the sum of all extracted IMFs and the final residue:
: Final residue signal after completing the EMD process;
: Total number of extracted IMFs;
: Sum of extracted IMFs during the EMD process to capture oscillatory components of the signal;
: Residue signal after completing the EMD process, which is either a monotonic function or a fixed constant, representing the portion of the original signal that cannot be further decomposed.
3.2. Proposed Model
In the current study, we developed and evaluated two distinct models to compare their performance using identical hyperparameters and datasets. The first model employs a standard long short-term memory (LSTM) network integrated with technical indicators (TI). The second model, introduced in this study as EMD-TI-LSTM, incorporates an innovative approach by integrating empirical mode decomposition (EMD) with technical indicators, which are then processed by an LSTM network. The workflows of both models are depicted in
Figure 2, which illustrates the differences in their design and implementation.
The first model follows a conventional approach by utilizing the capabilities of the LSTM network to detect enduring patterns in time series data. Historical data, specifically past asset prices, are first imported. Technical indicators such as the exponential moving average (EMA), relative strength index (RSI), and Bollinger bands (BB) are then computed and used as additional inputs to strengthen the model’s prediction accuracy. After calculating the technical indicators, the data are normalized, and sequences are prepared for the LSTM model. The network of LSTM, implemented using TensorFlow 2.2 framework (Google Inc., Mountain View, CA, USA), Keras 3.4 open-source library (Google Inc., Mountain View, CA, USA), consists of two layers: a dropout layer to mitigate overfitting and a fully connected output layer. The first model is configured with an appropriate loss function and optimizer. It is trained on the training set using early stopping to avoid overfitting. Following training, the model is applied to predict the test dataset, and these predictions are rescaled to the original one. The effectiveness of the model is measured based on metrics such as the mean absolute percentage error (MAPE), root mean square error (RMSE), and mean absolute error (MAE) to determine the accuracy and reliability of the forecasts.
The EMD-TI-LSTM model introduces EMD into the prediction process. After importing historical data, EMD breaks down the initial signal into intrinsic mode functions (IMF), capturing various frequency components of the signal. These IMFs provide a richer representation of the data by breaking them down into multiple scales. The decomposed signals are then combined with technical indicators and processed through the LSTM network, following a training procedure similar to the first model. However, EMD-TI-LSTM includes an additional step of aggregating the forecasts derived from multiple IMFs before exporting the final results. In summary, while the first model processes technical indicators directly through the LSTM network, EMD-TI-LSTM first applies EMD to the signal, creating a more detailed feature set by combining IMFs with technical indicators before feeding them into the LSTM. A comparison between these models seeks to evaluate the effect of incorporating EMD on the predictive accuracy of the LSTM network to determine whether the enhanced feature representation provided by EMD offers a significant advantage over the standard LSTM approach.
As detailed in
Figure 3, the EMD-TI-LSTM model employs an advanced multi-step methodology for time series forecasting, specifically designed to enhance predictive accuracy by integrating EMD with technical indicators and an LSTM network. The process unfolds through the following steps:
Import historical data: The model begins by importing historical asset data, which serve as the foundational input for the forecasting process. These data typically consist of time series information, such as past asset prices.
Apply EMD: The historical data undergo the EMD process, which breaks down the original signal into multiple IMFs. EMD is a powerful method that splits a signal into its underlying components, each of which represents various frequency bands. This decomposition allows the model to capture various oscillatory behaviors within the data.
Calculate TI for each IMF: For each IMF extracted through the EMD process, the model calculates a set of technical indicators, including EMA, BB, and RSI. These indicators are crucial for capturing different market dynamics and trends. The IMFs, enriched with their respective technical indicators, are then used as input features for the subsequent LSTM model. This combination of IMFs and technical indicators provides a more elaborate and complete presentation of the underlying data.
Apply LSTM: The model then constructs a sequential LSTM model for each IMF-TI combination, with carefully defined hyperparameters and dropout layers for regularization to prevent overfitting. LSTM networks are designed to capture long-term dependencies in the time series by learning from the enriched feature sets provided by each IMF. Each LSTM model is trained independently on its respective IMF and associated technical indicators.
Obtaining forecasts: Once the training is complete, each LSTM network generates forecasts for its corresponding IMF. These forecasts reflect the model predictions for each decomposed component of the original signal.
Calculate the forecast: The final step involves aggregating the individual forecasts produced by each LSTM network. This aggregation synthesizes the predictions from all IMFs into a single unified forecast of the asset’s future price. By combining the insights derived from each IMF, the EMD-TI-LSTM model delivers a more reliable and accurate prediction than a single LSTM model.
Model performance evaluation: The performance of the EMD-TI-LSTM model is evaluated using metrics such as MAPE, RMSE, and MAE. These metrics are calculated by comparing the aggregated forecasts against the actual asset prices, providing a clear measure of the model’s accuracy and reliability.
The hyperparameters for the EMD-TI-LSTM model are summarized in
Table 1. The model configuration includes a window size of seven for technical indicators, including EMA, RSI, and BB, meaning each indicator is calculated using the most recent seven time steps of time series data, which helps capture short-term trends. Mathematically, this approach integrates recent historical data into forecasts. The model employs two LSTM layers, each with 512 units, leveraging a deep learning framework to effectively identify and remember sophisticated patterns and temporal relationships in the time series data. The training process is set for 500 epochs, where each epoch represents a full pass through the entire training dataset and a learning rate of 0.0001, ensuring a stable learning curve. The sequence length and batch size are configured at 60 and 32, respectively. The sequence length defines the number of time steps used in each input segment, enabling the model to be trained on a predetermined window of historical data. The batch size specifies the number of samples processed before updating the model hyperparameters, thus influencing the data processing efficiency and memory usage. A dropout rate of 0.1 is applied to reduce overfitting, and a train/test ratio of 0.95 is chosen to allocate the majority of the dataset for training, consequently refining the model’s forecasting ability in time series analysis.
Table 2 presents the comprehensive algorithm for the EMD-TI-LSTM model, detailing each step from the initial setup to model evaluation. The process begins by recording the start time and ensuring that all necessary packages are installed and libraries are imported. Google Drive is mounted to facilitate file access. Configurations are defined, including assets, time intervals, and hyperparameter ranges. The algorithm then iterates over each asset and hyperparameter combination. For each configuration, it prints the current setup, sets the Adam optimizer with the specified learning rate, and loads the data from an Excel file. The data are preprocessed by converting dates, sorting, and extracting close values. The EMD technique is applied to these values, followed by the definition of functions for data preparation and LSTM model creation. Lists are initialized to collect predictions and actual values. Each IMF is processed sequentially, where indicators like EMA, BB, and RSI are calculated and combined into a single data frame. Data are then prepared for LSTM training, split into training and testing sets, and used to create and train the LSTM model. Predictions are made, inverse transformed, and stored. Forecasts are aggregated by summing them and aligning them with test close prices and dates, and then performance metrics, such as MAPE, RMSE, and MAE, are calculated. Results are printed, saved in an Excel file, and plotted. The process is concluded by printing a completion message and recording the end time to calculate the total execution duration.
The proposed model (EMD-TI-LSTM) has a number of advantages that can be summarized as follows. First, it tends to enhance financial forecasting since it benefits from the strengths of three useful concepts (EMD, TI, and LSTM). The unique combination of these methodologies can contribute to both the theoretical and practical aspects of financial forecasting and can have broader implications for machine learning applications in the field. The relative advantage of this model is its ability to use EMD to break down complex market signals into more interpretable components, which can then be processed using LSTM. Furthermore, the incorporation of technical indicators (EMA, RSI, etc.) enhances the model’s forecasting capabilities by incorporating meaningful market information into the analysis. An important advantage of the EMD-TI-LSTM approach is that it can be applied to any historical asset data without any prior information about the given dataset. It is entirely unaware of the asset type; in fact, it simply learns from the time series data samples. One of the main advantages of EMD-TI-LSTM is that it can be easily implemented using the existing pyEMD 1.0.0 library and LSTM networks by lightly setting understandable parameters. Another advantage is that it can be easily facilitated for further research and can be adapted for advanced forecasting problems in various financial domains within the academic community and industry. Thus, the EMD-TI-LSTM method not only pushes the boundaries of conventional forecasting approaches but also fosters innovation in AI-driven financial analysis. The subsequent sections will substantiate these advantages by applying the EMD-TI-LSTM model to well-known real-world datasets, presenting experimental results, and conducting comprehensive comparisons to validate the model’s practical efficacy.
3.3. Dataset Description
In this study, four distinct financial assets were analyzed: the close price of BTC/USD, the close index of the BIST 100 Index, the close index of the NASDAQ-100 Index, and the close price of GOLD/USD. To clarify, the datasets will be referred to using the tickers BTC, BIST, NASDAQ, and GOLD, respectively. All datasets were sourced from the TradingView website [
58] and are briefly presented in
Table 3. Additionally, each dataset is described in detail in this table to provide an inclusive view of its characteristics and significance.
3.3.1. BTC/USD
BTC/USD represents the exchange rate between Bitcoin (BTC), the world’s most traded and widely adopted cryptocurrency, and the United States Dollar (USD). It indicates how much one Bitcoin is worth in U.S. dollars. Bitcoin, as the first digital currency, pioneered the cryptocurrency market and remains a dominant asset within this emerging class. The BTC/USD trading pair not only reflects Bitcoin’s value in terms of U.S. dollars but also captures the high volatility and dynamic nature of cryptocurrency markets, making it a critical indicator for analyzing market trends and investor behavior.
3.3.2. BIST 100 Index
The BIST 100 Index is the primary stock market index of Turkey, representing the outcomes of the leading and highly liquid enterprises listed on the Borsa Istanbul (BIST) stock exchange. This capitalization-weighted index comprises 100 companies with the highest trading volumes and market values, excluding investment trusts. The selection of constituents for the BIST 100 is based on predetermined criteria designed to ensure the inclusion of the most significant and stable companies in the Turkish market. As a key indicator of Turkey’s economic health, the BIST 100 Index provides meaningful perspectives on overall performance and trends within the Turkish equity market.
3.3.3. NASDAQ-100 Index
The NASDAQ-100 Index, representing the national association of securities dealers’ automated quotations, encompasses 100 of the top companies listed on the NASDAQ stock market, the world’s second-largest by market capitalization. This modified capitalization-weighted index includes major technology and high-growth companies across various industries, such as technology, telecommunications, biotechnology, media, and services, while excluding financial services companies. As a key indicator, NASDAQ-100 offers a broad view of the U.S. technology sector and serves as a valuable benchmark for investors assessing the performance of the stock market, particularly in sectors characterized by innovation and rapid growth.
3.3.4. GOLD/USD
GOLD/USD refers to the trading pair that represents the exchange rate between gold, measured in troy ounces, and the United States Dollar (USD). This trading pair indicates the value of one troy ounce of gold in terms of US dollars. Gold has been a fundamental asset in financial markets and was the basis of economic capitalism until the repeal of the Gold Standard, which led to the adoption of a fiat currency system. Gold continues to be a widely followed and critical asset in global finance. The GOLD/USD exchange rate serves as a key metric for evaluating gold’s value and trends within a broader economic context.
Daily closing values for these assets were obtained from the TradingView website (TradingView Inc., New York, NY, USA), covering a ten-year period from 15 November 2013 to 15 November 2023. The statistical information of these datasets, including the count, minimum, mean, maximum, and standard deviation (SD), is presented in
Table 4. This table reveals distinct volatility patterns among the four financial assets: BTC, BIST, NASDAQ, and GOLD. BTC displays the highest volatility, reflecting the unpredictable nature and significant price fluctuations typical of cryptocurrency markets. Conversely, GOLD exhibits the lowest volatility, underscoring its role as a stable asset. The BIST and NASDAQ indices demonstrate moderate volatility, highlighting variations in risk and stability across different market sectors. This comparison underscores the diverse risk profiles of these assets, with BTC being the most volatile and GOLD being the most stable.
Additionally, the daily close values of BTC, BIST, NASDAQ, and GOLD over time from 1 January 2014 to 1 January 2024 are illustrated through
Figure 4. The figure provides a clear visualization of the trends and fluctuations in the closing prices of these assets over the ten years. BTC shows a pattern of extreme volatility with sharp peaks and troughs, reflecting the high-risk nature of the cryptocurrency market. BIST exhibits a steady upward trend, particularly in later years, indicating growth in the stock market. In addition, NASDAQ shows a strong upward trajectory, consistent with the expansion of the sector. GOLD demonstrates relatively lower volatility compared to BTC, with gradual increases and a few significant dips, underscoring its stability.
The actual close values and their corresponding IMFs for BTC, BIST, NASDAQ, and GOLD are shown in
Figure 5,
Figure 6,
Figure 7 and
Figure 8, respectively. These figures highlight the underlying trends and cycles at various frequencies, which are crucial for forecasting and understanding market behavior. By examining these IMFs, we can identify the significant patterns that make them valuable features for more accurate market predictions. For example,
Figure 5 illustrates the original close values of BTC along with their respective IMFs. The top subplot shows the BTC close value time series, which reflects the overall market trend. The subsequent plots showcase the decomposed IMFs, ranging from IMF 1 to IMF 9, each representing different frequency components of the original signal. The high-frequency intrinsic modes, including IMF 1 to IMF 3, capture rapid fluctuations and noise, providing visions for short-term market volatility. The mid-frequency IMFs, namely IMF 4 to IMF 6, reveal medium-term cyclical patterns that may correspond to periodic market behaviors, while the lower-frequency IMFs, involving IMF 7 to IMF 9, reveal long-term trends and slow-moving cycles, which are critical for understanding the broader market trajectory. This decomposition allows us to isolate and analyze the various components of the BTC time series, enabling a more thorough grasp of its fundamental dynamics and improving the precision of forecasting models by focusing on specific IMFs relevant to the forecasting horizon.
3.4. Tools and Software
The code used in this study was developed using Python 3.12 programming language Google Colab tool. We used several key libraries and tools to support the data analysis, model development, and visualization tasks. Python was the primary programming language used due to its versatility, simplicity, and extensive library ecosystem, which supports a wide variety of applications from basic data manipulation to complex simulations. For deep learning model development and validation, we employed TensorFlow 2.2 framework (Google Inc., Mountain View, CA, USA), a powerful open-source library for numerical computation and machine learning, along with Keras 3.4 open-source library (Google Inc., Mountain View, CA, USA), which provides a high-level interface for building and training neural networks, simplifying the creation of intricate models. We also utilized PyEMD 1.0.0 library for empirical mode decomposition and scikit-learn 1.5.1 Python library (sklearn) for machine learning tasks, namely, model training and evaluation. For financial analysis and charting, TradingView was used to provide advanced charting tools and comprehensive market data, aiding in visualizing and interpreting financial trends efficiently. Google Colab served as the cloud-based environment for coding and running the machine learning experiments. It offered access to powerful computational resources, including NVIDIA GPUs and Google TPUs, which significantly enhanced the efficiency of our model training and evaluation processes. Additionally, Google Colab’s seamless integration with Google Drive facilitated the management of datasets and storage of code results. These tools and resources collectively enhanced the effectiveness of our work, enabling robust data analysis, accurate modeling, and clear representation of results.
6. Conclusions and Future Works
In this paper, we introduced a new hybrid model, named EMD-TI-LSTM, designed to advance financial forecasting by integrating empirical mode decomposition (EMD), technical indicators (TI), and long short-term memory (LSTM). The consistent performance across various metrics and datasets highlights the scalability and applicability of EMD-TI-LSTM in different financial forecasting scenarios, making it a versatile tool for both researchers and practitioners. Our results clearly demonstrate that EMD-TI-LSTM expressively outperformed traditional LSTM and other state-of-the-art methods in predicting financial asset prices. The model achieves consistently lower MAPE, RMSE, and MAE values, indicating a substantial enhancement in prediction accuracy. Mathematically, the EMD-TI-LSTM model improved the accuracy by 39.56%, 36.86%, and 39.90% over the conventional LSTM model on the BTC, BIST, NASDAQ, and GOLD datasets, as measured by the MAPE, RMSE, and MAE metrics, respectively. Notably, the model achieved a MAPE of 1.69 for the BTC dataset, reflecting a remarkable 42.91% improvement compared to the average MAPE of 2.96 from other state-of-the-art methods. Additionally, the innovative integration of EMD and TI not only simplifies complex market signals but also enriches the model with valuable market insights, leading to superior predictive performance. The mathematical methodology and findings of this study offer a reliable framework for future research.
Future inquiries could proceed with using the EMD-TI-LSTM method by exploring several promising avenues. First, a web/mobile application can be implemented to offer a user-friendly interface for the EMD-TI-LSTM model, facilitating performance analyses. Second, expanding the applications of the presented model to a diverse range of financial assets will help evaluate its generalizability and reveal its strengths and limitations in several marketplace conditions. Another promising area for research is applying the EMD-TI-LSTM model to real-world challenges across various fields. Domain-specific studies can uncover the practical benefits of this method in different areas. To conclude, the EMD-TI-LSTM method signifies meaningful progress in AI-powered financial forecasting, offering a potent alternative to traditional approaches and facilitating further innovative strides in the sector.