Next Article in Journal
The Influence of Personality Traits on Stock Investment Retention: Insights from Thai Investors
Previous Article in Journal
Microcrediting and Investment Analysis in the Context of Environmental, Social, and Corporate Governance
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Optimizing Multivariate Time Series Forecasting with Data Augmentation

by
Seyed Sina Aria
,
Seyed Hossein Iranmanesh
and
Hossein Hassani
*
School of Industrial Engineering, College of Engineering, University of Tehran, Tehran 19395-4697, Iran
*
Author to whom correspondence should be addressed.
J. Risk Financial Manag. 2024, 17(11), 485; https://doi.org/10.3390/jrfm17110485
Submission received: 9 September 2024 / Revised: 17 October 2024 / Accepted: 17 October 2024 / Published: 28 October 2024
(This article belongs to the Section Financial Markets)

Abstract

:
The convergence of data mining and deep learning has become an invaluable tool for gaining insights into evolving events and trends. However, a persistent challenge in utilizing these techniques for forecasting lies in the limited access to comprehensive, error-free data. This challenge is particularly pronounced in financial time series datasets, which are known for their volatility. To address this issue, a novel approach to data augmentation has been introduced, specifically tailored for financial time series forecasting. This approach leverages the power of Generative Adversarial Networks to generate synthetic data that replicate the distribution of authentic data. By integrating synthetic data with real data, the proposed approach significantly improves forecasting accuracy. Tests with real datasets have proven that this method offers a marked improvement over models that rely only on real data.

1. Introduction

Understanding and analyzing economic indicators is a fundamental aspect of a country’s macroeconomic policymaking and decision-making process. In time series analysis, one of the major challenges is the scarcity of data. For example, in the field of earned value management, accurately estimating project timelines and finances depends on recorded economic data. While this information is often available from official sources, a primary issue is the infrequent reporting—typically on a monthly or annual basis. As a result, there are situations where sufficient, high-quality data on the factors influencing time series trends may not be readily accessible. This scarcity of data can significantly affect the quality of predictive models. To address the data scarcity problem in data science, various prototyping approaches have been introduced. However, these methods often fall short of improving predictive models due to their assumption of linearity in changes. As a result, the field of time series analysis has shifted toward more complex methodologies. These advanced techniques allow for a more accurate analysis of non-linear patterns within time trends.
These indicators exhibit fluctuations or trends over time, making them a type of time series data. To analyze these time series, advanced computational and algorithmic tools have been introduced, enabling the application of time series prediction methods. The fundamental concept of prediction-based models involves utilizing past data to train a model that can accurately predict the time series data that need to be monitored. Following this, the forecasted results are compared against the original time series data (Lei et al. 2023). Time series data often exhibit oscillatory patterns, which can range from simple to complex. Historically, conventional methods like Autoregressive Moving Average (ARMA), and Autoregressive Integrated Moving Average (ARIMA) were utilized to identify these patterns (Chang et al. 2023; Nazareth and Reddy 2023). However, these methods struggled to identify intricate patterns due to their simplicity. In contrast, deep learning methods have gained popularity in time series applications, as well as in fields like machine vision, owing to their versatility and effectiveness (Liu and Lin 2021; Liu et al. 2017; Ma et al. 2021; Nguyen et al. 2021). Various types of deep learning models have been introduced in the literature. By leveraging new computational tools and implementing more complex algorithms, particularly using Long Short-Term Memory network (LSTM) computing units (Nguyen et al. 2021; Somu et al. 2021; Wang et al. 2020b; Xayasouk et al. 2020), it is now possible to extract more intricate patterns from time series data, incorporating additional features and thereby enhancing the accuracy of predictions.
Traditional Recurrent Neural Networks (RNNs) encounter challenges in capturing long-term dependencies since they have a tendency to forget information from previous time steps. Addressing this issue, LSTM networks employ gates to regulate information flow within the network, enabling selective retention of pertinent data from prior time steps, a vital characteristic for effectively capturing long-term dependencies within financial data. Moreover, LSTM networks demonstrate proficiency in handling data noise and outliers, as demonstrated in recent research (Fang et al. 2023). Additionally, Bidirectional LSTM (Bi-LSTM), a variant of Recurrent Neural Networks (RNNs), exhibits the unique ability to process data in both forward and reverse directions simultaneously, setting it apart from traditional LSTM models that operate solely in the forward direction (Wang et al. 2023). Bi-LSTM networks, owing to their bidirectional time series analysis, acquire a more comprehensive comprehension of temporal data during training, yielding fewer prediction errors in comparison to other network architectures (Liu and Lin 2021; Ma et al. 2021).
Despite the growing demand for data-driven approaches in various applications, the limited availability and high cost of acquiring large datasets remain significant challenges. This is particularly true for time series data, which often exhibit complex patterns and require extensive training data. Economic characteristic analysis, due to its complexity stemming from several effective factors, can be hindered by the limited availability and low frequency of financial datasets, especially when applying deep learning models like Bi-LSTM for forecasting. To address these limitations, data augmentation has emerged as a promising solution, involving the generation of synthetic data resembling the original data. This approach expands the training dataset and enhances the generalizability of predictive models. However, traditional data augmentation techniques, such as adding noise or scaling the data, may introduce noise and inconsistencies in the generated data. This is where Generative Adversarial Networks (GANs) come into play (Goodfellow et al. 2020). GANs, capable of generating random samples, can enhance the robustness and generalization of financial models. Consequently, researching the application of GANs in prediction tasks has become an increasingly popular topic (Xu et al. 2022). GANs are a type of deep learning model capable of generating new data that closely resemble the original data. In the context of financial data, which are often limited in volume and frequency, GANs have proven to be valuable. Studies also indicate that GANs can significantly enhance the performance of Bi-LSTM for forecasting financial data, as demonstrated by the study in 2018, which showed a 10% improvement in classification accuracy when using GAN-generated synthetic data compared to traditional augmentation methods (Frid-Adar et al. 2018; Brophy et al. 2021).
To achieve more accurate forecasting, this study proposes an approach consisting of the following steps:
  • Training GANs to generate synthetic time series data that closely mimic real-world datasets;
  • Utilizing a Bidirectional Wasserstein Generative Adversarial Network (Bi-WGAN), an advanced model that integrates the features of Bidirectional LSTM with Wasserstein GANs;
  • Leveraging the generated data to train deep learning models, including LSTM networks, for time series prediction;
  • Comparing the performance of the forecasting models trained on a combination of real and synthetic data.
The proposed approach overcomes some of the limitations of traditional time series forecasting methods by harnessing the strengths of both GANs and deep learning models. The innovations introduced in this research can be categorized into two main aspects:
  • Combining Bi-directional LSTM (Bi-LSTM) networks with the Wasserstein Generative Adversarial Network (WGAN) to enhance training and prevent mode collapse in real data distribution mapping;
  • Conducting a comparative study on the use of WLoss and Bi-LSTM functions separately and in combination, along with a comparison of prediction errors in the predictive model.
The paper’s structure is as follows: Section 2 provides a brief review of the literature, highlighting the effectiveness of deep neural networks, particularly RNNs such as LSTM and GANs, in time series prediction applications. Section 3 explains the hybrid deep learning model that combines WGANs and Bi-LSTM networks for forecasting time series data. In Section 4, the dataset is described and implementation details are defined. Section 5 presents the experimental results and discussions, followed by the discussion and conclusion in Section 6 and Section 7.

2. Literature Review

Time series forecasting plays a pivotal role in contemporary business planning, shaping both short-term and long-term organizational goals. The pursuit of accurate and reliable forecasts has become an ongoing effort for many enterprises, resulting in substantial cost savings and efficiencies (Bandara et al. 2020). Leveraging historical and current time-series data, organizations make informed predictions about the future. These predictions span a wide spectrum of applications, ranging from weather forecasting to financial projections (Chandra 2015). In order to predict time series, effective steps have been presented by deep neural networks that offer wide applications, including energy (Somu et al. 2021; Wang et al. 2020a), financial domains (Lu et al. 2021b; Moghar and Hamiche 2020; Yadav et al. 2020), gold volatility prediction (Vidal and Kristjanpoller 2020), cryptocurrency price prediction (Patel et al. 2020), weather forecast indicators (Xayasouk et al. 2020), travel time (Liu et al. 2017), etc.

2.1. Effectiveness of Deep Neural Networks in Time Series Prediction

Deep neural networks, particularly Recurrent Neural Networks (RNNs) such as LSTM (Long Short-Term Memory), have been shown to be effective in time series prediction. These networks are able to capture the temporal dependencies in time series data, which is crucial for accurate forecasting. Recurrent Neural Networks (RNNs) are specialized architectures designed for handling sequential data. They use a vector of hidden states that act as memory, which is preserved and updated over time. Recurrent Neural Networks (RNNs) are specifically designed to extract temporal information from raw sequential data. Although RNNs were developed over two decades ago, their popularity has surged across diverse application domains only recently. This surge can be attributed to advancements in optimization techniques, computational hardware, and the availability of large-scale datasets (Tran et al. 2018; Liu et al. 2020).
RNNs have been successfully applied to a variety of time series prediction tasks, including energy consumption, financial forecasting, air pollution prediction, travel time prediction, and COVID-19 prevalence rate prediction. For example, by using recurrent networks and LSTM along with improving temporal correlations, predicting photovoltaic energy (PV) due to climate fluctuations and solar energy changes was carried out (Wang et al. 2020a). Employing a technique akin to the one utilized in a 2020 study to identify periodicity in energy consumption trends, the LSTM error (RMSE) was observed to be 64.59% lower than that of the linear and conventional ARMA method (Wang et al. 2020b). Also, in a study, the Kernel Convolutional Neural Network (kCNN-LSTM) method was introduced to predict energy consumption by receiving spatio-temporal properties (Somu et al. 2021). This method consists of a clustering algorithm and deep networks in two structures of Convolutional Neural Networks (CNNs) and LSTM. The Convolutional Neural Network (CNN) is a type of feedforward neural network specifically designed for feature extraction from data using convolutional structures. Unlike traditional feature extraction methods, CNNs eliminate the need for manual feature extraction. The architecture of CNNs draws inspiration from visual perception. Each biological neuron corresponds to an artificial neuron, while CNN kernels act as diverse receptors capable of responding to various features. Activation functions simulate the behavior where only neural electric signals surpassing a certain threshold can be transmitted to the next neuron (Li et al. 2021). Recently, CNNs have been explored as tools for feature extraction and classification in intrusion detection systems, thanks to their capability to handle complex data (Aldweesh et al. 2020).
The effectiveness of hybrid models was also explored by testing photovoltaic energy time data, revealing that reinforcing the LSTM network with a CNN network could reduce prediction errors over extended time periods (Agga et al. 2021).

2.2. Benefits of Bidirectional LSTM for Time Series Prediction

Among the applications of the financial field are the use of recurrent networks in stock market forecasting (Moghar and Hamiche 2020), optimization of LSTM networks for Indian stock market forecasting, and combination of CNN, Bi-LSTM, and Attention Mechanism (AM) for a similar purpose. Xayasouk and colleagues predicted the air pollution index by collecting datasets from feature records over three years using deep LSTM and deep autoencoder networks, and comparing the Root Mean Square Error (RMSE) of the two networks, they found that LSTM performed better (Xayasouk et al. 2020).
In forecasting travel time, LSTM had less RMSE error than other classical methods such as linear regression, logistic regression, lasso regression, and ARIMA, another example of the superiority of this method compared to linear and less complex methods. However, conventional and less complex methods can help to strengthen deep learning methods (Liu et al. 2017). To achieve this goal, researchers in 2021 combined SARIMA and LSTM to forecast tourists’ mentions of Macau SAR. SARIMA, capable of identifying seasonal patterns, halved the RMSE error compared to traditional LSTM (Wu et al. 2021). As another application, LSTM networks have been used to predict the prevalence rate of COVID-19 (Gautam 2022; Luo et al. 2021). Utilizing LSTM alongside the suggested Centralized Cluster Distribution (CCD) and Weighted Empirical Stretching (WES) loss function presents a novel method for enhancing Bitcoin price forecasting performance. This approach resulted in RMSE improvements of 11.5% and 22.5% across various label domains (Koo and Kim 2024).
Another type of LSTM, bidirectional LSTM, has improved predictive performance in some applications. A Bi-LSTM network integrates two hidden layers that process data in opposite directions before merging their outputs into a single layer. This configuration allows the network to consider both past and future states, thereby enhancing its predictive accuracy by learning from two data directions. Essentially, a Bi-LSTM divides standard LSTM neurons into a positive state (forward time direction) and a reverse state (backward time direction). At each step, the output consists of both positive and negative bidirectional LSTM, enabling the network to train on historical data from both directions and extract more valuable information (Graves and Schmidhuber 2005). In Figure 1, we illustrate the structure of a basic Bi-LSTM for clarity. For example, in a comparative study, the Bi-LSTM structure—owing to its two-way data traversal—better captures the sequential features in time series data, leading to a 37.78% improvement in model accuracy (Siami-Namini et al. 2019). In an applied example by combining the empirical Mode Decomposition approach and Bi-LSTM network structure, Beijing air quality index prediction for different time horizons was investigated, which was reported in three horizons of 3 h, 6 h, and 12 h for the lower RMSE error combination model (Zhang et al. 2021). Additionally, in another applied research study, short-term traffic was predicted using Bi-LSTM, resulting in an approximate 20% improvement in RMSE error due to the use of bidirectional layers. (Ma et al. 2021). Bi et al. (2024) proposed a hybrid prediction method called Variational Mode Decomposition, Bidirectional Input Attention, Encoder, Decoder (VBAED) to predict water-quality time series data. This method combines variational mode decomposition, bidirectional input attention, a bidirectional LSTM encoder, and a bidirectional temporal attention mechanism in a decoder (Bi et al. 2024). The method is superior to existing methods in capturing long-term dependence, noise filtering, and prediction accuracy.

2.3. Data Augmentation with Generative Adversarial Networks (GANs)

Deep neural networks often require large datasets for training. However, in many real-world applications, data availability is limited. To enrich the dataset augmentation, it is possible to turn to a productive method that has been proven in numerous articles, i.e., using generative adversarial networks or GANs. After the introduction of GAN architectures (Goodfellow et al. 2014), several pieces of research have been conducted on their application in the field of machine vision and using image datasets (Frid-Adar et al. 2018; Karras et al. 2017; Wang et al. 2018). However, these networks have also been successful with different applications in the field of time series. For example, an LSTM-based Variational Autoencoder Generative Adversarial Network (VAE-GAN) anomaly detection method is designed for equipment condition monitoring (Niu et al. 2020). Additionally, there is more research that uses GANs for anomaly detection (Lee et al. 2021; Lu et al. 2021a). To employ GANs for analyzing sequential data, a method called quant GAN was introduced in a paper in 2020. This technique enables the generation of high-quality financial time series data, including the S&P (Standard & Poor’s 500) index dataset spanning nine years (from May 2009 to December 2018) (Wiese et al. 2020). Furthermore, short-term load prediction has been investigated by combining GANs and LSTM structures to provide artificial data to better model deep networks (Zhang et al. 2020). Using Balanced Generative Adversarial Network (B-GAN) structures is an example of intrusion data generation that merges LSTM layers as a data balancing strategy. The generated samples helped in detecting intrusions in industrial control systems (Yuan et al. 2023). GAN has also been used in the time series field to help classify time series in imbalanced conditions (Deng et al. 2021). LSTM networks are widely acknowledged as highly effective sequential models for time series predictions. We assess their performance against GANs, renowned for their ability to maintain realism. GANs have demonstrated remarkable performance in generating photorealistic high-quality facial images (Quilodrán-Casas et al. 2022). Another example of the GAN application in time series is the prediction of the COVID-19 outbreak (Silva et al. 2021). Additionally, the mode collapse issue was addressed through the WGAN architecture in 2017, which minimized the distribution distance. Subsequently, this structure found application in the time series domain, as indicated by recent research (Arjovsky et al. 2017; Pfenninger et al. 2021). A group of authors in a research combined bidirectional LSTM with GANs in the time series (Wiese et al. 2020), and in another study, bidirectional LSTM was combined with GANs in order to replace and predict time series (Gupta and Beheshti 2020). Through experimentation on three dataset series, the hybrid network introduced in both applications demonstrated superior performance compared to other conventional methods. Another instance involved incorporating Bi-directional LSTM layers into a GAN structure, which was used to analyze crude oil time series data for market risk prediction (Zou et al. 2023).

2.4. Contribution of the Present Research

Based on the background of the research mentioned in the present study, deep learning methods will be used for the purpose of the prediction of effective economic indicators in the Iranian market. Due to the lack of synced data in terms of time and to reduce the effect of data deficiency, Wasserstein GANs are used to strengthen the dataset. The new dataset, a combination of synthetic and real data, is predicted in the next step using Bi-LSTM networks. The contribution of this research in two parts is to investigate the effect of data enhancement with WGAN and the performance of the WGAN–Bi-LSTM hybrid network in comparison with the methods used in recent research.
Given the diversity of methods used in time series forecasting, Table 1 presents a concise comparison of various approaches, including their acronyms, along with their respective advantages and disadvantages. This information is intended to enhance the reader’s understanding of each method’s applicability and limitations, supporting more informed decision-making in the field of time series forecasting.

3. Methodology

Given that the goal of solving a prediction problem is to simultaneously analyze multiple indicators, deep networks and LSTM layers are used as the prediction methods in this paper. Since these methods require large datasets, before training the prediction algorithm, first, by deep GAN, we produce examples similar to real ones and strengthen the dataset. GAN includes two generator networks whose output is production sequences and discriminators that distinguish real examples from unrealistic ones by determining the actual/fake label (label 1 for the actual sample and label 0 for the fake sample). However, due to mode collapse problems and vanishing gradients associated with conventional GANs, this paper utilizes the Wasserstein GAN with Gradient Penalty (WGAN-GP) method. In these networks, the critic network is employed instead of discriminators. The role of the critic network is akin to identifying fake samples, but its output may not be constrained within the range of 0 to 1, as sigmoid is no longer used in the final layer of this network (Arjovsky et al. 2017). Generative Adversarial Networks (GANs) have been widely used for various applications. However, they often face issues such as vanishing gradients and instability during training. To address these challenges, we employed the Wasserstein Generative Adversarial Network (WGAN), which utilizes the Wasserstein distance as its loss function. This approach not only improves the convergence properties but also enhances the stability of training processes, particularly for high-dimensional data. we preferred the Wasserstein GAN (WGAN) over traditional GANs due to its ability to mitigate issues related to gradient vanishing and instability during the training phase. The advantages of WGAN make it particularly suitable for time-series data, allowing for more reliable results, as demonstrated in our experiments. Equations (1) and (2) are based on the standard formulations in Generative Adversarial Networks (Goodfellow et al. 2014). The difference between the GANs and the WGAN is in their loss function. In conventional GANs, the Binary Cross Entropy or BCE function is used to calculate loss, which is common in:
L o s s G = 1 m i = 1 m l o g 1 D G z i ,
binary classification problems:
L o s s D = 1 m i = 1 m l o g D x i l o g D G z i ,
where D and G are symbols of discriminator and generator networks, respectively, and z is the input noise vector to the G and x networks, as well as an example of a real dataset. In this way, G(z) is the output of the generative network and is considered a fake sample, so in the LossD calculation, a data set including real and fake data is given to the distinguished network to update the weights, while for LossG, fake samples are used to calculate loss and update the weights of the productive network. Figure 2 shows the structure of a simple WGAN.
Equation (3) represents the Wasserstein loss as introduced by Arjovsky et al. (2017). However, in WGANs, to solve mode collapse problems and vanishing gradients, the cost function is defined so that the output distribution of the generated network is close to the real data distribution, and the generator can find a map of the rand space to the real data distribution space. In this way, in WGANs, the cost function (loss) or Wasserstein loss can be defined as follows:
W L o s s = m i n G   m a x C E x ~ P r D ( x ) E x ~ ~ P g D x ~ ,
where x ~ is the output of the generated network or the same fake samples, and p g is the distribution of the model by the generator network as well as the P r of the actual data distribution. In fact, in WGAN training, the g network seeks to minimize this value, and the c network seeks to maximize WLoss. The gradient value of this function behaves better than the conventional function in such a way that it optimizes the generator with better quality (Arjovsky et al. 2017).
LSTM is an advanced variant of Recurrent Neural Networks (RNNs) that excels at capturing long-term dependencies in sequential data. Unlike conventional RNNs, LSTM tackles the vanishing gradient problem by employing specialized memory units. LSTM networks utilize two state vectors to effectively capture long-term dependencies in sequential data. The first state vector, h t , stores the current state of the layer and its output at time t. The second state vector, c t , incorporates more distant dependencies by continuously updating and modifying itself. These updates are regulated by four gates: input (i), output (o), forget (f), and cell candidate (g), which regulate the flow of information between the memory units.
  • Input gate: The input gate i t  controls the flow of new information into the cell state. It determines which parts of the current input are relevant and should be incorporated into the state representation.
  • Forget gate: The forget gate f t controls the flow of old information out of the cell state. It decides which parts of the previous state should be retained or discarded.
  • Cell candidate gate: The cell candidate gate g t proposes the new information that should be added to the cell state. It generates a candidate state vector that potentially contains new information from the current input.
  • Output gate: The output gate o t controls the output of the LSTM cell. It determines which parts of the cell state are relevant and should be passed on as the next hidden state.
These gates work together to ensure that the LSTM cell can effectively capture long-term dependencies in the data by selectively storing and forgetting information over time. To enable the LSTM network to learn effectively, it is necessary to train the input weights, recurrent part weights, and biases stored in vectors W, R, and b, respectively. Input weights W combine information from different input nodes in the network. The recurrent part weights in R control the proportion of information that moves from one time step to the next, and the vector b contains biases. Depending on whether they belong to the forget, cell candidate, input, or output gates, the subset of weights and biases are assigned to a subvector with the subscript f, c, i, or o, respectively (Equations (4)–(12) are based on standard formulations in the literature regarding LSTM and Bi-LSTM networks (Graves and Schmidhuber 2005)).
weights and bias matrix = W k , R k , b k ,
k t = σ k W k h t 1 + R k x t + b k ,   k i , f , o , g ,
σ i , f , o x = s i g m o i d   a c t i v a t i o n   f u n c t i o n   σ = 1 + e x 1 ,
σ c x = t a n h ( x ) = e 2 x 1 e 2 x + 1 ,
The t-th hidden states of Bi-LSTM for the forward and backward states are calculated using the following equations:
c t = f t c t 1 + i t g t ,
h t = h t 1   t a n h c t ,
The traditional LSTM is limited in its ability to capture sequential information due to its unidirectional processing. This can lead to incomplete analysis of time series data. To address this issue, the Bi-LSTM was developed with a bidirectional architecture that can process input signals in both forward and backward directions. The Bi-LSTM consists of two LSTM layers that operate in parallel and in opposite directions. The forward propagation layer, represented by h f ( t ) , captures information from past sequence values. The backward propagation layer, represented by  h b ( t ) , extracts information from future dataset values.
h f t = σ c W f h x t + R f h h f t 1 + b f ,
h b t = σ c W b h x t + R b h h b t 1 + b b ,
where W f h and W b h represent the forward and backward synapsis weights from the input to the internal unit weights, R f h and R b h represent the forward and backward feedback recurrent weights, and b f and b b correspond to bias information in two directions. The final output of the Bi-LSTM, h t , is calculated using the following equation:
y t = σ ( W f h y h f t + W b h y h b t + b b ) ,
where W f h y and W b h y represent the forward and backward weights of the output layer, respectively, and b b denotes the output bias. The activation function of the output layer (σ) in a Bi-LSTM can be either a sigmoidal or a linear function and depends on the desired outcome.
The metrics utilized for evaluating prediction accuracy include Mean Square Error (MSE) and Root Mean Square Error (RMSE), which gauge the alignment between predicted and actual closing prices. Here, y i ( t ) denotes the predicted closing price at time t, y i ( t ) represents the actual closing price at time t, and n denotes the total number of data points. Lower values of MSE and RMSE indicate superior model performance.
M S E = 1 n i = 1 n ( y i t y i ( t ) ) 2 ,
R M S E = 1 n i = 1 n y i t y i t 2 ,

4. Implementation

In this section, we outline the step-by-step process followed throughout the research. As depicted in Figure 3, the workflow consists of several key stages, from data collection to real-world testing. Below is a detailed explanation of each step to provide a clearer understanding of the methodology.
  • Data Collection (Step 1):
We collected financial data from various sources in Iran, resulting in ten distinct data categories.
  • Data Preprocessing (Step 2):
During preprocessing, we encountered challenges such as differing weekend closures—some datasets closed on Thursday and Friday, while others closed on Saturday and Sunday. We resolved these inconsistencies using a noise correction method. Additionally, we observed a strong correlation between gold and coin prices, so we removed one to avoid redundancy. We also normalized the data to ensure consistency for modeling.
  • Data Augmentation (Step 3):
LSTM models typically require large amounts of data, and GANs are highly effective for data generation. Therefore, we utilized a WGAN network, which is particularly well suited for time series data like ours. This technique enhanced the dataset, improving the training process.
  • Improving the LSTM Model (Step 4):
We enhanced the LSTM model by incorporating Bidirectional LSTM (Bi-LSTM), which allows the model to learn in both forward and backward directions. This approach led to improved learning outcomes and more accurate predictions.
  • Model Tuning (Step 5):
In time series analysis, model tuning is critical. We followed best practices and set the lambda parameter to 1/2. Detailed hyperparameters are available in Table 2 and Table 3.
  • Comparison (Step 6):
We compared the performance of models with and without GAN-based data augmentation to assess its impact on the final results.
  • Real-World Testing (Step 7):
Finally, we tested the model on real-world data to verify its performance in practical scenarios.

4.1. Data Collection

To explore the hybrid implementation method, we extracted and compiled a dataset containing economic parameters from Iran, sourced from TGJU—Totally Governmental Javaher Ushering. This dataset spans from 22 November 2019 to 21 December 2021 and includes eight key international and domestic economic indicators. These indicators consist of the global stock index (S&P), Bitcoin and Ethereum prices, aluminum prices, global gold prices, gold prices in IRR, coin prices, and the Iranian stock index. Figure 4 presents a visual representation of the price features within this dataset, and a summary of the dataset’s characteristics is provided in Table 4.

4.2. Data Processing

Due to the significant differences in the value ranges of various features input into the model, it is essential to normalize the data. This step is crucial to prevent the negative effects that differing numerical ranges can have on data augmentation and prediction. Normalizing the data reduces its range to between −1 and 1. The formula for this data normalization is as follows:
x m i n V a l u e m a x V a l u e m i n V a l u e = y ,
Table 4 describes the dataset used in this study, which includes various financial series. These series consist of the global stock index (S&P), the prices of Bitcoin and Ethereum cryptocurrencies, the global price of aluminum, and both the worldwide and the Iranian prices of gold. Additionally, it includes the price of coins and the Iranian stock index. The dataset is structured with a sequence length of 20, meaning that each data point is based on a sequence of 20 consecutive daily records. In total, the dataset contains 740 records, divided into 592 for training and 148 for testing. This combination of diverse financial indicators allows for comprehensive analysis and accurate forecasting across various time series.
The proposed Bi-WGAN algorithm is implemented following the process outlined in Figure 2. First, the dataset is normalized and cleaned. Next, the WGAN algorithm is trained using the parameters specified in Table 2 and Table 3. Once training is complete, the generator network produces synthetic samples to enhance the training process in the predictive network. In similar applications (Sundaram and Hulkund 2021), less than 50% of the data are artificial (between 40 and 20%), a generator will provide about 20% of the predictive algorithm training data.
Based on the experiment results, LSTM works better than other prediction methods, and GAN–LSTM works best when we do not have enough data, because LSTM needs more data and GAN helps to fill that gap. Also, Bi-GAN performs better than GAN, and among GANs, WGAN works best for time series data. The overall prediction with the WGAN tool and augmented data resulted in a lower error. Experiments were carried out in two different conditions for two different combination ratios of synthetic data, in which the LSTM and Bi-LSTM methods were compared. Table 5 shows that the best case was using the Bi-LSTM method and network training with 20% artificial data. The error reported in this table is the MSE error (Lu et al. 2021b) or mean squared error (Zhang et al. 2020), with the help of which the predictive algorithm was used in the next step. The percentages of 10% and 20% for generated data were chosen based on findings from previous studies that showed the benefits of augmenting real-world datasets with synthetic data. Specifically, it has been shown that augmenting with less than 50% synthetic data, usually between 20% and 40%, yields optimal results. These percentages were selected to balance between enhancing the dataset without causing overfitting or introducing excessive noise. The values of 10% and 20% were used as sample points to observe how the model performance changes as the proportion of synthetic data increases. While this study focused on 10% and 20% synthetic data, further exploration of other percentages could provide additional insights. Due to resource and time constraints, only these two percentages were tested. In this study, we performed extensive hyperparameter tuning to optimize the performance of our models. One of the key parameters we tuned was the learning rate. We adjusted the learning rate logarithmically through several stages. Initially, we set the learning rate to 0.1. Then, we progressively reduced it as follows: first to 0.05, then to 0.01, followed by 0.005, 0.001, and, finally, 0.0005. After testing each value, we found that the best-performing learning rate was the one mentioned in the results section. This iterative process helped to ensure that the models were trained efficiently, avoiding both underfitting and overfitting.

5. Results and Discussion

The following section describes the performance of the proposed method and presents relevant analyses for the dataset of economic indicators. As mentioned earlier, the use of GANs comes with challenges, including the lack of a suitable method for quantitatively evaluating their results. Since this approach is an unsupervised learning method, there is no ground truth available, making it difficult to define a stopping condition for the learning process or to effectively evaluate the algorithm’s progress and output. For this purpose, the evaluation of the WGAN’s performance involves utilizing the visualization of the generated distribution, similar to prior research (Pfenninger et al. 2021).
Figure 5 illustrates the learning processes of both the generator and the critic networks. The higher number of epochs for the critic network is due to the repeated training cycles within each learning loop specific to this network. Additionally, batch size impacts the number of weight update loops in the network. With a batch size of 64 and 720 learning samples, approximately 10 weight updates occur per epoch. Considering 100 epochs, the total number of weight updates is around 1000.
In Figure 6, the diagram on the left illustrates the model’s improvement process over time as it learns, typically showing performance metrics such as error or accuracy during each training period. The diagram on the right demonstrates how closely the model’s predictions align with the actual data values. By examining this graph, one can assess the model’s prediction accuracy and observe the differences between predicted and actual values.
After training at ten epochs, samples of the signals generated by the network are visible (Figure 7). These synthetic signals’ range and manner of oscillation are similar to real signals and have similar characteristics, such as being non-stationary. The range of values of the generated numbers also indicates that similarities to real signals can be observed.
This heatmap in Figure 8 visualizes the Mean Squared Error (MSE) for the predictions made by a Support Vector Regression (SVR) model. With dimensions of (1, 148), each row corresponds to a sample, and each column shows the respective MSE. The depicted results represent the different stages of evaluating the SVR using MSE, covering input analysis, model training, MSE computation, and heatmap creation.
Nevertheless, to assess the network’s effectiveness, the distribution of generated samples is examined. As depicted in Figure 9, the log return distributions of actual and predicted data for oil, gold, aluminum, and Bitcoin are plotted against production and real datasets, showing considerable similarity. This suggests that WGAN has effectively learned to generate synthetic samples.

6. Discussion

When selecting a model for forecasting, it is vital to balance the trade-offs between accuracy, complexity, and computational cost. Traditional models like ARMA and ARIMA may be preferable for simpler, linear time series with limited resources, while neural network models like LSTM and Bi-LSTM offer powerful tools for capturing long-term dependencies in complex data. GAN-based models, although computationally intensive, provide a unique advantage by generating synthetic data and are particularly valuable in cases where data are scarce or imbalanced.
Table 6 summarizes the advantages and limitations of various forecasting models commonly used in time series analysis and prediction tasks. The methods range from traditional statistical models like ARMA and ARIMA to more complex neural networks such as RNNs, LSTM, Bi-LSTM, GANs, and Bi-WGANs.
The ARMA and ARIMA models are favored for their simplicity and effectiveness with linear and stationary data. However, they struggle with non-linear patterns and require careful parameter tuning, especially ARIMA, which is also computationally intensive due to differencing steps needed for non-stationary data.
In contrast, neural networks like RNNs and LSTM are well suited to handle complex, sequential data. LSTM in particular excels in capturing long-term dependencies in time series data, although it comes with higher computational costs and demands more data for accurate predictions. Bi-LSTM further enhances this by capturing context from both past and future sequences, but at the cost of even greater computational requirements.
Generative Adversarial Networks (GANs) offer an advanced approach by generating synthetic data and capturing complex patterns, making them useful for tasks involving data augmentation. However, GANs are challenging to train and require substantial tuning to prevent issues like mode collapse. The Bi-WGAN variant improves on GANs by adding stability and is particularly useful for generating high-quality synthetic data for imbalanced datasets, albeit with a significant increase in computational demand.
The choice of model ultimately depends on the specific requirements of the forecasting task, including the nature of the data, the need for accuracy, and available computational resources. For applications that require high precision in capturing both short-term and long-term dependencies, LSTM and Bi-LSTM are strong contenders. In cases where synthetic data are needed to boost model performance, GANs or Bi-WGANs may be the optimal choice, despite their training complexity.

7. Conclusions

In this paper, our primary objective was to enhance multivariate time series prediction using data sourced from economic indicators affecting the Iranian market. To achieve this goal, we implemented data reinforcement through the Bi-WGAN generative algorithm. Our analysis of the results demonstrates a notable enhancement in predictive accuracy. Specifically, when we integrated 20% artificial data generated by the generator network into the dataset, we observed a remarkable 53% reduction in Mean Squared Error (MSE) compared to the baseline data state, and a substantial 34% improvement when contrasted with the performance of the conventional LSTM model in comparison to the Bi-LSTM. It is worth noting that while the network’s performance exhibits significant progress, there remains room for optimization, primarily attributed to the absence of a universally recognized and reliable measurement benchmark. Nonetheless, a qualitative evaluation of the generated data, along with an examination of the log return distribution within signals related to economic features, reveals that the data distribution demonstrates a satisfactory degree of alignment with actual data. Moreover, the oscillation patterns in the generated signals closely resemble those observed in real-world signals. Additionally, our RMSE error was 0.098. In conclusion, both qualitative and quantitative analyses, as indicated by the MSE error reduction, affirm the effectiveness of our proposed approach in strengthening the dataset for improved time series prediction.

Author Contributions

Conceptualization, S.S.A., S.H.I. and H.H.; writing—review and editing, S.S.A., S.H.I. and H.H.; supervision S.H.I. and H.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data available on request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Agga, Ali, Ahmed Abbou, Moussa Labbadi, and Yassine El Houm. 2021. Short-term self consumption PV plant power production forecasts based on hybrid CNN-LSTM, ConvLSTM models. Renewable Energy 177: 101–12. [Google Scholar] [CrossRef]
  2. Aldweesh, Arwa, Abdelouahid Derhab, and Ahmed Z. Emam. 2020. Deep learning approaches for anomaly-based intrusion detection systems: A survey, taxonomy, and open issues. Knowledge-Based Systems 189: 105124. [Google Scholar] [CrossRef]
  3. Arjovsky, Martín, Soumith Chintala, and Léon Bottou. 2017. Wasserstein GAN. arXiv arXiv:1701.07875. [Google Scholar]
  4. Bandara, Kasun, Christoph Bergmeir, and Hansika Hewamalage. 2020. LSTM-MSNet: Leveraging forecasts on sets of related time series with multiple seasonal patterns. IEEE Transactions on Neural Networks and Learning Systems 32: 1586–99. [Google Scholar] [CrossRef]
  5. Bi, Jing, Zexian Chen, Haitao Yuan, and Jia Zhang. 2024. Accurate water quality prediction with attention-based bidirectional LSTM and encoder–decoder. Expert Systems with Applications 238: 121807. [Google Scholar] [CrossRef]
  6. Brophy, Eoin, Zhengwei Wang, Qi She, and Tomas Ward. 2021. Generative adversarial networks in time series: A survey and taxonomy. arXiv arXiv:2107.11098. [Google Scholar]
  7. Chandra, Rohitash. 2015. Competition and collaboration in cooperative coevolution of Elman recurrent neural networks for time-series prediction. IEEE Transactions on Neural Networks and Learning Systems 26: 3123–36. [Google Scholar] [CrossRef]
  8. Chang, Ting-Jen, Tian-Shyug Lee, Chih-Te Yang, and Chi-Jie Lu. 2023. A ternary-frequency cryptocurrency price prediction scheme by ensemble of clustering and reconstructing intrinsic mode functions based on CEEMDAN. Expert Systems with Applications 233: 121008. [Google Scholar] [CrossRef]
  9. Deng, Grace, Cuize Han, Tommaso Dreossi, Clarence Lee, and David S. Matteson. 2021. IB-GAN: A Unified Approach for Multivariate Time Series Classification under Class Imbalance. arXiv arXiv:2110.07460. [Google Scholar]
  10. Fang, Zhen, Xu Ma, Huifeng Pan, Guangbing Yang, and Gonzalo R. Arce. 2023. Movement forecasting of financial time series based on adaptive LSTM-BN network. Expert Systems with Applications 213: 119207. [Google Scholar] [CrossRef]
  11. Frid-Adar, Maayan, Idit Diamant, Eyal Klang, Michal Amitai, Jacob Goldberger, and Hayit Greenspan. 2018. GAN-based synthetic medical image augmentation for increased CNN performance in liver lesion classification. Neurocomputing 321: 321–31. [Google Scholar] [CrossRef]
  12. Gautam, Yogesh. 2022. Transfer Learning for COVID-19 cases and deaths forecast using LSTM network. ISA Transactions 124: 41–56. [Google Scholar] [CrossRef] [PubMed]
  13. Goodfellow, Ian, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. Advances in Neural Information Processing Systems 2: 2672–80. [Google Scholar]
  14. Goodfellow, Ian, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2020. Generative adversarial networks. Communications of the ACM 63: 139–44. [Google Scholar] [CrossRef]
  15. Graves, Alex, and Jürgen Schmidhuber. 2005. Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Networks 18: 602–10. [Google Scholar] [CrossRef]
  16. Gupta, Mehak, and Rahmatollah Beheshti. 2020. Time-series Imputation and Prediction with Bi-Directional Generative Adversarial Networks. arXiv arXiv:2009.08900. [Google Scholar]
  17. Karras, Tero, Timo Aila, Samuli Laine, and Jaakko Lehtinen. 2017. Progressive growing of gans for improved quality, stability, and variation. arXiv arXiv:1710.10196. [Google Scholar]
  18. Koo, Eunho, and Geonwoo Kim. 2024. Centralized decomposition approach in LSTM for Bitcoin price prediction. Expert Systems with Applications 237: 121401. [Google Scholar] [CrossRef]
  19. Lee, Chang-Ki, Yu-Jeong Cheon, and Wook-Yeon Hwang. 2021. Studies on the GAN-based anomaly detection methods for the time series data. IEEE Access 9: 73201–15. [Google Scholar] [CrossRef]
  20. Lei, Tianyang, Chang Gong, Gang Chen, Mengxin Ou, Kewei Yang, and Jichao Li. 2023. A novel unsupervised framework for time series data anomaly detection via spectrum decomposition. Knowledge-Based Systems 280: 111002. [Google Scholar] [CrossRef]
  21. Li, Zewen, Fan Liu, Wenjie Yang, Shouheng Peng, and Jun Zhou. 2021. A survey of convolutional neural networks: Analysis, applications, and prospects. IEEE Transactions on Neural Networks and Learning Systems 33: 6999–7019. [Google Scholar] [CrossRef] [PubMed]
  22. Liu, Penghui, Jing Liu, and Kai Wu. 2020. CNN-FCM: System modeling promotes stability of deep learning in time series prediction. Knowledge-Based Systems 203: 106081. [Google Scholar] [CrossRef]
  23. Liu, Xiaolei, and Zi Lin. 2021. Impact of Covid-19 pandemic on electricity demand in the UK based on multivariate time series forecasting with Bidirectional Long Short Term Memory. Energy 227: 120455. [Google Scholar] [CrossRef] [PubMed]
  24. Liu, Yangdong, Yizhe Wang, Xiaoguang Yang, and Linan Zhang. 2017. Short-term travel time prediction by deep learning: A comparison of different LSTM-DNN models. Paper presented at the 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC), Yokohama, Japan, October 16–19. [Google Scholar]
  25. Lu, Haodong, Miao Du, Kai Qian, Xiaoming He, and Kun Wang. 2021a. GAN-based data augmentation strategy for sensor anomaly detection in industrial robots. IEEE Sensors Journal 22: 17464–74. [Google Scholar] [CrossRef]
  26. Lu, Wenjie, Jiazheng Li, Jingyang Wang, and Lele Qin. 2021b. A CNN-BiLSTM-AM method for stock price prediction. Neural Computing and Applications 33: 4741–53. [Google Scholar] [CrossRef]
  27. Luo, Junling, Zhongliang Zhang, Yao Fu, and Feng Rao. 2021. Time series prediction of COVID-19 transmission in America using LSTM and XGBoost algorithms. Results in Physics 27: 104462. [Google Scholar] [CrossRef]
  28. Ma, Changxi, Guowen Dai, and Jibiao Zhou. 2021. Short-term traffic flow prediction for urban road sections based on time series analysis and LSTM_BILSTM method. IEEE Transactions on Intelligent Transportation Systems 23: 5615–24. [Google Scholar] [CrossRef]
  29. Moghar, Adil, and Mhamed Hamiche. 2020. Stock market prediction using LSTM recurrent neural network. Procedia Computer Science 170: 1168–73. [Google Scholar] [CrossRef]
  30. Nazareth, Noella, and Yeruva Venkata Ramana Reddy. 2023. Financial applications of machine learning: A literature review. Expert Systems with Applications 219: 119640. [Google Scholar] [CrossRef]
  31. Nguyen, H. Du, Kim Phuc Tran, Sébastien Thomassey, and Moez Hamad. 2021. Forecasting and Anomaly Detection approaches using LSTM and LSTM Autoencoder techniques with the applications in supply chain management. International Journal of Information Management 57: 102282. [Google Scholar] [CrossRef]
  32. Niu, Zijian, Ke Yu, and Xiaofei Wu. 2020. LSTM-based VAE-GAN for time-series anomaly detection. Sensors 20: 3738. [Google Scholar] [CrossRef] [PubMed]
  33. Patel, Mohil Maheshkumar, Sudeep Tanwar, Rajesh Gupta, and Neeraj Kumar. 2020. A deep learning-based cryptocurrency price prediction scheme for financial institutions. Journal of Information Security and Applications 55: 102583. [Google Scholar] [CrossRef]
  34. Pfenninger, Moritz, Samuel Rikli, and Daniel Nico Bigler. 2021. Wasserstein GAN: Deep Generation Applied on Financial Time Series. Available online: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3885659 (accessed on 8 September 2024).
  35. Quilodrán-Casas, César, Vinicius L. S. Silva, Rossella Arcucci, Claire E. Heaney, YiKe Guo, and Christopher C. Pain. 2022. Digital twins based on bidirectional LSTM and GAN for modelling the COVID-19 pandemic. Neurocomputing 470: 11–28. [Google Scholar] [CrossRef] [PubMed]
  36. Siami-Namini, Sima, Neda Tavakoli, and Akbar Siami Namin. 2019. The performance of LSTM and BiLSTM in forecasting time series. Paper presented at the 2019 IEEE International Conference on Big Data (Big Data), Los Angeles, CA, USA, December 9–12. [Google Scholar]
  37. Silva, Vinicius L. S., Claire E. Heaney, Yaqi Li, and Christopher C. Pain. 2021. Data Assimilation Predictive GAN (DA-PredGAN) Applied to a Spatio-Temporal Compartmental Model in Epidemiology. Journal of Scientific Computing 94: 25. [Google Scholar] [CrossRef]
  38. Somu, Nivethitha, M. R. Gauthama Raman, and Krithi Ramamritham. 2021. A deep learning framework for building energy consumption forecast. Renewable and Sustainable Energy Reviews 137: 110591. [Google Scholar] [CrossRef]
  39. Sundaram, Shobhita, and Neha Hulkund. 2021. GAN-based Data Augmentation for Chest X-ray Classification. arXiv arXiv:2107.02970. [Google Scholar]
  40. Tran, Dat Thanh, Alexandros Iosifidis, Juho Kanniainen, and Moncef Gabbouj. 2018. Temporal attention-augmented bilinear network for financial time-series data analysis. IEEE Transactions on Neural Networks and Learning Systems 30: 1407–18. [Google Scholar] [CrossRef]
  41. Vidal, Andrés, and Werner Kristjanpoller. 2020. Gold volatility prediction using a CNN-LSTM approach. Expert Systems with Applications 157: 113481. [Google Scholar] [CrossRef]
  42. Wang, Fei, Zhiming Xuan, Zhao Zhen, Kangping Li, Tieqiang Wang, and Min Shi. 2020a. A day-ahead PV power forecasting method based on LSTM-RNN model and time correlation modification under partial daily pattern prediction framework. Energy Conversion and Management 212: 112766. [Google Scholar] [CrossRef]
  43. Wang, Heshan, Yiping Zhang, Jing Liang, and Lili Liu. 2023. DAFA-BiLSTM: Deep autoregression feature augmented bidirectional LSTM network for time series prediction. Neural Networks 157: 240–56. [Google Scholar] [CrossRef]
  44. Wang, Jian Qi, Yu Du, and Jing Wang. 2020b. LSTM based long-term energy consumption prediction with periodicity. Energy 197: 117197. [Google Scholar] [CrossRef]
  45. Wang, Ting-Chun, Ming-Yu Liu, Jun-Yan Zhu, Andrew Tao, Jan Kautz, and Bryan Catanzaro. 2018. High-resolution image synthesis and semantic manipulation with conditional gans. Paper presented at the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, June 18–23. [Google Scholar]
  46. Wiese, Magnus, Robert Knobloch, Ralf Korn, and Peter Kretschmer. 2020. Quant GANs: Deep generation of financial time series. Quantitative Finance 20: 1419–40. [Google Scholar] [CrossRef]
  47. Wu, Don Chi Wai, Lei Ji, Kaijian He, and Kwok Fai Geoffrey Tso. 2021. Forecasting tourist daily arrivals with a hybrid Sarima–Lstm approach. Journal of Hospitality & Tourism Research 45: 52–67. [Google Scholar]
  48. Xayasouk, Thanongsak, HwaMin Lee, and Giyeol Lee. 2020. Air pollution prediction using long short-term memory (LSTM) and deep autoencoder (DAE) models. Sustainability 12: 2570. [Google Scholar] [CrossRef]
  49. Xu, Hongfeng, Donglin Cao, and Shaozi Li. 2022. A self-regulated generative adversarial network for stock price movement prediction based on the historical price and tweets. Knowledge-Based Systems 247: 108712. [Google Scholar] [CrossRef]
  50. Yadav, Anita, C. K. Jha, and Aditi Sharan. 2020. Optimizing LSTM for time series prediction in Indian stock market. Procedia Computer Science 167: 2091–100. [Google Scholar] [CrossRef]
  51. Yuan, Lixiang, Siyang Yu, Zhibang Yang, Mingxing Duan, and Kenli Li. 2023. A data balancing approach based on generative adversarial network. Future Generation Computer Systems 141: 768–76. [Google Scholar] [CrossRef]
  52. Zhang, Jianguang, Xuyang Zhang, Jianfeng Yang, Zhaoxu Wang, Yufan Zhang, Qian Ai, Zhaoyu Li, Ziru Sun, and Shuangrui Yin. 2020. Deep lstm and gan based short-term load forecasting method at the zone level. Paper presented at the 2020 International Conference on Artificial Intelligence in Information and Communication (ICAIIC), Fukuoka, Japan, February 19–21. [Google Scholar]
  53. Zhang, Luo, Peng Liu, Lei Zhao, Guizhou Wang, Wangfeng Zhang, and Jianbo Liu. 2021. Air quality predictions with a semi-supervised bidirectional LSTM neural network. Atmospheric Pollution Research 12: 328–39. [Google Scholar] [CrossRef]
  54. Zou, Yingchao, Lean Yu, and Kaijian He. 2023. Forecasting crude oil risk: A multiscale bidirectional generative adversarial network based approach. Expert Systems with Applications 212: 118743. [Google Scholar] [CrossRef]
Figure 1. The structure of a basic Bi-LSTM.
Figure 1. The structure of a basic Bi-LSTM.
Jrfm 17 00485 g001
Figure 2. The structure of a basic WGAN.
Figure 2. The structure of a basic WGAN.
Jrfm 17 00485 g002
Figure 3. Flow diagram of the research process.
Figure 3. Flow diagram of the research process.
Jrfm 17 00485 g003
Figure 4. A visual representation of the price features included this dataset.
Figure 4. A visual representation of the price features included this dataset.
Jrfm 17 00485 g004
Figure 5. Loss function changes during WGAN training: (a) generator loss; (b) critic loss.
Figure 5. Loss function changes during WGAN training: (a) generator loss; (b) critic loss.
Jrfm 17 00485 g005
Figure 6. Examples of synthetically generated signals and comparisons with real signals. (a,c,e) The diagram shows the model’s error rate decreasing throughout the training process. (b,d,f) The diagram compares the actual data with the model’s predictions for both training and testing datasets.
Figure 6. Examples of synthetically generated signals and comparisons with real signals. (a,c,e) The diagram shows the model’s error rate decreasing throughout the training process. (b,d,f) The diagram compares the actual data with the model’s predictions for both training and testing datasets.
Jrfm 17 00485 g006aJrfm 17 00485 g006b
Figure 7. Examples of synthetically generated signals and comparisons with real signals: (a) real samples; (b) generated samples.
Figure 7. Examples of synthetically generated signals and comparisons with real signals: (a) real samples; (b) generated samples.
Jrfm 17 00485 g007
Figure 8. The heatmap visualizes the MSE for SVR.
Figure 8. The heatmap visualizes the MSE for SVR.
Jrfm 17 00485 g008
Figure 9. Log return distribution: (a) log return distribution for oil; (b) log return distribution for int_gold; (c) log return distribution for alum; (d) log return distribution for btc.
Figure 9. Log return distribution: (a) log return distribution for oil; (b) log return distribution for int_gold; (c) log return distribution for alum; (d) log return distribution for btc.
Jrfm 17 00485 g009
Table 1. Comparison of various methods in time series forecasting.
Table 1. Comparison of various methods in time series forecasting.
MethodAcronymAdvantagesDisadvantages
Long Short-Term Memory LSTMCaptures temporal dependencies Requires large datasets
Bidirectional LSTM Bi-LSTM Considers both past and future data More complex architecture
Kernel Convolutional Neural Network kCNN-LSTM Effective for feature extraction from spatio-temporal data May require extensive computational resources
Variational Autoencoder GAN VAE-GAN Effective for anomaly detection Sensitive to hyperparameters
Wasserstein GAN WGAN Addresses mode collapse issues More complex to implement than standard GANs
Balanced GAN B-GAN Useful for generating balanced datasets May not generalize well to unseen data
Generative Adversarial Network GAN Good at generating realistic data Requires careful training and tuning
Table 2. Specifications of the generator network structure.
Table 2. Specifications of the generator network structure.
Generator
Hyperparametervalue
Layer typeconv, FC
Layer num3
Dropout0
Epoch100
Learning rate0.0005
OptimizerRMSProp
Table 3. Specifications of the critic network structure.
Table 3. Specifications of the critic network structure.
Critic
Hyperparametervalue
Layer typeLSTM, FC
Layer num4
Dropout0
Epoch100
Learning rate0.0005
OptimizerRMSProp
Critic learn frequency5
Table 4. Description of the scraped dataset.
Table 4. Description of the scraped dataset.
SeriesGlobal stock index (S&P)
Price of Bitcoin and Ethereum cryptocurrencies
Price of aluminum
Global price of gold
Price of gold in IRR
Price of coins
Iranian stock index
Sequence length20
PeriodOne day
Number of records740
Test 148
Train 592
Table 5. Comparative summary of the results. The MSE is reported with and without applying GAN in the prediction of time series.
Table 5. Comparative summary of the results. The MSE is reported with and without applying GAN in the prediction of time series.
10% Generated20% Generated
LSTMWGAN0.019120.0155
GAN0.018340.02049
no-GAN0.022110.02438
Bi-LSTMWGAN0.013740.01019
GAN0.019260.01007
no-GAN0.018010.01184
Table 6. Comparative summary of the forecasting methods.
Table 6. Comparative summary of the forecasting methods.
MethodsProsCons
ARMASimple to implement; effective for stationary time series; good for short-term forecastingLimited to linear patterns; not suitable for non-stationary data; struggles with complex dependencies
ARIMAHandles non-stationary data with differencing; well suited for data with seasonality or trendsComputationally expensive; requires manual parameter tuning; assumes linearity
RNNCapable of handling sequential data; captures temporal dependenciesProne to vanishing/exploding gradient issues; struggles with long-term dependencies
LSTMGood at capturing long-term dependencies; reduces vanishing gradient problem; suitable for time series forecastingHigh computational cost; longer training time; requires a large amount of data
Bi-LSTMBetter at capturing context from both past and future sequences; enhanced accuracy for sequential dataIncreased computational complexity; higher resource demand; slower training times
GANExcellent at generating synthetic data; captures complex patterns in dataDifficult to train; risk of mode collapse; requires careful tuning of hyperparameters
Bi-WGANImproved stability over traditional GAN; can generate high-quality synthetic data; good for imbalanced datasetsHigh computational cost; complex architecture; requires a significant amount of data and tuning
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Aria, S.S.; Iranmanesh, S.H.; Hassani, H. Optimizing Multivariate Time Series Forecasting with Data Augmentation. J. Risk Financial Manag. 2024, 17, 485. https://doi.org/10.3390/jrfm17110485

AMA Style

Aria SS, Iranmanesh SH, Hassani H. Optimizing Multivariate Time Series Forecasting with Data Augmentation. Journal of Risk and Financial Management. 2024; 17(11):485. https://doi.org/10.3390/jrfm17110485

Chicago/Turabian Style

Aria, Seyed Sina, Seyed Hossein Iranmanesh, and Hossein Hassani. 2024. "Optimizing Multivariate Time Series Forecasting with Data Augmentation" Journal of Risk and Financial Management 17, no. 11: 485. https://doi.org/10.3390/jrfm17110485

APA Style

Aria, S. S., Iranmanesh, S. H., & Hassani, H. (2024). Optimizing Multivariate Time Series Forecasting with Data Augmentation. Journal of Risk and Financial Management, 17(11), 485. https://doi.org/10.3390/jrfm17110485

Article Metrics

Back to TopTop