Next Article in Journal
CFO Compensation and Audit Fees
Previous Article in Journal
Dynamic Spillovers from US (Un)Conventional Monetary Policy to African Equity Markets: A Time-Varying Parameter Frequency Connectedness and Wavelet Coherence Analysis
Previous Article in Special Issue
Financial Distress Prediction in the Nordics: Early Warnings from Machine Learning Models
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Forecasting Orange Juice Futures: LSTM, ConvLSTM, and Traditional Models Across Trading Horizons

by
Apostolos Ampountolas
School of Hospitality Administration, Boston University, Boston, MA 02215, USA
J. Risk Financial Manag. 2024, 17(11), 475; https://doi.org/10.3390/jrfm17110475
Submission received: 15 September 2024 / Revised: 4 October 2024 / Accepted: 18 October 2024 / Published: 22 October 2024
(This article belongs to the Special Issue Machine Learning Applications in Finance, 2nd Edition)

Abstract

:
This study evaluated the forecasting accuracy of various models over 5-day and 10-day trading horizons to predict the prices of orange juice futures (OJ = F). The analysis included traditional models like Autoregressive Integrated Moving Average (ARIMA) and advanced neural network models such as Long Short-Term Memory (LSTM), Recurrent Neural Network (RNN), Backpropagation Neural Network (BPNN), Support Vector Regression (SVR), and Convolutional Long Short-Term Memory (ConvLSTM), incorporating factors like the Commodities Index and the S&P500 Index. We employed loss function metrics and various tests to assess model performance. The results indicated that for the 5-day horizon, the LSTM and ConvLSTM consistently outperformed the other models. LSTM achieved the lowest error rates and demonstrated superior capability in capturing temporal dependencies, especially in single-factor and S&P500 Index predictions. ConvLSTM also performed strongly, effectively modeling spatial and temporal data patterns. In the 10-day horizon, similar trends were observed. LSTM and ConvLSTM models had significantly lower errors and better alignment with actual values. The BPNN model performed well when all factors were included, and the SVR model maintained consistent accuracy, particularly for single-factor predictions. The Diebold–Mariano (DM) test indicated significant differences in forecasting accuracy, favoring advanced neural network models. In addition, incorporating multiple influencing factors further improved predictive performance, enhancing investment outcomes and reducing risk.

1. Introduction

The study of commodity market dynamics has been a cornerstone of financial research, offering valuable perspectives on price formation, risk management, and market efficiency. The global economy relies heavily on commodities, particularly the oil and natural gas sector, as well as other important commodities such as energy, agriculture, minerals, and metals. Among the range of commodities, orange juice futures (OJ = F) have been a particularly interesting area of study already for several decades (Roll 1984) due to their distinctive market traits and the significant impact of both natural and economic factors. In addition, evaluating orange juice prices is crucial because several interrelated factors significantly impact the global market and consumer behavior (Wang and Wei 2021; Zhang et al. 2018).
Is it possible to reliably predict commodity prices? This question has been the subject of ongoing discussion in the financial and economic literature. For example, the recent surge in orange juice prices can be attributed to extreme weather events and persistent diseases affecting major orange-producing regions, i.e., hurricanes and pest infestations (Durbin and Pollastri 2024). Weather greatly influences orange juice production, unlike other widely produced commodities. Nevertheless, commodity prices are generally considered more unpredictable than stock prices or exchange rates, posing challenges for accurate forecasting. Factors like the interaction of demand and supply, economic expansion, market predictions, government regulations, and unexpected events such as spillover effects, pandemics, war, and global debt crises all have an impact on commodity futures prices (Zhao et al. 2016). These complex factors are the primary drivers of the significant price fluctuations in the spot market prices of commodities. As a result, the expectation is that predicting commodity price trends, specifically, in this case, the orange juice futures prices, will not only help mitigate volatility and reduce risk in commodity markets but can also support governmental entities in making sound and long-term economic choices. Therefore, the motivation for this research is tied to the unique characteristics of the orange juice futures market, which is heavily influenced by various factors that create challenges for traders and investors. The study highlights the need to develop robust forecasting models to improve investment outcomes and reduce market risks, especially given the significant volatility of orange juice futures prices compared to other commodities like gold or oil.
Given the latest technological advancements, various methods are being utilized to forecast prices in the financial industry (Ampountolas 2023; Gupta and Nigam 2020). Although traditional econometric techniques, for example, the vector autoregressive model (VAR) struggle to accurately predict the non-linear aspects of commodity prices due to their robust linear assumptions (Sun et al. 2022; Wang and Fang 2022), advanced models such as machine learning techniques have gained significant attention due to their ability to observe volatility characteristics, non-linear information, and historical data effectively (Ampountolas 2024; Butler et al. 2021; Zhao et al. 2017) or combining models (Barrow and Crone 2016). Therefore, in the financial and economics literature, we have encountered many authors since the early years, for example, Kroner et al. (1995) who employed machine learning techniques, such as the Support Vector Regression (SVR), Long Short-Term Memory (LSTM), Recurrent Neural Networks (RNNs), Multi-Layer Perceptron (MLP), convolutional neural networks (CNNs), gate recurrent units (GRUs), backpropagation (BPNN) models, and many other models to validate the impact of various factors on predicting commodity futures prices.
Limited literature examines the price forecasts of orange juice futures (OJ = F) as an independent asset. Most research focuses on commodities like gold or oil, with many papers examining multiple commodities inclusively. Motivated by this and the enormous price growth during the last two years, we examine various forecasting models—ARIMA, LSTM, RNN, BPNN, SVR, NAR, and ConvLSTM—to predict commodity futures market’s prices of orange juice futures prices. We also include other factors such as commodity futures (ES = F) and S&P500 Indexes. Therefore, we employ two forecasting horizons: 5 trading days and 10 trading days. Thus, this study aims to contribute to current research by analyzing and predicting the price trends of orange juice futures in the selected commodity markets. Additionally, we present a comparative analysis of the forecasting models based on loss functions and performance metrics. As such, predicting orange juice futures prices is essential because this market is highly volatile and affected by a range of unpredictable factors that have significant economic impacts. Accurate predictions can help stakeholders mitigate volatility, manage risk, and make more informed decisions in the commodity market. Moreover, given the increasing price volatility in recent years, enhanced forecasting methods can improve profitability for stakeholders involved in futures trading.
Our results revealed that for both the 5-day and 10-day horizons, advanced neural network models, particularly LSTM and ConvLSTM, consistently outperformed the other forecasting models. These models achieved the lowest error rates and demonstrated superior capability in capturing temporal dependencies, with ConvLSTM also effectively modeling spatial and temporal data patterns. The directional accuracy and Diebold and Mariano (1995) test supported the findings. In the 10-day horizon, the LSTM and ConvLSTM models again showed significantly lower errors and better alignment with actual values than ARIMA, which had the highest error rates. The BPNN model performed well when all factors were included, and the SVR model maintained consistent accuracy, especially for single-factor predictions. The DM test indicated significant differences in forecasting accuracy, favoring advanced neural network models.
Section 2 briefly overviews the current literature, and Section 3 discusses the relevant forecasting models and performance assessment metrics and details the data. Section 4 then presents an analysis of the empirical study’s findings. Section 5 summarizes the study’s conclusions and outlines potential directions for future research.

2. Literature Review

In one of the earliest studies, Roll (1984) confirmed that the weather condition variable impacts the market for frozen concentrated juice. Orange juice prices are impacted by high volatility as a result of concerns about extreme weather events that could affect production. However, he demonstrated that weather accounts for only a small portion of the fluctuations seen in futures prices. In another work, Kroner et al. (1995) utilized time-series approaches to generate long-term predictions of commodity price volatility by integrating investors’ anticipated volatility. The authors assessed various forecasts of commodity price volatility, categorizing them into three groups: (1) forecasts based solely on expectations derived from options prices, (2) forecasts relying exclusively on time-series modeling, and (3) forecasts that combine market expectations and time-series techniques. They concluded that the forecasts proposed in category (3) outperformed the other two categories. Brooks et al. (2013) analyzed whether there is consistency in the evidence supporting two theories on commodity future pricing over time. The authors explored if the ability of commodity futures to predict prices is related to their seasonal fluctuations, and they also examined if there are changes in the pricing relationships at different times. They found more compelling evidence of seasonal patterns in the basis, which aligns with the storage theory. The findings reveal that structural changes mainly involve adjustments in the starting points rather than the trends, indicating that the predictive power of the basis remains consistent across various economic conditions. The study by Black et al. (2014) investigates how stock and commodity prices are related and whether this connection can be utilized to predict stock returns. Since both prices are associated with anticipated future economic performance, they are expected to have a lasting relationship, while shifts in sentiment toward commodity investments may impact how the response to imbalances occurs. The findings indicated that there is a long-term relationship between stock and commodity prices, and further tests identify disruptions in the predictive regression. The paper by Atsalakis et al. (2016) introduces an innovative method for predicting the price direction of 25 commodities on the global market using a neuro-fuzzy controller. The prediction system utilizes two adaptive neural fuzzy inference systems (ANFISs) to create an inverse controller for each commodity. The findings demonstrate a 68.33% hit rate with a significant improvement in return on equity compared to the buy-and-hold strategy.
In addition to traditional econometric approaches, various machine learning techniques are used to uncover the inherent complexity of commodity prices. The most common machine learning methods include neural networks (NN) and support vector machines (SVM), which are favored for their ability to model intricate characteristics like nonlinearity and volatility. Hybrid models have also demonstrated superior forecasting accuracy compared to their machine learning models. Drachal and Pawłowski (2021) briefly overviews how genetic algorithms (GA) are used to predict commodity prices. The authors concentrated on a hybrid method (i.e., combining genetic algorithms with other approaches) used in situations like determining if a complete forecasting technique can be split into two or more distinct parts, with one part being based on a GA and the other parts based on different methods. Another study by Jiang et al. (2022) utilized various machine learning techniques to confirm the influence of investor sentiment on estimating the price of crude oil futures. The authors included several forecasting models, such as the MLP, LSTM, SVR, RNN, and GRU models. The results indicated that the Long Short-Term Memory model yielded the best results when combined with the composite sentiment index. This was attributed to a reduced rate of accuracy errors and improved directional accuracy when forecasting next-day-ahead prices for time-series analysis. In a similar study, Guo et al. (2023) utilized machine learning to analyze historical data, volatility, and non-linear characteristics. They assessed the predictive capabilities of neural network models such as the GRU, MLP, LSTM, RNN, CNN, SVR, and BPNN models on crude oil futures. The set of assessment tests illustrated that the GRU model surpassed other models in terms of accuracy and performance when forecasting crude oil futures prices. Moreover, the incorporation of relevant factors resulted in enhanced forecast accuracy for the proposed models. A recent study by Zheng et al. (2024) reported the effectiveness of hybrid models in enhancing the accuracy of crude oil price forecasts when compared to single models. Their research introduces an innovative interval-based approach. Initially, they apply variational mode decomposition (VMD) to split the original training series into low- and high-frequency components. The low-frequency component is considered an inseparable random set. It is forecasted using a newly developed autoregressive conditional interval (ACI) model, while the high-frequency component is predicted using interval Long Short-Term Memory (iLSTM) networks. The final interval-valued prediction is obtained by combining the forecasts of both components. Additionally, the study designs and implements a daily trading strategy based on interval-valued data.
Ren et al. (2024) introduced an innovative imaging technique to predict the daily price data of crude oil futures. Utilizing convolutional neural networks (CNNs), they achieved higher accuracy in predicting future price trends than other standard forecasting methods. The findings indicate that images can capture more nonlinear information, which is advantageous for energy price prediction, particularly during significant fluctuations in crude oil prices. In a different study, Ampountolas (2024) studied GARCH models and the Support Vector Regression (SVR) model to understand better how volatility changes in commodity returns, like gold and cocoa, as well as the financial market index S&P500. The evaluation showed that Support Vector Regression (SVR) performs better than traditional GARCH models for short-term forecasting, suggesting it could be a valuable alternative for predicting financial market trends. These results highlight the importance of choosing the right modeling techniques for specific types of assets and forecasting time frames.
In conclusion, an extensive body of literature discusses predicting volatility in commodity futures markets, mainly for energy, crude oil, or metals. Throughout the years, forecasting techniques have progressed from traditional econometric approaches to innovative machine learning methods. Consequently, the accuracy of forecast models is gradually increasing, and at the same time, it has been demonstrated that the variables influencing the prediction of commodity futures prices are varied.

3. Data and Methodology

3.1. Data

This study’s data set contained daily historical time series data for three financial assets: orange juice futures (OJ = F), S&P500 futures (ES = F), and the S&P500 Index (GSPC). The dependent variable for this study is the price of orange juice futures. We aim to accurately forecast the price of orange juice futures in the USA market. The dataset covers the period from July 2022 to June 2024 and has 504 observations. The data were obtained from Yahoo Finance. In addition, we utilized the S&P500 futures index and the stock market, i.e., the S&P500 Index, as impact factors in the orange juice futures price estimation model.

3.2. Descriptive Statistics

3.2.1. Dataset Trend

Figure 1 illustrates a noticeable upward trend in the price of orange juice futures (OJ = F) over the two years, characterized by substantial volatility and periodic corrections. Similarly, the trend index confirms the bullish trajectory, with the futures price increasing by 145.17% from the beginning date for the dataset. In this context, the analysis highlights the potential for significant returns while emphasizing the volatility of commodity futures markets. Such insights are vital for investors and market analysts to make informed trading and investment decisions.

3.2.2. Summary Statistics

Moreover, Table 1 reports a comprehensive overview of the study’s summary statistics of the three financial assets: OJ = F, ES = F, and GSPC. OJ = F has a standard deviation of 82.74, which, relative to its mean, suggests significant variability in prices. This is also supported by the OJ = F price range, which shows a broad price range that aligns with its high standard deviation. In contrast, ES = F and GSPC have higher absolute standard deviations of 483.97 and 478.46, respectively, but these are small relative to their higher mean values and thus have relatively low volatility compared to OJ = F. Finally, OJ = F has a kurtosis of −1.1256, suggesting less frequent extreme deviations from the mean. ES = F and GSPC have kurtosis values of −0.7725 and −0.7436, respectively, indicating a similar but slightly less pronounced platykurtic distribution. At the same time, OJ = F shows a slight positive skewness of 0.1262, suggesting a marginally longer right tail. The financial indices, ES = F and GSPC, have higher positive skewness values of 0.5726 and 0.5850, respectively, indicating a more noticeable asymmetry with a longer right tail.

3.3. Forecasting Models

3.3.1. Autoregressive Integrated Moving Average (ARIMA)

The Autoregressive Integrated Moving Average (ARIMA) model is a prominent statistical forecasting technique within the ARMA linear model class. According to Hyndman and Athanasopoulos (2018), the development of exponential smoothing models hinges on identifying trends and seasonality in the data. In contrast, ARIMA models are adept at handling stationary, non-stationary, and seasonal processes of order ( p , d , q ). The general form of the ARIMA model is represented as
( 1 ϕ 1 B ) ( 1 Φ 1 B 4 ) ( 1 B ) ( 1 B 4 ) y t = ( 1 + θ 1 B ) ( 1 + Θ 1 B 4 ) ε t
In this equation, y t denotes the observed value at time t, and ε t represents the error term, assumed to be white noise with a Gaussian distribution, having a mean of zero and a constant variance σ 2 . The ARIMA model is denoted by ARIMA ( p , d , q ) , where selecting the appropriate order ( p , d , q ) is a critical aspect of the ARIMA modeling procedure.
ARIMA models can be applied to both seasonal and non-seasonal data. Seasonal ARIMA requires a more intricate specification of the model components. Prior to estimating the time series models, it is essential to perform the augmented Dickey–Fuller (ADF) test Dickey and Fuller (1979) to determine the stationarity of the dataset. If the series is found to be non-stationary, data transformation is necessary. The ADF test is defined as follows:
Δ x t = α 0 + b 0 x t 1 + i = 1 k c 0 Δ x t 1 + w t
Here, Δ denotes the difference operator; α 0 , b 0 , and c 0 are coefficients to be estimated; x is the variable under examination; and w is the white noise error term. The null hypothesis ( b 0 = 0 ) indicates that the series is non-stationary, while the alternative hypothesis ( b 0 < 0 ) suggests that the series is stationary.

3.3.2. Recurrent Neural Network (RNN)

The RNN is structured with input, hidden, and output layers, allowing it to handle and retain new data simultaneously, thus enabling information transfer to subsequent periods (Henrique et al. 2018). Due to its feedback mechanism, the RNN incorporates historical data in its predictions. However, it struggles with retaining long-term data and may suffer from gradient explosion issues (Jiang et al. 2022). The RNN calculations are as follows:
h t = f h ( u t x t + W t 1 h t 1 )
y t + T = f y ( v t h t + b y )
where h t represents the hidden layer vector, x t is the input layer vector, y t + T is the output layer, u t is the input-to-hidden weight at time t, v t is the hidden-to-output weight at time t, and W t 1 is the weight from the output state at time t 1 to the hidden state at time t.

3.3.3. Long Short-Term Memory (LSTM)

LSTM is an advanced version of RNN featuring forget, input, and output gates. It leverages RNN’s strengths while mitigating its weaknesses, making it suitable for time series prediction. Based on Jiang et al. (2022), the transfer process is detailed as follows:
F t = ρ ( W f x x t + W f h h t 1 + b f )
I t = ρ ( W i x x t + W i h h t 1 + b i )
O t = ρ ( W o x x t + W o h h t 1 + b o )
C t = f t c t 1 + i t tanh ( W c x x t + W c h h t 1 + b c )
h t = o t tanh ( c t )
where F t is the forget gate at time t, W is the weight matrix, x t is the input vector at time t, b the bias parameter, h t is the hidden state vector at time t, I t is the input gate at time t, O t is the output gate at time t, ρ and tanh are the activation functions, and C t is the candidate set.

3.3.4. Convolutional Long Short-Term Memory (ConvLSTM)

The Convolutional Long Short-Term Memory (ConvLSTM) model represents an advanced neural network architecture specifically designed to handle spatiotemporal data by integrating convolutional operations within the LSTM framework. The traditional fully connected LSTM (FC-LSTM) is powerful for sequence modeling but lacks the capability to effectively capture spatial correlations, as it uses fully connected layers that disregard spatial information. ConvLSTM addresses this limitation by incorporating convolutional structures in both the input-to-state and state-to-state transitions, allowing it to capture local spatial dependencies better.
The fundamental equations governing ConvLSTM are as follows:
i t = σ ( W x i X t + W h i H t 1 + W c i C t 1 + b i )
f t = σ ( W x f X t + W h f H t 1 + W c f C t 1 + b f )
C t = f t C t 1 + i t tanh ( W x c X t + W h c H t 1 + b c )
o t = σ ( W x o X t + W h o H t 1 + W c o C t + b o )
H t = o t tanh ( C t )
Here, i t , f t , and o t represent the input, forget, and output gates, respectively. The symbols X t and H t denote the input and hidden state at time t, while C t is the cell state. The convolution operator is represented by *, and the Hadamard product is represented by ◦.
The ConvLSTM model thus maintains the advantages of traditional LSTM in handling long-term dependencies while enhancing its ability to process data with spatial structures. This makes ConvLSTM particularly suitable for applications like precipitation nowcasting, where capturing both spatial and temporal patterns is crucial. By stacking multiple ConvLSTM layers and forming an encoding–forecasting structure, the model achieves robust performance predicting future states from historical data, significantly outperforming traditional FC-LSTM models in spatiotemporal sequence forecasting tasks.

3.3.5. Backpropagation Neural Network (BPNN)

The Backpropagation Neural Network (BPNN) is among the most popular and extensively used models in artificial neural networks, renowned for its robustness and simplicity. BPNN employs a Multi-Layer Perceptron structure, typically consisting of an input layer, one or more hidden layers, and an output layer. The core principle of BPNN is the backpropagation algorithm, which adjusts the network weights to minimize the error between the predicted outputs and the actual targets. This is achieved through an iterative process of forward and backward passes.
During the forward pass, input data are propagated through the network, generating an output. The error is then calculated using a loss function, such as the mean squared error (MSE):
E = 1 2 i = 1 n ( y i y ^ i ) 2
where y i is the actual target value and y ^ i is the value predicted by the network.
This error is propagated backward through the network to update the weights in the backward pass. The learning algorithm performs a gradient descent optimization on the weights linking the nodes in each layer. The weight update rule is derived from the gradient descent method, where the weights are adjusted in the direction that reduces the error. The update for a weight w i j from neuron i to neuron j is given by
Δ w i j = η E w i j
where η is the learning rate, controlling the step size of the weight update. The partial derivative E w i j is computed using the chain rule, which involves calculating the gradient of the error concerning the weights.
Despite its advantages, BPNN has shortcomings, such as long training times and potential overtraining. However, its robustness and generally good performance across a wide range of applications make it a valuable tool in neural network modeling. Due to its effectiveness and ease of use, BPNN is often considered a benchmark for comparing the performance of other neural network models. This iterative weight adjustment process continues until the network converges to a state where the error is minimized, thereby improving the model’s accuracy. BPNN’s ability to fine-tune weights through gradient descent is highly effective for various applications, including pattern recognition, time series forecasting, and complex function approximation.

3.3.6. Support Vector Regression (SVR)

Support Vector Regression (SVR) is a non-linear regression technique based on Support Vector Machine (SVM) principles. SVR excels in approximating functions and works by identifying a regression hyperplane in a high-dimensional feature space with minimal risk. According to Kazem et al. (2013), the formulation of SVR can be expressed as follows:
f ( x ) = w T ϕ ( x ) + b
Minimize
1 2 | | w | | 2 + C i = 1 n ( ξ i + ξ i )
Subject to
y i ( w · ϕ ( x i ) + b ) ϵ + ξ i
( w · ϕ ( x i ) + b ) y i ϵ + ξ i
ξ i , ξ i 0
In this formulation, x i R k for i = 1 , 2 , . . . , n and y i R . Here, y i represents the target value of x i , w is the weight vector, ϕ ( x ) denotes a non-linear mapping function and b is a bias term. The variables ξ i and ξ i are slack variables that account for deviations from the margin of tolerance ϵ .
SVR aims to determine the optimal hyperplane that approximates the data with a minimal margin of error and maintains the model’s generalization ability by managing the trade-off between the hyperplane’s flatness and the error tolerance.

3.3.7. Non-Linear Autoregressive (NAR) Neural Network

A neural network is a computational model designed for data processing that can capture relationships within data. One of the significant advantages of artificial neural networks (ANNs) over other forecasting and modeling approaches is their ability to approximate complex functions with high precision and identify nonlinear patterns in input data without preset assumptions. Dynamic neural networks, particularly the NAR model, are extensively utilized for modeling and forecasting time series data, such as financial time series.
The NAR model addresses nonlinear time series problems by utilizing a single time series and predicting its future values based solely on its past values. Mathematically, the future value of a time series Y t is forecasted using its previous values Y t 1 , Y t 2 , , Y t d , where f represents the mapping function performed by the neural network:
Y t = f ( Y t 1 , , Y t d )
This model aims to learn the optimal weights for the neurons to minimize the error between the network’s output and the actual values. A crucial aspect of neural-network-based forecasting is the network’s architecture, which defines the number of neurons in each layer and the connections between them. A feed-forward network with a hidden layer is commonly employed for time series modeling and forecasting. The NAR neural network typically features a feed-forward structure with a tansigmoid transfer function in the hidden layer and a linear transfer function in the output layer.
Determining the number of hidden neurons and the number of delays in observations (denoted by d) is essential because these parameters significantly influence the autocorrelation structure of the time series. Researchers often rely on trial-and-error experiments to choose these parameters due to the lack of a theoretical method for their determination. In one-step-ahead forecasting tasks, the number of neurons in the output layer is usually set to one.

3.4. Assessment Indicators

3.4.1. Loss Functions

The study presents a comprehensive analysis of the forecasting accuracy of various loss functions, including the commonly used Mean Absolute Percentage Error (MAPE), as well as the Mean Absolute Error (MAE) and Root Mean Square Error (RMSE).
Mean Absolute Error : MAE = 1 h j = 1 h y t + j y ^ t + j ,
Root Mean Square Error : RMSE = 1 h j = 1 h y t + j y ^ t + j 2 ,
Mean Absolute Percentage Error : MAPE = 1 h j = 1 h y t + j y ^ t + j y t + j ,
where y ^ t + j indicates the model’s forecast at time t. y t + j refers to the dataset’s actual values, h refers to the forecasting horizon, and finally, j indicates the number of historical observations. A lower value obtained from these evaluation indicators signifies a smaller error, indicating that the predictive model effectively converges toward accurate results.

3.4.2. Forecasting Performance Metrics

In addition, we utilize the directional accuracy (DA) and accuracy improvement (AI) metrics and the Diebold and Mariano (DM) test to evaluate the performance of forecasting models.
The directional accuracy (DA) is a metric used to assess forecasting models by measuring their ability to predict the direction of changes in observed values. This is especially valuable in financial forecasting, where accurately predicting whether prices will increase or decrease is often more important than predicting the exact value. A higher DA indicates better forecasting model performance in predicting the direction of changes.
D A = 100 T t = 1 T d t
where d t is defined as
d t = 1 if Y ( t ) Y ( t 1 ) Y ^ ( t ) Y ^ ( t 1 ) 0 0 otherwise
where Y ( t ) and Y ^ ( t ) are the actual and predicted values at time t, respectively, and T is the sample size. The indicator function d t checks whether the predicted change in the value (from t 1 to t) matches the actual change in the value. If both the actual and predicted changes are in the same direction (both up or both down), d t equals 1, indicating a correct prediction. If the directions do not match, d t equals 0, indicating an incorrect prediction.
We have also employed the accuracy improvement (AI) and the Diebold and Mariano (DM) tests to compare the forecasting models more accurately.
The accuracy improvement (AI) is designed to compare two forecasting models. The accuracy improvement is defined as
AI = S S p S × 100 %
where the sum of the absolute errors for a specified model is denoted as S, and the sum of the absolute errors for the proposed model is denoted as S p . If AI > 0 , it indicates that the proposed forecasting model performs better, whereas if AI < 0 , it implies that the proposed model has not overcome the specified model’s drawback. The index AI provides a more intuitive way to compare precision.
Finally, we used the predictive accuracy test suggested by Diebold and Mariano (1995) to assess the statistical significance of enhancements in forecast accuracy. This test is commonly utilized to compare the predictive capabilities of various models and ascertain if the differences in accuracy are statistically meaningful.
D M = d ¯ σ ^ d 2 / T
where d ¯ refers to the mean of the loss differential series d t and d t represents the difference between the loss from the first model and the loss from the second model. T represents the number of observations and σ ^ d 2 indicates an estimate of the variance of d t .

4. Estimation Results

This study’s estimation results present forecasts with a horizon of 5 trading days and 10 trading days. According to the whole evaluation forecasting model, we initially conducted a forecast without influencing factors (single factor—OJ = F). Then, considering the influencing variables, we added the ES = F factor along with the OJ = F. Afterward, we introduced the OJ = F and the S&P500 Index factor, conducting a new estimation, and finally, we performed estimations including all factors in the forecasting process.

4.1. Forecast Results in the 5-Trading Day Horizon

Table 2 presents the forecast accuracy results for the study’s various models and the evaluation indicators across different financial indices (Single-factor (OJ = F), Commodities Index, S&P500 Index, and a combined category of all factors) in a 5-trading-day horizon (steps). Compared to advanced models, the traditional ARIMA model shows the highest error rates across all metrics, indicating its limited capacity to handle complex time series data. LSTM stands out with the lowest error rates, particularly excelling in single-factor and S&P500 Index predictions, showcasing its strength in modeling temporal dependencies with high accuracy (e.g., MAE of 12.4155 and 9.4766 and MAPE of 3.1107% and 2.4010%, respectively). RNN and BPNN also perform well, though RNN shows higher errors in the Commodities Index, indicating variability in performance across different data types. SVR exhibits consistent but moderate accuracy, with relatively low errors but less effectiveness than neural networks. BPNN shows low errors, specifically when introducing the Commodities Index and when we combine all factors to forecast the daily price of orange juice futures. While improving over ARIMA, NAR still presents higher errors, especially in the combined category of all factors. ConvLSTM demonstrates robust performance with low errors across most categories, second only to LSTM, highlighting its efficacy in capturing spatial and temporal data patterns. Overall, the results emphasize the superiority of advanced neural network models, particularly BPNN, LSTM, and ConvLSTM, in achieving accurate forecasts in financial time series data.
Table 3 compares the performance of various models in terms of directional accuracy and average improvement over a 5-day prediction period. The ARIMA model, serving as a baseline, shows a directional accuracy of 50.53%. The ConvLSTM model outperforms all others, with the highest directional accuracy of 62.11% and an average improvement of 65.33%. SVR also demonstrates strong performance with a 58.95% directional accuracy and 56.13% average improvement. The LSTM and BPNN models provide moderate enhancements, with directional accuracies of 55.79% and 51.58%, respectively, and average improvements of 39.11% and 51.82%. Interestingly, despite its poor directional accuracy of 46.32%, the NAR model shows the highest average improvement at 66.86%, suggesting it may excel in other prediction aspects. RNN, with a directional accuracy of 47.37%, shows a significant average improvement of 53.59%. Overall, ConvLSTM is the most reliable model for directional predictions, followed by SVR, while NAR and RNN might enhance different prediction metrics beyond directional accuracy.
Next, in Figure 2, we compare each prediction model’s estimated results and actual values in the 5-trading day horizon. We observe that the ARIMA model shows poor performance, as its forecasted values are flat and do not capture the trend and volatility of the actual data. LSTM and ConvLSTM models, on the other hand, performed well, closely aligning with the actual data and accurately capturing both the upward trend and fluctuations. The RNN model shows moderate accuracy, better than ARIMA, but still missed some key variations in the data. The BPNN also performs reasonably well, capturing the overall trend with some deviations. However, the SVR displayed substantial divergence from the actual data, indicating its inadequacy in this forecasting context. The NAR model improves upon ARIMA and SVR, capturing the general trend but still missing several peaks and troughs. We conclude that advanced neural network models, particularly LSTM and ConvLSTM, demonstrate superior forecasting capabilities and effectively handle the complexities and nonlinearities present in the time series data.
Finally, in Table 4, we present the results of the Diebold–Mariano test. The findings reveal significant differences in forecasting accuracy among various models, with ARIMA showing highly significant differences ( p < 0.01 ) compared to all other models, suggesting its distinct performance characteristics. Notably, LSTM consistently outperforms other models, as indicated by significant positive DM statistics across all comparisons ( p < 0.01 ) . In contrast, RNN exhibits a notable negative DM value when compared with the BPNN, indicating inferior performance while showing better performance against SVR, NAR, and ConvLSTM. The BPNN model shows significant differences with SVR, NAR, and ConvLSTM, highlighting its unique predictive capabilities. The SVR and NAR comparisons also indicate significant differences, suggesting varied forecasting strengths. Furthermore, the comparison between ConvLSTM and NAR shows no significant difference, implying similar performance. As such, we observe that the DM test results underscore the variability in forecasting accuracy among the models.

4.2. Forecast Results in 10-Trading Day Horizon

Table 5 presents the accuracy of the forecasting models evaluated over 10-day (horizon—trading day) steps across different datasets, similar to 5-day steps. The ARIMA model consistently shows the highest error rates across all metrics and datasets, with an MAE ranging from 51.4101 to 75.0684 and a MAPE from 12.8525% to 18.8773%, indicating its limited efficacy in forecasting complex time series data. In contrast, the LSTM model demonstrates significantly lower errors, with an MAE between 17.7515 and 19.3754 and a MAPE around 4.4288% to 4.7611%, highlighting its superior ability to capture temporal dependencies. The RNN model also performs well, particularly for the Commodities Index, but shows higher variability with an MAE from 18.9532 to 40.0245 and a MAPE from 4.6796% to 10.0940%. The BPNN model exhibits robust performance with the lowest errors among the neural networks in the category of all factors, achieving an MAE of 11.7737 and a MAPE of 2.9399%. The SVR outperforms the other models and maintains consistent, higher accuracy across datasets with an MAE from 12.4496 to 13.9587 and a MAPE of around 3.0931% to 3.4438%. It is noticeable that the SVR is the most accurate model for a single factor, including the Commodities Index and the S&P500 Index. While improving over ARIMA, the NAR model still shows higher errors than LSTM and BPNN, with an MAE between 18.7106 and 19.4742 and a MAPE from 4.7287% to 4.8448%. Lastly, ConvLSTM displays strong forecasting capability, excelling in the Commodities Index with the lowest MSE of 407.6756 and an RMSE of 20.1910 and maintaining a competitive performance across other datasets. These empirical results indicate that the SVR model and, in one case, the BPNN model are the most accurate models for the forecasting of orange juice futures prices, even if additional influencing factors are included in the prediction process. These findings are, to an extent, similar to the estimation results in the 5-trading day horizon, although in the 5-day step forecasts, the LSTM and the ConvLSTM demonstrated superior forecasting accuracy in some cases. Additionally, concurrently incorporating extra relevant factors can enhance all predictive models’ performance. Thus, it has been shown once more that integrating influencing factors can decrease the forecasting model’s prediction error and boost the accuracy of forecasting orange juice (OJ = F) futures prices.
Table 6 presents the results of the directional accuracy (DA) and accuracy improvement (AI) criterion for various forecasting models over 10-day steps. Directional accuracy measures the percentage of the correctly predicted direction of changes, while AI reflects the percentage improvement over a baseline model, in this study the ARIMA model. ARIMA shows a directional accuracy of 48.89%, serving as the AI baseline. Among the models, BPNN achieves the highest directional accuracy at 54.44%, indicating superior predictive capability in capturing the direction of changes. ConvLSTM, despite having a directional accuracy of 50.00%, shows the most substantial average improvement (63.75%) over ARIMA, highlighting its efficacy in enhancing forecasting accuracy. NAR and LSTM exhibit notable average improvements despite lower directional accuracies (45.56% and 44.44%). RNN and SVR demonstrate moderate directional accuracies (51.11% and 52.22%) but differ in accuracy improvement, with RNN showing a significant improvement (33.30%) compared to SVR (8.20%). Therefore, the results suggest that while directional accuracy varies across models, advanced neural networks like ConvLSTM and NAR substantially improve forecasting accuracy.
Then, we evaluated the forecasted values from the estimation models, which take into account all the relevant factors with the actual value, as shown in Figure 3. We can observe that the ARIMA model significantly underperforms, failing to capture the upward trend and volatility, indicating its limitations in forecasting complex, nonlinear patterns. On the other hand, the LSTM and ConvLSTM models demonstrate a closer alignment with actual values, particularly in capturing the general upward trend and peak levels, highlighting their superior ability to handle time series data with temporal dependencies. The RNN and BPNN models exhibit moderate performance, capturing some trends but with notable deviations and missed volatility. The SVR model shows substantial divergence from actual values, at least in this context, reflecting its inadequacy. Finally, while better than ARIMA, the NAR model still shows significant discrepancies, particularly in capturing peak values. Therefore, by comparing the predicted and actual values, we find that the LSTM and ConvLSTM models stand out for their enhanced forecasting capabilities, which effectively model the underlying patterns and trends in the data.
Finally, we employed the Diebold–Mariano (DM) test to examine the forecasting accuracy of the various models over 10-day steps. Table 7 presents the test results. More specifically, the ARIMA model shows highly significant differences ( p < 0.01 ) with all models except SVR, where it slightly underperforms (−0.6466 ***), suggesting ARIMA’s generally distinct predictive behavior. LSTM exhibits significantly better performance compared to RNN (−8.3953 ***) and SVR (−17.9334 ***) but slightly outperforms BPNN (1.8406 ***) and demonstrates superior accuracy against NAR and ConvLSTM. RNN’s performance is significantly worse than SVR (−16.0307 ***) but better than BPNN (5.8633 ***) and significantly improved over NAR and ConvLSTM. BPNN shows similar trends, underperforming against SVR (−16.2270 ***) but outperforming NAR and ConvLSTM. SVR demonstrates significant superiority over NAR (16.5782 ***) and ConvLSTM (19.7874 ***). Lastly, NAR’s performance is significantly outperformed by ConvLSTM (15.6224 ***). In these findings, we observe that the variable forecasting capabilities across models with advanced neural networks like LSTM and ConvLSTM often exhibit superior performance compared to traditional models such as ARIMA, particularly in handling complex time series data.

5. Conclusions

This study evaluated the forecasting accuracy of various models with different configurations over 5-day and 10-day trading horizons to forecast orange juice futures (OJ = F) prices. We have employed a dataset from July 2022 to June 2024. Our analysis included traditional models like ARIMA and advanced neural network models such as LSTM, RNN, BPNN, SVR, and ConvLSTM, with varying influencing factors like the Commodities Index and the S&P500 Index. In addition, we have adopted a set of loss function metrics to evaluate the accuracy of each model and various tests to assess the performance of each forecasting model.
For the 5-trading day forecasting horizon, the advanced neural network models, particularly LSTM and ConvLSTM, consistently outperformed traditional models like ARIMA. LSTM achieved the lowest error rates and demonstrated superior capability in capturing temporal dependencies, especially in single-factor and S&P500 Index predictions. ConvLSTM also exhibited strong performance, highlighting its effectiveness in modeling spatial and temporal data patterns. The directional accuracy and Diebold–Mariano test further supported the superiority of LSTM and ConvLSTM over other models.
In the 10-trading day forecasting period, we observed similar trends. While ARIMA displayed the highest error rates, the LSTM and ConvLSTM models showed significantly lower errors and better alignment with actual values. The BPNN model also performed well, mainly when we incorporated all factors. The SVR model maintained consistent accuracy across datasets, especially for single-factor predictions. The Diebold–Mariano test results indicated significant differences in forecasting accuracy, with advanced neural network models generally outperforming traditional models.
The findings of this study demonstrate that advanced models such as LSTM and ConvLSTM outperform traditional methods like ARIMA in forecasting orange juice futures prices. Specifically, LSTM achieved the lowest error rates across various factors, including the Commodities Index and S&P500 Index. This differs from previous research on commodities such as crude oil and gold, which favored machine learning techniques (e.g., LSTM, GRU) while emphasizing different influencing factors such as investor sentiment and macroeconomic indicators. For example, research by Guo et al. (2023) highlighted the superior performance of GRU in crude oil forecasting, particularly when considering relevant factors like volatility and historical data. Furthermore, while previous studies applied hybrid models to energy commodities, this study demonstrates the advantage of neural network models for commodity markets, emphasizing the need to customize forecasting tools to the distinctive characteristics of each market.
Our empirical results also have practical implications. Therefore, investors and analysts can promptly analyze market trends and identify potential risks based on the forecasting model results. As we have observed, the findings emphasize the superiority of advanced neural network models, particularly LSTM and ConvLSTM, in forecasting complex time series data. These models effectively capture underlying patterns and trends, offering enhanced forecasting capabilities compared to traditional models like ARIMA. Incorporating influencing factors further improves the predictive performance of these models, underscoring the importance of considering multiple variables in the forecasting of financial assets. This optimization enhances investors’ investment performance and reduces risk. Therefore, the DM test in both periods supports the above findings by indicating that models like LSTM and ConvLSTM not only provide statistically better predictions but also can offer traders and investors more reliable forecasts for decision-making. This could lead to improved returns and reduced risks, especially in volatile markets such as orange juice futures.

Limitations and Further Research

Despite the promising results, this study, like any other, has limitations that warrant further research. First, the dataset was limited to specific financial indices and assets. As such, future research could explore a broader range of variables and datasets to enhance the generalizability of the findings. In addition, we observed only two forecasting horizons (5-trading day and 10-trading day steps). Examining shorter- or longer-term forecasting estimations could provide more insights into the robustness and reliability of these models.
Second, while advanced neural network models showed superior performance, model optimization is also very important. Future studies should explore different optimization methods to enhance forecasting accuracy or incorporate additional forecasting models, such as hybrid models and parameters. At the same time, although neural networks like LSTM and ConvLSTM can effectively model nonlinear relationships, they require extensive data for training to avoid overfitting, especially in highly volatile markets like orange juice futures. Additionally, neural networks are computationally intensive, requiring significant time and resources for both training and fine-tuning, particularly as the complexity of the network increases. Finally, another practical consideration is the interpretability of these models. Neural networks are often seen as black boxes, making it difficult for users to understand how predictions are derived.
Finally, the study primarily focused on point forecasts. Introducing probabilistic forecasting methods could offer a more comprehensive evaluation of model performance by considering uncertainty and confidence intervals in predictions. Furthermore, the economic implications of these forecasts were not analyzed. Future research should assess the practical applications and financial benefits of employing advanced neural network models for trading and investment strategies.
Nevertheless, the study demonstrated the potential of the models utilized in financial forecasting, and further research could lead to even more robust and practical forecasting solutions.

Funding

This research received no external funding.

Data Availability Statement

Data are publicly available.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Ampountolas, Apostolos. 2023. Comparative analysis of machine learning, hybrid, and deep learning forecasting models: Evidence from european financial markets and bitcoins. Forecasting 5: 472–486. [Google Scholar] [CrossRef]
  2. Ampountolas, Apostolos. 2024. Enhancing forecasting accuracy in commodity and financial markets: Insights from garch and svr models. International Journal of Financial Studies 12: 59. [Google Scholar] [CrossRef]
  3. Atsalakis, George, Dimitrios Frantzis, and Constantin Zopounidis. 2016. Commodities’ price trend forecasting by a neuro-fuzzy controller. Energy Systems 7: 73–102. [Google Scholar] [CrossRef]
  4. Barrow, Devon K., and Sven F. Crone. 2016. Cross-validation aggregation for combining autoregressive neural network forecasts. International Journal of Forecasting 32: 1120–1137. [Google Scholar] [CrossRef]
  5. Black, Angela J., Olga Klinkowska, David G. McMillan, and Fiona J. McMillan. 2014. Forecasting stock returns: Do commodity prices help? Journal of Forecasting 33: 627–639. [Google Scholar] [CrossRef]
  6. Brooks, Chris, Marcel Prokopczuk, and Yingying Wu. 2013. Commodity futures prices: More evidence on forecast power, risk premia and the theory of storage. The Quarterly Review of Economics and Finance 53: 73–85. [Google Scholar] [CrossRef]
  7. Butler, Sunil, Piotr Kokoszka, Hong Miao, and Han Lin Shang. 2021. Neural network prediction of crude oil futures using b-splines. Energy Economics 94: 105080. [Google Scholar] [CrossRef]
  8. Dickey, David A., and Wayne A. Fuller. 1979. Distribution of the estimators for autoregressive time series with a unit root. Journal of the American Statistical Association 74: 427–431. [Google Scholar]
  9. Diebold, X. Francis, and S. Roberto Mariano. 1995. Comparing predictive accuracy. Journal of Business & Economic Statistics 13: 134–144. [Google Scholar]
  10. Drachal, Krzysztof, and Michał Pawłowski. 2021. A review of the applications of genetic algorithms to forecasting prices of commodities. Economies 9: 6. [Google Scholar] [CrossRef]
  11. Durbin, Dee-Ann, and Tatianna Pollastri. 2024. High Orange Juice Prices May Be on the Table for a While Due to Disease and Extreme Weather—finance.yahoo.com. Available online: https://finance.yahoo.com/news/high-orange-juice-prices-may-151316322.html (accessed on 18 July 2024).
  12. Guo, Lili, Xinya Huang, Yanjiao Li, and Houjian Li. 2023. Forecasting crude oil futures price using machine learning methods: Evidence from china. Energy Economics 127: 107089. [Google Scholar] [CrossRef]
  13. Gupta, Nalini, and Shobhit Nigam. 2020. Crude oil price prediction using artificial neural network. Procedia Computer Science 170: 642–647. [Google Scholar] [CrossRef]
  14. Henrique, Bruno Miranda, Vinicius Amorim Sobreiro, and Herbert Kimura. 2018. Stock price prediction using Support Vector Regression on daily and up to the minute prices. The Journal of Finance and Data Science 4: 183–201. [Google Scholar] [CrossRef]
  15. Hyndman, Rob J., and George Athanasopoulos. 2018. Forecasting: Principles and Practice. Melbourne: OTexts. Available online: http://OTexts.com/fpp2/ (accessed on 10 July 2024).
  16. Jiang, Zhe, Lin Zhang, Lingling Zhang, and Bo Wen. 2022. Investor sentiment and machine learning: Predicting the price of china’s crude oil futures market. Energy 247: 123471. [Google Scholar] [CrossRef]
  17. Kazem, Ahmad, Ebrahim Sharifi, Farookh Khadeer Hussain, Morteza Saberi, and Omar Khadeer Hussain. 2013. Support Vector Regression with chaos-based firefly algorithm for stock market price forecasting. Applied Soft Computing 13: 947–958. [Google Scholar] [CrossRef]
  18. Kroner, Kenneth F., Kevin P. Kneafsey, and Stijn Claessens. 1995. Forecasting volatility in commodity markets. Journal of Forecasting 14: 77–95. [Google Scholar] [CrossRef]
  19. Ren, Xiaohang, Wenting Jiang, Qiang Ji, and Pengxiang Zhai. 2024. Seeing is believing: Forecasting crude oil price trend from the perspective of images. Journal of Forecasting 43: 2809–2821. [Google Scholar] [CrossRef]
  20. Roll, Richard. 1984. Orange juice and weather. The American Economic Review 74: 861–880. [Google Scholar]
  21. Sun, Yongxuan, Bowen Zhang, Zhizhong Ding, Momiao Zhou, Mingxi Geng, Xi Wu, Jie Li, and Wei Sun. 2022. Environment-aware vehicle lane change prediction using a cumulative probability mapping model. International Journal of Sensor Networks 40: 1–9. [Google Scholar] [CrossRef]
  22. Wang, Donghua, and Tianhui Fang. 2022. Forecasting crude oil prices with a wt-fnn model. Energies 15: 1955. [Google Scholar] [CrossRef]
  23. Wang, Wenting, and Longbao Wei. 2021. Impacts of agricultural price support policy on price variability and welfare: Evidence from china’s soybean market. Agricultural Economics 52: 3–17. [Google Scholar] [CrossRef]
  24. Zhang, Dongqing, Guangming Zang, Jing Li, Kaiping Ma, and Huan Liu. 2018. Prediction of soybean price in china using qr-rbf neural network model. Computers and Electronics in Agriculture 154: 10–17. [Google Scholar] [CrossRef]
  25. Zhao, Lin, Xun Zhang, Shouyang Wang, and Shanying Xu. 2016. The effects of oil price shocks on output and inflation in china. Energy Economics 53: 101–110. [Google Scholar] [CrossRef]
  26. Zhao, Yang, Jianping Li, and Lean Yu. 2017. A deep learning ensemble approach for crude oil price forecasting. Energy Economics 66: 9–16. [Google Scholar] [CrossRef]
  27. Zheng, Li, Yuying Sun, and Shouyang Wang. 2024. A novel interval-based hybrid framework for crude oil price forecasting and trading. Energy Economics 130: 107266. [Google Scholar] [CrossRef]
Figure 1. Orange juice futures price and trend.
Figure 1. Orange juice futures price and trend.
Jrfm 17 00475 g001
Figure 2. Comparison of forecasting models’ actual and predicted values—5-day steps.
Figure 2. Comparison of forecasting models’ actual and predicted values—5-day steps.
Jrfm 17 00475 g002
Figure 3. Forecasts model comparison of actual and predicted values—10-day steps.
Figure 3. Forecasts model comparison of actual and predicted values—10-day steps.
Jrfm 17 00475 g003
Table 1. Dataset summary statistics.
Table 1. Dataset summary statistics.
AssetObsMeanStd. DevMinMaxKurtosisSkewness
OJ = F504289.423882.7351150.6500487.2000−1.12560.1262
Financial Indices
ES = F5044394.4995483.96663588.50005491.0000−0.77250.5726
GSPC5034376.1165478.45623577.03005487.0298−0.74360.5850
Table 2. Accuracy of models’ forecasting results in 5-trading-day steps.
Table 2. Accuracy of models’ forecasting results in 5-trading-day steps.
ModelsMetricsSingle FactorCommodities IndexS&P500 IndexAll
ARIMAMAE75.314877.062852.485952.3981
MSE6904.33327212.55653565.88273546.0604
RMSE83.092384.926859.715059.5488
MAPE18.855519.298413.074113.0551
LSTMMAE12.415519.19159.476614.4386
MSE254.8753522.2959158.0147341.6509
RMSE15.964822.853812.570418.4838
MAPE3.11074.83102.40103.5943
RNNMAE13.739626.195214.737414.3007
MSE316.8579970.6563401.1044329.8499
RMSE17.800531.155420.027618.1618
MAPE3.47336.62253.69363.6512
BPNNMAE8.418642.233710.21349.9090
MSE137.51381943.8305155.2525159.2351
RMSE11.726644.088912.460012.6188
MAPE2.107410.96882.61422.5316
SVRMAE8.639010.782910.882710.1447
MSE142.4235165.7649166.9276156.7200
RMSE11.934112.875012.920012.5188
MAPE2.17282.80672.83662.6253
NARMAE14.133813.766713.655319.1757
MSE289.0329288.4339288.7256501.5138
RMSE17.001016.983316.991922.3945
MAPE3.62223.50773.47414.9971
ConvLSTMMAE11.409411.407010.415914.5008
MSE185.1033182.2130162.6494272.5280
RMSE13.605313.498612.753416.5084
MAPE2.94742.93872.67063.7759
Table 3. Directional accuracy and accuracy improvement—5-day steps (all factors).
Table 3. Directional accuracy and accuracy improvement—5-day steps (all factors).
ModelsDirectional Accuracy (%)Average Improvement (%)
5-day stepsARIMA50.53
LSTM55.7939.11
RNN47.3753.59
BPNN51.5851.82
SVR58.9556.13
NAR46.3266.86
ConvLSTM62.1165.33
Table 4. Diebold–Mariano (DM) test results among forecasting models in 5-day steps.
Table 4. Diebold–Mariano (DM) test results among forecasting models in 5-day steps.
ModelsBenchmark
LSTMRNNBPNNSVRNARConvLSTM
ARIMA17.8053 ***24.3258 ***20.7271 ***25.5028 ***24.4659 ***22.6057 ***
LSTM 16.8237 ***26.5703 ***50.8174 ***26.8106 ***30.5143 ***
RNN −1.9965 **3.0291 *18.2003 ***9.1926 ***
BPNN 9.5119 ***18.0418 ***19.7829 ***
SVR 11.7014 ***12.5263 ***
NAR −1.6399
Note: * Significance at 10% level, ** at the 5% level, *** at the 1% level.
Table 5. Models forecast results accuracy in 10-trading day horizon.
Table 5. Models forecast results accuracy in 10-trading day horizon.
ModelsMetricsSingle FactorCommodities IndexS&P500 IndexAll
ARIMAMAE67.857475.068451.410151.4660
MSE5836.03806898.85193465.50843473.0407
RMSE76.394083.059358.868658.9325
MAPE17.000018.877312.852512.8667
LSTMMAE18.839919.294717.751519.3754
MSE640.2273668.2661586.0565707.8805
RMSE25.302725.850824.208626.6060
MAPE4.64874.75414.42884.7611
RNNMAE40.024520.883618.953221.0377
MSE2206.0296812.4532647.3866900.3059
RMSE46.968428.503625.443830.0051
MAPE10.09405.14184.67965.1378
BPNNMAE14.599317.195314.315811.7737
MSE377.5428426.3074344.9291256.9264
RMSE19.430520.647218.572316.0289
MAPE3.65954.48053.59392.9399
SVRMAE12.988812.481012.449613.9587
MSE345.7689285.5075284.0731374.0397
RMSE18.594916.897016.854519.3401
MAPE3.19153.10003.09313.4438
NARMAE18.710618.965618.977519.4742
MSE622.8136621.6618633.2322649.2775
RMSE24.956224.933125.164125.4809
MAPE4.72874.77204.75234.8448
ConvLSTMMAE19.242215.523515.845517.5845
MSE621.6677407.6756408.7932495.4774
RMSE24.933320.191020.218622.2593
MAPE4.97713.96644.08034.5623
Table 6. Directional accuracy and accuracy improvement—10-day steps (all factors).
Table 6. Directional accuracy and accuracy improvement—10-day steps (all factors).
ModelsDirectional Accuracy (%)Average Improvement (%)
10-day stepsARIMA48.89
LSTM44.4440.33
RNN51.1133.30
BPNN54.4442.88
SVR52.228.20
NAR45.5647.96
ConvLSTM50.0063.75
Table 7. Diebold–Mariano (DM) test results among forecasting models in 10-day steps.
Table 7. Diebold–Mariano (DM) test results among forecasting models in 10-day steps.
ModelsBenchmark
LSTMRNNBPNNSVRNARConvLSTM
ARIMA18.5606 ***17.0005 ***17.7258 ***−0.6466 ***18.1408 ***21.2885 ***
LSTM −8.3953 ***1.8406 ***−17.9334 ***8.5297 ***19.5791 ***
RNN 5.8633 ***−16.0307 ***14.6996 ***19.0879 ***
BPNN −16.2270 ***3.1734 ***19.2480 ***
SVR 16.5782 ***19.7874 ***
NAR 15.6224 ***
Note: *** Significance at the 1% level.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Ampountolas, A. Forecasting Orange Juice Futures: LSTM, ConvLSTM, and Traditional Models Across Trading Horizons. J. Risk Financial Manag. 2024, 17, 475. https://doi.org/10.3390/jrfm17110475

AMA Style

Ampountolas A. Forecasting Orange Juice Futures: LSTM, ConvLSTM, and Traditional Models Across Trading Horizons. Journal of Risk and Financial Management. 2024; 17(11):475. https://doi.org/10.3390/jrfm17110475

Chicago/Turabian Style

Ampountolas, Apostolos. 2024. "Forecasting Orange Juice Futures: LSTM, ConvLSTM, and Traditional Models Across Trading Horizons" Journal of Risk and Financial Management 17, no. 11: 475. https://doi.org/10.3390/jrfm17110475

APA Style

Ampountolas, A. (2024). Forecasting Orange Juice Futures: LSTM, ConvLSTM, and Traditional Models Across Trading Horizons. Journal of Risk and Financial Management, 17(11), 475. https://doi.org/10.3390/jrfm17110475

Article Metrics

Back to TopTop