1. Introduction
Energy consumption is required to power the worldwide economy, including most human activities [
1]. Accordingly, nearly all worldwide industrial production activities, transportation systems, economic development projects, and machinery and equipment rely upon the energy production and distribution of oil, natural gas, coal, electricity, biofuels, waste, and other sources of supply and their derivatives to enable all of these integrated components of the worldwide economy to function [
2]. This also means that energy prices that fuel all parts of the supply chain distribution system are a vital fraction of the price of every commercial product that requires shipping and electrical products to function [
Because of this dependence, unexpected steep increases in energy prices can suppress economic growth. They could generate inflation in countries that are net importers of oil, depending upon circumstances, duration, the type of oil, existing inventory, the size of the nation, energy substitution capability, and the number of industries affected [
5]. Conversely, substantial energy price drops create serious budgetary problems for oil-exporting countries [
6]. At present, oil, which is by far the most highly consumed and concentrated energy source, provides somewhere between 30–35% of all of the energy needed to run the global economy, which is 5–10% more than coal, 10–15% more than natural gas, and 25–30% more than renewable and nuclear energy [
Despite the importance of energy in human activities, there needs to be more consensus on the roles of energy consumption and energy pricing mechanisms [
8]. Historical examples such as oil demand shocks in the price of oil in 1973/74, 1979/80, 1986, 1997/2000, and 2003/2008 are examples of complex events that have both stimulating and depressing effects on national economies. There is also a lack of consensus about the predictive accuracy of energy commodity pricing models, e.g., [
10]. The lack of understanding of how oil prices are developed and structured has resulted in unexpected, international high and low oil price volatility, spikes, and supply and demand shocks. Price volatility has a significant impact on global financial markets, economic development activity, and political stability, as well as on automobile fuel prices, airline transportation prices, and shipping rates. It also financially impacts national economies and consumer goods and services.
To alleviate the oil price modeling challenges, researchers have used a panoply of models spanning from Stevenson and Bear’s [
11] most trivial random walk to highly sophisticated nonlinear models, such as dynamic model averaging, not to mention the recent advances in commodity forecasting using machine learning models. Most of the literature on commodity price forecasting can be grouped into three categories: univariate time series, multivariate time series, and volatility forecasting. In turn, under each category, researchers have used linear and nonlinear models with and without exogenous variables.
The autoregressive and moving average (ARMA) model of Box and Jenkins [
12] and its variants are widely used in time series forecasting. ARMA models stipulate that a stationary (A time series is stationary if its mean, variance, and covariance do not vary with time. If the series is not stationary, we can integrate it to make it stationary by taking successive differences, yielding an autoregressive integrated moving average (ARIMA) process) time series can be modeled as a weighted average of its past observations (AR process) and the past observations of the white noise process. The literature abounds with studies that use ARIMA as either a forecasting tool or a benchmark in predicting commodity prices, for example, [
16] for oil prices, [
17] for electricity prices, and [
18] gold.
There are other variations of the ARIMA model, such as the seasonal autoregressive integrated moving average or SARIMA, the controlled autoregressive integrated segmented moving average or CARISMA [
19], the fuzzy autoregressive integrated moving average or FARIMA [
20], and the fuzzy seasonal autoregressive integrated moving average (FSARIMA) that combines the seasonal SARIMA with the fuzzy regression model [
21]. The ARIMA model, Poisson, Markov, autoregressive, moving average, ARMA, and ARIMA processes are limited to short-range dependencies. In addition, the autoregressive fractional integral moving average model (ARFIMA) is identified as a fractional order signal processing technique as a generalization of the ARIMA and the ARMA models [
22]. Because of this integration, the authors of [
22] found the ARFIMA to offer broader applications than all of the previously mentioned time series analysis methodologies with both short-term and long-term dependence. In addition, the authors found the ARFIMA forecasting model to result in a superior fit compared with established integer-order models when working with spatial or time series data with long-range dependence (LRD), that is, long-memory or long-range persistence.
To overcome the limitations of the ARMA and its variant models, some researchers investigate whether oil price time series portray long-memory properties or volatility (see, for example, [
27]). However, with evidence of nonlinear dependence, predictable returns, and volatility, long-memory properties contradict the validity of weak-form oil market efficiency. Although, at the same time, some studies have empirically tested the modeling and forecasting of long-memory fluctuations in crude oil markets with the use of generalized autoregressive conditional heteroskedasticity (GARCH) type models (e.g., [
31]; researchers conducting these studies still think that long memories and volatility that appear in returns are irrelevant. Nevertheless, because it has been well-known that market shocks have a substantial simultaneous influence on returns and volatility, they have dual long-memory properties. Thus, based on this, researchers such as [
35] use the joint ARFIMA-FIGARCH model to study the relationship between returns and volatility in economic and financial time series. This model is well suited to conducting such an analysis of a process demonstrating dual long-memory properties.
The recent availability of economic data, such as the Federal Reserve Economic Data (FRED) (For a detailed description of the FRED data, see [
36]) and the Economic Policy Uncertainty data (EPU; the authors of [
37] have presented new research perspectives to use the exogenous variable approach to forecast oil prices. For instance, the authors of [
38] use FRED and EPU datasets and other financial variables to predict crude oil prices using various machine learning models. However, these data exhibit high multicollinearity among their variables. For instance, Appendix
Table A1 indicates that approximately 50% of the variables in FRED data show a correlation coefficient greater than 0.50 in absolute value, with more than 26% exhibiting a correlation coefficient greater than 0.75.
The high correlation coefficient and dimensionality of big data, such as FRED, make data reduction techniques suitable by projecting the high dimension data onto a few orthogonal components, eliminating multicollinearity, and reducing the computational cost when the big set is used (for instance, the dynamic model averaging, introduced by [
39], does not allow more than 30 variables in the DMA
R package). Principal component analysis (PCA) appears to be a popular choice among researchers in the area of economics and finance (see, for example, [
43]). The partial least squares (PLS) method is another data reduction technique to avoid the multicollinearity problem while taking advantage of the data pattern. While the PCA technique uses only the explanatory variables to extract the principal components without considering how each variable relates to the dependent variable, the partial least squares method offers an alternative approach to PCA by capturing the relationship between explanatory variables and dependent variables when extracting the components.
This paper assesses the out-of-sample forecasting accuracy of several time series models in predicting the West Texas Intermediate (WTI) crude oil prices. We consider a basic ARIMA model, a seasonal ARIMA (SARIMA), the partial least squares using FRED economic data, augmented ARIMA and SARIMA models with exogenous variables (the PLS components, COVID-19, and the 2008 financial crisis dummy variables), an autoregressive conditional heteroskedasticity model (ARCH), and a generalized autoregressive conditional heteroskedasticity model (GARCH). We also ask and answer the following question: Can economic data, such as FRED, improve oil forecasting accuracy? We compare these alternative models using the mean absolute (MAE), the mean absolute percentage error (MAPE), and the root-mean-squared error (RMSE) of the out-of-sample predictions.
This paper adds to the existing literature on oil price forecasting in at least two aspects. First, we combine mean and variance models to produce more accurate oil price forecasts. The results show that considering the volatility equation along the mean equation improves the forecasting accuracy by 70%. Second, we use the FRED data after reducing its dimension using partial least-squares. This is crucial as the original FRED data are characterized by a high correlation among most variables, potentially inducing overfitting.
The rest of the paper is organized as follows. First, in
Section 2, we describe the alternative models used for prediction. Then,
Section 3 describes the data and estimation issues. Finally, in
Section 4 and
Section 5, we present and discuss the results; in
Section 6, we conclude and suggest future research avenues.
5. Discussion
In this study, we provided alternative models to forecast WTI crude oil prices. We also emphasized the forecasting power of the FRED-MD economic data by estimating models with and without the data. In
Table 15, we used the mean absolute percentage error (MAPE), the mean absolute error (MAE), and the root mean squared error (RMSE) to compare alternative models. Our results indicate that the ARIMA(1,1,0)-GARCH(1,1) model outperforms all of the other models in the three criteria. In addition, the models with conditional volatility outperform all of the other models using the three criteria.
Table 15 shows a significant difference between the ARIMA family models and volatility models, whether we use FRED data or not. Hence, the worst volatility model (in terms of RMSE), namely, ARIMA(1,1,0) with ARCH(1), reduces the forecasting error of ARIMAX1(3,1,1) by more than 70%. This highlights the importance of considering nonlinear time series when studying variables with extreme events.
On the other hand, using FRED-MD data reduces the forecasting error of the corresponding model, except for GARCH. For instance, ARIMA(3,1,1)’s mean absolute percentage error drops from 32.62% to 27.80% when we add the PLS components extracted from FRED-MD data. Moreover, the forecasting error also drops when the PLS components are added to the mean equation of the ARCH(1) model, from 5.69 to 5.34 in terms of root mean squared error.
Regarding previous studies, the authors of [
38] offer two interesting studies we can use to compare our findings as their goals were to forecast crude oil prices using somewhat similar series. The authors of [
14] use ARIMA and SARIMA to forecast crude oil prices in the United States and Europe from January 2017 to September 2021. Their ARIMA and SARIMA models yielded an out-of-sample MAPE of 0.05 and 0.09, respectively. However, their training set included the COVID-19 pandemic period. In contrast, our training set did not include that period, but our conditional variance models (ARCH and GARCH) were successful in producing highly accurate forecasts.
The study by [
38] is close to our study regarding the variables and the training set used. The authors used several machine learning models to forecast the WTI crude oil prices, using data from March 1993 to December 2021 (thedata we used was from February 1992 to October 2022) with a test set including the COVID-19 pandemic. In addition, their study used also FRED data besides the economic policy uncertainty data and other financial data. Their change point-adaptive recursive neural network (CP-ADARNN) allows for more reduction in forecast errors compared to our best model. However, one has to be cautious regarding the use of economic data without orthogonalization (partial least-squares or principal components) due to multicollinearity and the consequent risk of overfitting. In contrast, the GARCH model does not present the overfitting risk and is available in most traditional statistical software.
6. Conclusions
In this study, we propose alternative time series models to forecast WTI crude oil prices. We also assess the forecasting power of economic data, namely, FRED data. Our results indicate that when linear models are used (ARIMA and SARIMA), the inclusion of the partial least-squares components extracted from FRED reduces the forecasting error without providing an accurate forecast during high-volatility periods. In contrast, including PLS components from FRED data does not improve the forecasting power volatility models, especially the GARCH model.
Moreover, this paper offers empirical evidence against ignoring conditional volatility in forecasting commodity prices. Even when augmented with economic data spanning all economic activities, linear models, such as ARIMA, provide poor forecasts, especially during extreme events. Our finding highlights the importance of considering nonlinear time series when studying variables with extreme events. In addition, this study’s outcome has practical implications for commodity traders as the GARCH estimation only requires the commodity price.
The findings of this study confirm the use of the generalized autoregressive conditional heteroskedasticity (GARCH) models as a statistical tool that provides high-quality forecasts, not only for the volatility but also for the time series mean. The GARCH model is important because it makes it possible to model financial and economic time series data more precisely. The model effectively captures the key characteristics of the data by taking volatility clustering into consideration. This is critical in crude oil pricing since forecasting depends on precise modeling of volatility. As indicated by
Figure 1, the WTI oil prices exhibited volatility clusters during several periods in the 1992–2022 monthly data. Unlike studies, such as [
14], our study properly accounts for volatility clustering to capture the important feature of WTI oil prices and produce accurate forecasts.
This study can be improved in several respects. First, future studies may extend the factor models to include global demand, international economic, and geopolitical indicators instead of limiting the analysis to the United States’ economic indicators. In fact, the FRED database only includes variables related to U.S. economic activities. Exploring other variables or indices, such as the ones provided by the Economic Policy Uncertainty (EPU) database, may take into consideration global economic activities.
Second, to palliate linear models’ shortcomings, one of the promising nonlinear modeling techniques is the dynamic model averaging developed by [
39]. To allow the forecasting model’s parameters and forecasting accuracy to vary over time, the model integrates the state space approach with the Markov chain process. This modeling approach allows the parameters and the set of explanatory variables to vary over time. To select the model with the highest probability at any given time, the model is frequently accompanied by dynamic model selection (see, for instance, [
Finally, future studies should also consider asymmetric volatility models. Observational evidence shows that negative shocks have different impacts than positive ones on volatility. The use of asymmetric GARCH models, such as the exponential general autoregressive conditional heteroskedastic (EGARCH) model, developed by [
53], and the Glosten–Jagannathan–Runkle GARCH(GJR-GARCH) model, developed by [
54], are some examples that consider asymmetric volatility.