Next Article in Journal
An In-Depth Look at Rising Temperatures: Forecasting with Advanced Time Series Models in Major US Regions
Previous Article in Journal
A Data-Driven Multi-Step Flood Inundation Forecast System
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Forecasting the CBOE VIX and SKEW Indices Using Heterogeneous Autoregressive Models

by
Massimo Guidolin
* and
Giulia F. Panzeri
BAFFI CAREFIN Centre, Bocconi University, 21100 Milan, Italy
*
Author to whom correspondence should be addressed.
Forecasting 2024, 6(3), 782-814; https://doi.org/10.3390/forecast6030040
Submission received: 2 August 2024 / Revised: 8 September 2024 / Accepted: 8 September 2024 / Published: 14 September 2024
(This article belongs to the Section Forecasting in Economics and Management)

Abstract

:
We analyze the predictability of daily data on the CBOE  V I X  and  S K E W  indices, used to capture the average level of risk-neutral risk and downside risk, respectively, as implied by S&P 500 index options. In particular, we use forecast models from the Heterogeneous Autoregressive ( H A R ) class to test whether and how lagged values of the  V I X  and of the  S K E W  may increase the forecasting power of  H A R  for the  S K E W  and the  V I X . We find that a simple  H A R  is very hard to beat in out-of-sample experiments aimed at forecasting the  V I X . In the case of the  S K E W , the benchmarks (the random walk and an  A R ( 1 ) ) are clearly outperformed by  H A R  models at all the forecast horizons considered and there is evidence that special definitions of the  S K E W  index based on put options data only yield superior forecasts at all horizons.

1. Introduction

Volatility modeling and forecasting is an important issue of research in financial markets, and it has received much attention of researchers and practitioners over the last three decades. A range of different models have been used, starting from simple one-factor Autoregressive Conditional Heteroskedasticity ( A R C H ) type models, extending to multiple-factor stochastic volatility models, and eventually–owing to the increasing availability of high-frequency intraday data–to the use of realized variance for measuring and modeling volatility (see the review of the literature standard financial econometrics textbooks, e.g., Guidolin and Pedio [1], Hautsch [2]). In spite of such a rich drive to develop and test progressively more complex models, the literature has increasingly emphasized how the forecasts of future volatility of the underlying returns that (whenever possible) can be inferred from the price of traded options can and do help to sharpen the predictive accuracy of many models (see, e.g., Han and Park [3] and references therein). Therefore one portion of such an expanding body of research has investigated the dynamic properties and predictability of the main indices that have been proposed to measure implied volatility and its main features, such as the Chicago Board Options Exchange (CBOE) Volatility Index (in what follows, referred to simply as the  V I X ) and the CBOE Skew Index (the  S K E W ) in the case of S&P 500 index options (see Fassas and Siriopoulos [4] for a review of implied volatility indices).
One of the most important characteristics of both realized and option-implied variance is its long memory feature, i.e., its strong persistence of their sample autocorrelations. As argued by Bandi and Perron [5], this may be an indication of the presence of fractional integration in both series. Fractional integration implies that the autocorrelations of a time series decay at a hyperbolic rate rather than at an exponential rate, as seen in non-fractionally integrated series. This slower decay indicates that even distant past values have a significant impact on the current and future states of the series. As it is well known, in the case of physical realized variance, to account for this stylized fact, Corsi [6] developed a simple and parsimonious Heterogeneous Autoregressive ( H A R ) model that builds on the Heterogeneous Market Hypothesis. With reference to spot, physical realized variance, Corsi showed that the model provided more accurate out-of-sample forecasts of realized variance than the alternatives. Even though  H A R  has turned out to be considerably successful, both in its original structure and in its subsequent extensions (see, e.g., the review in Corsi et al. [7]), much less is known about the performance of  H A R  models when applied to the classical, key implicit volatility indices, such as the  V I X  and especially the  S K E W . The goal of our paper is exactly to provide a thorough, state-of-the-art assessment of the performance of the baseline  H A R  model and a few of its variations in the prediction of the CBOE Volatility Index and the Skew Index. The choice to stick to rather simple forms of  H A R  modelling derives from the fact that it is generally challenging to provide more complex specifications of the pure  H A R  model that clearly outperform it in terms of forecasting accuracy. The model presented by Psaradellis and Sermpinis [8], the  H A R G A S V R  model, incorporates a Support Vector Regression ( S V R ) algorithm optimized using a Genetic Algorithm ( G A ) to capture nonlinear relationships in the data. Among all tested hybrid models, the  H A R - G A S V R  stands out as the only one achieving significantly superior forecasting performance compared to the pure  H A R  model. In particular, and quite distinctively, we rely on the  H A R  model not only for modeling and forecasting the daily behaviour of the  V I X  but also for analysing the time series of the  S K E W  index. Since both indices display similar properties in their time series, we believe the same rationale that led to the development of the  H A R  model for realized volatilities (and its extension to implied volatilities) can be extended to risk-neutral skewness, and thus to the  S K E W  index.
The Chicago Board Options Exchange Volatility Index is a real-time market index representing the market’s expectations for volatility over the coming 30 days. Often referred to as the “fear gauge”, the  V I X  measures anticipated market volatility derived from the prices of S&P 500 index options. The focus is primarily on Out-of-The-Money (OTM) options, which are considered more sensitive to market volatility expectations. The  V I X  was first introduced in 1993 by the CBOE and it was initially based on the implied volatility of eight S&P 100 (OEX) options. However, in 2003, the CBOE updated the  V I X ’s calculation methodology to better reflect market conditions and expectations by shifting the underlying index to the S&P 500. This change allowed for a broader and more comprehensive measure of market volatility, as the S&P 500 is a more widely followed index with a larger number of component stocks. The  V I X  calculation is based on the prices of a wide range of S&P 500 index options, particularly focusing on options (both puts and calls) with 23 to 37 days until expiration. It uses a model-free approach that does not assume any specific distribution for the returns of the underlying assets, see, e.g., Whaley [9] for a general discussion. The  V I X  serves as a crucial tool for investors and analysts, providing insights into market sentiment and expectations of volatility. It is widely used for hedging purposes and as a benchmark for volatility-based derivatives (see, e.g., Zhang et al. [10]).
The CBOE Skew Index was designed (and officially launched in 1990) to measure the market’s perception of tail risk, which refers to the risk of extreme events that are unlikely but potentially very impactful on aggregate stock prices. The concept of “skew” in this context refers to the tendency of OTM put options to be more expensive than OTM call options, reflecting the market’s greater concern for significant downside risks than upside potential. Specifically, the Skew Index is calculated using the prices of out-of-the-money options on the S&P 500 index and it quantifies the market’s expectations of a significant deviation from the average, particularly focusing on the risk of extreme negative returns. (The index values typically hover around 100 in stable market conditions. Values much higher than 100 suggest that market participants are expecting higher chances of large drops in the index than large rises, indicating greater perceived tail risk). The  S K E W  is considered a “fear index” similar to the  V I X , but it specifically measures the perceived risk of significant negative market moves. A higher Skew indicates that investors are willing to pay more for downside protection, suggesting concerns about potential large market downturns. Therefore the  S K E W  can provide valuable insights into market sentiment and the likelihood of tail events, which are important for risk management and investment decision-making: it serves as a signal to investors about the prevailing market fears and can indicate periods of higher financial market vulnerability, see also the discussion in Cao et al. [11].
The literature regarding the use of the  V I X  to forecast a range of key financial market variables is massive and we shall make no effort to summarize it here because of space constraints. There is however also a literature on forecasting (see also the discussion in Majmudar and Banerjee [12]), the  V I X  itself that was started with Blair et al. [13], who had emphasized the  V I X ’s predictability, comparing it with historical volatility measures, and hence its superior performance as a predictor of future volatility. Degiannakis [14] was the first known paper to apply Autoregressive Integrated Moving Average ( A R I M A ) models to predict the  V I X , demonstrating that an  A R I M A ( 1 , 1 , 1 )  model could effectively forecast the  V I X ’s direction with over 58 percent accuracy, while adding other financial or macroeconomic variables would not significantly improve the model’s performance. Ahoniemi [15] generalized this effort by examining various time-series models for predicting the  V I X , including  A R I M A  and a Generalized Autoregressive Conditional Heteroskedasticity ( G A R C H ) specifications. The study found that these models, particularly when combined with financial indicators like S&P 500 returns, can predict the direction of the  V I X  with notable accuracy. Fernandes et al. [16] have evaluated different  H A R -type models, showing that  H A R  consistently outperforms simpler models like the autoregressive model and the random walk in terms of predictive power for the  V I X , especially in short-term forecasts. Recently, Taylor [17] has proposed a model that includes various economic indicators and financial variables and applied multivariate regressions and machine learning methods, to analyze and predict  V I X  futures returns. He reports that his model significantly outperforms traditional linear models in both statistical accuracy and economic value terms.
There is also a small literature that has focused on predicting the CBOE Skew Index itself. Mora-Valencia et al. [18] have used  A R I M A   G A R C H  models to forecast the CBOE Skew Index and found that a Moving Average ( M A ) model and an Exponential Generalized Autoregressive Conditional Heteroskedasticity ( E G A R C H ) model, specifically a  M A ( 4 ) E G A R C H ( 1 , 3 ) , with Student’s t-distributed innovations was particularly effective in capturing the characteristics of the Skew Index. They found that the  V I X , alongside other variables, could help forecast the  S K E W , indicating that changes in market volatility, as measured by the  V I X , are related to expectations of tail risk reflected in the Skew Index. They also report that accurate forecasting of the Skew Index could provide warnings about potential market downturns. Bevilacqua and Tunaru [19] have examined the predictive power of decomposed components of the  S K E W , the Positive Skew Index ( S K E W + ) and the Negative Skew Index ( S K E W ), for forecasting the index itself. They found that including these components in a First-Order Autoregressive ( A R ( 1 ) ) model improved out-of-sample forecasting performance, suggesting that the decomposition provides more nuanced predictive information about market sentiments and tail risks. Moreover, including the lagged value of the  V I X  in their models improved the out-of-sample forecasting accuracy for the  S K E W  Index. This suggests that the  V I X , which captures market volatility expectations, provides significant information that can be used to predict changes in the  S K E W  (Elyasiani et al. [20] have explored the  S K E W  as a measure of market greed and fear, indicating its usefulness in predicting market sentiment shifts).
We use daily data on the CBOE  V I X , the  S K E W , and the panel of all put and call options that were tradable on the S&P 500 index for the sample period from 4 January 1996, to 31 December 2019, to perform a classical recursive Out-Of-Sample (OOS) forecasting exercise. We limit our data to 2019 to avoid dealing with the disruptions in the role and informational effectiveness of the US index options market, from which the  V I X  and  S K E W  are derived, see, e.g., John and Li [21]. Of course, it would be interesting to extend our analysis and formally contrast the pre- and post-2020 samples, when enough data will become available. Where appropriate, besides data from the CBOE website, to decompose the  S K E W  index and extract only its negative part or distill a version of the index only based on put options, we compute afresh the entire daily history of the index. In the recursive, prediction exercise, we employ a 1000-day rolling window recursive forecasting scheme. On each day starting with observation 1001, we compute the 1-, 5-, 10-, and 21-day ahead forecasts. To produce predictions beyond the 1-day horizon, we employ a classical, indirect dynamic forecast scheme.
We obtain three key results. First, within an otherwise standard  H A R  framework, the  S K E W  index does not contain useful information to improve the in-sample prediction of the  V I X  index. However, when it comes to perform in-sample prediction of the  S K E W  (especially of its logarithm), both the lagged value of  S K E W  and the lagged value of the  V I X  improve the predictive power of the  H A R  model. Second, a simple, Corsi [6]-style, simple  H A R  is very hard to beat in OOS experiments aimed at forecasting the  V I X . Specifically, using Hansen’s test for Superior Predictive Ability (SPA) test for the squared error loss function, neither the  H A R - S K  nor the  H A R - S K - P U T S  models turn out to be clearly superior in any of the cases investigated. The same conclusion holds under an absolute error loss function. Furthermore, when forecasting the levels of the  V I X , the  H A R  model is clearly superior to its two extensions, except for the 1-day ahead forecast horizon. In fact, when the MCS methodology is deployed, the naive  H A R  model turns out to be the only model included in the confidence set 85% of time. Third, when it comes to forecast the  S K E W , the benchmarks (here, the random walk and an  A R ( 1 ) ) are clearly outperformed by  H A R  models in predicting all the transformations of the  S K E W  used as a target variable and at all the forecast horizons considered. The benchmarks generally fail to be included in the MCS, in fact. Fourth, again in the case of the  S K E W , when we compare the performance of the  H A R  model with the  H A R - S K - P U T S  model, the latter produces superior forecasts at all horizons and this tends to be confirmed by MCS-based evidence.
The rest of the paper is organized as follows. Section 2 adds to the paper already surveyed here and performs an additional, short review of the literature. Section 3 introduces the models tested and the OOS research design adopted in our paper. Section 4 describes the data and presents preliminary summary statistics. Section 5 shows our main empirical results, especially those concerning the OOS performance of the different models. Section 7 concludes.

2. Brief Literature Review

As already mentioned, there exists a relatively large body of literature that has employed econometric methods to forecast the  V I X  index. For instance, Ahoniemi [15] estimated various time series models for the  V I X  using daily data from 1990 to 2007. The best model in terms of in-sample goodness-of-fit and OOS correct sign prediction turned out to be a simple  A R I M A ( 1 , 1 , 1 ) G A R C H ( 1 , 1 ) . However, the best performance in terms of the Mean Squared Forecast Error ( M S F E ) criterion was given by homoskedastic  A R I M A ( 1 , 1 , 1 )  and a Autoregressive Integrated Moving Average with Exogenous variables  A R I M A X ( 1 , 1 , 1 ) . Konstantinidi et al. [22] expanded this effort by studying the predictability of U.S. and European daily implied volatility indices using six models, i.e., of  V I X  indices extended to a range of markets. They found that while point and interval forecasts offered a good predictive accuracy under statistical loss functions, particularly when vector autoregressive models employing principal components were used, these same predictions turned out to be unable to support economically significant OOS results. For instance, they failed to yield abnormal profits when used to support simple volatility trading strategies.
In the body of research that has used the same type of models explored in this paper, Fernandes et al. [16] examined the daily predictability of the US  V I X  index using a range of  H A R -type models. They reported that it was challenging to outperform a pure  H A R  model in terms of forecasting performance, especially at the shortest horizons. Psaradellis and Sermpinis [8] confirmed these findings and reported that a  H A R - G A S V R  (genetic algorithm-support vector regression) hybrid model and a  G A S V R  regression applied to the residuals of the pure  H A R  model produced the most accurate forecasts and the largest economic value. However, these findings did open the door to the idea that by extending  H A R  models to include sufficient levels of complexity, forecasting accuracy might have been gained. Our paper returns on this idea but works by including lags of the target variables within the  H A R  framework to yield a stronger predictive power.
Byun and Kim [23] opened to the idea that combining information from both (conditional) variance and skeweness may offer optimal predictions of the former. Therefore they investigated the impact of risk-neutral skewness on forecasting future, spot realized volatility. They compared three models and found that the coefficient of risk-neutral skewness is positive and significant for daily volatility forecasts, with its significance decreasing with the forecasting horizon. They adopted the methodology to compute the risk-neutral skewness proposed by Bakshi et al. [24], which is the same methodology that underlies the construction of the  S K E W  index we follow. Their OOS analysis showed that risk-neutral skewness significantly improves forecasting performance for daily, weekly, and monthly horizons. In our empirical applications, we also pursue this idea, even though our effort is entirely devote to the prediction of risk-neutral moments, and hence of the  V I X  instead of spot, realized variance.
Liu and Faff [25] have developed SIX, an index forecasting 30-day realized skewness of the spot returns on the S&P 500. They compared SIX with the CBOE  S K E W  index and a non-parametric measure, studying their relationship with the  V I X  index. They argued that daily changes in  V I X  and SIX are negatively related as opposed to changes in  V I X  and  S K E W , which are instead positively related. They also examined the predictive power of SIX and  S K E W  for one-day-ahead S&P 500 returns, concluding that a lower risk-neutral skewness implies (more) negative returns. Lastly, they explored whether risk-neutral skewness may forecast the future physical skewness of the S&P 500 return distribution and found that SIX efficiently predicts future realized (physical) skewness, whereas  S K E W  did not. Even though our paper only focuses on the prediction of the  V I X  and of the  S K E W  and hence of risk-neutral, option-implied moments and refrains from embarking in the prediction of realized quantities, a few of the ideas in Liu and Faff [25] are pursued and expanded further.
Recently, Bevilacqua and Tunaru [19], also extending the work by Liu and Faff [25], have decomposed the CBOE  S K E W  index into positive ( S K E W + ) and negative ( S K E W ) skewness. They find  S K E W  to be a strong predictor of market downturns and recessions, while  S K E W +  would capture market sentiment (see also Han [26]). Additionally,  S K E W +  dampens the actual level of risk conveyed by  S K E W  alone, thereby resulting in an underestimation of actual tail-risk. In studying the OOS one step-ahead forecasting performance of the resulting models, Bevilacqua and Tunaru [19] compare an  A R ( 1 )  benchmark model to three competing models. These models include the  A R ( 1 )  model augmented with the lagged values of  S K E W S K E W , and  S K E W +  as additional regressors. They report that all three competing models outperformed the benchmark  A R ( 1 )  model in terms of forecasting accuracy. Although we work within the tradition of using  H A R  models as a predictive framework, we also pursue the idea that  S K E W  may contain different and additional forecasting power vs. the overall  S K E W  index. Byun and Kim [23] have focused on the predictive power of risk-neutral skewness for future volatility. They assess whether incorporating this skewness information into forecasting models can improve predictions of future realized volatility.
While the CBOE  S K E W  index is a specific measure of perceived tail risk and skewness in the S&P 500 options market, the paper by Byun and Kim broadly examines risk-neutral skewness across different contexts and its role in forecasting volatility, rather than focusing exclusively on the  S K E W  index itself. Therefore, the paper does not primarily aim to forecast the CBOE  S K E W  index but rather uses skewness as a factor to enhance volatility forecasting models.
Finally, in this paper we test the separate predictive power of  S K E W - P U T S , the risk-neutral measure of skewness that can be derived from using put options only. Doran et al. [27] had already investigated the existence of predictive informational content of the implied volatility skew using options on the S&P 100 index, evaluating whether the shape of the volatility skew could be used in forecasting future market movements. Using differences in implied volatilities between OTM and In-The-Money (ITM) options for both call and put options, they found that the put volatility skew exhibited strong predictive power for short-term market crashes. In contrast, the call volatility skew displayed weaker predictive power for short-term upward spikes in volatility. We extend this evidence on the separate market value of risk-neutral skewness extracted from put vs. call options to the problem of forecasting the  V I X , hence risk-neutral volatility, instead of aggregate stock market returns.

3. Methodology

3.1. The Heterogeneous Autoregressive Models

The heterogeneous autoregressive model was proposed by Corsi [6] with the goal to provide accurate forecasts of (spot) realized volatilities, which in this paper we extend to the modelling and forecasting of The  H A R  model is well known to be able to capture the highly implied persistent nature of volatility and it is logically grounded on the heterogeneous market hypothesis of Müller et al. [28], by which there would exist (at least) three types of market participants: short-term (characterized by high trading frequencies, typically daily), medium-term (who have longer trading horizons, typically weekly, including some types of speculative funds), and long-term (who generally have monthly trading frequencies, for instance institutional investors) traders and investors. The heterogeneity derives from the different time horizons of market participants and it leads to three distinct types of volatility: daily, weekly, and monthly. This creates an overall volatility cascade from low to high frequencies. Corsi [6] shows that this structure for the volatility cascade leads to a simple, restricted linear autoregressive model that employs volatilities realized over different time horizons. Typically, the  H A R  model features daily, weekly, and monthly moving averages of realized volatilities to replicate the three components of volatility associated with different types of trading activity. In this paper, we follow the approach of Fernandes et al. [16], expanding the original  H A R  model by also including biweekly and quarterly components of implied volatility. Hence, the  H A R  model specification we employ is:
Y t = β 0 + β 1 Y t 1 + β 5 d Y 5 d + β 10 d Y 10 d + β 21 d Y 21 d + β 63 d Y 63 d + ϵ t
where  Y t  is daily implied volatility recorded at time t and:
Y 5 d = Y t 1 + Y t 2 + Y t 3 + Y t 4 + Y t 5 5
Y 10 d = Y t 1 + Y t 2 + + Y t 9 + Y t 10 10
Y 21 d = Y t 1 + Y t 2 + + Y t 20 + Y t 21 21
Y 63 d = Y t 1 + Y t 2 + + Y t 62 + Y t 63 63
In words, the implied option volatilities associated with the weekly, biweekly, monthly, and quarterly components are represented by the 5-day, 10-day, 21-day, and 63-day moving averages, respectively, assuming 5 trading days per week and 21 trading days per month.
The main benefit of the  H A R  model is its ability to capture complex long-range dependence in time series of volatility estimates while preserving a manageable number of parameters. In contrast, because of their long memory and the corresponding slow decay of the SAC and PAC functions, when modeling time series of both realized and implied volatilities using  A R M A -type models, the number of parameters in both the autoregressive and moving average components tends to be very large. Additionally, Fernandes et al. [16] found that Autoregressive Fractionally Integrated Moving Average ( A R F I M A ) models perform very poorly both in-sample and out-of-sample when analyzing the time series of asset returns volatility because these models impose a linear form of long memory that depends exclusively on a single parameter.
The objective of this paper is however to extend the  H A R  modelling strategy to involve the  S K E W  index, both as a potential predictor of the  V I X  and as a target variable to be forecast on its own. As discussed in the Introduction, the results in Byun and Kim [23] showed that risk-neutral skewness significantly enhances the forecasting accuracy of  H A R  models for realized volatility across daily, weekly, and monthly horizons. Building on this finding, we employ a similar framework but adapted to predict implied volatility rather than realized volatility. This model, that we shall call  H A R - S K , integrates daily, weekly, and monthly moving averages of realized volatilities, supplemented by one additional predictor: the lagged value of risk-neutral skewness, i.e.,
V I X t = β 0 + β 1 V I X t 1 + β 5 d V I X 5 d + β 10 d V I X 10 d + β 21 d V I X 21 d + β 63 d V I X 63 d + β SKEW S K E W t 1 + ϵ t
Furthermore, also following Bevilacqua and Tunaru [19], to explore the distinct informational content of the  S K E W  index and of its negative component calculated using put options only ( S K E W ), we introduce a similar model named  H A R - S K - P U T S . This model replaces the lagged value of  S K E W  for the lagged value of the  S K E W  index in the  H A R - S K  model:
V I X t = β 0 + β 1 V I X t 1 + β 5 d V I X 5 d + β 10 d V I X 10 d + β 21 d V I X 21 d + β 63 d V I X 63 d + β SKEW - PUTS S K E W t 1 + ϵ t ,
where  S K E W t 1  indicates the lagged value of  S K E W  calculated using put options only.
All the models specified above are used in modelling and forecasting the daily behaviour of the  V I X  index. We shall also apply the  H A R  model to forecast the time series of the  S K E W  index. To investigate whether the negative part of the  S K E W  is a significant driver of its evolution, we also adopt the hybrid  H A R - S K - P U T S  model mentioned earlier:
S K E W t = β 0 + β 1 S K E W t 1 + β 5 d S K E W 5 d + β 10 d S K E W 10 d + β 21 d S K E W 21 d + β 63 d S K E W 63 d + β SKEW - PUTS S K E W t 1 + ϵ t
Finally, to explore the relationship between the  V I X  and  S K E W  indices, we estimate and produce recursive forecasts using one additional model, the  H A R - I V , as in Byun and Kim [23]. This model extends the pure  H A R  model applied to forecast the  S K E W  by including the lagged value of the  V I X  index as an additional predictor:
S K E W t = β 0 + β 1 S K E W t 1 + β 5 d S K E W 5 d + β 10 d S K E W 10 d + β 21 d S K E W 21 d + β 63 d S K E W 63 d + β VIX V I X t 1 + ϵ t .

3.2. Research Design

The core of our empirical study consists of comparing the forecasting power of the models described above. Following the set up in Ahoniemi [15], we employ a 1000-day rolling window recursive forecasting scheme. On each day starting with observation 1001, We compute the 1-, 5-, 10-, and 21-day ahead forecasts. To produce predictions beyond the 1-day horizon, we employ a classical, indirect dynamic forecast scheme, producing the forecast for the  t + 1  period and treating this as actual data to compute the  t + 2  prediction, and so on.
To evaluate the forecasting accuracy of the various models introduced earlier, three main indicators shall be considered: the  M S F E , the Mean Absolute Forecast Error ( M A F E ), and the percentage of correct predicted change of direction. The latter indicator is particularly useful from a trader’s point of view because many trading strategies rely on the correct direction prediction rather than on the magnitude of such changes (see, e.g., Costantini et al. [29] and Leitch and Tanner [30]).
We define the point forecast error for every observation in our (pseudo) OOS period, variable j, forecast horizon h, and model  M i  at time t as:
e ^ t , t + h j , M i = y t + h j y ^ t , t + h j , M i ,
where  y t + h j  is the actual value of variable j observed at time  t + h  and  y ^ t , t + h j , M i  is the predicted value as of time t for the variable j at time  t + h .
Next, we define  M S F E  as:
M S F E j , h , M i = 1 T h t = 1 T h e ^ t , t + h j , M i 2 ,
and  M A F E  as
M A F E j , h , M i = 1 T h t = 1 T h e ^ t , t + h j , M i ,
where T is the size of the out-of-sample period and t the first observation of such out-of-sample period. Lastly, the percentage of correct change of direction is defined as:
Correct change of direction = 1 T h t = 1 T h x t , t + h
where
x t , t + h = 1 if ( y t + h j y t j ) ( y ^ t , t + h j , M i y t j ) > 0 0 otherwise
The indicators introduced above are not particularly explanatory by themselves, specifically in assessing whether differences between different models are significant. In order to assess the different OOS forecasting performances of the various models, we rely on two different tests.
The first type of testing framework employed is Hansen’s test for Superior Predictive Ability Hansen [31]. In this framework, all alternative forecasts are compared to a benchmark prediction model. Forecasting power is defined by expected loss. The framework is the same as White’s Reality Check (RC) test White [32] but Hansen’s uses a different (studentized) test statistic and a sample-dependent distribution under the null hypothesis. Compared to White’s test, Hansen’s SPA test is more powerful and less sensitive to irrelevant alternative models, see Hansen [31]. In this framework, our rolling estimation scheme is acceptable when comparison of forecasts is interpreted as being conditional on the estimated parameters. The composite null hypothesis tested using Hansen’s SPA test is that the benchmark model is not inferior to any alternative models, namely:
H 0 : E ( d t ) 0 ,
where  E ( d t )  is the vector of expected values of the distances in performances  d t = ( d 1 , t , d 2 , t , , d m , t )  defined as:
d k , t = L ( Y t , δ 0 , t h ) L ( Y t , δ k , t h ) ,
where  L ( Y t , δ 0 , t h )  is the loss function of a selected benchmark model  δ 0 , t h  for the prediction made h-steps in advance for the target variable  Y t  and  L ( Y t , δ k , t h )  is the loss function for the k-th alternative model with  k = 1 , 2 , , m . The distribution of the test statistic is approximated by implementing the stationary bootstrapping method of Politis and Romano [33].
Our second framework to test superior predictive accuracy is Giacomini and White [34]’s who proposed an alternative approach after emphasizing that other frameworks for evaluating alternative OOS performances may not be necessarily useful in real-time forecast selection. Giacomini and White’s (GW) approach is based on inference about conditional expectations of forecasts and forecast errors rather than unconditional expectations. The authors argue that their framework assesses forecasting methods rather than forecasting models, thereby including evaluation choices about estimation procedures. Another advantage of GW test is that it can be applied to both nested and non-nested models, for multi-step point, interval, and density forecasting without a need for major adjustments. In fact, a fundamental assumption of GW is the heterogeneity of the process of the time series to be predicted instead of its stationarity. (The source of heterogeneity considered by GW is that induced by a distribution that changes over time. Furthermore, even if the underlying economic processes were stationary, heterogeneity in the observed time series can arise from changes in the measurement process Giacomini and White [34]. In this context, they suggest a rolling estimation scheme rather than an expanding window one).
The null hypothesis tested in GW’s approach is specified as:
H 0 : E L t + τ ( Y t + τ , f ^ m , t ) L t + τ Y t + τ , g ^ m , t | F t = 0
where  L t + τ ( Y t + τ , f ^ m , t )  and  L t + τ Y t + τ , g ^ m , t  are the loss functions relative to the predicted variables from forecasting methods f and g F t  is defined as  F t [ 1 L t + τ 1 ( Y ^ t + τ 1 , f ^ m , t 1 ) L t + τ 1 ( Y ^ t + τ 1 , g ^ m , t 1 ) ] . If the null hypothesis fails to be rejected, this implies that the two models evaluated are, on average, equally accurate. A positive sign of the test statistics indicates that model f forecasts losses are, on average, larger than those produced by model g, while a negative sign of the test statistics indicates the opposite.
In the final part of the paper, we also resort to Hansen et al. [35]’s model confidence set methodology to identify a set of models that are statistically indistinguishable in terms of predictive accuracy. The MCS process starts with a broad set of candidate models and iteratively eliminates them based on performance, until reaching a set of models that are equivalent in terms of forecasting precision. It is interesting to note how this approach differs from the SPA method described above, which focuses on detecting the superiority of a specific model compared to a benchmark, evaluating whether a single model has significantly better predictive ability. Essentially, while the MCS seeks to find a group of equally valid models, the SPA test aims to identify the best model, comparing it with a benchmark and rejecting the others. Even though the MCS method is particularly useful when there is no clear ”winner” among the models, as it provides a robust set of optimal models, if it yields a set composed of only one model (i.e., a singleton), this would clearly be a very strong indication in favor of that model.

4. Data

4.1. Original Data and Filtering Procedures

The data consist of daily observations for  V I X S K E W , and  S K E W  indices, over the period 4 January 1996–31 December 2019, for a total of 6005 daily observations. These data come from different sources: the time series of the  V I X  and  S K E W  indices are collected from the CBOE website. To decompose the  S K E W  index and extract only its negative part, we re-compute the entire daily history of the index. This involved running the filtering algorithm described in Appendix A and calculating afresh the  S K E W  index using the prices of the power payoffs detailed in the appendix. The only difference vs. the official procedure adopted by the CBOE is that we apply this procedure considering put options only. Because we have S&P 500 options available only with reference to a 4 January 1996–31 December 2019 sample, we limit our analysis to this range of dates even though the full sample of  V I X  and  S K E W  data from the CBOE are in principle available for a longer, 2 January 1990–30 November 2020 sample. In any event, this sample provides more than six years of additional data vs. the sample covered by Fernandes et al. [16]. The options price data are sourced from the Optionmetrics database available through WRDS (Wharton Research Data Services).
To calculate the  S K E W  index, besides the prices of S&P 500 put options, the only other additional of information needed are the rates implicit in the U.S. Treasury yield curve, taken as a proxy for the risk-free rates at a range of relevant maturities. Such yield data are collected from the Federal Reserve Bank Reports, also available in WRDS. In practice, the relevant Constant Maturity Treasury rates are interpolated using Nelson and Siegel [36]’s model, which consists of fitting a parametric model to yield data to obtain fitted (interpolated) continuously compounded rates for all relevant option maturities. The model parameters are estimated by converting observed market Treasury rates into discount factors and minimizing the sum of squared errors derived as a difference between the model-implied and market discount factors.
Once the novel  S K E W  index has been calculated, we obtain the complete time series for all the three variables of interest. In principle, the series should contain a total of 6042 observations. However, the datasets must also be adjusted for missing data points by stepwise deletion, considering only those dates for which data is available for all series. The resulting, final dataset contains a total of 6005 observations.

4.2. Descriptive Statistics

Figure 1 and Figure 2 show the evolution of both the  V I X  and  S K E W  indices between 4 January 1996 and 31 December 2019. Because in what follows we shall forecast both the level of the indices and their natural logarithm, Figure 1 concerns levels, while Figure 2 covers the case of the log of the indices. Importantly, the relatively long sample period covered in our paper covers a series of financial crises and turmoil periods in the financial markets that were characterized by high levels of the  V I X  index: the Asian crisis in 1997, the Russian crisis in 1998, the dot-com bubble burst in 2000, the 9/11 terrorist attack in 2001, the Lehman Brothers bankruptcy in September 2008, and the subsequent global financial crisis. From the plots, we observe that at least until the 2008 financial crisis, it seems that the  S K E W  index tended to be increase contemporaneously with the events marked by spikes in the  V I X . Starting in 2009, however, we note that an upward trend in the  S K E W  sets in in spite of the fact that  V I X  remains rather stable or even declines between 2011 and 2018. Hence Figure 1 fails to provide any clear insights concerning the overall relationship between the two implied volatility-based CBOE indices.
Table 1 reports standard summary statistics for the  V I X  and  S K E W  indices and their logarithms. The distribution of the  V I X  index turns out to be very skewed to the right and leptokurtic as the Jarque-Bera test strongly rejects the null hypothesis of normality. Applying the logarithm to the series of the  V I X  removes as substantial portion of the non-normal features, but the Jarque-Bera test still rejects the null hypothesis of a normal distribution. Regarding the time series of the  S K E W  index, it also displays positive skewness and a positive excess of kurtosis. However, in this case applying a log transformation has very weak effects on the measurable sample deviations from normality, at least in comparison to the  V I X  index.
The (unreported, to save space) Sample Autocorrelation Function (SACF) and Sample Partial Autocorrelation Function (SPACF) of the series all display high levels of persistence, which may be due to the presence of a unit root. To address this, we conduct unit root tests, specifically the Augmented Dickey–Fuller (ADF) and Phillips–Perron (PP) tests for unit roots, and the Kwiatkowski-Phillips-Schmidt-Shin (KPSS) test for stationarity. The ADF and PP tests strongly reject the null hypothesis of a unit root, whereas the KPSS rejects the null of stationarity.
As it is well known, because structural change and unit roots are closely related phenomena, we should be mindful that conventional unit root tests are biased toward a false unit root null when the data are trend stationary but contain a structural break. However, unit root tests exist that remain valid in the presence of a break. In particular, we apply a modified augmented Dickey-Fuller tests implemented in two steps, which allow either a change in level or a change in both level and trend when the breaks occur instantaneously (additive breaks), which we consider more typical of financial markets. In the first-step ADF test, the number of lags is selected by minimization of standard information criteria (we cannot find any difference between BIC and Hannan-Quinn results), given a maximum of 33 lags. The break date is selected as the one providing the strongest evidence of a break when all the break dates after day 34 in the sample are tested. In all cases, we reject the null of a unit root in the series considered. For instance, in the case of the raw of  V I X  index, the baseline ADF test stistic (with 7 lags selected by the BIC) is −5.142 (p-value of 0.000); when the occurrence of breaks the ADF test statistics declines to −7.818 (p-value of 0.000). The corresponding stastistics in the case of the  S K E W  index are −4.503 and −14.543 (p-values are both 0.000). Detailed results are available upon request. This contradiction holds for both indices and persists even when the logarithmic transformation is applied. The tests are also applied to the first differences of the series. In this case, the KPSS test fails to reject the null hypothesis of stationarity for all series under analysis, the differenced series and the differenced logarithm series. All three tests lead to the same conclusion. To account for this, we conduct our out-of-sample analysis using both the original and differenced series.

5. Empirical Results

5.1. In-Sample Analysis

We start our analysis by inspecting the in-sample predictability properties of (a few transformations of) the time series of the  V I X  and  S K E W  indices.

5.1.1. In-Sample Forecasting of the  V I X  Index

Because the literature has shown that  A R I M A  and  A R F I M A  fail to outperform the  H A R  model and its extensions (see Fernandes et al. [16]), to save space in the following we shall focus exclusively on the  H A R  model in both in-sample and out-of-sample analyses. This choice aligns with our objective to explore whether the  S K E W  index offers any additional information for modeling and forecasting the  V I X  index beyond what the  H A R  model already incorporates. The other predictive models employed are the  H A R - S K  and the  H A R - S K - P U T S . These models are applied to both the time series of the  V I X  index as well as to its logarithmic transformation. (For consistency, when modelling the logarithm of the  V I X , the  H A R - S K  and  H A R - S K - P U T S  models include as additional regressors the lagged value of the logarithm of the  S K E W  index and of the  S K E W  respectively).
Table 2 reports the estimated coefficients of the three models obtained over the whole sample. The first insight we can draw is that neither  S K E W  nor  S K E W  are characterized by statistically significant coefficients, when these are used as predictors. Hence, their addition to the pure  H A R  model does not add any explanatory power in terms of adjusted  R 2 . Furthermore, among all the specified regressors, only the coefficients of the lagged value of the  V I X  and the 10-day average are significant. Only when modeling the logarithm of the  V I X , the coefficient of the 63-day average is also significant at the 5% test size level. However, this result may be influenced by high levels of linear dependence among the regressors (i.e., multicollinearity) that delivers a good adjusted  R 2  (in fact, always in excess of 96 percent in Table 2) but modest significance of the individual coefficients. The lower panels of Table 2) devoted to the estimation of models for changes in  V I X  and log- V I X , apart from a structurally lower R-squared vs. the case of levels and log levels (2–3% vs. 96–97%, as one would expect), disclose the same rates of statistical significance as the upper panels, but now tilted away from the first lag of the series and 10-day volatility, towards 5- and 63-days.
To address these issues in our OOS analysis, we shall consider two specifications of the  H A R  model (and hence of the  H A R - S K  and  H A R - S K - P U T S  models): one which includes all the regressors, and one which includes only those predictors that display significant coefficients in the in-sample specification. Even though this is clearly sub-optimal and may imply some degree of hindsight bias, it provides an upper bound to the forecasting performance of the models when a substantial part of their tendency to over-fit is removed from the problem. (The first-best solution would consist of re-estimating and re-testing for significance the coefficients of all models in a recursive fashion throughout the pseudo OOS exercise). By re-estimating the models using only the significant predictors, we aim at obtaining additional evidence of the robustness of our baseline findings that use predictions from the full models that include all predictors. However, in unreported tables, we observe that even in the parsimonious specifications, the inclusion of the lagged values of the  S K E W  index and of the  S K E W  does not add any explanatory power to the pure  H A R  model and that their coefficients are not significantly different from zero. (Complete and tabulated results concerning the case of the models trimmed of the insignificant coefficients obtained over the full sample are available from the authors upon request).

5.1.2. In-Sample Forecasting of the  S K E W  Index

As discussed in Section 4 earlier, the time series of the  S K E W  index displays, as it happens for the  V I X  index, high levels of persistence. Furthermore, from the estimation of  A R M A  type models, we learn that model selection based on standard information criteria typically leads to highly parameterized models, including dozens of autoregressive and moving average terms. Because of this, we also employ the  H A R  model directly for the in-sample analysis of the  S K E W  index. The predictors adopted are the same employed for the  V I X  index, as specified in Equation (1). Moreover, by using a decomposition of the  S K E W  index, Bevilacqua and Tunaru [19] have argued that  S K E W  plays a major role in driving the evolution of the overall  S K E W  index, we also implement the  H A R - S K - P U T S  specification in Equation (8). Finally, as there is mixed evidence in the literature about the usefulness of the  S K E W  in modelling the time series of the  V I X  index, we also test whether the opposite relationship is somewhat clearer and we recursively estimate the  H A R - I V  model in Equation (9).
Table 3 shows the results of parameters estimation over the full sample of the three models cited above, both for the  S K E W  index and for its logarithm. Consistently to what was done in the case of the  V I X , when the  S K E W  is taken in logs, the additional predictors ( V I X  and  S K E W ) are taken in logs too. The coefficient estimates show that the only non-significant coefficient in the  H A R  model is the one associated to the quarterly moving average. As for the overall goodness of fit, also in Table 3, the  H A R  model produces satisfactory results in terms of adjusted  R 2 , always in excess of 88 percent. In the case of the  H A R - S K - P U T S  and the  H A R - I V  specifications, we observe that the first predictor adds some explanatory power in terms of adjusted  R 2 , while the second specification does not. Hence, it seems that the  V I X  index does not contain useful information for the in-sample fit of the  S K E W  index. Indeed, the coefficient related to the  S K E W  is significantly different from zero while the coefficient associated to the  V I X  index is not. However, by looking at the results produced by the in-sample analysis of the logarithm of the  S K E W  index, the results change: in the pure  H A R  model, all the predictors are characterized by significant coefficients and both the lagged value of the  S K E W  and the lagged value of the  V I X  improve the explanatory power of the  H A R  model, although marginally (at least in terms of the in-sample, adjusted  R 2 ). The two lower panels of Table 3, devoted to in-sample modelling of changes in the  S K E W , apart from a comparable drop in R-squared coefficients (here from 88% to 12%), show qualitatively similar results in terms of widespread coefficient significance (apart from the loss of estimation accuracy of the first lag, as one would expect when the series are first-differenced). Yet, the point estimates tend to turn negative, as one would expect as a result of over-differencing the SKEW in levels.
In unreported results based on employing only those predictors that turned out to be significant in the  H A R  model (thus excluding the 63-day moving average), the conclusions regarding the in-sample predictive power of the  V I X  and  S K E W  reported above are unchanged: the coefficient associated with  S K E W  in the model for  H A R - S K - P U T S  is significantly different from zero, while the coefficient associated with the  V I X  index in the  H A R - I V  model is not.

5.2. Out-of-Sample Analysis

After evaluating the models in terms of their in-sample forecasting accuracy, as measured by their  R 2 , we focus our attention on a pseudo out-of-sample analysis. The pseudo nature of the exercise stems from the fact that the predictive  H A R  models have been selected and fine tuned in Section 5.1.2 using the full sample of data. Nonetheless the recursive implementation of the OOS forecasting scheme described in Section 3.2 guarantees that no hindsight bias occurs in the calculation of the point forecasts used to assess the accuracy of alternative models. As planned, in what follows, we present the results obtained in terms of  M S F E M A F E , and the percentage of correctly predicted changes in direction. The OOS forecasting performance of the various models are evaluated using Hansen’s Hansen [31] test for superior predictive ability and GW’s Giacomini and White [34] test for conditional predictive ability. As a robustness check of our results, and given their persistence, we also employ the models to predict the first differences of the variables.
All the results that follow refer to the prediction of the level of the  V I X  and  S K E W  indices. Hence, when we employ transformations of such variables, we first transform back the outputs of the models in levels of the  V I X  and  S K E W  before calculating the indicators of forecasting accuracy. This choice is justified by the fact that the results obtained in this manner have a more solid practical meaning when compared to those derived from transformed variables.

5.2.1. Forecasting the  V I X  Index

Table 4 and Table 5 report the results in terms of MSFE, MAFE, and the percentage of correct direction changes when forecasting the  V I X  index using four different specifications of the target variable, i.e., levels, changes, logs, and change in logs. (The predictors in these tables include all the variables used in the in-sample analysis presented above, without excluding those found to be insignificant. Nonetheless we shall comment, where appropriate, on the main differences revealed by supplementary analyses based on models that have been trimmed, as described in Section 5.1.1 and Section 5.1.2, to exclude any insignificant predictors in the in-sample analyses). The tables also report the p-values of Hansen’s SPA test using both the squared error (SPA(SE)) and the absolute error (SPA(AE)) loss functions. A rejection of the null hypothesis in Hansen’s test as signalled by a low p-value indicates that the model taken as a benchmark is inferior to at least one of the alternative models treated in the set. Forecasts are recursively computed at the one-, five-, 10-, and 21-day ahead horizons.
Table 4 and Table 5 show that all in all, the results obtained in the literature about forecasting the  V I X  index are confirmed by our findings: the pure  H A R  model is very hard to beat. Specifically, in evaluating the results using Hansen’s SPA test for the squared error loss function, neither the  H A R - S K  nor the  H A R - S K - P U T S  models turn out to be clearly superior in any of the cases investigated. The same conclusion holds when employing Hansen’s test under the absolute error loss function. In fact, the results for SPA(AE) reveal that at several horizons (always in the case of the  V I X  taken in levels), the  H A R - S K  and  H A R - S K - P U T S  models are rejected with small p-values from the model confidence set.
Furthermore, in forecasting the levels of the  V I X  index (Panel A in Table 4), the  H A R  model is clearly superior to its two extensions, except for the 1-day ahead forecast horizon. In predicting the log transformation of the  V I X  (Panel B, Table 4), the superior forecasting performance of the simple  H A R  model becomes more evident at the longest horizons. Specifically, the  H A R - S K - P U T S  model performs the worst, being outperformed in terms of absolute errors at all forecast horizons except in the 1-day ahead set of exercises. Also in terms of squared errors, the  H A R - S K - P U T S  model is outperformed at all forecast horizons except for the 1-day ahead horizon. Interestingly, forecasting the level of the  V I X  at a 5-day horizon and longer, yields more accuracy under a squared loss function (the  M S F E ) vs. forecasting the log- V I X  and then transforming such predictions back to levels; yet, the opposite occurs in the case of absolute loss (the  M A F E  criterion). This is probably caused by the fact that a squared loss is more influenced by extreme outlier prediction errors that tend to appear more frequently as a result of recursive model estimations based on the logarithm of the  V I X  that get then compounded in the exponential transformation.
In both panels A and B of Table 4, the average correct change of direction statistics fails to give unequivocal indications, apart from showing that when the target variable is the level of the  V I X , applying transformations appears to generally hurt performance. In particular, in panel A for horizons of five days and higher,  H A R - S K  slightly outperforms. In panel B, every horizon implies a different ranking, even though the differences are small. As pointed out by one referee, our results on the correct change of direction tend to be below 50%, which is relatively disappointing in the light of previous literature. Casual inspection of the results when forecasting the  V I X  and  S K E W  in levels using the plain vanilla  H A R  model, we note that it is the inclusion of the 2008–2010 data that erodes the correct change of direction performance of the  H A R  model. For instance, in the case of the  V I X , the fraction of correctly predicted signs exceeds 52% when these three years are excluded; the corresponding statistic is 51% in the case of  S K E W  forecasts. Very similar conclusions are achieved with reference to the prediction of the first differences of the  V I X  in Panel A of Table 5). The  H A R - S K - P U T S  model performs the worst, being outperformed (i.e., excluded from the model confidence set) both in terms of squared and absolute errors at all the forecast horizons except for the 1 day-ahead (but just in terms of squared errors).
Panel B in Table 5 shows instead the results for predicting the first difference of the logarithm of the  V I X  (i.e., its continuous growth rate). In this case, there is no clearly superior model in terms of squared errors, while the  H A R - S K  model is outperformed at all forecast horizons in terms of absolute errors, i.e., when the models are ranked in terms of their  M A F E . However, in panel B forecasting the first-difference of the logarithm of the  V I X  always yields, despite the need of an exponential transformation to convert the predictions back to levels, more accurate predictions vs. the case in which one directly predicts the first-difference of the  V I X . Hence, whether the exponentiation required by the prediction of variables in logs may eventually penalized realized predictive performance or not, remains unclear as probably data-dependent.
Also with reference to Table 5, when it comes to the average percentage of correct predicted changes in direction, none of the models tested yields particularly precise prediction results, as this percentage ranges between 48% and 49%. Hence, none of the models appears to be specifically suitable for setting up a purely directional trading strategy as they would fail to outperform a naive coin-tossing device. This is in some contrast with earlier literature and appears to depend on poor performance of all models during the Great Financial Crisis.
We have also examined the OOS realized accuracy of forecasting the  V I X  index by including only those predictors that were found to be significant in the in-sample analysis in Section 5.1.1. In unreported tables, we find results that are qualitatively similar to the ones commented above. According to Hansen’s test applied to both the squared and absolute loss functions, the  H A R  model is not outperformed by the other two at any of the forecast horizons. Moreover, in the case of the logarithm of the  V I X , in terms of the absolute error loss function, the  H A R - S K - P U T S  model is outperformed at all horizons except the 1-day. Nonetheless, also in this case, we find no structurally superior model when their performance is assessed under a squared loss function. Additionally, the forecasting results obtained from the first difference of both the  V I X  and its logarithm show that the inclusion of non-significant predictors does not affect the OOS results: the  H A R - S K - P U T S  model is outperformed in predicting the  V I X  via its first difference (according to both the SPA (SE) and SPA (AE) tests), whereas the  H A R - S K  model is inferior in predicting the logarithm of the  V I X  (yet, according to the SPA (AE) test only). Once again, no particularly useful information emerges from the percentage of correct predicted changes in direction indicator.
The picture of the rankings of OOS performances that has emerged so far appears to be clear: the lagged value of the  S K E W  index does not add any useful information to the OOS forecasting of the  V I X  index beyond what is already accounted for by the pure  H A R  model. This conclusion is robust across different specifications of the target variables predicted and across different specifications. These results extend one of the findings in Bevilacqua and Tunaru [19], who found that the inclusion of the  S K E W  index, its “negative” part ( S K E W ), and its positive part ( S K E W + ) produce superior out-of-sample forecasting performances compared to a simple  A R ( 1 )  process. Yet, we also found that when employing a more sophisticated model, the effect of the  S K E W  index in predicting the  V I X  vanishes. Also when including the the  S K E W  index obtained from put options only, the predictive power of the  S K E W  index in predicting the  V I X  fails to improve and, in fact, occasionally worsens the OOS accuracy of the otherwise simpler, standard  H A R .
Table 6 and Table 7 show the results of the Giacomini and White’s Giacomini and White [34] test of conditional predictive ability. This test differs from Hansen’s SPA test because it evaluates only models taken in pairs and because it is a conditional test rather than an unconditional one, i.e., the exploits the cross-serial correlation structure linking a set of instruments with the forecast errors generated by a predictive model. The loss function employed for this test is the squared error function but the results we have obtainted were essentially similar for an absolute value loss. In a GW test, a rejection of the null hypothesis due to a low p-value of the test statistic indicates that a model (say A) produces a larger average loss than another model (say B) if the test statistic is positive, and vice versa, model B produces a larger average loss than model A if the test statistic is negative. In the tables, models in the rows correspond to model A, and models in the columns correspond to model B when performing the test. If both null hypotheses fails to be rejected, the two models produce equally accurate forecasts.
The results of the test reported in Table 6 and Table 7 confirm the key conclusions already commented: when forecasting both levels and first differences of  V I X  and log- V I X , respectively, the  H A R - S K  and  H A R - S K - P U T S  models never produce lower losses than the HAR model, and the  H A R - S K  model is never significantly inferior to the  H A R - S K - P U T S  model. The test statistic is non-zero (generally negative, thus favoring  H A R - S K  over  H A R - S K - P U T S ) only when comparing these last two models at longer horizons and predicting the differenced series, but the null hypothesis is never rejected, at least in the case differencing is applied before forecasting.
Also in this case, When considering the results obtained by including only the significant predictors in the models, there are no notable discrepancies vs. the results detailed in Table 6 and Table 7.

5.2.2. Out-of-Sample Forecasting of the  S K E W  Index

We now focus on evaluating the OOS forecasting power of the different models in predicting the  S K E W  index. Since no model has been established in the literature that promises a particularly good predictive performance for the  S K E W  index, we compare the predictions made using the three models in Equations (1), (8), and (9) with those of two additional, simple benchmarks: a Random Walk ( R W ) model and a  A R ( 1 )  model. (Even though Taylor [17] is applied to VIX futures returns (i.e., differences in the log-VIX) only, his fractionally integrated white noise model may represent a further, important benchmark, especially because Taylor shows that this model provides more accurate predictions of price changes and proportional returns, even when incorporating economic predictors like VIX futures price changes, S&P 500 index returns, interest rates, and the futures basis).
Table 8 and Table 9 document the results obtained using all the predictors of our earlier in-sample analysis, in Section 5.1.2. From the tables, we can draw one conclusion: the benchmark  R W  and  A R ( 1 )  are clearly outperformed by all other models in predicting all the specifications of the  S K E W  at all the forecast horizons evaluated. This is shown by Hansen’s SPA test performed using both the squared and the absolute loss functions. The only exception is the  A R ( 1 )  model, which is not clearly inferior to other models in predicting the first difference of the  S K E W  and the first difference of the logarithm of the  S K E W  at the 1-day horizon under the absolute loss.
Panel A in Table 8 displays the results obtained in predicting the levels of the  S K E W . According to the SPA (SE) test, the  H A R - S K - P U T S  model outperforms the benchmarks at predicting the  S K E W  index at all horizons except at the 1-day according to the SPA (AE) test. Yet, there is no clear superior model between the  H A R  and the  H A R - S K - P U T S . The same results can be observed from Panel B in Table 8 when forecasting the  S K E W  via its logarithm.
Again, as we did when evaluating the out-of-sample forecasts of the  V I X  index, we also evaluate our predictions using the first differences of both the  S K E W  and of its logarithm. Panels A and B of Table 9 report the corresponding results. In this case, there is no clear superior model among the three  H A R  frameworks, neither in terms of squared errors nor in terms of absolute errors, even though they all outperform the benchmarks.
Unlike the case with the  V I X  index in Section 5.2.1, the percentage of correct predicted direction of change Table 8 and Table 9, offers some useful insights. At the 1-day horizon, this percentage is approximately 42% for all the models employed, including the benchmark models. Paradoxically, this suggests that establishing a trading strategy based on the opposite of the direction predicted by the models might yield superior returns. However, at longer horizons, the percentage obtained by the benchmark models hovers around 50%, indicating that no clear indication about the direction of the  S K E W  index can be extracted using either a  R W  or an  A R ( 1 )  model. On the other hand, employing the  H A R  model and its other two specifications yields a percentage ranging from 44% to 46%, which reveals that while the benchmarks may face problems when forecasting the  S K E W  as whole, matters are more complicated when it comes to predict the sign of the change in the  S K E W . These conclusions apply to both the levels of the  S K E W  index (Panel A of Table 8) and to its logarithm (Panel B of Table 8). When the forecasts are instead to be based on the first difference of the  S K E W  series, the percentage of correct sign predictions remains around 42–43% at all horizons for all the alternative models and for the  A R ( 1 )  model. The only exception appears to be the RW model, which ranges from 43% to 46%.
Considering only those predictors that were found to be significant when analyzing the in-sample properties of the time series of the  S K E W  index in Section 5.1.2, we obtain additional evidence of the robustness of our results. The  H A R - S K - P U T S  model is superior to the others according to Hansen’s test using the squared errors loss function at all forecast horizons except the 1-day horizon. Nonetheless, when we assess predictive performances under the absolute error loss function, there is no clear superior model between the  H A R  and the  H A R - S K - P U T S . So results remain mixed when it comes to compare different strands of  H A R  models and in terms of realized signed change prediction performance, even though there is never any doubt that  H A R -type models are superior to standard benchmarks.
Table 10, Table 11, Table 12 and Table 13 report the results of the GW test Giacomini and White [34] for conditional forecasting ability in the case of  S K E W  predictions. As in the case of the  V I X , the models in the rows correspond to model A and the models in columns correspond to model B when performing the test. A clear finding is that also in the perspective of a GW test, all the  H A R  models consistently produce superior forecasts compared to the two benchmarks, as indicated by the rejection of the null hypothesis and by the positive test statistics when comparing the benchmarks vs.  H A R  models. Furthermore, when comparing the performances of the  H A R  model with the  H A R - S K - P U T S  model, the latter produces superior forecasts at all horizons in predicting the  S K E W  index and its logarithm (Table 10 and Table 11). However, this conclusion cannot be confirmed when we forecast the first difference of the  S K E W  and the logarithm of the  S K E W  (see Table 12 and Table 13). Another interesting result can be inferred from Table 10 and Table 11 is that the  H A R - I V  model is found to be always inferior to the  H A R - S K - P U T S  model in predicting the  S K E W  index and of its logarithm. Once more, this conclusion does not hold when assessing the two models in terms of their ability to predict the first difference of the series.
When we investigate the models that include only the predictors found to be statistically significant over the in-sample estimation of the  H A R  model (the  R W  and  A R ( 1 )  benchmark models specifications and realized predictive performaces do not change because these models contain all predictors by construction) in Section 5.1.2, we cannot find any relevant difference in the OOS result with respect to what concluded above.
Finally, by comparing the conclusions derived from Hansen [31]’s SPA test with those from the GW test Giacomini and White [34], we find that the two types of predictive accuracy tests yield consistent results. The only difference arises at the 1-day horizon, where the Hansen’s test cannot detect any superior model among the three different specifications of the  H A R  model, while GW’s test does.

6. Model Confidence Set Results

As a final, conclusive step, we apply to our problem the Model Confidence Set (henceforth, MCS) methodology by Hansen et al. [35]. Our question becomes then not really testing whether a plain vanilla  H A R  may represent the best performing model in terms of OOS accuracy, but whether it is possible to identify a subset of models that are indistinguishable in terms of predictive accuracy. Therefore, instead of selecting a single “best” model, MCS recognizes that multiple models can perform similarly. The method begins with a set of candidate models and iteratively removes those that are statistically worse than others. It uses a test statistic, essentially a t-statistic in our implementation, to compare predictive performance differences. Models with higher test statistics are eliminated, and the process continues until only models that are not significantly different remain. To ensure robustness and handle sample variability, block bootstrap techniques are used to calculate p-values and assess the significance of the differences. MCS provides a set of reliable models, avoiding premature selection of a model that may appear better due to sample noise.
In particular, in our implementation, we use 20,000 bootstraps and select heterogeneous block lengths based on the minimization of the BIC information criterion. The threshold to determine which models are removed from the MCS is set at 0.05. The test statistic is the Selection Quantile (SQ) one, the first one proposed in Hansen et al. [35]. Table 14 presents the empirical results for both the  V I X  and the  S K E W  in their four alternative definitions (levels, differences, logs, and differences in logs). The table lists empirical findings for both the squared and the absolute value loss functions and in correspondence to each model we report the SQ statistic p-value. The p-values that are boldfaced refer only to the model(s) that end up belonging to the MCS.
Table 14 documents the main results of the MCS tests and shows that out of a total of 64 instances in which the MCS was applied to the forecast errors derived from the three  H A R -style models ( H A R H A R - S K / I V , and  H A R - S K E W - P U T ) and the two benchmarks ( R W  and  A R ( 1 ) ), in 54 cases the plain vanilla  H A R  model turns out to belong to the confidence set when a 5% p-value threshold is imposed. This is all the more impressive because out of 64 sets estimated, only in 38 cases the MCS includes only a model and of such 38 instances, in 29 of them it is the plain vanilla  H A R  model that represents the singleton. This means that a simple  H A R  framework is indeed very hard to dismiss from the set of forecasts a researcher ought to entertain. For instance, the first row of the table shows that when forecasting the  V I X  level OOS, under a squared loss, the MCS contains only the plain vanilla  H A R ; under an absolute value loss, the MCS contains instead not only the simple  H A R , but also  H A R - S K  and the two benchmarks, albeit the latter carry rather modest p-values of 0.125 (yet exceeding 0.05). This conclusion does not depend on the assumed loss function and applies with particular strength to the  V I X  in its various transformations (30 cases out of 32).
Interestingly, and this is especially the case when we forecast the  S K E W  (in 21 cases out of 32), also the  H A R  model that includes put-based only skewness measures tends to belong to the MCS, in total in 23 cases out of 64. The two benchmarks are featured in the MCS very rarely: even when we combine both  R W  and the  A R ( 1 )  model, this occurs only in 5 cases out of 64, which may almost be attributed to pure chance (even though such an application of the concept of size of a test may not be a straightforward in the case of the MCS).

7. Conclusions

We have assessed the OOS forecasting power of simple  H A R  models to forecast (a number of transformations of) the CBOE  V I X  and  S K E W  indices. Besides ranking the realized, OOS performance of the standard  H A R  and of variations of  H A R  that allow lagged  S K E W  to forecast the  V I X  and viceversa, we use Hansen [31]’s Superior Predictive Accuracy, Giacomini and White [34]’s, and Hansen et al. [35]’s model confidence set methods to test whether the (unconditional and conditional, respectively) differential in predictive accuracy are statistically significant or whether any model can be removed from the set that is worthwhile to consider in forecasting applications. We devote special attention to the idea that  H A R  models for the  V I X  ( S K E W ) that also include lagged (transformations of) the  S K E W  ( V I X ) may provide superior predictive performance vs. standard  H A R  models and that special definitions of the  S K E W  index (based only on put options or just measuring downside risk) may carry special information content that is valuable in OOS applications. Nonetheless, we conclude that—especially in  V I X  forecasting applications—a plan vanilla  H A R  model delivers the best predictive performance and that only in specific cases involving specific transformations of the  S K E W  may provide a role to  H A R  models that are augmented to include additional option-implied information. Because all  H A R  models can outperform simple but common benchmarks in the literature (such as the random walk and AR(1) models), we lean towards the idea that it is the robust predictive power of  H A R  that drives our results. This finding is confirmed by the MCS results in which in almost 85% of the combinations of target forecast variables, loss functions, and horizons, a plain vanilla  H A R  turns out to be impossible to remove from the set to be taken into account. Therefore applications to  V I X  and  S K E W  forecasting confirm once more a classical results of applications of  H A R  modelling to realized, physical volatility, i.e., that systematically outperforming the standard framework popularized by Corsi [6] remains difficult in most cases.
Our objective was not to claim superiority of the  H A R  approach but to stress that a considerable predictive mileage can also be obtained from a very simple approach. Obviously, while  H A R  is justified for forecasting the  V I X  and  S K E W  indices due to its ability to capture long-memory properties and the impact of volatility at different time scales but remains intuitive and simple, wavelet modeling is more complex, focusing on formally decomposing time series into different frequency components. While wavelets offer insights into localized variations at multiple resolutions, they can be computationally intensive and require careful tuning of parameters and remain an exciting avenue for future research.

Author Contributions

Conceptualization, M.G. and G.F.P.; methodology, M.G. and G.F.P.; software, M.G. and G.F.P.; validation, M.G. and G.F.P.; formal analysis, M.G. and G.F.P.; data curation, M.G. and G.F.P.; writing—original draft preparation, M.G.; writing—review and editing, G.F.P.; funding acquisition, M.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the European Union—NextGenerationEU, in the framework of the GRINS—Growing Resilient, INclusive and Sustainable project (GRINS PE00000018—CUP B43C22000760006).

Data Availability Statement

Restrictions apply to the availability of these data. The raw data was obtained from Wharton Research Data Services and are available at https://wrds-www.wharton.upenn.edu/ (accessed on 13 February 2021) with the permission of https://wrds-www.wharton.upenn.edu/ or under adequate subscription conditions. Publicly available VIX data and interim outputs of the analysis are available at https://zenodo.org/ (DOI: 10.5281/zenodo.13759314).

Acknowledgments

We thank Nicholas Battistutta for his seminal work and excellent research assistance. We are also grateful to three anonymous referees for excellent comments and encouragement.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
A R C H Autoregressive Conditional Heteroskedasticity
CBOEChicago Board Options Exchange
V I X CBOE Volatility Index
S K E W CBOE Skew Index
H A R Heterogeneous Autoregressive
S V R Support Vector Regression
G A Genetic Algorithm
H A R - G A S V R Heterogeneous Autoregressive-Genetic Algorithm Support Vector Regression
OTMOut-of-The-Money
A R I M A Autoregressive Integrated Moving Average
G A R C H Generalized Autoregressive Conditional Heteroskedasticity
M A Moving Average
M C S Model Confidence Set
E G A R C H Exponential Generalized Autoregressive Conditional Heteroskedasticity
S K E W + Positive Skew Index
S K E W Negative Skew Index
A R ( 1 ) First-Order Autoregressive
OOSOut-Of-Sample
SPASuperior Predictive Ability
M S F E Mean Squared Forecast Error
A R I M A X Autoregressive Integrated Moving Average with Exogenous variables
ITMIn-The-Money
A R F I M A Autoregressive Fractionally Integrated Moving Average
M A F E Mean Absolute Forecast Error
RCReality Check
GWGiacomini and White
WRDSWharton Research Data System
SACFSample Autocorrelation Function
SPACFSample Partial Autocorrelation Function
ADFAugmented Dickey–Fuller
PPPhillips–Perron
KPSSKwiatkowski-Phillips-Schmidt-Shin
SPA(SE)Hansen’s SPA test using squared error
SPA(AE)Hansen’s SPA test using absolute error
RWRandom Walk

Appendix A. The SKEW Index

The Chicago Board Options Exchange introduced the  V I X  index in 1993 to measure the market’s expectation of 30-day volatility derived from at-the-money S&P 100 options. In 2003, the methodology was updated to include 30-day calls and puts on the S&P 500 index, using weights inversely proportional to the squared strike price. This new method includes options across the volatility skew, providing a market-determined forward-looking estimate of one-month stock market volatility, see Hentschel [37]. Yet, the distribution of S&P 500 log-returns is unlikely to be normal if there are large jumps in returns, causing “tail risk” to emerge. Because the standard deviation alone cannot fully describe this risk, as the empirical distribution of S&P 500 returns is leptokurtic and negatively skewed, the  V I X  cannot fully capture the perceived risk of investing in the S&P 500. In fact, at least since the market crash of October 1987, the volatility smile has shifted, biased toward the put side, reflecting market awareness of large downward jumps. Hence, the CBOE developed the  S K E W  index to measure perceived tail risk.
The  S K E W  index, published since February 2011, is based on the methodology proposed by Bakshi and Madan [38] and Bakshi et al. [24]) for the calculation of risk-neutral skewness.  S K E W  is calculated from out-of-the-money S&P 500 options and replicates an exposure to the skewness of S&P 500 returns. The  S K E W  index is calculated as:
S K E W = 100 10 · S k ,
where:
S k = E R μ σ 3 .
S k  represents the risk-neutral skewness of 30-day S&P 500 log returns, R is the 30-day log return,  μ  its expected value, and  σ  its standard deviation, under the risk-neutral probability measure.  S k  is also the market price of a skewness payoff determined by the asymmetry of the S&P 500 log-return, with zero payoff for symmetric returns and negative (positive) payoff for negative (positive) biases.
Bakshi et al. [24] show that  S k  can be calculated using the power payoffs  P 1 , P 2 ,  and  P 3 :
S = E [ R 3 ] 3 E [ R ] E [ R 2 ] + 2 E [ R 3 ] ( E [ R 2 ] E [ R ] 2 ) 3 / 2 = P 3 3 P 1 P 2 + 2 P 1 3 ( P 2 P 1 2 ) 3 / 2
Power payoffs can be replicated by delta-hedged portfolios of ATM and OTM options, so calculating  P 1 P 2 , and  P 3 , and consequently  S K E W , is analogous to what is routinely done by the CBOE in the case of the  V I X . The filtering algorithm for selecting relevant option contracts is the same as for  V I X : one identifies the strike price at which the absolute difference between the call and put prices (mid-prices) is smallest and calculate the index forward level F as:
F = strike price + e R T × ( Call price Put price )
Once F is determined,  K 0  is chosen as the strike price just below F. We focus only on call options with strike  K i > K 0 , and put options with strike  K i < K 0 . From these selected contracts, options with a bid price of zero are excluded. If consecutive puts (calls) have zero bid prices, no further puts (calls) with lower (higher) strikes are included in the  V I X  calculation.
For near and next-term maturities, the power payoff prices are calculated as:
P 1 = E [ R T ] = e r T i 1 K i 2 Q ( K i ) Δ K i + ϵ 1
P 2 = E [ R T 2 ] = e r T i 2 K i 2 1 ln K i F 0 Q ( K i ) Δ K i + ϵ 2
P 3 = E [ R T 3 ] = e r T i 3 K i 2 2 ln K i F 0 ln K i F 0 2 Q ( K i ) Δ K i + ϵ 3 ,
where T is the time to expiration,  F 0  the forward index level computed from option prices,  K i  the strike price of the ith out-of-the-money option,  Δ K i  the interval between strikes determined as half the difference between the strikes on either side of  K i  ( Δ K i = K i + 1 + K i 1 2 ), r the risk-free interest rate,  Q ( K i )  the mid-price for each option with strike  K i , and the  ε j  terms are adjustment factors for the differences between  K 0  and  F 0 . Specifically:
ε 1 = 1 + ln F 0 K 0 F 0 K 0
ε 2 = 2 ln K 0 F 0 F 0 K 0 1 + 1 2 ln K 0 F 0 2
ε 3 = 3 ln K 0 F 0 2 1 3 ln K 0 F 0 1 + F 0 K 0

References

  1. Guidolin, M.; Pedio, M. Essentials of Time Series for Financial Applications; Academic Press: La Jolla, CA, USA, 2018. [Google Scholar]
  2. Hautsch, N. Econometrics of Financial High-Frequency Data; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2011. [Google Scholar]
  3. Han, H.; Park, M.D. Comparison of realized measure and implied volatility in forecasting volatility. J. Forecast. 2013, 32, 522–533. [Google Scholar] [CrossRef]
  4. Fassas, A.P.; Siriopoulos, C. Implied volatility indices–A review. Q. Rev. Econ. Financ. 2021, 79, 303–329. [Google Scholar] [CrossRef]
  5. Bandi, F.M.; Perron, B. Long memory and the relation between implied and realized volatility. J. Financ. Econom. 2006, 4, 636–670. [Google Scholar] [CrossRef]
  6. Corsi, F. A simple approximate long-memory model of realized volatility. J. Financ. Econom. 2009, 7, 174–196. [Google Scholar] [CrossRef]
  7. Corsi, F.; Audrino, F.; Renó, R. HAR Modeling for Realized Volatility Forecasting. In Handbook of Volatility Models and Their Applications; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 2012; pp. 363–382. [Google Scholar]
  8. Psaradellis, I.; Sermpinis, G. Modelling and trading the US implied volatility indices. Evidence from the VIX, VXN and VXD indices. Int. J. Forecast. 2016, 32, 1268–1283. [Google Scholar] [CrossRef]
  9. Whaley, R.E. Understanding the VIX. J. Portf. Manag. 2009, 35, 98–105. [Google Scholar] [CrossRef]
  10. Zhang, J.E.; Shu, J.; Brenner, M. The new market for volatility trading. J. Futures Mark. 2010, 30, 809–833. [Google Scholar] [CrossRef]
  11. Cao, J.; Ruan, X.; Zhang, W. Inferring information from the s&p 500, CBOE VIX, and CBOE skew indices. J. Futures Mark. 2020, 40, 945–973. [Google Scholar]
  12. Majmudar, U.; Banerjee, A. Vix Forecasting. 2004. Available online: https://ssrn.com/abstract=533583 (accessed on 1 February 2021).
  13. Blair, B.J.; Poon, S.H.; Taylor, S.J. Forecasting S&P 100 volatility: The incremental information content of implied volatilities and high-frequency index returns. J. Econom. 2001, 105, 5–26. [Google Scholar]
  14. Degiannakis, S.A. Forecasting VIX. Available online: https://ssrn.com/abstract=1806044 (accessed on 1 February 2021).
  15. Ahoniemi, K. Modeling and Forecasting the VIX Index. 2008. Available online: https://ssrn.com/abstract=1033812 (accessed on 1 February 2021).
  16. Fernandes, M.; Medeiros, M.C.; Scharth, M. Modeling and predicting the CBOE market volatility index. J. Bank. Financ. 2014, 40, 1–10. [Google Scholar] [CrossRef]
  17. Taylor, N. Forecasting returns in the VIX futures market. Int. J. Forecast. 2019, 35, 1193–1210. [Google Scholar] [CrossRef]
  18. Mora-Valencia, A.; Rodríguez-Raga, S.; Vanegas, E. Skew index: Descriptive analysis, predictive power, and short-term forecast. N. Am. J. Econ. Financ. 2021, 56, 101356. [Google Scholar] [CrossRef]
  19. Bevilacqua, M.; Tunaru, R. The SKEW index: Extracting what has been left. J. Financ. Stabil. 2021, 53, 100816. [Google Scholar] [CrossRef]
  20. Elyasiani, E.; Gambarelli, L.; Muzzioli, S. The skewness index: Uncovering the relationship with volatility and market returns. Appl. Econ. 2021, 53, 3619–3635. [Google Scholar] [CrossRef]
  21. John, K.; Li, J. COVID-19, volatility dynamics, and sentiment trading. J. Bank. Financ. 2021, 133, 106162. [Google Scholar] [CrossRef]
  22. Konstantinidi, E.; Skiadopoulos, G.; Tzagkaraki, E. Can the evolution of implied volatility be forecasted? Evidence from European and US implied volatility indices. J. Bank. Financ. 2008, 32, 2401–2411. [Google Scholar] [CrossRef]
  23. Byun, S.J.; Kim, J.S. The information content of risk-neutral skewness for volatility forecasting. J. Empir. Financ. 2013, 23, 142–161. [Google Scholar] [CrossRef]
  24. Bakshi, G.; Kapadia, N.; Madan, D. Stock return characteristics, skew laws, and the differential pricing of individual equity options. Rev. Financ. Stud. 2003, 16, 101–143. [Google Scholar] [CrossRef]
  25. Liu, Z.F.; Faff, R. Hitting skew for six. Econ. Model. 2017, 64, 449–464. [Google Scholar] [CrossRef]
  26. Han, B. Investor sentiment and option prices. Rev. Financ. Stud. 2008, 21, 387–414. [Google Scholar] [CrossRef]
  27. Doran, J.S.; Peterson, D.R.; Tarrant, B.C. Is there information in the volatility skew? J. Futures Mark. Futures Opt. Other Deriv. Prod. 2007, 27, 921–959. [Google Scholar] [CrossRef]
  28. Müller, U.A.; Dacorogna, M.M.; Davé, R.D.; Pictet, O.V.; Olsen, R.B.; Ward, J.R. Fractals and Intrinsic Time—A Challenge to Econometricians. 2008. Available online: https://ssrn.com/abstract=5370 (accessed on 1 February 2021).
  29. Costantini, M.; Cuaresma, J.C.; Hlouskova, J. Forecasting errors, directional accuracy and profitability of currency trading: The case of EUR/USD exchange rate. J. Forecast. 2016, 35, 652–668. [Google Scholar] [CrossRef]
  30. Leitch, G.; Tanner, J.E. Economic forecast evaluation: Profits versus the conventional error measures. Am. Econ. Rev. 1991, 81, 580–590. [Google Scholar]
  31. Hansen, P.R. A test for superior predictive ability. J. Bus. Econ. Stat. 2005, 23, 365–380. [Google Scholar] [CrossRef]
  32. White, H. A reality check for data snooping. Econometrica 2000, 68, 1097–1126. [Google Scholar] [CrossRef]
  33. Politis, D.N.; Romano, J.P. The stationary bootstrap. J. Am. Stat. Assoc. 1994, 89, 1303–1313. [Google Scholar] [CrossRef]
  34. Giacomini, R.; White, H. Tests of conditional predictive ability. Econometrica 2006, 74, 1545–1578. [Google Scholar] [CrossRef]
  35. Hansen, P.R.; Lunde, A.; Nason, J.M. The model confidence set. Econometrica 2011, 79, 453–497. [Google Scholar] [CrossRef]
  36. Nelson, C.R.; Siegel, A.F. Parsimonious modeling of yield curves. J. Bus. 1987, 60, 473–489. [Google Scholar] [CrossRef]
  37. Hentschel, L. Errors in implied volatility estimation. J. Financ. Quant. Anal. 2003, 38, 779–810. [Google Scholar] [CrossRef]
  38. Bakshi, G.; Madan, D. Spanning and derivative-security valuation. J. Financ. Econ. 2000, 55, 205–238. [Google Scholar] [CrossRef]
Figure 1. Evolution of  V I X  and SKEW indices in levels.
Figure 1. Evolution of  V I X  and SKEW indices in levels.
Forecasting 06 00040 g001
Figure 2. Evolution of  V I X  and  S K E W  indices in logarithms.
Figure 2. Evolution of  V I X  and  S K E W  indices in logarithms.
Forecasting 06 00040 g002
Table 1. Summary Statistics: The table reports the key sample statistics for the levels and the logarithms of the  V I X  and  S K E W  indices calculated on the full sample, from 4 January 1996 to 31 December 2019. The table also shows the p-values for the Jarque-Bera test of normality. Under the null hypothesis of zero skewness and of kurtosis equal to three, the Jarque-Bera test statistics is distributed as a chi-squared with 2 degrees of freedom.
Table 1. Summary Statistics: The table reports the key sample statistics for the levels and the logarithms of the  V I X  and  S K E W  indices calculated on the full sample, from 4 January 1996 to 31 December 2019. The table also shows the p-values for the Jarque-Bera test of normality. Under the null hypothesis of zero skewness and of kurtosis equal to three, the Jarque-Bera test statistics is distributed as a chi-squared with 2 degrees of freedom.
VIXSKEWLOG(VIX)LOG(SKEW)
Mean19.9434120.58132.92554.7902
Median18.3200119.32002.90804.7818
Maximum80.8600159.03004.39275.0691
Minimum9.1400104.09002.21274.6453
Std. Dev.8.14348.02200.35480.0648
Skewness2.03430.95380.56730.7590
Kurtosis10.19694.01433.29283.5171
Jarque-Bera0.00000.00000.00000.0000
Observations6005600560056005
Table 2. In-sample modelling of  V I X  and  L O G ( V I X )  including all predictors in Equations (1), (6), and (7). p-values are reported in parenthesis. In the table, * means statistical significance at 10%, ** and *** indicate statistical significance at 5 and 1%, respectively.
Table 2. In-sample modelling of  V I X  and  L O G ( V I X )  including all predictors in Equations (1), (6), and (7). p-values are reported in parenthesis. In the table, * means statistical significance at 10%, ** and *** indicate statistical significance at 5 and 1%, respectively.
VIXLOG(VIX)
HARHAR-SKHAR-SK-PUTSHARHAR-SKHAR-SK-PUTS
β 0 0.24540.64360.38320.02830.10830.0709
(0.000) ***(0.083) *(0.178)(0.000) ***(0.159)(0.258)
β 1 0.86230.86140.86160.89850.89750.8973
(0.000) ***(0.000) ***(0.000) ***(0.000) ***(0.000) ***(0.000) ***
β 5 d 0.01890.01900.0190−0.0050−0.0048−0.0047
(0.479)(0.476)(0.476)(0.847)(0.853)(0.855)
β 10 d 0.11750.11760.11750.08290.08310.0830
(0.000) ***(0.000) ***(0.000) ***(0.004) ***(0.004) ***(0.004) ***
β 21 d −0.0191−0.0187−0.01889−0.0049−0.0048−0.0047
(0.335)(0.345)(0.341)(0.796)(0.801)(0.805)
β 63 d 0.00810.00710.00780.01880.01780.0184
(0.311)(0.379)(0.332)(0.022) **(0.031) **(0.025) **
β S K E W −0.0031 −0.0158
(0.277) (0.296)
β S K E W - P U T S −0.0009 −0.0079
(0.620) (0.493)
R a d j u s t e d 2 0.961180.961150.961150.964950.964950.96495
Δ  VIX Δ  LOG(VIX)
HARHAR-SKHAR-SK-PUTSHARHAR-SKHAR-SK-PUTS
β 0 −0.0009−0.0009−0.0009−0.0001−0.0001−0.0001
(0.965)(0.964)(0.965)(0.942)(0.941)(0.942)
β 1 −0.0516−0.0485−0.0463−0.0358−0.0332−0.0321
(0.000) ***(0.001) ***(0.001) ***(0.011) **(0.020) **(0.025) **
β 5 d −0.2327−0.2328−0.2333−0.1301−0.1304−0.1306
(0.000) ***(0.000) ***(0.000) ***(0.004) ***(0.003) ***(0.003) ***
β 10 d −0.1049−0.1069−0.1072−0.1943−0.1959−0.1957
(0.195)(0.187)(0.185)(0.012) **(0.011) **(0.011) **
β 21 d −0.0663−0.0668−0.0655−0.0981−0.0977−0.0977
(0.589)(0.586)(0.593)(0.408)(0.410)(0.410)
β 63 d −0.7251−0.7244−0.7261−0.8042−0.8039−0.8048
(0.000) ***(0.000) ***(0.000) ***(0.000) ***(0.000) ***(0.000) ***
β S K E W 0.0109 0.0443
(0.128) (0.236)
β S K E W - P U T S 0.0059 0.0229
(0.032) ** (0.180)
R a d j u s t e d 2 0.02630.02670.02700.02130.02150.0216
Table 3. In-sample modelling of  S K E W  and  L O G ( S K E W )  including all the regressors specified in Equations (1), (8), and (9). p-values are reported in parenthesis. In the table, * means statistical significance at 10%, ** and *** indicate statistical significance at 5 and 1%, respectively.
Table 3. In-sample modelling of  S K E W  and  L O G ( S K E W )  including all the regressors specified in Equations (1), (8), and (9). p-values are reported in parenthesis. In the table, * means statistical significance at 10%, ** and *** indicate statistical significance at 5 and 1%, respectively.
SKEWLOG(SKEW)
HARHAR-SK-PUTSHAR-IVHARHAR-SK-PUTSHAR-IV
β 0 2.31062.66462.67180.08780.09510.1152
(0.000) ***(0.000) ***(0.000) ***(0.000) ***(0.000) ***(0.000) ***
β 1 0.62310.61150.62260.61800.60820.6168
(0.000) ***(0.000) ***(0.000) ***(0.000) ***(0.000) ***(0.000) ***
β 5 d 0.26660.26430.26670.27860.27680.2787
(0.000) ***(0.000) ***(0.000) ***(0.000) ***(0.000) ***(0.000) ***
β 10 d −0.0900−0.0913−0.0911−0.091−0.0922−0.0930
(0.012) **(0.011) **(0.011) **(0.011) **(0.010) **(0.010) ***
β 21 d 0.15740.15900.15840.14900.15060.1505
(0.000) ***(0.000) ***(0.000) ***(0.000) ***(0.000) ***(0.000) ***
β 63 d 0.02390.02040.02210.02710.02440.0240
(0.107)(0.170)(0.140)(0.065) *(0.097) *(0.105)
β S K E W - P U T S 0.0119 0.0118
(0.010) *** (0.035) **
β V I X −0.0046 −0.0016
(0.337) (0.089) *
R a d j u s t e d 2 0.880680.880790.880680.884470.884540.88451
Δ  SKEW Δ  LOG(SKEW)
HARHAR-SK-PUTSHAR-IVHARHAR-SK-PUTSHAR-IV
β 0 0.01460.01460.01460.00010.00010.0001
(0.685)(0.684)(0.686)(0.692)(0.684)(0.685)
β 1 −0.2022−0.2017−0.2016−0.2081−0.2070−0.2078
(0.000) ***(0.000) ***(0.000) ***(0.000) ***(0.000) ***(0.000) ***
β 5 d −0.1832−0.1831−0.1830−0.1847−0.1845−0.1846
(0.000) ***(0.000) ***(0.000) ***(0.000) ***(0.000) ***(0.000) ***
β 10 d −0.5442−0.5442−0.5440−0.5342−0.5342−0.5340
(0.000) ***(0.000) ***(0.000) ***(0.000) ***(0.000) ***(0.000) ***
β 21 d −0.7632−0.7632−0.7636−0.7093−0.7092−0.7093
(0.000) ***(0.000) ***(0.000) ***(0.000) ***(0.000) ***(0.000) ***
β 63 d −1.2132−1.2129−1.2127−1.2832−1.2827−1.2832
(0.001) **(0.001) **(0.001) **(0.001) ***(0.001) ***(0.001) ***
β S K E W - P U T S −0.0006 −0.0015
(0.909) (0.794)
β V I X 0.0076 0.0007
(0.733) (0.876)
R a d j u s t e d 2 0.12040.12040.12040.12200.12200.1220
Table 4. VIX and  L O G ( V I X )  out-of-sample forecasting performance; models specified using all the regressors detailed in Equations (1), (6), and (7). In the table, * means statistical significance at 10%, ** and *** indicate statistical significance at 5 and 1%, respectively.
Table 4. VIX and  L O G ( V I X )  out-of-sample forecasting performance; models specified using all the regressors detailed in Equations (1), (6), and (7). In the table, * means statistical significance at 10%, ** and *** indicate statistical significance at 5 and 1%, respectively.
Panel A: VIXMSFESPA(SE)MAFESPA(AE)Correct Change of Direction
1 Day Ahead
RW2.835050.0010 ***1.027040.0208 **48.071%
AR(1)2.792580.0354 **1.028830.0778 *48.071%
HAR2.770971.00001.025991.000048.151%
HAR-SK2.777360.16281.028610.0348 **48.091%
HAR-SK-PUTS2.779910.16161.028840.0020 ***48.231%
5 Days Ahead
RW9.731850.0000 ***2.013840.0088 ***49.770%
AR(1)9.433940.0000 ***2.035240.0081 ***49.510%
HAR8.456001.00001.924591.000049.750%
HAR-SK8.558370.0512 *1.939050.0548 *49.790%
HAR-SK-PUTS8.582100.0058 ***1.947870.0002 ***49.750%
10 Days Ahead
RW14.904540.0000 ***2.521020.0007 ***50.850%
AR(1)14.223400.0000 ***2.538210.0005 ***51.150%
HAR10.93351.00002.254841.000050.100%
HAR-SK11.16820.0216 **2.285000.0196 **50.160%
HAR-SK-PUTS11.25050.001 ***2.304790.0000 ***50.040%
21 Days Ahead
RW26.810300.0000 ***3.342240.0000 ***49.789%
AR(1)24.589050.0000 ***3.385060.0000 ***49.970%
HAR12.70811.00002.502221.000048.164%
HAR-SK13.200000.006 ***2.558840.0032 ***48.545%
HAR-SK-PUTS13.45850.0000 ***2.596020.0004 ***48.364%
Panel B: LOG(VIX)MSFESPA(SE)MAFESPA(AE)Correct Change of Direction
1 Day Ahead
RW2.801330.0017 ***1.020910.0243 **48.071%
AR(1)2.769200.0184 **1.017430.0394 **48.471%
HAR2.723171.00001.011711.000048.171%
HAR-SK2.728180.33141.012430.373448.151%
HAR-SK-PUTS2.736190.13041.012270.449648.411%
5 Days Ahead
RW9.731850.0000 ***2.013840.0075 **49.230%
AR(1)9.364550.0000 ***1.993640.0114 **49.490%
HAR8.505271.00001.902651.000049.430%
HAR-SK8.559330.31361.908570.272449.610%
HAR-SK-PUTS8.621740.10981.916580.0240 **49.710%
10 Days Ahead
RW14.904540.0000 ***2.521020.0007 **50.901%
AR(1)14.076390.0000 ***2.462990.0006 **51.181%
HAR11.27481.00002.222731.000050.120%
HAR-SK11.41110.20302.242020.0370 **49.940%
HAR-SK-PUTS11.56620.0520 *2.259920.0020 ***49.900%
21 Days Ahead
RW26.810300.0000 ***3.342240.0000 **49.789%
AR(1)24.142710.0000 ***3.221530.0000 **49.970%
HAR13.38461.00002.449661.000048.304%
HAR-SK13.72870.0382 **2.491990.0070 ***48.164%
HAR-SK-PUTS14.06840.0110 **2.531220.0004 ***48.385%
Table 5. D(VIX) and  D ( L O G ( V I X ) )  out-of-sample forecasting performance; models specified using all the regressors detailed in Equations (1), (6), and (7). In the table, * means statistical significance at 10%, ** and *** indicate statistical significance at 5 and 1%, respectively.
Table 5. D(VIX) and  D ( L O G ( V I X ) )  out-of-sample forecasting performance; models specified using all the regressors detailed in Equations (1), (6), and (7). In the table, * means statistical significance at 10%, ** and *** indicate statistical significance at 5 and 1%, respectively.
Panel A: D(VIX)MSFESPA(SE)MAFESPA(AE)Correct Change of Direction
1 Day Ahead
RW2.806200.04420 **1.022300.478048.321%
AR(1)2.792580.03545 **1.028830.0778 **48.071%
HAR2.800631.000001.022061.000048.221%
HAR-SK2.804720.173401.024090.0016 ***48.001%
HAR-SK-PUTS2.804850.318201.025720.0108 ***48.361%
5 Days Ahead
RW3.388790.00580 ***1.138760.0097 **47.939%
AR(1)9.433940.00000 ***2.035240.0081 **49.510%
HAR2.802911.000001.021630.443048.579%
HAR-SK2.805120.202401.021541.000048.419%
HAR-SK-PUTS2.816090.00500 ***1.027080.0002 ***48.299%
10 Days Ahead
RW3.112840.00000 ***1.083470.0009 ***47.927%
AR(1)14.223400.00000 ***2.538210.0005 ***51.150%
HAR2.794501.000001.020800.453248.508%
HAR-SK2.797200.155401.020741.000048.448%
HAR-SK-PUTS2.807610.00540 ***1.026260.0008 ***48.248%
21 Days Ahead
RW3.052530.00000 ***1.076470.0000 ***47.927%
AR(1)24.589050.00000 ***3.385060.0000 ***49.970%
HAR2.788511.000001.021420.255448.535%
HAR-SK2.790370.244601.021051.000048.234%
HAR-SK-PUTS2.802470.00280 ***1.027000.0002 ***48.173%
Panel B: D(LOG(VIX))MSFESPA(SE)MAFESPA(AE)Correct Change of Direction
1 Day Ahead
RW7.134950.00000 ***1.535070.0000 **49.100%
AR(1)2.769200.01840 **1.017430.0411 **48.471%
HAR2.744920.487401.013541.000048.221%
HAR-SK2.747570.347801.014540.0326 **48.141%
HAR-SK-PUTS2.743071.000001.014180.3442048.421%
5 Days Ahead
RW3.651200.00000 ***1.148970.0064 ***48.459%
AR(1)9.364550.00000 ***1.993640.01140 **49.490%
HAR2.743981.000001.011231.000048.279%
HAR-SK2.746300.418801.012300.0284 **48.139%
HAR-SK-PUTS2.745340.536801.011880.350648.499%
10 Days Ahead
RW3.202950.00000 ***1.086430.0088 ***48.087%
AR(1)14.076390.00000 ***2.462990.0000 ***51.181%
HAR2.741201.000001.010751.000048.288%
HAR-SK2.742960.495801.011860.0184 **48.207%
HAR-SK-PUTS2.743090.478001.011510.313648.468%
21 Days Ahead
RW3.003920.00060 ***1.025540.0380 ***48.314%
AR(1)24.142710.00000 ***3.221530.0000 ***49.970%
HAR2.740881.000001.010131.000048.294%
HAR-SK2.743230.348401.011310.0116 **48.254%
HAR-SK-PUTS2.742260.519001.010760.346648.454%
Table 6. Giacomini-White’s type test for conditional predictive ability comparing  V I X  and  L O G ( V I X )  models including all the regressors in Equations (1), (6), and (7). In the table, * means statistical significance at 10%, ** and *** indicate statistical significance at 5 and 1%, respectively.
Table 6. Giacomini-White’s type test for conditional predictive ability comparing  V I X  and  L O G ( V I X )  models including all the regressors in Equations (1), (6), and (7). In the table, * means statistical significance at 10%, ** and *** indicate statistical significance at 5 and 1%, respectively.
Panel A: VIX Panel B: LOG(VIX)
1 Day AheadHAR-SKHAR-SK-PUTS1 Day AheadHAR-SKHAR-SK-PUTS
HAR0.2310.274HAR0.3020.169
(negative)(negative) (negative)(negative)
HAR-SK 0.387HAR-SK 0.516
(negative) (negative)
5 Days AheadHAR-SKHAR-SK-PUTS5 Days AheadHAR-SKHAR-SK-PUTS
HAR0.086 *0.016 **HAR0.4750.150
(negative)(negative) (negative)(negative)
HAR-SK 0.584HAR-SK 0.243
(negative) (negative)
10 Days AheadHAR-SKHAR-SK-PUTS10 Days AheadHAR-SKHAR-SK-PUTS
HAR0.067 *0.001 ***HAR0.3570.144
(negative)(negative) (negative)(negative)
HAR-SK 0.429HAR-SK 0.152
(negative) (negative)
21 Days AheadHAR-SKHAR-SK-PUTS21 Days AheadHAR-SKHAR-SK-PUTS
HAR0.021 **0.000 ***HAR0.0076 ***0.003 ***
(negative)(negative) (negative)(negative)
HAR-SK 0.007 ***HAR-SK 0.078 *
(negative) (negative)
Table 7. Giacomini-White’s type test for conditional predictive ability comparing  V I X  and  L O G ( V I X )  models including all the regressors in Equations (1), (6), and (7). In the table, * means statistical significance at 10%, ** and *** indicate statistical significance at 5 and 1%, respectively.
Table 7. Giacomini-White’s type test for conditional predictive ability comparing  V I X  and  L O G ( V I X )  models including all the regressors in Equations (1), (6), and (7). In the table, * means statistical significance at 10%, ** and *** indicate statistical significance at 5 and 1%, respectively.
Panel A: D(VIX) Panel B: D(LOG(VIX))
1 Day AheadHAR-SKHAR-SK-PUTS1 Day AheadHAR-SKHAR-SK-PUTS
HAR0.2980.479HAR0.5810.671
(negative)(negative) (negative)(positive)
HAR-SK 0.865HAR-SK 0.651
(negative) (positive)
5 Days AheadHAR-SKHAR-SK-PUTS5 Days AheadHAR-SKHAR-SK-PUTS
HAR0.6950.013 **HAR0.6860.040 **
(negative)(negative) (negative)(negative)
HAR-SK 0.044 **HAR-SK 0.279
(negative) (positive)
10 Days AheadHAR-SKHAR-SK-PUTS10 Days AheadHAR-SKHAR-SK-PUTS
HAR0.3120.008 ***HAR0.1210.482
(negative)(negative) (negative)(negative)
HAR-SK 0.061 *HAR-SK 0.822
(negative) (negative)
21 Days AheadHAR-SKHAR-SK-PUTS21 Days AheadHAR-SKHAR-SK-PUTS
HAR0.7020.007 ***HAR0.2250.379
(negative)(negative) (negative)(negative)
HAR-SK 0.022 **HAR-SK 0.696
(negative) (positive)
Table 8. SKEW and  L O G ( S K E W )  out-of-sample out-of-sample forecasting performance; models specified using all the regressors detailed in Equations (1), (8), and (9). In the table, * means statistical significance at 10%, ** and *** indicate statistical significance at 5 and 1%, respectively.
Table 8. SKEW and  L O G ( S K E W )  out-of-sample out-of-sample forecasting performance; models specified using all the regressors detailed in Equations (1), (8), and (9). In the table, * means statistical significance at 10%, ** and *** indicate statistical significance at 5 and 1%, respectively.
Panel A: SKEWMSFESPA(SE)MAFESPA(AE)Correct Change of Direction
1 Day Ahead
RW9.416300.0000 ***2.069540.0006 ***42.674%
AR(1)9.039370.0000 ***2.075380.0000 ***42.854%
HAR8.326780.29722.001961.000042.395%
HAR-SK-PUTS8.258721.00002.007420.312642.395%
HAR-IV8.344850.14482.006520.048842.455%
5 Days Ahead
RW22.40080.0000 ***3.416380.0000 ***48.870%
AR(1)23.05230.0000 ***3.540320.0000 ***49.030%
HAR13.67140.0174 **2.682430.20146.349%
HAR-SK-PUTS13.06641.00002.662091.000047.590%
HAR-IV13.79280.0072 ***2.706240.0044 ***46.509%
10 Days Ahead
RW30.01140.0000 ***4.029280.0000 ***49.840%
AR(1)33.07070.0000 ***4.325870.0000 ***49.780%
HAR13.91640.0144 **2.711740.215844.814%
HAR-SK-PUTS13.28931.00002.690681.000046.656%
HAR-IV14.09070.0066 ***2.740540.0054 ***45.334%
21 Days Ahead
RW36.10400.0000 ***4.523030.0000 ***49.910%
AR(1)41.49570.0000 ***4.931360.0000 ***50.110%
HAR13.97700.0368 **2.721080.291444.714%
HAR-SK-PUTS13.41691.00002.707221.000046.596%
HAR-IV14.20650.0148 **2.754350.0022 ***44.854%
Panel B: LOG(SKEW)MSFESPA(SE)MAFESPA(AE)Correct Change of Direction
1 Day Ahead
RW9.416300.0000 ***2.069540.0002 ***42.674%
AR(1)9.046010.0000 ***2.072340.0000 ***42.854%
HAR8.340380.18902.001361.000042.415%
HAR-SK-PUTS8.252551.00002.003760.437842.375%
HAR-IV8.349770.15042.003950.272842.534%
5 Days Ahead
RW22.40080.0000 ***3.416380.0000 ***48.870%
AR(1)23.09690.0000 ***3.530400.0000 ***49.030%
HAR13.63350.0142 **2.674870.251046.349%
HAR-SK-PUTS12.98941.00002.650511.000047.389%
HAR-IV13.68890.0040 ***2.692410.0410 **46.789%
10 Days Ahead
RW30.01140.0000 ***4.029280.0000 ***49.840%
AR(1)33.12260.0000 ***4.314660.0000 ***49.760%
HAR13.87010.0202 **2.702570.168444.794%
HAR-SK-PUTS13.20601.00002.678171.000046.676%
HAR-IV13.95110.009 ***2.723850.0286 **45.294%
21 Days Ahead
RW36.10400.0000 ***4.523030.0000 ***49.910%
AR(1)41.60490.0000 ***4.926470.0000 ***50.211%
HAR13.93120.0434 **2.711980.245245.053%
HAR-SK-PUTS13.33101.00002.694331.000046.899%
HAR-IV14.04640.0148 **2.737130.0206 **44.873%
Table 9. D(SKEW) and  D ( L O G ( S K E W ) )  out-of-sample out-of-sample forecasting performance; models specified using all the regressors detailed in Equations (1), (8), and (9). In the table, * means statistical significance at 10%, ** and *** indicate statistical significance at 5 and 1%, respectively.
Table 9. D(SKEW) and  D ( L O G ( S K E W ) )  out-of-sample out-of-sample forecasting performance; models specified using all the regressors detailed in Equations (1), (8), and (9). In the table, * means statistical significance at 10%, ** and *** indicate statistical significance at 5 and 1%, respectively.
Panel A: D(SKEW)MSFESPA(SE)MAFESPA(AE)Correct Change of Direction
1 Day Ahead
RW24.282200.000 ***3.302130.0000 ***43.802%
AR(1)8.659500.000 ***2.066800.128042.383%
HAR8.349001.00001.995300.777642.603%
HAR-SK-PUTS8.353990.47441.996490.482442.223%
HAR-IV8.358710.31041.994951.000042.803%
5 Days Ahead
RW19.438400.000 ***3.065280.0000 ***45.258%
AR(1)9.419230.000 ***2.069800.0000 ***42.817%
HAR8.604681.00002.007171.000042.717%
HAR-SK-PUTS8.633250.12262.009640.225642.597%
HAR-IV8.606800.60442.009260.170242.637%
10 Days Ahead
RW18.686000.000 ***3.017570.0000 ***45.564%
AR(1)9.429730.000 ***2.071050.0000 ***42.740%
HAR8.608651.00002.007461.000042.620%
HAR-SK-PUTS8.637911.00002.010050.209842.580%
HAR-IV8.610160.62782.008920.294642.660%
21 Days Ahead
RW18.674100.000 ***3.010370.0000 ***46.126%
AR(1)9.445700.000 ***2.072590.0000 ***42.854%
HAR8.623561.00002.009601.000042.714%
HAR-SK-PUTS8.649470.15282.011780.259442.794%
HAR-IV8.623770.67342.010550.430042.774%
Panel B: D(LOG(SKEW))MSFESPA(SE)MAFESPA(AE)Correct Change of Direction
1 Day Ahead
RW24.47880.000 ***3.305860.0000 ***43.862%
AR(1)8.663280.000 ***2.007260.0906 *42.363%
HAR8.346601.00001.993980.832642.463%
HAR-SK-PUTS8.350070.53001.994960.564642.143%
HAR-IV8.358300.28681.993730.0000 ***42.743%
5 Days Ahead
RW24.49320.000 ***3.306800.0000 ***43.838%
AR(1)9.422820.000 ***2.070330.0000 ***42.857%
HAR8.605521.00002.006071.000042.557%
HAR-SK-PUTS8.628850.15382.007900.278042.757%
HAR-IV8.609130.56682.007630.308842.577%
10 Days Ahead
RW24.51290.000 ***3.308490.0000 ***43.861%
AR(1)9.429730.000 ***2.071050.0000 ***42.740%
HAR8.609581.00002.006331.000042.600%
HAR-SK-PUTS8.633650.15222.008210.274242.740%
HAR-IV8.614670.51722.007890.306842.339%
21 Days Ahead
RW24.552100.000 ***3.310470.0000 ***43.938%
AR(1)9.445700.000 ***2.072590.0000 ***42.854%
HAR8.623691.00002.008361.000042.633%
HAR-SK-PUTS8.644490.18622.009910.335842.834%
HAR-IV8.627960.54482.009760.337042.493%
Table 10. Giacomini-White’s type test for conditional predictive ability comparing  S K E W  including all the regressors in Equations (1), (8), and (9). In the table, * means statistical significance at 10%, ** and *** indicate statistical significance at 5 and 1%, respectively.
Table 10. Giacomini-White’s type test for conditional predictive ability comparing  S K E W  including all the regressors in Equations (1), (8), and (9). In the table, * means statistical significance at 10%, ** and *** indicate statistical significance at 5 and 1%, respectively.
1 Day Ahead
AR(1)HARHAR-SK-PUTSHAR-IV
RW0.003 ***
(positive)
0.000 ***
(positive)
0.000 ***
(positive)
0.000 ***
(positive)
AR(1) 0.000 ***
(positive)
0.000 ***
(positive)
0.000 ***
(positive)
HAR 0.000 ***
(positive)
0.165
(negative)
HAR-SK-PUTS 0.004 ***
(negative)
5 Days Ahead
AR(1)HARHAR-SK-PUTSHAR-IV
RW0.439
(negative)
0.000 ***
(positive)
0.000 ***
(positive)
0.000 ***
(positive)
AR(1) 0.000 ***
(positive)
0.000 ***
(positive)
0.000 ***
(positive)
HAR 0.016 **
(positive)
0.036 **
(negative)
HAR-SK-PUTS 0.004 ***
(negative)
10 Days Ahead
AR(1)HARHAR-SK-PUTSHAR-IV
RW0.114
(negative)
0.000 ***
(positive)
0.000 ***
(positive)
0.000 ***
(positive)
AR(1) 0.000 ***
(positive)
0.000 ***
(positive)
0.000 ***
(positive)
HAR 0.002 ***
(positive)
0.022 **
(negative)
HAR-SK-PUTS 0.001 ***
(negative)
21 Days Ahead
AR(1)HARHAR-SK-PUTSHAR-IV
RW0.088 *
(negative)
0.000 ***
(positive)
0.000 ***
(positive)
0.000 ***
(positive)
AR(1) 0.000 ***
(positive)
0.000 ***
(positive)
0.000 ***
(positive)
HAR 0.003 ***
(positive)
0.051 *
(negative)
HAR-SK-PUTS 0.003 ***
(negative)
Table 11. Giacomini-White’s type test for conditional predictive ability comparing  L O G ( S K E W )  including all the regressors in Equations (1), (8), and (9). In the table, * means statistical significance at 10%, ** and *** indicate statistical significance at 5 and 1%, respectively.
Table 11. Giacomini-White’s type test for conditional predictive ability comparing  L O G ( S K E W )  including all the regressors in Equations (1), (8), and (9). In the table, * means statistical significance at 10%, ** and *** indicate statistical significance at 5 and 1%, respectively.
1 Day Ahead
AR(1)HARHAR-SK-PUTSHAR-IV
RW0.006 ***
(positive)
0.000 ***
(positive)
0.000 ***
(positive)
0.000 ***
(positive)
AR(1)0.000 ***
(positive)
0.000 ***
(positive)
0.000 ***
(positive)
0.000 ***
(positive)
HAR 0.000 ***
(positive)
0.000 ***
(positive)
0.543
(negative)
HAR-SK-PUTS 0.015 **
(negative)
5 Days Ahead
AR(1)HARHAR-SK-PUTSHAR-IV
RW0.388
(negative)
0.000 ***
(positive)
0.000 ***
(positive)
0.000 ***
(positive)
AR(1)0.000 ***
(positive)
0.000 ***
(positive)
0.000 ***
(positive)
0.000 ***
(positive)
HAR 0.008 ***
(positive)
0.000 ***
(positive)
0.154
(negative)
HAR-SK-PUTS 0.004 ***
(negative)
10 Days Ahead
AR(1)HARHAR-SK-PUTSHAR-IV
RW0.095 *
(negative)
0.000 ***
(positive)
0.000 ***
(positive)
0.000 ***
(positive)
AR(1)0.000 ***
(positive)
0.000 ***
(positive)
0.000 ***
(positive)
0.000 ***
(positive)
HAR 0.001 ***
(positive)
0.000 ***
(positive)
0.132
(negative)
HAR-SK-PUTS 0.002 ***
(negative)
21 Days Ahead
AR(1)HARHAR-SK-PUTSHAR-IV
RW0.074 *
(negative)
0.000 ***
(positive)
0.000 ***
(positive)
0.000 ***
(positive)
AR(1)0.000 ***
(positive)
0.000 ***
(positive)
0.000 ***
(positive)
0.000 ***
(positive)
HAR 0.003 ***
(positive)
0.000 ***
(positive)
0.083 *
(negative)
HAR-SK-PUTS 0.003 ***
(negative)
Table 12. Giacomini-White’s type test for conditional predictive ability comparing  D ( S K E W )  including all the regressors in Equations (1), (8), and (9). In the table, * means statistical significance at 10%, ** and *** indicate statistical significance at 5 and 1%, respectively.
Table 12. Giacomini-White’s type test for conditional predictive ability comparing  D ( S K E W )  including all the regressors in Equations (1), (8), and (9). In the table, * means statistical significance at 10%, ** and *** indicate statistical significance at 5 and 1%, respectively.
1 Day Ahead
AR(1)HARHAR-SK-PUTSHAR-IV
RW0.000 ***
(positive)
0.000 ***
(positive)
0.000 ***
(positive)
0.000 ***
(positive)
AR(1)0.000 ***
(positive)
0.000 ***
(positive)
0.000 ***
(positive)
0.000 ***
(positive)
HAR 0.137
(negative)
0.000 ***
(positive)
0.600
(negative)
HAR-SK-PUTS 0.264
(negative)
5 Days Ahead
AR(1)HARHAR-SK-PUTSHAR-IV
RW0.000 ***
(negative)
0.000 ***
(positive)
0.000 ***
(positive)
0.000 ***
(positive)
AR(1)0.000 ***
(positive)
0.000 ***
(positive)
0.000 ***
(positive)
0.000 ***
(positive)
HAR 0.351
(negative)
0.000 ***
(positive)
0.011 **
(negative)
HAR-SK-PUTS 0.462
(positive)
10 Days Ahead
AR(1)HARHAR-SK-PUTSHAR-IV
RW0.000 ***
(negative)
0.000 ***
(positive)
0.000 ***
(positive)
0.000 ***
(positive)
AR(1)0.000 ***
(positive)
0.000 ***
(positive)
0.000 ***
(positive)
0.000 ***
(positive)
HAR 0.086
(negative)
0.000 ***
(positive)
0.402
(negative)
HAR-SK-PUTS 0.224
(positive)
21 Days Ahead
AR(1)HARHAR-SK-PUTSHAR-IV
RW0.000 ***
(negative)
0.000 ***
(positive)
0.000 ***
(positive)
0.000 ***
(positive)
AR(1)0.000 ***
(positive)
0.000 ***
(positive)
0.000 ***
(positive)
0.000 ***
(positive)
HAR 0.211
(negative)
0.000 ***
(positive)
0.790
(negative)
HAR-SK-PUTS 0.272
(positive)
Table 13. Giacomini-White’s type test for conditional predictive ability comparing  D ( L O G ( S K E W ) )  including all the regressors in Equations (1), (8), and (9). In the table, * means statistical significance at 10%, ** and *** indicate statistical significance at 5 and 1%, respectively.
Table 13. Giacomini-White’s type test for conditional predictive ability comparing  D ( L O G ( S K E W ) )  including all the regressors in Equations (1), (8), and (9). In the table, * means statistical significance at 10%, ** and *** indicate statistical significance at 5 and 1%, respectively.
1 Day Ahead
AR(1)HARHAR-SK-PUTSHAR-IV
RW0.000 ***
(positive)
0.000 ***
(positive)
0.000 ***
(positive)
0.000 ***
(positive)
AR(1)0.000 ***
(positive)
0.000 ***
(positive)
0.000 ***
(positive)
0.000 ***
(positive)
HAR 0.155
(negative)
0.000 ***
(positive)
0.460
(negative)
HAR-SK-PUTS 0.239
(negative)
5 Days Ahead
AR(1)HARHAR-SK-PUTSHAR-IV
RW0.000 ***
(negative)
0.000 ***
(positive)
0.000 ***
(positive)
0.000 ***
(positive)
AR(1)0.000 ***
(positive)
0.000 ***
(positive)
0.000 ***
(positive)
0.000 ***
(positive)
HAR 0.439
(negative)
0.000 ***
(positive)
0.163
(negative)
HAR-SK-PUTS 0.575
(positive)
10 Days Ahead
AR(1)HARHAR-SK-PUTSHAR-IV
RW0.000 ***
(negative)
0.000 ***
(positive)
0.000 ***
(positive)
0.000 ***
(positive)
AR(1)0.000 ***
(positive)
0.000 ***
(positive)
0.000 ***
(positive)
0.000 ***
(positive)
HAR 0.083
(negative)
0.000 ***
(positive)
0.431
(negative)
HAR-SK-PUTS 0.331
(positive)
21 Days Ahead
AR(1)HARHAR-SK-PUTSHAR-IV
RW0.000 ***
(negative)
0.000 ***
(positive)
0.000 ***
(positive)
0.000 ***
(positive)
AR(1)0.000 ***
(positive)
0.000 ***
(positive)
0.000 ***
(positive)
0.000 ***
(positive)
HAR 0.306
(negative)
0.000 ***
(positive)
0.727
(negative)
HAR-SK-PUTS 0.440
(positive)
Table 14. Model Confidence Set Results for Alternative Models and Horizon—Squared and Absolute Loss Functions for VIX and SKEW. Boldfaced p-values indicate models included in the MCS.
Table 14. Model Confidence Set Results for Alternative Models and Horizon—Squared and Absolute Loss Functions for VIX and SKEW. Boldfaced p-values indicate models included in the MCS.
Squared Loss FunctionAbsolute Loss Function
HARHAR-SKHAR-SK-PUTRWAR(1)HARHAR-IVHAR-SK-PUTRWAR(1)
Panel A: VIX
Horizon = 1-day
VIX in levels1.0000.0000.0000.0000.0001.0000.1250.0000.1250.125
Δ VIX0.1430.0000.0000.0001.0001.0000.0000.0000.1430.000
log(VIX)1.0000.0000.0000.0000.0001.0000.0000.0000.0000.000
Δ log(VIX)1.0000.0000.0000.0000.0001.0000.0000.0000.0000.000
Horizon = 5-day
VIX in levels0.0000.0000.0000.0001.0000.0000.0000.0000.0001.000
Δ VIX1.0000.0000.0000.0000.0001.0000.0000.0000.0000.000
log(VIX)1.0000.0000.0000.0000.0001.0000.0000.0000.0000.000
Δ log(VIX)1.0000.0000.0000.0000.0001.0000.0000.0000.0000.000
Horizon = 10-day
VIX in levels1.0000.0000.0000.0000.0001.0000.1250.0000.0000.000
Δ VIX1.0000.0000.0000.0000.0001.0000.0000.0000.0000.000
log(VIX)1.0000.1250.1250.0000.0001.0000.1250.1250.0000.000
Δ log(VIX)1.0000.0000.0000.0000.0001.0000.0000.0000.0000.000
Horizon = 21-day
VIX in levels1.0000.0000.0000.0000.0001.0000.0000.0000.0000.000
Δ VIX1.0000.0000.0000.0000.0001.0000.0000.0000.0000.000
log(VIX)1.0000.0000.0000.0000.0001.0000.0000.0000.0000.000
Δ log(VIX)1.0000.0000.0000.0000.0001.0000.0000.0000.0000.000
HARHAR-SKHAR-SK-PUTRWAR(1)HARHAR-IVHAR-SK-PUTRWAR(1)
Panel B: SKEW
Horizon = 1-day
SKEW in levels0.1000.0001.0000.0000.0001.0000.0000.1000.0000.000
Δ SKEW1.0000.0000.0000.0000.0000.1111.0000.0000.0000.000
log(SKEW)0.1000.0001.0000.0000.0001.0000.0000.1000.0000.000
Δ log(SKEW)1.0000.1110.1110.0000.0000.1111.0000.0000.0000.000
Horizon = 5-day
SKEW in levels0.0000.0001.0000.0000.0000.1000.0001.0000.0000.000
Δ SKEW1.0000.1110.1110.0000.0001.0000.0000.1110.0000.000
log(SKEW)0.0000.0001.0000.0000.0000.1000.0001.0000.0000.000
Δ log(SKEW)1.0000.0000.0000.0000.0001.0000.1110.1110.0000.000
Horizon = 10-day
SKEW in levels0.0000.0001.0000.0000.0000.0000.0001.0000.0000.000
Δ SKEW1.0000.1110.0000.0000.0001.0000.0000.0000.0000.000
log(SKEW)0.0000.0001.0000.0000.0000.0000.0001.0000.0000.000
Δ log(SKEW)1.0000.0000.0000.0000.0001.0000.0000.1110.0000.000
Horizon = 21-day
SKEW in levels0.0000.0001.0000.0000.0000.1000.0001.0000.0000.000
Δ SKEW1.0000.1110.0000.0000.0001.0000.1110.0000.0000.000
log(SKEW)0.0000.0001.0000.0000.0000.1000.0001.0000.0000.000
Δ log(SKEW)1.0000.1110.0000.0000.0001.0000.1110.0000.0000.000
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Guidolin, M.; Panzeri, G.F. Forecasting the CBOE VIX and SKEW Indices Using Heterogeneous Autoregressive Models. Forecasting 2024, 6, 782-814. https://doi.org/10.3390/forecast6030040

AMA Style

Guidolin M, Panzeri GF. Forecasting the CBOE VIX and SKEW Indices Using Heterogeneous Autoregressive Models. Forecasting. 2024; 6(3):782-814. https://doi.org/10.3390/forecast6030040

Chicago/Turabian Style

Guidolin, Massimo, and Giulia F. Panzeri. 2024. "Forecasting the CBOE VIX and SKEW Indices Using Heterogeneous Autoregressive Models" Forecasting 6, no. 3: 782-814. https://doi.org/10.3390/forecast6030040

APA Style

Guidolin, M., & Panzeri, G. F. (2024). Forecasting the CBOE VIX and SKEW Indices Using Heterogeneous Autoregressive Models. Forecasting, 6(3), 782-814. https://doi.org/10.3390/forecast6030040

Article Metrics

Back to TopTop