1. Introduction
We aim to improve volatility modeling by adding information that exists on latent volatility processes while the markets are closed and no transactions occur. We build upon the observation that the price at market closing usually differs from the price at market opening, despite no transactions occurring between the two recordings. Models previously proposed usually estimate volatility by including information on past day and intraday volatility, estimated from day-recorded prices and sampled at various time intervals. Some papers have proposed methods to address overnight returns. The latent volatility component apparent in periods when markets are closed, highlighted by the difference between the two prices, may be the effect of events that occurred during the market closing, both domestic or international, or may be due to other latent factors that usually influence the financial markets, and may prove useful in volatility modeling. We propose an estimation of this night latent volatility and suggest a new model that uses day, intraday, and night volatility information to model day volatility. What distinguishes our contribution from other papers published on similar topics is that we propose a two-factor structure in a realized generalized autoregressive conditional heteroskedasticity (GARCH) setting that takes advantage of the natural relationship between the realized measure and the conditional (day and night) variance. The mathematical structure is thus elegant, facilitates volatility estimation, and allows the inclusion of return-volatility dependence. We call the structure bivariate because it uses both day and night volatility information, as opposed to the univariate ones that only use day information. To strengthen the robustness of our empirical research, we further extended this idea to a number of realized GARCH models that use day and intraday volatility information, creating an equivalent set of bivariate models that additionally use night volatility information. We obtained a class of realized GARCH models that incorporate day, night, and intraday volatility measures; they were assessed against their counterparts that did not include night volatility information using an extended set of 10 stock prices. Empirical results of the forecasting performance assessment show a degree of improvement of the newly proposed models over those that do not include night volatility measures. This finding suggests the potential of our method for volatility forecasting problems for financial assets and other assets with night latent volatility information.
Financial volatility modeling has benefited significantly from the availability of high-frequency data. The main interest in modeling using frequently sampled information and integrating it into models built to estimate day conditional variance was initiated by
Andersen and Bollerslev (
1998), who used realized volatility estimates extracted from intraday data (realized variance) as better estimates of conditional volatility than squared returns. They proved that by adding up squared intraday returns, the forecasted volatility would correlate closely to the future latent volatility factor.
Engle (
2002) was among the first econometricians who extended the standard GARCH model to include an exogenous realized measure (the realized variance) in the conditional variance (GARCH) equation. In this model, the realized measures’ variation is not explained; thus, such models (GARCH-X) are considered incomplete.
Engle and Gallo (
2006) proposed the multiplicative error model (MEM), which was the first attempt to contain a separate GARCH structure equation for the realized measure. A similar complete model nested in a MEM setting is the high frequency based volatility (HEAVY) model of
Shephard and Sheppard (
2010). Both MEM and HEAVY models are difficult to use as they work with multiple latent processes—for every realized measure used, there is a corresponding latent volatility process. The Realized GARCH model proposed by
Hansen et al. (
2012) combines a GARCH structure for returns with realized measures of volatility. Compared with MEM and HEAVY models, the Realized GARCH model takes advantage of the natural relationship between the realized measure and the conditional variance. Instead of introducing additional latent factors, it proposes a single measurement equation in which the realized measure is a consistent estimator of the integrated variance. Besides its elegant mathematical structure, the Realized GARCH model is easy to estimate, captures the return-volatility dependence (leverage effect), and has been empirically shown to outperform conventional GARCH. A more robust version of the Realized GARCH model was introduced by
Banulescu-Radu et al. (
2019), suggesting a variant that is less sensitive to outliers and minimizes the impact on volatility of days with extreme negative volatility shocks. A realized exponential GARCH model that can use multiple realized volatility measures for the modeling of a return series, using a similar framework, has also been proposed (
Hansen and Huang 2016). Finding that the Realized GARCH model was insufficient for capturing the long memory of underlying volatility,
Huang et al. (
2016) developed a parsimonious variant of the Realized GARCH model by introducing
Corsi’s (
2009) heterogeneous autoregressive (HAR) specification in the volatility dynamics. A multivariate GARCH model that incorporates realized measures of variances and covariances was also introduced by
Hansen et al. (
2014), but it did not suggest the introduction of night volatility information.
Bollerslev et al. (
2018) proposed asymmetric multivariate volatility models that exploit estimates of variances and covariances based on the signs of high-frequency returns to allow for more nuanced responses to positive and negative return shocks than the threshold leverage effect.
Hansen et al. (
2019) proposed a multivariate GARCH model that incorporates realized measures for the covariance matrix of returns.
Overnight (close-to-open) volatility is usually higher than the five-minute realized volatility estimated during trading hours, and the close-to-open price differential may trigger a distorting effect on the realized volatility. Thus, the inclusion of overnight returns when constructing the realized conditional covariance matrix of the daily returns has been empirically documented to reduce information loss and consequently improve volatility forecasting. A common approach to account for volatility during the market’s closing hours has been to calculate a close-to-open return from the price change recorded between the trading day closing and the next trading day opening, and then add its squared value to the sum of intraday returns (
Bollerslev et al. 2009;
Martens 2002;
Blair et al. 2001).
Hansen and Lunde (
2005) compounded optimal weights corresponding to overnight returns and to the sum of intraday returns, and
Fleming and Kirby (
2011) and
Fuertes and Olmo (
2013) further applied it.
De Pooter et al. (
2008) and
Fleming et al. (
2003) computed it in matrix form by incorporating the cross-product of the vector of overnight returns in the summation of the matrix that provided the covariance matrix of the daily returns, acknowledging that the outer product of the vector of overnight returns is an inaccurate estimator of the integrated covariance matrix for the period when markets were closed (
Fleming et al. 2003).
Koopman et al. (
2005);
Martens (
2002); and
Angelidis and Degiannakis (
2008) excluded the noisy overnight returns to compute an estimate of volatility during trading hours, instead of daily volatility; then, they scaled up the sum of intraday returns to cover the whole 24-h day. The literature has not yet reached a consensus on the best method of accounting for overnight returns; however,
Ahoniemi and Lanne (
2013) suggested that the weighted sum of the squared overnight return and the sum of intraday squared returns was the most accurate measure of realized volatility for the Standard&Poor’s’ S&P 500 index.
This paper suggests a method of capturing and incorporating night volatility into the day conditional volatility equation of one low-frequency as well as a number of high-frequency GARCH models. We propose a two-factor structure of the conditional variance, one for night and one for day variance, in a realized GARCH setting that takes advantage of the natural relationship between the realized measure and the conditional (day and night) variance. The mathematical structure is thus elegant, facilitates volatility estimation, and allows the inclusion of the return-volatility dependence. A general framework is formulated; based on it, a set of GARCH models is adapted such that it uses the estimation of night latent volatility to model day conditional volatility. This approach enabled us to document, in an empirical context, whether the introduction of the night volatility component, in the two-factor structure and realized GARCH setting we propose, improved the volatility modeling for each of the models discussed. The new models are called bivariate as they use both night and day volatility information and are defined to work in typical financial settings, such as volatility modeling of stock and commodity prices. We assessed the performance of the bivariate models by comparing the error functions of the forecasts of the bivariate models with those obtained when the simple versions of the models, which do not use night volatility information, were used. We call the latter models univariate models. The scope of this study was thus to analyze whether the use of night volatility information in the forms proposed improves the modeling of day volatility.
The paper proceeds as follows.
Section 2 proposes the new set of bivariate realized models.
Section 3 describes the data and methodology, and
Section 4 summarizes the results. The paper concludes with
Section 5, where final remarks are presented, and some future lines of research are proposed.
3. Data and Estimation Methodology
We used tick data sampled along 3537 trading days during the period of 30 August 2004–31 December 2018, corresponding to 10 stocks: AIG (American International Group, Inc.), AXP (American Express Company), BAC (Bank of America Corporation), CSCO (Cisco Systems, Inc.), F (Ford Motor Company (F)), GE (General Electric Company), INTC (Intel Corporation), JPM (JPMorgan Chase & Co.), MSFT (Microsoft Corporation), and T (AT&T Inc.). To avoid the outliers that would result from quiet days, the half trading days around the Christmas and Thanksgiving holidays were removed.
We opted for estimating intraday volatility by compounding realized kernels instead of the more widely used realized variance, as it is generally acknowledged that squared daily returns provide a poor estimation of actual intraday volatility. Realized kernels are robust for microstructure errors or frictions, which are known to cause endogenous and dependent noise terms. They are used to estimate the quadratic variation in an efficient price process when the time stamps in every day do not match (non-synchronous, with irregularly spaced observations) and when the high-frequency time series described by the prices are noisy with many microstructure effects. We compounded the realized kernels as measures of intraday volatility (
) using the methodology of
Barndorff-Nielsen et al. (
2009,
2011). The framework is given by
Y, a variable that is the sum of a Brownian semi-martingale and a jump process, as follows:
For the purpose of our exercise, we need to find the quadratic variation of
Y,
.
Barndorff-Nielsen et al. (
2009,
2011) estimated it from the noisy discrete observations
of
, where
and
represents the market microstructure effects (noise).
Barndorff-Nielsen et al. (
2009,
2011) estimated this quadratic variation by proposing realized kernels, a non-negative estimator that is constructed as follows.
The first challenge with the tick data is the non-synchronicity. Non-synchronous trading occurs when the trades or quotes appear at irregularly spaced times across stocks, which is usually the case in stock markets, especially those with low liquidity or stale prices.
Barndorff-Nielsen et al. (
2011) solved this by suggesting a refresh time when all the stocks are traded. We implemented the same method by recording the prices only when (and immediately after) all of them were traded.
To eliminate start and end effects and their associated errors, which are averaged through this procedure, we proceeded to jittering (averaging) the first and last two prices, as also suggested by
Barndorff-Nielsen et al. (
2011)
. Having synchronized and constructed the time series by jittering at the initial and final time points, we defined the semi-definite realized kernels, as follows, according to
Barndorff-Nielsen et al. (
2009,
2011):
where
is a kernel weight function that has the
property, and
is twice differentiable with continuous derivatives.
Barndorff-Nielsen et al. (
2009) used a Parzen kernel as it satisfies the smoothness conditions through
, and its estimates are positive. We made the same choice, and used the same Parzen kernel function:
The optimal choice of bandwidth, according to
Barndorff-Nielsen et al. (
2009), which we chose to use, is
, with
and
where
for the Parzen kernel.
is called the integrated quarticity, and, in our empirical exercise, it equals
. This denotes a subsampled realized variance based on 20-min returns. By calculating 1200 realized variances by shifting the first observation recorded time in 1-s increments, we obtained a number of realized variance estimators. We averaged them and obtained
.
was estimated by calculating the realized variance using every
ith trade. We varied the starting point, and thereby produced
i realized variances, namely
. Thus, our
estimator was calculated as:
where
is the number of non-zero returns used to estimate
. The estimate of
is then the average of the
estimates,
By design, the realized kernel is positive semi-definite and the rate of convergence is
We estimated the in-sample and out-of-sample (3000th day in the sample, 24 November 2016, the cutoff point) in both the univariate and bivariate models with respect to each of the 10 stocks. The univariate models considered are the standard realized versions of the GARCH model (Realized GARCH, Realized EGARCH, EGARCH-X, and Realized GARCH (2,2)), as well as the EGARCH model. The estimated bivariate models are those mentioned in
Section 2 (Bivariate EGARCH, reduced and complete forms of Bivariate Realized GARCH, Bivariate Realized EGARCH, Bivariate EGARCH-X, and Bivariate Realized GARCH (2,2)).
The estimation was performed by maximizing the total log-likelihood functions (MLE), namely the sum of partial log-likelihood functions for the returns and for the intraday measures; the ranking criterion with respect to the MLE was the partial log-likelihood function for returns solely. We used MLE to estimate both the proposed bivariate models and a number of univariate models that do not include night volatility information.
The log-likelihood function used in the estimation of the above models takes the form
for Bivariate EGARCH and Bivariate EGARCH-X, or
for Bivariate Realized GARCH complete version, Bivariate Realized EGARCH (1,1), and Bivariate Realized GARCH (2,2) (
Appendix A), where
and
.
To evaluate whether introducing night volatility estimations in models’ equations improves the day volatility estimation, we calculated two loss functions, root mean squared error (RMSE) and mean absolute error (MAE). Based on these, we documented the number of models for each in-sample and out-of-sample estimation for each of the 10 stocks, at which MAE and RMSE were smaller. This allowed us to draw conclusions about the better performance of the bivariate or univariate models. Based on the size of the loss functions obtained at each estimation, we analyzed the performance of the new models that included night volatility estimates. This contributed to our objective by documenting whether or not night volatility information improves the estimation of day volatility with respect to the main GARCH-type of models proposed in the literature.
The maximized log-likelihood functions in univariate and bivariate estimations are provided in
Table A1 and
Table A2 in
Appendix B. As the log-likelihood functions of the bivariate models differ from those of the univariate versions (for the bivariate estimation, we maximized a bi-dimensional vector
with a non-null correlation factor (
) between its subvectors), it makes little sense to compare the values of the MLEs across the univariate and bivariate models to document an improvement or loss of performance when introducing night volatility estimates. Specifically, the log-likelihood function for the bivariate models is:
where
In the univariate models’ case, the log-likelihood function is
for EGARCH and EGARCH-X, and
for Realized EGARCH, Realized GARCH, and Realized GARCH (2,2). As such, we could not use this method to evaluate the performance of the bivariate models, as we would be comparing the values of estimations of different functions.
Thus, for the purpose of documenting the gain or loss in accuracy, we used the standard method in econometrics for evaluating the models’ performance—that of calculating two loss functions (RMSE and MAE)—which would better assess whether adding night volatility information with a two-factor structure in a realized GARCH setting improves estimations of next-day volatility.
4. Results
The standard method used in econometrics to evaluate models’ performance is to calculate the size of the loss functions, among which RMSE and MAE are the most common and reliable. We calculated them for both in-sample and out-of-sample estimations, and our results indicate an improvement when night volatility estimations were included in the equations of the day conditional volatility in almost every case.
We worked with a number of models that have different features and for which adding an estimation of night volatility may contribute to the volatility estimation. For example, by inspecting the results for RMSE (in-sample estimation) in
Table 2, the improvement was evident for 55 out of 60 cases (1 loss function result × 6 models evaluated × 10 stocks). The cases in which the improvement could not be documented are marked with red (for RMSE) or green (for MAE) numbers in
Table 2. In the five cases in which this was not evident, four of them were for Realized GARCH (2,2). This means that Realized GARCH (2,2) only shows some features that did not work better when the night volatility estimates were considered given the way in which the model was designed. This may be because, compared to the other models that model next-day volatility by only using information from the previous day and night, Realized GARCH (2,2) uses information on the previous night volatility as well as information on returns and volatility of the previous two days. We thought that this might be the problem with this model, but it would need to be proven empirically; we left this question for future work.
This conclusion was strengthened by examining the MAE results. When considering MAE as an evaluation tool, the bivariate models produced superior forecasting ability in 59 out of 60 cases, indicating an improvement for the models that included night volatility estimation in the day volatility modeling. However, in only one case out of 60 was the improvement not evident, for the same Realized GARCH (2,2) model. As such, the model itself appears to be problematic, not the evaluation we performed. As mentioned above, we thought that the problem with this model was that it models conditional day volatility by including in the model information on day volatility and returns from the previous two days, instead of one day only as we did for the other models. In Bivariate Realized GARCH (2,2), we considered only one-night volatility information instead of considering the night volatility estimation from the previous two nights.
Univ and Biv stand for Univariate and Bivariate, respectively, while com and red stand for complete and reduced, respectively. Red and green numbers indicate the stances in which bivariate models perform worse than the univariate ones (when evaluated according to RMSE or MAE, respectively).
When examining the results for the out-of-sample estimations in
Table 3, we found that of 60 evaluations with RMSE, 53 showed forecasting improvement when night volatility information was used. In the seven cases in which the improvement was not evident, three were recorded for the same Realized GARCH (2,2) model. The remaining four belonged to various other models, one for each. However, we observed another pattern. Most of the failures in documenting an improvement were for the same stock: AIG. This suggests that the results were sensitive not only to the model (as we explained earlier with the way in which Realized GARCH (2,2) was built), but were also sensitive to the stock choice. Since AIG persistently failed in showing an improvement when using night volatility information, AIG price recordings should be more carefully examined to understand what makes it less sensitive to this modeling suggestion, including examining the amount of the stock price differential (the difference between the market closing and the market opening prices), and also understanding the roots of the volatility transmission for this stock in particular. Again, we left this as exploratory work for the future paper. When ranked according to MAE, 58 results out of 60 indicated improvement, whereas only two cases (among them, one for Realized GARCH (2,2)) did not. Again, both estimations indicated strong evidence in favor of including night volatility estimation in the modeling problem of day volatility.
Counting the number of cases that fail to show improvement is valuable for two reasons: (1) It is the best tool when comparing models evaluated through MLE given that the log-likelihood functions were not similar for looking at the size of the MLE values; and (2) the cases in which we failed to see improvement indicated some consistency for a specific model and a specific stock. This opens the opportunity for future work in which we might try to understand why the Realized GARCH (2,2) model and AIG stock persistently indicated less evidence compared with other models and stocks, where by adding night volatility information, we produced improved volatility estimation.
Red and green numbers indicate the stances in which bivariate models perform worse than the univariate ones (when evaluated according to RMSE or MAE, respectively).
Thus, we concluded that the proposed bivariate models improved the forecasting performance compared with the univariate models; as such, adding night volatility estimations according to the methodology suggested improves next-day volatility estimates.
5. Conclusions
This paper provided a methodology that captures and integrates night volatility into the modeling of day volatility. In univariate context, this method led to formulating four bivariate realized GARCH models (Bivariate EGARCH-X, Bivariate Realized GARCH, Bivariate Realized GARCH (2,2), and Bivariate Realized EGARCH) and one bivariate non-realized model (Bivariate EGARCH). The novelty of this method is the incorporation of a night measure of volatility into the models, computed from price changes between the closing and opening of the trading market with a two-factor structure of the conditional variance in a realized GARCH setting that takes advantage of the natural relationship between the realized measure and the conditional variance. This captures the leverage effect and maintains an elegant mathematical structure that facilitates the estimation of volatility.
With respect to assessing forecasting performance, the first finding was that rankings were sensitive to the stock and model choice but displayed little sensitivity to the ranking criterion and estimation methodology. However, the bivariate models were proved to perform better in most instances, compared with the univariate models. As such, we concluded that by adding night volatility estimates in the volatility models according to the methodology described, better estimates of next-day volatility could be obtained. This represents a step further from including high-frequency data in the modeling problem of the GARCH models in that estimates of night volatility are added into the equation of the day conditional variance according to the novel methodology we suggest.
The assessment to multivariate assets (e.g., portfolios of stocks) could be extended in future work by documenting a method of forecasting volatility of assets using the principal component (PC) analysis or other statistical procedures that use the orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables, taking advantage of the autoregressive conditional heteroskedastic models we proposed that use estimates of day, intraday, and night volatility. We might refer to these models as PC Bivariate Realized GARCH models and these might be used to formulate the general form of one multivariate asset’s conditional variance–covariance matrix expressed in terms of conditional variances of the compounding assets and of their principal components. This would allow the estimation of the volatility of one multivariate asset through estimations of the volatility of principal components using day, intraday, and night volatility information. Then, by reducing the n-multivariate to a stock dimension ( positive integers), we could estimate the new models and assess their one-day-ahead forecasting performance. Constructing models that use volatility information from the previous two days and two nights may further improve the modeling of volatility, as we noted by inspecting the results for the current bivariate form of Realized GARCH (2,2). Disseminating among the stocks according to their underlying volatility features may provide a better method of more consistently modeling their volatility patterns.
Integration of volatility estimates of highly interlinked markets that are open during the closing time of the reference market is another suggestion for further research. For example, proposing models for the U.S. market that estimate day volatility using night volatility estimates from the Asian markets open during the non-trading times of the U.S. market would allow for integration in such models of systemic risk and financial contagion related elements, with likely benefits for volatility estimation and forecasting.