Next Article in Journal
Inflation and Hyperinflation Countries in 2018–2020: Risks of Different Assets and Foreign Trade
Next Article in Special Issue
Predictability of the Realised Volatility of International Stock Markets Amid Uncertainty Related to Infectious Diseases
Previous Article in Journal
COVID-19 Disclosure: A Novel Measurement and Annual Report Uncertainty
Previous Article in Special Issue
Multiscale Decomposition and Spectral Analysis of Sector ETF Price Dynamics
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Bayesian Semiparametric Realized Stochastic Volatility Model

Sobey School of Business, Saint Mary’s University, Halifax, NS B3H 3C3, Canada
J. Risk Financial Manag. 2021, 14(12), 617; https://doi.org/10.3390/jrfm14120617
Submission received: 2 November 2021 / Revised: 13 December 2021 / Accepted: 15 December 2021 / Published: 19 December 2021
(This article belongs to the Special Issue Financial Markets, Financial Volatility and Beyond)

Abstract

:
This paper proposes a semiparametric realized stochastic volatility model by integrating the parametric stochastic volatility model utilizing realized volatility information and the Bayesian nonparametric framework. The flexible framework offered by Bayesian nonparametric mixtures not only improves the fitting of asymmetric and leptokurtic densities of asset returns and logarithmic realized volatility but also enables flexible adjustments for estimation bias in realized volatility. Applications to equity data show that the proposed model offers superior density forecasts for returns and improved estimates of parameters and latent volatility compared with existing alternatives.

1. Introduction

Asset volatility plays a crucial role in many financial problems such as derivative pricing, risk management, and portfolio allocation. The generalized autoregressive conditional heteroscedasticity (GARCH) model introduced by Bollerslev (1986) and stochastic volatility (SV) model formalized by Taylor (1986) are standard econometric tools for estimating and forecasting financial asset volatility. Recent developments in financial econometrics benefit volatility modeling in several aspects. For one thing, the availability of high-frequency data provides realized measures of ex-post volatility, which are data on historical volatility. For another, econometric techniques such as the Bayesian nonparametric mixture enable modeling data in more flexible ways. This paper proposes an extended SV model that utilizes information from both returns and realized volatility (RV) under a flexible Bayesian nonparametric framework. Compared with existing models, the proposed models significantly improve density forecasts of returns and volatility measures.
To better accommodate asymmetric and heavy-tailed features of asset returns, many works have extended the SV model by relaxing the Gaussian distributional assumption. Non-Gaussian innovation distributions used in the SV framework include the Student’s t (Chib et al. 2002; Sandmann and Koopman 1998), normal inverse Gaussian (Barndorff-Nielsen 1997), finite Gaussian mixture (Kim et al. 1998; Mahieu and Schotman 1998) and generalized hyperbolic skew Student’s t (Nakajima and Omori 2012). Recent advances in Bayesian nonparametrics enable modeling data without distributional assumptions. Jensen and Maheu (2010) extends the SV model to its semiparametric version with a nonparametric mixture innovation distribution. Other versions of semiparametric SV models include those in Yu (2012); Delatola and Griffin (2013); Jensen and Maheu (2014) and Virbickaitė and Lopes (2019).
For two decades, estimation of ex-post volatility using high-frequency data has been a very active research topic in financial econometrics. Andersen et al. (2001) and Barndorff-Nielsen and Shephard (2002) show that the RV estimator defined as the summation of squared intraday returns is consistent for ex-post daily volatility in an ideal scenario. In practice, price observations are contaminated with market microstructure noise, which leads to biased RV measures. Zhang et al. (2005) suggest that the average of subsampled RV measures outperforms RV and introduce a two-scales volatility estimator. Barndorff-Nielsen et al. (2008) propose a kernel-based approach for volatility estimation. Other estimators include the realized power variation (Barndorff-Nielsen and Shephard 2004), range-based volatility estimator (Christensen and Podolskij 2007), pre-averaged RV (Jacod et al. 2009) and quasi-maximum likelihood estimator (Xiu 2010). High-frequency volatility estimators quantify latent volatility nonparametrically and provide data on volatility. Maheu and McCurdy (2011) show that jointly modelling returns and RV leads to significant improvements in return density forecasts. Takahashi et al. (2009) propose a realized SV (RSV) model that simultaneously analyzes returns and RV measures. Shirota et al. (2014) and Asai et al. (2017) extend the RSV model to allow for leverage, long memory or asymmetry. Other joint return-RV models include the realized GARCH (Hansen et al. 2012), high-frequency-based volatility (HEAVY) model (Shephard and Sheppard 2010) and Markov switching model with RV (Liu and Maheu 2018).
Compared with return data, ex-post volatility estimates offer more accurate volatility measures but are subjective to estimation bias caused by the market microstructure noise. Takahashi et al. (2009) equip the RSV model with a constant correction term to adjust the estimation bias. Nevertheless, later works such as Bandi et al. (2013) note that the bias in RV could be time-varying due to the variation in market microstructure noise. In addition, most works, including Takahashi et al. (2009), assume that logarithmic RV (logRV) follows a Gaussian distribution. However, Corsi et al. (2008) shows that the volatility of RV is time-varying, and residuals in several logRV models are not normally distributed. Huang et al. (2019) study the option implied volatility and find the volatility of volatility varies over time.
This paper extends the RSV model to its semiparametric version by relaxing assumptions about innovation distributions and RV estimation bias. Such an extension provides two benefits. First, the non-Gaussian features of both return and logRV are better accommodated under the Bayesian nonparametric framework with no distributional assumptions. Following Jensen and Maheu (2010), I incorporate the RSV model with the Dirichlet process mixture (DPM), which is a Bayesian nonparametric mixture model allowing a nonfixed number of clusters. Second, I assume that the RV estimation bias is time-varying. Instead of adjusting the RV bias via a constant parameter, the proposed model adopts a varying correction term to filter out the bias, which facilitates the extraction of volatility information from RV data. I consider three versions of semiparametric RSV models, in which return and logRV processes are influenced by a common DPM, are governed by two independent DPMs, or only the return innovation terms follow a DPM.
The proposed model is evaluated against existing SV models including RSV, SV-DPM, and standard SV and GARCH models with normal or Student’s t innovations. Empirical applications to three U.S. equities (Disney (DIS), IBM, SPDR S&P 500 ETF (SPY)) and one South Korea stock (SK Hynix (SKHY)) disclose the benefit of the proposed extension of the RSV model. The semiparametric RSV model captures stronger volatility persistence and results in a less noisy log volatility process compared with the RSV model. The nonparametric mixture well characterizes skewed and heavy-tailed densities for both returns and logRV, as shown in predictive density plots. In contrast to the semiparametric SV model without RV, the semiparametric RSV model fits the return density using a mixture with fewer clusters. In out-of-sample forecasting, the proposed model significantly improves return density forecasts compared with benchmark models. Incorporating the RSV model with Bayesian nonparametric mixtures also benefits the forecast of logRV densities. Both in-sample and out-of-sample results are robust to the choice of assets, subsample periods, and RV measures.
The remainder of the paper is organized as follows. Section 2 provides a brief summary of ex-post volatility estimation and discusses the data. Section 3 illustrates the proposed models, benchmarks, Bayesian inference, and model comparison. Full sample estimates and out-of-sample forecasting results are reported in Section 4. Section 5 concludes the paper, followed by an Appendix A.

2. Ex-Post Volatility Estimation and Data

2.1. Ex-Post Volatility Estimation

Consider the following stochastic process for logarithmic price p ( τ ) :
d p ( τ ) = m ( τ ) d τ + σ ( τ ) d w ( τ ) ,
where m ( τ ) is a drift term, σ ( τ ) stands for the instantaneous volatility, and w ( τ ) is a Brownian motion. The integrated variance V t is the true variance measure of the return over the period ( t 1 , t ) and is defined as
V t = t 1 t σ 2 ( τ ) d τ .
Andersen et al. (2001) and Barndorff-Nielsen and Shephard (2002) show that the RV defined as the sum of squared intraperiod returns is a consistent estimator of V t in an ideal setting without market microstructure noise.
Due to the bid-ask bounce, discrete price changes and measurement error, price observations are contaminated with errors. Let p ˜ t , i Δ = p t , i Δ + ϵ t , i Δ denote the log price observed at time1 i Δ on day t, where p t , i Δ and ϵ t , i Δ represent the frictionless log price and error term, respectively. The i th intraday return over Δ seconds is given as
r ˜ t , i = p ˜ t , i Δ p ˜ t , ( i 1 ) Δ = p t , i Δ p t , ( i 1 ) Δ + ϵ t , i Δ ϵ t , ( i 1 ) Δ .
The presence of microstructure noise induces autocorrelation in the return series and leads to biased RV. One simple way to reduce the estimation bias is to form RV using low-frequency data such as Δ = 300 or 600 s. Such an approach leads to a less biased but noisy volatility estimator. Zhang et al. (2005) suggest that an improved estimator with reduced estimation noise can be obtained by averaging sparsely sampled RV estimators from different subsamples. Each subsample contains returns with the same Δ but different starting times. The subsampled RV (SRV) with K subsampling groups is defined as
SRV ( K ) t = 1 K k = 1 K RV t k , RV t k = i = 1 n t r ˜ t , i k 2 ,
where r ˜ t , i k = p ˜ t , ( i + k / K ) Δ p ˜ t , ( i 1 + k / K ) Δ is the return from the subsample that shifts the time period ( ( i 1 ) Δ , i Δ ) by k / K Δ and n t is the number of intraday returns on day t.
Another popular ex-post volatility estimator robust to microstructure noise is the realized kernel (RK) proposed by Barndorff-Nielsen et al. (2008). Barndorff-Nielsen et al. (2009) recommend the nonnegative RK, which guarantees RK estimates to be positive, for practical application. The nonnegative RK is defined as
RK t = h = H H k h H + 1 γ h , γ h = i = | h | + 1 n t r ˜ t , i r ˜ t , i | h | ,
where k ( · ) stands for the kernel weight function and γ h is a realized autocovariance term. H is the bandwidth controlling the number of γ h terms used in constructing RK t . Barndorff-Nielsen et al. (2008) suggest that the optimal choice of H is H * = c 0 n t 0.6 ( ω t / IQ t ) 0.8 and the preferred kernel function is the Parzen kernel2. ω t 2 stands for the variance of microstructure noise and can be estimated as RV dense , t / ( 2 n dense , t ) following Bandi and Russell (2008)3. IQ t is the integrated quarticity, which can be approximated by the square of 10-min SRV t . For the Parzen kernel, c 0 = 3.5134 .

2.2. Data Source and Motivation

The tick-by-tick transactions of DIS, IBM, and SPY from 2 January 2004 to 31 December 2020 and SKHY from 2 January 2009 to 31 December 2020 are obtained from Tick Data4. The data are cleared following the procedure used in Barndorff-Nielsen et al. (2009) and converted to continuously compounded returns. Both the 600-s SRV with 20 subsampling groups and 30-s nonnegative RK are employed to estimate the ex-post volatility.
Equities are actively traded during trading hours, but after-market transactions are very sparse. The RV based on high-frequency data over trading hours measures the variance of open-to-close return, rather than close-to-close return. Following Hansen and Lunde (2005), I construct the daily volatility measure by combining RV over trading hours with squared overnight return r t , c o , where r t , c o is defined as the log difference between the closing price on day t 1 and the opening price on day t. The RV corresponding to close-to-close daily return is measured as
RV t , c c = RV t , o c + r t , c o 2 .
To simplify the notation, I drop the subscript “cc” and use RV t in the remaining sections.
Figure 1 plots daily returns and logarithmic realized volatility measures of DIS. Table 1 provides summary statistics for daily returns, two versions of RV and logRV measures of DIS, IBM, SPY and SKHY. As evident from the nonzero skewness values and large values of kurtosis, all four return series exhibit asymmetric and leptokurtic features. The skewness of all logRV series deviate from zero and their kurtosis5 values are all greater than 6.5, which clearly departs from a normality assumption. The summary statistics of logRV data are consistent with findings reported in Corsi et al. (2008).
Volatility estimators such as SRV and RK are not bias-free, especially in finite samples. Table 1 shows that the two RV measures overestimate the return variance on average for DIS and IBM, but provide underestimations of SPY and SKHY return variance. However, the above results are based on the full sample and may not be true for subsamples. Figure 2 shows the ratios between the sample variance of returns and the sample mean of RV measures in 100-day rolling windows based on DIS. The ratio fluctuates between 0.57 and 1.33. Similarly, the ratio varies from 0.68 to 1.65 in the SPY case. The varying ratios suggest that the gap between return variance and RV average is not a constant and that RV estimation bias could be time-varying or state-dependent.

3. Models

3.1. Semiparametric Realized Stochastic Volatility Models

Following Jensen and Maheu (2010), I adopt the flexible distributional framework offered by the DPM and integrate it with the RSV model. The Gaussian innovation distributions for returns and logRV in the RSV model are replaced with nonparametric mixtures. In addition, the RV bias correction term is assumed to follow the DPM to more flexibly adjust the RV estimation bias. DPM is a nonparametric version of the finite mixture model. Unlike the conventional mixture, which requires a predetermined number of distributions, DPM allows the number of clusters to be nonfixed and learned endogenously from the data. Such a flexible framework is achieved with the use of the Dirichlet process (DP) formally introduced by Ferguson (1973). The DP is an infinite-dimensional generalization of the Dirichlet distribution and can be seen as a distribution of distributions. Let G represent a discrete distribution for mixture parameters. Imposing DP ( α , G 0 ) as a prior for G makes the number of clusters and the corresponding weights in G random. α in DP ( α , G 0 ) is the concentration parameter that influences the likelihood of creating new clusters. The larger the value of α is, the more mixtures the DPM contains. G 0 is the base function for DP and serves as the center of G.
Expressing the DP prior as the stick-breaking form by Sethuraman (1994), the RSV model incorporated with DPM (RSV-DPM) is given as
r t | s t = μ s t + λ s t exp ( h t / 2 ) z t , z t N ( 0 , 1 )
log R V t | s t = ξ s t + h t + σ u , s t u t , u t N ( 0 , 1 )
h t = ρ 0 + ρ 1 h t 1 + v t , v t N ( 0 , σ v 2 )
s t Multinomial ( Π ) , Π = ( π 1 , π 2 , )
π j = v j l = 1 j 1 ( 1 π l ) , v j Beta ( 1 , α ) ,
ϕ j G 0 , ϕ j = { μ j , λ j , ξ j , σ u , j 2 } .
The return and logRV are linked by the latent log volatility h t , which follows the parametric autoregressive process defined by Equation (9). Parameters μ s t , λ s t 2 , ξ s t and σ u , s t 2 are all state-dependent and follow the distribution G, whose prior is DP ( α , G 0 ) . As in Jensen and Maheu (2010), a mixture with state-dependent mean μ s t and volatility scalar λ s t is applied to fit the return innovation distribution. Parameters ξ s t and σ u , s t 2 in Equation (8) correspond to the RV estimation bias and logRV variance, respectively. Allowing ξ s t and σ u , s t 2 to be nonfixed not only accommodates the non-Gaussian features of logRV, but also allows flexible adjustment for RV bias. To maintain parsimony, I assume that all state-dependent parameters are governed by the underlying state variable s t = 1 , 2 , , . Π is the state probability vector whose elements are generated from the stick-breaking process in Equation (11). The DP prior’s base function G 0 is defined as G 0 ( μ j ) N ( m μ , v μ 2 ) , G 0 ( λ j 2 ) IG ( v 0 / 2 , s 0 / 2 ) , G 0 ( ξ j ) N ( m ξ , v ξ 2 ) and G 0 ( σ u , j 2 ) IG ( v 0 , u / 2 , s 0 , u / 2 ) , where IG stands for an inverse-gamma distribution. The prior for ( ρ 0 , ρ 1 ) is N ( 0 , V ) and σ v 2 IG ( v 0 , v / 2 , s 0 , v / 2 ) . A hierarchical prior Gamma ( a , b ) is placed on α to add more flexibility.
I further consider an alternative semiparametric RSV model termed RSV-DPM-ind, which assigns two independent DPMs to return and logRV processes and analyzes logRV as follows.
log R V t | w t = ξ w t + h t + σ u , w t u t , u t N ( 0 , 1 )
w t Multinomial ( Γ ) , Γ = ( γ 1 , γ 2 , )
γ j = c j l = 1 j 1 ( 1 γ l ) , c j Beta ( 1 , α 2 ) ,
χ j H 0 , χ j = { ξ j , σ u , j 2 } .
Equations (7), (9)–(12) and (13)–(16) constitute the RSV-DPM-ind model. Underlying state variables s t and w t govern return and logRV, respectively. Parameters ξ w t and σ u , w t 2 follow the distribution H, whose prior is DP ( α 2 , H 0 ) with base function H 0 ( ξ j ) N ( m ξ , v ξ 2 ) and H 0 ( σ u , j 2 ) IG ( v 0 , u / 2 , s 0 , u / 2 ) . The model settings for return and latent log volatility processes are the same as the RSV-DPM model, except that the DPM with prior DP ( α 1 , G 0 ) only governs μ s t and λ s t 2 .
In addition, a semiparametric RSV model with DPM influencing only the return process is included for the purpose of model comparison. Termed RSV-DPM-ret, the model is constituted by Equations (7) to (12) but with fixed parameters ξ and σ u 2 in the logRV equation.
The first-order autoregressive latent volatility process can be generalized to include more lagged terms, factor capturing volatility-feedback effect, or additional predictors. For example, Huang et al. (2019) find that the option implied volatility (IV) benefits the prediction of future RV. Equation (9) could contain IV as an additional explanatory variable. In addition, motivated by Corsi et al. (2008), the realized quarticity (RQ), which measures the volatility of RV, can be potentially incorporated in Equation (8) to improve the characterization of the logRV residual error. All of those are left for future studies.

3.2. Benchmark Models

Several volatility models are considered as benchmarks. The first benchmark is the RSV model, which analyzes return and logRV in the following parametric way.
r t = μ + exp ( h t / 2 ) z t , z t N ( 0 , 1 )
log R V t = ξ + h t + σ u u t , u t N ( 0 , 1 )
Combining Equations (9), (17) and (18) completes the RSV model. To evaluate the proposed models with existing semiparametric SV models, we consider the SV-DPM designed by Jensen and Maheu (2010). Equations (7), (9)–(12) constitute the SV-DPM model, which does not take the RV information into account. The conventional SV model formed by Equations (9) and (17) and the SV model with Student’s t-distributed innovation term6 are included. Finally, the GARCH model and GARCH with Student’s t-distributed innovation (GARCH-t) are included as benchmarks. The GARCH model is given as
r t = μ + σ t z t , z t N ( 0 , 1 ) ,
σ t 2 = ω + α ( r t 1 μ ) 2 + β σ t 1 2 .

3.3. Bayesian Inference

The semiparametric RSV models are estimated using the Markov chain Monte Carlo (MCMC) technique. Taking the RSV-DPM model as an example, the parameter set includes { ϕ j } j = 1 = { μ j , λ j 2 , ξ j , σ u , j 2 } j = 1 , ψ = { ρ 0 , ρ 1 , σ v 2 } , α and latent volatility series h 1 : T = { h 1 , h 2 , , h T } . The slice sampling technique introduced by Walker (2007) and further developed by Kalli et al. (2011) is applied to facilitate model estimation in infinite state space. Conditional on a set of auxiliary variables u 1 : T = { u 1 , u 2 , , u T } , the infinite number of clusters is randomly truncated to a finite number K, which facilitates the use of Gibbs sampling or Metropolis-Hasting algorithms to estimate model parameters.
Let y t = ( r t , log R V t ) , y 1 : T = { y 1 , y 2 , , y T } and θ = { ϕ j } j = 1 K , ψ . After augmenting s 1 : T and u 1 : T , the joint posterior p { ϕ j } j = 1 , ψ , h 1 : T , s 1 : T , u 1 : T | y 1 : T is proportional to
t = 1 T 𝟙 ( u t < π s t ) N r t | μ s t , λ s t 2 exp ( h t ) N log R V t | ξ s t + h t , σ u , s t 2 N h t | ρ 0 + ρ 1 h t 1 , σ v 2 · j = 1 K p ( ϕ j 2 ) p ( ψ ) .
Each MCMC iteration contains the following sampling steps.
1.
Sample model parameters { ϕ j } j = 1 K and ψ conditional on r 1 : T , log R V 1 : T , h 1 : T , s 1 : T .
Given conjugate priors, the conditional posterior distributions of μ j , λ j 2 , ξ j , σ u , j 2 , ρ 0 , ρ 1 and σ v 2 can be easily derived. See the Appendix A for details. Model parameters are estimated by iteratively using Gibbs samplers as follows.
(a).
μ j | r 1 : T , h 1 : T , s 1 : T , λ j 2 for j = 1 , , K .
(b).
λ j 2 | r 1 : T , h 1 : T , s 1 : T , μ j for j = 1 , , K .
(c).
ξ j | log R V 1 : T , h 1 : T , s 1 : T , σ u , j 2 for j = 1 , , K .
(d).
σ u , j 2 | log R V 1 : T , h 1 : T , s 1 : T , ξ j for j = 1 , , K .
(e).
ρ 0 , ρ 1 | h 1 : T , σ v 2 .
(f).
σ v 2 | h 1 : T , ρ 0 , ρ 1 .
2.
Sample latent volatility h t for t = 1 , 2 , , T .
Latent volatility variables are sampled using the Metropolis-Hasting algorithm with a single move sampler. The conditional posterior of h t is given as
p h t | r t , log R V t , h t 1 , h t + 1 , { ϕ j } j = 1 K , ψ p r t | h t , { ϕ j } j = 1 K p log R V t | h t , { ϕ j } j = 1 K p ( h t | h t 1 , ψ ) p ( h t + 1 | h t , ψ ) .
The proposal distribution f ( h t | · ) for h t is derived from the conditional posterior following the approach in Kim et al. (1998). We leave the details to the Appendix A. A proposed value h t f ( h t | · ) is accepted with probability min 1 , f ( h t ) p ( h t | y t , h t 1 , h t + 1 , θ ) f ( h t ) p ( h t | y t , h t 1 , h t + 1 , θ ) .
3.
Sample state variable s t for t = 1 , , T from
p ( s t = j | r t , h t , { ϕ j , π j } j = 1 K , u t ) j = 1 K 𝟙 ( u t < π j ) N r t | μ j , λ j 2 exp ( h t ) N log R V t | ξ j + h t , σ u , j 2 .
4.
Sample auxiliary variable u t for t = 1 , , T .
(a).
Calculate π j = v j l = 1 j 1 ( 1 π l ) for j = 1 , , K , where v j is sampled from
p ( v j | s 1 : T , α ) Beta 1 + t = 1 T 𝟙 ( s t = j ) , α + t = 1 T 𝟙 ( s t > j ) .
(b).
Sampling u t for t = 1 , , T from p ( u t | s t , π 1 : K ) Uniform ( 0 , π s t ) .
(c).
Find the smallest K such that j = 1 K π j > 1 min u 1 : T .
5.
Sample α based on K.
Following the method proposed by Escobar and West (1994), α is sampled from the Gamma mixture below.
p ( α | K ) q · Gamma ( a + K , b log ζ ) + ( 1 q ) · Gamma ( a + K 1 , b log ζ ) ,
where q = a + K 1 a + K 1 + T ( b log ζ ) and ζ Beta ( α + 1 , T ) .
The estimation of the RSV-DPM-ind model is essentially the same as that of the RSV-DPM model but two sets of DPM-related parameters need to be estimated. The RSV-DPM-ret model shares the same estimation steps as RSV-DPM, except that ξ and σ u 2 are sampled conditional on y 1 : T .
Posterior statistics can be calculated conditional on MCMC draws after dropping results in a burn-in period. For example, the posterior mean of σ v 2 based on G MCMC outputs is given as
E ( σ v 2 | y 1 : T ) = 1 G i = 1 G σ v 2 ( i ) ,
where σ v 2 ( i ) is the i th draw of σ v 2 . Similarly, the smoothed log volatility can be estimated as
E ( h t | y 1 : T ) = 1 G i = 1 G h t ( i ) .

3.4. Prediction

Since volatility is not observable but the return is, the density forecast of returns is a natural way to evaluate the predictive power of volatility models. The predictive likelihood for returns provides the measure of the density forecast and is defined as
p ( r t + 1 | y 1 : t ) = p ( r t + 1 | θ , h t + 1 ) p ( h t + 1 | θ , y 1 : t ) p ( θ | y 1 : t ) d θ d h t + 1 .
For all semiparametric SV models, based on G MCMC outputs, p ( r t + 1 | y 1 : t ) can be obtained by integrating out parameter uncertainties as
p ( r t + 1 | y 1 : t ) 1 G i = 1 G p ( r t + 1 | μ t + 1 ( i ) , λ t + 1 2 ( i ) exp ( h t + 1 ( i ) ) ) ,
where h t + 1 ( i ) N ( ρ 0 ( i ) + ρ 1 ( i ) h t ( i ) , σ v 2 ( i ) ) , ( ρ 0 ( i ) , ρ 1 ( i ) , σ v 2 ( i ) ) p ( θ | y 1 : t ) and μ t + 1 ( i ) and λ t + 1 2 ( i ) are determined based on the predicted state s t + 1 ( i ) Multinomial ( Π ( i ) , K ( i ) + 1 ) . If s t + 1 ( i ) < = K ( i ) , set μ t + 1 ( i ) = μ s t + 1 ( i ) and λ t + 1 2 ( i ) = λ s t + 1 2 ( i ) . If s t + 1 ( i ) = K ( i ) + 1 , μ t + 1 ( i ) N ( m μ , v μ 2 ) and λ t + 1 2 ( i ) IG ( v 0 / 2 , s 0 / 2 ) .
The log predictive likelihood ( LPL ) of model M 1 over the out-of-sample period from t 0 + 1 to T is
LPL 1 = t = t 0 T 1 log p ( r t + 1 | y 1 : t , M 1 ) .
The model with a higher LPL is preferred. In model comparison, it is convenient to compute the log predictive Bayes factor ( LBF ). The LBF between models M 1 and M 2 equals LBF = LPL 1 LPL 2 . A LBF value greater than 5 suggests that M 1 strongly dominates M 2 . The subsample performance of density forecasts can be investigated using cumulative LBF defined as follows.
CLBF s = t = t 0 s log p ( r t + 1 | y 1 : t , M 1 ) log p ( r t + 1 | y 1 : t , M 2 ) for s = t 0 , , T 1 .
For SV models incorporating RV measures, the predictive likelihood of logRV can be calculated similarly to evaluate the prediction of volatility measures.
p ( log R V t + 1 | y 1 : t ) = p ( log R V t + 1 | θ , h t + 1 ) p ( h t + 1 | θ , y 1 : t ) p ( θ | y 1 : t ) d θ d h t + 1 .
Taking the RSV-DPM model as an example, p ( log R V t + 1 | y 1 : t ) can be consistently estimated based on posterior outputs as
p ( log R V t + 1 | y 1 : t ) 1 G i = 1 G p ( log R V t + 1 | ξ t + 1 ( i ) + h t + 1 ( i ) , σ u , t + 1 2 ( i ) ) ,
where h t + 1 ( i ) N ( ρ 0 ( i ) + ρ 1 ( i ) h t ( i ) , σ v 2 ( i ) ) , ( ρ 0 ( i ) , ρ 1 ( i ) , σ v 2 ( i ) ) p ( θ | y 1 : t ) . For state-dependent parameters, ξ t + 1 ( i ) = ξ s t + 1 ( i ) and σ u , t + 1 2 ( i ) = σ u , s t + 1 2 ( i ) if s t + 1 ( i ) < = K ( i ) . ξ t + 1 ( i ) N ( m ξ , v ξ 2 ) and σ u , t + 1 2 ( i ) IG ( v 0 , u / 2 , s 0 , u / 2 ) if s t + 1 ( i ) = K ( i ) + 1 .

4. Empirical Applications

This section reports the results of applying the proposed and benchmark models to the four data series discussed in Section 2. Model estimation is based on 5000 MCMC results, after 5000 burnin and the code is written in C programming language. The out-of-sample period starts on 2 January 2009, and contains 3021 days. We consider both SRV and RK discussed in Section 2 as volatility measures. The priors applied to the semiparametric RSV models are μ j N ( 0 , 0.1 ) , λ j 2 IG ( 10 / 2 , 10 / 2 ) , ξ j N ( 0 , 1 ) , σ u , j 2 IG ( 10 / 2 , 2 / 2 ) for j = 1 , , K , ρ i N ( 0 , 100 ) for i = 0 or 1, σ v 2 IG ( 10 / 2 , 0.5 / 2 ) and α Gamma ( 2 , 8 ) . The benchmark models have the same priors for parameters μ , ρ 0 , ρ 1 , σ v 2 , ξ and σ u 2 .

4.1. Parameter Estimates

Table 2 reports posterior estimates for the RSV-DPM, RSV-DPM-ind, RSV and SV-DPM models. Among the three models incorporating RV measures, the two semiparametric RSV models capture stronger volatility persistence than the conventional RSV model. The posterior means of ρ 1 in the RSV-DPM model for DIS, IBM, SPY, and SKHY are all higher than 0.96, while the RSV model reports ρ 1 values of 0.861, 0.887, 0.940, and 0.933 in the four cases. The variance estimates of log volatility in the proposed models are lower than those in the benchmark RSV model. For example, in the DIS application, the posterior mean of σ v 2 in RSV-DPM is 0.0325, but the RSV model estimates σ v 2 as 0.1922. To mitigate the gap between return variance and RV, the RSV model uses a positive bias correction term ξ for DIS and IBM and a negative ξ for SPY and SKHY. The sign of ξ is consistent with the relationship between return variance and RV averages observed in Table 1. The semiparametric RSV models, in contrast, adjust the estimation bias more flexibly via a time-varying correction term. Figure 3 plots E ( ξ t | y 1 : T ) from RSV-DPM model7 in the DIS case. ξ t is on average positive but varies substantially from large values to even negative values. In addition, the proposed models report lower posterior standard deviations of ρ 1 and σ v 2 than the RSV model, which suggests that the Bayesian nonparametric extension improves the precision of latent volatility parameter estimation. The results are based on ex-post volatility measured by SRV, and using RK leads to similar results.
A comparison of the estimation results of the three semiparametric models shows that the RSV-DPM-ind model requires fewer mixtures to fit the asymmetric and leptokurtic properties of returns, whereas the SV-DPM model has to rely on additional mixtures. For example, the mixture for return distributions in the RSV-DPM-ind model contains on average 3.05, 2.89, 1.88 and 2.15 Gaussian distributions in the four applications, whereas the average numbers of clusters in the SV-DPM model are 5.51, 3.56, 8.34, and 3.59.
As shown in the top panel of Figure 4, the latent volatility estimates E ( h t | y 1 : T ) of the RSV model is noisier than that of the RSV-DPM model, which is consistent with the weaker volatility persistence and higher logRV volatility shown in Table 2. Such a result indicates that without non-Gaussian innovation terms, volatility fluctuates more to accommodate the return distribution, which sacrifices the time-dependent nature of volatility. The bottom panel of Figure 4 shows that the RSV-DPM model captures similar time-series volatility dynamics as the SV-DPM model.

4.2. Density Forecasts

The top panel of Figure 5 plots the predictive return density p ( r t + 1 | y 1 : t ) of DIS on 31 December 2020, from the RSV and RSV-DPM models. The bottom panel of Figure 5 provides the log predictive density of returns to more clearly visualize the tail pattern. The predictive return density under the RSV-DPM model is asymmetric and heavy-tailed, in contrast to the Gaussian density under the RSV model. The predictive density of logRV and its log density can be found in Figure 6. The RSV-DPM model fits logRV using a skewed density with a heavy right tail, which is not achievable with the Gaussian assumption in the benchmark RSV model.
Table 3 reports the log predictive likelihoods of returns at three forecast horizons and log predictive Bayes factors against the SV model. In all four asset cases, either the semiparametric SV or RSV model outperforms the basic SV model in terms of return density forecasting. While the parametric RSV model fails to provide better density forecasts of DIS and IBM returns than the SV-DPM model, the semiparametric versions of the RSV model improve return density forecasts significantly in all four asset cases, compared with the benchmarks. For example, based on SRV measures, the log predictive Bayes factors of RSV-DPM relative to RSV are 60.8, 96.5, 16.8, and 7.3 in DIS, IBM, SPY, and SKHY applications, respectively. Density forecasts of DIS, SPY, and SKHY returns suggest that RSV-DPM is the top-performing model, and the IBM result favors the RSV-DPM-ind model slightly more. The log predictive Bayes factors between the RSV-DPM and RSV model in DIS and IBM cases are larger than those in the other two cases, which suggests the proposed extension offers more improvement when logRV data is more skewed and has heavier tails. The RSV-DPM-ret model, which only extends the return innovation distribution to DPM, provides very small forecast improvements compared with the benchmark RSV model. The results among the three semiparametric RSV models show that the relaxation of assumptions about logRV distribution and RV bias are the main drivers for improving return density forecasts. The upward slope cumulative log predictive Bayes factors shown in Figure 7 confirm that the RSV-DPM model consistently offers density forecast improvement over the RSV model in subsample periods. Table 3 also shows that the proposed models have improved density forecasts of returns over five and ten days in most of the cases considered and the ranking of models is robust to the choice of RV measures.
Table 4 summarizes the results of the density forecast of logRV in the four asset applications. The RSV-DPM and RSV-DPM-ind models offer more accurate logRV density forecasts than the benchmark RSV model. The one-period ahead log predictive likelihoods of both RSV-DPM and RSV-DPM-ind are 300, 500, 30 and 90 units higher than the RSV model in the DIS, IBM, SPY and SKHY cases, respectively. The cumulative log predictive Bayes factors shown in Figure 8 confirm that the logRV density forecast results are robust to subsamples. The RSV-DPM-ind model, which assumes returns and logRV is governed by two independent DPMs, offers the best one-period ahead density forecast of logRV. The more parsimonious RSV-DPM model yields better density forecasts of logRVs over longer periods.

5. Conclusions

This paper contributes to the SV modeling literature by integrating the RSV model with a Bayesian nonparametric framework. Such an extension benefits the volatility modeling in two aspects. First, the Bayesian nonparametric mixture better fits the empirical distributions of returns and logRVs, compared with Gaussian densities. Second, the information from high-frequency volatility measures can be better utilized by allowing more flexible RV bias adjustment.
Applications to DIS, IBM, SPY, and SKHY data compare the proposed model with benchmarks including RSV, SV-DPM, SV-t, SV, and GARCH. The semiparametric RSV model offers a significant improvement on density forecast of return and logRV, especially for asset data with more severe asymmetric and leptokurtic features. Compared with the parametric RSV model, the proposed model yields less noisy and more persistent latent volatility series.
The forecasting improvements mainly originate from the generalization of the logRV framework. The empirical results of this paper are consistent with Corsi et al. (2008) that non-Gaussian densities better characterize logRV. Another suggestion from this study is that RV estimation bias may not remain constant and it is beneficial to filter out the time-varying bias flexibly. One potential future research direction is to adopt a time-dependent mixture, so that the volatility of logRV and bias adjustment parameter is related to past values and allowed to cluster. Another area of potential research is to explore if the information from realized quarticity could be incorporated to improve the characterization of the volatility of logRV.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are obtained from Tick Data (https://www.tickdata.com/ accessed on 15 August 2021).

Conflicts of Interest

The author declares no conflict of interest.

Appendix A

The estimation for parameters in the RSV-DPM model contains the following steps.
1.
μ j | r 1 : T , h 1 : T , s 1 : T , λ j 2 for j = 1 , , K .
Given prior μ j N ( m μ , v μ 2 ) , the conditional posterior of μ j is given as
p ( μ j | ) exp s t = j ( r t μ j ) 2 2 λ j 2 exp ( h t ) exp ( μ j m μ ) 2 2 v μ 2 N ( m ¯ μ , v ¯ μ 2 )
where
v ¯ μ 2 = s t = j 1 λ j 2 exp ( h t ) + 1 v μ 2 1 , m ¯ μ = v ¯ μ 2 s t = j r t λ j 2 exp ( h t ) + m μ v μ 2 .
2.
λ j 2 | r 1 : T , h 1 : T , s 1 : T , μ j .
Given prior λ j 2 IG ( v 0 2 , s 0 2 ) , the conditional posterior of λ j 2 is given as
p ( λ j 2 | ) ( λ j 2 ) n j 2 exp s t = j ( r t μ j ) 2 2 exp ( h t ) λ j 2 ( λ j 2 ) v 0 2 1 exp s 0 2 λ j 2 IG v 0 + n j 2 , s t = j ( r t μ j ) 2 2 exp ( h t ) + s 0 2
where n j = t = 1 T 𝟙 ( s t = j ) .
3.
ξ j | log R V 1 : T , h 1 : T , s 1 : T , σ u , j 2
Given prior ξ j N ( m ξ , v ξ 2 ) , the conditional posterior of ξ j is given as
p ( ξ j | ) exp s t = j ( log R V t h t ξ j ) 2 2 σ u , j 2 exp ( ξ j m ξ ) 2 2 v ξ 2 N ( m ¯ ξ , v ¯ ξ 2 )
where
m ¯ ξ = v ξ 2 s t = j ( log R V t h t ) + m ξ σ u , j 2 n j v ξ 2 + σ u , j 2 , v ¯ ξ 2 = v ξ 2 σ u , j 2 n j v ξ 2 + σ u , j 2
4.
σ u , j 2 | log R V 1 : T , h 1 : T , s 1 : T , ξ j
Given prior σ u , j 2 IG ( v 0 , u 2 , s 0 , u 2 ) , the conditional posterior of σ u , j 2 is given as
p ( σ u , j 2 | ) ( σ u , j 2 ) n j 2 exp s t = j ( log R V t h t ξ j ) 2 2 σ u , j 2 ( σ u , j 2 ) v 0 2 1 exp s 0 2 σ u , j 2 IG v 0 + n j 2 , s t = j ( log R V t h t ξ j ) 2 2 + s 0 2
5.
ρ 0 , ρ 1 | h 1 : T , σ v 2
Let ρ = ( ρ 0 , ρ 1 ) , Y = ( h 1 , h 2 , , h T ) and X = ( X 1 , X 2 , , X T ) , where X t = [ 1 , h t 1 ] . Given prior ρ N ( M , V ) , the conditional posterior of ρ is given as
p ( ρ | ) exp 1 σ v 2 ( Y ρ X ) ( Y ρ X ) exp ( ρ M ρ ) V 1 ( ρ M ρ ) exp 0.5 ( ρ M ¯ ) V ¯ 1 ( ρ M ¯ ) N ( M ¯ , V ¯ )
where
M ¯ = V ¯ 1 V 1 M + 1 σ v 2 X Y , V ¯ = V 1 + 1 σ v 2 X X
6.
σ v 2 | h 1 : T , ρ 0 , ρ 1
Given prior σ v 2 IG ( v 0 , v 2 , s 0 , v 2 ) , the conditional posterior of σ v 2 is given as
p ( σ v 2 | ) IG T + v 0 , v 2 , s 0 , v + t = 1 T [ h t + 1 ρ 0 ρ 1 h t ] 2 2
7.
h t | r t , log R V t , h t 1 , h t + 1 , { ϕ j } j = 1 K , ψ for t = 1 , 2 , , T .
The conditional posterior of h t is given as
p ( h t | y t , h t , θ ) p ( r t | h t , { ϕ j } j = 1 K ) p ( log R V t | h t , { ϕ j } j = 1 K ) p ( h t + 1 | h t , ψ ) p ( h t | h t 1 , ψ ) 1 exp ( h t / 2 ) exp ( r t μ s t ) 2 2 λ s t 2 exp ( h t ) exp ( log R V t h t ξ s t ) 2 2 σ u , s t 2 · exp ( h t + 1 ρ 0 ρ 1 h t ) 2 2 σ v 2 exp ( h t ρ 0 ρ 1 h t 1 ) 2 2 σ v 2 1 exp ( h t / 2 ) exp ( r t μ s t ) 2 2 λ s t 2 exp ( h t ) exp ( log R V t h t ξ s t ) 2 2 σ u , s t 2 · exp ( h t μ h * ) 2 2 σ h * 2 exp h t 2 1 2 exp ( h t ) ( r t μ s t ) 2 2 λ s t 2 exp ( h t μ h * * ) 2 2 σ h * * 2
where
μ h * = ρ 0 ( 1 ρ 1 ) + ρ 1 ( h t 1 + h t + 1 ) 1 + ρ 1 2 , σ h * 2 = σ v 2 1 + ρ 1 2
and
μ h * * = ( log R V t ξ s t ) σ h * 2 + μ h * σ u , s t 2 σ h * 2 + σ u , s t 2 , σ h * * 2 = σ h * 2 σ u , s t 2 σ h * 2 + σ u , s t 2 .
Kim et al. (1998) show that
exp h t 2 1 2 exp ( h t ) ( r t μ s t ) 2 2 λ s t 2 exp ( h t μ h * * ) 2 2 σ h * * 2 exp h t 2 1 2 exp ( μ h * * ) ( r t μ s t ) 2 λ s t 2 ( 1 + μ h * * h t ) exp ( h t μ h * * ) 2 2 σ h * * 2 exp ( h t μ h * * * ) 2 2 σ h * * 2 N ( μ h * * * , σ h * * 2 ) f ( h t | · )
where
μ h * * * = μ h * * + σ h * * 2 2 r t μ s t λ s t 2 exp ( μ h * * ) 1
and f ( h t | · ) is the proposal distribution for drawing h t . A new value h t f ( h t | · ) is accepted with probability min 1 , f ( h t ) p ( h t | y t , h t 1 , h t + 1 , θ ) f ( h t ) p ( h t | y t , h t 1 , h t + 1 , θ ) .

Notes

1
Under the previous-tick scheme, p ˜ t , i Δ is the price observed the nearest before time i Δ .
2
The Parzen kernel function is given as
k ( x ) = 1 6 x 2 + 6 x 3 , 0 x 1 / 2 2 ( 1 x ) 3 , 1 / 2 < x 1 0 , x > 1 .
3
RV dense , t is calculated using high-frequency returns such as every q trades, and n dense , t is the number of nonzero returns. I set q = 5 .
4
https://www.tickdata.com/, accessed on 15 August 2021.
5
The kurtosis measure is calculated using formula K = 1 n i = 1 n x i μ ^ σ ^ 4 .
6
The innovation term ϵ t t ( ν ) , where ν is the degree of freedom.
7
To better visualize the bias correction difference in RSV-DPM and RSV models, we set the scaling parameter λ j 2 to be 1 to make two models have the same setting in return variance.

References

  1. Andersen, Torben G., Tim Bollerslev, Francis X. Diebold, and Heiko Ebens. 2001. The distribution of realized stock return volatility. Journal of Financial Economics 61: 43–76. [Google Scholar] [CrossRef]
  2. Asai, Manabu, Chia-Lin Chang, and Michael McAleer. 2017. Realized stochastic volatility with general asymmetry and long memory. Journal of Econometrics 199: 202–12. [Google Scholar] [CrossRef] [Green Version]
  3. Bandi, Federico M., and Jeffrey R. Russell. 2008. Microstructure noise, realized variance, and optimal sampling. The Review of Economic Studies 75: 339–69. [Google Scholar] [CrossRef]
  4. Bandi, Federico M., Jeffrey R. Russell, and Chen Yang. 2013. Realized volatility forecasting in the presence of time-varying noise. Journal of Business & Economic Statistics 31: 331–45. [Google Scholar]
  5. Barndorff-Nielsen, Ole E. 1997. Normal inverse gaussian distributions and stochastic volatility modelling. Scandinavian Journal of Statistics 24: 1–13. [Google Scholar] [CrossRef]
  6. Barndorff-Nielsen, Ole E., and Neil Shephard. 2002. Estimating quadratic variation using realized variance. Journal of Applied Econometrics 17: 457–77. [Google Scholar] [CrossRef] [Green Version]
  7. Barndorff-Nielsen, Ole E., and Neil Shephard. 2004. Power and Bipower Variation with Stochastic Volatility and Jumps. Journal of Financial Econometrics 2: 1–37. [Google Scholar] [CrossRef] [Green Version]
  8. Barndorff-Nielsen, Ole E., Peter Reinhard Hansen, Asger Lunde, and Neil Shephard. 2008. Designing realized kernels to measure the ex post variation of equity prices in the presence of noise. Econometrica 76: 1481–536. [Google Scholar]
  9. Barndorff-Nielsen, Ole E., Peter Reinhard Hansen, Asger Lunde, and Neil Shephard. 2009. Realized kernels in practice: Trades and quotes. Econometrics Journal 12: C1–C32. [Google Scholar] [CrossRef]
  10. Bollerslev, Tim. 1986. Generalized autoregressive conditional heteroskedasticity. Journal of Econometrics 31: 307–27. [Google Scholar] [CrossRef] [Green Version]
  11. Chib, Siddhartha, Federico Nardari, and Neil Shephard. 2002. Markov chain monte carlo methods for stochastic volatility models. Journal of Econometrics 108: 281–316. [Google Scholar] [CrossRef]
  12. Christensen, Kim, and Mark Podolskij. 2007. Realized range-based estimation of integrated variance. Journal of Econometrics 141: 323–49. [Google Scholar] [CrossRef]
  13. Corsi, Fulvio, Stefan Mittnik, Christian Pigorsch, and Uta Pigorsch. 2008. The volatility of realized volatility. Econometric Reviews 27: 46–78. [Google Scholar] [CrossRef]
  14. Delatola, Eleni-Ioanna, and Jim E. Griffin. 2013. A bayesian semiparametric model for volatility with a leverage effect. Computational Statistics & Data Analysis 60: 97–110. [Google Scholar]
  15. Escobar, Michael D., and Mike West. 1994. Bayesian Density Estimation and Inference Using Mixtures. Journal of the American Statistical Association 90: 577–88. [Google Scholar] [CrossRef]
  16. Ferguson, Thomas S. 1973. A Bayesian analysis of some nonparametric problems. The Annals of Statistics 1: 209–30. [Google Scholar] [CrossRef]
  17. Hansen, Peter Reinhard, and Asger Lunde. 2005. A Realized Variance for the Whole Day Based on Intermittent High-Frequency Data. Journal of Financial Econometrics 3: 525–54. [Google Scholar] [CrossRef]
  18. Hansen, Peter Reinhard, Zhuo Huang, and Howard Howan Shek. 2012. Realized garch: A joint model for returns and realized measures of volatility. Journal of Applied Econometrics 27: 877–906. [Google Scholar] [CrossRef]
  19. Huang, Darien, Christian Schlag, Ivan Shaliastovich, and Julian Thimme. 2019. Volatility-of-volatility risk. Journal of Financial and Quantitative Analysis 54: 2423–52. [Google Scholar] [CrossRef] [Green Version]
  20. Jacod, Jean, Yingying Li, Per A. Mykland, Mark Podolskij, and Mathias Vetter. 2009. Microstructure noise in the continuous case: The pre-averaging approach. Stochastic Processes and their Applications 119: 2249–76. [Google Scholar] [CrossRef] [Green Version]
  21. Jensen, Mark J., and John M. Maheu. 2010. Bayesian semiparametric stochastic volatility modeling. Journal of Econometrics 157: 306–16. [Google Scholar] [CrossRef] [Green Version]
  22. Jensen, Mark J., and John M. Maheu. 2014. Estimating a semiparametric asymmetric stochastic volatility model with a dirichlet process mixture. Journal of Econometrics 178: 523–38. [Google Scholar] [CrossRef] [Green Version]
  23. Kalli, Maria, Jim E. Griffin, and Stephen G. Walker. 2011. Slice sampling mixture models. Statistics and Computing 21: 93–105. [Google Scholar] [CrossRef]
  24. Kim, Sangjoon, Neil Shephard, and Siddhartha Chib. 1998. Stochastic volatility: Likelihood inference and comparison with arch models. The Review of Economic Studies 65: 361–93. [Google Scholar] [CrossRef]
  25. Liu, Jia, and John M. Maheu. 2018. Improving markov switching models using realized variance. Journal of Applied Econometrics 33: 297–318. [Google Scholar] [CrossRef] [Green Version]
  26. Maheu, John, and Thomas McCurdy. 2011. Do high-freuqency measures of volatility improve forecasts of returns distributions? Journal of Econometrics 160: 69–76. [Google Scholar] [CrossRef] [Green Version]
  27. Mahieu, Ronald, and Peter C. Schotman. 1998. An empirical application of stochastic volatility models. Journal of Applied Econometrics 13: 333–60. [Google Scholar] [CrossRef] [Green Version]
  28. Nakajima, Jouchi, and Yasuhiro Omori. 2012. Stochastic volatility model with leverage and asymmetrically heavy-tailed error using gh skew student’s t-distribution. Computational Statistics & Data Analysis 56: 3690–704. [Google Scholar]
  29. Sandmann, Gleb, and Siem Jan Koopman. 1998. Estimation of stochastic volatility models via monte carlo maximum likelihood. Journal of Econometrics 87: 271–301. [Google Scholar] [CrossRef]
  30. Sethuraman, Jayaram. 1994. A constructive definition of Dirichlet priors. Statistica Sinica 4: 639–50. [Google Scholar]
  31. Shephard, Neil, and Kevin Sheppard. 2010. Realising the future: Forecasting with high-frequency-based volatility (heavy) models. Journal of Applied Econometrics 25: 197–231. [Google Scholar] [CrossRef] [Green Version]
  32. Shirota, Shinichiro, Takayuki Hizu, and Yasuhiro Omori. 2014. Realized stochastic volatility with leverage and long memory. Computational Statistics and Data Analysis 76: 618–41. [Google Scholar] [CrossRef] [Green Version]
  33. Takahashi, Makoto, Yasuhiro Omori, and Toshiaki Watanabe. 2009. Estimating stochastic volatility models using daily returns and realized volatility simultaneously. Computational Statistics & Data Analysis 53: 2404–26. [Google Scholar]
  34. Taylor, Stephen J. 1986. Modelling Financial Time Series. Chichester: John Wiley. [Google Scholar]
  35. Virbickaitė, Audronė, and Hedibert F. Lopes. 2019. Bayesian semiparametric markov switching stochastic volatility model. Applied Stochastic Models in Business and Industry 35: 978–97. [Google Scholar] [CrossRef]
  36. Walker, Stephen G. 2007. Sampling the Dirichlet mixture model with slices. Communications in Statistics—Simulation and Computation 36: 45–54. [Google Scholar] [CrossRef]
  37. Xiu, Dacheng. 2010. Quasi-maximum likelihood estimation of volatility with high frequency data. Journal of Econometrics 159: 235–50. [Google Scholar] [CrossRef]
  38. Yu, Jun. 2012. A semiparametric stochastic volatility model. Journal of Econometrics 167: 473–82. [Google Scholar] [CrossRef]
  39. Zhang, Lan, Per A. Mykland, and Yacine Aït-Sahalia. 2005. A tale of two time scales: Determining integrated volatility with noisy high-frequency data. Journal of the American Statistical Association 100: 1394–411. [Google Scholar] [CrossRef]
Figure 1. Daily returns and logarithmic realized volatility measures of DIS.
Figure 1. Daily returns and logarithmic realized volatility measures of DIS.
Jrfm 14 00617 g001
Figure 2. Ratios between return variance and RV average of DIS in 100-days rolling window.
Figure 2. Ratios between return variance and RV average of DIS in 100-days rolling window.
Jrfm 14 00617 g002
Figure 3. Posterior mean of ξ t of DIS from the RSV-DPM model.
Figure 3. Posterior mean of ξ t of DIS from the RSV-DPM model.
Jrfm 14 00617 g003
Figure 4. Posterior mean of log volatilities h t of DIS.
Figure 4. Posterior mean of log volatilities h t of DIS.
Jrfm 14 00617 g004
Figure 5. (Top): predictive return density p ( r t + 1 | y 1 : t ) ; (Bottom): log predictive return density log p ( r t + 1 | y 1 : t ) of the RSV-DPM and RSV models.
Figure 5. (Top): predictive return density p ( r t + 1 | y 1 : t ) ; (Bottom): log predictive return density log p ( r t + 1 | y 1 : t ) of the RSV-DPM and RSV models.
Jrfm 14 00617 g005aJrfm 14 00617 g005b
Figure 6. (Top): Predictive logRV density p ( log ( R V t + 1 ) | y 1 : t ) ; (Bottom): Log predictive logRV density log p ( log ( R V t + 1 ) | y 1 : t ) of the RSV-DPM and RSV models.
Figure 6. (Top): Predictive logRV density p ( log ( R V t + 1 ) | y 1 : t ) ; (Bottom): Log predictive logRV density log p ( log ( R V t + 1 ) | y 1 : t ) of the RSV-DPM and RSV models.
Jrfm 14 00617 g006
Figure 7. Cumulative log predictive Bayes factors for semiparametric RSV models vs the RSV model (from top to bottom: DIS, IBM, SPY and SKHY).
Figure 7. Cumulative log predictive Bayes factors for semiparametric RSV models vs the RSV model (from top to bottom: DIS, IBM, SPY and SKHY).
Jrfm 14 00617 g007
Figure 8. Cumulative log predictive Bayes factors for logRV between RSV-DPM and RSV models t = t 0 T 1 log ( p ( log R V t + 1 | y 1 : t , RSV - DPM ) / p ( log R V t + 1 | y 1 : t , RSV ) ) .
Figure 8. Cumulative log predictive Bayes factors for logRV between RSV-DPM and RSV models t = t 0 T 1 log ( p ( log R V t + 1 | y 1 : t , RSV - DPM ) / p ( log R V t + 1 | y 1 : t , RSV ) ) .
Jrfm 14 00617 g008
Table 1. Descriptive statistics for returns and volatility measures of DIS, IBM, SPY and SKHY.
Table 1. Descriptive statistics for returns and volatility measures of DIS, IBM, SPY and SKHY.
DataMeanSt. Dev.SkewnessKurtosisMinMax
Panel A: DIS
r t 0.0480.0452.9220.37817.063−13.90814.818
SRV t 3.0221.25574.44612.149214.1070.101222.816
RK t 3.0761.28180.63312.779234.8980.124222.505
log ( SRV t ) 0.3720.2270.9821.0237.823−2.2965.406
log ( RK t ) 0.4010.2480.9431.1008.026−2.0895.405
Panel B: IBM
r t 0.0070.0252.055−0.37814.591−13.75510.899
SRV t 2.2320.95234.3759.621142.5840.096138.501
RK t 2.2960.96936.5999.258126.4670.113128.970
log ( SRV t ) 0.094−0.0490.9111.2078.340−2.3404.931
log ( RK t ) 0.125−0.0320.8901.2848.555−2.1804.860
Panel C: SPY
r t 0.0280.0651.476−0.38921.891−11.58913.558
SRV t 1.3770.45722.27915.417352.0390.013148.459
RK t 1.4050.46622.68015.250343.3850.017147.030
log ( SRV t ) −0.630−0.7821.3670.7137.048−4.3055.000
log ( RK t ) −0.595−0.7641.3310.7577.104−4.0794.991
Panel D: SKHY
r t 0.0970.0006.5860.2375.504−13.06213.958
SRV t 6.5594.34854.9814.24028.2220.58987.957
RK t 6.2944.07155.9544.16526.8200.33885.496
log ( SRV t ) 1.5591.4700.5360.7276.758−0.5294.477
log ( RK t ) 1.4811.4040.6060.6166.690−1.0844.448
This table reports the summary statistics of daily returns (rt), subsampled realized variance (SRVt), realized kernel (RKt) and log volatility measures of DIS, IBM, SPY and SKHY.
Table 2. Posterior estimates of the RSV-DPM, RSV-DPM-ind, RSV and SV-DPM models.
Table 2. Posterior estimates of the RSV-DPM, RSV-DPM-ind, RSV and SV-DPM models.
RSV-DPMRSV-DPM-indRSVSV-DPM
MeanSt. Dev.MeanSt. Dev.MeanSt. Dev.MeanSt. Dev.
Panel A: DIS
μ 0.04800.0139
ξ 0.05380.0240
σ u 2 0.19340.0124
ρ 0 0.01200.00420.06070.01390.00550.00740.00230.0039
ρ 1 0.97110.00410.94240.00720.86150.01470.97040.0057
σ v 2 0.03250.00260.08160.00800.19220.01920.04610.0095
K6.28061.32393.05401.0079 5.50721.4271
α 0.44780.18780.24480.1379 0.39750.1857
K 2 3.97341.1658
α 2 0.29880.1567
Panel B: IBM
μ 0.06880.0154
ξ 0.05250.0240
σ u 2 0.17630.0102
ρ 0 0.02750.00560.04460.00850.03600.00750.01320.0042
ρ 1 0.96620.00490.95530.00580.88660.01120.97950.0043
σ v 2 0.04460.00420.05350.00490.17760.01520.03240.0053
K5.13181.19472.89221.2027 3.56621.4451
α 0.37970.17580.23630.1413 0.27470.1576
K 2 5.28402.0734
α 2 0.38170.2005
Panel C: SPY
μ 0.09370.0094
ξ −0.08080.0230
σ u 2 0.21760.0088
ρ 0 −0.03370.0067−0.04840.0124−0.03280.0068−0.01770.0057
ρ 1 0.96670.00450.94290.00660.94050.00650.97930.0040
σ v 2 0.07070.00560.13000.01060.13570.01010.06890.0090
K5.19501.23831.88640.9991 8.34963.5307
α 0.37690.17230.17460.1214 0.58060.2952
K 2 3.18261.1761
α 2 0.25320.1422
Panel D: SKHY
μ 0.08850.0384
ξ −0.05660.0308
σ u 2 0.20680.0077
ρ 0 0.04590.00870.08300.01390.10740.01570.01650.0053
ρ 1 0.96880.00560.94730.00830.93310.00960.96760.0071
σ v 2 0.01750.00220.03280.00430.04340.00550.01990.0039
K5.91401.03162.14721.2665 3.58781.4628
α 0.43470.17820.19370.1339 0.28480.1637
K 2 3.39901.3635
α 2 0.27010.1531
This table reports the posterior means and standard deviations of model parameters in the DIS, IBM SPY and SKHY applications.
Table 3. Return density forecasts.
Table 3. Return density forecasts.
h = 1 h = 5 h = 10
𝓛 𝓟 𝓛 𝓛 𝓑 𝓕 𝓛 𝓟 𝓛 𝓛 𝓑 𝓕 𝓛 𝓟 𝓛 𝓛 𝓑 𝓕
Panel A: DIS
SV−5174.3 −5194.5 −5209.1
SV-t−5129.444.9−5157.636.9−5175.133.9
GARCH−5383.1−208.8−5333.2−138.7−5339.8−130.8
GARCH-t−5149.524.8−5177.017.6−5189.319.7
SV-DPM−5121.153.2−5150.544.0−5174.135.0
RSV (SRV)−5128.845.5−5185.98.6−5238.6−29.5
RSV (RK)−5140.433.9−5188.85.7−5235.8−26.7
RSV-DPM-ret (SRV)−5128.346.1−5184.99.6−5237.7−28.7
RSV-DPM-ret (RK)−5138.735.6−5189.15.4−5234.2−25.2
RSV-DPM (SRV)−5066.8107.5−5142.651.9−5185.323.7
RSV-DPM (RK)−5070.2104.1−5141.453.1−5185.523.5
RSV-DPM-ind (SRV)−5073.9100.4−5150.943.6−5195.513.5
RSV-DPM-ind (RK)−5075.199.2−5152.042.5−5196.712.3
Panel B: IBM
SV−4851.8 −4872.9 −4894.2
SV-t−4797.754.1−4832.940.0−4860.633.60
GARCH−5127.2−275.4−4993.1−120.2−5008.6−114.43
GARCH-t−4825.426.4−4851.721.2−4872.621.53
SV-DPM−4792.259.6−4834.038.8−4859.834.40
RSV (SRV)−4846.25.6−4880.3−7.4−4913.3−19.10
RSV (RK)−4841.510.3−4880.3−7.4−4903.5−9.30
RSV-DPM-ret (SRV)−4844.96.9−4881.6−8.7−4913.9−19.77
RSV-DPM-ret (RK)−4834.117.7−4878.2−5.3−4902.0−7.84
RSV-DPM (SRV)−4749.3102.5−4812.760.1−4843.250.95
RSV-DPM (RK)−4744.9106.8−4814.358.6−4845.948.22
RSV-DPM-ind (SRV)−4745.5106.2−4815.257.7−4844.649.57
RSV-DPM-ind (RK)−4739.6112.2−4816.556.4−4843.750.47
Panel C: SPY
SV−3855.2 −3956.1 −4016.2
SV-t−3867.8−12.6−3943.112.9−3998.018.1
GARCH−3975.8−120.6−4060.3−104.3−4126.5−110.4
GARCH-t−3851.53.7−3945.910.1−4011.15.1
SV-DPM−3843.212.0−3940.116.0−3996.719.5
RSV (SRV)−3767.687.6−3908.347.8−3981.135.1
RSV (RK)−3768.886.4−3910.046.0−3980.036.1
RSV-DPM-ret (SRV)−3766.488.8−3908.048.1−3980.036.2
RSV-DPM-ret (RK)−3767.887.4−3909.846.2−3979.137.0
RSV-DPM (SRV)−3752.2103.0−3891.364.7−3970.745.5
RSV-DPM (RK)−3750.6104.6−3893.462.6−3972.843.3
RSV-DPM-ind (SRV)−3766.289.0−3910.745.4−3980.635.5
RSV-DPM-ind (RK)−3766.788.5−3913.342.7−3981.035.1
Panel D: SKHY
SV−4308.1 −4304.7 −4297.7
SV-t−4320.9−12.7−4315.2−10.6−4312.3−14.6
GARCH−4509.3−201.2−4507.5−202.8−4509.8−212.0
GARCH-t−4303.24.9−4302.52.2−4302.2−4.5
SV-DPM−4305.03.1−4299.55.2−4293.93.9
RSV (SRV)−4290.617.5−4288.516.1−4287.89.9
RSV (RK)−4290.517.6−4289.615.1−4288.59.2
RSV-DPM-ret (SRV)−4289.518.6−4288.316.4−4287.510.3
RSV-DPM-ret (RK)−4290.417.7−4290.014.6−4288.98.8
RSV-DPM (SRV)−4283.424.8−4287.317.4−4285.712.1
RSV-DPM (RK)−4281.426.7−4290.514.2−4286.411.3
RSV-DPM-ind (SRV)−4289.119.0−4289.215.5−4286.411.3
RSV-DPM-ind (RK)−4289.718.4−4291.812.8−4287.710.1
This table reports the log predictive likelihood and log Bayes factors. Bold numbers indicate the highest values.
Table 4. Density forecasts of log realized volatility measures.
Table 4. Density forecasts of log realized volatility measures.
h = 1 h = 5 h = 10
𝓛 𝓟 𝓛 𝓛 𝓑 𝓕 𝓛 𝓟 𝓛 𝓛 𝓑 𝓕 𝓛 𝓟 𝓛 𝓛 𝓑 𝓕
Panel A: DIS
 RSV (SRV)−3166.5 −3634.7 −3853.0
 RSV-DPM-ret (SRV)−3159.47.1−3618.416.3−3853.4−0.5
 RSV-DPM (SRV)−2864.9301.6−3313.5321.2−3476.9376.0
 RSV-DPM-ind (SRV)−2860.9305.5−3346.6288.1−3559.6293.3
 RSV (RK)−2985.0 −3496.2 −3738.4
 RSV-DPM-ret (RK)−2981.83.2−3492.24.0−3735.52.9
 RSV-DPM (RK)−2614.3370.7−3146.6349.6−3328.9409.5
 RSV-DPM-ind (RK)−2598.4386.5−3170.7325.5−3410.2328.2
Panel B: IBM
 RSV (SRV)−3306.9 −3569.9 −3784.9
 RSV-DPM-ret (SRV)−3304.22.8−3570.1−0.2−3784.30.5
 RSV-DPM (SRV)−2746.3560.7−3123.1446.8−3324.3460.6
 RSV-DPM-ind (SRV)−2748.6558.4−3163.0406.9−3368.1416.7
 RSV (RK)−3168.4 −3464.7 −3669.8
 RSV-DPM-ret (RK)−3171.8−3.4−3461.33.4−3676.9−7.0
 RSV-DPM (RK)−2536.9631.5−2973.8490.9−3201.6468.2
 RSV-DPM-ind (RK)−2526.7641.8−3009.4455.3−3248.1421.7
Panel C: SPY
 RSV (SRV)−3328.5 −3846.6 −4067.6
 RSV-DPM-ret (SRV)−3327.41.1−3844.52.1−4071.4−3.8
 RSV-DPM (SRV)−3289.938.6−3849.9−3.3−4031.536.1
 RSV-DPM-ind (SRV)−3273.754.8−3817.229.5−4042.924.7
 RSV (RK)−3187.8 −3762.3 −3996.4
 RSV-DPM-ret (RK)−3187.60.2−3768.1−5.8−3987.68.8
 RSV-DPM (RK)−3152.934.9−3742.220.1−3946.649.8
 RSV-DPM-ind (RK)−3112.375.5−3713.948.4−3955.041.4
Panel D: SKHY
 RSV (SRV)−1722.1 −1874.3 −1911.4
 RSV-DPM-ret (SRV)−1722.3−0.2−1873.01.2−1911.00.4
 RSV-DPM (SRV)−1632.290.0−1782.092.3−1836.574.8
 RSV-DPM-sep (SRV)−1629.892.4−1810.164.2−1856.654.8
 RSV (RK)−1857.5 −2013.0 −2052.1
 RSV-DPM-ret (RK)−1857.7−0.2−2013.3−0.3−2054.5−2.4
 RSV-DPM (RK)−1803.753.8−1962.250.8−2003.049.1
 RSV-DPM-ind (RK)−1798.259.3−1975.937.1−2016.235.9
This table reports the log predictive likelihood and log Bayes factors of predicting logRV densities. Bold numbers indicate the highest values.
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Liu, J. A Bayesian Semiparametric Realized Stochastic Volatility Model. J. Risk Financial Manag. 2021, 14, 617. https://doi.org/10.3390/jrfm14120617

AMA Style

Liu J. A Bayesian Semiparametric Realized Stochastic Volatility Model. Journal of Risk and Financial Management. 2021; 14(12):617. https://doi.org/10.3390/jrfm14120617

Chicago/Turabian Style

Liu, Jia. 2021. "A Bayesian Semiparametric Realized Stochastic Volatility Model" Journal of Risk and Financial Management 14, no. 12: 617. https://doi.org/10.3390/jrfm14120617

APA Style

Liu, J. (2021). A Bayesian Semiparametric Realized Stochastic Volatility Model. Journal of Risk and Financial Management, 14(12), 617. https://doi.org/10.3390/jrfm14120617

Article Metrics

Back to TopTop