Next Article in Journal
Econometric Fine Art Valuation by Combining Hedonic and Repeat-Sales Information
Previous Article in Journal
Top Incomes and Inequality Measurement: A Comparative Analysis of Correction Methods Using the EU SILC Data
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Does Systematic Sampling Preserve Granger Causality with an Application to High Frequency Financial Data?

by
Gulasekaran Rajaguru
1,*,
Michael O’Neill
1 and
Tilak Abeysinghe
2
1
Bond Business School, Bond University, Robina, QLD 4226, Australia
2
Department of Economics, National University of Singapore, Singapore 117570, Singapore
*
Author to whom correspondence should be addressed.
Econometrics 2018, 6(2), 31; https://doi.org/10.3390/econometrics6020031
Submission received: 17 March 2018 / Revised: 6 June 2018 / Accepted: 11 June 2018 / Published: 15 June 2018

Abstract

:
In applied econometric literature, the causal inferences are often made based on temporally aggregated or systematically sampled data. A number of studies document that temporal aggregation has distorting effects on causal inference and systematic sampling of stationary variables preserves the direction of causality. Contrary to the stationary case, this paper shows for the bivariate VAR(1) system that systematic sampling induces spurious bi-directional Granger causality among the variables if the uni-directional causality runs from a non-stationary series to either a stationary or a non-stationary series. An empirical exercise illustrates the relative usefulness of the results further.

1. Introduction

The use of highly temporally aggregated and systematically sampled data for causal inference is quite common in the applied econometric literature. The issue of applying regular interval sampling techniques is often encountered when dealing with high frequency data in the field of finance. Sampling at regular frequencies when trades arrive non-synchronously and at high frequency can distort correlation measures (Epps 1979; Hayashi and Yoshida 2005; Scholes and Williams 1977). However, many studies of cross-correlations between equity market instruments and their derivatives still rely on regularly spaced data, sampled at 1 min, 5 min, or 10 min intervals as discussed in Bollen et al. (2016). These issues are becoming more important as the availability of equity market derivative products increases and the frequency of trading intensifies.
There is substantial literature addressing the issue of the effect of temporal aggregation and systematic sampling in various aspects such as the univariate ARIMA structure (see Stram and Wei (1986); Wei (1990) and citation therein), unit roots (Rossana and Seater 1992, 1995), cointegration, exogenity, measures of persistence, impulse response functions and forecasting, (see Lütkepohl (1987); Marcellino (1999); Weiss (1984) and references therein). Studies that have analysed the dynamic relationship between variables include Zellner and Montmarquette (1971); Sims (1971); Taio and Wei (1976); Wei (1978); Wei and Mehta (1980) and the citation therein. Wei and Mehta (1980) examine the information loss due to aggregation in parameter estimation of distributed lag models. They also investigated the relative efficiencies of several parameters for the underlying parameters through simulation. They have shown that, in terms of parameter estimation, there is a substantial loss in information due to aggregation, especially if the input series is negatively autocorrelated. Breitung and Swanson (2002); Ericsson et al. (2001); and Rajaguru (2004a) show that the information loss due to aggregation concentrates on contemporaneous correlations between the variables. Breitung and Swanson (2002) examine how the spurious instantaneous relations are induced from Granger causal relationships due to temporal aggregation and systematic sampling in a VAR framework. They derive the condition under which the temporal and aggregation and systematic sampling preserves the instantaneous relations. Ericsson et al. (2001) draw attention to misspecifications involved in cross-country regressions that result from heavy temporal aggregations. They have derived the probability limits of the contemporaneous regression coefficient from a bi-directional causal bivariate VAR(1) system. Based on two period averages, the probability limits of the estimated contemporaneous regression coefficients are found to be positive, negative, or zero, even if the original unaveraged variables are contemporaneously uncorrelated. Both Breitung and Swanson (2002) and Ericsson et al. (2001) concentrate on how instantaneous relations are created due to temporal aggregation and systematic sampling from Granger causal systems. However, they fail to address the creation of a spurious Granger causal relationship from a non-causal relation due to temporal aggregation and systematic sampling.
There is a sizeable theoretical literature that analyses the impact of temporal aggregation and systematic sampling on Granger causality, weak and strong exogeneity. Sims (1971) warns that aggregation could result in a spurious causal relationship. Marcellino (1999) also shows that Granger non-causality is generally not invariant to temporal aggregation. As a result, strict and strong exogeneity is also not invariant. Triacca (2017) shows the absence of causal relationship when two causal variables are included simultaneously. Wei (1982) using Geweke (1982) linear decomposition demonstrates for the stationary variables that temporal aggregation can change a true one-sided Granger Causal relationship into a two-sided causal system. On the other hand, it shows that the systematic sampling preserves the one-sided causal relationship between the variables and the unidirectional causal system becomes weaker when they are systematically sampled further. Cunningham and Vilasuso (1995) have found through Monte Carlo simulation that temporal aggregation is between two and ten times more likely to fail to detect a true causal relationship than is systematic sampling. Cunningham and Vilasuso (1997) have examined the influence of temporal aggregation and systematic sampling on money-output relationship. Their results demonstrate that the use of systematic sampling in forming time aggregates rather than temporal aggregation.
Most of the literature has examined the effect of temporal aggregation and systematic sampling on causal inferences for the stationary case and find that systematic sampling preserves the direction of causality. When the series are non-stationary, in practice, we take appropriate differencing and conduct Granger causality test to determine the direction of causation. Mamingi (1996) using Monte Carlo simulations shows that systematic sampling of integrated process produces misleading causal inferences. Rajaguru (2004b) has derived the relationship between the cross-covariances of systematically sampled and disaggregated process and find that systematic sampling of integrated process convert uni-directional system into bidirectional. Both studies arrive at the general conclusion that systematic sampling of integrated processes induces spurious causal processes. It assumes that all variables in the system are non-stationary. However, they fail to address the scenario where one of the series is non-stationary while the other variables are stationary. In such cases, the causality in the underlying data generating process could run from stationary variable to non-stationary and vice versa. It is essential to analyse the nature of causal distortion due to systematic sampling in such scenarios. Moreover, the causal distortion in the presence of unit roots due to systematic sampling based on cross-covariance analysis (Rajaguru 2004b) could be misleading in the presence of lagged dependent variables when it is used in a VAR framework. This paper derives the relationship between the cross-covariances of aggregated and disaggregated process for the general integrated process of order d. The main contribution of the paper is it derives the condition for the order of integration in a VAR framework by incorporating lagged dependent variables for which the one-sided causal relationship tends to show two-sided causal relationship due to systematic sampling.
In the next section, we derive the relationship between the theoretical cross covariance of aggregated and disaggregated processes. This result plays a fundamental role in our exercise and is applicable to both stationary and integrated processes. In Section 3, we then derive the limiting values of least squares estimates and the corresponding t-ratios of a VAR(1) process under different levels of systematic sampling. In Section 4, we test each of our theoretical findings empirically using equity market indices and associated derivatives. In the concluding section we highlight some important issues involved in Granger causality testing with systematically sampled data.

2. Relationship between Cross Covariances of Disaggregated and Systematically Sampled Series

In this section, we extend the derivation of Rajaguru (2004b) to derive the relationship between the cross-covariances of the underlying data generating process and systematically sampled series. Let z t = ( z 1 t , z 2 t , , z n t ) , ( t = 1 , 2 , , T ) be an equally spaced basic disaggregated series. Systematic sampling means the construction of the series Z τ = z m t ( τ = 1 , 2 , , N and T = m N ) by sampling from z t at every m th interval (m is a positive integer). Let w t = ( w 1 t , w 2 t , , w n t ) , w j t = ( 1 L ) d j z j t , be a weakly stationary process with mean zero and variance covariance matrix, where L is a backward shift operator.
Γ w ( k ) = E ( w t w t k ) = [ γ i j ( k ) ] , i , j = 1 , 2 , , n
where γ i i w ( k ) is the autocovariance of the i-th component, w i t , at lag k, and γ i j w ( k ) is the cross covariance between i-th and j-th components. Further γ i i w ( 0 ) is the variance of the i-th series and γ i j w ( 0 ) represents the contemporaneous cross covariance between the series.
Notice that the order of integration d j of the j-th component, w j t , of w t need not be the same for all j. Allowing for different orders of integration of the individual components will be useful in examining how Granger causality distortions depend not only on systematic sampling but also on d j . As we shall see in the subsequent sections, the order of integration d j plays an important role in inducing spurious causal relationships between the aggregated series. Let L be the backward shift operator on the sampling unit τ . Thus, ( 1 L ) Z τ = Z τ Z τ 1 = z m τ z m ( τ 1 ) = ( 1 L m ) z m τ . Since the unit root property is invariant upon systematic sampling, we let W τ = ( W 1 τ , W 2 τ , , W n τ ) , W j τ = ( 1 L ) d j Z j τ = ( 1 L m ) d j z j m τ = ( 1 + L + + L m 1 ) d j w j m τ . The d j -th difference of the systematically sampled series (j-th component) is simply the weighted sum of the d j -th difference of the basic series. The following proposition shows the relationship between the cross-covariances of the systematically sampled series and the basic series (see Rajaguru (2004b) for the bivariate case).
Proposition 1.
The cross covariance between i-th and j-th components of the systematically sampled series W i τ and W j τ k can be expressed in terms of cross covariances of the i-th and j-th components of the basic disaggregated series w i t and w j t , that is,
γ i j W ( k ) = C o v ( W i τ , W j τ k ) = ( 1 + L + L 2 + + L m 1 ) d i + d j γ i j W ( m k + d j ( m 1 ) )
γ j i W ( k ) = C o v ( W j τ , W i τ k ) = ( 1 + L + L 2 + + L m 1 ) d i + d j γ j i W ( m k + d i ( m 1 ) )
where L operates on the index of γ i j w ( k ) such that L γ i j w ( k ) = γ i j w ( k l ) and γ i j w ( k ) = γ j i w ( k )
In particular,
γ i i W ( k ) = C o v ( W i τ , W i τ k = ( 1 + L + L 2 + + L m 1 ) 2 d i γ i i w ( m k + d i ( m 1 ) )
Further, the matrix representation of the above is given by
Γ W ( k ) = E ( W τ W τ k ) = [ γ i j W ( k ) ] , i , j = 1 , 2 , , n
Corollary 1.
If d 1 = d 2 = = d n = d (that is all the components of the vector process are integrated of the same order d), then
γ i j W ( k ) = C o v ( W i τ , W j τ k ) = ( 1 + L + L 2 + + L m 1 ) 2 d γ i j w ( m k + d ( m 1 ) )
and
Γ W ( k ) = E ( W τ W τ k ) = [ γ i j W ( k ) ] , i , j = 1 , 2 , , n
= ( 1 + L + L 2 + + L m 1 ) 2 d Γ w ( m k + d ( m 1 ) )
where L operates on each element of the matrix Γ w ( k ) .
The above proposition reveals that the cross covariances between systematically sampled series is simply the weighted sum of the cross covariances of the basic series. The most striking observation in both Equations (1) and (2) is the cross-covariance function is independent of the cause variable and is influenced by the effect variable.

3. Estimates of VAR(p) Process Based on Systematically Sampled Data

Consider the basic vector process z t and let w t = ( w 1 t , w 2 t , , w n t ) , w j t = ( 1 L ) d j z j t be a weakly stationary process with mean zero and variance covariance matrix
Γ w ( k ) = E ( w t w t k ) = [ γ i j w ( k ) ] , i , j = 1 , 2 , , n
and Γ w ( k ) = Γ w ( k ) k
Suppose the covariance stationary process w t has the following VAR(p) representation
w t = Φ 1 w t 1 + Φ 2 w t 2 + + Φ p w t p + e t
with e t N 0 ̲ , σ 1 2 0 0 0 0 σ 2 2 0 0 0 0 0 σ n 2 .
The variance covariance matrix of e t is set to be diagonal to make sure that there are no contemporaneous relationships among the variables in the basic form. The system of normal equations (Yule-Walker equations) for the above process is given by
Γ w ( k ) = E ( w t w t k ) = Φ 1 E ( w t 1 w t k ) + Φ 2 E ( w t 2 w t k ) + + Φ p E ( w t p w t k ) + E ( w t w t k ) k = Φ 1 Γ w ( k 1 ) + Φ 2 Γ w ( k 2 ) + + Φ p Γ w ( k p )
Given Φ i ’s and using the fact that Γ w ( k ) = ( Γ w ( k ) ) , we can solve for the theoretical cross covariances Γ w ( k ) k from the system of simultaneous equations in (5).
Let Z τ be the m-period non-overlapping aggregate (systematic sampling) of z t . Let W τ = ( W 1 τ , W 2 τ , , W n τ ) such that W j τ = ( 1 L ) d j Z j τ . We now consider estimating the following n-variate VAR(p) model with systematically sampled series:
W t = Φ 1 W τ 1 + Φ 2 W τ 2 + + Φ p W τ p + E τ
where E i τ ( i = 1 , 2 , , n ) represent the error process of the aggregated model. The probability estimates Φ ^ i , i.e, plim Φ ^ i (denoted by Φ ˜ i ) can be obtained by solving the system of normal equations
Γ W ( k ) = E ( W τ W τ k ) = Φ ˜ 1 Γ W ( k 1 ) + Φ ˜ 2 Γ W ( k 2 ) + + Φ ˜ p Γ W ( k p ) + E ( E τ W τ k )
for k = 1 , 2 , , p .
It is clear from the construction that E ( E τ W τ k ) = 0 for k = 1 , 2 , , p . Thus, (7) reduces to
Γ W ( k ) = E ( W τ W t τ ) = Φ ˜ 1 Γ W ( k 1 ) + Φ ˜ 2 Γ W ( k 2 ) + + Φ ˜ p Γ W ( k p )
Equivalently, Γ W ( k ) = Γ W ( k 1 ) Φ ˜ 1 + Γ W ( k 2 ) Φ ˜ 2 + + Γ W ( k p ) Φ ˜ p . This can be rewritten as
Γ W ( 1 ) Γ W ( 2 ) Γ W ( p ) = Γ W ( 0 ) Γ W ( 1 ) Γ W ( p 1 ) Γ W ( 1 ) Γ W ( 0 ) Γ W ( p 2 ) Γ W ( p + 1 ) Γ W ( p + 2 ) Γ W ( 0 ) Φ ˜ 1 Φ ˜ 2 Φ ˜ p
Φ ˜ = Γ 1 Γ W ( 1 ) Γ W ( 2 ) . Γ W ( p )
where Γ = Γ W ( 0 ) Γ W ( 1 ) Γ W ( p 1 ) Γ W ( 1 ) Γ W ( 0 ) Γ W ( p 2 ) Γ W ( p + 1 ) Γ W ( p + 2 ) Γ W ( 0 ) and Φ ˜ = Φ ˜ 1 Φ ˜ 2 Φ ˜ p
Note from (10) that the probability limit of the estimates of the model based on systematically sampled data is a function of the cross covariances of the aggregated process and further can be expressed as the cross covariances of the basic disaggregated process. Furthermore, from (5), the cross-covariances of the basic process are a function of the coefficients of the VAR(p) of the disaggregated process. Thus, the estimated parameters of the aggregated VAR(p) is the weighted sum of the cross-covariances of the basic process with the weights being the coefficients of the VAR(p) of the disaggregated process. Since our objective is to assess the effects of systematic sampling on Granger causality and to simplify computations we specialize the analysis to a bivariate VAR(1) process.

Aggregated VAR(1) Process

In order to examine the effect of systematic sampling on Granger causality, we consider the following bivariate VAR(1) system1 with z 1 t I ( d 1 ) and z 2 t I ( d 2 ) such that w i t = ( 1 L ) d i z i t for i = 1 , 2 :
w 1 t w 2 t = φ 11 φ 12 φ 21 φ 22 w 1 t 1 w 2 t 1 + e 1 t e 2 t , e 1 t e 2 t N 0 0 , σ 1 2 0 0 σ 2 2
i.e., w t = Φ 1 w t 1 + e t , where w t = w 1 t w 2 t and Φ 1 = φ 11 φ 12 φ 21 φ 22 .
In this system the coefficients φ 12 and φ 21 measure the causal relationships between w 1 t and w 2 t , with φ 12 0 implying Granger causality from w 2 to w 1 and φ 21 0 implying Granger causality from w 1 to w 2 . In this exercise, the contemporaneous correlation between the errors are set to zero (i.e., c o v ( e 1 t , e 2 t ) = 0 ) in order to assess the effect of systematic sampling on this correlation.2 As in Rajaguru and Abeysinghe (2010) the variances, autocovariances and cross-covariances of system (12) are given by
γ 11 w ( 0 ) = σ w 1 2 = E ( w 1 t w 1 t ) = φ 11 2 σ w 1 2 + φ 12 2 σ w 2 2 + 2 φ 11 φ 12 γ 12 w ( 0 ) + σ 1 2
γ 22 w ( 0 ) = σ w 2 2 = E ( w 2 t w 2 t ) = φ 21 2 σ w 1 2 + φ 22 2 σ w 2 2 + 2 φ 21 φ 22 γ 12 w ( 0 ) + σ 2 2
γ 12 w ( 0 ) = γ 21 w ( 0 ) = E ( w 1 t w 2 t ) = φ 11 φ 21 σ w 1 2 + φ 12 φ 22 σ w 2 2 + ( φ 11 φ 22 + φ 12 φ 22 ) γ 12 w ( 0 )
The system of equations described in (5) can be written as
Γ w ( k ) = Φ 1 Γ w ( k 1 ) γ 11 w ( k ) γ 12 w ( k ) γ 21 w ( k ) γ 22 w ( k ) = φ 11 φ 12 φ 21 φ 22 γ 11 w ( k 1 ) γ 12 w ( k 1 ) γ 21 w ( k 1 ) γ 22 w ( k 1 )
That is,
γ 11 w ( k ) = E ( w 1 t w 1 t k ) = φ 11 γ 11 w ( k 1 ) + φ 12 γ 21 w ( k 1 )
γ 22 w ( k ) = E ( w 2 t w 2 t k ) = φ 21 γ 12 w ( k 1 ) + φ 22 γ 22 w ( k 1 )
γ 12 w ( k ) = E ( w 1 t w 2 t k ) = φ 11 γ 12 w ( k 1 ) + φ 12 γ 22 w ( k 1 )
γ 21 w ( k ) = E ( w 2 t w 1 t k ) = φ 21 γ 11 w ( k 1 ) + φ 22 γ 21 w ( k 1 )
Solve (13)–(15), we get
σ w 1 2 = c 3 [ σ 1 2 ( b 2 c 3 b 3 c 2 ) σ 2 2 ( b 1 c 3 b 3 c 1 ) ] [ a 1 c 3 a 3 c 1 ] [ b 2 c 3 b 3 c 2 ] [ a 2 c 3 a 3 c 2 ] [ b 1 c 3 b 3 c 1 ]
σ w 1 2 = c 3 [ σ 1 2 ( a 2 c 3 a 3 c 2 ) σ 2 2 ( a 1 c 3 a 3 c 1 ) ] [ b 1 c 3 b 3 c 1 ] [ a 2 c 3 a 3 c 2 ] [ b 2 c 3 b 3 c 2 ] [ a 1 c 3 a 3 c 1 ]
γ 12 w ( 0 ) = [ a 3 σ w 1 2 + b 3 σ w 2 2 ] c 3
where a 1 = 1 φ 11 2 , b 1 = φ 12 2 , c 1 = 2 φ 11 φ 12 , a 2 = φ 21 2 , b 2 = 1 φ 22 2 , c 2 = 2 φ 21 φ 22 , a 3 = φ 11 φ 21 , b 3 = φ 12 φ 22 and c 3 = 1 [ φ 11 φ 22 + φ 12 φ 21 ] . Let Z 1 τ and Z 2 τ be the m-period non-overlapping aggregates of z 1 t and z 2 t respectively. Let W 1 τ = ( 1 L ) d 1 Z 1 τ and W 2 τ = ( 1 L ) d 2 Z 2 τ . Since systematic sampling of a VAR(1) process produces a VARMA(1,h; h 1) process at low levels of aggregation (Marcellino 1999), we first carried out a Monte Carlo experiment by fitting VAR(p), p = 1 , 2 , 3 models to W 1 τ and W 2 τ derived from (5) for m = 3 . Based on T = 10,000 replications we observe that the coefficient estimates of the systematically sampled VAR(1) model remain largely unaffected by the VAR order. The AIC and BIC criteria also lead to the selection of a VAR(1) process for the systematically sampled series. We, therefore, proceeded to obtain analytical results from the following bivariate VAR(1) process:
W 1 τ W 2 τ = φ 11 φ 12 φ 21 φ 22 W 1 τ 1 W 2 τ 1 + E 1 τ E 2 τ
i.e., W τ = Φ 1 W τ 1 + E τ where E i τ ( i = 1 , 2 ) as defined earlier represents the error process of the aggregated model, W τ = W 1 τ W 2 τ and Φ 1 = φ 11 φ 12 φ 21 φ 22 . The p lim of OLS estimates φ ^ i j and p lim φ ^ i j are given by:
Φ ˜ = Φ ˜ 1 = Γ 1 Γ w ( 1 ) = Γ w ( 0 ) 1 Γ w ( 1 ) = γ 11 W ( 0 ) γ 12 W ( 0 ) γ 21 W ( 0 ) γ 22 W ( 0 ) 1 γ 11 W ( 1 ) γ 12 W ( 1 ) γ 21 W ( 1 ) γ 22 W ( 1 )
Φ ˜ 1 = p lim φ ^ 11 p lim φ ^ 21 p lim φ ^ 12 p lim φ ^ 22 = 1 γ 11 W ( 0 ) γ 22 W ( 0 ) ( γ 12 W ( 0 ) ) 2 γ 22 W ( 0 ) γ 21 W ( 0 ) γ 12 W ( 0 ) γ 11 W ( 0 ) γ 11 W ( 1 ) γ 21 W ( 1 ) γ 12 W ( 1 ) γ 22 W ( 1 )
That is,
φ ^ 11 = ( W 1 τ W 1 τ 1 ) ( W 2 τ 1 2 ) ( W 1 τ W 2 τ 1 ) ( W 1 τ 1 W 2 τ 1 ) ( W 1 τ 1 2 ) ( W 2 τ 1 2 ) ( W 1 τ 1 W 2 τ 1 ) p lim φ ^ 11 = γ 11 W ( 1 ) γ 22 W ( 0 ) γ 12 W ( 1 ) γ 12 W ( 0 ) γ 11 W ( 0 ) γ 22 W ( 0 ) ( γ 12 W ( 0 ) ) 2
and similarly
p lim φ ^ 12 = γ 12 W ( 1 ) γ 11 W ( 0 ) γ 11 W ( 1 ) γ 12 W ( 0 ) γ 11 W ( 0 ) γ 22 W ( 0 ) ( γ 12 W ( 0 ) ) 2
p lim φ ^ 21 = γ 21 W ( 1 ) γ 22 W ( 0 ) γ 22 W ( 1 ) γ 12 W ( 0 ) γ 11 W ( 0 ) γ 22 W ( 0 ) ( γ 12 W ( 0 ) ) 2
p lim φ ^ 22 = γ 22 W ( 1 ) γ 11 W ( 0 ) γ 21 W ( 1 ) γ 12 W ( 0 ) γ 11 W ( 0 ) γ 22 W ( 0 ) ( γ 12 W ( 0 ) ) 2
For t statistics
t ( φ ^ i j ) = φ ^ i j / s e ( φ ^ i j ) ,
we get
V a r ( φ ^ 11 ) = γ 22 W ( 0 ) σ ^ 1 2 ( T 1 ) γ 11 W ( 0 ) γ 22 W ( 0 ) ( γ 12 W ( 0 ) ) 2
V a r ( φ ^ 12 ) = γ 11 W ( 0 ) σ ^ 1 2 ( T 1 ) γ 11 W ( 0 ) γ 22 W ( 0 ) ( γ 12 W ( 0 ) ) 2
V a r ( φ ^ 21 ) = γ 22 W ( 0 ) σ ^ 2 2 ( T 1 ) γ 11 W ( 0 ) γ 22 W ( 0 ) ( γ 12 W ( 0 ) ) 2
V a r ( φ ^ 22 ) = γ 11 W ( 0 ) σ ^ 2 2 ( T 1 ) γ 11 W ( 0 ) γ 22 W ( 0 ) ( γ 12 W ( 0 ) ) 2
σ ^ 1 2 = 1 T 2 ( Y 1 τ φ ^ 11 Y 1 τ 1 φ ^ 12 Y 2 τ 1 ) 2 = T 1 T 2 ( 1 + φ ^ 11 2 ) γ 11 W ( 0 ) + φ ^ 12 2 γ 22 W ( 0 ) 2 φ ^ 11 γ 11 W ( 1 ) 2 φ ^ 12 γ 12 W ( 1 ) + 2 φ ^ 11 φ ^ 12 γ 12 W ( 0 )
and
σ ^ 2 2 = T 1 T 2 ( 1 + φ ^ 22 2 ) γ 22 W ( 0 ) + φ ^ 21 2 γ 11 W ( 0 ) 2 φ ^ 21 γ 21 W ( 1 ) 2 φ ^ 22 γ 22 W ( 1 ) + 2 φ ^ 21 φ ^ 22 γ 12 W ( 0 ) .
where T is the effective sample size after aggregation. For the case of systematic sampling
γ 12 W ( k ) = ( 1 + L + + L m 1 ) d 1 + d 2 γ 12 w ( m k + d 2 ( m 1 ) )
γ 21 W ( k ) = ( 1 + L + + L m 1 ) d 1 + d 2 γ 21 w ( m k + d 1 ( m 1 ) )
and
γ i i W ( k ) = ( 1 + L + + L m 1 ) 2 d γ i i w ( m k + d i ( m 1 ) )
An important well-known problem of temporal aggregation or systematic sampling is the creation of contemporaneous correlation even when such a correlation is absent. Using the VAR(1) system in (12) with φ 11 = 0 and φ 22 = 0 , Breitung and Swanson (2002) and Ericsson et al. (2001) examined the effect of temporal aggregation on contemporaneous regression coefficient for m = 2 and observed that this coefficient could be positive, negative, or zero. Here we generalize their result for the case of systematic sampling for any m. From the contemporaneous regression relationship W 2 τ = c W 1 τ + u τ with systematically sampled data we get
c ^ = W 1 τ W 2 τ W 1 τ 2 , and p lim c ^ = γ 12 W ( 0 ) γ 11 W ( 0 )
and the corresponding tests statistics is given by
t ( c ^ ) = c ^ / s e ( c ^ )
we get V a r ( c ^ ) = γ 22 W ( 0 ) + c ^ 2 γ 11 W ( 0 ) 2 c ^ γ 12 W ( 0 ) ( T 1 ) γ 11 W ( 0 ) , where
γ 12 W ( 0 ) = ( 1 + L + + L m 1 ) d 1 + d 2 γ 12 w ( d 2 ( m 1 ) )
and
γ i i W ( 0 ) = ( 1 + L + + L m 1 ) 2 d i γ i i w ( d i ( m 1 ) )
It can be shown that, the above parameters, described in (25)–(30), of the systematically sampled process can be expressed in terms of the moments of the disaggregated process and these in turn can be expressed in terms of the parameters of the original basic disaggregated process using (13)–(22). In order to examine the effect of systematic sampling on Granger Causality, we consider three cases where (i) no Granger Causality between with variables in the disaggregated form; (ii) causality between the variables in the disaggregated form is one-sided and (iii) causality between the variables in the disaggregated form is bi-directional.
Case 1: No Granger causality between the variables in the disaggregated form
Proposition 2.
If there does not exist Granger causality between the basic series then the Granger causality between the systematically sampled series is also absent.
Proof of Proposition 2.
In this case φ 12 = φ 21 with c o v ( e 1 t , e 2 t ) = 0 . Therefore, from (18), (19) and (22) γ i j w ( k ) = 0 for all k and i j ( i , j = 1 , 2 ). Further we can see that γ i j W ( k ) = 0 for all k and i j . Thus, if the cross-covariances between the basic series are zero then the cross-covariances between systematically sampled series will also be zero. And from (25) and (26) we can see that p lim φ ^ 12 = p lim φ ^ 21 = 0 . Thus, if there is no Granger causality between the basic series then the Granger causality between the systematically sampled series will also be absent. ☐
The general result described by Proposition 2 does not depend on d i s . In particular, the systematically sampled two independent random walk processes will remain causally unrelated when they are estimated in the differenced form. It can also be inferred that p lim c ^ = 0 , suggesting that the systematic sampling does not create any contemporaneous correlation among the variables when there does not exist Granger causality between the variables in the disaggregated form is absent.
Case 2: Causality between the disaggregated series is one-sided
Let φ 12 = 0 such that w 2 t does not Granger cause w 1 t and there exists uni-directional causality from w 1 t to w 2 t . It can be shown that
γ 11 w ( 0 ) = σ w 1 2 and γ 11 w ( k ) = φ 11 γ 11 w ( k 1 ) = φ 11 k σ w 1 2
γ 12 w ( 0 ) = φ 11 φ 21 σ w 1 2 1 φ 11 φ 22 γ 12 w ( k ) = φ 11 γ 12 w ( k 1 ) = φ 11 k φ 11 φ 21 σ w 1 2 1 φ 11 φ 22 k > 0
γ 22 w ( k ) = φ 21 γ 12 w ( k 1 ) + φ 22 γ 22 w ( k 1 ) = φ 21 φ 11 k 1 φ 11 φ 21 σ w 1 2 1 φ 11 φ 22 + φ 22 γ 22 w ( k 1 )
γ 21 w ( k ) = φ 21 γ 11 w ( k 1 ) + φ 22 γ 21 w ( k 1 ) = φ 11 k 1 φ 21 σ w 1 2 + φ 22 γ 21 w ( k 1 ) k > 0
It has been well established in the earlier literature (Mamingi (1996) and Rajaguru (2004b)) that, for the stationary case, systematic sampling preserves the direction of Granger causality. In this section, we establish the condition under which the unidirectional causal system turns into a feedback system due to systematic sampling. Thus we have the following theorem.
Theorem 1.
In a bivariate VAR(1) framework, systematic sampling induces spurious bi-directional Granger causality among the variables if the uni-directional causality runs from a non-stationary series to either a stationary or a non-stationary series.
Equivalently, systematic sampling induces spurious bi-directional Granger causality among the variables if d 1 > 0 .
Proof of Theorem 1.
See Appendix A. ☐
It can also be shown that the expression for P lim φ ^ 12 when φ 12 = 0 in the basic form for the case of systematic sampling when d 1 = d 2 = 1 is the same as the case of temporal aggregation (see Rajaguru and Abeysinghe 2010) when d 1 = d 2 = 0 .
That is, the systematic sampling induces spurious Granger causality when d 1 = d 2 = 1 and it can be expressed as
p lim φ ^ 12 = φ 11 ( 1 + φ 11 + φ 11 2 + φ 11 m 1 ) 2 σ w 1 2 γ 11 W ( 0 ) φ 11 φ 21 1 φ 11 φ 22 γ 12 W ( 0 ) γ 11 W ( 0 ) γ 22 W ( 0 ) ( γ 12 W ( 0 ) ) 2
It can also be shown that if the one-sided causality runs from a white noise series (in differences) to a differenced stationary series in the basic disaggregated form then systematic sampling will not produce a spurious feedback relationship even if d 1 = 1 . However, this may not hold when d 1 > 1 . In general, as m increases VAR(1) tends to become VAR(0). However, when φ 11 reaches unity, we get a near co-integrated specification in the I(2) space and as a result VAR(1) remains VAR(1) as m increases. The conversion from VAR(1) to VAR(0) for the higher order of systematic sampling confirms that the converse of the Proposition 2 need not be true. In turn, we can conclude that not finding causality among the variables with systematically sampled data doesn’t necessarily mean that the variables are not related in the disaggregated form. It can be also shown that p lim c ^ = γ 12 W ( 0 ) γ 11 W ( 0 ) 0 as m increases and all causal information concentrate in the contemporaneous relationship among the variables due to systematic sampling of integrated process. Moreover, the spurious contemporaneous relationships do not disappear even if the sampling interval becomes larger.
Case 3: Granger causality between the original series is bi-directional
In this case both φ 12 and φ 21 are non-zero. The required aggregated parameters ( p lim φ ^ 12 , p lim φ ^ 21 ) are given in (25) and (26). To make computations easier, without loss of generality, we set φ 11 = 0 and φ 22 = 0 . When both φ 11 = 0 and φ 22 = 0 the underlying VAR(1) process is stationary if and only if | φ 12 φ 21 | < 1 .
γ 12 w ( 0 ) = 0
γ 11 w ( 1 ) = 0 and γ 11 w ( k ) = φ 12 γ 21 w ( k 1 )
γ 22 w ( 1 ) = 0 and γ 21 w ( k ) = φ 21 γ 12 w ( k 1 )
γ 12 w ( 1 ) = φ 12 σ w 2 2 and γ 12 w ( k ) = φ 12 γ 22 w ( k 1 )
γ 21 w ( 1 ) = φ 21 σ w 1 2 and γ 21 w ( k ) = φ 21 γ 11 w ( k 1 )
Through recursive substitution, we also get
γ 11 w ( 2 k 1 ) = 0 , γ 11 w ( 2 k ) = ( φ 12 φ 21 ) k σ w 1 2 k = 1 , 2 ,
γ 22 w ( 2 k 1 ) = 0 , γ 22 w ( 2 k ) = ( φ 12 φ 21 ) k σ w 2 2 k = 1 , 2 ,
γ 12 w ( 2 k 1 ) = φ 12 ( φ 12 φ 21 ) k 1 σ w 2 2 , γ 12 w ( 2 k ) = 0 k = 1 , 2 ,
γ 21 w ( 2 k 1 ) = φ 21 ( φ 12 φ 21 ) k 1 σ w 1 2 , γ 21 w ( 2 k ) = 0 k = 1 , 2 ,
and σ w 2 2 = 1 + φ 21 2 1 φ 12 2 φ 21 2 , σ w 1 2 = 1 + φ 12 2 1 φ 12 2 φ 21 2
where γ i j W ( 0 ) = γ i j w ( 0 ) and γ i j W ( 1 ) = γ i j w ( m ) if d 1 = d 2 = 0
γ i j W ( 0 ) = ( 1 + L + + L m 1 ) 2 γ i j w ( ( m 1 ) ) and γ i j W ( 1 ) = ( 1 + L + + L m 1 ) 2 γ i j w ( 2 m 1 ) if d 1 = d 2 = 1 i , j = 1 , 2 .
Scenario 1: Stationary processes: d 1 = d 2 = 0
If d 1 = d 2 = 0 then γ i j W ( 0 ) = γ i j w ( 0 ) and γ i j W ( 1 ) = γ i j w ( m ) . Now, the causal parameters of the model based on aggregated data takes the form p lim φ ^ 12 = γ 12 W ( 1 ) γ 11 W ( 0 ) γ 11 W ( 1 ) γ 12 W ( 0 ) γ 11 W ( 0 ) γ 22 W ( 0 ) ( γ 12 W ( 0 ) ) 2 and
p lim φ ^ 21 = γ 21 W ( 1 ) γ 22 W ( 0 ) γ 22 W ( 1 ) γ 21 W ( 0 ) γ 11 W ( 0 ) γ 22 W ( 0 ) ( γ 12 W ( 0 ) ) 2 .
p lim φ ^ 12 = γ 12 w ( m ) γ 11 w ( 0 ) γ 11 w ( m ) γ 12 w ( 0 ) γ 11 w ( 0 ) γ 22 w ( 0 ) ( γ 12 w ( 0 ) ) 2 and p lim φ ^ 21 = γ 21 w ( m ) γ 22 w ( 0 ) γ 22 w ( m ) γ 21 w ( 0 ) γ 11 w ( 0 ) γ 22 w ( 0 ) ( γ 12 w ( 0 ) ) 2
p lim φ ^ 12 = γ 12 w ( m ) σ w 1 2 0 σ w 1 2 σ w 2 2 0 and p lim φ ^ 21 = γ 21 w ( m ) σ w 2 2 0 σ w 1 2 σ w 2 2 0
p lim φ ^ 12 = γ 12 w ( m ) σ w 2 2 and p lim φ ^ 21 = γ 21 w ( m ) σ w 1 2
p lim φ ^ 12 = φ 12 ( φ 12 φ 21 ) ( m 1 ) 2 , if m is odd 0 , if m is even
and
p lim φ ^ 21 = φ 21 ( φ 12 φ 21 ) ( m 1 ) 2 , if m is odd 0 , if m is even
The interesting feature of the above derivation is that systematic sampling preserves the feedback causal relation among the variables when the order of systematic sampling is odd at lower levels of aggregation. Since by construction | φ 12 φ 21 | < 1 , from (45) and (46) we can conclude that VAR(1) becomes VAR(0) as m increases even if m is odd. On the other hand, when the order of aggregation is even one may not observe any causal relationship among the variables even at lower levels of aggregation when the causality between them is bi-directional. We can also observe from the systematically sampled data that the estimated p lim φ ^ 11 0 and p lim φ ^ 22 0 when the order of systematic sampling m is even. However, we may not observe these patterns when φ 11 0 or φ 22 0 . If either φ 11 0 or φ 22 0 then we observe that γ 12 w ( 0 ) 0 . In turn, we get p lim φ ^ 12 0 and p lim φ ^ 21 0 even if the the order of aggregation is even. The causal inferences based on the systematically sampled data could be misleading when m is significantly high as the causal parameter may become insignificant. Another key conclusion from this exercise is the converse of the Proposition 2 need not be true, i.e., not finding Granger causality among the variables based on systematically sampled data does not imply the absence of Granger causality in the basic form. The above results also suggest the misspecification involved in dynamic relationships among the variables in the aggregated form.
Contemporaneous Correlations:
Based on the contemporaneous regression equation for systematic sampling when d 1 = d 2 = 0 we get
p lim c ^ = γ 12 W ( 0 ) γ 11 W ( 0 ) = γ 12 w ( 0 ) γ 11 w ( 0 ) = 0
Thus, when both φ 11 = φ 22 = 0 and d 1 = d 2 = 0 the systematic sampling does not induce any contemporaneous relations among the variables of interest. However, as in Breitung and Swanson (2002), if φ i j ’s i , j = 1 , 2 are all non-zero then γ 12 w ( 0 ) 0 and hence we observe the contemporaneous correlations between the variables.
Scenario 2: Integrated Process: d 1 = d 2 = 1
Now, the causal parameters of the model based on systematically sampled data take the form p lim φ ^ 12 = γ 12 W ( 1 ) γ 11 W ( 0 ) γ 11 W ( 1 ) γ 12 W ( 0 ) γ 11 W ( 0 ) γ 22 W ( 0 ) ( γ 12 W ( 0 ) ) 2 and p lim φ ^ 21 = γ 21 W ( 1 ) γ 22 W ( 0 ) γ 22 W ( 1 ) γ 21 W ( 0 ) γ 11 W ( 0 ) γ 22 W ( 0 ) ( γ 12 W ( 0 ) ) 2 , where γ i j W ( 0 ) = ( 1 + L + + L m 1 ) 2 γ i j w ( ( m 1 ) ) and γ i j W ( 1 ) = ( 1 + L + + L m 1 ) 2 γ i j w ( 2 m 1 ) . Notice that the expression described above for the case of systematic sampling when both d 1 = d 2 = 1 is same as that for the case of temporal aggregation when d 1 = d 2 = 0 (see Rajaguru and Abeysinghe (2010) for temporal aggregation). And thus all the inferences made for the case of temporal aggregation of stationary process is applicable to the case of systematic sampling of I(1) processes.
The key findings are summarized below:
  • Just as in the one-way causal system the VAR(1) in the feedback system tends to become VAR(0) as m increases.
  • What is more disturbing though is that a positive φ 12 may become negative p lim φ ^ 12 . Furthermore, the magnitudes of p lim φ ^ 12 are such that in practice it is quite possible to conclude that causality is one-way though it is bi-directional.
Contemporaneous Correlation:
Again, consider estimating the contemporaneous regression equation given by (29), and the cross-covariances in this expression take the form
γ i j W ( 0 ) = ( 1 + L + + L m 1 ) 2 γ i j w ( ( m 1 ) ) and γ i j W ( 1 ) = ( 1 + L + + L m 1 ) 2 γ i j w ( 2 m 1 ) .
Notice that the cross covariances for the case of systematic sampling when d 1 = d 2 = 1 is same as for the case of temporal aggregation when d 1 = d 2 = 0 . And again all the inferences made for the case of temporal aggregation of stationary process is applicable for the case of systematic sampling of I(1) processes. The results are consistent with Breitung and Swanson (2002).

4. Monte Carlo Simulation

We find for the bivariate VAR(1) case that systematic sampling induces spurious bi-directional Granger causality among the variables if the uni-directional causality runs from a non-stationary series to either a stationary or a non-stationary series. However, the results may not be true for the higher order and higher dimensional VAR processes. In this section, we consider an extensive Monte Carlo simulation to examine the validity of Theorem 1 for the cases where the system has (i) more than one lag and (ii) more than two variables. We consider the following four scenarios to analyse the consequences of systematic sampling on Granger causality: (1) bivariate VAR(1) process; (2) bivariate VAR(2) process; (3) trivariate VAR(1) and (4) bivariate VAR(1) with a non-synchronous data generating process. For the first three cases, we assume that the data is generated from equally spaced discrete time series process. However, the financial time series data are available at random frequency. It is important to analyse the validity of the Theorem 1 for the more realistic situation where the data is available at a random frequency and the Granger causality tests are conducted in equally spaced observations sampled from non-synchronous data. We use scenario four to analyse this realistic behavior on the bivariate VAR(1) process.
Scenario 1:
Consider the following data generating process (DGP) where the observations are drawn from an equally spaced bivariate VAR(1) model:
w 1 t w 2 t = φ 11 φ 12 φ 21 φ 22 w 1 t 1 w 2 t 1 + e 1 t e 2 t , e 1 t e 2 t N 0 0 , 1 0 0 1
where z 1 t I ( d 1 ) and z 2 t I ( d 2 ) such that w i t = ( 1 L ) d i z i t for i = 1 , 2 .
In order to identify the source of causal distortion we consider the combinations of situations where (i) both d 1 I ( 0 ) and d 2 I ( 0 ) ; (ii) d 1 I ( 0 ) and d 2 I ( 1 ) ; (iii) d 1 I ( 1 ) and d 2 I ( 0 ) and (iv) both d 1 I ( 1 ) and d 2 I ( 1 ) .
We assume that the unidirectional causality runs from z 1 to z 2 in the basic disaggregated form by setting the parameter ϕ 12 = 0 . The remaining parameters are randomly drawn from uniform distribution U ( 10 , 10 ) such that the roots of the VAR(1) polynomial lie outside the unit circle. We have randomly generated 100,000 such models to examine the validity of our theoretical results. For each model, we considered the Monte Carlo simulation with 100,000 replications. For each model at each replication, we randomly generated 1200 observations (representing 100 years of monthly data). This represents the case where the order of aggregation m = 0 . We subsequently used the sampling frequency of m = 3 (400 quarterly observations from the sample of 1200) and m = 12 (100 annual observations). The percentage of rejection frequencies of ϕ ^ 12 and ϕ ^ 21 at the 5% level of significance are observed across all 100,000 models over 100,000 replications at each levels of aggregation ( m = 0 , m = 3 and m = 12 ). The results are reported in Panel A of Table 1. The results reconfirm the validity of theorem and show that if the causality runs from a stationary variable to either a stationary or a non-stationary variable then the direction of unidirectional causality is preserved at the lower level of aggregation (sampling interval m = 3 ). The systematic sampling induces spurious bidirectional Granger causality in about 35% of the cases where the unidirectional causality runs from a non-stationary variable to either a stationary or a non-stationary variable in the basic disaggregated form. Moreover the VAR(1) converges to VAR(0) at the higher level of aggregation (sampling interval m = 12 ).
Scenario 2:
Consider the following data generating process (DGP) where the observations are drawn from an equally spaced bivariate VAR(2) model:
w 1 t w 2 t = φ 1 , 11 φ 1 , 12 φ 1 , 21 φ 1 , 22 w 1 t 1 w 2 t 1 + φ 2 , 11 φ 2 , 12 φ 2 , 21 φ 2 , 22 w 1 t 2 w 2 t 2 + e 1 t e 2 t , e 1 t e 2 t N 0 0 , 1 0 0 1
where z 1 t I ( d 1 ) and z 2 t I ( d 2 ) such that w i t = ( 1 L ) d i z i t , d i { 0 , 1 } for i = 1 , 2 .
We set ϕ 1 , 12 = ϕ 2 , 12 = 0 indicating the unidirectional causality runs from z 1 to z 2 in the basic disaggregated form. As in scenario 1, all other parameters are drawn from a uniform distribution across all 100,000 models with 100,000 replications. The rejection frequencies (in percentage) of ϕ ^ 1 , 12 , ϕ ^ 2 , 12 , ϕ ^ 2 , 21 and ϕ ^ 1 , 12 at the 5% level of significance are reported in panel B of Table 1. It is clear from the results reported in panel B of Table 1 that the systematic sampling induces spurious bidirectional Granger causality when the underlying data generating process is VAR(2) regardless of order of integration.
Scenario 3:
Consider the following data generating process (DGP) where the observations are drawn from equally spaced trivariate VAR(1) model:
w 1 t w 2 t w 3 t = φ 11 φ 12 φ 13 φ 21 φ 22 φ 23 φ 31 φ 32 φ 33 w 1 t 1 w 2 t 1 w 3 t 1 + e 1 t e 2 t e 3 t , e 1 t e 2 t e 3 t N 0 0 0 , 1 0 0 0 1 0 0 0 1
where z 1 t I ( d 1 ) , z 2 t I ( d 2 ) and z 3 t I ( d 3 ) such that w i t = ( 1 L ) d i z i t , d i { 0 , 1 } for i = 1 , 2 , 3 .
We set ϕ 12 = 0 indicating the unidirectional causality runs from z 1 to z 2 in the basic disaggregated form. As in scenario 1, all other parameters are drawn from a uniform distribution across all 100,000 models with 100,000 replications. The rejection frequencies (in percentage) of ϕ ^ 12 and ϕ ^ 21 at the 5% level of significance at all levels of aggregation are reported in panel C of Table 1. In a multivariate framework, the results show that the only case where the direction of Granger causality is preserved is when all variables are stationary. As in bivariate case, VAR(1) converges to VAR(0) at the higher level of aggregation.
Scenario 4:
In order to analyse the effects of systematic sampling on Granger causality in a more realistic framework of non-synchronous data, we consider the following data generating process:
w 1 t = φ 11 w 1 s + φ 12 w 2 v + e 1 t
w 2 t = φ 21 w 1 s + φ 21 w 2 v + e 2 t
where t and t are randomly chosen sequentially available non-synchronous time periods. w 1 s and w 2 v are previously available information at time t and w 1 s and w 2 v are previously available information at time t . We first generate observations using the DGP above across 1200 grids assuming that not all grids are having the same number of observations. We systematically sample the last observation from each grid to represent the case where m = 0. We subsequently systematically sample 400 and 100 observations to represent m = 3 and m = 12 respectively. For the DGP with I ( 1 ) variable, the drift term is introduced and is expected to vary across each grid. As in scenario 1, the percentage of rejection frequencies of ϕ ^ 12 and ϕ ^ 21 at the 5% level of significance are observed across all 100,000 models over 100,000 replications. The results reported in Panel D of Table 1 show that the results based on non-synchronous DGP is similar to that of equally spaced DGP. However, the rate at which VAR(1) converges to VAR(0) is much faster for the case of equally spaced DGP than non-synchronous DGP.

5. Empirical Applications

5.1. Example 1—VIX vs. SPVXSTR I(0)/I(1)

The CBOE Volatility Index (VIX) is calculated from price quotes on the nearest and second nearest S&P 500 index options as described on the CBOE’s website at http://www.cboe.com/micro/vix/vixwhite.pdf. It represents a market estimate of expected 30 day stock market volatility, and is often described as the “investor fear gauge”. Standard and Poor’s also calculates a constant maturity VIX futures index called the S&P 500 VIX short-term total return index (SPVXSTR). The index has 30 days to expiration, and tracks a portfolio comprising positions in the nearest and second nearest futures with average maturity of 1 month at the close of trading each day, as described on Standard and Poor’s website (www.us.spindices.com).
These two series have been the focus of recent academic work which assessed bidirectional causality in high frequency data. Frijns et al. (2015) finds evidence for bi-directional Granger causality between the VIX and VIX futures, while Bollen et al. (2016) analyse the lead-lag relations between the SPVXSTR and the VIX finding that the VIX futures price lagged the VIX cash index in the first few years after it was launched, but the VIX futures now leads the VIX. This research suggests that these two series would be an important empirical application.
We test case 3 using intraday data for the SPVXSTR and VIX available from Thompson Reuters using the SIRCA portal from January 2010 to December 2014. Causality is analysed with sampling at 15 s, 1 min, 5 min, and 10 min intervals for each day. VIX is observed to be stationary in levels, while SPVXSTR is non-stationary in levels3. The summary results are reported in Appendix B, Table A1. The first of column represents the underlying Granger Causality at low-level of systematic sampling (15 s). Subsequently, the data are sampled at 1 min, 5 min, and 10 min intervals. The Granger Causality at the higher level of systematic sampling are reported in panels 1 min, 5 min and 10 min. The results are consistent with the theoretical literature that bi-directional causality remains bi-directional at the lower level of sampling intervals. For example, 566 out of 1029 days are bi-directional when the sampling intervals increased from 15 s to 1 min. The empirical results also reconfirm the theoretical findings that bi-directional Granger Causality could be incorrectly interpreted as uni-directional Granger causality. For example, 440 days are misinterpreted as uni-directional causality from either VIX to VST or VST to VIX at the lower level of systematic sampling. At the higher sampling intervals, bi-directional causal relationships could be misinterpreted as no-causal links between the variables of interest. The number of cases at 5 min and 10 min intervals are 486 and 646 respectively. What is left at the higher sampling interval is the contemporaneous correlations between VIX and VST. Importantly, as in theoretical results, the no-causal relationship remains the same at all levels of sampling intervals.

5.2. Example 2—SPX vs. VIX I(0)/I(0)

The S&P 500 index (SPX) is the most widely used gauge for US equities, and its calculation is described on Standard and Poor’s website (www.us.spindices.com). The behavior of the S&P 500 versus the VIX is well documented in finance literature (Whaley 2009). Like the VIX, the S&P 500 index is stationary in levels. We use intraday data for the VIX and SPX, again available from Thompson Reuters using the SIRCA portal from January 2010 to December 2014. Causality is analysed with sampling at 15 s, 1 min, 5 min and 10 min intervals. The summary results are reported in Appendix B, Table A2. We observe at 15 s intervals that SPX leads VIX in 781 cases. The uni-directionality remains uni-directional in 649 cases at a 1 min interval. The spurious causality from VIX to SPX is observed for only 14 cases. This is consistent with our theoretical finding that the uni-directional causality does not induce spurious reverse causality for the stationary variables. At the higher sampling intervals, uni-directional causal relationships could be misinterpreted as no-causal links between the variables of interest. Again, the no-causal relationship remains the same at all levels of sampling intervals.

5.3. Example 3—ES1 vs. SPVXSTR I(1)/I(1)

The E-Mini futures contract is based on the S&P 500 index (ES), and the SC1 Index tracks the closest to maturity E-mini contract, rolling close to maturity. Due to the availability of the data, index data was collected from Bloomberg at 1 min intervals (SC1/ES1). Like SPVXSTR, ES1 is non-stationary in levels. Causality is analysed with sampling at 1 min, 5 min and 10 min intervals. The summary results are reported in Appendix B, Table A3. The results are consistent with the theoretical findings that bi-directional causality between non-stationary variables turns into uni-directional causality at the lower level of sampling intervals. For the same case, the uni-directional causality from ES1 to VST (383 episodes at 5 min interval) turns into reverse causality from VST to ES1 at 10 min interval. This is consistent with our theoretical findings that causality between the non-stationary variables lead to spurious Granger Causality when they are estimated in differenced form. At the higher sampling intervals, bi-directional causal relationships could be misinterpreted as no-causal links between the variables of interest even if the non-stationary variables are estimated in differenced form.

5.4. Example 4—SPX, VIX and RV

Monte Carlo results discussed in Section 4 show that Granger causality between three variables are preserved as long as all three variables are stationary and are estimated in a VAR(1) framework. In this example, we evaluate the causal relationship between SPX, VIX and the realized volatility of SPX (RV). The realized volatility for 15 s frequency is constructed by estimating heterogenous autoregressive realized volatility (HAR-RV) model (see Corsi (2009); Wang et al. (2017) and the citation therein)4. We further use these realized volatility measures to examine the causal relationship between SPX, VIX and RV within VAR framework by sampling at 15 s, 1 min, 5 min and 10 min intervals. In particular, it helps to compare and contrast the effect of alternative volatility measures (VIX and RV) on SPX and vice versa. The summary results are reported in Appendix B, Table A4.
We observe at 15 s intervals that SPX leads VIX in 742 cases and SPX leads RV in 811 cases. The uni-directionality remains uni-directional at the lower sampling intervals. This is consistent with our simulation results for the three variable case that the uni-directional causality does not induce spurious reverse causality for the stationary variables. At the higher sampling intervals, uni-directional causal relationships could be misinterpreted as no-causal links between the variables of interest. The results are consistent with different volatility measures (VIX and RV). It also show the strong bidirectional causality between the volatility measures of VIX and RV.

6. Conclusions

Economists often have to use systematically sampled data in Granger causality testing. It was known in the theoretical literature that temporal aggregation may distort the causal links between variables while systematic sampling preserves the causal directions. Our exercise provides a quantitative assessment analytically and assesses the nature of the distortions created by systematic sampling. The following observations emerge from this exercise: (1) If the one-sided causality runs from a white noise series (in differences) to a differenced stationary series in the basic disaggregated form then systematic sampling will not produce a spurious feedback relationship even if d 1 = 1 . However, this may not hold when d 1 > 1 ; this may be similar to the case of temporal aggregation of nonstationary variables; (2) As m increases VAR(1) tends to become VAR(0). However, when φ 11 reaches unity, we get a near co-integrated specification in the I(2) space and as a result VAR(1) remains VAR(1) as m increases; (3) It can also be observed from the contemporaneous regressions that all causal information concentrates in the contemporaneous relationship among the variables due to systematic sampling of integrated processes. Moreover, the spurious contemporaneous relationships do not disappear even if the order of aggregation is larger.
The empirical results based on the stationary variables (SPX vs. VIX) show that a uni-directional causal relationship remains uni-directional at lower sampling intervals. This is consistent with our theoretical finding that the uni-directional causality does not induce spurious reverse causality for the stationary variables. At the higher sampling intervals, uni-directional causal relationships could be misinterpreted as no-causal links between the variables of interest. On the other hand, the causality between the non-stationary variables (ES1 vs. SPVXSTR) induces spurious causal relationships when they are estimated in differenced form. This is consistent with our theoretical findings that systematic sampling induces spurious causality when the non-stationary variables are estimated in differenced form.

Author Contributions

M.O. and G.R. have contributed to Section 1, Section 5 and Section 6. T.A. and G.R. contributed to Section 2 and Section 3. G.R. contributed to Section 4.

Acknowledgments

We thank the Editors and two anonymous referees for their constructive comments, which helped us to improve the quality of the paper. We thank James Todd for excellent research assistance. The usual disclaimer applies.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Proof of Theorem 1

For the completeness of the proof of this theorem, we need to consider the following four scenarios: (1) d 1 = d 2 = 0 (2) d 1 = 0 but d 2 > 0 (3) d 1 > 0 but d 2 = 0 and (4) d 1 > 0 and d 2 > 0 .
Without loss of generality, assume5 that φ 11 0 , φ 21 0 and φ 22 0 .
Scenario 1: Suppose that both d 1 = d 2 = 0 . Then the variance, autocovariances and cross covariances of the systematically sampled series take the form γ i j W ( k ) = γ i j w ( m k ) i , j = 1 , 2 . Now (33) takes the form
p lim φ ^ 12 = γ 12 W ( 1 ) γ 11 W ( 0 ) γ 11 W ( 1 ) γ 12 W ( 0 ) γ 11 W ( 0 ) γ 22 W ( 0 ) ( γ 12 W ( 0 ) ) 2 = γ 12 w ( m ) γ 11 w ( 0 ) γ 11 w ( m ) γ 12 w ( 0 ) γ 11 w ( 0 ) γ 22 w ( 0 ) ( γ 12 w ( 0 ) ) 2 = ϕ 11 m ϕ 11 ϕ 21 σ w 1 2 1 ϕ 11 ϕ 22 ( σ w 1 2 ) ( ϕ 11 m σ w 1 2 ) ϕ 11 ϕ 21 σ w 1 2 1 ϕ 11 ϕ 22 γ 11 w ( 0 ) γ 22 w ( 0 ) ( γ 12 w ( 0 ) ) 2 = 0
Thus, if d 1 = d 2 = 0 and φ 12 = 0 then p lim φ ^ 12 = 0 , suggesting that if the Granger causality between the stationary series are uni-directional then systematic sampling preserves the direction of causality. This is another proof of the results in Wei (1982) and Cunningham and Vilasuso (1995) (the later based on Monte Carlo simulations). Based on this result we strongly recommend to practitioners who study for example, the relationship between short and long term interest rates should not use time averages of the interest rates, if the rates are I(0) series. They should use systematically sampled values such as the end of period rates.
Scenario 2: d 1 = 0 but d 2 > 0
Here VAR is constructed for z 1 t and Δ d 2 z 2 t . In this case, by construction, unidirectional Granger causality runs from a stationary series to a non-stationary series in the disaggregated form. Then the variance, autocovariances and cross covariances of the systematically sampled series take the form
γ i j W ( k ) = ( 1 + L + L 2 + + L m 1 ) d 2 γ i j w ( m k + d 2 ( m 1 ) ) and
γ j i W ( k ) = ( 1 + L + L 2 + + L m 1 ) d 2 γ j i w ( m k ) i , j = 1 , 2 .
Let c i be the coefficient of L i in the expression ( 1 + L + L 2 + + L m 1 ) d 2 . Now (33) takes the form
p lim φ ^ 12 = γ 12 W ( 1 ) γ 11 W ( 0 ) γ 11 W ( 1 ) γ 12 W ( 0 ) γ 11 W ( 0 ) γ 22 W ( 0 ) ( γ 12 W ( 0 ) ) 2
where γ 12 W ( 1 ) = ( 1 + L + L 2 + + L m 1 ) d 2 γ 12 w ( m + d 2 ( m 1 ) )
γ 12 W ( 0 ) = ( 1 + L + L 2 + + L m 1 ) d 2 γ 12 w ( d 2 ( m 1 ) )
γ 11 W ( 1 ) = ( 1 + L + L 2 + + L m 1 ) 0 γ 11 w ( m ) = γ 11 w ( m ) = φ 11 m σ w 1 2 and
γ 11 W ( 0 ) = ( 1 + L + L 2 + + L m 1 ) 0 γ 11 w ( 0 ) = γ 11 w ( 0 ) = σ w 1 2 .
Now
γ 12 W ( 1 ) γ 11 W ( 0 ) γ 11 W ( 1 ) γ 12 W ( 0 ) = ( 1 + L + L 2 + + L m 1 ) d 2 γ 12 w ( m + d 2 ( m 1 ) ) σ w 1 2 ( ϕ 11 m σ w 1 2 ) ( 1 + L + L 2 + + L m 1 ) d 2 γ 12 w ( d 2 ( m 1 ) ) = c 0 γ 12 w ( m + d 2 ( m 1 ) ) + c 1 γ 12 y ( m + d 2 ( m 1 ) 1 ) + + c d 2 ( m 1 ) γ 12 w ( m ) ( σ w 1 2 ) ( φ 11 m σ w 1 2 ) c 0 γ 12 w ( d 2 ( m 1 ) ) + c 1 γ 12 y ( d 2 ( m 1 ) 1 ) + + c d 2 ( m 1 ) γ 12 w ( 0 ) = c 0 ϕ 11 m + d 2 ( m 1 ) ϕ 11 ϕ 21 σ w 1 2 1 ϕ 11 ϕ 22 + c 1 ϕ 11 m + d 2 ( m 1 ) 1 ϕ 11 ϕ 21 σ w 1 2 1 ϕ 11 ϕ 22 + + c d 2 ( m 1 ) ϕ 11 m ϕ 11 ϕ 21 σ w 1 2 1 ϕ 11 ϕ 22 ( σ w 1 2 ) ( ϕ 11 m σ w 1 2 ) c 0 ϕ 11 d 2 ( m 1 ) ϕ 11 ϕ 21 σ w 1 2 1 ϕ 11 ϕ 22 + c 1 ϕ 11 d 2 ( m 1 ) 1 ϕ 11 ϕ 21 σ w 1 2 1 ϕ 11 ϕ 22 + + c d 2 ( m 1 ) ϕ 11 0 ϕ 11 ϕ 21 σ w 1 2 1 ϕ 11 ϕ 22 = 0
Thus, if d 1 = 0 but d 2 > 0 and φ 12 = 0 then p lim φ ^ 12 = 0 , implies that if the unidirectional Granger causality runs from stationary series to a non-stationary series in the disaggregated form then one will not find spurious feedback relationship between them due to systematic sampling. As in the previous case, to exploit this result the practitioners must make sure to use systematically sampled series in their studies. For example, if one is studying the effect of an exogenously determined stationary interest rate (as in the case of Singapore) on the rate of change (log difference) of money demand then both the interest rate and money demand series should be systematically sampled, for example, take the end of period values.
Scenario 3: d 1 > 0 but d 2 = 0
In this case, by construction, unidirectional Granger causality runs from a non-stationary series to a stationary series in the basic disaggregated form.6 Then the variance, autocovariances and cross covariances of the systematically sampled series take the form
γ i j W ( k ) = ( 1 + L + L 2 + + L m 1 ) d 1 γ i j w ( m k ) and
γ j i W ( k ) = ( 1 + L + L 2 + + L m 1 ) d 1 γ j i w ( m k + d 1 ( m 1 ) ) i , j = 1 , 2 .
Let e i be the coefficient of L i in the expression ( 1 + L + L 2 + + L m 1 ) d 1 . Now (33) takes the form
p lim φ ^ 12 = γ 12 W ( 1 ) γ 11 W ( 0 ) γ 11 W ( 1 ) γ 12 W ( 0 ) γ 11 W ( 0 ) γ 22 W ( 0 ) ( γ 12 W ( 0 ) ) 2
where γ 12 W ( 1 ) = ( 1 + L + L 2 + + L m 1 ) d 1 γ 12 w ( m )
γ 12 W ( 0 ) = ( 1 + L + L 2 + + L m 1 ) d 1 γ 12 w ( 0 )
γ 11 W ( 1 ) = ( 1 + L + L 2 + + L m 1 ) 2 d 1 γ 11 w ( m + d 1 ( m 1 ) ) and
γ 11 W ( 0 ) = ( 1 + L + L 2 + + L m 1 ) 2 d 1 γ 11 w ( d 1 ( m 1 ) )
Let e i and f i be the coefficients of L i in the expressions ( 1 + L + L 2 + + L m 1 ) d 1 and ( 1 + L + L 2 + + L m 1 ) 2 d 1 respectively.
Now
γ 12 W ( 1 ) γ 11 W ( 0 ) γ 11 W ( 1 ) γ 12 W ( 0 ) = ( 1 + L + L 2 + + L m 1 ) d 1 γ 12 w ( m ) ( 1 + L + L 2 + + L m 1 ) 2 d 1 γ 11 w ( d 1 ( m 1 ) ) ( 1 + L + L 2 + + L m 1 ) d 1 γ 12 w ( 0 ) ( 1 + L + L 2 + + L m 1 ) 2 d 1 γ 11 w ( m + d 1 ( m 1 ) )
= e 0 γ 12 w ( m ) + e 1 γ 12 w ( m 1 ) + + e d 1 ( m 1 ) γ 12 w ( m d 1 ( m 1 ) ) f 0 γ 11 w ( d 1 ( m 1 ) ) + f 1 γ 11 w ( d 1 ( m 1 ) 1 ) + + f 2 d 1 ( m 1 ) γ 11 w ( d 1 ( m 1 ) ) e 0 γ 12 w ( 0 ) + e 1 γ 12 w ( 1 ) + + e d 1 ( m 1 ) γ 12 w ( d 1 ( m 1 ) f 0 γ 11 w ( m + d 1 ( m 1 ) ) + f 1 γ 11 w ( m + d 1 ( m 1 ) 1 ) + + f 2 d 1 ( m 1 ) γ 11 w ( m d 1 ( m 1 ) )
= e 0 γ 12 w ( m ) + e 1 γ 12 w ( m 1 ) + + e d 1 ( m 1 ) γ 12 w ( m d 1 ( m 1 ) ) f 0 γ 11 w ( d 1 ( m 1 ) ) + f 1 γ 11 w ( d 1 ( m 1 ) 1 ) + + f d 1 ( m 1 ) γ 11 w ( 0 ) + f d 1 ( m 1 ) + 1 γ 11 w ( 1 ) + + f 2 d 1 ( m 1 ) γ 11 w ( d 1 ( m 1 ) ) e 0 γ 12 w ( 0 ) + e 1 γ 12 w ( 1 ) + + e d 1 ( m 1 ) γ 12 w ( d 1 ( m 1 ) ) f 0 φ 11 m γ 11 w ( d 1 ( m 1 ) ) + f 1 φ 11 m γ 11 w ( d 1 ( m 1 ) 1 ) + + f d 1 ( m 1 ) φ 11 m γ 11 w ( 0 ) + f d 1 ( m 1 ) + 1 γ 11 w ( m 1 ) + + f 2 d 1 ( m 1 ) γ 11 w ( m d 1 ( m 1 ) )
= f 0 γ 11 w ( d 1 ( m 1 ) ) + f 1 γ 11 w ( d 1 ( m 1 ) 1 ) + + f d 1 ( m 1 ) γ 11 w ( 0 ) { e 0 γ 12 w ( m ) φ 11 m γ 12 w ( 0 ) + e 1 γ 12 w ( m 1 ) φ 11 m γ 12 w ( 1 ) + + e d 1 ( m 1 ) γ 12 w ( m d 1 ( m 1 ) ) φ 11 m γ 12 w ( d 1 ( m 1 ) ) } + f d 1 ( m 1 ) + 1 γ 11 w ( 1 ) e 0 γ 12 w ( m ) + e 1 γ 12 w ( m 1 ) + + e d 1 ( m 1 ) γ 12 w ( m d 1 ( m 1 ) ) γ 11 w ( m 1 ) e 0 γ 12 w ( 0 ) + e 1 γ 12 w ( 1 ) + + e d 1 ( m 1 ) γ 12 w ( d 1 ( m 1 ) ) + + f 2 d 1 ( m 1 ) γ 11 w ( d 1 ( m 1 ) ) e 0 γ 12 w ( m ) + e 1 γ 12 w ( m 1 ) + + e d 1 ( m 1 ) γ 12 w ( m d 1 ( m 1 ) ) γ 11 w ( m d 1 ( m 1 ) ) e 0 γ 12 w ( 0 ) + e 1 γ 12 w ( 1 ) + + e d 1 ( m 1 ) γ 12 w ( d 1 ( m 1 ) ) 0 .
Because, in the above expression, only the term γ 12 w ( m ) φ 11 m γ 12 w ( 0 ) = 0 and the other arguments are non-zero as φ 11 0 , φ 21 0 and φ 22 0 .
Thus, if the uni-directional Granger causality runs from a non-stationary series to a stationary series then one could observe bi-directional spurious feedback relationship between them if the variables are systematically sampled.
Scenario 4: d 1 > 0 and d 2 > 0
In this case, by construction, the uni-directional Granger causality runs from a non-stationary series to a non-stationary series. Then the variance, autocovariances and cross covariances of the systematically sampled series take the form
γ i j W ( k ) = ( 1 + L + L 2 + + L m 1 ) d 1 + d 2 γ i j w ( m k + d 2 ( m 1 ) ) and
γ j i W ( k ) = ( 1 + L + L 2 + + L m 1 ) d 1 + d 2 γ j i w ( m k + d 1 ( m 1 ) ) i , j = 1 , 2 .
Now (17) takes the form
p lim φ ^ 12 = γ 12 W ( 1 ) γ 11 W ( 0 ) γ 11 W ( 1 ) γ 12 W ( 0 ) γ 11 W ( 0 ) γ 22 W ( 0 ) ( γ 12 W ( 0 ) ) 2
where γ 12 W ( 1 ) = ( 1 + L + L 2 + + L m 1 ) d 1 + d 2 γ 12 w ( m + d 2 ( m 1 ) )
γ 12 W ( 0 ) = ( 1 + L + L 2 + + L m 1 ) d 1 + d 2 γ 12 w ( d 2 ( m 1 ) )
γ 11 W ( 1 ) = ( 1 + L + L 2 + + L m 1 ) 2 d 1 γ 11 w ( m + d 1 ( m 1 ) ) and
γ 11 W ( 0 ) = ( 1 + L + L 2 + + L m 1 ) 2 d 1 γ 11 w ( d 1 ( m 1 ) )
Let f i and g i be the coefficients of L i in the expressions ( 1 + L + L 2 + + L m 1 ) 2 d 1 and ( 1 + L + L 2 + + L m 1 ) d 1 + d 2 respectively.
Now
γ 12 W ( 1 ) γ 11 W ( 0 ) γ 11 W ( 1 ) γ 12 W ( 0 ) = { ( 1 + L + L 2 + + L m 1 ) d 1 + d 2 γ 12 w ( m + d 2 ( m 1 ) ) ( 1 + L + L 2 + + L m 1 ) 2 d 1 γ 11 w ( d 1 ( m 1 ) ) } { ( 1 + L + L 2 + + L m 1 ) d 1 + d 2 γ 12 w ( d 2 ( m 1 ) ) ( 1 + L + L 2 + + L m 1 ) 2 d 1 γ 11 w ( m + d 1 ( m 1 ) ) }
= { g 0 γ 12 w ( m + d 2 ( m 1 ) ) + g 1 γ 12 w ( m + d 2 ( m 1 ) 1 ) + + g ( d 1 + d 2 ) ( m 1 ) γ 12 w ( m + d 2 ( m 1 ) d 1 ( m 1 ) ) ( f 0 γ 11 w ( d 1 ( m 1 ) ) + f 1 γ 11 w ( d 1 ( m 1 ) 1 ) + + f d 1 ( m 1 ) γ 11 w ( 0 ) + f d 1 ( m 1 ) + 1 γ 11 w ( 1 ) + + f 2 d 1 ( m 1 ) γ 11 w ( d 1 ( m 1 ) ) ) } { g 0 γ 12 w ( d 2 ( m 1 ) ) + g 1 γ 12 w ( d 2 ( m 1 ) 1 ) + + g ( d 1 + d 2 ) ( m 1 ) γ 12 w ( d 2 ( m 1 ) d 1 ( m 1 ) ) ( f 0 γ 11 w ( m + d 1 ( m 1 ) ) + f 1 γ 11 w ( m + d 1 ( m 1 ) 1 ) + + f d 1 ( m 1 ) γ 11 w ( m ) + f d 1 ( m 1 ) + 1 γ 11 w ( m 1 ) + + f 2 d 1 ( m 1 ) γ 11 w ( m d 1 ( m 1 ) ) ) }
= f 0 γ 11 w ( d 1 ( m 1 ) ) + f 1 γ 11 w ( d 1 ( m 1 ) 1 ) + + f d 1 ( m 1 ) γ 11 w ( 0 ) { g 0 γ 12 w ( m + d 2 ( m 1 ) ) + g 1 γ 12 w ( m + d 2 ( m 1 ) 1 ) + + g ( d 1 + d 2 ) ( m 1 ) γ 12 w ( m + d 2 ( m 1 ) d 1 ( m 1 ) ) φ 11 m g 0 γ 12 w ( d 2 ( m 1 ) ) + g 1 γ 12 w ( d 2 ( m 1 ) 1 ) + + g ( d 1 + d 2 ) ( m 1 ) γ 12 w ( d 2 ( m 1 ) d 1 ( m 1 ) ) } + f d 1 ( m 1 ) + 1 { γ 11 w ( 1 ) ( g 0 γ 12 w ( m + d 2 ( m 1 ) ) + g 1 γ 12 w ( m + d 2 ( m 1 ) 1 ) + + g ( d 1 + d 2 ) ( m 1 ) γ 12 w ( m + d 2 ( m 1 ) d 1 ( m 1 ) ) ) γ 11 w ( m 1 ) ( g 0 γ 12 w ( d 2 ( m 1 ) ) + g 1 γ 12 w ( d 2 ( m 1 ) 1 ) + + g ( d 1 + d 2 ) ( m 1 ) γ 12 w ( d 2 ( m 1 ) d 1 ( m 1 ) ) ) } + + f 2 d 1 ( m 1 ) { γ 11 w ( d 1 ( m 1 ) ) ( g 0 γ 12 w ( m + d 2 ( m 1 ) ) + g 1 γ 12 w ( m + d 2 ( m 1 ) 1 ) + + g ( d 1 + d 2 ) ( m 1 ) γ 12 w ( m + d 2 ( m 1 ) d 1 ( m 1 ) ) ) γ 11 w ( m d 1 ( m 1 ) ) ( g 0 γ 12 w ( d 2 ( m 1 ) ) + g 1 γ 12 w ( d 2 ( m 1 ) 1 ) + + g ( d 1 + d 2 ) ( m 1 ) γ 12 w ( d 2 ( m 1 ) d 1 ( m 1 ) ) ) }
= f 0 γ 11 w ( d 1 ( m 1 ) ) + f 1 γ 11 w ( d 1 ( m 1 ) 1 ) + + f d 1 ( m 1 ) γ 11 w ( 0 ) { g 0 γ 12 w ( m + d 2 ( m 1 ) ) φ 11 m γ 12 w ( d 2 ( m 1 ) ) + g 1 γ 12 w ( m + d 2 ( m 1 ) 1 ) φ 11 m γ 12 w ( d 2 ( m 1 ) 1 ) + + g d 1 ( m 1 ) γ 12 w ( m + d 2 ( m 1 ) d 1 ( m 1 ) ) φ 11 m γ 12 w ( d 2 ( m 1 ) d 1 ( m 1 ) ) } + f d 1 ( m 1 ) + 1 { γ 11 w ( 1 ) ( g 0 γ 12 w ( m + d 2 ( m 1 ) ) + g 1 γ 12 w ( m + d 2 ( m 1 ) 1 ) + + g ( d 1 + d 2 ) ( m 1 ) γ 12 w ( m + d 2 ( m 1 ) d 1 ( m 1 ) ) ) γ 11 w ( m 1 ) ( g 0 γ 12 w ( d 2 ( m 1 ) ) + g 1 γ 12 w ( d 2 ( m 1 ) 1 ) + + g ( d 1 + d 2 ) ( m 1 ) γ 12 w ( d 2 ( m 1 ) d 1 ( m 1 ) ) ) } + + f 2 d 1 ( m 1 ) { γ 11 w ( d 1 ( m 1 ) ) ( g 0 γ 12 w ( m + d 2 ( m 1 ) ) + g 1 γ 12 w ( m + d 2 ( m 1 ) 1 ) + + g ( d 1 + d 2 ) ( m 1 ) γ 12 w ( m + d 2 ( m 1 ) d 1 ( m 1 ) ) ) γ 11 w ( m d 1 ( m 1 ) ) ( g 0 γ 12 w ( d 2 ( m 1 ) ) + g 1 γ 12 w ( d 2 ( m 1 ) 1 ) + + g ( d 1 + d 2 ) ( m 1 ) γ 12 w ( d 2 ( m 1 ) d 1 ( m 1 ) ) ) }
In the above expression,
γ 12 w ( m + d 2 ( m 1 ) i ) φ 11 m γ 12 w ( d 2 ( m 1 ) i ) = 0 if 0 i d 2 ( m 1 ) 0 if i > d 2 ( m 1 ) .
Thus, γ 12 W ( 1 ) γ 11 W ( 0 ) γ 11 W ( 1 ) γ 12 W ( 0 ) 0 since φ 11 0 , φ 21 0 and φ 0 .
If the uni-directional Granger causality runs from a non-stationary series to a non-stationary series then one could observe bi-directional spurious feedback relationship between them in systematically sample form.
Thus, in summary, as long as the causal variable is non-stationary (i.e., d 1 > 0 ) regardless whether the output series is stationary or not, one may observe spurious feedback relationships among the variables with systematically sampled data when the causality between them is uni-directional in the basic disaggregated form Q.E.D.

Appendix B. Multivariate Granger Causality Tests

Table A1. Granger Causality between VIX and VST.
Table A1. Granger Causality between VIX and VST.
1 min5 min10 min
BothVIXVSTNoneBothVIXVSTNoneBothVIXVSTNone
15 sBoth566118322233217633548614226143646
VIX25712049013035027
VST9217242348117219
None132160202001021
Table A2. Granger Causality between SPX and VIX.
Table A2. Granger Causality between SPX and VIX.
1 min5 min10 min
BothSPXVIXNoneBothSPXVIXNoneBothSPXVIXNone
15 sBoth6338203423316332612482
SPX815901496403888345221070652
VIX1112591555345301987
None7192120021110224034240
Table A3. Granger Causality between ES1 and VST.
Table A3. Granger Causality between ES1 and VST.
5 min10 min
BothES1VSTNONEBothES1VSTNONE
1 minBoth21938323714763176198549
ES1854626521761
VST761914121132
NONE1052000125
Table A4. Granger Causality between SPX, VIX and RV.
Table A4. Granger Causality between SPX, VIX and RV.
15 s1 min5 min10 min
SPX –>VIX742492420196
VIX –>SPX108695823
SPX <–>VIX16817310166
None231515670964
SPX –>RV811427384173
RV –>SPX95716747
SPX <–>RV1421749940
None201577699989
VIX–>RV171227619
RV –>VIX2916312876
VIX <–>RV912650512345
None291314533809

References

  1. Bollen, Nicolas P. B., Michael J. O’Neill, and Robert E. Whaley. 2016. Tail wags dog: Intraday price discovery in VIX markets. Journal of Futures Markets 37: 431–51. [Google Scholar] [CrossRef]
  2. Breitung, Jörg, and Norman R. Swanson. 2002. Temporal Aggregation Causality in Multiple Time Series Model. Journal of Time Series Analysis 23: 651–65. [Google Scholar] [CrossRef]
  3. Corsi, Fulvio. 2009. A Simple Approximate to Long-memory Model of realized volatility. Journal of Financial Econometrics 7: 174–96. [Google Scholar] [CrossRef]
  4. Cunningham, S., and Jon Vilasuso. 1995. Time Aggregation and Causality Tests: Results from a Monte Carlo Experiment. Applied Economics Letters 2: 403–5. [Google Scholar] [CrossRef]
  5. Cunningham, S., and Jon Vilasuso. 1997. Time Aggregation and the Money—Real GDP Relationship. Journal of Macroeconomics 19: 675–95. [Google Scholar] [CrossRef]
  6. Epps, Thomas W. 1979. Comovements in stock prices in the very short run. Journal of the American Statistical Association 74: 291–98. [Google Scholar]
  7. Ericsson, Neil R., John S. Irons, and Ralph W. Tryon. 2001. Output and inflation in the long run. Journal of Applied Econometrics 16: 241–53. [Google Scholar] [CrossRef]
  8. Frijns, Bart, Alireza Tourani-Rad, and Robert I. Webb. 2015. On the intraday relation between the VIX and its futures. Journal of Futures Markets 36: 870–86. [Google Scholar] [CrossRef]
  9. Geweke, John. 1982. Measurement of Linear Dependence and Feedback between Multiple Time Series. Journal of the American Statistical Association 77: 304–13. [Google Scholar] [CrossRef]
  10. Hayashi, Takaki, and Nakahiro Yoshida. 2005. On covariance estimation of non-synchronously observed diffusion processes. Bernoulli 11: 359–79. [Google Scholar] [CrossRef]
  11. Lütkepohl, Helmut. 1987. Forecasting Aggregated Vector ARMA Process. New York: Springer. [Google Scholar]
  12. Mamingi, Nlandu. 1996. Aggregation over time, error correction models and Granger causality: A Monte Carlo investigation. Economics Letters 52: 7–14. [Google Scholar] [CrossRef]
  13. Marcellino, Massimiliano. 1999. Some Consequences of Temporal Aggregation in Empirical Analysis. Journal of Business and Economic Statistics 17: 129–36. [Google Scholar]
  14. Rajaguru, Gulasekaran. 2004a. Impact of Systematic Sampling on Causality in the presence of Unit Roots. Economics Letters 6: 127–32. [Google Scholar] [CrossRef]
  15. Rajaguru, Gulasekaran. 2004b. Effects of Temporal Aggregation and Systematic sampling Model Dynamics and Causal Inference. Ph.D. dissertation, National University of Singapore, Singapore. [Google Scholar]
  16. Rajaguru, Gulasekaran, and Tilak Abeysinghe. 2010. The distortionary effects of temporal aggregation on granger causality. In Some Recent Developments in Statistical Theory and Applications—Selected Proceedings of the International Conference on Recent Development in Statistics, Econometrics and Forecasting. Edited by Kuldeep Kumar and Anoop Chaturvedi. Irvine and Boca Raton: BrownWalker Press, pp. 38–56. [Google Scholar]
  17. Rossana, Robert J., and John J. Seater. 1992. Aggregation, Unit Roots and the Time Series Structure of Manufacturing Real Wages. International Economic Review 33: 159–79. [Google Scholar] [CrossRef]
  18. Rossana, Robert J., and John J. Seater. 1995. Temporal Aggregation and Economic Time Series. Journal of Business and Economic Statistics 13: 441–51. [Google Scholar]
  19. Scholes, Myron, and Joseph Williams. 1977. Estimating betas from nonsynchronous data. Journal of Financial Economics 5: 309–27. [Google Scholar] [CrossRef]
  20. Sims, Christopher A. 1971. Discrete approximations to continuous time distributed lags in econometrics. Econometrica 39: 545–63. [Google Scholar] [CrossRef]
  21. Stram, Daniel O., and William W. S. Wei. 1986. Temporal aggregation in the ARIMA process. Journal of Time Series 7: 279–92. [Google Scholar] [CrossRef]
  22. Tiao, G., and W. Wei. 1976. Effect of temporal aggregation on the dynamic relationship of two time series variables. Biometrika 63: 513–23. [Google Scholar] [CrossRef]
  23. Triacca, Umberto. 2017. Non-Causality Due to Included Variables. Econometrics 5: 46. [Google Scholar] [CrossRef]
  24. Wang, Yudong, Zhiyuan Pan, and Chongfeng Wu. 2017. Time-Varying Parameter Realized Volatility Models. Journal of Forecasting 36: 566–80. [Google Scholar] [CrossRef]
  25. Wei, William W. S. 1978. The effect of temporal aggregation on parameter estimation in distributed lag models. Journal of Econometrics 8: 237–46. [Google Scholar] [CrossRef]
  26. Wei, William W. S. 1982. The effect of systematic sampling and temporal aggregation on causality—A cautionary note. Journal of the American Statistical Association 77: 316–19. [Google Scholar]
  27. Wei, William W. S. 1990. Time Series Analysis: Univariate and Multi Variate Methods. Boston: Addison-Wesley. [Google Scholar]
  28. Weiss, Andrew A. 1984. Systematic sampling and temporal aggregation in time series models. Journal of Econometrics 26: 271–81. [Google Scholar] [CrossRef]
  29. Wei, William W. S., and Jatinder S. Mehta. 1980. Temporal Aggregation and Information Loss in Distributed Lag Model. In Analyzing Time Series. Edited by O. D. Anderson. Amsterdam: North-Holland, pp. 613–17. [Google Scholar]
  30. Whaley, Robert E. 2009. Understanding the VIX. Journal of Portfolio Management 35: 98–105. [Google Scholar] [CrossRef]
  31. Zellner, Arnold, and Claude Montmarquette. 1971. A study of some aspects of temporal aggregation problems in econometric analyses. Review of Economics and Statistics 53: 335–42. [Google Scholar] [CrossRef]
1
The results in general are applicable to multivariate VAR(p) process.
2
If the contemporaneous correlation between the two error processes is non-zero, then one could argue that the causal distortion comes from these non-zero correlations instead of systematic sampling. This also allows us to isolate the effects of sampling on the contemporaneous correlation between the variables.
3
Unit Root test results can be made available from authors upon request.
4
In our HAR model, the ln(RV) at time t is expected to depend on ln(RV) at t 1 , one minute and ten minutes.
5
This is to ensure that the preservation of uni-directionality does not occur due to the zero values of the parameters φ 11 , φ 21 and φ 22 .
6
For example, if the interest rate is endogenously determined and stationary, one may want to study the effect of changes in money supply on the interest rate.
Table 1. Monte carlo simulation results.
Table 1. Monte carlo simulation results.
Panel A: Bivariate VAR(1)
φ 12 φ 21
z 1 z 2 m = 0m = 3m = 12m = 0m = 3m = 12
I(0)I(0)5%5%5%98%79%18%
I(0)I(1)5%5%5%97%60%29%
I(1)I(0)5%33%19%95%72%9%
I(1)I(1)5%37%14%95%93%34%
Panel B: Bivariate VAR(2)
φ 1 , 12 φ 2 , 12 φ 1 , 21 φ 2 , 21
z 1 z 2 m = 0m = 3m = 12m = 0m = 3m = 12m = 0m = 3m = 12m = 0m = 3m = 12
I(0)I(0)5%37%14%5%20%6%97%88%34%97%59%10%
I(0)I(1)5%26%9%5%16%6%95%79%28%95%52%9%
I(1)I(0)5%61%25%5%37%10%98%83%19%98%57%8%
I(1)I(1)5%48%22%5%28%10%97%89%53%98%72%20%
Panel C: Trivariate VAR(1)
φ 12 φ 21
z 1 z 2 z 3 m = 0m = 3m = 12m = 0m = 3m = 12
I(0)I(0)I(0)5%5%5%97%59%28%
I(0)I(0)I(1)5%64%22%98%80%26%
I(0)I(1)I(0)5%48%9%98%16%10%
I(1)I(0)I(0)5%70%43%98%67%11%
I(0)I(1)I(1)5%61%15%98%74%11%
I(1)I(0)I(1)5%68%39%98%74%16%
I(1)I(1)I(0)5%65%23%98%83%32%
I(1)I(1)I(1)5%69%27%98%84%36%
Panel D: Bivariate VAR(1) - Non-synchronous DGP
φ 12 φ 21
z 1 z 2 m = 0m = 3m = 12m = 0m = 3m = 12
I(0)I(0)5%5%5%95%75%14%
I(0)I(1)5%5%5%94%57%23%
I(1)I(0)5%31%18%94%70%9%
I(1)I(1)5%33%11%94%91%31%

Share and Cite

MDPI and ACS Style

Rajaguru, G.; O’Neill, M.; Abeysinghe, T. Does Systematic Sampling Preserve Granger Causality with an Application to High Frequency Financial Data? Econometrics 2018, 6, 31. https://doi.org/10.3390/econometrics6020031

AMA Style

Rajaguru G, O’Neill M, Abeysinghe T. Does Systematic Sampling Preserve Granger Causality with an Application to High Frequency Financial Data? Econometrics. 2018; 6(2):31. https://doi.org/10.3390/econometrics6020031

Chicago/Turabian Style

Rajaguru, Gulasekaran, Michael O’Neill, and Tilak Abeysinghe. 2018. "Does Systematic Sampling Preserve Granger Causality with an Application to High Frequency Financial Data?" Econometrics 6, no. 2: 31. https://doi.org/10.3390/econometrics6020031

APA Style

Rajaguru, G., O’Neill, M., & Abeysinghe, T. (2018). Does Systematic Sampling Preserve Granger Causality with an Application to High Frequency Financial Data? Econometrics, 6(2), 31. https://doi.org/10.3390/econometrics6020031

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop