Next Article in Journal
Data-Driven Corrections of Partial Lotka–Volterra Models
Next Article in Special Issue
Machine Learning Algorithms for Prediction of the Quality of Transmission in Optical Networks
Previous Article in Journal
Giant Spin Current Rectification Due to the Interplay of Negative Differential Conductance and a Non-Uniform Magnetic Field
Previous Article in Special Issue
Bayesian3 Active Learning for the Gaussian Process Emulator Using Information Theory
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Monitoring Volatility Change for Time Series Based on Support Vector Regression

Department of Statistics, Seoul National University, Seoul 08826, Korea
*
Author to whom correspondence should be addressed.
Entropy 2020, 22(11), 1312; https://doi.org/10.3390/e22111312
Submission received: 21 October 2020 / Revised: 13 November 2020 / Accepted: 13 November 2020 / Published: 17 November 2020
(This article belongs to the Special Issue Theory and Applications of Information Theoretic Machine Learning)

Abstract

:
This paper considers monitoring an anomaly from sequentially observed time series with heteroscedastic conditional volatilities based on the cumulative sum (CUSUM) method combined with support vector regression (SVR). The proposed online monitoring process is designed to detect a significant change in volatility of financial time series. The tuning parameters are optimally chosen using particle swarm optimization (PSO). We conduct Monte Carlo simulation experiments to illustrate the validity of the proposed method. A real data analysis with the S&P 500 index, Korea Composite Stock Price Index (KOSPI), and the stock price of Microsoft Corporation is presented to demonstrate the versatility of our model.

1. Introduction

In this paper, we study the cumulative sum (CUSUM) monitoring procedure to sequentially detect a significant change in time series with conditional volatilities. Since [1,2], the CUSUM method has been an acclaimed tool to detect an anomaly among observations in statistical process control (SPC). In SPC, a control chart is a primary component that graphically describes the behavior of sequentially observed time series. For examples of SPC, see [3] for control charts, [4] for the analysis of complex environmental time series, and [5] for backcasting-forecasting time series. In particular, the CUSUM chart is one of the most frequently adopted methods among various fields of SPC. For an overview of SPC, we refer to [6]. In general, the performance of control charts is measured with average run length (ARL). However, instead of the conventional control charts, some authors, such as [7], alternatively took the approach of controlling type I errors in probability instead of controlling ARL to deal with the monitoring process in autoregressive time series. This design of the sequential monitoring method has merit in its ability to attain a lower false alarm rate, as seen in [8], who took a similar approach to dealing with generalized autoregressive conditional heteroscedastic (GARCH) time series. For more references as to the monitoring process in time series, see [9]. Here, inspired by the previous studies, we aim to hybridize the CUSUM monitoring scheme with support vector regression (SVR) for GARCH-type time series.
Support vector machine (SVM) is one of the most popular nonparametric learning methods used predominantly for classification and regression [10]. In particular, compared with traditional model-based methods, the support vector regression (SVR) is more advantageous to approximating the nonlinearity of the underlying dynamic structure of datasets. Refer to [11,12,13,14,15], and the papers cited therein. Moreover, SVR is structured to exploit the quadratic programming optimization problem, and to implement the structural risk minimization principle [16]. This facilitates SVR to be an estimation scheme that optimally minimizes the empirical risk while being relatively parsimonious [17]. This further enables the SVR to perform adequately for variously sized datasets, including the case where its size is relatively small. Refs. [18,19] recently used the SVR with a hybridization of the CUSUM method to detect a change point in SVR-autoregressive and moving average (ARMA) and SVR-GARCH models. Therein, the residual-based CUSUM test has been adopted by referring to the previous studies of [20,21,22]. See [23] for a general overview of the change point detection problem using CUSUM methods. In the same spirit, here we also take the approach of the CUSUM of squares test, rather than the score vector-based CUSUM test used in [7,8], as the former largely outperforms the latter in terms of stability and power, as seen in the empirical study of [21]; above all, the latter is only available for the model-based CUSUM test. As pointed out in the precedent works of [18,19], the role of the accurately computed residuals is substantially important, and thereby, so is the precise prediction of the used SVR method. As the optimal choice of tuning parameters matters to a significant degree as well, we consider adopting particle swarm optimization (PSO) [24] for the purpose of tuning parameter optimization. For a theoretical background, we refer to [25,26,27]. See also the survey papers of [28,29,30]. These are population-based algorithms and have been widely used to obtain optimal tuning parameters in SVR. Therefore, we evaluate the performance of the proposed SVR-GARCH-based CUSUM monitoring method equipped with the PSO through Monte Carlo experiments.
The rest of this paper is organized as follows. Section 2 introduces the CUSUM monitoring and explains its fundamental principle and application to GARCH-type time series. Section 3 describes SVR and PSO in general, then elaborates on the monitoring process for SVR-GARCH models in more detail. Section 4 presents Monte Carlo simulations conducted to evaluate the performance of the proposed method. Section 5 performs a real data analysis using the S&P 500 index, KOSPI, and the stock returns of Microsoft Corporation datasets. Finally, Section 6 provides concluding remarks.

2. CUSUM Monitoring Procedure

In this section, we introduce our monitoring process starting from the independent and identically distributed (iid) sample case. Let us consider the problem of monitoring an anomaly in the variance from a stream of observations ϵ t of mean zero up to time n. Under the null hypothesis of no anomalies, we assume that ϵ t are iid with unit variance over time t = 1 , , n . Namely, we test
H 0 : V a r ( ϵ t ) = 1 , t = 1 , , n vs . H 1 : V a r ( ϵ t ) = 1 , t = 1 , , k V a r ( ϵ t ) 1 , t = k + 1 , , n for some k = 2 , , n 1 .
For this task, we first define
W k = t = 1 k ( ϵ t 2 1 ) / τ ,
for each k 1 , where τ 2 = V a r ( ϵ 1 2 ) , which is assumed to be known. Ref. [7] considered the monitoring process based on:
T n ( 1 ) = max 1 k n T n ( k ) = max 1 k n max m k 1 n W m W k ,
and later [8] additionally considered the monitoring process based on:
T n = max 1 k n T n ( k ) = max 1 k n sup 1 m < m k 1 n m m W m W m .
In this study, however, we consider another monitoring process based on
T n m a x = T n ( 1 ) T n ( 2 ) : = max T n ( 1 ) , T n ( 2 )
with
T n ( 2 ) = max 1 k n T n ( 2 ) ( k ) = max 1 k n | min m k 1 n W m W k | .
We employ the test statistic T n m a x because it combines the strength of T n ( 1 ) and T n ( 2 ) , which detect well the decrease and increase of variance, respectively. Note that if E ϵ k 2 has a positive shift, say, from 1 to 1 + δ with δ > 0 , at some point k 0 = 2 , , n 1 , | min m k W m W k | for k > k 0 would tend to have larger values due to the shift, which, however, is not true for max m k W m W k , indicating that T n ( 2 ) would be able to detect the shift well while T n ( 1 ) would not do so; by contrast, if  E ϵ k 2 has a negative shift, T n ( 1 ) would be able to detect the shift well while T n ( 2 ) would not do so. Additionally, T n m a x is preferable over T n in (3) as the latter tends to detect well a change that occurs in the middle of time series.
Using Donsker’s invariance principle [31] and the fact that sup 0 s t B ( s ) B ( t ) = | B ( t ) | in distribution for any standard Brownian motion B , as described in [7], we have that as n ,
T n ( 1 ) D T = sup 0 t 1 ( sup 0 s t B ( s ) B ( t ) ) = D sup 0 t 1 | B ( t ) | , T n ( 2 ) D T = sup 0 t 1 ( sup 0 s t ( B ) ( s ) ( B ) ( t ) ) = D sup 0 t 1 | B ( t ) | ,
where D denotes a convergence in distribution, and  X = D Y implies the distribution functions of random variables X and Y being equivalent. Furthermore, by the continuous mapping theorem, T n m a x in (4) satisfies
T n m a x D T T .
Then, the null hypothesis is rejected if T n m a x is larger than a constant c, which is determined asymptotically as the number satisfying P ( T T > c ) = α for any significance α ( 0 , 1 ) . In practice, we can obtain the critical value c empirically using Monte Carlo simulations. For example, c = 2.46509 when α = 0.05 . Thus, an anomaly is signaled at k when T n m a x ( k ) = T n ( 1 ) ( k ) T n ( 2 ) ( k ) > c for some k = 1 , , n .
This monitoring process can be extended to the GARCH(1,1) model [32]:
y t = σ t ϵ t , t Z , σ t 2 = ω + α y t 1 2 + β σ t 1 2
with ω > 0 , α 0 , β 0 satisfying α + β < 1 and E ϵ 1 4 < . This model is widely used to model financial time series with high volatilities as it captures well the volatility clustering phenomenon. For its properties and characteristics and various GARCH variants, see [19,33], and the papers cited therein. Here, the monitoring process can be constructed based on residuals ϵ ^ t = y t / σ ^ t 0 with σ ^ t 0 2 = ω 0 + α 0 y t 1 , 0 2 + β 0 σ ^ t 1 2 with some initial values y 0 and σ 0 when the true parameters ω 0 , α 0 , β 0 are known in advance from past experience. However, when they are unknown, which is a prevalent phenomenon in practice, we instead construct the CUSUM test with residuals ϵ ^ t = y t / σ ^ t with σ ^ t 2 = ω ^ + α ^ y t 1 2 + β ^ σ ^ t 1 2 , where ω ^ , α ^ , β ^ are estimators of ω , α , β obtained from a given training sample. Then, we employ the following:
T ^ n m a x = T ^ n ( 1 ) T ^ n ( 2 ) ,
where T ^ n ( i ) are the same as T n ( i ) in (2) and (5), i = 1 , 2 , with W k in (1) replaced by W ^ k = t = 1 k ( ϵ ^ t 2 ϵ ^ ¯ 2 ) / τ ^ , and ϵ ^ ¯ 2 and τ ^ 2 are the sample mean and variance of the residuals obtained from the training sample, respectively. We reject the null hypothesis if T ^ n m a x > c for the aforementioned critical value c > 0 . Theoretically, the limiting distribution of T ^ n m a x is anticipated to be the same as the one in (6) when certain conditions are fulfilled, namely m = m ( n ) satisfies m / n as n (e.g., m c n log log n for some c > 0 ) and ( ω ^ , α ^ , β ^ ) is a ( m -consistent) Gaussian quasi-maximum likelihood estimator (QMLE) as in [33], the proof of which is rather standard and omitted for brevity. The monitoring process of the nonlinear time series via SVR is provided in Section 3.3 below.

3. Monitoring Procedure via SVR-GARCH Model

3.1. Support Vector Regression

Support vector regression (SVR) is a nonparametric function estimating method, which is a branch of the support vector machine that originated from [34] as a classification tool. Here, SVR is utilized to conduct CUSUM tests, as it effectively incorporates circumstances where the underlying structure of the conditional variance presented in (7) is unknown, and possibly nonlinear. Its details will be elaborated in Section 3.3 below.
In SVR, we seek to find a function of the form:
f ( x ) = w , ϕ ( x ) + b ,
where x denotes a vector of explanatory variables, w and b are regression parameters to be estimated, and ϕ is an implicit kernel operator that satisfies K ( x 1 , x 2 ) = ϕ ( x 1 ) ϕ ( x 2 ) for some pre-defined kernel function K.
To obtain the estimates of the regression parameters, we then formulate the problem where we can exploit its structure to employ a quadratic programming method [17]:
minimize 1 2 | | w | | 2 + C i = 1 n ( ξ i + ξ i * ) ,
subject to y i { w T ϕ ( x i ) + b } ϵ + ξ i * { w T ϕ ( x i ) + b } y i ϵ + ξ i ξ i 0 , ξ i * 0 ,
where ξ i , ξ i * > 0 denote slack variables, C > 0 denotes a penalty term, and ϵ is a tuning parameter that determines the level of tolerance regarding the error. Note that C and ϵ are user-defined tuning parameters of the model.
By the duality of the Karush–Khun–Tucker condition, we obtain the following optimization problem with respect to the Lagrange multipliers α i and α i * as dual variables [16]:
maximize 1 2 i = 1 n j = 1 n ( α i α i * ) ( α j α j * ) ϕ ( x i ) T ϕ ( x j ) ϵ i = 1 n ( α i + α i * ) + i = 1 n ( α i α i * ) y i ,
subject to i = 1 n ( α i α i * ) = 0 0 α i C , 0 α i * C .
Consequently, the solutions of the optimization problem (10) constructs the estimate f ^ of f as follows:
w ^ = i = 1 n ( α i α i * ) ϕ ( x i ) , f ^ ( x ) = i = 1 n ( α i α i * ) K ( x i , x ) + b ^ ,
where b ^ can be obtained with various approaches. See [17] or [35] for references.
The tuning parameters of our SVR-GARCH model consist of C , ϵ , and γ 2 , as we utilize the Gaussian kernel function
K ( x , y ) = exp | | x y | | 2 2 γ 2 .
Moreover, as the tuning parameters are user-defined, it must either be selected prior to employing the SVR, or be optimized. Here, we utilize the PSO to select the set of tuning parameters. The procedure of the PSO algorithm is elucidated in more detail in the next section.

3.2. Particle Swarm Optimization

PSO is one of the widely acclaimed meta-heuristic methodologies and is extensively adopted when optimizing tuning parameters in various machine learning techniques. Initially proposed by [24], its ability of optimization in nonlinear and nonconvex problems is inspired by the movement of organisms in bird flocks or animal herds. Here, we refer to [30] to describe the procedure.
Given a suitable objective function, a standard PSO algorithm aims to optimize a d-dimensional parameter x in the following search space:
X = { x = ( x 1 , , x d ) T R d : l k x k u k , k = 1 , d } , d 1 ,
for some u k , l k R , k = 1 , d . A particle j at time t, denoted by a d-dimensional vector x j ( t ) in X with
x j ( t ) = ( x j 1 ( t ) , x j 2 ( t ) , , x j d ( t ) ) T X ,
is considered as a candidate of the set of optimal solutions. Moreover, a swarm at time t is defined as S ( t ) = ( x 1 ( t ) , , x N ( t ) ) T . Each particle j has its own d-dimensional velocity vector at time t, denoted by v j ( t ) V for j = 1 , N , where
V = { v = ( v 1 , , v d ) T R d : v m i n v k v m a x , k = 1 , , d }
is a velocity space with v m i n and v m a x being the lower and upper bound of each velocity element, respectively [36]. Furthermore, the best optimal solution p j ( t ) is defined as:
p j ( t ) = ( p i 1 , , p j d ) T X ( j , t ) ,
where X ( j , t ) : = { x j ( 1 ) , , x j ( t ) } denotes the trajectory of the particle j until time t.
Then, the global best optimal solution of the trajectory of all particles until time t is defined as g ( t ) = ( g 1 , , g d ) T { p j ( 1 ) , , p j ( t ) } . In each generation at time t 1 , T m a x , where T m a x is a prescribed maximum generation time, the j-th particle in swarm S ( t ) is updated as follows:
v j ( t + 1 ) = w ( t ) v k ( t ) + c 1 z 1 ( p i ( t ) x i ( t ) ) + c 2 z 2 ( g ( t ) x i ( t ) ) x i ( t + 1 ) = x i ( t ) + v i ( t + 1 ) ,
where c 1 , c 2 are acceleration factors in R and z 1 , z 2 are random variables generated from a uniform distribution U [ 0 , 1 ] . In addition, w ( t ) is an inertia term at time t, which is calculated as follows:
w ( t ) = w start w end T max t T max + w end ,
where w start and w end are predefined lower and upper bounds of the inertia values, respectively. Ref. [30] recently proved that the optimal solution obtained from the PSO algorithm converges to the global optimum with probability 1 under regularity conditions.
The process of the aforementioned PSO algorithm is condensed in Algorithm 1 below.
Algorithm 1 Standard PSO algorithm
1: procedure PSO( N , w max , w min , c 1 , c 2 , T max )
2: Accelerated factor = ( c 1 , c 2 ) , max generation = T max
3:   Initialize location and velocity ;
4:  while t < T max do
5:     t t + 1 ;
6:     w ( t ) w max w max w min T max
7:    for i = 1 , , N do
8:     update v i ( t ) ;
9:      x i ( t ) x i ( t 1 ) + v i ( t ) ;
10:     update p i ( t ) ;
11:    end for
12:    update g ( t ) ;
13:  end while
14: end procedure

3.3. Monitoring Nonlinear Time Series via SVR

We extend the monitoring process introduced in Section 2 to embrace nonlinear GARCH models with the following form:
y t = σ t ϵ t , σ t 2 = g ( y t 1 2 , σ t 1 2 ) ,
where g is an unknown, possibly nonlinear, function to be estimated, σ t 2 is the conditional volatility, and ϵ t are iid errors with zero mean and unit variance. To obtain the residuals that formulate the test statistic (8) of the monitoring process, we estimate g by g ^ with a training sample y 1 , , y m . For the estimation procedure, we utilize the SVR and then adopt the notion of the retrospective test described in [19].
For estimating g ^ via SVR, we first divide the training sample into two distinct samples, say { y 1 , , y l } and { y l + 1 , , y m } , then regard the latter as the validation set. Subsequently, we replace σ t 2 with its proxy σ ˜ t 2 , as σ t 2 is unknown in practice. The authors of [14] employed the moving average method with a window size s 1 as below:
σ ˜ t 2 = j = 1 s y t j + 1 2
for t = 1 , , n , where y 0 , y 1 , , y m + 1 are some sensible initial values. Although its validity is advocated when s = 5 in [19], we alternatively consider the exponentially-weighted moving average (EWMA) estimator
σ ˜ t 2 = α σ ˜ t 1 2 + ( 1 α ) y t 2 ,
where α ( 0 , 1 ) is a tuning parameter, which we set to 0.94 in our study, and σ ˜ 0 2 > 0 is some initial value. This alteration improves both the stability and the performance of the model, as portrayed in Section 4.
Moreover, to further enhance the stability of the estimation process of g ^ , we log-transform the response variable of (11), then take the exponential to obtain σ ^ t 2 . To elaborate, we recursively obtain σ ^ t 2 through γ t : = log σ t 2 as follows:
γ t = g * ( y t 1 2 , σ t 1 2 ) , g * = log g , γ ^ t = g ^ * ( y t 1 2 , σ ˜ t 1 2 ) γ ˜ t = log σ ˜ t 2 , σ ^ t 2 = exp ( γ ^ t ) .
Given g ^ and a space of tuning parameters Θ , we then employ the PSO algorithm to obtain an optimal set of tuning parameters θ * Θ by evaluating the mean absolute error (MAE):
MAE = 1 m l t = l + 1 m | σ ^ t 2 σ ˜ t 2 | ,
where σ ^ t 2 and σ ˜ t 2 are obtained with a validation time series. Ultimately, the finalized g ^ is obtained by utilizing all the training samples { y 1 , , y m } and θ * .
The test set of length n emerges sequentially in practice, denoted by y m + 1 , , y m + n . Then, utilizing g ^ , we yield the residuals
ϵ ^ t = y t / σ ^ t ,
where σ ^ t 2 is obtained recursively through (12). Afterwards, upon observing y l , m + 1 l m + n , the monitoring procedure is conducted by computing the CUSUM test statistic as in (8), and we declare it out of control when it exceeds the prescribed critical value under the nominal level α ( 0 , 1 ) .
Remark 1.
In our proposed monitoring procedure, we use the theoretically obtained critical value c, as illustrated by its notable performance in Section 4. However, if m is not large enough relative to n, there is a chance that the monitoring procedure might be undermined by the parameter estimation. In this case, one may be able to obtain c empirically through a wild bootstrap approach with the following steps [21]:
1. 
Estimate ω , α , β with ω ^ , α ^ , β ^ from training sample y 1 , , y m ;
2. 
Estimate σ t 2 recursively with σ ^ t 2 = ω ^ + α ^ y t 1 2 + β ^ σ ^ t 1 2 and some initial values y 0 and σ ^ 0 ;
3. 
Generate iid standard normal random variables η t b , t = 1 , , n , b = 1 , , B , and construct a bootstrap sample y t b = σ ^ t η t b ;
4. 
Based on y t b , t = 1 , , n , estimate ω , α , β with ω ^ * , α ^ * , β ^ * , and calculate the bootstrapped residuals ϵ ^ t b * = y t b / σ ^ t b * with σ ^ t b * obtained recursively by σ ^ t b * 2 = ω ^ * + α ^ * y t 1 b 2 + β ^ * σ ^ t 1 b * 2 ;
5. 
Based on these residuals, construct the monitoring process T ^ n m a x , b ( k ) , k = 1 , , n , b = 1 , , B , similarly to T ^ n m a x in (8) with W ^ k b analogously defined to W ^ k ;
6. 
Finally, the critical value c is determined as the 100 α % upper quantile of T ^ n m a x , b = max 1 k n T ^ n m a x , b ( k ) for b = 1 , , B .
The critical value c can be obtained via a bootstrap method similar to that for the GARCH(1,1) model in (7). In this case, we use y t b = σ ˜ t η t . Based on the obtained bootstrap sample y t b , applying the SVR method again, we can get the conditional volatility estimates σ ^ t * b 2 = g ^ * ( y t 1 b 2 , σ ˜ t 1 b 2 ) , where σ ˜ t b 2 are proxies obtained from y t b ’s. Then, the bootstrapped residuals are obtained as ϵ ^ t b * = y t b / σ ^ t * b . The critical value is obtained from these similarly to Steps (5) and (6) addressed above. Instead of σ ^ t * b 2 , alternatively one might be able to consider using σ ˜ t * b 2 : = g ^ ( y t 1 b 2 , σ ˜ t 1 b 2 ) to obtain the residuals ϵ ^ t b * : = y t b / σ ˜ t * b .

4. Simulation Experiments

We assess the proposed SVR-GARCH monitoring process to measure the performance when the time series is simulated from linear or nonlinear variants of GARCH models, such as GARCH ( p , q ) , asymmetric GARCH ( p , q ) (AGARCH), GJR-GARCH ( p , q ) , and Box–Cox transformed threshold GARCH ( p , q ) (BCTT-GARCH), specified as follows:
GARCH ( p , q ) : y t = σ t ϵ t , σ t 2 = ω + i = 1 p α i y t i 2 + j = 1 q β j σ t j 2 , α i 0 , β j 0 ; AGARCH ( p , q ) : y t = σ t ϵ t , σ t 2 = ω + α ( y t 1 b ) 2 + β σ t 1 2 , α 0 , β 0 ; GJR - GARCH ( p , q ) : y t = σ t ϵ t , σ t 2 = ω + i = 1 p α 1 , i y t i + 2 + α 2 , i y t i 2 + j = 1 q β j σ t j 2 , y t + = max ( y t , 0 ) , y t = min ( y t , 0 ) , α 1 , i 0 , α 2 , i 0 , β j 0 ;
BCTT - GARCH ( p , q ) : y t = σ t ϵ t , σ t 2 = ω + i = 1 p α 1 , i ( y t i + 2 ) δ + α 2 , i ( y t i 2 ) δ + j = 1 q β j σ t j 2 1 / δ , y t + = max ( y t , 0 ) , y t = min ( y t , 0 ) , α 1 , i 0 , α 2 , i 0 , β j 0 ,
where the orders p , q are fixed as 1 and the errors ϵ t are iid. N ( 0 , 1 ) are random variables.
In implementation, we generate a time series of length m from each of the above models and fit the SVR-GARCH model to it as presented in Section 3. In this procedure, we utilize the first 0.7 m time series as a training set, and the rest as a validation set, where x denotes the largest integer not exceeding x. Subsequently, to create the circumstance, where the dataset for monitoring (testing sample) is observed sequentially, we initially design the underlying parametric model with a change located at point 1 < s < n . With this model, we generate a single observation of time series, obtain the residuals with the trained SVR-GARCH model, and then compute the test statistic. We terminate this monitoring process either if the test statistics are larger than the critical value c, or the accumulated number of observations reaches the prescribed length of n. To obtain the sizes and powers empirically, we iterate the procedure 1000 times at the significance level of α = 0.05 . The empirical sizes and powers are calculated as the ratio of the rejections of the null hypothesis H 0 out of the 1000 repetitions.
Upon the construction of the null hypothesis, we consider the following settings:
  • GARCH(1,1): ω = 0.3 , α = 0.3 , β = 0.3
  • AGARCH(1,1): ω = 0.1 , α = 0.1 , β = 0.8 , b = 1
  • GJR-GARCH(1,1): ω = 0.1 , α 1 = 0.3 , α 2 = 0.1 , β = 0.5
  • BCTT-GARCH(1,1): ω = 0.3 , α 1 = 0.4 , α 2 = 0.2 , β = 0.3 , δ = 0.8 .
We inspect the cases of ( m , n ) = ( 1000 , 1000 ) and ( 2000 , 2000 ) , where m = c n log log n is used, i.e., c = 1.933 , 2.028 for n = 1000 , 2000 , respectively, and count the number of false alarms and anomaly detections when the underlying model experiences a change at diverse locations, namely at the observation γ n of the monitoring time series for γ { 0.1 , 0.2 , , 0.7 } . For each set of experiments, we employ the PSO search algorithm for optimizing tuning parameters.
Table 1 and Figure 1 encapsulate the empirical sizes and powers for the above four cases. In a nutshell, all four models’ empirical sizes show a tendency to stabilize around α = 0.05 . Additionally, their empirical powers approached 1 in a vast majority of the experiments, exhibiting a remarkable ability to detect changes across the experiments. In particular, as anticipated, the power became more significant as the γ decreases. Nevertheless, even if the location of the change was towards the end of the observation (i.e., a larger γ ), its detection ability was still not heavily deteriorated. This entails that the performance of our monitoring process is not much affected by the location of change.
The result for the AGARCH model, presented in Figure 1c,d, is particularly more prominent. Despite the model being systematically unstable because of α + β being close to 1, empirical sizes and powers appeared to be highly reliable even at a relatively small sample size of n = 1000 . This robustly confirms the validity of our proposed method. Overall, our findings show that the proposed monitoring process is highly applicable to datasets with high volatilities, such as the daily returns of financial time series.

5. Real Data Analysis

In this section, we demonstrate the real-world applicability of our proposed SVR-GARCH monitoring scheme with three financial time series: the S&P500 Composite Index (2 January 1991∼31 December 2003), the Korea Composite Stock Price Index (KOSPI) (2 July 2012∼29 September 2020), and the stock price of Microsoft Corporation (1 July 2009∼30 September 2020), obtainable from the websites Yahoo Finance and Investing.com. Moreover, the log returns, defined as r t = 100 × { log ( y t ) log ( y t 1 ) } ( t 2 ) , are utilized throughout the analysis, and are denoted by “S&P500”, “KOSPI”, and “Microsoft”.
The procedure of this analysis is categorized into four steps. We first divided the time series of log returns into two chunks, and assigned the former as the training set of size m. To reflect the situation of monitoring, setting the maximum number of observations to n = 1500 , we regarded the latter time series being observed sequentially. We then overviewed the general behavior of the given time series with its summary statistics, autocorrelation functions (ACFs), and partial ACFs (PACFs). Table 2 and Figure 2 reveal the basic characteristics of the datasets and plots of the ACFs and PACFs up to lag 25, respectively. The characteristics of KOSPI is similar to those of the standard normal distribution, in terms of skewness and excess kurtosis. By contrast, the excess kurtosis of S&P500 and Microsoft drifted from that of a standard normal distribution, and was moderately skewed, compared to KOSPI. Moreover, the ACFs and PACFs of all three time series did not suggest significant autocorrelations, which entails the stationarity of the datasets. Additionally, fitting the SVR-GARCH model of Lee et al. (2020b) to examine an existence of change within the training time series, we applyed their retrospective CUSUM test to the training set. Then, the result exhibited that it did not exceed the critical value of 1.3397, which consolidated the absence of change within the training set at the significance level of 0.05 .
Figure 3, Figure 4 and Figure 5 present the result of the monitoring procedure and the identified location of change in conditional volatility. The change of volatility for S&P500 appeared to occur on 28 October 1997. Notably, our model promptly responded to the event that occurred on the day prior, where log returns fluctuated by more than 7%. Additionally, we can observe the increase of volatility posterior to the detected location. Indeed, the detected location is around the period of the Asian financial crisis of 1997, and specifically on the exact date when the South Korean won plummeted massively to its new low. This instantaneous response after an abrupt change of volatility strengthens the validity of our monitoring process.
The locations of change regarding KOSPI and Microsoft were observed on 11 March and 13 March 2020, respectively. Notice that the detected location of change is close to the predetermined n. These findings particularly implicate that the detection ability of our monitoring process is not impaired, even if the volatility change is located near the end. In addition, the change point in KOSPI predates the major spread of the global pandemic outbreak by a few days, where the log-returns are observed to fluctuate abruptly, exceeding 8%. In fact, the recognized locations of change regarding KOSPI and Microsoft are around the period of the stock market crash resulting from the pandemic, which still currently affects the global financial market to a certain extent. This result illuminates the potential of our monitoring scheme to identify a signal change prior to the change in the underlying structure of the model and to serve as a solution to the necessity of prompt detection of structural changes throughout various fields of research.

6. Concluding Remarks

In this study, we considered a novel monitoring process for detecting a significant change of conditional volatilities in time series. For this task, we proposed a procedure based on a CUSUM test with a new test statistic, and utilized the SVR-GARCH model to calculate the residuals so as to construct the monitoring process, wherein the PSO is employed to obtain an optimal set of tuning parameters. Our simulation study demonstrated the adequacy of the proposed monitoring process for various linear and nonlinear GARCH-type time series. A real data analysis of three financial time series, namely S&P500, KOSPI, and Microsoft, was conducted to affirm the practicability of our model in various real-world circumstances. Our proposed model can be adopted in a classical SPC, where one controls ARL directly rather than the Type I error. This could be empirically achieved using the bootstrap method, as stated in Section 3.3. As the necessity for control charts with their usage tailored to various circumstances is still increasing, we leave the issue as our future research project.

Author Contributions

Conceptualization, S.L.; methodology, S.L.; software, C.K.K. and D.K.; validation, S.L. and C.K.K.; formal analysis, C.K.K. and D.K.; investigation, S.Y. and C.K.K.; resources, S.Y.; data curation, C.K.K.; writing—original draft preparation, S.L. and C.K.K.; writing—review and editing, S.L. and C.K.K.; visualization, C.K.K.; supervision, S.L.; project administration, S.L.; funding acquisition, S.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the Basic Science Research Program through the National Research Foundation of Korea funded by the Ministry of Science, ICT and future Planning (grant 2018R1A2A2A05019433).

Acknowledgments

We thank the three anonymous referees for their careful reading and valuable comments that improved the quality of the paper.

Conflicts of Interest

The authors declare no conflict of interest. The funding sponsors had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, and in the decision to publish the results.

Abbreviations

The following abbreviations are used in this manuscript:
CUSUMcumulative sum
SPCstatistical process control
ARLaverage run length
ARMAautoregressive and moving average
ARCHautoregressive conditionally heteroskedasticity
GARCHgeneralized autoregressive conditionally heteroskedasticity
SVRsupport vector regression
SVMsupport vector machine
PSOparticle swarm optimization
iidindependent and identically distributed
MAEmean absolute error
EWMAexponentially weighted moving average
AGARCHasymmetric GARCH
GJR-GARCHGlosten, Jagannathan and Runkle-GARCH
BCTT-GARCHBox-Cox transformed threshold GARCH
ARMAautoregressive and moving average
QMLEquasi-maximum likelihood estimator
KOSPIKorea Composite Stock Price Index
ACFautocorrelation function
PACFpartial ACF

References

  1. Page, E.S. Continuous inspection schemes. Biometrika 1954, 41, 100–115. [Google Scholar] [CrossRef]
  2. Page, E.S. A test for a change in a parameter occurring at an unknown point. Biometrika 1955, 42, 523–527. [Google Scholar] [CrossRef]
  3. Wu, Z.; Jiao, J.; Yang, M.; Liu, Y.; Wang, Z. An enhanced adaptive CUSUM control chart. IIE Trans. 2009, 41, 642–653. [Google Scholar] [CrossRef]
  4. Regier, P.; Briceño, H.; Boyer, J.N. Analyzing and comparing complex environmental time series using a cumulative sums approach. MethodsX 2019, 6, 779–787. [Google Scholar] [CrossRef]
  5. Contreras-Reyes, J.E.; Idrovo-Aguirre, B.J. Backcasting and forecasting time series using detrended cross-correlation analysis. Phys. A Stat. Mech. Appl. 2020, 560, 125109. [Google Scholar] [CrossRef]
  6. Montgomery, D.C. Statistical Quality Control; Wiley Global Education: Hoboken, NJ, USA, 2012. [Google Scholar]
  7. Gombay, E.; Serban, D. Monitoring parameter change in AR (p) time series models. J. Multivar. Anal. 2009, 100, 715–725. [Google Scholar] [CrossRef]
  8. Huh, J.; Oh, H.; Lee, S. Monitoring parameter change for time series models with conditional heteroscedasticity. Econ. Lett. 2017, 152, 66–70. [Google Scholar] [CrossRef]
  9. Na, O.; Lee, Y.; Lee, S. Monitoring parameter change in time series models. Stat. Methods Appl. 2011, 20, 171–199. [Google Scholar] [CrossRef]
  10. Vapnik, V. Statistical Learning Theory; John Wiley and Sons: New York, NY, USA, 1998. [Google Scholar]
  11. Fernandez-Rodriguez, F.; Gonzalez-Martel, C.; Sosvilla-Rivero, S. On the profitability of technical trading rules based on artificial neural networks: Evidence from the Madrid stock market. Econ. Lett. 2000, 69, 89–94. [Google Scholar] [CrossRef]
  12. Cao, L.; Tay, F. Financial forecasting using support vector machines. Neural Comput. Appl. 2001, 10, 184–192. [Google Scholar] [CrossRef]
  13. Pérez-Cruz, F.; Afonso-Rodriguez, J.; Giner, J. Estimating GARCH models using SVM. Quant. Financ. 2003, 3, 163–172. [Google Scholar] [CrossRef]
  14. Chen, S.; Härdle, W.K.; Jeong, K. Forecasting volatility with support vector machine-based GARCH model. J. Forecast. 2010, 29, 406–433. [Google Scholar] [CrossRef]
  15. Bezerra, P.; Albuquerque, P. Volatility forecasting via SVR–GARCH with mixture of Gaussian kernels. Comput. Manag. Sci. 2017, 14, 179–196. [Google Scholar] [CrossRef]
  16. Vapnik, V.N. The Nature Of Statistical Learning Theory; Springer: New York, NY, USA, 2000. [Google Scholar]
  17. Smola, A.; Schölkopf, B. A tutorial on support vector regression. Stat. Comput. 2004, 14, 199–222. [Google Scholar] [CrossRef] [Green Version]
  18. Lee, S.; Lee, S.; Moon, M. Hybrid change point detection for time series via support vector regression and CUSUM method. Appl. Soft Comput. 2020, 89, 106101. [Google Scholar] [CrossRef]
  19. Lee, S.; Kim, C.; Lee, S. Hybrid CUSUM change point test for time series with time-varying volatilities based on support vector regression. Entropy 2020, 22, 578. [Google Scholar] [CrossRef]
  20. Lee, S.; Tokutsu, Y.; Maekawa, K. The cusum test for parameter change in regression models with ARCH errors. J. Jpn. Stat. Soc. 2004, 34, 173–188. [Google Scholar] [CrossRef] [Green Version]
  21. Oh, H.; Lee, S. Modified residual CUSUM test for location-scale time series models with heteroscedasticity. Ann. Inst. Stat. Math. 2019, 71, 1059–1091. [Google Scholar] [CrossRef]
  22. Lee, S. Location and scale-based CUSUM test with application to autoregressive models. J. Stat. Comput. Simul. 2020, 90, 2309–2328. [Google Scholar] [CrossRef]
  23. Lee, S.; Ha, J.; Na, O.; Na, S. The CUSUM test for parameter change in time series models. Scand. J. Stat. 2003, 30, 781–796. [Google Scholar] [CrossRef]
  24. Kennedy, J.; Eberhart, R. Particle swarm optimization. In Proceedings of the ICNN’95-International Conference on Neural Networks, Perth, Australia, 27 November–1 December 1995; IEEE: Piscataway, NJ, USA, 1995; Volume 4, pp. 1942–1948. [Google Scholar]
  25. Ozcan, E.; Mohan, C.K. Analysis of a simple particle swarm optimization system. Intell. Eng. Syst. Through Artif. Neural Netw. 1998, 8, 253–258. [Google Scholar]
  26. Trelea, I.C. The particle swarm optimization algorithm: Convergence analysis and parameter selection. Inf. Process. Lett. 2003, 85, 317–325. [Google Scholar] [CrossRef]
  27. Yasuda, K.; Ide, A.; Iwasaki, N. Adaptive particle swarm optimization. In SMC’03 Conference Proceedings, Proceedings of the 2003 IEEE International Conference on Systems, Man and Cybernetics, Washington, DC, USA, 8 October 2003; IEEE: Piscataway, NJ, USA, 2003; Volume 2, pp. 1554–1559. [Google Scholar]
  28. Zhang, Y.; Wang, S.; Ji, G. A comprehensive survey on particle swarm optimization algorithm and its applications. Math. Probl. Eng. 2015, 2015, 1–38. [Google Scholar] [CrossRef] [Green Version]
  29. Wang, D.; Tan, D.; Liu, L. Particle swarm optimization algorithm: An overview. Soft Comput. 2018, 22, 387–408. [Google Scholar] [CrossRef]
  30. Qian, W.; Li, M. Convergence analysis of standard particle swarm optimization algorithm and its improvement. Soft Comput. 2018, 22, 4047–4070. [Google Scholar] [CrossRef]
  31. Billingsley, P. Convergence of Probability Measure; Wiley: New York, NY, USA, 1968. [Google Scholar]
  32. Bollerslev, T. Generalized autoregressive conditional heteroskedasticity. J. Econom. 1986, 31, 307–327. [Google Scholar] [CrossRef] [Green Version]
  33. Francq, C.; Zakoian, J.M. GARCH Models: Structure, Statistical Inference and Financial Applications; John Wiley & Sons: New York, NY, USA, 2019. [Google Scholar]
  34. Cortes, C.; Vapnik, V. Support-vector networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
  35. Abe, S. Support Vector Machines for Pattern Classification; Springer: New York, NY, USA, 2005; Volume 2. [Google Scholar]
  36. Marini, F.; Walczak, B. Particle swarm optimization (PSO). A tutorial. Chemom. Intell. Lab. Syst. 2015, 149, 153–165. [Google Scholar] [CrossRef]
Figure 1. Plots of empirical power of the SVR-GARCH monitoring procedure.
Figure 1. Plots of empirical power of the SVR-GARCH monitoring procedure.
Entropy 22 01312 g001
Figure 2. Plot of ACF and PACF up to lag 25 of log-returns.
Figure 2. Plot of ACF and PACF up to lag 25 of log-returns.
Entropy 22 01312 g002
Figure 3. Raw index and its log returns of S&P500.
Figure 3. Raw index and its log returns of S&P500.
Entropy 22 01312 g003
Figure 4. Raw index and its log returns of KOSPI.
Figure 4. Raw index and its log returns of KOSPI.
Entropy 22 01312 g004
Figure 5. Raw index and its log returns of Microsoft.
Figure 5. Raw index and its log returns of Microsoft.
Entropy 22 01312 g005
Table 1. Empirical size and power of the SVR-GARCH monitoring procedure for the GARCH(1,1), AGARCH(1,1), GJR-GARCH(1,1), and BCTT-GARCH(1,1) models.
Table 1. Empirical size and power of the SVR-GARCH monitoring procedure for the GARCH(1,1), AGARCH(1,1), GJR-GARCH(1,1), and BCTT-GARCH(1,1) models.
4-17 n = 1000 n = 2000
Change location 0.1 n 0.2 n 0.3 n 0.4 n 0.5 n 0.6 n 0.7 n 0.1 n 0.2 n 0.3 n 0.4 n 0.5 n 0.6 n 0.7 n
GARCH(1,1)size0.0380.045
power ω 1 0.9030.8930.8790.8670.8240.7780.7370.9530.9450.9340.9160.8930.8450.805
ω 0.1 0.9540.9420.8980.8240.7010.4750.2060.9710.9630.9590.9160.8400.6160.317
α 0.6 0.9610.9550.9400.9240.8820.8300.7260.9950.9900.9760.9730.9510.9390.870
β 0.6 0.9740.9580.9550.9460.9070.8710.8320.9960.9920.9880.9810.9750.9400.920
AGARCH(1,1)size0.0390.037
power ω 1 0.9930.9890.9870.9850.9680.9590.9390.9960.9970.9880.9950.9890.9800.975
β 0.2 1.0001.0001.0001.0001.0001.0000.9921.0001.0001.0001.0001.0001.0001.000
α , β 0.8 , 0.1 0.9670.9270.9240.9000.8560.7670.6720.9950.9920.9830.9780.9500.8990.832
b 3 0.9890.9860.9810.9780.9620.9430.9200.9940.9950.9940.9840.9870.9880.966
GJR-GARCH(1,1)size0.0420.033
power ω 1 1.0000.9991.0000.9980.9990.9980.9961.0001.0001.0001.0000.9991.0000.999
ω 0.01 1.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.000
α 2 0.8 1.0001.0000.9990.9990.9980.9970.9671.0001.0001.0001.0000.9991.0000.998
β 0.75 0.9960.9930.9940.9860.9860.9770.9541.0001.0001.0000.9980.9980.9970.989
BCTT-GARCH(1,1)size0.0480.039
power ω 1 0.9780.9730.9730.9570.9480.9340.8970.9970.9960.9930.9890.9790.9750.955
ω 0.01 0.9950.9950.9930.9780.9510.8550.5560.9990.9970.9970.9970.9870.9580.819
α 1 1 0.9970.9950.9880.9920.9620.9470.9031.0000.9991.0001.0000.9960.9890.968
α 2 0.9 0.9870.9710.9680.9410.9210.8750.8170.9950.9930.9880.9920.9800.9580.911
β 0.6 0.9860.9830.9840.9670.9700.9410.8970.9980.9940.9970.9920.9900.9920.972
δ 0.2 0.9990.9980.9990.9920.9690.8910.5850.9990.9990.9980.9990.9960.9780.854
δ 2 0.8500.8140.7950.7490.6710.5850.5020.9450.9190.8900.8460.8000.7280.602
Table 2. Summary statistics and the results of the preliminary retrospective test of the training set, and the result of the monitoring test regarding S&P500, KOSPI, and Microsoft.
Table 2. Summary statistics and the results of the preliminary retrospective test of the training set, and the result of the monitoring test regarding S&P500, KOSPI, and Microsoft.
S&P500KOSPIMicrosoft
Summary statistics
(training set)
Observations164010161417
Mean0.06040.00960.0428
Standard deviation0.69310.77281.4408
Minimum−3.7272−3.1429−12.1033
Median0.03520.00700.03145
Maximum3.66422.91247.0330
Skewness−0.1064−0.0264−0.6141
Excess kurtosis2.24281.38486.8293
Retrospective test
(training set)
Test statistic0.80691.28760.5897
Monitoring testLocation97/10/2820/03/1120/03/13
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Lee, S.; Kim, C.K.; Kim, D. Monitoring Volatility Change for Time Series Based on Support Vector Regression. Entropy 2020, 22, 1312. https://doi.org/10.3390/e22111312

AMA Style

Lee S, Kim CK, Kim D. Monitoring Volatility Change for Time Series Based on Support Vector Regression. Entropy. 2020; 22(11):1312. https://doi.org/10.3390/e22111312

Chicago/Turabian Style

Lee, Sangyeol, Chang Kyeom Kim, and Dongwuk Kim. 2020. "Monitoring Volatility Change for Time Series Based on Support Vector Regression" Entropy 22, no. 11: 1312. https://doi.org/10.3390/e22111312

APA Style

Lee, S., Kim, C. K., & Kim, D. (2020). Monitoring Volatility Change for Time Series Based on Support Vector Regression. Entropy, 22(11), 1312. https://doi.org/10.3390/e22111312

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop