Next Article in Journal
Smooth, Singularity-Free, Finite-Time Tracking Control for Euler–Lagrange Systems
Next Article in Special Issue
Functional Limit Theorem for the Sums of PSI-Processes with Random Intensities
Previous Article in Journal
Using Probabilistic Models for Data Compression
Previous Article in Special Issue
Local Laws for Sparse Sample Covariance Matrices
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Asymptotic Properties and Application of GSB Process: A Case Study of the COVID-19 Dynamics in Serbia

by
Mihailo Jovanović
1,†,
Vladica Stojanović
2,*,†,
Kristijan Kuk
2,†,
Brankica Popović
2,† and
Petar Čisar
2,†
1
The Office for Information Technologies and eGovernment, 11000 Belgrade, Serbia
2
Department of Informatics & Computer Sciences, University of Criminal Investigation and Police Studies, 11000 Belgrade, Serbia
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Mathematics 2022, 10(20), 3849; https://doi.org/10.3390/math10203849
Submission received: 22 September 2022 / Revised: 13 October 2022 / Accepted: 14 October 2022 / Published: 17 October 2022
(This article belongs to the Special Issue Limit Theorems of Probability Theory)

Abstract

:
This paper describes one of the non-linear (and non-stationary) stochastic models, the GSB (Gaussian, or Generalized, Split-BREAK) process, which is used in the analysis of time series with pronounced and accentuated fluctuations. In the beginning, the stochastic structure of the GSB process and its important distributional and asymptotic properties are given. To that end, a method based on characteristic functions (CFs) was used. Various procedures for the estimation of model parameters, asymptotic properties, and numerical simulations of the obtained estimators are also investigated. Finally, as an illustration of the practical application of the GSB process, an analysis is presented of the dynamics and stochastic distribution of the infected and immunized population in relation to the disease COVID-19 in the territory of the Republic of Serbia.

1. Introduction

Stochastic models which are used in the analysis of time series with pronounced and permanent fluctuations are of particular importance in contemporary research. For this purpose, we start from the basic results of Engle and Smith [1], who first introduced the so-called STOchastic Permanent BREAKing process, popularly called the STOPBREAK process. Many authors have since considered the STOPBREAK notion, primarily in the field of econometrics. Some of its modifications were considered, among others, in [2,3,4,5], while its application was presented, for instance, in [6,7,8].
The original modification of the STOPBREAK process, named the Split-BREAK model, was introduced in [9]. After that, the general form of this process, named Gaussian (or Generalized) Split-BREAK (GSB) process, was proposed in [10,11,12]. This stochastic model also can be viewed as a generalization of STOPBREAK, as well as a well-known linear Auto-Regressive Moving Average (ARMA) model. In that way, the GSB process has already been applied in analyzing non-linear time series with pronounced and permanent fluctuations. Let us point out that in the mentioned works, of main consideration were the stochastic properties of the stationary components of the GSB process. The main goal of this paper is a more detailed investigation of the non-stationary components (time series) of the GSB model. These series naturally have a more complex stochastic structure, but they are of particular interest in contemporary research [13,14,15,16,17,18]. To this end, the asymptotic properties of distributions of the GSB series will also be of specific interest.
In addition to the theoretical aspects, the application of the GSB process in describing the dynamics and finding an adequate stochastic distribution of the infected and immunized population with respect to COVID-19 on the territory of the Republic of Serbia was also considered. We point out that many authors who deal with this, still current, issue have contributed various theoretical models that investigate it from several aspects. For instance, rigorous mathematical models, usually based on analyzing and solving systems of partial coupled equations, have been proposed, among others, in [19,20,21]. On the other hand, works in [22,23,24,25] combine deterministic and stochastic approaches, such as multiple and logistic regression, multifactor correlation, and the least squares estimation method, to predict the various effects caused by the COVID-19 pandemic. A particularly interesting approach is given in [26,27] where, to predict the COVID-19 dynamics more accurately, machine learning techniques and the construction of a complete information system are used. Finally, to the best of our knowledge, most stochastic approaches to-date in the analysis of infection, immunization, and other indicators related to the disease of COVID-19 were based on the use of the gamma distribution [21,28], as well as a log-normal distribution [29]. This is precisely one of the reasons why we believe that a different approach is given here, primarily in stochastic modeling and research of this problem. At the same time, let us emphasize that our main goal is to model the temporal dynamics of the COVID-19 disease, based on a formal study of the stochastic structure of the GSB model. In this sense, some other indicators and features of this disease, which can also affect its dynamics (see, for instance [30,31,32]), can to a certain degree be a limitation of this approach.
In the next section, starting from previous works [9,10,11,12], some definitions and basic stochastic properties of the GSB process are discussed. Section 3 contains the main and novel results related to this process’s detailed stochastic structure and asymptotic properties, where the method of characteristic functions (CFs) was used as the basic tool. Section 4 presents the procedure for estimating the unknown parameters of the GSB process and an investigation of the asymptotic properties of the obtained estimators. Numerical Monte Carlo simulations of the obtained estimators are considered in Section 5. In addition, the application of the GSB process in describing the dynamics and distribution of the size of infected and immunized populations on the territory of the Republic of Serbia is given here. Finally, concluding remarks are highlighted in Section 6.

2. Definition and Main Properties of the GSB Process

The basic series of GSB processes is defined by the following equality:
y t = m t + ε t .
Here, t = 0 ,   1 , ,   T are the known time values, ( m t ) is the series of the so-called martingale means, and ( ε t ) are the innovations, i.e., series of independent identical distributed (IID) Gaussian N ( 0 ,   σ 2 ) random variables (RVs). Moreover, it is considered that ( ε t ) is defined on the same probability space ( Ω , , P ) , expanded by some filtration F = ( t ) , i.e., nondecreasing σ -algebras on Ω . In a practical sense, filtration ( t ) represents a set of “information” at time t . Therefore, it is assumed that, for each t = 0 ,   1 , ,   T , the RVs ε t are t -adaptive. Accordingly, the conditional expectation, as well as the variance of RVs ε t , are, respectively,
E ( ε t | t 1 ) = 0 , V ( ε t | t 1 ) = E ( ε t 2 | t 1 ) = σ 2 .
On the other hand, for martingale means ( m t ) , we assume that they are defined by the following recurrence relation:
m t = m t 1 + q t 1 ε t 1 = m 0 + j = 0 t 1 q j ε j .
Here, we can effectively assume that m 0 = a s μ   ( c o n s t . ) and ε 1 = ε 0 = a s 0 . Meanwhile, q t is the so-called noise indicator, i.e., the RV that depends on innovations ( ε t ) in the following way:
q t = I ( ε t 1 2 > c ) = { 1 ,                   ε t 1 2 > c 0 ,                   ε t 1 2 c .
The value c > 0 represents the critical value of the reaction, i.e., the significance of the previous realization of innovations ( ε t ) which allow their present values to be included in Equation (2). In other words, value q t 1 = 0 indicates that there is no change in the martingale mean value m t , compared to the previous value m t 1 . Consequently, the value y t will be obtained with a “small” fluctuation, which depends only on ε t . By contrast, in the case of q t = 1 an emphatic (permanent) fluctuation of yt is registered. Thus, the level of previous realizations of series ( ε t ) affects the degree of variations in the series ( y t ) , that is, it indicates the intensity of fluctuations in the GSB process. Furthermore, according to the previous equalities, it follows that:
E ( y t | t 1 ) = m t + E ( ε t | t 1 ) = m t ,
from which we conclude that the series realizations ( y t ) are “close” to the martingale means ( m t ) . Moreover, it is valid to put:
E ( y t ) = E [ E ( y t | t 1 ) ] = E ( m t ) = E ( m t 1 ) + E ( q t 1 ε t 1 ) = E ( m t 1 ) = = E ( m 0 ) = μ ,
i.e., the mean values of the series ( y t ) and ( m t ) have equal, constant values. We notice that the previous equalities speak a lot about the stochastic nature of the GSB process, that is, the additive decomposition (1). Since the sequence ( m t ) is measurable concerning the field t 1 , it represents a component of predictability and stability of the GSB process. In contrast, the innovations series ( ε t ) is the deviation factor (white noise) of the basic GSB series ( y t ) in relation to the martingale means ( m t ) .
Further, we determine the conditional variance of the series ( y t ) from the equation:
V ( y t | t 1 ) = E ( y t 2 | t 1 ) m t 2 = 2 m t E ( ε t ) + E ( ε t 2 ) = σ 2 ,
and from here, one obtains:
V ( y t ) = E ( y t 2 ) μ 2 = E ( m t 2 ) + 2 E ( m t ε t ) + E ( ε t 2 ) μ 2 = V ( m t ) + σ 2 .
For each t = 1 , ,   T , it also holds that:
V ( m t ) = E ( m t 2 ) μ 2 = E ( m t 1 2 ) + 2 E ( m t 1 q t 1 ε t 1 ) + E ( q t 1 2 ε t 1 2 ) μ 2 = V ( m t 1 ) + a c σ 2 ,
where a c = E ( q t ) = E ( q t 2 ) = P { ε t 2 > c } . It follows that the variance of martingale means ( m t ) , under the assumption m 0 μ ( c o n s t . ) , can be expressed as:
V ( m t ) = t a c σ 2 , t 0 .
From here, the variance of the basic series ( y t ) can be obtained as follows:
V ( y t ) = V ( m t ) + σ 2 = ( t a c + 1 ) σ 2 , t 0 .
According to the previous equalities, the variances of the series ( y t ) and ( m t ) have non-constant values that depend on the point in time ( t ) in which they are observed.
Correlation functions of the series ( y t ) and ( m t ) can be obtained in a similar way. Note that for every s > t 0 , it holds that:
C o v ( m t , m s ) = E ( m t m s ) μ 2 = E ( m t m s 1 ) + E ( m t q s 1 ε s 1 ) μ 2 = C o v ( m t , m s 1 ) ,
and it is easy to see that the covariance of the series ( m t ) satisfies:
C o v ( m t , m s ) = V ( m t ) , s > t 0 .
From here, the correlation function of the martingale means is obtained:
K ˜ ( s , t ) = C o v ( m t , m s ) V ( m t ) · V ( m s ) = { min ( s , t ) s · t ,     s t 1 ,                         s = t .
Similarly, according to equalities:
C o v ( y t , y s ) = E ( y t y s ) μ 2 = E ( y t m s ) + E ( y t ε s ) μ 2 = E ( m t m s ) + E ( ε t m s ) μ 2 = C o v ( m t , m s ) + a c σ 2 = V ( m t ) + a c σ 2 = V ( y t ) , s > t 0 ,
the correlation function for ( y t ) , can be obtained as follows:
K ( s , t ) = { a c min ( s , t ) + 1 ( a c s + 1 ) · ( a c t + 1 ) ,     s t 1 ,                                                     s = t .
Therefore, both correlation functions depend on the time arguments t , s and indicate the non-stationarity of the series ( y t ) and ( m t ) . This fact requires some more complex techniques to examine their properties. Moreover, note that when s > t 0 ,
lim s t K ˜ ( s , t ) = lim s t min ( s , t ) s · t = t t 2 = 1 lim s t K ( s , t ) = lim s t a c min ( s , t ) + 1 ( a c s + 1 ) · ( a c t + 1 ) = a c t + 1 ( a c t + 1 ) 2 = 1 .
Thus, the correlation functions of both series ( y t ) and ( m t ) satisfy the L2-continuity condition.
At the end of this section, we define a series of increments of the GSB process by the following equality:
X t = y t y t 1 , t = 1 , , T .
Almost all authors who have studied STOPBREAK processes highlight the importance of this sequence. This series, as can be easily seen from Equations (1) and (2), can be given in the following form:
X t = ε t θ t 1 ε t 1 ,
where θ t = 1 q t = I ( ε t 1 2 c ) . The series ( X t ) is named a Splitting Moving Average process (of order 1), shortened to Split-MA (1) process, because it operates in two regimes. Fluctuations of innovations ( ε t ) that were emphasized in the previous time moment (t − 1) imply θ t 1 = 0 , so the equality X t = ε t holds. On the other hand, fluctuations that do not exceed the critical value c give a representation of ( X t ) in the form of a standard, linear MA (1) process. In this way, ( X t ) has similar properties to the MA (1) models, which can be applied in research into it. Thus, taking earlier assumptions, the mean value and variance of this series, obtained by simple computation, are:
E ( X t ) = 0 , V ( X t ) = E ( X t 2 ) = σ 2 ( b c + 1 ) ,
where b c = 1 a c = P ( ε t 1 2 c ) . Moreover, the covariance of this sequence is:
C o v ( X t , X s ) = { ( b c + 1 ) σ 2 ,                           s = t b c σ 2 ,                         | s t | = 1 0 ,                                             otherwise ,
and obviously has an identical structure to the standard MA (1) series. Based on the obtained covariance, we can easily see that the series ( X t ) is stationary and that its correlation function can be written in the form:
ρ X ( h ) = C o v ( X t , X t + h ) V ( X t ) = { 1 ,                                                                               h = 0 b c / ( b c + 1 ) ,                           h = ± 1 0 ,                                                               otherwise .
Finally, according to Equations (3) and (4), it follows that:
y t y t 1 = ε t θ t 1 ε t 1 , t = 1 , , T , .
which can be viewed as a non-linear Integrated Auto-Regressive Moving Average (ARIMA) model with “temporary” components ( θ t 1 ε t 1 ) . These imply the specific structure of the series ( X t ) , as well as other components of the GSB process.
In the following section, as we have already pointed out, we also discuss the application of the GSB model in describing the dynamics of infection and immunization of the population on the territory of the Republic of Serbia. As will be seen, this kind of dynamics has pronounced fluctuations that can be described by the non-stationary components of the GSB process, primarily by its main time series ( y t ) . In that case, due to its stationarity, the Split-MA (1) process plays an important role. As an illustration, Figure 1 shows the realizations of all the above-mentioned series obtained by the Monte Carlo simulation of the GSB model.

3. Stochastic Distribution and Asymptotic Properties of the GSB Process

In this section, some stochastic properties of the GSB process, regarding the distribution and asymptotic behavior of its basic stochastic components, are discussed in more detail. As explained in the previous section, the GSB model, given by Equations (1)–(4), contains four stochastic components: the basic series ( y ) , innovations ( ε t ) , the martingale means ( m t ) , and the series of increments ( X t ) . At the same time, series ( ε t ) and ( X t ) represent the stationary components of the GSB process, where ( X t ) is “close” to the linear MA model. In general form, the stochastic structure of the series ( X t ) is described in [12], where the method of characteristic functions (CFs) was used. Following this approach, the basic stochastic properties of the series ( X t ) can be expressed by the following statement.
Theorem 1.
Let  ( X t ) be the Split-MA (1) process defined by Equation (4). For arbitrary x and t = 0 , · 1 , ,   T , the cumulative distribution function (CDF) of this stochastic process is given by:
F X ( x ) = P { X t < x } = ( 1 b c ) F ε ( x ) + b c F 2 ε ( x ) ,
where F ε ( x ) and F 2 ε ( x ) are CDFs of RVs ε t : N ( 0 , σ 2 ) and 2 ε t : N ( 0 , 2 σ 2 ) , respectively.
Proof. 
For arbitrary t = 0 ,   1 , ,   T , let us denote the series of RVs η t = θ t ε t . Since θ t and ε t are mutually independent RVs, it follows
E ( η t ) = E ( θ t ) E ( ε t ) = 0 , V ( η t ) = E ( θ t 2 ) E ( ε t 2 ) = b c σ 2 .
Moreover, it is simply shown that C o v ( η t , η t + h ) = 0 holds for every h 0 , i.e., ( η t ) is a series of uncorrelated RVs. By applying conditional probabilities, the CDF of these RVs can be obtained as follows:
F η ( x ) = P { η t < x } = P { η t < x | θ t = 1 } · P { θ t = 1 } + P { η t < x | θ t = 0 } · P { θ t = 0 } = P { ε t < x } · P { θ t = 1 } + P { x > 0 } · P { θ t = 0 } = b c F ε ( x ) + ( 1 b c ) F 0 ( x ) ,
where F 0 ( x ) = I ( x > 0 ) is the CDF of the RV I 0 = a s 0 . Based on that, for the CF of the RVs η t , one obtains:
φ η ( u ) = + e i u x F η ( d x ) = + e i u x [ b c F ε + ( 1 b c ) F 0 ] ( d x ) = b c φ ε ( u ) + ( 1 b c ) φ 0 ( u ) .
Here, φ ε ( u ) = e σ 2 u 2 2 and φ 0 ( u ) 1 are CFs of the RVs ε t и I 0 , respectively. By substituting these CFs into the previous equality, we have:
φ η ( u ) = 1 + b c ( e σ 2 u 2 2 1 ) ,
whence, by applying Equation (4), it follows that the CF of RVs X t is:
φ X ( u ) = φ ε ( u ) · φ η ( u ) = e σ 2 u 2 2 [ 1 + b c ( e σ 2 u 2 2 1 ) ] = ( 1 b c ) e σ 2 u 2 2 + b c e σ 2 u 2 .
According to the last equality and Lévy’s correspondence theorem (see, e.g., [33] (p. 181)), Equation (5) immediately follows, that is, the statement of the theorem is proved. □
Remark 1.
As shown in [12], the CDF of RVs X t can also be given in the following form:
F X ( x ) = P { X t < x } = [ ( 1 b c ) F 0 ( x ) + b c F ε ( x ) ]     F ε ( x ) ,
where denotes the convolution of two (arbitrary) CDFs F ( x ) ,   G ( x ) :
( F G ) ( x ) = + F ( x y ) G ( d y ) .
The equivalence of Equations (5a) and (5b) are directly obtained from the fact that CDF F 0 ( x ) is neutral for the convolution operator, i.e.,
( F F 0 ) ( x ) = ( F 0 F ) ( x ) = + I ( x > y ) F ( d y ) = F ( x ) .
Finally, note that by differentiating Equation (5), the probability density function (PDF) of the series ( X t ) , one obtains:
f X ( x ) = 1 b c σ 2 π e x 2 2 π σ 2 + b c 2 σ π e x 2 4 π σ 2 .  
By a similar procedure as in the previous theorem and using the convolutions of CDFs, we describe the stochastic distribution of other components of the GSB process, i.e., the series ( m t ) and ( y t ) . As already shown in the previous section, these series represent non-stationary stochastic processes with a constant mean μ = E ( m t ) = E ( y t ) . Accordingly, the following statement is valid.
Theorem 2.
Let (yt) and ( m t ) be the time series defined by Equations (1) and (2), respectively, where m 0 = a s μ   ( c o n s t ) . For arbitrary x and t = 0 , · 1 , ,   T , the CDFs of these series are as follows:
F m ( x , t ) = P { m t < x } = t j = 1 [ ( 1 b c ) F j ( x ) + b c F 0 ( x ) ]   F μ ( x ) .
F y ( x , t ) = P { y t < x } = t j = 1 [ ( 1 b c ) F j ( x ) + b c F 0 ( x ) ] F μ ( x ) F ε ( x ) .  
Here, F 0 ( x ) and F j ( x ) are the CDFs of previously defined RVs I 0 and ε t , respectively, and F μ ( x ) = F m ( x , 0 ) is the CDF of the RV m 0 = a s   μ . In addition, when T = + , the following convergences (in distribution) are valid:
1 t m t d N ( 0 , a c σ 2 ) ,             1 t y t d N ( 0 , a c σ 2 ) ,                 t + .
Proof. 
For arbitrary t = 0 ,   1 , ,   T , let us introduce a series of RVs ξ t = q t ε t . In the same way as in the proof of the previous theorem, it is shown that ( ξ t ) is a series of mutually uncorrelated RVs, with E ( ξ t ) = 0 , D ( ξ t ) = a c σ 2 , where a c = E ( q t ) = P { ε t 2 > c } = 1 b c . By reapplying the conditional probabilities, the CDF of ξ t is obtained as follows:
F ξ ( x ) = P { ξ t < x } = P { ξ t < x | q t = 1 · P { q t = 1 } + P { ξ t < x | q t = 0 } · P { q t = 0 } = P { ε t < x } · P { q t = 1 } + P { x > 0 } · P { q t = 0 } = a c F ε ( x ) + ( 1 a c ) F 0 ( x ) .
According to this, their corresponding CF is obtained:
φ ξ ( u ) = + e i u x F ξ ( d x ) = + e i u x [ a c F ε + ( 1 a c ) F 0 ] ( d x ) = a c φ ε ( u ) + ( 1 a c ) φ 0 ( u ) = 1 + a c ( e σ 2 u 2 2 1 ) = ( 1 b c ) e σ 2 u 2 2 + b c .
Applying Equation (2), we find that the CFs of the RVs ( m t ) are as follows:
φ m ( u , t ) = φ μ ( u ) j = 0 t 1 φ ξ ( u ) = e i u μ [ ( 1 b c ) e σ 2 u 2 2 + b c ] t ,
where φ μ ( u ) = e i u μ is CF of the RV m 0 = a s μ . Then, Equation (6) immediately follows from Equation (9) and Lévy’s correspondence theorem [33] (p. 181).
Similarly, by applying the previous Equations (1) and (9), the CFs of the RVs ( y t ) are obtained:
φ y ( u , t ) = φ m ( u ) · φ ε ( u ) = e i u μ σ 2 u 2 2 [ ( 1 b c ) e σ 2 u 2 2 + b c ] t .
From here, by reapplying the theorem of Lévy, Equation (7) immediately follows.
To prove the second part of the theorem, i.e., Equation (8), note first that the CFs of the RVs m t / t and y t / t , when t = 1 ,   2 ,   , according to Equations (9) and (10), can be written as follows:
φ m ( u t , t ) = e i u μ / t [ 1 + a c ( e σ 2 u 2 2 t 1 ) ] t = e i u μ / t [ 1 a c σ 2 u 2 2 t + σ ( u 2 t ) ] t , φ y ( u t , t ) = e i u μ / t σ 2 u 2 2 t [ 1 + a c ( e σ 2 u 2 2 t 1 ) ] t = e i u μ / t σ 2 u 2 2 t [ 1 a c σ 2 u 2 2 t + σ ( u 2 t ) ] t .
Here, σ ( z ) is an infinitely small value of a higher order than z when z 0 . Hence, for a fixed but arbitrary u , we have:
φ m ( u t , t ) e a c σ 2 u 2 2 ,           φ y ( u t , t ) e a c σ 2 u 2 2 ,     t + ,
and the convergences thus obtained confirm the asymptotic relations in Equation (8). □
Remark 2.
Note again that the proofs of the previous two theorems are based on determining the CFs of the corresponding time series of the GSB process. In this sense, the CFs of the uncorrelated series of RVs ( ξ t ) and ( η t ) play a fundamental role. The series ( ξ t ) and ( η t ) can be viewed as “new” innovations with “optional” non-zero values, which essentially describe the stochastic structure of the GSB process. Nevertheless, as the relation η t + ξ t = a s ε t holds for each t = 0 , · 1 , ,   T , it is sufficient to consider only one of these two series of uncorrelated RVs (which is what was done in the statement of Theorem 2). Moreover, it can be easily shown that CDFs:
F ξ ( u ) = ( 1 b c ) F ε ( x ) + b c F 0 ( x ) , F η ( u ) = b c F ε ( u ) + ( 1 b c ) F 0 ( u )
are continuous almost everywhere, with the only point of discontinuity x = 0 where they have “jumps” of the values b c and 1 b c , respectively (see for more detail [34,35]). Therefore, the CDFs of the series ( ξ t ) and ( η t ) are mixtures of Gaussian and discrete type distribution, usually named Contaminated Gaussian Distribution (CGD). This is another important fact that disables an application of some of the standard procedures in the investigation of the properties of non-stationary series ( y t ) and ( m t ) .
On the other hand, Equation (8) shows that even non-stationary time series ( m t ) and ( y t ) can generate series ( m t / t ) and ( y t / t ) that converge toward a normal distribution when t + . Moreover, based on the properties of the non-stationary components of the GSB process described in Section 2, the time series ( m t / t ) has a constant variance a c σ 2 . These facts will be of importance in the practical application of the GSB process and can be readily observed based on the convergence of the corresponding CFs φ m ( u / t , t ) and φ y ( u / t , t ) . As an illustration, Figure 2 shows convergences of the modulus of these CFs, for different time indices ( t ) .
At the end of this section, we additionally describe some more asymptotic properties of series obtained by transformations of non-stationary time series ( m t ) and ( y t ) . They also refer to the possibility of finding their asymptotically normal (AN) distributions, which can be shown by the following statement:
Theorem 3.
For arbitrary α 1 and time series ( y t ) and ( m t ) , given by Equations (1) and (2), respectively, let us define the so-called α -mean series:
M ¯ t ; α = 1 t α j = 1 t m j , Y ¯ t ; α = 1 t α j = 1 t y j ,
Then the following statements hold:
(i).
When 1 α 3 / 2 , time series M ¯ t ; α and Y ¯ t ; α have an asymptotically normal distribution, i.e., the following relations, when t + , are valid:
M ¯ t ; α ~ N ( μ t 1 α , a c σ 2 t 3 2 α 3 ) , Y ¯ t ; α ~ N ( μ t 1 α , a c σ 2 t 3 2 α 3 ) .
(ii).
When α > 3 / 2 , time series M ¯ t ; α and Y ¯ t ; α asymptotically vanish, i.e.,
M ¯ t ; α d I 0 , Y ¯ t ; α d I 0 , t + .
Proof. 
We show the statement of the theorem first for the time series M ¯ t ; α . Based on the definition of time series ( m t ) , i.e., Equation (2), one obtains:
M ¯ t ; α = 1 t α j = 1 t m j = 1 t α j = 1 t ( m 0 + k = 0 j 1 q k ε k ) = 1 t α [ t m 0 + j = 0 t 1 ( t j ) q j ε j ] = t 1 α m 0 + k = 1 t k t α ξ t k .
Thus, the series M ¯ t ; α is represented as a sum of uncorrelated RVs ξ t k , k = 1 , ,   t . By applying the well-known properties of the CFs, as well as the expressions for the CF of the series ( ξ t ) , the CFs of M ¯ t ; α are as follows:
φ M ¯ ; α ( u , t ) = φ m ( u t α 1 , 0 ) k = 1 t φ ξ ( k u t α ) = e i u μ t 1 α k = 1 t [ 1 + a c ( e k 2 σ 2 u 2 2 t 2 α 1 ) ] .
Taking the logarithm of the function φ M ¯ ; α ( u , t ) gives a function:
ψ M ( u , t , α ) = ln φ M ¯ ; α ( u , t ) = i u μ t 1 α + k = 1 t f k ( u , t , α ) ,
where f k ( u , t , α ) = ln [ 1 + a c ( exp ( k 2 σ 2 u 2 t 2 α / 2 ) 1 ) ] . After some computation, we find that, when 0 < a c < 1 ,
f k ( 0 , t , α ) u = a c k 2 σ 2 u t 2 α e k 2 σ 2 u 2 2 t 2 α 1 + a c ( e k 2 σ 2 u 2 2 t 2 α 1 ) | u = 0 = 0   2 f k ( 0 , t , α ) u 2 = a c k 2 σ 2 t 2 α e k 2 σ 2 u 2 2 t 2 α ( ( 1 a c ) ( 1 k 2 σ 2 u 2 t 2 α ) + a c e k 2 σ 2 u 2 2 t 2 α ) ( 1 + a c ( e k 2 σ 2 u 2 2 t 2 α 1 ) ) 2 | u = 0 = a c k 2 σ 2 t 2 α .
Thus, the functions f k ( u , t , α ) have local maxima at the point u = 0 . Using a similar procedure as in [34], that is, by Laplace approximation of functions f k ( u , t , α ) at u = 0 , one obtains:
ψ M ( u , t , α ) = i u μ t 1 α + k = 1 t [ 2 f k ( 0 , t , α ) u 2 · u 2 2 + σ k ( u 2 ) ] = i u μ t 1 α + k = 1 t [ a c k 2 σ 2 u 2 2 t 2 α + σ k ( t 2 α u 2 ) ] = i u μ t 1 α a c σ 2 u 2 12 t 2 α t ( t + 1 ) ( 2 t + 1 ) + σ ( t 3 2 α u 2 ) .
Then, by taking the asymptotic value in the last expression, when t + , it follows:
  ψ M ( u , t , α ) ~ { i u μ t 1 α a c σ 2 t 3 2 α / 6 , 1 α 3 / 2 0 , α > 3 / 2 .
Substituting this expression into the CFs φ M ¯ ; α ( u , t ) , it is easy to conclude that the first part of the theorem, in the sense of the series M ¯ t ; α , is valid.
The proof for the series Y ¯ t ; α is carried out analogously. Using Equation (1), as the previously proven facts, we have that
Y ¯ t ; α = 1 t α j = 1 t ( m j + ε j ) = M ¯ t ; α + j = 1 t ε j t α = t 1 α m 0 + k = 1 t k t α ξ t k + k = 0 t 1 ε t k t α = t 1 α m 0 + ε t t α + k = 1 t ( 1 + k q t k ) ε t k t α .
Since RVs ε t k , k = 0 ,   1 , , t , are mutually independent, after some computation, we obtain the CFs of series Y ¯ t ; α as follows:
φ Y ¯ ; α ( u , t ) = φ m ( u t α 1 , 0 ) φ ε ( u t α ) k = 1 t [ ( 1 a c ) φ ε ( u t α ) + a c φ ε ( ( k + 1 ) u t α ) ] = e i u μ t 1 α σ 2 u 2 2 t 2 α k = 1 t [ e σ 2 u 2 2 t 2 α + a c ( e ( k + 1 ) 2 σ 2 u 2 2 t 2 α e σ 2 u 2 2 t 2 α ) ] = e i u μ t 1 α σ 2 u 2 ( t + 1 ) 2 t 2 α k = 1 t [ 1 + a c ( e ( k 2 + 2 k ) σ 2 u 2 2 t 2 α 1 ) ] .
From here, using the same procedure as in the previous part of the proof, i.e., by taking the logarithm of the function φ Y ¯ ; α ( u , t ) , and by developing ψ Y ( u , t , α ) = ln φ Y ¯ ; α ( u , t ) at the point u = 0 , we have:
ψ Y ( u , t , α ) = i u μ t 1 α σ 2 u 2 ( t + 1 ) 2 t 2 α + k = 1 t ln [ 1 + a c ( e ( k 2 + 2 k ) σ 2 u 2 2 t 2 α 1 ) ] = i u μ t 1 α σ 2 u 2 ( t + 1 ) 2 t 2 α k = 1 t [ a c ( k 2 + 2 k ) σ 2 u 2 2 t 2 α + σ k ( t 2 α u 2 ) ] = i u μ t 1 α σ 2 u 2 2 ( t 1 2 α + t 2 α ) a c σ 2 u 2 12 t 2 α t ( t + 1 ) ( 2 t + 7 ) + σ ( t 3 2 α u 2 ) .
Finally, taking the asymptotic values, when t + , one obtains:
ψ Y ( u , t , α ) ~ { i u μ t 1 α σ 2 u 2 2 ( t 1 2 α + t 2 α + a c t 3 2 α 3 ) , 1 α 3 / 2 0 , α > 3 / 2 .
Substituting this expression into CFs φ Y ¯ ; α ( u , t ) , the entire statement of the theorem is proved. □
Remark 3.
In the previous theorem, the case α = 3 / 2 is particularly interesting because Equation (11) then gives the following convergences:
1 t 3 / 2 j = 1 t m j d N ( 0 , a c σ 2 3 ) , 1 t 3 / 2 j = 1 t y j d N ( 0 , a c σ 2 3 ) , t + .
We will call these convergences, in the usual way, central limit theorems (CLTs) for the GSB process. As will be seen below, they will be helpful for estimating the unknown parameters of the GSB process, primarily the conditional variance σ 2 .

4. Parameter Estimation Procedures

Now, let us consider the problem of estimation of (unknown) parameters of the GSB process, the critical value ( c ) , mean value ( μ ) , and conditional variance (σ2). To estimate the first parameter c , a series of increments ( X t ) will be used as the (only) observable and stationary component of the GSB model. Recall that we have named this series the Split-MA (1) process because it is close to standard, linear MA models. Although some of the estimation procedures we present here are like standard estimation methods in MA models (see, for instance [36]), the specificity of the Split-MA (1) model requires additional testing and analysis, primarily of the quality of the obtained estimates. To that end, the consistency and asymptotic normality of the estimators were examined. After that, several new approaches were considered, based on the observation of non-stationary time series ( y t ) . The main goal of these procedures is aimed at obtaining the estimated values of the parameters μ and σ 2 .

4.1. Estimates of Critical Value (c)

Let ( X t ) be the Split-MA (1) process defined by Equation (4). As we have already shown, the first correlation coefficient of this series is:
ρ X ( 1 ) = b c 1 + b c , 0 < b c < 1 .
From here, by solving on b c , we get the estimated value of this parameter:
b ˜ c = ρ ^ X ( 1 ) 1 + ρ ^ ( 1 ) , 0 < b c < 1 ,
where:
ρ ^ X ( 1 ) = ( t = 1 T X t X t 1 ) ( t = 1 T X t 2 ) 1
is the estimated value of the first correlation. Based on the estimate b ˜ c , the corresponding estimate of the critical value c = c ˜ can be determined as a solution to the equation:
P { ε t 2 c } = b ˜ c .
According to Equation (14), it is easy to see that b ˜ c and c ˜ are appropriate estimates if the following inequalities hold:
0 < b ˜ c < 1           0.5 < ρ ^ X ( 1 ) < 0 .
In [9], it was shown that thus obtained estimators are strictly consistent if the innovations ( ε t ) have a continuous distribution. Moreover, the estimates b ˜ c and c ˜ will also be asymptotically normal (AN) if the RVs ( ε t ) have a symmetric distribution. Note that both conditions are fulfilled in the case of Gaussian innovations ε t :   N ( 0 , σ 2 ) , when the RVs ( ε t / σ ) 2 have a χ 1 2 distribution. Thus, the estimate of the critical value c ˜ is simply found from the equality:
c ˜ = σ ˜ 2 · F χ 1 2 1 ( b ˜ c ) .
Here, σ ˜ 2 is the estimated variance of innovations ( ε t ) which will be described later.
However, it can be shown that, as for the linear MA series, the estimate b ˜ c is not the most efficient estimate for b c (asymptotic efficiency of the estimate b ˜ c is analyzed at the end of this subsection). To obtain more efficient estimates of the given parameters, we will modify the well-known Gauss-Newton method of estimating the parameters of nonlinear functions (see, for instance [36]). First, notice that Equation (4) can be written in the form:
ε t = X t + θ t 1 ε t 1 , t = 1 , ,   T
or, in functional form,
ε t ( X , θ ) = X t + θ t 1 ε t 1 ( X , θ ) .
On the other hand, if we define a series of RVs as
W t ( X , θ ) = θ t W t 1 ( X ,   θ ) + ε t 1 ( X , θ ) ,
then it is easy to see that the RVs W t ( X , θ ) are t 1 adapted, for each t = 1 , ,   T , and thus independent of ε t and θ t + 1 . According to mentioned properties of RVs ( θ t ) and ( ε t ) , it follows that ( W t ( X , θ ) ) is a stationary and ergodic series of RVs (see, for more detail [37]) with E ( W t ( X , θ ) ) = 0 and correlation function ρ W ( h ) = b c | h | , h = 0 , ± 1 ,   To this series, using the procedure described in [38], we add the so-called residual series:
R t ( X , θ ) = W t ( X , θ ) b c W t 1 ( X ,   θ ) .
The RVs R t ( X , θ ) are also t 1 adapted and mutually non-correlated, which can easily be shown. Namely, by applying Equations (16)–(18), for any integer h > 0 , one obtains:
C o v ( R t ( X , θ ) , R t + h ( X , θ ) ) = E ( R t ( X , θ ) R t + h ( X , θ ) ) = E [ R t ( X , θ ) ( W t + h ( X , θ ) b c W t + h 1 ( X ,   θ ) ) ] = E ( R t ( X , θ ) W t + h ( X , θ ) ) b c E ( R t ( X , θ ) W t + h 1 ( X ,   θ ) ) = E [ R t ( X , θ ) θ t + h W t + h 1 ( X ,   θ ) ] b c E ( R t ( X , θ ) W t + h 1 ( X ,   θ ) ) = 0 .
Thus, Equation (18) defines the series ( W t ( X , θ ) ) as a linear autoregressive (AR) process with innovations ( R t ( X , θ ) ) . From here, we obtain another estimate of the unknown parameter b c ( 0 , 1 ) by the following algorithmic procedure:
(1)
Applying Equation (14), determine b ˜ c as (the initial) estimate of b c , and according to Equation (15), determine estimate c ˜ .
(2)
Based on Equations (16)–(18) and having obtained an estimate b ˜ c , compute, for each t = 1 , ,   T , the values:
θ ˜ t = I ( ε t 1 2 ( X , θ ˜ ) c ˜ ) ε t ( X , θ ˜ ) = X t + θ ˜ t 1 ε t 1 ( X , θ ˜ ) W t ( X , θ ˜ ) = θ ˜ t W t 1 ( X ,   θ ˜ ) + ε t 1 ( X , θ ˜ ) R t ( X , θ ˜ ) = W t ( X , θ ˜ ) b ˜ c W t 1 ( X ,   θ ˜ ) ,
where θ ˜ 0 = 1 , ε 0 ( X , θ ˜ ) = ε 1 ( X , θ ˜ ) = W 0 ( X , θ ˜ ) = 0 .
(3)
Using the standard regression procedure, i.e., the correlation function ρ W ( h ) when h = 1 , obtain an estimate of b c in the form:
b ^ c = ( t = 0 T 1 W t ( X , θ ˜ ) W t + 1 ( X , θ ˜ ) ) ( t = 1 T W t 2 ( X , θ ˜ ) ) 1 .
(4)
As in the first step, based on the estimate b ^ c , the critical value c ^ can be estimated as a solution of the equation (concerning c ):
P { ε t 2 c } = b ^ c .
We emphasize that in [9], strict consistency and AN of the estimates b ˜ c and c ˜ as well as b ^ c and c ^ was proved. At the same time, the distribution of innovations ( ε t ) was not explicitly used there. In the case of GSB process, where innovations are Gaussian distributed, we can express these results as follows:
Theorem 4. 
Estimates b ˜ c and b ^ c are strictly consistent for the parameter b c , i.e., it is valid that:
b ˜ c a s b c , b ^ c a s b c , T + .
Moreover, the estimates b ˜ c and b ^ c are asymptotically normal for b c , i.e.,
T ( b ˜ c b c ) d N ( 0 , V ˜ ) , T ( b ^ c b c ) d N ( 0 , V ^ ) , T + ,
where V ˜ ( b c ) = ( b c + 1 ) 2 ( 2 b c 2 + 4 b c + 1 ) and V ^ ( b c ) = ( 1 b c ) ( 3 b c 2 + 3 b c + 1 ) .
Remark 4.
Based on the previous theorem, the consistency and AN of the estimates c ˜ and c ^ , as continuous functions of b ˜ c and b ^ c , is also valid (see, for instance [9] or [39] p. 24). Additionally, for any b c ( 0 , 1 ) , the inequality V ^ ( b c ) V ˜ ( b c ) holds when the equality is valid only for b c = 0 , as can be seen in Figure 3. This means that asymptotic variance V ^ ( b c ) , as a measure of “scattering” b ^ c from the true value b c , is (significantly) smaller than V ˜ ( b c ) . So, b ^ c is a more efficient estimate than b ˜ c , which justifies its introduction.

4.2. Estimates of Mean ( μ )

As an estimator for the parameter μ = E ( y t ) , the sample mean of series ( y t ) was usually used:
μ ˜ = y ¯ T = 1 T t = 1 T y t .
This estimator is obviously unbiased E ( μ ˜ ) = E ( y ¯ T ) = μ , but its variance is not bounded. Namely, using the previously defined α -mean series Y ¯ T ; α when α = 1 , we can represent the estimator μ ^ as a sum of uncorrelated RVs:
μ ˜ = m 0 + 1 T [ k = 1 T ( 1 + k q T k ) ε T k + ε T ] .
Thus, for the variance of μ ˜ we get:
V ˜ = V ( μ ˜ ) = 1 T 2 [ k = 1 T V ( ( 1 + k q T k ) ε T k ) + V ( ε T ) ] = σ 2 T 2 [ k = 1 T E ( 1 + k q T k ) 2 + 1 ] = σ 2 T 2 [ k = 1 T ( 1 + a c k ( k + 2 ) ) + 1 ] = σ 2 T 2 [ T + 1 + a c T ( T + 1 ) ( 2 T + 7 ) 6 ] = σ 2 ( T + 1 ) T 2 ( 1 + a c T ( 2 T + 7 ) 6 ) = a c σ 2 T 3 + O ( T 1 ) + , T + .
Note that, as expected, the variance V ˜ = V ( μ ˜ ) is asymptotically identical to that in Theorem 3, i.e., as in Equation (11), when α = 1 . Moreover, V ˜ = 0 when a c = 0 , that is, in the case of extremely large values of the parameter c . However, in practical applications, this condition is usually not met.
An alternative way to obtain an estimate for μ is to take the sample mean of the mean series y ¯ t , when t = 1 , , T , i.e.,
μ ^ = 1 T t = 1 T y ¯ t = 1 T t = 1 T ω t y t .
Here, ω t = H ( T ) H ( t 1 ) and H ( t ) = j = 1 t j 1 , t = 1 , , T are the harmonic numbers, with assumption H ( 0 ) = 0 . Obviously, μ ^ is also an unbiased estimate of the parameter μ , but with weights that are more pronounced at the “older” points of time ( t ) in which realizations of the series ( y t ) are observed. This is consistent with the fact that the covariances of RVs y t depend on these “older” time indices. Moreover, as shown in Section 2, at these time points, the covariances of RVs y t are equal to their variances. For these reasons, it is expected that the estimate μ ^ will be more efficient than μ ˜ . Indeed, using a similar procedure as before, we first represent the estimate μ ^ as a sum of uncorrelated RVs:
μ ^ = 1 T t = 1 T ω t ( m 0 + j = 0 t 1 q j ε j ) + 1 T t = 1 T ω t ε t = 1 T [ m 0 t = 1 T ω t + j = 0 T 1 ( q j ε j t = j + 1 T ω t ) + t = 1 T ω t ε t ] .
As for each j = 1 , ,   T , the statement below holds:
t = j T ω t = t = j T ( H ( T ) H ( t 1 ) ) = t = j T k = t T 1 k = T ( j 1 ) ( ω j + 1 ) ,
it follows that it can also be written:
μ ^ = 1 T [ T ( m 0 + q 0 ε 0 ) + j = 1 T 1 ( T j ( ω j + 1 + 1 ) ) q j ε j ] + 1 T t = 1 T ω t ε t = m 0 + q 0 ε 0 + 1 T j = 1 T 1 ( c j q j + ω j ) ε j + ε T T 2 ,
where c j = T j ( ω j + 1 + 1 ) . Thus, after some computation, the variance of μ ^ one obtains is:
V ^ = V ( μ ^ ) = 1 T 2 [ j = 1 T 1 E ( c j q j + ω j ) 2 E ( ε j 2 ) + E ( ε T 2 ) T 2 ] = σ 2 T 2 [ j = 1 T 1 ( a c c j ( c j + 2 ω j ) + ω j 2 ) + 1 T 2 ] = σ 2 ( a c ( T 1 ) 2 ) H ( T 1 ) H (   T ) T + σ ( H 2 ( T ) ) = a c σ 2 H 2 ( T ) + σ ( H 2 ( T ) ) + , T + .
Notice that the variance of V ^ = V ( μ ^ ) is also unbounded, but with a lower asymptotic order than V ˜ = V ( μ ˜ ) , since:
lim T + V ( μ ^ ) V ( μ ˜ ) = lim T + H 2 ( T ) T = 0 .
This means that the estimate μ ^ is (asymptotically) more efficient than μ ˜ , which can be seen in Figure 4. Here are shown 3D plots of both variances V ˜ and V ^ , which were observed as functions of two variables a c ( 0 , 1 ) and T > 0 .

4.3. Estimates of Variance ( σ 2 )

Let us consider determining the estimates of the third unknown parameter σ 2 , which represents the variance of the innovations ( ε t ) , that is, the conditional variance of the base series ( y t ) . It is precisely these facts that enable different estimation procedures for the parameter σ 2 . First, notice that based on the previously obtained estimates b ˜ c and b ^ c , i.e., the modeled innovation values ( ε t ) given by Equation (16), the variance σ 2 can be easily estimated. The usual estimation procedure is based on sampling variance:
σ ˜ 2 = 1 T t = 1 T ε t 2 ( X , θ ˜ )   or   σ ^ 2 = 1 T t = 1 T ε t 2 ( X , θ ^ ) .
Here, ε t ( X , θ ˜ ) are ε t ( X , θ ^ ) modeled innovation values obtained from the estimates b ˜ c and b ^ c , respectively. Notice that in the case of Gaussian innovations ( ε t ) , the estimates given by Equation (21) are identical to the maximum likelihood estimators. Indeed, the log-likelihood function then reads as follows:
L ( y 1 , ,   y T ; σ 2 ) = T 2 ln ( 2 π σ 2 ) 1 2 σ 2 t = 1 T ( y t m t ) 2   ,
and by solving the equation L ( y 1 , ,   y T ; σ 2 ) / σ 2 = 0 , the estimate of σ 2 is obtained as in Equation (21), that is, as the sample variance of the series ( ε t ) . Thus, the consistency and AN of both estimates σ ˜ 2 and σ ^ 2 can be readily shown. We note that due to their equivalence, only the estimate σ ^ 2 will be further considered (see Theorem below).
On the other hand, note that the previous estimation procedure is based on unobservable, modeled values of innovations ( ε t ) . Another approach to estimating the variance σ 2 is based on the so-called two-stage procedure, using the previously estimated parameter b ^ c . By applying the equality V ( X t ) = E ( X t 2 ) = σ 2 ( b c + 1 ) , as well as the sample variance of the series ( X t ) , we can obtain an estimate:
σ ^ X 2 = 1 T ( b ^ c + 1 ) t = 1 T X t 2 .
Then, it follows:
Theorem 5. 
Estimates σ ^ 2 and σ ^ X 2 are strictly consistent for the parameter σ 2 , i.e., it is valid to put:
σ ^ 2 a s σ 2 , σ ^ X 2 a s σ 2 , T + .
Moreover, the estimates σ ^ 2 and σ ^ X 2 are asymptotically normal for σ 2 , i.e.,
T ( σ ^ 2 σ 2 ) d N ( 0 , V 1 ) , T ( σ ^ X 2 σ 2 ) d N ( 0 , V 2 ) , T + ,  
where V 1 = 2 σ 4 and V 2 = σ 4 ( 2 + 11 b c b c 2 ) ( 1 + 2 b c 3 b c 3 ) 1 .
Proof. 
Since ( ε t 2 ) is an IID series of RVs, the stationarity and ergodicity of this series are apparent. Applying the strong low of large numbers (SLLS), it follows:
σ ^ 2 = 1 T t = 1 T ε t 2 ( X , θ ^ ) a s σ 2 .
Furthermore, it can easily be shown that V ( σ ^ 2 ) = 2 σ 4 / T is the variance of the estimate σ ^ 2 . Thus, applying the central limit theorem (CLT), the first convergence in Equation (23) is obtained.
To prove the properties of the estimate σ ^ X 2 , we note that ( X t 2 ) is also a stationary and ergodic series of RVs. If SLLS is now applied to the following statistics:
X ¯ t 2 = 1 T t = 1 T X t 2 ,
then one obtains:
1 T t = 1 T X t 2 a s σ 2 ( b c + 1 ) .
At the same time, according to Theorem 4, we have that b ^ c is a strongly consistent estimator of b c , i.e., b ^ c + 1 a s b c + 1 , when T + . Thus, the last two convergences give:
σ ^ X 2 = X ¯ t 2 b ^ c + 1 a s σ 2 , T + .
To prove the AN of the estimate σ ^ X 2 , note first that the sequence ( X t 2 ) is 1-dependent, in the sense of Definition 6.3.1 in [36] (p. 245). According to Cauchy-Swarz and Minkowski inequalities, applied to Equation (4), i.e., the sixth moment of the sum X t = ε t + ( θ t 1 ε t 1 ) , it follows that:
E | X t | 6 [ ( E | ε t | 6 ) 1 / 6 + ( b c   E | ε t 1 | 6 ) 1 / 6 ] 6 15 σ 6 ( 1 + b c 1 / 6 ) 6 < + .
Then, the Hoeffding-Robbins theorem [40] can be applied, based on which it follows:
T X ¯ t 2 = T 1 / 2 t = 1 T X t 2 d N ( σ 2 ( b c + 1 ) , V 0 ) ,
for which:
V 0 = V ( X t 2 ) + 2 C o v ( X t 2 , X t + 1 2 ) = E ( X t 4 ) + 2 E ( X t 2 X t + 1 2 ) 3 σ 4 ( 1 + b c ) 2 = 3 σ 4 ( 1 + 3 b c ) + 2 σ 4 ( 1 + 4 b c + b c 2 ) 3 σ 4 ( 1 + b c ) 2 = σ 4 ( 2 + 11 b c b c 2 ) .
By applying the almost sure convergence of the estimate b ^ c and the previously obtained convergence in Equation (25), we have
T σ ^ X 2 = T X ¯ t 2 b ^ c + 1 d N ( σ 2 , V 2 ) , T + ,
where V 2 = V 0 / V ^ ( b c ) . Thus, according to Theorem 4, the second convergence in Equation (23) is obtained. □
Remark 5. 
As in Theorem 4, by comparing the asymptotic variances V 1 and V 2 for the estimates σ ^ 2 and σ ^ X 2 , respectively, it is easy to see that inequality V 1 V 2 holds. At the same time, the equality V 1 = V 2 = 2 σ 4 is valid only when b c = 0 (Figure 5a), so the estimator σ ^ 2 is more efficient than σ ^ X 2 .
However, according to the proof of the previous theorem, it can be easily seen that for the variance of the statistics X ¯ t 2 , given by Equation (24), is valid (Figure 5b):
V ( X ¯ t 2 ) = σ 4 ( 2 + 11 b c b c 2 ) T 0 , T + .
Thus, X ¯ t 2 can be used as an estimator of the “hybrid” parameter σ 2 ( b c + 1 ) , which will be of interest for practical research, that is, the application of the GSB model discussed below.
Finally, another approach to finding estimates of the variance σ 2 is based on the observations of the non-stationary series ( y t ) . Applying Theorem 3, i.e., the previously proven convergence in Equation (13), we have:
Y ¯ T ; 3 / 2 = 1 T 3 / 2 t = 1 T y t d N ( 0 , a c σ 2 3 ) , T + .
If we now consider the statistics:
S T 2 = Y ¯ T ; 3 / 2 2 = 1 T 3 ( t = 1 T y t ) 2 = 1 T 3 j = 1 T k = 1 T y j y k ,
after some computation, one obtains:
E ( S T 2 ) = 1 T 3 j = 1 T k = 1 T E ( y j y k ) = 1 T 3 j = 1 T k = 1 T [ C o v ( y j y k ) + μ 2 ] = 1 T 3 j = 1 T k = 1 T [ σ 2 ( min { j , k } a c + 1 ) + μ 2 ] = σ 2 T 3 [ a c j = 1 T ( j + 2 k = 1 j 1 k ) + T 2 ] + μ 2 T = σ 2 T 3 ( a c j = 1 T j 2 + T 2 ) + μ 2 T = σ 2 a c 6 T 2 ( T + 1 ) ( 2 T + 1 ) + σ 2 + μ 2 T a c σ 2 3 ,   T + .
Thus, S T 2 is an asymptotically unbiased estimator for a c σ 2 / 3 , and using the estimate a ^ c = 1 b ^ c , an estimator of the parameter σ 2 can be taken as:
σ ^ Y 2 = 3 a ^ c S T 2 = 3 a ^ c T 3 j = 1 T k = 1 T y j y k .

5. Numerical Simulation and Application of the GSB Process

As already mentioned in the introductory section, two important aspects related to the practical implementation of the GSB process will be explored here. Firstly, numerical Monte Carlo simulations of previously obtained GSB estimators are analyzed. Then, based on actual data, the GSB process was applied to analyze the dynamics and distribution of the infected and immunized population with respect to COVID-19 disease in the territory of the Republic of Serbia.

5.1. Numerical Simulations of GSB Estimators

We first describe a pseudo-algorithm for estimating the parameters of the GSB model based on N = 1000 independent Monte Carlo replications of the GSB series. To that end, we assume that all series have size T = 500 , which is close to the length of the actual series to be considered below. The primary aim is to examine the convergence, i.e., the quality of the previously proposed estimators on a sample of a given length. Therefore, corresponding estimation errors will also be investigated for this purpose. Using the previously presented theoretical facts, the pseudo-algorithm for estimating the parameters of the GSB process can be formulated as follows:
  • In the first estimation step, compute the sample correlation ρ ^ X ( 1 ) for a series of increments ( X t ) . If the condition 0.5 < ρ ^ X ( 1 ) < 0 is fulfilled, by using Equation (14), the estimator b ˜ c can be obtained.
  • Compute statistics X ¯ t 2 , given by Equation (24), as an estimate of the “hybrid” parameter σ 2 ( b c + 1 ) . The following variance estimator is then obtained:
    σ ^ X 2 = X ¯ t 2 b ˜ c + 1 .
  • According to Equation (15) and previously obtained estimates b ˜ c and σ ^ X 2 , compute the estimator c ˜ = σ ^ X 2 · F χ 1 2 1 ( b ˜ c ) .
  • By using the estimate c ˜ , for each t = 1 , , T , generate the (modeled) values of series ( ε t ) and ( m t ) , by applying the iterative procedure:
    { ε t = y t m t , m t = m t 1 + ε t 1 I { ε t 2 2 c ˜ } ,
    where ε 0 = ε 1 = 0 , and m 0 = y 0 = μ ^ is given by Equation (20).
  • According to previously obtained series ( ε t ) , and by using Equation (21), compute a (more efficient) variance estimator σ ˜ 2 .
  • By applying the Gauss-Newton procedure, i.e., Equations (16)–(18), the estimate b ^ c can be obtained.
  • According to previously obtained estimates b ^ c and σ ˜ 2 , compute the estimator c ^ = σ ˜ 2 · F χ 1 2 1 ( b ^ c ) .
We point out that in the above-mentioned pseudo algorithm, the 2nd stage can be replaced by the following alternative step:
2’.Compute statistics S T 2 , given by Equation (26), and estimate the “hybrid” parameter a c σ 2 / 3 . Then, according to Equation (27), the variance σ 2 can be estimated as:
σ ^ Y 2 = 3 a ˜ c S T 2 ,  
where a ˜ c = 1 b ˜ c .
By applying this pseudo-algorithm, the obtained values of the estimated parameters can be summarized as shown in Table 1, where their average values (Mean), minimums (Min.), maximums (Max.) can also be seen, along with the appropriate mean squared errors of estimation (MSEE) given in parentheses. Furthermore, testing results concerning the AN of thus obtained estimates are also presented in Table 1. To that end, Anderson-Darling and Cramer-von Mises normality tests were used. Their test statistics (denoted as AD and W, respectively), as well as their corresponding p -values, were calculated using procedures from the R-package “nortest” [41].
According to the obtained values, it is evident that most estimators have a property of the AN. This applies even to the estimates of the mean value μ ˜ and μ ^ , which are obtained from realizations of non-stationary GSB-series ( y t ) . As already explained, this is related to Theorems 2 and 3, which respectively describe the AN properties of the series ( y t / t ) and so-called α-means series. Notice that the asymptotic variance of these estimators is not bounded, hence there is a large range of their observed values. On the other hand, the AN property is not particularly emphasized in the case where the critical value ( c ) is estimated. This is because both estimates c ˜ and c ^ are obtained by the three-step procedure: estimates for the parameters b c and σ 2 should first be determined, and only then for c . In the case of variance estimators σ ˜ 2 and σ ^ 2 , obtained based on modeled innovations ( ε t ) , it is easy to see that they have the highest and almost the same efficiency. Furthermore, the values of the estimator σ ^ X 2 are only slightly “weaker” than σ ˜ 2 and σ ^ 2 . This is expected since, according to Theorem 5, the AN property holds for all these variance estimators. However, the estimate σ ^ Y 2 is by far the weakest variance estimate and can be omitted from further analysis. Moreover, based on previously obtained theoretical results, also confirmed through simulations, the most robust estimates of the unknown parameters c ,   μ , σ 2 are c ^ ,   μ ^ , σ ^ 2 , respectively. For those reasons, these estimators will be used for GSB modeling of actual data on COVID-19, which will be discussed below.

5.2. Application of the GSB Process: A Case Study of COVID-19 Dynamics

In this section we give, as an illustration, a practical application of the GSB process in stochastic modeling of actual data. In other words, as mentioned in the introductory section, we will show that it can be an adequate stochastic model for describing the dynamics of the infected and vaccinated population in relation to the SARS-CoV2 virus on the territory of the Republic of Serbia. To that end, we observe realizations of two time series ( U t ) and ( V t ) which, daily, represents the total number of infected persons, i.e., persons vaccinated with the first dose of the vaccine, starting from 24 December 2020 (the start date of vaccination in Serbia) and ending with 6 June 2022. The dynamics of both time series, length T = 529 , are shown in Figure 6.
The main statistical indicators of these series (also labeled as Series A and Series B, respectively) are shown in the following Table 2. Based on thus obtained values, it can be concluded that these are time series with distinct, pronounced fluctuations. For instance, the average number of infected people is (approximately) 3650 per day, ranging from 60 to 19,901 infected people. Similar to that, the average number of vaccinated persons is 6348 per day, but the range of vaccinated persons varies from only 4 to as many as 68,678 persons per day. Therefore, we further consider the possibility that the GSB process can be used here as an appropriate stochastic model. For this purpose, as basic sequences, we observe the realizations of the so-called log-volumes, i.e., logarithmic values of series ( U t ) and ( V t ) :
y t ( 1 ) : = ln ( U t ) , y t ( 2 ) : = ln ( V t ) , t = 0 ,   1 , , T .
Notice that the main goal of this transformation is to obtain more evenly distributed values of both series, and although based on increasing of the logarithmic function, the emphasis of fluctuations will remain. Additionally, inequalities U t , V t 1 implies the non-negativity of both log-volumes series ( y t ( 1 ) , y t ( 2 ) 0 ) .
Further, using the log-volumes as a basic series, and using Equation (3), the series of increments ( X t ( 1 ) ) , ( X t ( 2 ) ) are determined entirely. Based on them, the estimates of GSB process parameters can be obtained by applying the pseudo-algorithm presented above. We emphasize that here the estimation procedure is repeated twice, i.e., for both series (A and B). Thus, modeled values of martingale means and innovations series, generated by Equation (29), are as follows:
{ ε t ( j ) = y t ( j ) m t ( j ) ,                                                         m t ( j ) = m t 1 ( j ) + ε t 1 ( j ) I { ( ε t 2 ( j ) ) 2 c ˜ } ,
where j = 1 , 2 . As initial values of the iterative procedure (30), as before, we have taken ε 0 ( j ) = ε 1 ( j ) = 0 , as well as m 0 ( j ) = y 0 ( j ) = μ ^ . Table 3 contains the basic statistical indicators of the actual series, log-volumes ( y t ( j ) ) and increments ( X t ( j ) ) , as well as modeled series, martingale means ( m t ( j ) ) and innovations ( ε t ( j ) ) .
By analyzing thus obtained values, an interesting connection can be observed, which can be explained by the previous theoretical results. Firstly, the average values of the log-volumes are “close” to the averages of the martingale means, which is in accordance with the equality E ( y t ) = E ( m t ) . Moreover, with series A, almost equal values of other statistical indicators (standard deviations, for instance) are noticeable. This can also be seen by comparing the corresponding statistical indicators of increments ( X t ( 1 ) ) and innovations ( ε t ( 1 ) ) , which will be explained below. Table 4 shows the above-mentioned estimators obtained according to the previously described procedures. In addition, some other estimates are shown, such as the sample linear correlation ρ ^ X ( 1 ) and estimates of the value b c . Accordingly, note that the condition 0.5 < ρ ^ X ( 1 ) < 0 is fulfilled in the cases of both series. Moreover, let us notice, for instance, that the estimated values for σ 2 in the case of Series B are “close” to unity, so it can be assumed that innovations ( ε t ) in this case have a standard N ( 0 , 1 ) distribution.
As we have already pointed out, the most robust estimators of the GSB process are c ^ ,   μ ^ , σ ^ 2 and based on them, modeled values of the series ( m t ( j ) ) and ( ε t ( j ) ) were obtained. Let us recall that these series, respectively, represent the stability and the impact of fluctuations in the dynamics of the total number of infected and vaccinated people. The agreement between the modeled series and the actual data can be seen in Figure 7a where, along with the empirical values of the log-volumes ( y t ( j ) ), modeled values of martingale means ( m t ( j ) ) are given. On the other hand, the agreement of a series of increments, i.e., the Split-MA(1) process ( X t ( j ) ) with innovations ( ε t ( j ) ) is shown in Figure 7b.
It should also be noted that the high agreement between the actual and modeled series is particularly noticeable in the case of series A. This can be explained theoretically, in the way it was done in Section 2. If at some points in time, innovations ( ε t ( 1 ) ) have a pronounced fluctuation, they become equal to increments ( X t ( 1 ) ) at the next moment. The agreement between the realizations of these two series will be all the better if, in addition to large and pronounced fluctuations of ( ε t ( 1 ) ), the critical value c is relatively small. Note that this is precisely the case with series A, where “small” estimated values of the parameter c indicate the possibility that the true value of this parameter is c = 0 (or, equivalently, b c = 0 ). If the sample size is large enough, this assumption can be formally tested by the null hypothesis H 0 :   c = 0 or, equivalently, H 0 :   b c = 0 . According to Theorem 4, testing procedures can be based on the normal distribution, that is, using some standard, well-known statistical tests.
Note that in that case, the series of increments ( X t ( 1 ) ) is equalized with innovations ( ε t ( 1 ) ). This implies that ( y t ( j ) ) is a series with independent increments, i.e.,
X t ( 1 ) = y t ( 1 ) y t 1 ( 1 ) = ε t ( 1 ) y t ( 1 ) = y t 1 ( 1 ) + ε t ( 1 ) .
According to Equation (1), it follows that y t 1 ( 1 ) = m t ( 1 ) , so all “information from the past” is contained in the previous realization of the series ( y t ( 1 ) ). In that way, the entire statistical analysis of this series, i.e., the dynamics of the infected population, gains simplicity; namely, series A then has (only) two stochastic components ( y t ( 1 ) ) and ( ε t ( 1 ) ), i.e., it represents a random walk series.
Finally, using the inverse transformations of those given in Equation (29), PDFs of actual series ( U t ) and ( V t ) are readily obtained:
f U ( x , t ) = 1 x f y ( 1 ) ( ln x , t ) , f V ( x , t ) = 1 x f y ( 2 ) ( ln x , t ) .
Here, f y ( j ) ( ln x , t ) , j = 1 ,   2 are the PDFs of log-volumes ( y t ( j ) ), obtained by differentiating the CDFs given by Equation (9), which can be done simply. Still, due to the non-stationarity of the mentioned series, which also depends on time, it is necessary to apply some numerical procedures to calculate their PDFs. For this purpose, the R-package “distr” [42] has been used, and the results of the applied procedure are shown in Figure 8.
Here are the empirical distributions, i.e., histograms of the number of infected and vaccinated persons per day, with their fitted PDFs, obtained using Equations (32). Due to the non-stationarity of the time series ( U t ) and ( V t ) , as well as the comparison of the theoretical PDFs, fitting was also performed for the PDFs f U ( x , t ) and f V ( x , t ) of length t = 50 ,   10 , ,   500 < T = 529 (shown with dashed lines in Figure 8). In the case of the infected population (Series A), according to Equation (31) and the condition c ≈ 0, it follows that RVs y t ( 1 ) have (an approximately) normal N ( μ , ( t + 1 ) σ 2 ) distribution. Thus, RVs U t will have (an approximately) log-normal distribution, shown with the solid line in Figure 8a. Note that this result is close to that obtained in [29]. Nevertheless, the distribution of the number of vaccinated population (Series B), shown with the solid line in Figure 8b, has a more pronounced “peak” close to the origin. It can also be explained by previous theoretical results, primarily given in Theorem 2, i.e., by Equation (8), which concerns the asymptotic behavior of the main GSB series ( y t ) .

6. Conclusions

The stochastic analysis of the GSB process presented in this paper confirms its possibility in modeling actual time series with pronounced fluctuations. The applied methods of dynamic and statistical analysis, based on this process, aim here to understand the long-term tendency of the SARS-COV2 virus behavior, as well as the immunization process. Along with other contemporary research, we hope this one can help further development of successful methods of overcoming the pandemic. To this end, notice that new strains of the SARS-CoV2 virus, which are very common, can affect the overall symptoms as well as the disease dynamics of COVID-19 (see, c.f. [43,44,45]). They may therefore change the dynamics of both time series investigated here. This may therefore be a new goal and motivation for some future research.
Finally, let us emphasize that one of the main stochastic advantages of the GSB model is that it allows the simultaneous use of both stationary and non-stationary components. Thereby, the asymptotic behavior of the GSB time series as well as the corresponding estimates thus obtained are of particular importance. It should also be noted that the proposed parameter estimation procedure can be implemented algorithmically in a relatively simple way. Additionally, some other estimation methods, such as the Empirical Characteristic Function (ECF) method described in [12] can be used. As shown in [11,12], it can also be used to model some other types of real data with pronounced and persistent fluctuations.

Author Contributions

Conceptualization, M.J.; data curation, M.J.; formal analysis, V.S.; methodology, K.K.; project administration, B.P.; software, K.K. and B.P.; supervision, V.S.; validation, P.Č.; visualization, P.Č.; writing—original draft, M.J., V.S. and K.K.; writing—review and editing, B.P. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Ministry of Education, Science and Technological Development of the Republic of Serbia. (Grant number: III 47016.)

Data Availability Statement

Not applicable.

Acknowledgments

The authors would like to thank the Electronic Government of the Republic of Serbia and the Institute for Public Health “Milan Jovanović-Batut” for providing datasets used in this research.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Engle, R.F.; Smith, A.D. Stochastic Permanent Breaks. Rev. Econ. Stat. 1999, 81, 553–574. [Google Scholar] [CrossRef] [Green Version]
  2. Diebold, F.X.; Inoue, A. Long Memory and Regime Switching. J. Econom. 2001, 105, 131–159. [Google Scholar] [CrossRef] [Green Version]
  3. Gonzalo, J.; Martínez, O. Large Shocks vs. Small Shocks. (Or does size matter? May be so.). J. Econom. 2006, 135, 311–347. [Google Scholar] [CrossRef] [Green Version]
  4. Dendramis, Y.; Kapetanios, G.; Tzavalis, E. Level Shifts in Stock Returns Driven by Large Shocks. J. Empir. Financ. 2014, 29, 41–51. [Google Scholar] [CrossRef]
  5. Dendramis, Y.; Kapetanios, G.; Tzavalis, E. Shifts in Volatility Driven by Large Stock Market Shocks. J. Econom. Dynam. Control 2015, 55, 130–147. [Google Scholar] [CrossRef] [Green Version]
  6. Huang, B.-N.; Fok, R.C.W. Stock Market Integration—an Application of the Stochastic Permanent Breaks Model. Appl. Econ. Lett. 2001, 8, 725–729. [Google Scholar] [CrossRef]
  7. González, A. A Smooth Permanent Surge Process. In SSE/EFI Working Paper Series in Economics and Finance No. 572; Stockholm School of Economics, The Economic Research Institute: Stockholm, UK, 2004. [Google Scholar]
  8. Kapetanios, G.; Tzavalis, E. Modeling Structural Breaks in Economic Relationships Using Large Shocks. J. Econom. Dynam. Control 2010, 34, 417–436. [Google Scholar] [CrossRef]
  9. Stojanović, V.; Popović, B.Č.; Popović, P. The Split-BREAK Model. Braz. J. Probab. Stat. 2011, 25, 44–63. [Google Scholar] [CrossRef]
  10. Stojanović, V.; Popović, B.Č.; Popović, P. Stochastic Analysis of GSB Process. Publ. Inst. Math. 2014, 95, 149–159. [Google Scholar] [CrossRef]
  11. Stojanović, V.; Popović, B.Č.; Popović, P. Model of General Split-BREAK Process. REVSTAT Stat. J. 2015, 13, 145–168. [Google Scholar]
  12. Stojanović, V.; Milovanović, G.V.; Jelić, G. Distributional Properties and Parameters Estimation of GSB Process: An Approach Based on Characteristic Functions. ALEA—Lat. Am. J. Probab. Math. Stat. 2016, 13, 835–861. [Google Scholar] [CrossRef]
  13. Xu, Z.; Wang, H.; Zhang, H.; Zhao, K.; Gao, H.; Zhu, Q. Non-Stationary Turbulent Wind Field Simulation of Long-Span Bridges Using the Updated Non-Negative Matrix Factorization-Based Spectral Representation Method. Appl. Sci. 2019, 9, 5506. [Google Scholar] [CrossRef] [Green Version]
  14. Granero-Belinchón, C.; Roux, S.G.; Garnier, N.B. Information Theory for Non-Stationary Processes with Stationary Increments. Entropy 2019, 21, 1223. [Google Scholar] [CrossRef] [Green Version]
  15. Zhao, D.; Gelman, L.; Chu, F.; Ball, A. Novel Method for Vibration Sensor-Based Instantaneous Defect Frequency Estimation for Rolling Bearings Under Non-Stationary Conditions. Sensors 2020, 20, 5201. [Google Scholar] [CrossRef] [PubMed]
  16. Qu, C.; Li, J.; Yan, L.; Yan, P.; Cheng, F.; Lu, D. Non-Stationary Flood Frequency Analysis Using Cubic B-Spline-Based GAMLSS Model. Water 2020, 12, 1867. [Google Scholar] [CrossRef]
  17. Aguejdad, R. The Influence of the Calibration Interval on Simulating Non-Stationary Urban Growth Dynamic Using CA-Markov Model. Remote Sens. 2021, 13, 468. [Google Scholar] [CrossRef]
  18. Narr, C.F.; Chernyavskiy, P.; Collins, S.M. Partitioning Macroscale and Microscale Ecological Processes Using Covariate-Driven Non-Stationary Spatial Models. Ecol. Appl. 2022, 32, e02485. [Google Scholar] [CrossRef]
  19. Vaz, S.; Torres, D.F.M. A Discrete-Time Compartmental Epidemiological Model for COVID-19 with a Case Study for Portugal. Axioms 2021, 10, 314. [Google Scholar] [CrossRef]
  20. Alqahtani, R.T.; Musa, S.S.; Yusuf, A. Unravelling the Dynamics of the COVID-19 Pandemic with the Effect of Vaccination, Vertical Transmission and Hospitalization. Results Phys. 2022, 39, 105715. [Google Scholar] [CrossRef]
  21. Ghosh, S.; Volpert, V.; Banerjee, M. An Epidemic Model with Time Delay Determined by the Disease Duration. Mathematics 2022, 10, 2561. [Google Scholar] [CrossRef]
  22. Almeshal, A.M.; Almazrouee, A.I.; Alenizi, M.R.; Alhajeri, S.N. Forecasting the Spread of COVID-19 in Kuwait Using Compartmental and Logistic Regression Models. Appl. Sci. 2020, 10, 3402. [Google Scholar] [CrossRef]
  23. Rossi, C.; Bonanomi, A.; Oasi, O. Psychological Wellbeing during the COVID-19 Pandemic: The Influence of Personality Traits in the Italian Population. Int. J. Environ. Res. Public Health 2021, 18, 5862. [Google Scholar] [CrossRef]
  24. Ponkratov, V.; Kuznetsov, N.; Bashkirova, N.; Volkova, M.; Alimova, M.; Ivleva, M.; Vatutina, L.; Elyakova, I. Predictive Scenarios of the Russian Oil Industry; with a Discussion on Macro and Micro Dynamics of Open Innovation in the COVID-19 Pandemic. J. Open Innov. Technol. Mark. Complex. 2020, 6, 85. [Google Scholar] [CrossRef]
  25. Hassan, S.M.; Riveros Gavilanes, J.M. First to React Is the Last to Forgive: Evidence from the Stock Market Impact of COVID-19. J. Risk Financ. Manag. 2021, 14, 26. [Google Scholar] [CrossRef]
  26. Flora, J.; Khan, W.; Jin, J.; Jin, D.; Hussain, A.; Dajani, K.; Khan, B. Usefulness of Vaccine Adverse Event Reporting System for Machine-Learning Based Vaccine Research: A Case Study for COVID-19 Vaccines. Int. J. Mol. Sci. 2022, 23, 8235. [Google Scholar] [CrossRef] [PubMed]
  27. Kouamé, K.-M.; Mcheick, H. An Ontological Approach for Early Detection of Suspected COVID-19 among COPD Patients. Appl. Syst. Innov. 2021, 4, 21. [Google Scholar] [CrossRef]
  28. Sarría-Santamera, A.; Abdukadyrov, N.; Glushkova, N.; Russell Peck, D.; Colet, P.; Yeskendir, A.; Asúnsolo, A.; Ortega, M.A. Towards an Accurate Estimation of COVID-19 Cases in Kazakhstan: Back-Casting and Capture–Recapture Approaches. Medicina 2022, 58, 253. [Google Scholar] [CrossRef] [PubMed]
  29. Shim, E.; Choi, W.; Song, Y. Clinical Time Delay Distributions of COVID-19 in 2020–2022 in the Republic of Korea: Inferences from a Nationwide Database Analysis. J. Clin. Med. 2022, 11, 3269. [Google Scholar] [CrossRef]
  30. Jankhonkhan, J.; Sawangtong, W. Model Predictive Control of COVID-19 Pandemic with Social Isolation and Vaccination Policies in Thailand. Axioms 2021, 10, 274. [Google Scholar] [CrossRef]
  31. Queirós-Reis, L.; Gomes da Silva, P.; Gonçalves, J.; Brancale, A.; Bassetto, M.; Mesquita, J.R. SARS-CoV-2 Virus−Host Interaction: Currently Available Structures and Implications of Variant Emergence on Infectivity and Immune Response. Int. J. Mol. Sci. 2021, 22, 10836. [Google Scholar] [CrossRef] [PubMed]
  32. Xu, L.; Xie, L.; Zhang, D.; Xu, X. Elucidation of Binding Features and Dissociation Pathways of Inhibitors and Modulators in SARS-CoV-2 Main Protease by Multiple Molecular Dynamics Simulations. Molecules 2022, 27, 6823. [Google Scholar] [CrossRef]
  33. Williams, D. Probability with Martingales; Cambridge University Press: Cambridge, UK, 1991. [Google Scholar]
  34. Stojanović, V.; Popović, B.Č.; Milovanović, G.V. The Split-SV model. Comput. Statist. Data Anal. 2016, 100, 560–581. [Google Scholar] [CrossRef]
  35. Stojanović, V.; Kevkić, T.; Jelić, G. Application of the Homotopy Analysis Method in Approximation of Convolutions Stochastic Distributions. Univ. Politeh. Buchar. Sci. Bull. 2017, 79, 103–112. [Google Scholar]
  36. Fuller, W.A. Introduction to Statistical Time Series; John Wiley & Sons: New York, NY, USA, 1996. [Google Scholar]
  37. Popović, B.Č. The First Order Random Coefficient (RC) Autoregressive Time Series. Sci. Rev. 1992, 21–22, 131–136. [Google Scholar]
  38. Lawrence, A.J.; Lewis, P.A.W. Reversed Residuals in Autoregressive Time Series Analysis. J. Time Series Anal. 1992, 13, 253–266. [Google Scholar] [CrossRef]
  39. Serfling, R.J. Approximation Theorems of Mathematical Statistics, 2nd ed.; John Wiley & Sons: Hoboken, NJ, USA, 2002. [Google Scholar]
  40. Hoeffding, W.; Robbins, H. The central limit theorem for dependent random variables. Duke Math. J. 1948, 15, 773–780. [Google Scholar] [CrossRef]
  41. Gross, L. Tests for normality. R Package Version 1.0-2. 2013. Available online: http://CRAN.R-project.org/package=nortest (accessed on 21 September 2022).
  42. Ruckdeschel, P.; Kohl, M.; Stabla, T.; Camphausen, F. S4 Classes for Distributions. R News 2006, 6, 2–6. Available online: https://CRAN.R-project.org/doc/Rnews (accessed on 21 September 2022).
  43. Sivakumar, B.; Deepthi, B. Complexity of COVID-19 Dynamics. Entropy 2022, 24, 50. [Google Scholar] [CrossRef]
  44. Beškovnik, B.; Zanne, M.; Golnar, M. Dynamic Changes in Port Logistics Caused by the COVID-19 Pandemic. J. Mar. Sci. Eng. 2022, 10, 1473. [Google Scholar] [CrossRef]
  45. Zakharov, V.; Balykina, Y.; Ilin, I.; Tick, A. Forecasting a New Type of Virus Spread: A Case Study of COVID-19 with Stochastic Parameters. Mathematics 2022, 10, 3725. [Google Scholar] [CrossRef]
Figure 1. Dynamics of the basic series of the GSB model. (Parameter values are: μ = 0 and c = σ = 1 ).
Figure 1. Dynamics of the basic series of the GSB model. (Parameter values are: μ = 0 and c = σ = 1 ).
Mathematics 10 03849 g001
Figure 2. Graphs of the convergence of modulus of the characteristic functions φ m ( u / t , t ) and φ y ( u / t , t ) , when t = 1 ,   2 ,   ,   500 . (Parameter values are: μ = c = σ = 1 ).
Figure 2. Graphs of the convergence of modulus of the characteristic functions φ m ( u / t , t ) and φ y ( u / t , t ) , when t = 1 ,   2 ,   ,   500 . (Parameter values are: μ = c = σ = 1 ).
Mathematics 10 03849 g002
Figure 3. Graphs of the asymptotic variances of the estimates b ˜ c . (dashed line) and b ^ c (solid line), depending on b c ( 0 , 1 ) .
Figure 3. Graphs of the asymptotic variances of the estimates b ˜ c . (dashed line) and b ^ c (solid line), depending on b c ( 0 , 1 ) .
Mathematics 10 03849 g003
Figure 4. Variances shown as 3D plots of the estimate μ ˜ (a) and estimate μ ^ (b), depending on a c ( 0 , 1 ) and T > 0 . (The variance of innovations is σ 2 = 1 ).
Figure 4. Variances shown as 3D plots of the estimate μ ˜ (a) and estimate μ ^ (b), depending on a c ( 0 , 1 ) and T > 0 . (The variance of innovations is σ 2 = 1 ).
Mathematics 10 03849 g004
Figure 5. (a) Graphs of the asymptotic variances of the estimates σ ^ 2 (dashed line) and σ ^ X 2 (solid line), depending on b c ( 0 , 1 ) . (b) Plot in 3D of the variance of statistics X ¯ t 2 , depending on b c ( 0 , 1 ) and T > 0 . (The variance of the innovations is σ 2 = 1 ).
Figure 5. (a) Graphs of the asymptotic variances of the estimates σ ^ 2 (dashed line) and σ ^ X 2 (solid line), depending on b c ( 0 , 1 ) . (b) Plot in 3D of the variance of statistics X ¯ t 2 , depending on b c ( 0 , 1 ) and T > 0 . (The variance of the innovations is σ 2 = 1 ).
Mathematics 10 03849 g005
Figure 6. Dynamics of the total infected (a) and vaccinated population (b) in relation to the virus SARS-CoV2 on the territory of the Republic of Serbia.
Figure 6. Dynamics of the total infected (a) and vaccinated population (b) in relation to the virus SARS-CoV2 on the territory of the Republic of Serbia.
Mathematics 10 03849 g006
Figure 7. Graphs of empirical and modeled data: (a) log-volumes (solid lines) and martingale means (dashed lines); (b) Split-MA(1) process (solid lines) and innovations series (dashed lines). The upper panels represent the dynamics of the COVID-19 infection (Series A), and the lower panels represent the dynamics of the vaccinated population (Series B).
Figure 7. Graphs of empirical and modeled data: (a) log-volumes (solid lines) and martingale means (dashed lines); (b) Split-MA(1) process (solid lines) and innovations series (dashed lines). The upper panels represent the dynamics of the COVID-19 infection (Series A), and the lower panels represent the dynamics of the vaccinated population (Series B).
Mathematics 10 03849 g007
Figure 8. Empirical distributions of actual data (histograms) and their fitted PDFs (lines), obtained by the proposed estimation procedure: (a) distribution of the infected population (Series A); (b) distribution of the vaccinated population (Series B).
Figure 8. Empirical distributions of actual data (histograms) and their fitted PDFs (lines), obtained by the proposed estimation procedure: (a) distribution of the infected population (Series A); (b) distribution of the vaccinated population (Series B).
Mathematics 10 03849 g008
Table 1. Summary statistics of estimated parameters of the GSB process, obtained by a Monte Carlo study, along with realized statistics of normality tests.
Table 1. Summary statistics of estimated parameters of the GSB process, obtained by a Monte Carlo study, along with realized statistics of normality tests.
Parameters
Estimators
StatisticsValuesAD
(p-Value)
W
(p-Value)
Mean ( μ ˜ )Min.−24.93950.2886
(0.6161)
0.0415
(0.6545)
Mean−0.0192
(MSEE)(7.2791)
Max.26.8691
Mean ( μ ^ )Min.−20.03100.3363
(0.5056)
0.0453
(0.5845)
Mean−0.00806
(MSEE)(4.6055)
Max.19.7987
Critical value ( c ˜ )Min.0.38491.0160 *
(0.0112)
0.1449 *
(0.0278)
Mean1.0904
(MSEE)(0.5069)
Max.1.6481
Critical value ( c ^ )Min.0.51050.5647
(0.1435)
0.1074
(0.0889)
Mean0.9844
(MSEE)(0.1587)
Max.1.5033
Variance ( σ ˜ 2 )Min.0.82710.3144
(0.5446)
0.0494
(0.5182)
Mean0.9991
(MSEE)(0.0630)
Max.1.2182
Variance ( σ ^ 2 )Min.0.82480.3247
(0.5231)
0.0546
(0.4459)
Mean1.0002
(MSEE)(0.0631)
Max.1.2118
Variance ( σ ^ Y 2 )Min.0.77960.4018
(0.3584)
0.0588
(0.3921)
Mean1.0034
(MSEE)(0.0842)
Max.1.3340
Variance ( σ ^ X 2 )Min.0.110490.626 **
(<2.2 × 10−16)
16.522 **
(7.37 × 10−10)
Mean1.0937
(MSEE)(1.4183)
Max.1.6313
*   p < 0.05 , **   p < 0.01 .
Table 2. Basic statistical indicators of observed actual series.
Table 2. Basic statistical indicators of observed actual series.
StatisticsInfected (A)Vaccinated (B)
Mean3650.846336
Median20002960
Mode136645
Stand. deviation3650.841026.38
Minimum604
Maximum19,90168,678
Kurtosis8.11898.2609
Skewness2.14182.7009
Table 3. Basic statistical indicators of actual and modeled series.
Table 3. Basic statistical indicators of actual and modeled series.
StatisticsSeries ASeries B
y t ( 1 )   X t ( 1 )   m t ( 1 ) ε t ( 1 ) y t ( 2 ) X t ( 2 ) m t ( 2 ) ε t ( 2 )
Mean7.4041−0.00337.4111−0.00547.3544−0.00688.9349−0.1769
Median7.5976−0.03367.6061−0.03327.9940−0.05669.4269−0.1106
Stand. deviation1.32470.19481.32440.19122.05461.00361.75891.0238
Minimum4.0943−0.59904.0943−0.59901.3863−5.05541.0986−6.6837
Maximum9.89850.91259.89850.739011.13725.514711.30994.5209
Kurtosis2.34194.33322.33053.72142.407110.17613.673210.2208
Skewness−0.54930.6114−0.56050.4518−0.49580.4290−1.0703−0.1625
Table 4. Estimated values of GSB process parameters.
Table 4. Estimated values of GSB process parameters.
ParametersSeries ASeries B
μ ˜ 7.40417.3544
μ ^ 7.44548.1409
ρ ^ X ( 1 ) −0.0126−0.2577
b ˜ c 0.01270.3472
c ˜ 0.00030.2118
b ^ c 0.09530.4436
c ^ 0.00060.3477
σ ˜ 2 0.04131.0462
σ ^ 2 0.04031.0634
σ ^ X 2 0.03751.0053
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Jovanović, M.; Stojanović, V.; Kuk, K.; Popović, B.; Čisar, P. Asymptotic Properties and Application of GSB Process: A Case Study of the COVID-19 Dynamics in Serbia. Mathematics 2022, 10, 3849. https://doi.org/10.3390/math10203849

AMA Style

Jovanović M, Stojanović V, Kuk K, Popović B, Čisar P. Asymptotic Properties and Application of GSB Process: A Case Study of the COVID-19 Dynamics in Serbia. Mathematics. 2022; 10(20):3849. https://doi.org/10.3390/math10203849

Chicago/Turabian Style

Jovanović, Mihailo, Vladica Stojanović, Kristijan Kuk, Brankica Popović, and Petar Čisar. 2022. "Asymptotic Properties and Application of GSB Process: A Case Study of the COVID-19 Dynamics in Serbia" Mathematics 10, no. 20: 3849. https://doi.org/10.3390/math10203849

APA Style

Jovanović, M., Stojanović, V., Kuk, K., Popović, B., & Čisar, P. (2022). Asymptotic Properties and Application of GSB Process: A Case Study of the COVID-19 Dynamics in Serbia. Mathematics, 10(20), 3849. https://doi.org/10.3390/math10203849

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop