Next Article in Journal
Parallel Implementations of ARX-Based Block Ciphers on Graphic Processing Units
Next Article in Special Issue
Performance Analysis of Hybrid MTS/MTO Systems with Stochastic Demand and Production
Previous Article in Journal
Modeling and Controlling Epidemic Outbreaks: The Role of Population Size, Model Heterogeneity and Fast Response in the Case of Measles
Previous Article in Special Issue
On First-Passage Times and Sojourn Times in Finite QBD Processes and Their Applications in Epidemics
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Waiting Time Problems for Patterns in a Sequence of Multi-State Trials

1
Department of Mathematics, Korea University, Seoul 02841, Korea
2
Department of Mathematics Education, Chungbuk National University, Chungbuk 28644, Korea
3
Department of Mathematics, University of Seoul, Seoul 02504, Korea
*
Author to whom correspondence should be addressed.
Mathematics 2020, 8(11), 1893; https://doi.org/10.3390/math8111893
Submission received: 15 September 2020 / Revised: 18 October 2020 / Accepted: 24 October 2020 / Published: 31 October 2020
(This article belongs to the Special Issue Queue and Stochastic Models for Operations Research)

Abstract

:
In this paper, we investigate waiting time problems for a finite collection of patterns in a sequence of independent multi-state trials. By constructing a finite GI/M/1-type Markov chain with a disaster and then using the matrix analytic method, we can obtain the probability generating function of the waiting time. From this, we can obtain the stopping probabilities and the mean waiting time, but it also enables us to compute the waiting time distribution by a numerical inversion.

1. Introduction

Waiting time problems for runs and patterns in a random sequence of trials are considered important, as they are of theoretical interest and have practical applications in various areas of statistics and applied probability such as reliability, sampling inspection, quality control, DNA/RNA sequence analysis, and hypothesis testing ([1]). For comprehensive surveys and applications of related waiting time problems, refer to the books of Balakrishnan and Koutras [2] and Fu and Lou [3].
Let { X t , t 1 } be a sequence of random variables taking values in a finite set A. A finite sequence of elements of A is called a pattern. We consider a finite collection C = { C 1 , C 2 , , C K } of patterns, possibly of different lengths. For i = 1 , , K , let τ C i be the waiting time until the first occurrence of pattern C i as a run in the series X 1 , X 2 , . Let W be the waiting time until one of the K patterns appears, i.e.,
W = min { τ C 1 , , τ C K } .
Many researchers have studied waiting time problems for general and specific choices of C in a random sequence of trials. When { X t , t 1 } is a sequence of independent and identically distributed (i.i.d.) Bernoulli trials, Fu and Koutras [4] developed a finite Markov chain embedding method, which was first employed by Fu [5], to study the exact distributions for the numbers of specified runs and patterns. Fu [6] extended the finite Markov chain embedding method to study the exact distributions for the numbers of runs and patterns in a sequence of i.i.d. multi-state trials. In addition, he obtained the waiting time distribution of a specified pattern.
In this paper, we are mainly interested in computing the waiting time distribution, as well as the stopping probabilities P ( W = τ C j ) , j = 1 , , K . Li [7], Gerber and Li [8], Guibas and Odlyzko [9], Blom and Thorburn [10] and Antzoulakos [11] considered the case when { X t , t 1 } is a sequence of i.i.d. multi-state trials. Li [7] and Gerber and Li [8] used the martingale approach to obtain the mean waiting time E [ W ] and the stopping probabilities P ( W = τ C i ) , i = 1 , , K for a finite collection C of patterns. Guibas and Odlyzko [9] used the combinatorial method to obtain the probability generating function of the waiting time. Blom and Thorburn [10] also used the combinatorial method to obtain the mean waiting time E [ W ] and the stopping probabilities P ( W = τ C i ) , i = 1 , , K for a finite collection C of patterns with the same length. Antzoulakos [11] used the finite Markov chain embedding method to study waiting time problems for a single pattern as well as a finite collection C of patterns.
Han and Hirano [12], Fu and Chang [13], Glaz et al. [14], Pozdnyakov [15], Gava and Salotti [16], Zhao et al. [17] and Kerimov and Öner [18] considered the case when { X t , t 1 } is a discrete time homogenous Markov chain with a finite state space, i.e., { X t , t 1 } is a sequence of Markov dependent multi-state trials. Han and Hirano [12] studied waiting time problems for two different patterns. Fu and Chang [13] studied waiting time problems for a finite collection of patterns by using the finite Markov chain embedding method. Glaz et al. [14] obtained the mean waiting time E [ W ] and the probability generating function of the waiting time for a finite collection of patterns in a two-state Markov chain by using the method of gambling teams and martingale approach. Pozdnyakov [15] investigated the same problems as in Glaz et al. [14] for multi-state Markovian trials. Gava and Salotti [16] obtained the system of linear equations of stopping probabilities P ( W = τ C i ) , i = 1 , , K , by using the methods developed for gambling teams in [14,15]. Recently, Zhao et al. [17] found a method, which is based on the method of [9], to calculate E [ W ] and P ( W = τ C i ) , i = 1 , , K . Even more recently, Kerimov and Öner [18] found oscillation properties of the expected stopping times and stopping probabilities for patterns consisting of two consecutive states. For useful reviews of different approaches to solve waiting time problems of patterns for both i.i.d. and Markov dependent trials, refer to Fu and Lou [3].
Antzoulakos [11] and Fu and Chang [13] obtained the probability generating function of the waiting time for a finite collection of patterns in a sequence of i.i.d. and Markov dependent multi-state trials, respectively. They used a Markov chain with absorbing states corresponding to the patterns and considered the waiting time as the first entrance time into the absorbing state. The Markov chain has the transition probability matrix P of the form:
P = P T T P T A O I ,
where P T T is the submatrix of P whose entries are transition probabilities from a transient state to a transient state, P T A is the submatrix of P whose entries are transition probabilities from a transient state to an absorbing state, O is the zero matrix and I is the identity matrix. By using the general formula that represents the probability generating function of the first entrance time into the absorbing state, they obtained the probability generating function of the waiting time. Their results are expressed in terms of the submatrices P T T and P T A , as well as variants of them. Chang [19] also studied waiting time problems for a finite collection of patterns. He investigated the distribution of the waiting time until the rth occurrence of any pattern in the collection of patterns. He also used the expression (1) for the analysis.
In this paper, we consider a sequence of i.i.d. multi-state trials. We also use a Markov chain with transition probability matrix of the form (1). However, we heavily investigate the structure of the submatrices P T T and P T A . This enables us to construct a finite GI/M/1-type Markov chain with a disaster and consider the waiting time as the time until the occurrence of the disaster. Based on this and the matrix analytic method, we obtain the probability generating function of the waiting time W on { W = τ C j } , Ψ j ( z ) = E [ z W 1 { W = τ C j } ] , j = 1 , , K . From this, we can obtain the stopping probabilities P ( W = τ C i ) , i = 1 , , K as well as the conditional/unconditional mean waiting times, E [ W | W = τ C j ] and E [ W ] , but it also enables us to compute the waiting time distribution by a numerical inversion. The benefit of our method is that it is useful and efficient even when the length of the pattern is large. Our method can also be extended to Markov dependent multi-state trials.
The paper is organized as follows. In Section 2, we formulate our waiting time problems. In Section 3, we construct a GI/M/1-type Markov chain with a disaster. From this we can obtain our results, which are given in Section 4. In Section 5, numerical examples are presented to illustrate our results. Conclusions are given in Section 6.

2. Problem Formulation

Let { X t , t 1 } be a sequence of i.i.d. trials taking values in a finite set A. Assume that for t = 1 , 2 , ,
P ( X t = x ) = p x , x A ,
where x A p x = 1 . For a finite collection C = { C 1 , C 2 , , C K } of patterns, suppose that pattern C i is of the form
C i = s 1 i s l i i , i = 1 , , K ,
where s j i A , j = 1 , , l i , i.e., C i is any pattern of length l i . Here, l i , i = 1 , , K are fixed positive integers with l 1 l 2 l K . Recall that W is the waiting time until one of the K patterns appears, i.e.,
W = min { τ C 1 , , τ C K } .
We will call W the sooner waiting time.
Our main interest is to derive the probability generating function of the sooner waiting time W on { W = τ C j } , j = 1 , , K , i.e.,
Ψ j ( z ) = E [ z W 1 { W = τ C j } ] , j = 1 , , K .
From this, we can obtain the stopping probabilities, the conditional/unconditional probability mass functions of W, and the conditional/unconditional means of W as follows:
  • The stopping probabilities P ( W = τ C j ) , j = 1 , , K , are given by P ( W = τ C j ) = Ψ j ( 1 ) .
  • The conditional probability mass function of W, given W = τ C j , i.e., P ( W = n | W = τ C j ) , j = 1 , , K , can be computed from the conditional probability generating function of W given W = τ C j , Ψ j ( z ) Ψ j ( 1 ) , by a numerical inversion.
  • The probability mass function of W, P ( W = n ) , can be computed from j = 1 K Ψ j ( z ) by a numerical inversion.
  • The conditional mean of W, E [ W | W = τ C j ] , j = 1 , , K , can be obtained from E [ W | W = τ C j ] = W j ( 1 ) Ψ j ( 1 ) , where W j ( 1 ) = d d z Ψ j ( z ) | z = 1 . In addition, the unconditional mean of W, E [ W ] , can be obtained from E [ W ] = j = 1 K W j ( 1 ) .

3. GI/M/1-Type Markov Chain with a Disaster

In this section, we construct a GI/M/1-type Markov chain with a disaster to obtain an expression for (2). We define the following three terms:
  • s 1 s j is a subpattern of pattern C i if s 1 s j = s k i s k + 1 i s k + j i for some k with 1 k l i j ; when j = 0 , s 1 s j means the null pattern (i.e., the pattern with length 0).
  • A subpattern s 1 s j of pattern C i is proper if j < l i (i.e., s 1 s j C i ).
  • A subpattern s 1 s j of pattern C i is a leading subpattern of C i if s 1 s j = s 1 i s j i for some j with 0 j l i .
Assume that for i j , C i is not a proper subpattern of C j .
We now introduce a two-dimensional process ( N ˜ t , J ˜ t ) , t = 0 , 1 , 2 , , where N ˜ t and J ˜ t are defined as follows:
(i) N ˜ 0 = 0 , and for t = 1 , 2 , ,
  • N ˜ t is the largest n { 0 , 1 , , min { l 1 1 , t } } such that ( X t n + 1 , , X t ) is a proper leading subpattern of a pattern in C , if t < W .
  • N ˜ t = Δ , where Δ is an extra point, if t W .
(ii) J ˜ 0 = 1 , and for t = 1 , 2 , ,
  • J ˜ t is the smallest i { 1 , , K } such that ( X t N ˜ t + 1 , , X t ) is a proper leading subpattern of pattern C i , if N ˜ t { 1 , , l 1 1 } .
  • J ˜ t = 1 , if N ˜ t = 0 .
  • J ˜ t = j , if N ˜ t = Δ and W = τ C j .
To clarify the definitions of N ˜ t and J ˜ t , we provide the following example: Let { X t , t 1 } be a sequence of i.i.d. trials taking values in a finite set A = { a , b , c } . Suppose C = { C 1 , C 2 , C 3 } , where C 1 = a a a b b b , C 2 = a a b a and C 3 = a b c . For example, if we consider the sequence of trials
b c a a b a b b a a a b b b b c ,
then ( N ˜ t , J ˜ t ) , t = 0 , 1 , 2 , are given in Table 1. As another example, if we consider the sequence of trials
a b a b a c c a a c a a b a a ,
then ( N ˜ t , J ˜ t ) , t = 0 , 1 , 2 , are given in Table 2. Note that { ( N ˜ t , J ˜ t ) , t = 0 , 1 , } is a discrete time Markov chain.
Define m 0 = 1 and for k = 1 , 2 , , l 1 1 , let m k be the number of patterns in C whose lengths are larger than k, i.e.,
m k = max { j { 1 , , K } : k < l j } , k = 1 , , l 1 1 .
We also define m Δ = K . Note that N ˜ t { 0 , 1 , , l 1 1 , Δ } . If N ˜ t = k , then J ˜ t { 1 , 2 , , m k } . Furthermore, the set of all possible values of J ˜ t when N ˜ t = k { 0 , 1 , , l 1 1 , Δ } is given by
I k = { 1 } if k = 0 , { i : 1 i m k , s 1 i s k i is not a leading subpattern of C j for j < i } if 1 k l 1 1 , { 1 , , K } if k = Δ .
Therefore, the state space of the discrete time Markov chain { ( N ˜ t , J ˜ t ) , t = 0 , 1 , } is
E = { ( k , i ) : k = 0 , 1 , , l 1 1 , Δ ; i I k } .
For each state ( k , i ) , the first component k is called level. The one-step transition probability matrix P of { ( N ˜ t , J ˜ t ) , t = 0 , 1 , } is given, in lexicographic order with Δ being the last element in the set of levels, as follows:
P = 0 1 2 l 1 2 l 1 1 Δ P 00 P 01 O O O P 0 Δ P 10 P 11 P 12 O O P 1 Δ P 20 P 21 P 22 P 23 O O P 2 Δ P l 1 2 , 0 P l 1 2 , 1 P l 1 2 , 2 P l 1 2 , l 1 1 P l 1 2 , Δ P l 1 1 , 0 P l 1 1 , 1 P l 1 1 , 2 P l 1 1 , l 1 1 P l 1 1 , Δ O O O O I Δ ,
where the submatrices are described below. A matrix consisting of ( i , j ) components with i I and j J will be called an I × J matrix.
  • For k = 0 , 1 , , l 1 1 , P k 0 is the I k × I 0 matrix whose ( i , 1 ) -component is
    ( P k 0 ) i 1 = x A k 0 i p x ,
    where A k 0 i is the subset of A consisting of x such that s k i s k + 1 i s k i x is not a leading subpattern of a pattern in C for any k { 1 , , k + 1 } .
  • For k = 0 , 1 , , l 1 1 , P k Δ is an I k × I Δ matrix. The ( i , j ) -component of P k Δ is
    ( P k Δ ) i j = p s l j j
    if s 1 i s 2 i s k i = s 1 j s 2 j s l j 1 j . Otherwise, ( P k Δ ) i j = 0 .
  • For k = 0 , 1 , , l 1 1 , and k = 1 , , min { k + 1 , l 1 1 } , P k k is an I k × I k matrix. The ( i , j ) -component of P k k is
    ( P k k ) i j = p s k j
    if the following three conditions hold:
    (i)
    s k k + 2 i s k k + 3 i s k i s k j is a proper leading subpattern of pattern C j ;
    (ii)
    s 1 j s k j is not a proper leading subpattern of pattern C j for j { 1 , , j 1 } ;
    (iii)
    s n i s n + 1 i s k i s k j is not a leading subpattern of a pattern in C for n { 1 , 2 , , k k + 1 } .
    Otherwise, ( P k k ) i j = 0 .
  • O’s are zero matrices (possibly of different sizes).
  • I Δ is the I Δ × I Δ identity matrix.
To make it easier to understand how the matrix P in (4) is constructed, we explain with an example. For the previously described example with A = { a , b , c } , C 1 = a a a b b b , C 2 = a a b a and C 3 = a b c , we have
I 0 = { 1 } , I 1 = { 1 } , I 2 = { 1 , 3 } , I 3 = { 1 , 2 } , I 4 = { 1 } , I 5 = { 1 } , I Δ = { 1 , 2 , 3 } .
The matrix P is
P = P 00 P 01 O O O O P 0 Δ P 10 P 11 P 12 O O O P 1 Δ P 20 P 21 P 22 P 23 O O P 1 Δ P 30 P 31 P 32 P 33 P 34 O P 1 Δ P 40 P 41 P 42 P 43 P 44 P 45 P 1 Δ P 50 P 51 P 52 P 53 P 54 P 55 P 1 Δ O O O O O O I Δ ,
where
P 00 = p b + p c , P 01 = p a , P 10 = p c , P 11 = 0 , P 12 = [ p a p b ] , P 20 = p c p b , P 21 = 0 p a , P 22 = 0 0 0 0 , P 23 = p a p b 0 0 , P 30 = p c p b , P 31 = 0 0 , P 32 = 0 0 0 0 , P 33 = p a 0 0 0 , P 34 = p b 0 , P 40 = 0 , P 41 = 0 , P 42 = [ 0 0 ] , P 43 = [ 0 0 ] , P 44 = 0 , P 45 = p b , P 50 = p c , P 51 = p a , P 52 = [ 0 0 ] , P 53 = [ 0 0 ] , P 54 = 0 , P 55 = 0 , P 0 Δ = [ 0 0 0 ] , P 1 Δ = [ 0 0 0 ] , P 2 Δ = 0 0 0 0 0 p c , P 3 Δ = 0 0 0 0 p a p c , P 4 Δ = [ 0 p a p c ] , P 5 Δ = [ p b 0 0 ] , I Δ = 1 0 0 0 1 0 0 0 1 .
That is, P is given by
P = p b + p c p a 0 0 0 0 0 0 0 0 0 p c 0 p a p b 0 0 0 0 0 0 0 p c 0 0 0 p a p b 0 0 0 0 0 p b p a 0 0 0 0 0 0 0 0 p c p c 0 0 0 p a 0 p b 0 0 0 0 p b 0 0 0 0 0 0 0 0 p a p c 0 0 0 0 0 0 0 p b 0 p a p c p c p a 0 0 0 0 0 0 p b 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 .
Let { ( N t , J t ) , t = 0 , 1 , } be a two-dimensional discrete time Markov chain with the same state space E as that given in (3) and the same transition probability matrix P as that given in (4), but with an arbitrary initial state. Note that { ( N t , J t ) , t = 0 , 1 , } is a finite G I / M / 1 -type Markov chain with a disaster. This disaster occurs when N t reaches Δ .

4. Probability Generating Function of the Waiting Time

In this section, we derive an expression for (2). The analysis is based on the matrix analytic method. For more details, refer to Neuts [20,21] and Latouche and Ramaswami [22]. Let
τ n = inf { t 1 : N t = n } , n = 0 , 1 , , l 1 1 , τ Δ = inf { t 1 : N t = Δ } .
For n = 0 , 1 , , l 1 2 , we define
( G n ( z ) ) i j = E z τ n + 1 1 { τ n + 1 < τ Δ , J τ n + 1 = j } ( N 0 , J 0 ) = ( n , i ) ,
which means the probability generating functions for the time of the first visit to state ( n + 1 , j ) , starting from state ( n , i ) at time 0, before the first visit to level Δ . Let G n ( z ) be the matrix of the probability generating functions whose ( i , j ) -component is ( G n ( z ) ) i j . Conditioning on the first transition, we have
G n ( z ) = z P n , n + 1 + k = 0 n P n k G k : n + 1 ( z ) , 0 n l 1 2 ,
where the ( i , j ) -component of G k : n + 1 ( z ) is
( G k : n + 1 ( z ) ) i j = E z τ n + 1 1 { τ n + 1 < τ Δ , J τ n + 1 = j } ( N 0 , J 0 ) = ( k , i ) , 0 k n l 1 2 .
Since
G n : n + 1 ( z ) = G n ( z ) , 0 n l 1 2 , G k : n + 1 ( z ) = G k ( z ) G k + 1 : n + 1 ( z ) , 0 k < n l 1 2 ,
we have
G k : n + 1 ( z ) = G k ( z ) G k + 1 ( z ) G n ( z ) , 0 k n l 1 2 .
Substituting (6) into (5), we obtain
G n ( z ) = z P n , n + 1 + k = 0 n P n k G k ( z ) G k + 1 ( z ) G n ( z ) , 0 n l 1 2 .
Equation (7) can be interpreted as follows: Starting from level n, the Markov chain may visit level n + 1 (while avoiding level Δ ) in two ways: it may move up to level n + 1 at the very next transition (contributing the factor z P n , n + 1 ), or it may move to level k ( 0 k n ) at the first transition, move up from level k to level k + 1 , then from level k + 1 to level k + 2 , and so on, until finally moving from level n to level n + 1 (contributing the factor z k = 0 n P n k G k ( z ) G k + 1 ( z ) G n ( z ) ). From (7), we obtain
G n ( z ) = z I n z k = 0 n P n k G k ( z ) G k + 1 ( z ) G n 1 ( z ) 1 P n , n + 1 , 0 n l 1 2 ,
where I n is the I n × I n identity matrix.
For n = 0 , 1 , , l 1 1 , we define
( H n ( z ) ) i j = E z τ Δ 1 { τ n + 1 > τ Δ , J τ Δ = j } ( N 0 , J 0 ) = ( n , i ) if n = 0 , 1 , , l 1 2 , E z τ Δ 1 { J τ Δ = j } ( N 0 , J 0 ) = ( l 1 1 , i ) if n = l 1 1 ,
which means that ( H n ( z ) ) i j ( n = 0 , 1 , , l 1 2 ) is the probability generating function for the time of the first visit to state ( Δ , j ) , starting from state ( n , i ) , before the first visit to level n + 1 , and that ( H l 1 1 ( z ) ) i j is the probability generating function for the time of the first visit to state ( Δ , j ) , starting from state ( l 1 1 , i ) . Let H n ( z ) be the matrix of the probability generating functions whose ( i , j ) -component is ( H n ( z ) ) i j . Conditioning on the first transition, we have
H n ( z ) = z P n Δ + k = 0 n P n k H k : n + 1 ( z ) , 0 n l 1 1 ,
where the ( i , j ) -component of H k : n + 1 ( z ) ( 0 k n l 1 1 ) is
( H k : n + 1 ( z ) ) i j = E z τ Δ 1 { τ n + 1 > τ Δ , J τ Δ = j } ( N 0 , J 0 ) = ( k , i ) if n = 0 , 1 , , l 1 2 , E z τ Δ 1 { J τ Δ = j } ( N 0 , J 0 ) = ( k , i ) if n = l 1 1 .
Since
H n : n + 1 ( z ) = H n ( z ) , 0 n l 1 1 , H k : n + 1 ( z ) = H k ( z ) + G k ( z ) H k + 1 ( z ) , 0 k < n l 1 1 ,
we have
H k : n + 1 ( z ) = k = k n G k ( z ) G k 1 ( z ) H k ( z ) , 0 k n l 1 1 .
Substituting (11) into (9), we obtain
H n ( z ) = z P n Δ + k = 0 n P n k k = k n G k ( z ) G k + 1 ( z ) G k 1 ( z ) H k ( z ) , 0 k n l 1 1 ,
which can be written as
H n ( z ) = z P n Δ + k = 0 n 1 P n k k = k n 1 G k ( z ) G k + 1 ( z ) G k 1 ( z ) H k ( z ) + z k = 0 n P n k G k ( z ) G k + 1 ( z ) G n 1 ( z ) H n ( z ) , 0 k n l 1 1 .
From this equation, we obtain
H n ( z ) = z I n z k = 0 n P n k G k ( z ) G k + 1 ( z ) G n 1 ( z ) 1 × P n Δ + k = 0 n 1 P n k k = k n 1 G k ( z ) G k + 1 ( z ) G k 1 ( z ) H k ( z ) , 0 n l 1 1 .
Recall that Ψ j ( z ) = E [ z W 1 { W = τ C j } ] , j = 1 , , K . Since
P ( W = n , W = τ C j ) = P ( ( τ Δ , J τ Δ ) = ( n , j ) ( N 0 , J 0 ) = ( 0 , 1 ) ) ,
we have
Ψ j ( z ) = E [ z W 1 { W = τ C j } ] = E [ z τ Δ 1 { J τ Δ = j } ( N 0 , J 0 ) = ( 0 , 1 ) ] , j = 1 , , K ,
which means, by (10),
Ψ j ( z ) = ( H 0 : l 1 ( z ) ) 1 j .
Therefore, by (11),
( Ψ 1 ( z ) , , Ψ K ( z ) ) = k = 0 l 1 1 G 0 ( z ) G 1 ( z ) G k 1 ( z ) H k ( z ) .
In summary, we obtain the following theorem.
Theorem 1.
The probability generating functions of the sooner waiting time W on { W = τ C j } , Ψ j ( z ) = E [ z W 1 { W = τ C j } ] , j = 1 , , K , are given by
( Ψ 1 ( z ) , , Ψ K ( z ) ) = k = 0 l 1 1 G 0 ( z ) G 1 ( z ) G k 1 ( z ) H k ( z ) ,
where G k ( z ) , k = 0 , 1 , , l 1 2 and H k ( z ) , k = 0 , 1 , , l 1 1 are given by (8) and (12), respectively.
From Theorem 1, we can obtain the following results.
Corollary 1.
(i) 
The stopping probabilities P ( W = τ C j ) , j = 1 , , K , are given by
P ( W = τ C j ) = Ψ j ( 1 ) .
(ii) 
The conditional probability generating functions of W, given W = τ C j , j = 1 , , K , are given by
E [ z W | W = τ C j ] = Ψ j ( z ) Ψ j ( 1 ) .
(iii) 
The marginal probability generating function of W is given by
E [ z W ] = j = 1 K Ψ j ( z ) .
Remark. As mentioned in Section 2, the conditional probability mass function P ( W = n | W = τ C j ) , j = 1 , , K can be computed from (14) by a numerical inversion. In addition, the probability mass function P ( W = n ) can be computed from (15) by a numerical inversion. For the numerical inversion of probability generating functions, refer to Abate and Whitt [23].
By Theorem 1, we can also obtain the conditional/unconditional means of the sooner waiting time. To get this, we introduce
G k = G k ( 1 ) , G k ( 1 ) = d d z G k ( z ) | z = 1 , k = 0 , 1 , , l 1 2 , H k = H k ( 1 ) , H k ( 1 ) = d d z H k ( z ) | z = 1 , k = 0 , 1 , , l 1 1 .
Recall that W j ( 1 ) = d d z Ψ j ( z ) | z = 1 , j = 1 , , K . By differentiating (13) with respect to z and evaluating at z = 1 , we have
( W 1 ( 1 ) , , W K ( 1 ) ) = k = 0 l 1 1 i = 0 k 1 G 0 G 1 G i 1 G i ( 1 ) G i + 1 G k 1 H k + G 0 G 1 G k 1 H k ( 1 ) .
Therefore, to obtain an expression for W j ( 1 ) , j = 1 , , K , we need to determine G k ( 1 ) , k = 0 , 1 , , l 1 2 , and H k ( 1 ) , k = 0 , 1 , , l 1 1 . Equation (8) may be written as
I n z k = 0 n P n k G k ( z ) G k + 1 ( z ) G n 1 ( z ) G n ( z ) = z P n , n + 1 , 0 n l 1 2 ,
from which
I n k = 0 n P n k G k G n 1 G n ( 1 ) k = 0 n P n k G k G n 1 + k = 0 n 1 P n k i = k n 1 G k G i ( 1 ) G n 1 G n = P n , n + 1 , 0 n l 1 2 .
Therefore, G n ( 1 ) , n = 0 , 1 , , l 1 2 are obtained as follows:
G n ( 1 ) = I n k = 0 n P n k G k G n 1 1 k = 0 n P n k G k G n 1 + k = 0 n 1 P n k i = k n 1 G k G i ( 1 ) G n 1 G n + I n k = 0 n P n k G k G n 1 1 P n , n + 1 .
Similarly, we can obtain H n ( 1 ) , n = 0 , 1 , , l 1 1 , by using equation (12). Equation (12) may be written as
I n z k = 0 n P n k G k ( z ) G k + 1 ( z ) G n 1 ( z ) H n ( z ) = z P n Δ + k = 0 n 1 P n k k = k n 1 G k ( z ) G k + 1 ( z ) G k 1 ( z ) H k ( z ) , 0 n l 1 1 ,
from which
I n k = 0 n P n k G k G n 1 H n ( 1 ) k = 0 n P n k G k G n 1 + k = 0 n 1 P n k i = k n 1 G k G i ( 1 ) G n 1 H n = P n Δ + k = 0 n 1 P n k k = k n 1 G k G k + 1 G k 1 H k + k = 0 n 1 P n k k = k n 1 i = k k 1 G k G i ( 1 ) G k 1 H k + G k G k 1 H k ( 1 ) , 0 n l 1 1 .
Therefore, H n ( 1 ) , n = 0 , 1 , , l 1 1 are obtained as follows:
H n ( 1 ) = I n k = 0 n P n k G k G n 1 1 k = 0 n P n k G k G n 1 + k = 0 n 1 P n k i = k n 1 G k G i ( 1 ) G n 1 H n + I n k = 0 n P n k G k G n 1 1 { P n Δ + k = 0 n 1 P n k k = k n 1 G k G k + 1 G k 1 H k + k = 0 n 1 P n k k = k n 1 i = k k 1 G k G i ( 1 ) G k 1 H k + G k G k 1 H k ( 1 ) } , 0 n l 1 1 .
Since W j ( 1 ) = E [ W 1 { W = τ C j } ] , j = 1 , , K , we can obtain the conditional mean waiting times E [ W | W = τ C j ] , j = 1 , , K from E [ W | W = τ C j ] = W j ( 1 ) P ( W = τ C j ) . We can also obtain the unconditional mean waiting time E [ W ] from E [ W ] = j = 1 K W j ( 1 ) . From these two formulas and (16), we obtain the following theorem.
Theorem 2.
The conditional and unconditional means of the sooner waiting time W are given by, respectively,
( E [ W | W = τ C 1 ] , , E [ W | W = τ C K ] ) = W 1 ( 1 ) P ( W = τ C 1 ) , , W K ( 1 ) P ( W = τ C K ) , E [ W ] = j = 1 K W j ( 1 ) ,
where
( W 1 ( 1 ) , , W K ( 1 ) ) = k = 0 l 1 1 i = 0 k 1 G 0 G 1 G i 1 G i ( 1 ) G i + 1 G k 1 H k + G 0 G 1 G k 1 H k ( 1 ) ,
with G k ( 1 ) , k = 0 , 1 , , l 1 2 and H k ( 1 ) , k = 0 , 1 , , l 1 1 given by (17) and (18), respectively.

5. Numerical Examples

In this section, we present numerical results for the computations of the stopping probabilities, the probability mass functions (along with the tail probabilities) of the sooner waiting time, and the conditional/unconditional means of the sooner waiting time. To illustrate our results, we provide two examples.
Example 1.
Let { X n , n 1 } be a sequence of i.i.d. trials taking values in a finite set A = { a , b , c } . Assume that for n = 1 , 2 , ,
p a = P ( X n = a ) = 1 3 , p b = P ( X n = b ) = 1 3 , p c = P ( X n = c ) = 1 3 .
Suppose that K = 10 , i.e., the collection C consists of 10 patterns, C = { C 1 , , C 10 } . We select the collection of patterns { C 1 , , C 10 } as shown in Table 3, where the lengths of the patterns, l 1 , , l 10 , are chosen from the order statistics of i.i.d. random variables with mean 5. The set of patterns given in Table 3 is an example of a randomly selected pattern set such that one pattern is not a subpattern of another. The procedure of randomly selecting a one pattern set is omitted here.
By Theorem 1, we can compute ( Ψ 1 ( z ) , , Ψ K ( z ) ) . Table 4 shows the stopping probabilities
P ( W = τ C j ) = Ψ j ( 1 ) , j = 1 , , 10 .
In Figure 1, we plot the joint probabilities P ( W n , W = τ C j ) , j = 1 , 4 , 7 , 10 , with n varying. This can be computed by the numerical inversion of its generating function:
n = 1 P ( W n , W = τ C j ) z n = z 1 z ( Ψ j ( 1 ) Ψ j ( z ) ) .
In Table 5, we present the probability mass function of W
P ( W = n ) = j = 1 10 P ( W = n , W = τ C j ) ,
and the tail probability of W
P ( W n ) = j = 1 10 P ( W n , W = τ C j ) ,
with n varying. Here, P ( W = n , W = τ C j ) can be computed by the numerical inversion of its generating function:
n = 1 P ( W = n , W = τ C j ) z n = Ψ j ( z ) .
By Theorem 2, we can compute the conditional mean of the sooner waiting time W, E [ W | W = τ C j ] , j = 1 , , K , and the unconditional mean of W, E [ W ] . Table 6 shows the conditional and unconditional mean waiting times for Example 1.
The next example will be for the Bernoulli trials.
Example 2.
Let { X n , n 1 } be a sequence of i.i.d. Bernoulli trials, i.e., { X n , n 1 } takes values in a finite set A = { 0 , 1 } . Assume that for n = 1 , 2 , ,
p 0 = P ( X n = 0 ) = 1 2 , p 1 = P ( X n = 1 ) = 1 2 .
Suppose that the collection C consists of 5 patterns, C = { C 1 , , C 5 } , where
C 1 = 1111111111111111 , C 2 = 0101010101010101 , C 3 = 001001001001 , C 4 = 00010001 , C 5 = 0000 .
For Example 2, the joint probabilities P ( W n , W = τ C j ) , j = 1 , , 5 are shown in Figure 2. Also, the stopping probabilities P ( W = τ C j ) , j = 1 , , 5 , the probability mass function of W (along with the tail probability) and the conditional/unconditional means of W are shown in Table 7, Table 8 and Table 9, respectively.

6. Conclusions

We have derived the probability generating function of the sooner waiting time for a finite collection of patterns in a sequence of i.i.d. multi-state trials. From this probability generating function we have obtained the stopping probabilities and the mean waiting time, but it also has enabled us to compute the waiting time distribution by a numerical inversion. As mentioned in the introduction, our method can be extended to Markov dependent multi-state trials.
For further research, we will investigate the tail asymptotics for the sooner waiting time W. From Figure 1 and Figure 2, we can expect that the distribution of W has a geometric tail behavior. This is true under certain aperiodic condition because W is the first passage time to a subset of the state space, in a discrete time Markov chain with a finite state space. Under some assumptions about periodicity, the distribution of W exhibits a geometric tail behavior, i.e.,
P ( W n ) c σ n a s n
for some c > 0 and σ ( 0 , 1 ) . Here “∼" means that the limit of the ratio is 1. It would be of interest to find explicit expressions for c and σ . We also have the following geometric tail behavior:
P ( W n , W = τ C i ) c i σ n a s n
for some c i > 0 and σ ( 0 , 1 ) . Here σ is independent of i and is the same as that described above. It would also be of interest to find explicit expressions for c i , i = 1 , , K .

Author Contributions

Conceptualization, B.K. and J.K. (Jeongsim Kim); investigation, B.K. and J.K (Jeongsim Kim); methodology, B.K. and J.K. (Jeongsim Kim); numerical investigation, J.K. (Jerim Kim); writing–original draft preparation, J.K. (Jeongsim Kim); writing–review and editing, B.K., J.K. (Jeongsim Kim) and J.K. (Jerim Kim). All authors have read and agreed to the published version of the manuscript.

Funding

The first author’s research was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. 2020R1A2B5B01001864). The second author’s research was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. 2020R1F1A1A01065568).

Acknowledgments

We are grateful to the reviewers for their valuable comments and suggestions, which greatly improved this paper.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Kim, B.; Kim, J. Sooner waiting time problems in a sequence of multi-state trials with random rewards. Stat. Probab. Lett. 2019, 153, 171–179. [Google Scholar] [CrossRef]
  2. Balakrishnan, N.; Koutras, M.V. Runs and Scans with Applications; Wiley: New York, NY, USA, 2002. [Google Scholar]
  3. Fu, J.C.; Lou, W.Y.W. Distribution Theory of Runs and Patterns and Its Applications: A Finite Markov Chain Imbedding Approach; World Scientific: Singapore, 2003. [Google Scholar]
  4. Fu, J.C.; Koutras, M.V. Distribution theory of runs: A Markov chain approach. J. Am. Stat. Assoc. 1994, 89, 1050–1058. [Google Scholar] [CrossRef]
  5. Fu, J.C. Reliability of large consecutive-k-out-of-n: F systems with (k-1)-step Markov dependence. IEEE Trans. Reliab. 1986, R-35, 602–606. [Google Scholar] [CrossRef]
  6. Fu, J.C. Distribution theory of runs and patterns associated with a sequence of multi-state trials. Stat. Sin. 1996, 6, 957–974. [Google Scholar]
  7. Li, S.-Y.R. A martingale approach to the study of occurrence of sequence patterns in repeated experiments. Ann. Probab. 1980, 8, 1171–1176. [Google Scholar] [CrossRef]
  8. Gerber, H.U.; Li, S.-Y.R. The occurrence of sequence patterns in repeated experiments and hitting times in a Markov chain. Stoch. Process. Their Appl. 1981, 11, 101–108. [Google Scholar] [CrossRef] [Green Version]
  9. Guibas, L.J.; Odlyzko, A.M. String overlaps, pattern matching, and nontransitive games. J. Comb. Theory Ser. A 1981, 30, 183–208. [Google Scholar] [CrossRef] [Green Version]
  10. Blom, G.; Thorburn, D. How many random digits are required until given sequences are obtained? J. Appl. Probab. 1982, 19, 518–531. [Google Scholar] [CrossRef]
  11. Antzoulakos, D.L. Waiting times for patterns in a sequence of multistate trials. J. Appl. Probab. 2001, 38, 508–518. [Google Scholar] [CrossRef]
  12. Han, Q.; Hirano, K. Sooner and later waiting time problems for patterns in Markov dependent trials. J. Appl. Probab. 2003, 40, 73–86. [Google Scholar] [CrossRef]
  13. Fu, J.C.; Chang, Y.M. On probability generating functions for waiting time distributions of compound patterns in a sequence of multistate trials. J. Appl. Probab. 2002, 39, 70–80. [Google Scholar] [CrossRef]
  14. Glaz, J.; Kulldorff, M.; Pozdnyakov, V.; Steele, J.M. Gambling teams and waiting times for patterns in two-state Markov chains. J. Appl. Probab. 2006, 43, 127–140. [Google Scholar] [CrossRef] [Green Version]
  15. Pozdnyakov, V. On occurrence of patterns in Markov chains: Method of gambling teams. Stat. Probab. Lett. 2008, 78, 2762–2767. [Google Scholar] [CrossRef]
  16. Gava, R.J.; Salotti, D. Stopping probabilities for patterns in Markov chains. J. Appl. Probab. 2014, 51, 287–292. [Google Scholar] [CrossRef]
  17. Zhao, M.-Z.; Xu, D.; Zhang, H.-Z. Waiting times and stopping probabilities for patterns in Markov chains. Appl. Math. J. Chin. Univ. 2018, 33, 25–34. [Google Scholar] [CrossRef] [Green Version]
  18. Kerimov, A.; Öner, A. Oscillation Properties of Expected Stopping Times and Stopping Probabilities for Patterns Consisting of CONSECUTIVE states in Markov Chains. Rocky Mt. J. Math. Available online: https://projecteuclid.org/euclid.rmjm/1588298550 (accessed on 1 May 2020).
  19. Chang, Y.M. Distribution of waiting time until the rth occurrence of a compound pattern. Stat. Probab. Lett. 2005, 75, 29–38. [Google Scholar] [CrossRef]
  20. Neuts, M.F. Matrix-Geometric Solutions in Stochastic Models: An Algorithmic Approach; Johns Hopkins University Press: Baltimore, MD, USA, 1981. [Google Scholar]
  21. Neuts, M.F. Structured Stochastic Matrices of M/G/1 Type and Their Applications; Marcel Dekker, Inc.: New York, NY, USA, 1989. [Google Scholar]
  22. Latouche, G.; Ramaswami, V. Introduction to Matrix Analytic Methods in Stochastic Modeling; ASA-SIAM series on Statistics and Applied Probability; SIAM: Philadelphia, PA, USA, 1999. [Google Scholar]
  23. Abate, J.; Whitt, W. Numerical inversion of probability generating functions. Oper. Res. Lett. 1992, 12, 245–251. [Google Scholar] [CrossRef]
Figure 1. Plots of the joint probabilities P ( W n , W = τ C j ) when j = 1 , 4 , 7 , 10 for Example 1.
Figure 1. Plots of the joint probabilities P ( W n , W = τ C j ) when j = 1 , 4 , 7 , 10 for Example 1.
Mathematics 08 01893 g001
Figure 2. Plots of the joint probabilities P ( W n , W = τ C j ) , j = 1 , , 5 for Example 2.
Figure 2. Plots of the joint probabilities P ( W n , W = τ C j ) , j = 1 , , 5 for Example 2.
Mathematics 08 01893 g002
Table 1. Sample paths of N ˜ t and J ˜ t corresponding to the sample path of X t , b c a a b a b b a a a b b b b c .
Table 1. Sample paths of N ˜ t and J ˜ t corresponding to the sample path of X t , b c a a b a b b a a a b b b b c .
t0123456789101112131415
X t bcaabbbaaabbbbc
N ˜ t 0001230012345 Δ Δ Δ
J ˜ t 1111121111111111
Table 2. Sample paths of N ˜ t and J ˜ t corresponding to the sample path of X t , a b a b a c c a a c a a b a a .
Table 2. Sample paths of N ˜ t and J ˜ t corresponding to the sample path of X t , a b a b a c c a a c a a b a a .
t0123456789101112131415
X t ababaccaacaabba
N ˜ t 01212100120123 Δ Δ
J ˜ t 1131311111111222
Table 3. The patterns used in Example 1.
Table 3. The patterns used in Example 1.
C 1   bbcabbccbacc
C 2   cbcbccbbb
C 3   cbbaacbcc
C 4   abaccba
C 5   bacabb
C 6   cbcab
C 7   acaab
C 8   caaca
C 9   aaca
C 10   aaac
Table 4. The stopping probabilities P ( W = τ C j ) , j = 1 , , 10 for Example 1.
Table 4. The stopping probabilities P ( W = τ C j ) , j = 1 , , 10 for Example 1.
j P ( W = τ C j )
1 6.3509 × 10 5
2 1.7152 × 10 3
3 1.7152 × 10 3
4 1.4655 × 10 2
5 4.6715 × 10 2
6 1.3893 × 10 1
7 9.3671 × 10 2
8 1.2852 × 10 1
9 1.5249 × 10 1
10 4.2152 × 10 1
Table 5. The probability mass function P ( W = n ) and tail probability P ( W n ) for Example 1.
Table 5. The probability mass function P ( W = n ) and tail probability P ( W n ) for Example 1.
n P ( W = n ) P ( W n )
5 2.8807 × 10 2 9.7531 × 10 1
10 2.6300 × 10 2 8.3422 × 10 1
15 2.2424 × 10 2 7.1069 × 10 1
20 1.9103 × 10 2 6.0542 × 10 1
30 1.3863 × 10 2 4.3936 × 10 1
40 1.0060 × 10 2 3.1884 × 10 1
50 7.3009 × 10 3 2.3138 × 10 1
60 5.2983 × 10 3 1.6792 × 10 1
70 3.8450 × 10 3 1.2186 × 10 1
80 2.7903 × 10 3 8.8432 × 10 2
90 2.0249 × 10 3 6.4175 × 10 2
100 1.4695 × 10 3 4.6572 × 10 2
200 5.9533 × 10 5 1.8867 × 10 3
Table 6. The conditional mean E [ W | W = τ C j ] , j = 1 , , 10 and unconditional mean E [ W ] for Example 1.
Table 6. The conditional mean E [ W | W = τ C j ] , j = 1 , , 10 and unconditional mean E [ W ] for Example 1.
(a) E [ W | W = τ C j ]
j E [ W | W = τ C j ]
141.7543
240.0476
340.0476
440.6783
541.0372
636.3888
735.0704
835.6263
929.7715
1031.4236
(b) E [ W ]
E [ W ] 33.3582
Table 7. The stopping probabilities P ( W = τ C j ) , j = 1 , , 5 for Example 2.
Table 7. The stopping probabilities P ( W = τ C j ) , j = 1 , , 5 for Example 2.
j P ( W = τ C j )
1 2.1412 × 10 4
2 2.9831 × 10 4
3 4.7235 × 10 3
4 5.5265 × 10 2
5 9.3950 × 10 1
Table 8. The probability mass function P ( W = n ) and tail probability P ( W n ) for Example 2.
Table 8. The probability mass function P ( W = n ) and tail probability P ( W n ) for Example 2.
n P ( W = n ) P ( W n )
5 3.1250 × 10 2 9.3750 × 10 1
10 3.0273 × 10 2 7.7734 × 10 1
15 2.4963 × 10 2 6.3660 × 10 1
20 2.0452 × 10 2 5.2110 × 10 1
30 1.3705 × 10 2 3.4916 × 10 1
40 9.1827 × 10 3 2.3396 × 10 1
60 6.1528 × 10 3 1.5676 × 10 1
70 4.1227 × 10 3 1.0504 × 10 1
80 2.7624 × 10 3 7.0380 × 10 2
90 1.8509 × 10 3 4.7158 × 10 2
50 1.2402 × 10 3 3.1598 × 10 2
100 8.3100 × 10 4 2.1172 × 10 2
200 1.5158 × 10 5 3.8620 × 10 4
Table 9. The conditional mean E [ W | W = τ C j ] , j = 1 , , 5 and unconditional mean E [ W ] for Example 2.
Table 9. The conditional mean E [ W | W = τ C j ] , j = 1 , , 5 and unconditional mean E [ W ] for Example 2.
(a) E [ W | W = τ C j ]
j E [ W | W = τ C j ]
138.6231
238.7707
336.8873
437.4396
526.7334
(b) E [ W ]
E [ W ] 27.3792
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Kim, B.; Kim, J.; Kim, J. Waiting Time Problems for Patterns in a Sequence of Multi-State Trials. Mathematics 2020, 8, 1893. https://doi.org/10.3390/math8111893

AMA Style

Kim B, Kim J, Kim J. Waiting Time Problems for Patterns in a Sequence of Multi-State Trials. Mathematics. 2020; 8(11):1893. https://doi.org/10.3390/math8111893

Chicago/Turabian Style

Kim, Bara, Jeongsim Kim, and Jerim Kim. 2020. "Waiting Time Problems for Patterns in a Sequence of Multi-State Trials" Mathematics 8, no. 11: 1893. https://doi.org/10.3390/math8111893

APA Style

Kim, B., Kim, J., & Kim, J. (2020). Waiting Time Problems for Patterns in a Sequence of Multi-State Trials. Mathematics, 8(11), 1893. https://doi.org/10.3390/math8111893

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop