1. Introductory Notes
Consider a stochastic system that has a population of members categorized in different states. Three types of movements are possible in the system. Firstly, movements of members are probable among the states of the system; secondly, members are leaving the system from the various states; and thirdly, new members are entering the system to replace leavers and to expand the population. When the various movements of the system are modeled by a nonhomogeneous Markov chain, we call such a system a nonhomogeneous Markov system (NHMS).
An NHMS is actually a generalization of the classical Markov chain where we have one particle moving among the states without the possibility of leaving the system and probably being replaced by another with possibly different characteristics. Of great importance is the vector of absolute probabilities which consists of the probabilities of the particle to be in any state of the Markov chain. On the other hand, in an NHMS, we have a population of particles categorized according to their characteristics in the various states. Particles are leaving the population from all the states, and new particles are entering the population to replace them and to expand the population. Of great importance is the vector of the expected relative population structure. Hence, the problems to be solved are a lot harder, and new strategies and tools are used other than the simple Markov chain. The roots of the motive for the development of the theory of NHMS, which was first introduced in Vassiliou [
1], could be summarized in the use of Markov models in manpower systems. This started with the work of Young and Almond [
2], Young [
3,
4], and Bartholomew [
5,
6], and it was extended in the works of Young and Vassiliou [
7], Vassiliou [
8,
9], and McClean [
10,
11]. In the book by Vassiliou [
12], one can find the evolution of the theory of NHMS and the large diversity of its developments in various directions, that is, NHMS in discrete and continuous time, stochastic control in NHMS, Laws of Large Numbers for NHMS, Perturbations theory, NHMS in a stochastic environment, Markov systems, and others. In Section 5.4 of [
12], there is a synopsis of real and potential applications of NHMS which illustrates the breadth of applications and some of the reasons why the entire theory is central to these processes. Work on manpower planning using results from NHMS and new areas has continued throughout the years up to nowadays, for example, Garg, et al. [
13,
14,
15], Ugwugo and McClean [
16], Vassiliadis [
17,
18], Georgiou, et al. [
19], Guerry [
20,
21], Pollard [
22], Esquivel, et al. [
23,
24,
25].
In
Section 2 of the present study, we define and describe the NHMS in discrete time and space in a compact but hopefully readable way for the reader who comes in contact for the first time with these processes. We also state the expected relative population structure in the various states as a function of the parameters of the population that could be estimated from the available data. In
Section 3, we start with some basic definitions of concepts and mathematical tools, as well as useful known results that will be used in what follows. The novel part of this section is Theorem 1, where we study strong ergodicity for NHMS by relaxing the basic assumption present in all studies of strong ergodicity for NHMS. That is, we will not assume that the inherent inhomogeneous Markov chain converges as time goes to infinity to a homogeneous Markov chain with a regular (it consists of one communicating class of states, which is aperiodic) transition probability matrix or, equivalently, that the inhomogeneous Markov chain is strongly ergodic. In
Section 3, we start with some basic definitions of concepts and mathematical tools, as well as useful known results that will be used in what follows. The novel part of this section is Theorem 4, where we prove under what conditions the rate of convergence of strong ergodicity in a NHMS is geometrically fast. This is an important question in NHMS due to its large practical value. In Theorem 4, we relax the basic assumption present in all studies of the rate of convergence to its asymptotic behavior for a NHMS. That is, we will not assume that the inherent inhomogeneous Markov chain converge as time goes to infinity to a homogeneous Markov chain with a regular transition probability matrix geometrically fast. Finally, in
Section 5, we provide an illustrative application from the area of manpower planning.
2. The NHMS in Discrete Time and Space
Let a population consist of any kind of entities and let us denote by
for
, the total number of memberships at time
t, that is, at the end of the interval
, which are being held by its members. At every point of time that a member leaves the population, the membership is being transferred to a new member. For example, members could be patients in a hospital and memberships the beds they occupy. It is assumed that the total number of memberships are known or the sequence
is a realization of a known stochastic process depending on the application. The memberships are distributed in a finite number of states and let
be the state space. Important aspect of an NHMS (see Vassiliou [
12], Section 5.2) is the
population structure, that is, the vector of random variables
where
is the number of memberships in state
i at time
t. Also, very important is the
relative population structure, which is the vector of random variables
. We denote by
the transition probability matrix of the internal transitions of the members of the population during the interval
, that is, the
t-th interval. Also, we have probable leavers from the states in
S in every time interval
t and let us denote by
where the state
represents the external environment. Finally, we have new entrants of memberships to the population in order to replace leavers and to expand the population. Let us collect the probabilities of allocation of the memberships to the various states in the
t-th interval in the following stochastic vector
where the state 0 represents the new entrants with their memberships waiting to be allocated in the various states. Note that in what follows we assume that
.
The transition probability matrix of the memberships
in the
t-th interval can be shown (Vassiliou [
12], p. 193), and it is given by
We call the inhomogeneous Markov chain defined uniquely by the sequence of transition probability matrices
the
embedded or inherent nonhomogeneous Markov chain of NHMS. A population or any physical phenomenon that could be modeled in the above described way is defined to be a
nonhomogeneous Markov system.Now, as previously, we define the relative population structure for a population which started at time
s and is at time
t to be
, where
is the population structure for the population. It could be proved that (Georgiou and Vassiliou [
26] and Vassiliou [
12], p. 195)
where
s is the initial time and, therefore,
is the initial relative population structure which is known and
and where
is the relative population structure at time
t for the system that started with initial relative population structure
, which apparently is a random variable, and we denote by
its expected value.
From (
2), recursively, we obtain
where
3. Strong Ergodicity in NHMS with Chronological Order
The asymptotic behavior of NHMS and of nonhomogeneous Markov chains, as well as of homogeneous Markov chains, has been one of the central problems for many years, as can be seen in Refs. [
12,
27,
28,
29,
30,
31]. The asymptotic behavior of NHMS started with Vassiliou [
1,
32], and an updated evolution of these theorems and their variants could be found in Vassiliou [
12]. In the present section, we provide and prove a basic theorem for strong ergodicity in NHMS when the transition probabilities matrices of the inherent Markov chain
are given in chronological order; that is, it is assumed that the time order of the elements of the sequence
is given and will not be changed. In Theorem 1 we relax the basic assumption present in all studies of asymptotic behavior for NHMS. That is, we will not assume that the inherent inhomogeneous Markov chain converge as time goes to infinity to a homogeneous Markov chain with a regular (it consists of one communicating class of states which is aperiodic) transition probability matrix or, equivalently, that the inhomogeneous Markov chain is strongly ergodic. We start with some basic definitions and results, which will be useful in what follows.
For what follows, we assume a complete probability space and consider an NHMS in discrete time and space. Therefore, we will not repeat this in every Definition, Lemma, Proposition, or Theorem.
Definition 1. We say that the NHMS is strongly ergodic if and only if there exists a stochastic vector ψ such thatwhere from now on is any vector norm, except if it is otherwise stated. Definition 2. If is any finite stochastic matrix with state space , then the Dobrushin ergodicity coefficient that is known is given byWe now define another class of ergodicity coefficients which will be generated by different norms. Definition 3 (Seneta [
27]).
Let the setand by any metric on this set. Then, the quantityfor any stochastic matrix, is called a coefficient of ergodicity. Remark 1. All matrix norms on provide an appropriate metric on via . Then for any stochastic matrix we get We call the coefficient of ergodicity induced by the norm . It is proved that (8) can be written equivalently (Vassiliou [12], p. 118) as follows:When the norm is used, that is, for an matrix with elements from , we obtain thatthen We now give the definitions of strong and weak ergodicity for a nonhomogeneous Markov chain:
Definition 4. Consider an inhomogeneous Markov chain in discrete time and space. We say that with a sequence of transition probability matrices is strongly ergodic if there exists a stable stochastic matrix such that for every tIf the limit is zero uniformly in t we say that or equivalently is uniformly strongly ergodic. Definition 5. Consider an inhomogeneous Markov chain in discrete time and space. We say that with sequence of transition probability matrices is weakly ergodic if for all states Note that in (13), is not actually necessary to exist. Remark 2. Note that equivalently a non-homogeneous Markov chain is weakly ergodic if for every t.
We now state the following Lemma the proof of which exists in Vassiliou ([
12], p. 119).
Lemma 1. The coefficient of ergodicity generated by any metric as in (7) or induced by any vector norm on has the following properties: - (1)
- (2)
The following Lemma proof, which can be found in Paz [
33], is useful in what follows:
Lemma 2. If is a stochastic matrix and if is any real matrix such that and in addition , then We now state and prove one of the basic theorems of the present paper.
Theorem 1. Let there be the complete probability space and consider an NHMS in discrete time and space. We assume thatLet be the inherent nonhomogeneous Markov chain of the movement of memberships. If is weakly ergodic with , the sequence of transition probabilities, and in addition, there exists a stochastic vectorwhich is the left eigenvector of for , that isandthen the NHMS is strongly ergodic. Proof. From (
17) we obtain that there exists a stochastic vector
such that
From (
16) and (
18), we obtain that there exists a stochastic matrix
such that
Denote by
to be the stable matrix with row the stochastic vector
and
the stable matrix with row
. Then, we have
We now show that
We have that
Taking norms on (
22) we obtain that
We now have
where
is less than one due to weak ergodicity of
the inherent nonhomogeneous Markov chain. Also for fixed
r, we can always choose
such that
We now have that
Now, similarly, we obtain
and using this equation recursively, we obtain
From condition (
16), we obtain that
and
Therefore, from (
28)–(
30), as well as, Lemma 2, we have that
due to condition (
17). Since
for every
r and
and it is a sum of positive numbers, we have that its tail goes to zero. Hence, we could, for every
, fix
such that
for
and we can always take a
large enough so that
and
therefore, we obtain that
Now, from (
4), we have that
Now, we have
By fixing
, we can always find
such that
Now, for
from (
37), we have that
where the second part of (
38) for
is less than
due to condition (
14). The first part is also for,
, less than
since
due to condition (
14). From (
34) and (
36), and a
large enough
, we easily see that
hence, the NHMS is strongly ergodic. □
4. Rate of Convergence in NHMS with Chronological Order
An important question in nonhomogeneous Markov chains and NHMS, due to its very large practical value, is the rate of convergence to their asymptotic structure. In fact, it is important to find the necessary conditions under which the rate of convergence is geometric because then the value of the asymptotic result is greater. The roots of the study of finding conditions under which the rate of convergence is geometric for nonhomogeneous Markov chains are in Huang, et al. [
34,
35] and Seneta [
31]. The study of the geometric rate of convergence in NHMS started in Vassiliou and Tsaklidis [
36] and Georgiou and Vassiliou [
26], and an updated evolution of these theorems and their variants could be found in Vassiliou [
12]. The importance of answering this problem for practical purposes is apparent in Bartholomew [
37] for the homogeneous Markov system, which is a very special case of an NHMS. In the present section, we provide and prove a basic theorem for the rate of convergence to strong ergodicity in NHMS when the transition probabilities matrices of the inherent Markov chain
are given in chronological order. In Theorem 4, we relax the basic assumption present in all studies of the rate of convergence to its asymptotic behavior for an NHMS. That is, we will not assume that the inherent inhomogeneous Markov chain converges as time goes to infinity to a homogeneous Markov chain with a regular transition probability matrix geometrically fast. We start with some basic definitions and results which will be useful in what follows.
Definition 6. We say that a sequence of matrices converges with geometrical rate to a matrix if there exists constants and such that Lemma 3 (Vassiliou and Tsaklidis [
36]).
The following statements are equivalent:- (i)
The sequence converges to zero geometrically fast.
- (ii)
The sequence converges to T geometrically fast.
Definition 7. A stochastic matrix is called Markov if at least one column of is entirely positive. Let be the set of all Markov matrices. We say that the stochastic matrix if the set of all regular matrices; for any . We say that the stochastic matrix the set of all scrambling matrices if .
Remark 3. The distinction of the set from all stochastic regular matrices is due to the fact that the product of two regular matrices is not always regular. In addition, the product of two nonregular stochastic matrices could be regular. A practical way to check if a stochastic matrix of small dimension is scrambling is the following: given any two rows , there is at least one column k such that and .
Definition 8. The incidence matrix of a stochastic matrix is a matrix where in the positions of positive elements we put the number 1. Therefore, two stochastic matrices and of the same dimension have the same incidence matrix if they have the positive elements in the same positions. Then, we write .
We now state some known Lemmas and Theorems, the proofs of which can be found in Vassiliou ([
12], p. 143).
Theorem 2. For all stochastic matrices, we have .
Lemma 4. If with , then for the number of distinct incidence matrices corresponding to with the same dimension as .
Theorem 3. Let there be a complete probability space , and consider a nonhomogeneous Markov chain in discrete time and space with a sequence of transition probabilities matrices . If for every anduniformly for all , then weak ergodicity obtains at a uniform geometric rate. We now define the geometrically strongly ergodic NHMS, which is a central concept in the present section.
Definition 9. Let there be a complete probability space , and consider an NHMS in discrete time and space. We say that the NHMS is strongly ergodic if there exists a stochastic vector ψ and constants and such that We now state and prove the basic theorem of this section.
Theorem 4. Let there be a complete probability space , and consider an NHMS in discrete time and space. We assume that the total number of memberships is increasing (), and it converges geometrically fast to T. That is,Let be the inherent nonhomogeneous Markov chain of the movement of memberships. If is weakly ergodic with the sequence of transition probabilities to be such thatand if in addition there exists a stochastic vectorwhich is the left eigenvector for of , that is,and in additionthen the NHMS is geometrically strongly ergodic. Proof. In order to prove that the NHMS is geometrically strongly ergodic, we must show that the expected relative structure satisfies Definition 9.
From (
46), we have that there exists a vector
and constants
and
such that
We define
and
as the stable matrices with rows
and
, respectively.
We now show that the inherent nonhomogeneous Markov chain
is geometrically strongly ergodic. That is, we need to show that there exists constants
and
such that
Let us denote by
the number of distinct incidence matrices corresponding to
with the same dimension as
. Then, for
, we have that
Now, we have that
For arbitrary but fixed
t and
with
with
m, the largest such integer from (50), we have that
Since
and by Lemma 4 we have that
and
From the weak ergodicity of the inherent nonhomogeneous Markov chain
, we have that
From (
51)–(
53), we arrive at
which as
goes to zero at a uniform geometric rate.
Following the steps of arriving at relation (
31), we straightforwardly obtain that
We now have the largest integer for
m, such that
From (
56), we have that
From (
55), we have that
Now, we have that
since
. Also, from condition (
46), we have that
Therefore,
Now, from (
49), (
54) and (
59), and condition (
46), we easily have that
hence, there exist constants
and
such that
What remains, according to Definition 9, is to show that the expected relative population structure
converges for every
t to the vector
as
goes to infinity geometrically fast.
From (
35), we have that
From (
61), we have that
with
and
From Lemma 3, we have that since
converges geometrically fast to
T, then
converges geometrically fast to zero. Hence,
Now, from (
37), we have that
With no loss of generality, we may assume that
, and then, from (
63) and (
64), we have that
From (
63) and (
66), we easily arrive at the proof of the theorem. □
5. An Illustrative Application
We will illustrate the results in the previous section with an example from a population of manpower. To possibly better visualize, the reader may have in mind a University system with three grades. That is, grade one is those with the level of Professors, grade two belongs to the Associate Professors, and finally, in grade three, there are the Assistant Professors. The University has a plan for funding for memberships for the next few years. When a member of staff is leaving, their membership remains with the University, that is, the funding of their position is not lost but remains, and the University could go on to appoint someone at any grade. The external environment to which the leavers go and their membership is retained by the University and it is the population of members of academic staff from Universities almost all over the world, as practices have shown. Hence, it is from this external environment that the new members will obtain the memberships available from the organization of the University.
A fundamental question for the practitioner is the estimation of the transition probabilities from the historical records. The way to estimate the transition probabilities of the memberships is a small extension of the way it is performed in manpower planning, which is well documented in Bartholomew [
38]. Another quite similar problem is that of the competing risk model in the medical literature, as it was presented by Kalbfleisch and Prentice ([
39], Chapter 8), Lee [
40], and Cox and Oakes [
41]. Also, similar problems exist in the study of reliability models and various actuarial studies as discussed in Elandt–Johnson and Johnson ([
42], Chapter 7).
Now define by
: the number of memberships moving from grade i to grade j in the t-th interval.
: the number of memberships in state i at the beginning of the t-th interval.
Assume that the number of years available in the historical data of the University is
n. Then, the maximum likelihood estimate of the probability
is the following:
It is an apparent advantage that the probabilities
are separately estimated for every
. In this way, the number of years of historical records necessary are significantly reduced. At this point, it is useful to test the hypothesis that the probabilities
are indeed functions of time. That is,
Considering the flow of memberships which move from grade
i to grade
j as a multinomial random variable, then (see Andersen and Goodman [
43]) hypothesis (
68) is tested by the statistic:
where
is the maximum likelihood estimate under the null hypothesis and is chi-square distributed with
degrees of freedom.
Now, let that the
showed that the probabilities
are functions of time. Then, there is a need to predict their values as functions of time. For a specific pair
, let that
are probable covariates for the specific application. Then, logistic stepwise regression is an appropriate model for these probabilities. Let us define by
then we obtain
Now, it is obvious that stepwise regression will show what are the important covariates to predict
(see also Vassiliou [
44]).
Let that following the above-described steps, we ended with the matrix:
The total population of memberships was planned according to the following sequence
that is, the total number of memberships converge geometrically fast with
, satisfying the condition (
41) in Theorem 4.
It is not difficult to check that the sequence (
74) satisfies condition (
42), that is,
for
. Also, it satisfies condition (
43) since
For the sequence of transition probability matrices for the memberships (
74), we find for the condition (
45) that the vectors
that satisfy it were the following:
We observe that already for t = 6 in (
82) and (
83) we have convergence of
which satisfies condition (
45) of Theorem 4, that is the convergence is geometrically fast. Hence, we conclude that
Calculating using the transition probability matrices given in (
74) the matrix product
we find that for
it converges to
. Hence, we conclude that
as
converges geometrically fast as was expected from Theorem 4 relation (
48).
Now, given the convergence of
as
it is straight forward to find from Equation (
2) applying it recursively or equivalently from Theorem 1 that
converges geometrically fast (in fact in 9 time steps) to
Hence, the NHMS is geometrically strongly ergodic.