1. Introduction
Multi-server queueing systems with heterogeneous requests can be successfully applied for modeling, planning, evaluating, and optimization of real-world systems and networks. For example, such systems have been applied for description and optimization of the operation of emergency departments in hospitals (see, e.g., [
1,
2,
3]) and cells of cognitive radio systems (see [
4,
5,
6,
7,
8,
9,
10,
11,
12]).
It is usually supposed in such systems that primary requests (i.e., patients having a severe injury or licensed users) have preemptive priority over secondary requests (i.e., other patients or cognitive users). They can be dropped only if during their arrival moment all servers are busy and provide service for the primary requests. If some of them provide service for the secondary requests, service of one of secondary requests is instantaneously interrupted and this server is occupied by the primary request. The termination of service ahead of the schedule implies negative effects such as the loss of throughput due to the waste of the work done on servicing interrupted requests as well as dissatisfaction of these requests. Therefore, to smooth these negative effects, it is necessary to properly manage an admission of the secondary requests, e.g., via the temporarily suspending of their admission when the forced termination of service is likely because the number of busy servers is greater than a certain threshold fixed in advance (see for example [
13]). In that paper, it was shown that the proper choice of the reservation threshold may lead to better operation of the system. Let the number of servers be equal to
N. The strategy analysed in [
13] assumes that an arbitrary non-priority request is accepted to the system only when the number of occupied servers at the arrival epoch is less than a threshold
M which admits values in the range
. In [
14], the model analysed in [
13], as well as in the majority of other existing papers devoted to analysis of cognitive radio systems, has been significantly generalized into several directions: (i) instead of the mixture of two stationary Poisson processes as model of arrival flows of heterogeneous requests, the arrivals are defined by the Marked Markovian Arrival Process (
); (ii) it is supposed that the non-priority requests, which are not immediately accepted to the system at an arrival epoch or interrupted during a service, have the options to abandon the system or to move to so called orbit and attempt to enter the service again after a random amount of time.
The stationary Poisson arrival flows considered in the majority of the existing relevant papers are the simplest case of
. They do not reflect the correlation and variance of inter-arrival times in the arrival process flows which are inherent in modern telecommunication networks. This process is the extension of the well known Markovian Arrival Process (
) (see [
15,
16,
17]) to cover the case of heterogeneous requests. The problem of description of real flows by the
s has been intensively discussed in the existing literature, see, e.g., [
18,
19,
20].
The phenomenon of retrials (repeated attempts) is an inherent feature of many real world systems, including telecommunication networks and contact centers. Therefore, its accounting is very important for adequate modeling of these systems and networks. The classic surveys of research on retrial queues are [
21,
22].
The threshold (and multi-threshold) type strategies of control by service and admission rates or by the number of active servers in queueing systems are very popular because their use allows the economic effectiveness of the corresponding real queueing systems to be significantly improved. Information concerning the state of the art in analysis of such systems when the arrival flow is provided by the
or
can be found, e.g., in [
23].
The evident drawback of the threshold type strategies is the possibility of frequent change (oscillation) of the operation mode of the system when the number of requests in the system hesitates around the value of the thresholds. To avoid this negative effect, so-called hysteresis strategies are often applied. Such a strategy is the transparent extension of a threshold strategy. In the context of our paper, this extension consists of the following. Instead of fixing one threshold, say,
to stop admission of non-priority requests when the number of requests in the system achieves the level
and resuming acceptance when this number becomes less than
we fix two thresholds, say,
and
(
). One can think that
Admission of non-priority requests is stopped when the number of requests in the system exceeds the upper level
M and is resumed when this number becomes less than the lower level
Information about the state of the art in analysis of systems with hysteresis control when the arrival flow is provided by the
or
can be found, e.g., in paper [
24]. Queues with hysteresis control are sometimes called oscillating; see, e.g., [
25,
26].
In this paper, we analyse a system with a hysteresis policy of admission control. To the best of our knowledge, such a type of control has not been earlier applied for servers reservation in priority queues. The difficulty of the consideration of a system with hysteresis control compared to analysis of analogous systems with threshold control consists of the following. When using threshold control, the decision to admit or to reject an incoming request is made solely based on the comparison of the observed number of requests in the system and the threshold. When using hysteresis control, the decision in the situation when the observed number of requests admits values in the interval , and additionally depends on prior decisions. Therefore, in order to construct the Markov process that describes the dynamics of the system, the use of an additional component is required.
The rest of the paper is organized in the following way. In
Section 2, the analysed mathematical model is completely reported. In
Section 3, the dynamics of the queueing system under study is described as a level-dependent multi-dimensional continuous-time Markov chain. Necessary denotations are introduced and the infinitesimal generator of this Markov chain is derived. In
Section 4, the ergodicity (stability) condition of this Markov chain is provided. The invariant distribution of the system states is calculated and formulas for computation of the main performance characteristics of the system are presented in
Section 5.
Section 6 is devoted to numerical illustration of the feasibility of the presented algorithms and the possibility of solving optimization problems.
Section 7 concludes the paper. The main results of the presented analysis are summarized, and directions for future research are highlighted.
2. Mathematical Model
We consider a queueing system consisting of N identical servers and having no buffer space. Two types of requests are processed in this system. The service times of a type-k requests are independent identically exponentially distributed random variables with the rate Type-1 requests have preemptive (absolute) priority over type-2 requests. Arrival of type- requests is described by the Markovian Arrival Process. This process is defined by an underlying process that is an irreducible continuous-time Markov chain having states The transition intensities of the process within this state space are provided by the elements of the matrices and of size The non-diagonal entries of the matrix define the rates of transitions that are not accompanied by arrival of type-k requests. The diagonal entries of the matrix define the rates at which the process exits its states. The entries of the matrix define the rates of transitions that are accompanied by the arrival of a type-k request.
The matrix is the infinitesimal generator of the Markov chain The row vector defines the invariant distribution of this Markov chain. This vector is the only solution to the system of equations Here and throughout this paper, means a column vector of appropriate size consisting of ones and means a row vector of appropriate size consisting of zeros.
The fundamental (mean) arrival rate of type-k requests is defined by
We suggest that type-1 requests have preemptive priority over type-2 requests. This means that an arriving type-1 request is always admitted to the system except in a case when all servers are busy by providing service to type-1 requests at the arrival moment. In such a case, the type-1 request immediately departs from the system without service (that is, it is lost). If all servers during a type-1 request arrival epoch are busy, but a certain number of them are providing service to type-2 requests, service of one of type-2 requests is interrupted and the type-1 request occupies the corresponding server.
We suppose that the admission of type-2 requests to the system is implemented according to the hysteresis strategy. This strategy is more general and flexible than the threshold reservation strategy known in the literature. It is described as follows. The decision to admit or to reject an arriving type-2 request depends on the current state of the managing process This is a stochastic process having two possible values, 0 and 1. We say that the value 0 corresponds to an off-period during which arriving type-2 requests are not admitted into the service, while a value of 1 corresponds to an on-period, during which arriving type-2 requests may be admitted for service.
The mechanism for switching the states of the managing process is defined by the relation of the current number of busy servers and two integer thresholds, and
If the number of busy servers during the stay of process in state 1 is less than then any request trying to obtain access is admitted to the system and immediately starts service. If the number of busy servers during the stay of the process in state 1 is equal to M and a new request from outside arrives, then the process jumps to state 0. The arriving request is accepted if it is of type-1 and is rejected if it is of type-2. With probability a rejected request leaves the system permanently (it is lost). With probability q, this request decides to attempt to obtain access later on. We say that this request moves to a virtual place called a orbit. A request staying in the orbit repeats its attempts to get access independently of the behavior of other requests residing in the orbit after a random time interval duration, which has an exponential distribution with the rate An attempt is successful if the managing process resides in state 1 and the number of busy servers is less than If the attempt is successful, the request immediately occupies an available server and starts service. If the attempt is not successful, the retrying request departs from the system with probability . With the complementary probability, the request returns back to the orbit.
If the number of busy servers during the stay of process in state 0 is equal to and the service of any request finishes, then the process transits to state 1 (the on-period begins). At all other moments (i.e., when no arrival occurs during the on-line period in the presence of M busy servers or no service completion occurs during the off-line period in the presence of busy servers) no switches of the states of the managing process can occur.
As is stated above, service of a type-2 request receiving service can be terminated by the arrival of a type-1 request. In this case, a type-2 request leaves the system permanently with probability or transits to the orbit. The requests residing in the orbit are impatient (intolerant) and depart from the system without receiving service after a random amount of time. This time has an exponential distribution with the rate
In the rest of this paper, we analyze the invariant distribution of the system states and discuss computation of various performance characteristics of the system under any fixed values of the thresholds and
3. Process of the System States
Let
be the number of requests residing in the orbit
be the number of occupied servers
be the number of type-2 requests in service
be the state of the underlying Markov chain of the
be the state of the admission managing process: during off-period and during on-period
during moment t when .
Note that we assume that if if and that can be either 0 or 1 when
It is easy to verify that the six-dimensional stochastic process
is an irreducible continuous-time Markov chain having one denumerable and five finite components.
Let us enumerate the states of the chain in the direct lexicographic order of the components . The set of states having a value of the first two components are called a macro-state and the set of macro-states is called a level
Let Q be the infinitesimal generator of the Markov chain It consists of the matrix blocks which in turn consist of the matrices containing the transition rates of the chain from the macro-state to the macro-state The diagonal entries of matrix Q are negative. The modulus of each diagonal entry defines the departure rate of the Markov chain from the corresponding state.
To write the explicit expression for the blocks and subblocks of the generator Q, we use the following notation.
Let
I and O be the identity matrix and zero matrix, correspondingly. If the dimension of a matrix is not clear from the context, it is indicated by a suffix. Otherwise, it is omitted.
denote the column vector that is the transposition of the row vector .
⊗ and ⊕ be the symbols of the Kronecker product and sum of matrices, respectively; see [
27,
28,
29]. These symbols are very useful for describing the rates and probabilities of simultaneous transitions of several independent Markov chains.
be the Kronecker delta, which is equal to 1 if and 0 otherwise.
and .
be the size of blocks
, defined as
i.e., be a diagonal matrix with diagonal entries .
.
.
be matrices of size
, defined as
be matrices of size
, defined as
be square matrices of size
, defined by the formulas
be the auxiliary matrices of the corresponding size, defined as
be the auxiliary matrices of the corresponding size, defined as
Lemma 1. The infinitesimal generator Q of the Markov chain has the following block-tridiagonal structure: where the non-zero blocks of size are defined as follows: Proof of the lemma can be implemented by means of an analysis of different scenarios involving the behavior of the Markov chain during an interval of a very small duration while accounting for the transparent probabilistic meaning of all involved matrices and the extensive use of the operator of the Kronecker product and sum of matrices, which is not presented here.
The block-tridiagonal form of the generator Q obviously follows from the fact that during a very short interval the number of requests in the orbit can only not change its value or decrease or increase by one. This follows from the properties of the and the exponential distribution of service, retrial, and impatience rimes.
The block-tridiagonal form of the matrices block-twodiagonal form of the matrices , and block-diagonal form of the matrices analogously follows from the same properties.
4. Ergodicity (Stability) of the Process of the System States
Theorem 1. As it is assumed that the probability of refusal on joining a busy system upon arrival or retrial is positive (i.e., ) and requests can depart from the orbit without receiving service (i.e., ), the Markov chain is ergodic for any set of the system parameters.
To prove this Theorem, we use the results for Asymptotically Quasi-Toeplitz Markov Chains (
) presented in [
30]. To prove that the Markov chain
indeed belongs to the class of such chains, it is necessary to verify the existence of the limits
where
is a diagonal matrix with the elements defined as the moduli of the corresponding diagonal entries of the matrix
Then, it follows from [
30] that the sufficient condition for ergodicity of the
is that the inequality
is fulfilled, where the row vector
is the unique solution to the system of linear algebraic equations
while the sufficient condition for non-ergodicity of the
is the fulfillment of the inequality
It may be easily verified that the limits (2) for the Markov chain
do exist and are provided by the formulas
Then, it is evident when accounting for (4) and (5) that inequality (3) turns into the trivial form which implies that the Markov chain is ergodic for any values of the system parameters. Thus, the theorem is proven.
Remark 1. The statement of the Theorem is intuitively transparent because if any request can abandon joining the orbit or depart from there without service, then the number of requests in the orbit cannot infinitely grow and the process is ergodic.
5. Computation of Invariant Distribution of
the Markov Chain and Key Performance Characteristics of the System
Let us assume that the ergodicity condition is fulfilled. Then, the following invariant probabilities exist:
Let us construct the row vectors of the invariant probabilities of the levels as follows.
The row vector consists of invariant probabilities enumerated in the direct lexicographic order of the components
Then, the row vectors
of the macro-states are defined by
It is well known that the probability vectors
satisfy the following system of linear algebraic equations (the Chapman–Kolmogorov equations):
where
Q is the infinitesimal generator of the Markov chain
Note that in the case where
and
we have a model with request loss (i.e., without retrials). In this case, the infinite size generator
Q reduces to a single finite non-zero block
of the form provided above and the system of Equation (
6) is finite. The invariant probability vectors of the system states in this case can be computed by the direct solution of system (6) by computation or by means of numerically stable algorithms; see, e.g., [
31,
32,
33].
In the general case, system (6) has infinite size and cannot be directly solved by computation except by performing its truncation. Instead of such a rough method, the system can be solved by means of the numerically stable algorithms developed in [
30,
34] and based on replacement of this system by the infinite system derived using the notion of censored Markov chains; see, e.g., [
35,
36]. The new system has a better structure of the generator that allows for its solution via elaborate and effective numerical algorithms.
The algorithms presented in [
30,
34] are oriented to a more general form of the generator
Q (i.e., blocks above the off-diagonal blocks are not mandatory equal to zero). The version of the algorithm exactly oriented to a block-tridiagonal form of the generator
Q can be found, e.g., in [
37].
For the reader’s convenience, here we present a very short outline of the algorithm for the computation of the vectors of the invariant probabilities
Step 1 Compute the sequence of the matrices
from backward recursion:
with terminal condition
Step 2 Compute the non-negative matrices
from direct recursion
Step 3 Compute the probability vector
as the unique solution to the system
Step 4 Compute the vectors of the invariant probabilities
by the formula
Details of computer implementation of this algorithm can be found in [
30,
34,
37].
After the vectors of the invariant probabilities have been found, it is possible to compute a variety of performance characteristics of the system.
The invariant distribution of the number of requests in the orbit is
The mean number of requests in the orbit is
The mean number of requests in the system is
The mean number of busy servers is
The mean number of busy servers providing service to type-1 requests is
The mean number of busy servers providing service to type-2 requests is
The intensity of output of type-1 requests is
The intensity of output of type-2 requests is
The intensity of output of requests from the system is
The loss probability of type-1 requests is
The loss probability of type-2 requests is
The loss probability of an arbitrary request is
where
The probability of type-2 request loss upon arrival is computed by
The probability that a type-2 request enters the orbit upon arrival is computed by
The rate of type-2 requests lost upon arrival is
The probability that an arbitrary type-2 request will be forced to terminate service and go into the orbit is
The probability of an arbitrary type-2 request being forced to terminate service leading to loss is
The probability of a type-2 request being lost due to non-persistence is computed by
The probability of a type-2 request being lost due to impatience is computed by
The probability of an arbitrary type-2 request being lost from the orbit is
The mean number of state changes in the management process of admission during a unit of time is
6. Optimization Problem and Numerical Examples
The goals of this section are to briefly show the feasibility of the proposed method of constructing the generator of the Markov chain, describe the behavior of the system and computation performance characteristics of the system, and to illustrate the dependence of key performance characteristics on the parameters of the hysteresis admission control strategy.
Let us consider a system consisting of
servers. The matrices defining the arrival process of type-1 requests are as follows:
This arrival process has a mean arrival intensity , a squared coefficient of variation of inter-arrival times , and a coefficient of correlation of two neighbouring inter-arrival times
The matrices defining the arrival process of type-2 requests are
This arrival process has a mean arrival intensity , a squared coefficient of variation of inter-arrival times , and a coefficient of correlation of two neighbouring
The other system parameters are , , , and
Let us now vary the parameter M over the interval and the parameter over the interval .
First, note that the performance characteristics of the system related to type-1 requests do not depend on the parameters M and . Thus, for the example under consideration, , , and for all values of M and .
The mean number of type-2 requests in the orbit () is very large for small values of M and This admission policy is too strict, and type-2 requests are compelled to wait for service in the orbit. With the growth of the values of M and access by type-2 requests becomes easier and the value of essentially decreases.
The mean number of busy servers providing service to type-2 requests is small for small values M and , which restricts the value of from above. With the growth of the values of M and the value of quickly increases as more type-2 requests receive access to the service upon arrival.
The probability of type-2 request loss upon arrival is essential for small values of M and With the growth of the values of M and it quickly decreases. Similar observations relate to the probability of type-2 request loss due to impatience, the probability of type-2 request loss due to non-persistence, and the probability of arbitrary type-2 request loss from the orbit.
The probability of an arbitrary type-2 request being forced to terminate service and being lost is quite small for small values of M and This is because small values of M and imply strict restriction of admission of type-2 requests. Correspondingly, servicing of these requests is rarely terminated. With the growth of the values of M and more type-2 requests are accepted into the system despite many servers being busy. This evidently implies more frequent service interruptions and a sharp increase in the probability
The shape of the surface showing the probability of type-2 requests matches the form of the corresponding surfaces presenting the values of probabilities , and , as the probability is the sum of the listed probabilities.
The loss probability
of an arbitrary request as a function of
M and
is similar to the probability
of type-2 request loss presented in
Figure 8, and the probability
is essentially less than the probability
. This is easily explained by the fact that the number
N of servers in this example is chosen to be such that the probability
of type-1 requests lost is very small (on the order of
). Because 37.5 percent of arriving requests are type-1 requests, which are rarely lost, the value of
is essentially less than the value of
As anticipated, the value of is essentially larger when the values of the thresholds M and are close to each other. This value quickly decreases when the difference between increases.
Having clarified the shapes of the dependencies of the main performance characteristic of the system on the thresholds, we now illustrate the possibility of using the results of the presented analysis to optimize admission control in the considered system via proper choice of the thresholds M and , thereby defining the strategy of control.
Let the quality of the control of system operation be evaluated via the following cost function:
Here, the cost coefficients
and
f define the charges of the system caused by the loss of one type-2 request upon arrival to the system, loss after service termination, moving to the orbit after service termination, loss due to impatience, loss due to non-persistence, and for each change of the state of admission managing process, respectively. In the example, we set the following values of the cost coefficients:
The shape of the function
is presented on
Figure 11. The optimal value is
, which is reached for
and
To determine the time spent for computation related to this example, we used the algorithm for computation of invariant distribution developed in [
37] on a Lenovo Gaming 3 notebook with an Intel(R) Core(TM) i7-10750H CPU, 2.60 GHz, and 16 GB RAM in Wolfram Mathematica 12, with a result of is 10,351 s (172.51666 min). This relatively long computation time is explained by the fact that computations were implemented for all values
and
(190 pairs
in total); in addition, the size
of the finite blocks
of the infinite size generator
Q can be quite large (e.g., for
and
, this size is equal to 1756).
7. Conclusions
In this paper, we have analyzed a multi-server queueing system providing service for two types of requests. The system is suitable for modeling, in particular, of the operation of the cell of cognitive radio systems. Type-1 requests have preemptive priority over type-2 requests. Access to services by type-2 requests is restricted via a hysteresis mechanism defined by two thresholds. Their access is temporarily forbidden when the number of busy servers exceeds the higher threshold and is resumed again when this number drops below the lower threshold. Rejected type-2 requests have the options of abandoning the system or attempting to retry for service after random intervals of time. Under the fixed values of the thresholds, the dynamics of the system are described by the level-dependent six-dimensional Quasi-Birth-and-Death process. The generator of this process is written out. A sufficient condition for the ergodicity of this chain is presented and efficient algorithms for computation of its invariant distribution are recommended. The expressions for computing the main performance characteristics of the system via the vectors of the invariant probabilities of the Quasi-Birth-and-Death process are derived, and the feasibility of their use for solving optimization problems is numerically demonstrated.
Various extensions of the model can be considered based on the results presented in this paper, e.g., the randomized procedure of dropping type-2 requests and retrying requests in situations involving a lack of servers can be considered. Different thresholds defining acceptance/rejection of type-2 requests arriving from outside and from the orbit can be considered. The presence of small buffers for temporal maintenance of high priority requests and/or interrupted low priority requests can be assumed. The existence of several classes of low-priority requests with a preemptive or non-preemptive priority between the classes of low-priority requests (see, e.g., [
38]) can be supposed as well. A scenario in which the processor shares discipline instead of using fixed servers can be analysed. A phase-type distribution (see, e.g., [
39]) of service times instead of an exponential one can be considered, and non-reliability of servers can be allowed (see, e.g., [
40]).