1. Introduction
Queueing theory applications cover a wide range of businesses. Examples include electronic communication systems and devices. New technologies, electronic communication systems, and devices are used by many organizations today. This is because they enable sustainable and uninterrupted growth and are a prerequisite for competitive advantage. However, this implies certain requirements and risks. The requirements are, first and foremost, reliability that takes into account the complexity and interdependence of the system. On the other hand, however, the stochastic characteristics and complexity of systems carry risks related to the requirements of reliability control [
1], transmission quality [
2], availability, and security [
3]. In research work on electronic communication systems and devices, queueing models in which events flow into queues in a Markovian or non-Markovian manner are often used. Here, queueing theory makes use of the mathematical apparatus associated with the theory of stochastic processes, in particular Markov processes [
4,
5]. Queueing theory is also a useful tool for estimating capacity requirements or managing demand for any system where the demand time for the system’s services is random. Recent research on electronic communication systems supported by queueing theory indicates a significant development potential in many areas, such as industry, business, transport, energy, and medicine. Many organizations, such as banks, airlines, healthcare systems, telecommunications companies, and security departments, routinely use queueing theory models to help determine capacity levels needed to meet experienced demands in a more efficient way. An analysis of the literature on the subject shows that studies and applications relating to service control for queueing systems are rich and varied. In a queueing system, the elements arriving to receive a service can refer to a variety of phenomena. These could be customers arriving at a system, such as a bank, patients arriving at a hospital, items in a partially completed production component, or transport vehicles, planes, or cars waiting to be served by the system. However, queueing theory is a tool that is heavily used in modeling and testing the quality of transmission in electronic communication systems and devices [
6,
7,
8]. QoS (Quality of Service) is one of the most important challenges that arise in the design and maintenance of NGNs (Next-Generation Networks) [
9,
10,
11,
12]. The interconnection between embedded devices, sensors, and machines over the Internet has revolutionized the way people live and work. Sustaining organizational growth and increasing capacity requires the introduction of new intelligent technology solutions; hence, different techniques are proposed in the literature to improve the current network architectures [
13]. These requirements are particularly important with the development of new smart technologies, such as IoT (Internet of Things) [
14] and IoE (Internet of Everything) [
15]. IoT and IoE technologies embody a fantastic vision of the future where everything is connected to the internet. This technological revolution that is ushering us into a new era of ubiquitous connectivity, computing, and communication is primarily driven by the continuous, relentless demand for bandwidth, mobile communications of higher data rates, ultra-low latency, high reliability, and QoS guarantees for systems connecting billions of devices, machines, people, and intelligent processes. The queueing systems that inspired this article are queueing systems that support QoS, particularly found in electronic communication devices and advanced telecommunications systems, which manage intelligent data flow in networks.
This publication is divided into 10 main sections.
Section 1 provides an introduction to the topic and research area.
Section 2 discusses QoS functions and areas for which QoS provision is critical.
Section 3 discusses the performance of modern networks based on QoS architectures. Two architectures are discussed here. The first architecture is IntServ and the second is DiffServ.
Section 4 describes the means and methods of data markings used in the architectures discussed.
Section 5 describes the queueing systems used in the QoS architectures.
Section 6 discusses the parameters of the queueing system.
Section 7 concludes with the theoretical foundations used to build models of PQS systems; it discusses the topic of Petri nets and defines timed Petri nets that illustrate an example model.
Section 7 summarizes the research context used to build the actual PQS models described in the following sections.
Section 8 describes the designed and tested models of the priority queueing system.
Section 9 presents the performance characteristics of the tested models obtained through simulation using software tools.
Section 10 provides a summary.
2. QoS Functions and Application Areas
Quality of service is the ability to provide different priorities to different applications, users, or dataflow, or to guarantee a certain level of performance to a dataflow [
16]. Intensive research work on QoS has been carried out for many years [
17,
18]. Several researchers have already reviewed the state-of-the-art and challenges of QoS [
19]. As the results of QoS research show, QoS is one of the most important challenges that arise in the design and maintenance of NGNs [
20] and modern TSN (Time- Sensitive Network) [
21]. Although QoS research has received sustained attention long before the advent of cloud computing, QoS is fundamental in this area as well. Adequate implementation of cloud services and cloud computing in enterprises can reduce the operational costs of organizations, but one of the main challenges posed by cloud applications is the management of QoS. Ardagna, D. et al. [
22] argue that QoS is fundamental for cloud users, who expect providers to deliver the advertised quality characteristics, and for cloud providers, who need to find the right trade-off between QoS levels and operational costs. The findings of Abdelmaboud et al. [
23] also confirm that QoS in cloud computing has become an important research topic in which open challenges and gaps remain, which require exploration and future research. With the increasing use of services in a cloud environment, QoS improvements are, therefore, needed for cloud computing solutions. Cloud computing technologies are as important as the IoT, which is a new paradigm that is gaining immense popularity in today’s world. However, the success of the IoT world requires the provision of services that are credited with ubiquity, reliability, high performance, efficiency, and scalability. The IoT is one of the most disruptive technologies for wireless systems due to its ability to provide connectivity to anything, anytime, anywhere. However, existing heterogeneous IoT solutions cannot handle the highly diverse QoS requirements with the increasing number of IoT applications. Conclusions and a detailed assessment of the current state of the art on proposed approaches to QoS issues in IoT were presented by White et al. [
24]. Another area of research carried out on QoS focuses on WSN (Wireless Sensor Network) architectures [
25]. The applications of WSNs are numerous and can be used to monitor areas such as the environment, health/healthcare, logistics applications, and smart grids. The increasing use of wireless sensors in a variety of areas makes QoS a critical parameter for determining their application. However, due to their dynamic architectural nature, it is difficult to ensure an adequate level of QoS in WSN architectures. An overview of methods and mechanisms for implementing various QoS metrics and enhanced communication capabilities in WSNs was presented by Letswamotse et al. [
26]. Mechanisms for efficient resource management for QoS improvement were investigated and critical design and implementation aspects for QoS requirements and guarantees in WSNs were identified. Based on the analysis of the available research work, it can be concluded that QoS requirements are critical for many services and applications and are still a broad area of intensive research work.
4. Marking and Classification
The operation of QoS requires labeling and classifying the imposition of appropriate policies related to data transmission. Different labeling and classification schemes have been designed and standardized. For IPv4, this is the 8-bit ToS (Type of Service) field, while for IPv6, the 8-bit SC (Service Class) field is used [
46]. In practice, however, a 6-bit DSCP (Differentiated Services Code Point) bit field located in the IP packet header is used to properly identify aggregates (see
Figure 4).
The number found in the DSCP field is sometimes referred to as the DSCP code point. Each code point is a combination of bits with an agreed
a priori meaning. Typically, the assignment of the corresponding DSCP code values is done at the entrance to the network infrastructure. Recommendation RFC 2475 [
30] defines the division of the DSCP field into two parts: the class selector field, which ensures compatibility with previous solutions (Type of Service) by analogy with the 3-bit IP precedence field in the ToS field, and the drop probability field, which defines the level of probability of packet loss. The first three bits, therefore, define the IP precedence. The higher the value of this field, the more important the IP packet is. This means that in the event of congestion, the communication node will discard packets with a lower value of this field first. A binary value of 111 set in this field indicates the highest and a value of 000 indicates ordinary or no priority. The 2-bit ECN (Explicit Congestion Notification) field indicates congestion in the network and is a so-called congestion indicator. Setting both ECN bits to 1 informs the communication nodes that the buffer is overloaded. The intention behind the introduction of this mechanism was to inform a suitably intelligent protocol, e.g., TCP, of the occurrence of congestion at the network layer and to signal the sender to slow down the rate of sending new data. The class selector field allows network traffic to be identified by appropriately dividing the traffic into network service classes and prioritizing the packets belonging to this traffic. Packets with the same DSCP class value should be subject to similar handling rules at the communication nodes. Associated with each DSCP value is a specific forwarding strategy of the communication node, known as a PHB. In defining the PHB for each DSCP code point, an attempt was made to maintain compatibility with the original, i.e., to preserve the bit mapping of the DSCP pattern to the former ToS field structure (respectively, the
P—precedence,
D—delay,
T—throughput,
R—reliability fields). The
D,
T, and
R bits express the need for special treatment of the packet, in terms of low delay (
D), high throughput (
T), and high reliability (
R). An unclassified service is assigned a DSCP value of 000 000. For example, the mapping of a DSCP code with a decimal value of 22 to a
P-
D-
T-
R structure is as follows. The value of 22 in binary notation is 010110; we can map it to a precedence equal to 010 and the
D and
T bit set, so the PHB resulting from setting the DSCP to such a value should be characterized by an immediate priority (
Table 2), with the packet having the lowest possible delay and the highest possible throughput. The relevant IETF documents define the assignment of specific code point values to different classes of service.
Table 2 shows the precedence values in the binary code.
In the mapping of the DSCP to the
P-
D-
T-
R structure, the three most significant bits of the DSCP identify the so-called PHB class, while the other two bits reflect the packet’s resistance to rejection: the higher the value, the less likely the packet is to be rejected. Thus, the PHB class subfield is the equivalent of the precedence subfield in the ToS/service class field. Although it is possible to use any combination of bits to create distinct traffic classes, it is nevertheless recommended to follow the standards defined by the IETF RFC, which define two basic traffic classes. RFC 2597 [
42] defines the AF class and RFC 3246 [
41] defines the EF class. These two RFC specifications outline the fundamental traffic marking schemes, including the precise binary values that should populate the type of service byte in the IP header. Each AF class is independently forwarded with its guaranteed bandwidth. The EF PHB packet forwarding rule is designed to handle real-time application data streams. The EF PHB rule, therefore, provides handling mechanisms with the highest quality parameters. Both rules define the packet handling mechanisms implemented in the communication node of the DiffServ domain. Packets belonging to the same class are placed in the same queue. In order to distinguish between different classes of service and enforce priorities between them, a certain queueing and scheduling scheme is used in the communication nodes. A diagram of an example router architecture supporting DiffServ aggregation is presented in [
47]. Packets arriving at a communication node are subjected to classification and, based on the value of the DSCP code, are assigned to the appropriate DiffServ behavior aggregate. This means that packets with the same DSCP code value will be handled with the same quality level and are placed in the same queue. Each aggregate represented by a unique DSCP code value is handled by one queue (see
Figure 5). On the outgoing link, packets are classified and forwarded to the appropriate buffers implementing the corresponding PHB. This is followed by a serialized process of forwarding packets to the next communication node. A communication node can have multiple input links and multiple output links. For each packet that arrives on an input link, the router determines the next hop in its path and forwards and transmits the packet over the corresponding output link. The service discipline on each output port of the router allocates the resources of the corresponding output link between the different classes of services in link sharing [
48]. In the DiffServ domain, packets of flows are classified into a small fixed number of classes, such as CS, EF, or AF. Complex classification for individual flows is only implemented in the edge nodes. In the core, the nodes provide service guarantees only on a class basis and not on a flow basis. The number and variety of defined DSCP values are very large, as 64 different classes can be defined using the 6-bit DSCP field. However, it seems that such detailed data classification does not affect the efficiency of QoS because, in practice, only a few classes are used in most QoS prioritization schemes. Usually, there are between four and eight traffic classes. Incoming packets are first classified according to the value of the DSCP field. DSCP values other than CS, AF, and EF can be treated as BE traffic.
5. Queueing Systems in QoS Assurance
Queueing theory research has been carried out for many years and is also currently the subject of intensive study. The literature in this area is extensive. In T. Saaty’s work [
49], he provides a rather extensive list of references and discusses numerous queueing theory applications. Studies devoted to the mathematical aspects of queueing theory were conducted by many other authors [
50,
51,
52,
53]. Queueing theory is also one of the intensively used tools in the modeling and research of QoS transmission quality in networks [
54,
55]. The theory makes use of the mathematical apparatus associated with the theory of stochastic processes, in particular Markov processes [
56]. Queueing systems theory deals with the construction and analysis of analytical models that can be used in rational management and optimization of critical parameters of queueing systems. A queueing system is understood to be a system in which, on the one hand, there is an influx of requests that require services and, on the other hand, there are service apparatuses that serve the needs of these requests. If the inflow of requests exceeds the capacity to handle them immediately, a queue is formed. A block diagram of a simple queueing system is shown in
Figure 6.
A service channel, an input and output stream, and a system queue are marked in the system. The service channel (marked 1) can be any device or team handling incoming requests to the system. If the service channel is busy, the request goes to the queue and waits in the queue. The input stream is, therefore, made up of requests arriving at the system and the output stream of requests leaving the system. A characteristic feature of such a model is the sequence of waiting times for requests described by Equation (
3):
waiting to enter the system. We can derive a recursive equation of waiting times for the system. Consider the
request marked with an arrow in
Figure 6 arriving at the system at time
. The request
waits in the queue for a time period
, and is then serviced in the system for the time period, marked in the diagram as
; only after time Equation (
4)
does the request leave the system. The location of the
request leaving the system is marked with an arrow in the diagram. The next request arriving in the system is request
, which arrives in the system at time Equation (
5):
The arrival time of a request
is indicated by an arrow in the diagram. If the subsequent request
arrives in the system during the handling time of the earlier request
, and if
then the waiting time for the request
, as shown in
Figure 6, will be
However, the next request
may arrive in the system after the handling of the earlier request has been completed; hence, if
then the request arrives in an empty system. This case is indicated by the red arrow in
Figure 6. The request arrives in an empty system, so the waiting time for such a request
is
. If we assume that the waiting time
is the ‘warming-up’ time of the server, this is the time during which the server cannot serve the requested request starting at
; hence, the service time takes the value service time
. The above arguments, therefore, lead to the recursive relation Equation (
9):
where
From this relation, it follows by induction that the waiting time
for all
is described by Equation (
10):
Similar to non-negative waiting times
, we can also consider the magnitude of
, which takes both non-negative and negative values. If
, then
is the waiting time for a request
arriving at the system. In other words, the value of
. On the other hand, if
, then
is the idle time of the server waiting for a
request to arrive in the system. The waiting time
can, therefore, be described by Equation (
11):
Since
=
, using Equation (
10), we obtain the relation Equation (
12):
In order to characterize the queueing system, it is necessary to define the statistical distributions that describe the distribution of times between consecutive requests arriving at the system and the distribution of service times that describe the distribution of times required to service consecutive requests. The model shown in
Figure 6 consists of a single server to which a stream of requests arrives, handled in order of arrival. Such a model is traditionally referred to as a FIFO (First-In-First-Out) queueing discipline. It is also the primary mechanism supporting data transfer in computer networks. FIFO queueing is easy to implement and it treats all data equally. However, FIFO queueing does not support priority data so all incoming data are treated equally. The disadvantage of FIFO queueing is that it is of limited use in providing QoS. If data come into the system from different traffic streams, then one stream can easily disrupt the flow of the other streams. This means that FIFO processing can lead to a situation where an aggressive stream can monopolize more capacity or the entire queue buffer. The result can be a rapid deterioration of QoS, causing, for example, a sudden increase in the delay of transmitted data. However, several scheduling algorithms have been developed, demonstrating better memory isolation between flows [
57]. Classification of queueing systems can be performed based on various criteria. The most commonly used classification criteria are:
A queueing system can be characterized by its queueing rules, which determine the order in which requests in the system are handled [
58]. Several different queueing algorithms, depending on the needs and characteristics of the traffic, are used to manage congestion at communication nodes. Apart from the aforementioned FIFO queueing, the most common queueing disciplines are:
PQ (Priority Queueing);
RR (Round Robin);
WRR (Weight Round Robin);
DRR (Deficit Round Robin);
CBQ (Class-Based Queueing);
SFQ (Stochastic Fairness Queueing).
Queueing algorithms are the primary mechanisms used to shape traffic and ensure QoS transmission quality. They prioritize requests and send them, maintaining a certain queueing discipline. In priority queues, separate queues are used for the data of different traffic classes, i.e., different priorities [
59]. Requests intended for transmission over a common communication channel are always selected, starting from the non-empty queues with the highest priority. Consequently, submissions from lower-priority queues are only selected if all higher priority queues are empty. Priority queueing systems form a large class of queueing systems, where the incoming requests are to be distinguished by their importance [
60]. A schematic of priority queueing systems is shown in
Figure 7. This system is a fairly simple data serialization tool that consists of two or more FIFO queues
with different priorities. Incoming requests are initially sorted and placed in the queues according to their priority. Individual queues are handled, starting with the queue with the highest priority
. The lower-priority queues
are only served when the higher priority queues are empty. The main problem with the PQ algorithm is that the continuous flow of requests to the highest priority queue can lead to a complete blockage of the lower-priority queues and starve the data streams. In the worst case, requests from such queues may never be served.
There are various modifications of the PQ algorithm that eliminate the problem of stream starvation. These belong to the group of so-called FQ (Fair Queueing) algorithms. WRR (Weighted Round Robin) limits the number of consecutive requests belonging to the same class that can be transmitted over a channel. When the limit of requests belonging to the same class is reached, the scheduler switches to the next non-empty priority queue and follows the same rule. These limits are called weights and are denoted as
. With
k classes of traffic handled by the algorithm, if there are enough packets in all classes, the algorithm will select
requests from class 1, followed by
requests from class 2, …, up to
requests from class
k. At the end of the cycle, it selects
requests from class 1, and so on. As a result of such an operation, the communication channel is shared by notifications belonging to all classes according to Equation (
13):
where
,
is the transmission rate for the submissions belonging to class
i. If the transmission rates are identical for submissions belonging to all classes, then the ratio is described by Equation (
14):
Thus, in the example system with 3 priority classes and with weights of 3, 2, and 1, respectively, these “utilisation bounds” are equal to
,
, and
for classes 1, 2, and 3, respectively. Another variation of the weighted priority algorithm is the DRR algorithm. It allows flows with variable packet lengths to share bandwidth fairly. A detailed analysis of the performance of the DDR algorithm is presented in [
61]. Like DDR, the SFQ algorithm also belongs to the group of fair queueing algorithms. Here, flows can be classified based on parameters such as source and destination addresses. The allocation of data to the appropriate queue is based on the value of the hash function from these parameters. Each queue is handled fairly and cyclically in a round-robin fashion. It is observed that SFQ outperforms DRR by maintaining a comparatively low per-packet delay. On the other hand, DRR has the highest packet ratio compared to SFQ [
62]. Another fair queueing algorithm is the CBQ algorithm, which is a more advanced form of the WRR algorithm. Here, data are classified into traffic classes, so that specific groups of data can be given a specific priority—or a weight. The weight can be specified as a percentage of the transmission channel throughput. A typical example of the use of the queueing algorithms discussed here involves communication nodes responsible for the transmission of various types of data in networks. They allow the implementation of various QoS mechanisms. The operation of communication nodes without any QoS mechanisms is based on the assumption that the device processes data in the order in which the data were received, i.e., using the FIFO method. Devices supporting QoS can classify incoming data into predefined traffic classes and then handle the aggregated data streams. The handling process follows an implemented queueing mechanism and a defined handling and traffic-shaping policy to provide services with established QoS [
63].
6. Queueing System Parameters
The notation proposed by Daviad G. Kenadall [
64] is widely used to describe queueing systems. He proposed a simple symbolism, with the following syntax:
where
A—distribution of the inter-arrival time;
B—distribution of the service;
m—number of servers;
N—queue capacity; if
, then this letter is omitted;
S—optional parameter, indicating the service discipline used (FIFO, LIFO, etc.). It is assumed that if the
S parameter is omitted, then the service discipline is always FIFO. Similarly, the
N parameter represents the capacity of the queue. If the queue is infinite
, then the
N parameter is not entered in the Kenadall notation. Appropriate letter designations are placed in the
A and
B positions to indicate the type of schedules that are used in the system. The following designations for distributions are commonly used:
M—an exponential distribution with
and
where parameter
, is a continuous distribution having the Markov property,
D—deterministic distribution, and
—Erlang distribution of order
. For the
distribution, we have
—hyperexponential distribution with
k phases. Here, we have
G—General distribution.
Of the distributions mentioned, the exponential distribution is considered extremely useful. On the one hand, it describes the behavior of the system well and, on the other hand, its simplicity facilitates mathematical notation. The simplest queueing system with FIFO support can, therefore, be written in Kenadall’s notation as
. Such a model designation indicates a queueing system with a single service channel, an infinite queue, and an exponential distribution of the service time and influx of subsequent requests to the system. For any queueing system, it is very important to determine its most important characteristics and parameters, which include the average number of requests residing in the system or queue, the average time a request stays in the system or queue, the average intensity of the stream of requests in the system. The parameters describing the characteristics of the queueing system can be determined based on the argument presented in [
65]. The average residence time of a request in the system is the sum of the average residence time of the request in the queue and the average handling time. Similarly, the average number of requests in the system is the average number of requests in the queue and the average number of requests handled. Let us assume the following designations for the parameters in question:
L—average number of requests in the system;
W—average length of time a request stays in the system;
Q—average number of submissions in the queue;
T—average waiting time of a request in a queue.
Let us consider the inflow and handling of requests in the queueing system shown in
Figure 8. The process is denoted as follows:
—the number of requests flowing into the system at time
;
—number of submissions leaving the system at time
;
—the residence time of the submissions in the system
i;
—observation period.
If the incoming system is in a steady state, then the average number of incoming requests and the average number of requests leaving the system are equal to each other. Both streams have an intensity equal to
. The number of requests entering and leaving the system is described by the discrete functions
and
. At a given time
t, their difference is
and this determines the number of requests present in the system. If
, this means that there are no submissions in the system. In
Figure 8, this is the area where
and
form a single line. The average number of requests located in the system
L during the time period analyzed
can be determined by the integral of the function
divided by the length of the time segment
; hence, we have Equation (
20):
The calculated integral is the area of the figure shown in
Figure 8. After sufficient time
, we can write the value of
L as Equation (
21):
Multiplying and dividing the right-hand side of Equation (
21) by the intensity yields Equation (
22):
The value
is, therefore, the average number of requests that arrived in the system during
. We can write the average time a submission stays in the system as Equation (
23):
By substituting the value of Equation (
23) into Equation (
22), we finally obtain Little’s law Equation (
24):
Equation (
24) implies a very useful law in queueing theory called Little’s law, which asserts that the time-averaged number of requests in a queueing system, denoted as
L, is equal to the rate at which requests arrive and enter the system, represented by
× the average sojourn time of a request,
W. Little’s law applies to any queueing system, regardless of the type of request stream, service time distribution, or queue discipline.
7. Petri Nets and Timed Petri Nets
Petri nets, sometimes called net structures, are abstract formal models that have been developed in the search for natural, simple, and efficient methods, in order to describe and analyze the flow of information in systems. Carl A. Petri, in his work [
66], proposed a formal model of systems composed of independently operating concurrent elements that can communicate with each other to exchange information or synchronize their processes. This model is commonly known as PNs (Petri nets) [
67,
68,
69]. PNs represent a significant advancement in modeling parallel and concurrent processes. In PNs, the system’s state is derived from a combination of local state variables, enabling a direct representation of concurrency, causality, and independence. A central feature of the model is that any form of synchronization in the system must be achieved through the exchange of information between the system components. This principle is reflected in the formal notation of PNs. A Petri net is an abstract concept, formally given in the form of an ordered triple [
68,
70]:
for which
is a non-empty finite set of places representing conditions,
is a non-empty finite set of transitions representing events, such that: (
P ∩
T) = ⊘ and (
P ∪
T) ≠ ⊘;
A is a binary relation that binds the elements of the sets
P and
T; otherwise,
A is a set of networks, such that
i.e.,
A does not directly connect places or transitions and is known as the flow relation, also referred to as the causality relation. It is an incidence of network vertices or alternatively, directed arcs. Relation
A between places and transitions can be represented by two separate directional relations:
P-elements and
T-elements contain input and output sets.
P-elements connected by arcs directed toward
t-elements are called input elements, while
p-elements connected by arcs directed away from
t-elements are called output elements. The sets of input and output transitions of a
p-element space are defined as follows:
Similarly, sets of input and output sites are defined for a particular transit
tA characteristic feature of networks is the graphical representation of information flow. The following interpretation is adopted for networks. The network is represented as a bipartite graph
, whose nodes are elements of the set
P and
T, i.e.,
V =
, while the arcs are pairs of relations
A.
Figure 9 shows an example of a net structure modeling the producer–consumer system, for which
The graphical representation of the network shown in
Figure 9 is a simple model of two processes communicating over an unlimited buffer, i.e., a producer–consumer system with an unbounded buffer. To fully illustrate the communication process, the model shown is supplemented with
tokens inside the
p-elements. The behavior of events in the network is represented precisely by the
tokens assigned to the
p-elements of the network. In PNs, the distribution of
tokens over places changes by occurrences or transition firings. Enabling and firing rules are associated with transitions. Informally, we can say that the enabling rule defines the conditions that allow a transition to fire, and the firing rule specifies the change of state produced by the transition. Both rules are specified in terms of arc characteristics. An active transition removes one token from each input location and adds one
token to each output location. Typically, a
p-element with at least one
token associated with it means that the condition represented by that element is satisfied. The distribution of
tokens in
p-elements can be described by the marking function
or can be represented as a vector describing the number of
tokens assigned to consecutive
p-elements of the network:
. The network
N, including the initial marking function
, is called a marked net
M and can be equivalently defined as in Equation (
36):
An example of a net structure extended with a
marking function, which determines the initial state or marking of the net, is shown in
Figure 9. In many practical applications of PNs, the activation of transitions is synchronized by external events, and certain actions are assigned to the locations of the net that are performed when the described system is in a certain state. The modeled example shows the process of asynchronous communication between the producer and consumer. The network, including the initial marking function with
tokens placed inside
p-elements, i.e.,
and
, represents a marked net,
M. The two cyclic subnets (
,
,
,
) and (
,
,
,
) represent the producer and the consumer, respectively, while place
is the buffer. As mentioned earlier, the marking function,
m, is an assignment of
tokens to
p-places and can be represented by a vector with as many components as there are places in the network; the
component then represents the number of
tokens assigned to a
p-place in the network. A change in the state of the sender occurs as a result of events assigned to transitions. Transition
is assigned the
data prepared event, and transition
is assigned the
data sent event. Similarly, the consumer may be in one of two states,
or
, and its change occurs as a result of the occurrence of events, assigned to transitions
and
, respectively. In a marked net,
M, a transition (
t) is enabled by a marking
m :
An enabled transition (
t) can fire, transforming a marking (
m) into a directly reachable marking,
:
A pair comprising a place (
p) and a transition (
t) is called a self-loop if
p is both the input and output of
t . A net is said to be pure if it has no self-loops. Pure nets are completely characterized by a single matrix,
C. The connectivity matrix,
C, or—in other words—the network invariance matrix, is an
matrix, where
is the number of
p-elements and
is the number of
t-elements. For the network shown in
Figure 9, the incidence matrix takes the form:
The analysis of the model shown in
Figure 9 shows that the depicted communication process is a unidirectional process and is determined by a sequence of arcs and nodes, corresponding to the unidirectional flow of
tokens in the model. Many practical process synchronization schemes are derived from this simple model. The model can be considered in different variants, where data
tokens flow into a queue buffer of varying capacities. Bernardini et al. [
71] researched models that synchronized through a buffer with a capacity of one element, as well as a parallel version with direct access to two separate buffers, each having equal capacity. The Petri nets theory has been extensively studied for many years, and this research has resulted in the definition of many types of networks and a substantial body of literature describing their properties. There are many modifications to Petri nets created to solve specific problems of a complex nature. Although there are many different variations of Petri nets [
72,
73,
74,
75], their common feature is a structure based on a bipartite directed graph, i.e., a graph with two types of vertices, alternately connected by directed edges i.e., arcs. These two types of vertices represent, in a general sense, the conditions and events occurring in the modeled system, with each event being able to occur only if all the conditions associated with it are satisfied. A system interpretation that introduces events that depend on their associated conditions (condition–event systems) is very convenient and helpful for modeling complex discrete systems [
76]. The Petri net class choice is, thus, determined by the purpose of the analysis. The behavior of PNs is time-independent and characterized by the non-deterministic firing of transitions that are simultaneously included in a given label. In order to investigate the performance aspects of systems modeled with PNs, however, the duration of the modeled activities must be taken into account. This can be done in different ways, resulting in different types of temporal networks. Network models with attached temporal descriptions are called TPNs (Temporal Petri Nets). We are dealing with TPNs when the realization of transitions is not immediate but takes a finite amount of time. This implies that in defining such a net, the characteristics described in the transitions are considered. A TPN is defined as a pair,
, where
M is the marked net,
;
f is the firing time function that assigns the average firing time or occurrence time to the transition of the net, and
, where
is the set of nonnegative real numbers. To analyze the performance of temporal networks, an additional component is needed to describe random decisions in nondeterministic networks. This is
c (conflict resolution function),
, which assigns probabilities of firings to transitions in free-choice classes of transitions and relative frequencies of firings to transitions in conflict classes [
75]. Temporal characteristics in transitions can determine the time associated with the realization of transitions in various ways. In particular, it can be a deterministic value or described by a random variable with a given probability distribution. In the first case, the corresponding timed nets are referred to as
D-timed nets [
77], in the second, for the exponential distribution of occurrence times, the nets are called
M-timed nets, Markovian nets [
78]. In simulation applications, other distributions can also be used, such as the uniform distribution
U-timed. In TPNs, different distributions can be associated with different transitions of the same model, providing great flexibility in working with simulation models. A detailed characterization of TPNs, as well as a detailed characterization of the behavior of timed nets with immediate and timed transitions, are presented in the literature [
75,
79,
80].
8. The Model
In order to evaluate the performance of the modeled systems, a Petri Net model, shown schematically in
Figure 10, was used. It is advisable to drop data that exceed the priority queue size rather than incur additional queueing latency, which can lead to network congestion, resulting in delays, packet loss, and degradation of quality of service metrics. Therefore, the research presumes that the model of the system under study would consist of three priority queues with finite capacity. These will serve as system queues, i.e., storage areas for priority data after classification. Each queue will store incoming data of the same priority class. Such a mechanism prevents a priority queue from monopolizing the entire interface. This approach allows for effective traffic management for high, medium, and low priority data. In the models studied, the data are designated as class-1, class-2, and class-3, respectively. Transitions
,
, and
, together with places
,
, and
are independent sources of generated data for class-1, class-2, and class-3, respectively. The rates of generated data
,
, and
are determined by the occurrence times associated with
,
, and
. The Markovian mode of the timed transition determines the distribution of the inter-arrival times for the source. In the models tested, assumptions were made that the PQ model would consist of three priority queues. The three places,
,
, and
, are queues for class-1, class-2, and class-3 data, respectively. Timed transitions
,
, and
represent the transmission channels for class-1, class-2, and class-3 data. For simplicity of the description, it is assumed in this paper that the transmission times are approximately the same for data of different classes. In the model, the data transmission time is assumed to be deterministic at one time unit (for all three classes). In addition, in the model, place
is shared by transitions
,
, and
, in order to ensure that only one data type is transmitted at a time. Hence, place
is denoted by a single token (i.e.,
). In addition, arc inhibitors (
,
) do not allow data to be transmitted over transition
when there is class-1 data waiting to be transmitted. Similarly, arc inhibitors (
,
and
,
) only allow transmissions when there is no class-1 or class-2 data waiting to be transmitted. One characteristic of priority queues is that, as long as any of the data are in a higher priority queue, data waiting in lower-priority queues cannot be transmitted. Transmitting data only from higher-priority queues can lead to a transmission starvation problem for data classified in lower-priority queues. The model under study uses a modification of the PQ algorithm that eliminates the problem of starvation of lower-priority data streams. It was assumed that the system would sequentially handle five requests from the class-1 queue, three requests from the class-2 queue, and finally two requests from the class-3 queue. In addition, the model used finite input queues, where requests waiting to be handled in the system flow. Based on the models prepared, traffic shaping mechanisms were investigated in systems based on priority queueing with 5-3-2 priorities and a finite input queue of 5, 25, and 75 requests. M-timed Petri nets (Markovian nets) were used as the sources of data generation in all three queues.
The rates of generated data are determined by the occurrence times associated with transitions , , and : , , , so traffic intensity . The average waiting time is determined by and the average queue length is determined by ; finally, utilization and throughput are determined by , where . The arguments refer to the total (cumulative) firing times or total token waiting times. The argument refers to the number of firings or the number of tokens entering the places. refers to the simulation times.
9. Performance Characteristics
Performance results are obtained by the event-driven simulation of timed Petri Net models. A collection of software tools for the analysis of temporal Petri nets, TPN-tools, developed and under development at the Department of Computer Science at Memorial University, St. John’s, Canada, was used to perform simulations and build models of PQS systems. In timed Petri nets, the occurrence times of some transitions may be equal to zero, which means that such occurrences are instantaneous; all of such transitions are referred to as immediate, while the others are referred to as timed [
75]. The model developed was evaluated for different combinations of modeling parameters. Performance results were compared for three different data classes with weights equal to 5, 3, and 2 for three different queue lengths of 5, 25, and 75 requests, respectively. With three priority classes and weights 5, 3, and 2, utilization bounds were equal to
,
, and
for classes 1, 2, and 3, respectively. In order to compare the system performance, an analysis was made of the parameters, such as
(Average Waiting Time),
(Average Queue Length) and utilization. The research carried out has made it possible to estimate the impact of queue lengths on the performance characteristics of the system and different classes of data. Experimental results reveal that, for the same number of data generated and channel usage,
is different. In addition, the study found significant differences in systems with queues of 5, 25, and 75. It was found that the
characteristics in the range up to
are similar in all models and have linear behavior. However, for
, there are dynamic changes in the performance characteristics in the models studied. The largest increase in
values was found for class-3 data. Furthermore, the maximum
values for class-3 data were found to vary between different queue lengths. In the model with a queue length = 5, the
(
Figure 11), in the model with a queue length = 25, the value increases to
(
Figure 12), and for the longest queue length = 75, the
(
Figure 13). As can be seen, increasing the queue length to a value of 75 resulted in a
times the increase in the
value for class-3 data. Class-2 data are similar, but here, increasing the queue length to a value of 75 results in a
times increase in the
value. For the highest priority class-1 data, the increase is the greatest at
times. As can be seen when the traffic volume approaches
,
becomes significant and approaches the maximum value for all three data classes.
The dynamics of the changes occurring in the model become the most apparent for the
parameter, which is directly related to the
parameter. Both parameters studied show a similar character, as can be seen in (
Figure 14,
Figure 15 and
Figure 16). In all the studied models, the
tests showed that, for
, there is a sharp increase in queue length. As expected, in the first stage, the length of the queue of the lowest priority data, i.e., data of type class-3, increases sharply. In the studied model with a queue length = 75, it was found that for
, the
. On the other hand, increasing the value of
by
, i.e., increasing to
, results in a dynamic increase in data queueing, and the value is
. Similar dynamic changes were observed for the data of class-2 and class-1. The study showed that for
, the
values in all models are close to the maximum values.
In the next stage, the utilization parameter was examined. As before, a system with queues with lengths of 5, 25 by 75 was analyzed. Experimental results reveal that the parameter has similar characteristics in all the models studied, as can be seen in
Figure 17,
Figure 18 and
Figure 19. As the value of
increases, there is a proportional increase in the value of the utilization parameter. The linear characteristics of the utilization parameter for the investigated classes occur in the range of
from 0 to
. In this range, all classes proportionally share the entire available bandwidth of the transmission channel. For
, class-1, class-2, and class-3 data use the entire available bandwidth, and the transmission channel is shared at a ratio of
per class. As the amount of incoming data increases, the available bandwidth of the system decreases. Once all the bandwidth is used, i.e., for
, we observe a process of data competition for access to the link. In the first instance, we observe a collapse in the linear characteristics of class-3 data, with the maximum utilization value for class-3 data occurring for
(see
Figure 18). As the incoming data continue to increase, the utilization value decreases, but stabilizes at
for
. A similar phenomenon is observed for class-2 data, but here, the maximum utilization value is higher, as class-2 data only compete for available bandwidth with class-1 data. As the
value increases, the utilization value stabilizes at
for
. Class-1 data are the highest priority data and do not have to compete for available bandwidth with other data. The linear behavior of the utilization parameter for class-1 data occurs until it reaches a value of
. For high throughputs, the value stabilizes and remains constant at the same level. Systems with queues of 5, 25, and 75 show no significant differences in the utilization parameter.
In the final stage, the behavior of the transmission channel was investigated with a large amount of incoming data into the system. The behavior of the transmission channel shared by the three data classes was analyzed as a function of class-1 data traffic intensity and at constant traffic intensities for class-2 and class-3 data (class-2 = 0.6 and class-3 = 0.3) (
Figure 20). For class-3 data, channel utilization remains unchanged at the initial level for
. Thereafter, we observe a decrease in channel utilization as the value of
increases. The value of the utilization parameter stabilizes at
for
. For class-2 data, the channel utilization has a more dynamic course, and the channel utilization halves. From an initial value of utilization =
, we observe a decrease to
. The value of the utilization parameter stabilizes at
for
. With class-2 data, channel utilization decreases for
. This means that class-2 data start competing for link access when
reaches
. Interestingly, class-3 data (lower-priority data) only start competing for link access when
reaches
. This phenomenon is well illustrated by the AWT and AQL, as shown in
Figure 21 and
Figure 22. The AWT and AQL studies carried out revealed that when there is a large volume of data traffic in the system, the performances of higher-priority data can deteriorate. As can be seen in
Figure 21, for
, the average waiting time of class-2 data increases sharply, while lower-priority data (class-3 data) show much better characteristics in the same range. The AQL parameter shows a similar phenomenon. In the first stage, there is a sharp increase in the queue length for class-2 data, followed by a rise in the queue filling with class-3 data, which are lower-priority queues (
Figure 22), i.e., with lower-priority queues. With class-1 data having the highest priority for both AWT and AQL parameters examined, it shows the best characteristics.