1. Introduction and Related Work
We are currently in a transition stage, in which the digitization of production processes is becoming more widespread, with large areas of new technologies such as Big Data, Machine Learning or Artificial Intelligence depending on it. These new technologies will improve and optimize the configuration and maintenance of industrial processes, even leading to the proposal of a Digital Twin of an industrial scenario, with the ability to test new configurations in controlled virtual environments or to predict the behaviour of systems in order to anticipate unexpected behaviours such as the deterioration of a tool.
Personal area Wireless Sensor Networks (WSNs) allow for an increase in this level of digitization, enriching the systems already deployed in the plant with greater flexibility, which allows deployment of a large number of sensors, without strict limitations in range, location or power consumption. The Time-Slotted Channel Hopping (TSCH) mechanism proposed in the IEEE 802.15.4e standard [
1] has positioned itself as one of the benchmarks for WSN deployment in industry, since its deterministic behaviour and the use of frequency hopping techniques are designed to work in the type of scenario where there is a high probability of interference.
The combination of the IEEE 802.15.4e standard with the IPv6 Routing Protocol for Low-Power and Lossy Networks (RPL) protocol [
2] allows the construction of meshed networks whose routes are dynamically updated autonomously, which facilitates the unattended use of this type of network. One of the weaknesses of these systems based on TSCH and RPL is the network formation process, which can be extensive in terms of time without a correct configuration of the parameters involved in the process. For applications such as home automation or smart cities, the network formation times may not be as relevant. However, in industrial applications, the network commissioning time and the energy efficiency associated to this process are key performance factors to consider when deploying this kind of WSNs. On the TSCH side, this is due to the fact that the nodes do not know in advance at what instant of time or on what frequency the necessary messages will be transmitted in order to synchronize. This process is not detailed in the standard, so different alternatives have been proposed to speed up this stage of the formation of a WSN.
Some authors, such as [
3,
4,
5], propose different alternatives and schemes to improve synchronization processes, such as TSCH schedules dedicated to beacon transmission or modifications of the beacon transmission period to use optimal traffic to speed synchronization. However, all these approaches do not cover the entire network formation process, since they do not consider the creation of the RPL routing tables.
On the RPL side, the formation of both upstream and downstream routes is carried out once TSCH synchronization has been obtained, exchanging control messages with the rest of the neighbouring nodes to establish the optimal route to the root node. The formation of these routes can be compromised without proper use of radio resources, as interference can cause these control messages to collide, delaying the network formation process.
Other works have addressed the problem of the whole WSN formation process. The authors of [
6] propose a dynamic algorithm to manage the resources in IPv6 over the TSCH mode of IEEE 802.15.4e (6TiSCH) [
7] and RPL networks, improving the behaviour of previous state-of-the-art solutions. A network based on TSCH and RPL is also used in [
8], reducing the network formation time by reducing the slot frame size during the network deployment phase. The authors of [
9,
10] conducted a study in which they observed how the synchronization and network formation time are affected by different configuration parameters, highlighting the beacon transmission period. Based on this analysis, they propose a mechanism called Bell-X, which adjusts this period dynamically, allowing the synchronization to be speeded up, not only during deployment but also to improve the connection of a new node during the steady state of the network. Yaala et al. [
11] develop an analytical model to characterize the synchronization process in TSCH using shared slots and sporadic traffic.
In order to characterize the behaviour in the different phases, an analytical model was proposed for the TSCH synchronization time and the formation of ascendent routes through RPL [
12]. This model allowed the behaviour to be observed under different configurations of the parameters involved in the network formation process, providing a global view that allows the configuration of these protocols under different use cases to be optimized. The formation of downstream routes has other complications, since interfering nodes affect each hop in a localized manner. This paper proposes an analytical model to characterize the time it takes to perform the downstream routes formation in RPL, considering the number of interfering nodes in each hop. Using this model, it is possible to analyse the impact on the network formation time, due to different protocol parameters and physical configurations of the WSN, such as the number of interfering nodes that cause collisions to the RPL traffic.
This multi hop message exchange covers the formation of downstream routes in RPL, and in combination with the published results of [
12], including TSCH synchronization and upstream routes formation, the entire network formation process is temporally characterized. Thanks to these models, it is possible to predict the behaviour of a WSN based on TSCH and RPL under different configurations and deployments, in order to optimize the configuration of both protocols in industrial scenarios.
The rest of the article is structured as follows.
Section 2 introduces the complete network formation process, looking at both the TSCH synchronization phase and the creation of upstream and downstream RPL routes.
Section 3 provides an analysis of the state of the art, addressing those articles that study the process of network formation based on TSCH and RPL.
Section 4 proposes the analytical model for the upstream path formation time in RPL, while
Section 5 describes the simulations performed to test the reliability of the model.
Section 6 shows the results obtained in the different simulations. Finally,
Section 6 presents the conclusions obtained after this analysis.
2. TSCH and RPL Network Deployment
The TSCH medium access method has advantages in industrial environments, thanks to its deterministic operation and frequency hopping mechanism. To achieve this determinism, TSCH networks need to maintain precise time synchronization, and thanks to the scheduling shared by the entire network, the equipment can transmit information without the probability of collision with other messages.
The resources to be managed through this planning are the timeslots, a series of slots into which the time space is divided, giving enough time to transmit a message of the maximum size defined in the standard and to receive an acknowledgement confirming that the message has been received correctly.
To maintain this synchronization throughout the lifetime of the network, all nodes periodically send beacon messages, called Enhanced Beacons (EBs). The parameter sent in the EBs to perform the synchronization is known as Absolute Slot Number (ASN) and represents the absolute value of the slots that have passed since the network was initiated. In this way, all nodes that are synchronized with the TSCH network share the same time source, obtained thanks to the ASN parameter. To maintain this time synchronization, each time a node receives a new EB message, it compares its ASN value with the one it has received and compares the possible deviations that may occur.
According to the operation defined in the IEEE 802.15.4 standard, BS messages are transmitted with a period fixed in the configuration of the devices. Due to the frequency hopping mechanism and the fact that the size of the slotframes has a prime value, each time a BS message is sent, a different channel will be used, following a frequency hopping pattern defined in the device configuration. To calculate the frequency used in each channel, from the ASN value, the following function is proposed by the standard:
On the other hand, the nodes waiting to synchronize start the scanning process. Since the BS messages are transmitted on different frequencies following a certain hopping pattern, the scanning process is also performed on different frequencies, but the scanning frequency is changed randomly.
Figure 1 depicts the slotframes of two nodes during the TSCH synchronization phase using a total of four channels, both for normal operation and for the scan phase. As can be seen, the duration of scanning is much longer than the time in which a message is transmitted by the already connected nodes. The diagram also shows different cases where synchronization may fail. These failed synchronization attempts may be due to the fact that during the time the node is listening to a channel there is no message being transmitted or because the message is transmitted on a different frequency from the one the node is listening to. It is also interesting to note that, during the scanning process, the node keeps the radio always active on different channels until it receives the beacon, while nodes already synchronized can turn off their radio in the inactive slots to reduce energy consumption.
The configuration of the devices involves several parameters that affect this synchronization process, such as the total number of channels used, the number of channels being scanned, and the scan time of each channel or the frequency hopping pattern. In the 2.4 GHz band, a maximum of 16 channels can be used, which may be different from the channels being scanned. The configuration of these parameters is not detailed in the standard, so a wrong configuration can lead to a very long synchronization time or even an inability to synchronize at all.
Once the new devices have been synchronized in TSCH, the nodes will follow the same pattern of active and inactive slots within the slotframe, thus reducing their work cycle with respect to the scanning phase. However, the nodes do not yet know the network topology, so it is necessary for them to receive a DIO message from RPL. The nodes that are trying to join the topology will start a process of sending DIS messages to request the necessary information to join the RPL topology. These messages will be transmitted with a fixed period, which will be configured by default in the equipment.
On the other hand, the nodes already connected to the network will be sending DIO messages using the Trickle Timer mechanism, which modifies the period with which these messages are sent. The Trickle Timer mechanism allows an increase in the time between DIO messages, so that if the network remains stable, DIO messages are sent much less frequently. The only way to reset the Trickle Timer is when there is a change in the topology, such as a change of parent, or when a DIS message is received from another device.
Figure 2 shows the RPL message exchange mechanism after a new node has synchronized. The figure shows how the node that is already part of the network modifies the period with which DIO messages are transmitted, and how when a DIS message is received by the new node the timer used by the Trickle Timer is restarted. In this way, it is forcing DIO messages to be sent more frequently to encourage the RPL tree to be updated. The connection to the RPL topology only occurs when the new nodes receive a DIO message with information about the topology.
Finally, for the creation of downstream routes, the nodes that have just joined the RPL topology must use that link to the root node to transmit a DAO message, so that all nodes on the path know the downstream route to the new node.
Figure 3 shows the comparison between a multi-hop message exchange, such as the DAO message, and a multicast message exchange, which only reaches nodes within the coverage area. Unlike the previous processes, the DAO message is a unicast transmission, while EBs and broadcast DIO messages are transmitted in multicast to all nodes in the WSN network. This means that both EB and DIO transmissions will be single-hop transmissions to nodes within the coverage area, while the DAO message must be sent all the way to the root, so that in multi-hop meshed networks it must be forwarded over several hops until it reaches its destination. The RPL protocol allows this process to be configured so that, in addition to sending a DAO message from the new node to the root, a DAO-ACK is sent to confirm that the root node has correctly received the message.
3. DAO Multi-Hop Analytical Model
The time it takes for the new node to form the downstream routes was then modelled by sending the DAO message. This message originates from the node that has just joined the RPL topology after receiving a DIO message and must be transmitted up to the RPL topology coordinator via the default route. In this way, all nodes in the path add that route to the new node.
Table 1 shows the different parameters that influence the process of sending the DAO message. These parameters have been taken into account when defining the analytical model, which allows the time it takes for each node to send this message to the root to be characterized.
Taking into account the probability of successful transmission using the Packet Delivery Rate (
) parameter, it is possible to determine the time it takes to send a DAO message in ideal situations on a single-hop link. In the best case, this message will be transmitted in the next available slotframe, whose time is determined by the variable
, which represents the number of timeslots dedicated to the RPL slotframe. Since the time at which the DAO message is generated can occur at any point in the slotframe, this time is randomly distributed between 0 and
, so that on average the time it takes to transmit this message on a single-hop link will be
, with a probability determined by the parameter
. In the case that the message is lost, it will be sent in the next slotframe, with a probability (
). It is possible to resend the message up to four times. The time taken to send the DAO message,
, will have a different behaviour in the first hop with respect to the rest, since the DAO can be generated at any instant within the slotframe, while the retransmissions will always be carried out in the next slotframe. This behaviour is characterized by the parameter
, which will have a value of 1 in the first hop and 0 for the rest of the cases.
Using as an example the case in which
, Equation (2) would be represented as shown below,
where the first operand represents the first attempt to send the DAO message and the following are the forwarding messages if a collision occurs.
This parameter only indicates the time to send it over a single hop link, so to reach the RPL root it will be necessary to multiply it by the number of hops to the root node. However, due to the configuration of the Orchestra slotframe [
13] dedicated to RPL control traffic, DAO messages are sent through a single shared slot, so they may collide with other types of messages such as DIO, which will be sent by all nodes within the coverage area following the pattern set by the Trickle Timer. In this way, in each hop the node that has to resend the message will have a series of interfering nodes within its coverage area, which do not have to coincide with the interferers of another hop. To model this behaviour, a vector of inter-hop nodes has been defined, where the size of the vector indicates the number of hops, represented by the variable
, and the values of each position indicate the interfering nodes in each hop,
. The following equation shows a vector of size three, which indicates the number of interfering nodes in a three-hop link up to the root.
To consider possible collisions with DIO messages, the
parameter has been defined, which indicates the probability that a DIO message is being transmitted in a given slotframe. This parameter is calculated according to the Trickle Timer period,
, with which the control messages are being transmitted and the size of the slotframe dedicated to RPL traffic,
.
Finally, to calculate the time it takes to send the DAO the time to transmit the DAO as a function of the PDR is considered,
. It also takes into account the possible interference with DIO messages in each of the hops. For this purpose, the summation goes through the vector of values of interfering nodes of Equation (2).
The first fraction of Equation (5) represents the sending time of the DAO message in the first hop of the whole path to the root node. Thus, takes into account that the creation of the DAO message can occur at any time during the slotframe, with an average value of , as explained in Equation (2). In addition, the vector of inter-hop nodes uses the first element of the array, . In the second part of Equation (6), all other hops along the path to the root are considered, using , since the forwarding of the DAO message is assigned to the next slotframe after receiving the previous forwarded message. In this case, the summation allows the vector of interfering nodes to be traversed for each hop.
4. Simulations
To carry out the validation of the analytical model, simulations have been performed with different deployment configurations of a WSN network. As there is a wide variety of possible configurations, given that the interfering nodes in each hop can vary, the experiments have been limited to a maximum number of three hops (
), while the maximum number of interfering nodes in any given hop reaches a maximum of fifteen nodes. These configurations give enough information to be able to validate the accuracy of the model, covering a wide range of applications for industrial scenarios, where the number of devices is usually between 10 and 30 nodes [
14].
The configuration of the protocol stack, as well as the different variations of each of the simulated configurations, have been carried out using the Contiki NG operating system, as it includes the Cooja simulator with which to deploy networks based on TSCH and RPL.
Figure 4 shows one of the topologies under study, where the newly deployed node has ID 15, and is located three hops away from the root node (node 1), so it must relay its messages through nodes 3 and 2. The communication between node 15 and node 3 has ten interfering nodes, including node 2 and nodes 6 to 14. For the communication between node 3 and node 2, there would be five interfering nodes, including node 1 and nodes 4 to 7. The last hop, having no other node within the coverage area, has no interfering node. To validate the behaviour of this type of multi-hop network with sufficient repeatability, the selection of routes has been limited so that the same type of multi-hop network is always formed, while varying the number of interfering nodes at each stage up to the root.
Simulations were performed for 11 different configurations, the topology of which is shown in
Figure 5 in which the new node is one hop away from the root,
Figure 6 in those in which it is located two hops away, and
Figure 7 for those configurations in which the new node is deployed three hops away from the root. Each of these simulations was repeated 10 times, using different seeds for each of the simulations.
The procedure followed in each of the simulations begins by deploying an initial network, based on the configurations presented above. The deployment of this initial network and its convergence before connecting the node under consideration, aims to normalize all simulations under the same initial conditions. Once the initial network is deployed, both the TSCH synchronization and the RPL topology, a new node is added at the location planned in each of the configurations. The simulation ends when the DAO message is received from the root. To analyse the time required to send and forward the multi-hop message, the processes at MAC layer are taken into account. Through the Cooja simulator, it is possible to check the cases of retransmissions due to interference and the time needed to send the control message along the path to the root node.
Table 2 shows the basic configuration of some parameters of the selected communications architecture. For scheduling, we have used the Orchestra autonomous scheduling, which allows us to configure slotframes dedicated to different traffic planes. Since we want to evaluate the behaviour of downstream route shaping only, RPL will use a single slotframe to transmit all its control messages, whether multicast or unicast. In order to maintain a constant level of interference in all simulations, a constant traffic of multicast DIO messages has been generated, instead of using the period set by the Trickle Timer. This will allow the results to be as homogeneous as possible and will only affect the number of interfering nodes and the number of hops to the root.
For the validation of the analytical model, it is necessary to know the behaviour of the PDR of the transmission medium. This PDR will not be associated to all the transmissions of a node, but the PDR behaviour has been evaluated only in the Orchestra scheduling dedicated to RPL, which is based on minimal 6TiSCH [
15]. Thus, the PDR behaviour will be associated with the transmissions of RPL messages on a single resource per slotframe.
Figure 8 shows the evolution of the PDR based on the number of interfering nodes.
5. Results and Model Validation
The different results obtained from the simulations described above are shown below. The main results show the evolution of the DAO message reception time as a function of the number of interfering nodes. To represent this time as a function of the interfering nodes of several hops, multidimensional graphs have been plotted, so that the behaviour of this configuration can be seen by representing the most relevant slices.
Figure 9 shows the results obtained for the simulations with a single hop. The curve represents the equation obtained from the analytical model, while the asterisks show the average value obtained from the simulations, and the typical deviation is represented by a plus sign. As can be seen in the figure, the results obtained from the simulations show a behaviour very close to the curve defined by the model. An interesting aspect that can be deduced from this figure is the behaviour of the standard deviation. Since the messages are sent only in the slots dedicated to RPL, there is only one opportunity to send a message every 310 ms, which determines that the latency increases by multiples of this value each time interference occurs.
Figure 10 shows the two-dimensional response for the two-hop configurations. To facilitate the representation and comparison of the results of these configurations, those slices of the surface corresponding to the first level interfering have been chosen, representing the curves in
Figure 11.
Each of the colours represents one of the sections marked in
Figure 10. The curves represent the response of the proposed analytical model, while the asterisk and the plus sign represent the average and standard deviation obtained from the simulations. As with the single-hop simulations, the analytical model fits correctly, with the standard deviation increasing as the interference increases in each of the hops.
For simulations with three hops, the representation would be characterized by a volume, since the three axes represent the interfering nodes in each of the hops. However,
Figure 12 shows only the four surfaces of interest according to the simulated configurations.
Figure 13 shows these 4 surfaces separately, indicating the sections of interest to be analysed, which correspond to the simulated configurations.
As in the previous cases,
Figure 14 shows only the curves for each of the configurations of interest, in order to show in a simple way the comparison of the results with the analytical model. In these last configurations with three hops, the results obtained in the simulations also coincide with the expected behaviour of the proposed model. Analysis of the results presented in
Figure 9,
Figure 11 and
Figure 14 shows that the behaviour of the proposed model, as a function of the number of hops to the root node and the number of interfering nodes, is similar to the real response of WSN based on TSCH and RPL.
6. Conclusions
The establishment of downstream routes represents the final stage of the WSN formation process. Its behaviour differs from the multicast control messages exchanged to obtain synchronization and form the downstream routes, since the sending of DAO messages can be performed on multi-hop links, where the interfering nodes vary with each retransmission. In this work, an equation that models the timing to form the downstream routes using the RPL DAO message is proposed. This message, unlike the previous phases of node deployment, is transmitted in unicast and must be sent from the new node to the root, so that in those cases where they are not within the same coverage area, a multi-hop communication will be performed.
The communications architecture used is designed to meet the requirements of industrial applications through the use of the TSCH mechanism. This provides determinism and reliability, while the RPL protocol allows routing tables in WSN networks to be built and maintained autonomously. The use of the proposed analytical model and the study that accompanies it allows us to know in detail which aspects are more relevant during the network formation phase, allowing better configurations to be chosen in order to speed up this process. With a bad configuration of the protocol stack parameters, network formation can take many minutes or may even fail to connect, resulting in a loss of resources and energy dedicated to the process of connecting the devices.
From the results obtained from the simulations, it has been verified that the analytical model follows the same behaviour as the simulated networks, whose variations have focused on determining the impact of interfering messages in RPL on the different hops of a link between node and root. This interference produced in the RPL traffic plane considerably affects the connection time and efficiency in the exchange of messages, so that if measures such as sending DAO-ACK or allocating more resources to RPL are not adopted, the construction of downstream routes may not be completed correctly.