1. Introduction
TCP, which is a de facto transport layer protocol for the Internet, was not considered a suitable protocol for wireless sensor networks which enable IoT [
1,
2]. However, protocol standardization for IoT resulted in TCP emerging as a major candidate transport protocol, even for constrained IoT networks, because TCP is better than UDP (User Datagram Protocol) for integration with the existing network infrastructures where middleboxes such as firewalls or NATs (Network Address Translation devices) may block UDP packets. In Reference [
1], Gomez et al. stated that the recent industry and standardization tendencies such as the Constrained Application Protocol (CoAP) over TCP, HTTP (Hypertext Transport Protocol) optimized for IoT, and messaging protocols such as the Message Queuing Telemetry Transport (MQTT) and the Advanced Message Queuing Protocol (AMQP) suggest that TCP may gain extensive support in IoT scenarios. Regarding the historically claimed issues of TCP in IoT scenarios, they argue that most of the issues are found in well-accepted IoT end-to-end reliability mechanisms, can be solved, or are not actual issues.
The major challenge of designing a protocol for constrained networks is how to make it work well under severe resource constraints such as limited battery and computing power, small memory, or insufficient wireless bandwidth and ability to communicate [
3]. To reduce memory usage, for example, TCP of the uIP stack [
4,
5] does not implement a sliding window for sending and receiving data and limits the number of outstanding segments to one. Several other embedded TCP implementations including BLIP (the Berkeley Low-power IP) stack [
6] in TinyOS [
7] and GNRC (the Generic Network Stack) [
8] in RIOT OS [
9] also keep the TCP window size of one segment [
10,
11,
12]. However, this does not mean that these TCP implementations do not need any congestion control mechanism. Since periodic reporting by every node in scenarios such as continuous monitoring may cause congestion around the sink, TCP still needs to control congestion. In this case, managing the retransmission timer is a primary approach to congestion control. The optimal mechanism for setting the retransmission timer value will reduce retransmissions while maintaining high throughput. Reducing TCP retransmissions in wireless sensor networks is very important because end-to-end retransmission consumes much more energy than hop-by-hop retransmission.
Radio duty cycling (RDC) is a major tool which is considered to save battery power in wireless sensor networks. However, despite a number of prior works on TCP for wireless sensor networks [
10,
13,
14,
15,
16], few studies considered RDC. TCP is required to work well regardless of whether an RDC mechanism runs underneath. We examined TCP performance in constrained networks with RDC enabled. In the preliminary test using the Cooja network simulator, we found that uIP TCP causes lots of retransmissions when ContikiMAC [
17], which is the default radio duty cycling mechanism in the Contiki OS, is enabled. uIP TCP’s mechanism for managing the retransmission timer explains a considerable part of the problem. uIP TCP sets the retransmission timer based on RTT measurements for first transmissions. However, for retransmissions, it sets the retransmission timer based on the fixed retransmission timeout (RTO) of 3 s. Since ContikiMAC may cause large RTT variations due to the hidden terminal problem, the retransmission timer based on the fixed RTO value may cause lots of retransmissions, which could be reduced with the estimated RTO based on RTT measurements. This motivated us to design a scheme that enhances uIP TCP’s performance in networks with RDC enabled.
In devising a mechanism for managing the retransmission timer, we note that CoAP [
18], which is an application protocol for constrained networks, also usually limits the number of outstanding messages to one. Congestion control relying only on RTO management was actively studied in the context of CoAP over UDP. The CoAP Simple Congestion Control/Advanced (CoCoA) [
19,
20] is one of the well-known schemes to enhance congestion control for CoAP. We propose a scheme that enhances uIP TCP’s performance by employing the notions introduced by CoCoA such as weak RTT and backoff with variability.
Some might argue that in-network congestion control is better than per-flow end-to-end congestion control in IoT networks. In-network congestion control is effective because network devices can use the information which is not available to end systems. Therefore, to improve congestion control and load balancing at the IPv6 Routing Protocol for Low-Power and Lossy Networks (RPL) layer, numerous RPL schemes, including ORPL (Opportunistic RPL), QU-RPL (Queue utilization based RPL), PC-RPL (Power-controlled RPL), and BRPL (Backpressure RPL), were proposed [
21,
22,
23,
24]. However, to prevent the sources from overwhelming the network, we also need an end-to-end congestion control mechanism that makes the sources control the sending rates. Both in-network approaches and end-to-end approaches can complement each other.
Our main contribution can be summarized as follows:
We investigate the effect of RTT and RTO estimation on uIP TCP performance in constrained networks.
We examine how to adopt, for uIP TCP, the notion of weak RTT and backoff with variability, which CoCoA uses for congestion control in CoAP.
We propose a new mechanism for managing the retransmission timer to enhance TCP performance in constrained networks with RDC enabled.
The rest of the paper is organized as follows:
Section 2 provides the related work;
Section 3 provides a brief overview of uIP TCP and CoCoA;
Section 4 discusses the preliminary test results;
Section 5 presents our proposed scheme;
Section 6 discusses performance evaluation. Lastly, in
Section 7, we draw conclusions.
3. Overview of uIP TCP and CoCoA
3.1. uIP TCP
The uIP TCP/IP stack is an extremely small implementation of the TCP/IP protocol suite that was designed for embedded systems [
4]. uIP uses a single global buffer which is large enough to hold one packet of the maximum size. The same global packet buffer is used both for incoming packets and for the TCP/IP headers of outgoing data. When a packet arrives from the network, the device driver places it in the global buffer and calls the TCP/IP stack. To send data, the application passes a pointer to the data, as well as the length of the data, to the stack. The TCP/IP headers are written into the global buffer, and then the device driver sends the headers and the application data out on the network. Since the data are not saved for retransmission, the application has to reproduce the data if retransmission is necessary.
In uIP, the application is invoked in response to events such as data arriving, an incoming connection request, or a poll request from the stack. The main control loop repeatedly checks whether a packet arrived from the network and whether the periodic timer expires. If a packet arrived, the input handler of the TCP/IP stack is invoked. The periodic timer, which expires every 0.5 s in the current implementation, serves several purposes. Firstly, retransmission is driven by the periodic timer. Whenever the periodic timer expires, the retransmission timer for each connection is decremented. When the retransmission timer reaches zero, a retransmission is performed. Secondly, the periodic timer is used to measure RTT. Each time the periodic timer expires, uIP increments a counter for each connection that has unacknowledged data in the network. When an acknowledgement is received, the current value of the counter is recorded as sampled RTT. uIP TCP uses Karn’s algorithm which does not measure RTT for a retransmitted segment. The basic time unit for RTT and RTO is 1 s. Since RTT and RTO estimations are calculated using Van Jacobson’s fast algorithm [
32], which relies only on integer operations, the computational overhead is low. The algorithm for setting RTO is as follows:
While the maximum number of retransmissions is set to eight by default, the retransmission timer backs off exponentially up to four times with the fixed RTO value of 3 s.
uIP TCP does not implement the sliding window algorithm, which requires a lot of 32-bit operations. Instead, it allows only a single TCP segment per connection to be unacknowledged at any given time. Even though each node generates low-rate traffic via a stop-and-wait protocol, congestion can occur at areas where traffic is concentrated. Hence, uIP TCP needs congestion control, and an RTO algorithm comes into play.
Allowing only one outstanding TCP segment may cause the uIP TCP sender to interact poorly with the delayed acknowledgement mechanism of the receiver. Because the receiver only receives a single segment at a time, it may wait until the delay acknowledgement timer (which is typically 200 ms but can be as high as 500 ms) expires, which limits the maximum possible throughput of the sender.
3.2. CoCoA
We briefly describe the main features of CoCoA below.
Weak RTO estimator. Karn’s algorithm does not take RTT samples from retransmitted TCP segments in order to estimate RTT accurately. However, to raise the chances of measuring RTT, CoCoA takes an RTT sample even when the ACK is triggered by a retransmitted CON message. The RTO estimator using retransmitted messages is called the weak RTO estimator, which is differentiated from the strong RTO estimator that uses only CON messages acknowledged without retransmissions. CoCoA ignores RTT samples obtained after the third retransmission to avoid taking a much larger RTT estimate than the actual RTT. The strong RTO estimator is calculated as follows:
where
G is the clock granularity. The weak RTO estimator is calculated as follows:
It is notable that, for the weak estimator, the RTT variance multiplier is set to 1 instead of 4 to avoid increasing RTO excessively, and that a lower weight is assigned to the weak estimator to reduce the effect of inaccurate measurements.
Variable backoff factor. Since the weak estimator of CoCoA may have a larger RTT estimate than the actual one, the binary exponential backoff mechanism (BEB) may cause network underutilization. To address this problem, CoCoA uses a variable backoff factor (VBF) as follows:
where
a and
b are the thresholds. CoCoA sets
a and
b to 1 s and 3 s, respectively.
RTO aging. Furthermore, CoCoA uses an RTO aging mechanism to avoid keeping an RTO which is no longer valid. If the RTO is too small or too large and is not updated for a while, CoCoA tries to make the RTO move close to the default value.
5. Proposed Scheme
As stated above, since uIP TCP and reliable CoAP are similar in that both allow only one outstanding packet by default, the previous studies on congestion control mechanisms of CoAP inspired our proposed scheme. However, we note that uIP TCP and reliable CoAP have different requirements. The most distinctive difference is how to handle a failure of packet delivery after the maximum retransmission attempts. In the case of CoAP, if a message delivery fails even when the number of retransmissions reaches the maximum, the application is notified that the message was not delivered successfully. That is why the delivery ratio is included in the performance evaluation metrics in some CoAP studies. In contrast, it is expected that TCP does not allow delivery failure of a part of the byte stream. As long as the TCP connection is alive, TCP must guarantee that every single segment is delivered to the upper application in order. If a segment fails to be delivered even when the number of retransmissions reaches the maximum number of retrials, a broken network link is notified to the upper layer. Consequently, under the same network conditions, TCP requires that the maximum number of retransmissions be set to a higher value compared to CoAP. However, since retransmission by an end-to-end protocol requires high energy consumption in an energy-restricted network, our main focus is on reducing TCP retransmissions without decreasing throughput.
As discussed in the previous section, an RDC mechanism may cause large RTT fluctuation, which may lead to many retransmissions. While the purpose of RDC is energy efficiency, inducing many retransmissions can nullify the benefits of RDC. Obviously, our primary goal is to reduce retransmissions with and without RDC enabled. To this end, we consider weak RTT estimation, backoffs with variability, and dithering.
5.1. RTT Estimation
As shown in
Figure 3, successful delivery without retransmission may be rare depending on the network condition. In such a case, it is useful to employ weak RTT estimation, which measures RTT using retransmitted segments. As mentioned in
Section 3, CoCoA measures RTT between sending the first transmission and receiving the ACK, and it ignores RTT samples obtained after the third retransmission in order to avoid taking a much larger RTT estimate than the actual RTT. However, the results of our preliminary test with ContikiMAC indicated that we need to take a different approach to measuring weak RTT and setting RTO. We found that RTT sometimes got very large even when we took an RTT sample using the latest retransmission. In this case, measuring RTT between sending the first transmission and receiving the ACK would have rendered an RTT sample much larger than the actual RTT. Thus, we always measure RTT between the latest (re)transmission and the ACK regardless of how many retransmissions are performed. Once an RTT sample is taken, we calculate RTO without differentiating weak RTT from strong RTT.
5.2. Exponential Backoff with Variable Limits
We made a modification to uIP TCP so that the retransmission timer backs off based on the estimated RTO instead of the fixed RTO. Several studies argued that the mechanisms which use different backoff policies according to the RTO value are useful to avoid setting the retransmission timer to an excessively large value and to improve fairness between close nodes and distant nodes from the border router [
1,
19]. For this reason, CoCoA employs a variable backoff factor (VBF) mechanism which uses three backoff factors (3, 2, and 1.3) with two RTO thresholds of 1 and 3 s.
However, when an RDC mechanism is enabled, RTO may be set to a very large value due to RTT fluctuation, as seen in the preliminary test result. This makes the CoCoA-like variable backoff factors unsuitable for uIP TCP in networks with an RDC mechanism enabled. Although we examined the effect of VBF mechanism where the thresholds were adapted to our network conditions, the result was not so good as expected. Thus, we consider varying the maximum number of backoffs according to RTO values. In choosing the maximum number of backoffs and the thresholds, we borrowed some numbers from the original uIP TCP. Recall that, in the original uIP TCP, the fixed RTO of 3 s backs off up to four times and, therefore, RTO increases up to 48 s (3 → 6 → 12 → 24 → 48). We simply chose the thresholds of 3, 6, and 12 s and then adjusted the maximum number of backoffs (4/3/2/1) so that RTO did not back off over 48 s. For example, if RTO is less than 3 s, RTO backs off up to four times. For RTO greater than 12 s, backoff is allowed only once.
Table 2 shows the details.
5.3. Dithering
As discussed, it is important to adopt a measure to avoid synchronization like CoAP which dithers the retransmission timer value. We set the actual retransmission timer by adding a random duration to the retransmission timer. The interval for generating a random duration depends on the retransmission timer value. To avoid setting the actual retransmission timer to an excessively large value, we set the interval by multiplying the retransmission timer value by 1, 1/2, or 1/4 with the thresholds of 4 and 12 s. The reason for choosing the multipliers of 1, 1/2, and 1/4 is that the multiplication can be implemented using the shift operation. The thresholds were chosen empirically. For other kinds of networks, the multipliers and thresholds may be optimized according to the RTT distribution. It is important to note that, since uIP TCP uses the time granularity of 0.5 s for the retransmission timer, the effect of dithering is limited. Nevertheless, the simulation results show that even the limited dithering is effective in improving the performance of uIP TCP.
Our algorithm to set the actual retransmission timer is summarized in
Table 2.
6. Performance Evaluation
To evaluate the performance of our proposed scheme, we used the grid topologies as in the preliminary test. The maximum number of retransmissions was set to 15. We used three performance metrics: throughput (i.e., TCP goodput), total number of retransmissions, and fairness. Fairness between TCP flows is quantified using Jain’s index. We compared performance between three schemes: the original uIP TCP, the uIP TCP with dithering, and our proposed scheme. The uIP TCP with dithering was made by adding only dithering to the original uIP TCP. All simulations were run for 10 min and repeated 10 times. In the figures for simulation results, we added the 95% confidence intervals for the means in each plot.
Firstly, to investigate the effect of the network size, we compared performance of the three schemes for the 4 × 4 grid and 5 × 5 grid topology networks. Compared to the 4 × 4 grid topology, the 5 × 5 grid topology not only causes larger differences of path lengths but also worsens the hidden terminal problem. In this test, we set the random loss rates to 0% and the number of packets in the link-layer queue to eight packets.
Figure 4 shows the performance when an RDC mechanism is not enabled. In terms of total goodput, the original uIP TCP underperforms the other two schemes. The uIP TCP with dithering and our proposed scheme show almost equal performance, but the uIP TCP with dithering performs slightly better, which indicates the benefits of dithering. The reason why the goodput difference between three schemes is not extremely large is that RTT is very small. In terms of the number of retransmitted segments, our proposed scheme underperforms slightly. The schemes using the fixed RTO of 3 s benefit from very small RTT and the random loss rate of 0%. The fairness index of the original uIP TCP is lower than that of the other two schemes. The fairness index of our proposed scheme is almost the same as that of the uIP TCP with dithering.
Figure 5 shows the performance when ContikiMAC is used. For all schemes, the total goodput is much lower than when no RDC mechanism is used. The reason is that, as mentioned in
Section 4.2.1, ContikiMAC worsens the hidden terminal problem. We note that the number of retransmissions of the original uIP TCP in the 5 × 5 grid topology network is very large. It is even larger than the number of successfully received segments. In
Figure 5a,b, the 95% confidence intervals for the means of our proposed scheme do not overlap those of the original uIP TCP for both topologies. This suggests that our proposed scheme significantly outperforms the original uIP TCP in terms of goodput and the number of retransmissions, because it can adapt the actual retransmission timer to the large RTT fluctuations. Moreover, we observe from
Figure 5a,b that, for the 5 × 5 grid topology, the 95% confidence intervals for the means of our proposed scheme and the uIP TCP with dithering do not overlap. This implies that a more severe hidden terminal problem results in a more effective retransmission timer adjustment based on RTT estimates. It is notable that, when ContikiMAC is used, TCP connection failures are often observed. In particular, the original uIP TCP suffers from more TCP connection failures than the others, resulting in a lower fairness index.
To investigate the impact of the queue size, we compare performance with the number of packets in the link-layer queue set to four packets and eight packets in the 4 × 4 grid topology network. As shown in
Figure 6, the increase in the link-layer queue size from four packets to eight packets does not significantly impact goodput and the number of retransmissions. In
Figure 6a, there is no overlap of the 95% confidence intervals for the means between the original uIP TCP and our proposed scheme, which shows that our proposed scheme can improve goodput for both queue sizes.
Figure 6b shows that, when ContikiMAC is used, the 95% confidence intervals for the means of the original uIP TCP are relatively large, but do not overlap those of our proposed scheme. This implies that, when ContikiMAC is used, our proposed scheme reduces retransmissions for both queue sizes.
To figure out the impact of the random loss rates, we compared performance between the three schemes with the random loss rates of 0%, 5%, 10%, and 15%. In this test, the number of packets in the link layer queue was set to 4. We used the 4 × 4 grid topology because too many connection failures occurred in the 5 × 5 grid topology.
Figure 7 depicts the performance when no RDC mechanism is used. As the random loss rate increases, the total goodputs and the fairness indexes of all the schemes converge. However, in
Figure 7b, showing the number of retransmissions, the 95% confidence intervals for the means of our proposed scheme do not overlap those of the original uIP TCP with the loss rates of 10% and 15%. This implies that our proposed scheme is effective for reducing the number of retransmissions with high loss rates.
Figure 8 shows the performance when ContikiMAC is used. We can see that, for all the schemes, the performance is less sensitive to random loss rates than when no RDC mechanism is used. In
Figure 8a,b, the 95% confidence intervals for the means of the original uIP TCP do not overlap those of our proposed scheme at all loss rates of 0% through 15%, which indicates that our proposed scheme significantly improves goodput and the number of retransmissions. However, we observe from
Figure 8c, depicting fairness, that there are overlaps between the confidence intervals of the original uIP TCP and our proposed scheme at the loss rates of 5% and 15%. As the random loss rate increases, the frequency of connection failure also tends to increase for all three protocols, which remains as an issue to be addressed in the future work.
7. Conclusions
uIP TCP may suffer lots of retransmissions and low throughput when RTT fluctuates wildly due to the hidden terminal problem, which may worsen with RDC enabled. One of the main reasons for such poor performance is that uIP TCP sets the retransmission timer using the fixed RTO for retransmitted segments. To address this problem, we propose a new scheme for retransmission timer management. Our scheme adjusts the retransmission timer according to the estimated RTT for retransmitted segments, as well as first-transmitted segments. In addition, our scheme adopts the notion of weak RTT estimation of CoCoA, exponential backoffs with variable limits, and dithering. Simulation results show that our proposed scheme enhances throughput and significantly reduces retransmissions, particularly when an RDC mechanism is enabled.
In this work, we only used a periodic reporting scenario without bursty background traffic. For future work, we plan to extend our scheme considering various scenarios. Furthermore, we will examine the effect of extending the TCP window size to more than one segment.