1. Introduction
The rail sector is expected to increase enormously in upcoming years. The forthcoming Digital Transformation will be based on reducing any kind of carbon footprint, and the transport sector will be one of the most impacted sectors. In fact, rail transport will be the favored method to move people and freight worldwide due to its reduced carbon dioxide (CO
) emissions. The increase in the adoption of rail in transport has raised interest worldwide, and this will bring to some challenges for rail operators. They will be expected to operate more efficiently, thus increasing the number of trains per kilometer, as well as to guarantee safety and service quality both for passengers and freights. A few years ago, safety and optimization operations for rail traffic management and to avoid train collisions were performed by the Automatic Train Protection (ATP) system. This is a complex architecture based on track-side equipment placed at about 1–2 km in the rail path (and usually at
km in critical points such as rail exchanges and station approaching), which signal to the train driver how to adapt the train speed [
1]. This system is not very reactive to signals, which is a further problem, and it is standard in all rails crossing national borders.
Recent papers have investigated transport safety, including for example, the high in-train compressive forces that could determine train derailment [
2]. Nevertheless, a first step in innovation in this field is the development of a Single European Railway Area in order to achieve fully interoperable rail in Europe. The first step has been the European Rail Traffic Management System (ERMTS) set up by the European Union in the Directive 96/48/EC [
3] to specify procedures and technologies, and Directive 2001/16/EC [
4] for the interoperability of the trans-European conventional railway system. This is based on the European Train Control System (ETCS) and the Global System for Mobile communications for Railway (GSM-R) and has been deployed for mainlines. GSM-R has the same characteristics of the classical GSM but it transmits on a dedicated frequency spectrum assigned to the rail operators. Moreover, dedicated base stations have been deployed by rail operators to provide connectivity services along their lines, thus supporting rail applications. Supported services are: (i.) voice communications (for rail staff and emergency calls), both for point-to-point and group calls, and (ii.) low-data rate services for railway operations (ETCS L2/L3 operations). Unfortunately, this strategy is no longer bearable due to the management of the technological infrastructure and costs. Moreover, GSM will be guaranteed until 2030.
Two main proposals have been raised. One is proposed and supported by the International Union of Railway (UIC) and by the European Union Agency for Railway (ERA) by 2013. In fact, in 2015, the Technical Committee Rail Telecommunication (within European Telecommunications Standards Institute, ETSI) and 3rd Generation Partnership Project (3GPP) started to work on the Future Railway Mobile Communication System (FRMCS). This study concerns (i.) the choice of technology or technologies for communications; (ii.) the assigned spectrum for future radio communication system for railway, and (iii.) the required additional investment on new radio sites with respect to GSM-R radio sites. The second initiative has been supported by the European Shift2Rail Joint Undertaking (S2R-JU), in which the concept of Adaptable Communication System (ACS) is proposed. While FRMCS is managed by telecommunication operators, ACS has the Over-The-Top (OTT) approach, which implements the bearer independence and supports connections over multiple networks.
Other works proposed the adoption of Multi-Path Transport Control Protocol (MPTCP) to solve this problem. In [
5], some experimental results have been provided, but only for increasing the overall data transfer throughput by utilizing the multihoming capabilities of communicating nodes. This is not the case required by the rail operators and it is not manageable by them, as they need technology redundancy rather than path redundancy over the same physical interface. Moreover, MPTCP also incurs performance degradation when the number of lost packets increases (i.e., the well-known problem of Transport Control Protocol, TCP) [
6], instead of using a dedicated tunnel manager as proposed in this work. Other works tried to solve the rail application connectivity by efficiently characterizing the radio channels in railway environments, as in [
7]. In [
8], some experimentations have been presented supporting the use of cellular systems for providing rail applications in train-to-network communication. Differently from these works, we propose to overcome the necessity of dedicated radio network deployment or using public networks to provide connectivity for rail applications by managing connections at the IP level and trying to take advantage of the available radio bearers along the rail line, thus selecting the most appropriate and suitable bearer(s) based on the application requirements.
One of the contributions of this work is the analysis of user and system requirements related to the ACS designed to meet the growing needs of railway operators interested in communication for different types of critical services, where one of the most important aspects is noise resilience and the ability to manage redundancy provided by multiple bearers. The connectivity requirements particular to the ACS model have been translated into the domain of IP networks to find a reasonable abstraction and an effective and efficient implementation for some concepts, such as the representation of the redundant bearers, but using standard solutions and protocols, even if recombined in a novel way.
One of the requirements of ACS is to ensure the transmission of user data traffic on specific bearers, based on some configuration criteria such as Quality of Service (QoS). The proposed ACS model along with its implementation is based on the concept multi-bearer gateways, which allow User Agents, both users and producers of communication services, to communicate by establishing pseudo-virtual circuits by identifying each communication endpoint via an IP-address associated with a logical interface. Since there is no guarantee to ensure that IP datagrams follow precise paths, as IP forwarding is simply based on the destination address, we solved this problem by binding the agents to such logical interfaces (which are based on Linux IP Virtual Local Area Network, IPVLAN) and implementing an extension of Session Initiating Protocol (SIP) to signalling a circuit, which implies that both gateways and hosts dynamically update forwarding tables during the signaling process. So, we still rely on standard IP routing but force datagrams, on a service-basis, to follow specific protected tunnels between gateways.
The third important contribution of this work is the way we manage the tunnels’ protection, which is based on multi-path transport protocols and data duplication, in addition to retransmission mechanisms given by Transmission Control Protocol (TCP). The software we developed can combine different transport protocols with multiple protection criteria, but any complexity is hidden to the application layer which ignores how the gateway will forward the data across the bearers. Since we use standard IP protocols and technologies, the implementation is also efficient and cost-effective.
For this work, we implemented a testbed based on an emulator developed to evaluate the ACS performance. Part of the emulator has been developed in [
9], where simpler use cases (e.g., tunnel protocol only based on Generic Routing Encapsulation, GRE) and metrics (e.g., latencies through Internet Control Message Protocol, ICMP) were considered. In this paper, we propose a new version of Tunnel Manager, which can set hybrid transport protocols (e.g., TCP and User Datagram Protocol (UDP), simultaneously). Moreover, a modified version of SIP has been adopted for service discovery and user registration along with new emulator for railway signalling protocols, which has been used to generate new and more realistic test-cases. To investigate the most suitable strategies in selecting transport protocols, two main rail applications have been considered: (i.) signaling and (ii.) generic data transfer (i.e., via file transfer). The first rail application is fundamental for the normal operations of the train movement and it is based on the train transmitting its position report (PR) to the remote control center, aiming at managing train traffic along the overall railways. After receiving the PR of each train, the control center evaluates the train position and the positions of the other trains, thus responding with the authorization of the movement (or movement authorization, MA), not allowing the train to reduce its speed or stop. The second rail application we considered is related to the transmission of critical files. In this case, we evaluate the available data rate and the latency required by a file transfer in case multiple bearers are adopted the ACS.
The paper is organized as follows. In
Section 2, the rail context is described and the main rail applications and their requirements are recalled. Furthermore, in
Section 3, the ACS architecture and its characteristics are detailed. In
Section 4, the proposal of registration of the rail applications in the whole network to make them available to clients is described. In
Section 5, the testbed implemented is presented in order to evaluate the potentials of ACS. In
Section 6, results of the emulator and the testbed are reported. Finally, conclusions are drawn in
Section 7.
3. Modelling the ACS
In this section, the description of the ACS model is provided and used for the testbed implementation and to evaluate its performance.
A conceptual representation of the elements related to the new communication system based on ACS can be represented as in
Figure 2, which contains the essential components of an ACS-based network architecture. From left to right we schematized the on-board host devices where user agents (UAs) run. Such devices are connected to an on-board ACS Gateway, which exposes interfaces to offer connectivity to the User Agents towards the bearers and through them to services exposed by server User Agents connected to the network ACS GW. The bearers, at the centre of the figure, can be imagined as channels implemented with different technologies for transporting IP network traffic.
Regardless of any complexity related to how the components of
Figure 2 can be implemented, the ACS can be effectively represented by means of a network of hosts and related network interfaces, distinguishing between physical network interfaces (e.g., the ones representing the bearers) and logical network interfaces (i.e., the tunnel interfaces across the bearers and User Agent logical endpoints). Finally, none of unnumbered interfaces need to be represented in the topology since they are transparent in terms of IP addressing.
Figure 4 represents such a logical model where physical and virtual interfaces are part of the ACS layered network topology. The scheme of ACS topology places the gateways at the centre. The host devices where the User Agents run (by means of client and server applications) are connected to each adjacent gateway. So, the on-board devices are connected to the OG, and Network Devices are connected to the NG. The connectivity between on-board and ground user agents is established by a virtual circuit. A virtual circuit is nothing more than a defined path (through routes and IP tunnels) between two ACS User Agents. The gateway role is to abstract any complexity due the presence of multiple bearers and provide the necessary signaling mechanism for service registration and circuits’ establishing.
A virtual circuit is obtained by IP routing and multi-path tunneling. Multi-path tunnels are created between the two gateways. The number of bearers and transport technologies involved in a multi-path tunnel depend on configuration, which is meant to ensure a required QoS. A tunnel can be associated to multiple aggregated bearers via a redundancy mechanism based on multi-path transmission techniques.
The second important aspect of virtual circuit is the logical address (and the logical device) used to represent the endpoints of a virtual circuit and to identify the User Agent in the ACS network. Each User Agent, which can represent either a client or a server of a service, is identified by a unique IP address resolved by the Domain Name System (DNS) once an agent is registered.
From the implementation point of view, each host device can create a virtual network device by using IPVLAN Driver [
15] as a result of a control plane signaling based on the SIP [
16] and DNS protocols. Each IPVLAN logical interface represents the endpoint of a virtual circuit.
4. Implementation of the ACS Control Plane
The creation of the virtual circuits is based on a signaling protocol, and the paradigm adopted for the ACS is the same as the SIP, which is a well-known application layer network protocol used to create, modify and terminate sessions between two or more participants.
The ACS GWs and ACS Host Devices implement the SIP to provide end-to-end application management functions related to the connection status, session setup, QoS parameters required and negotiated, IP address settings, user authentication and profiling. More in detail, the SIP has been used to provide User Agents with the capability to register themselves to a Registrar to have assigned a logical IP address. The registration process of the User Agent is made hop-by-hop, which means each User Agent connects directly on the adjacent gateway. The OG assumes the role of SIP Proxy in the control plane, while the NG acts as SIP Registrar. SIP messages between two SIP User Agents are propagated hop-by-hop (as shown in
Figure 5) since each host of the network needs to keep a local network topology database updated and a related routing table via SIP protocol message exchange. The SIP registration mechanism allows the SIP participants to announce or request a service.
A SIP session and a related virtual circuit are created by an invitation mechanism, as described in RFC 3261 [
16]. In
Figure 6, a sequence diagram shows an example of on-board User Agent which is registering to the network gateway via an OG (acting as SIP proxy) to be assigned with a logical IP of the ACS network.
Once two User Agents are registered, they can create a session and consequently a virtual circuit by using the SIP INVITE method, as shown in
Figure 7.
A circuit between two agents is composed of IPVLAN logical interfaces, which identify the end points of the circuit, and the tunnel interface via the local access interface that each gateway exposes to the agents. How the packets go through the bearers is decided by the gateway based on a given configuration, but such complexity is hidden to the agents, which can only address the virtual interfaces related to the multi-path tunnels. A simplified version of the ACS topology can schematically show how a virtual circuit is established. The schema shown in
Figure 8 is the one used to carry out our performance tests.
6. Results
This section reports the results of the testbed described in the previous section. Two main cases were considered: the transmission of the ERTMS signaling (i.e., PR/MA within the required time for guaranteeing the train operations) in
Section 6.1 and the transmission of a critical data in
Section 6.2.
Performance tests were executed creating several tunnel configurations combining 1 or 2 bearers and using either UDP or TCP as tunnel transport protocol or a mix of them but any case can be easily extended. Different types of tests were conducted to measure different aspects in all tunnel configuration scenarios, by using several tools. To measure bandwidth and data transfer performance, Iperf3 was used [
20], which is a well-known networking tool capable of producing standard performance measurements for networks, creating data streams to measure the throughput between the two endpoints (i.e., User Agents) for producing as output a time-stamped report of the amount of data transferred and the throughput measured.
To measure round-trip time (RTT) and message loss rate we used the “ping” command, which operates by sending ICMP echo request packets to the target host and waiting for an ICMP echo reply. The program reports errors, packet loss, and a statistical summary of the results, including the minimum, maximum, the mean round-trip times (and standard deviation of the mean). Finally, a traffic generator tool (named udpTrEmu) was developed to simulate the Railway Signaling use-cases related to several traffic profiles typical of railway signaling, whose reference parameters were extrapolated from the requirements on GSM-R networks for ETCS support [
21]. Impairments were added through Linux Traffic Control (tc) with Network Emulator (netem) commands directly on the bearer physical interfaces [
15]. Netem allows us to add delay, packet loss, duplication and other characteristics to packets outgoing from a selected network interface. Netem is built using the existing QoS and Differentiated Services (diffserv) facilities in the Linux kernel.
6.1. Results on Rail Procedures
As described in the introduction, the signaling service is not a bandwidth-requiring service but it is very important for the train operation that the PR and its response MA are received within a given timeline. In
Figure 11, the behavior of the considered PR/MA procedure within ETCS has been reported. As shown in
Figure 11a, PR or MA can be lost twice before an MA is correctly received within
seconds, thus not stopping the train movement or reducing its speed. On the contrary, in case of high packet loss, the procedure timeline is not respected since no MA is received within
seconds (see
Figure 11b). Additionally in
Figure 11c, the procedure timeline is not respected due to packet loss and high channel delay.
Based on requirements in [
21], we investigate whether a 128-byte packet of MA arrives with a delay higher than
s. We suppose the train transmits a PR packet every
s. We considered that one or two bearers are set up by the Tunnel Manager and for each of them we can select TCP or UDP as transport protocol strategy. Then, we can have five possible combinations: (i.) 1 TCP bearer; (ii.) 1 UDP bearer; (iii.) 2 TCP bearers; (iv.) 2 UDP bearers; (v.) 1 TCP bearer and 1 UDP bearer. We considered an outage in the PR/MA protocol when an over-limit event occurs. Then, we reported the outage probability defined as:
Figure 12 reported the outage due to PR/MA over-limit as a function of the overall packet loss ratio in the network for a delay of 100 ms between the OG and the NG. All five transport protocol strategies were considered.
In order to stress the PR/MA rail application, high packet loss rates in the whole network were set. This considers all two-way end-to-end paths (i.e., from the on-board gateway to the network gateway in the remote control center and return for the authorization) including the wireless access network, which has the highest packet loss probability, the network operator network and the transport network to the rail operator center. In many cases (even when only one bearer is used), the train receives at least one MA within s. In fact, if one PR packet is lost the subsequent may arrive to the control center within s. As expected, in case one single bearer is selected, the over-limit rapidly increases. The worst performance occurs with UDP since is obtained for a packet loss of . If TCP is adopted, performance increases since the Tunnel Manager is able to require a packet retransmission, and due to the overall delay of the whole path of 100 ms in the considered case, it arrives in time (i.e., within s). is obtained for a packet loss of . As expected, the best performance occurs in case of 2 bearers using TCP.
Figure 13 reports the outage due to PR/MA over-limit when TCP is selected as transport protocol for one bearer (solid lines) and two bearers (dashed lines). Several overall network delays ranging from 0 ms to 400 ms were considered.
As expected, increasing the delay causes an increase in the over-limit percentage of the rail application due to the reduced possibility to retransmit a lost packet. In case of one single TCP bearer, the outage is for channel losses of and delay of about 50–100 ms, but it reduces to lower than when two TCP bearers are used by the Tunnel Manager. In case of a low-delay channel, two TCP bearers can tolerate a packet loss of about (quite unrealistic in current networks but realistic only in very noisy wireless channels).
Figure 14 reports the outage due to PR/MA over-limit when UDP is selected as transport protocol for one bearer (solid lines) and two bearers (dashed lines). Several overall network delays ranging from 0 ms to 400 ms were considered.
As expected, UDP presents lower performance with respect to TCP due to lack of retransmissions. As an example, for UDP with one bearer the outage is lower than
when the channel delay is negligible and for packet loss equal to no more than
. In contrast to the TCP cases in
Figure 13, UDP is more sensitive to using more than one bearer, showing a good improvement in the over-limit performance in cases where two UDP bearers are used. In fact,
is obtained for channel delays of 100 ms and a packet loss of
.
Figure 15 reports the packet loss of the overall network as a function of the channel delay for several transport protocol strategies. Results in this figure are very useful to the Tunnel Manager in selecting the proper strategy based on the experienced packet losses as well as the measured delay in the network channels in order to respect the requirement
of the rail application with a given outage over-limit
. In
Figure 15, an over-limit threshold
was selected. This means that for
, we have a couple of
values, according to the
Figure 12,
Figure 13 and
Figure 14 for each transport protocol strategy.
For example, in case of a delay of 200 ms, the Tunnel Manager can select UDP and activate two bearers if they experience a packet loss of or can activate a second TCP bearer (thus having one TCP bearer and one UDP bearer) in case it experiences a packet loss of . In case the wireless channel is particularly noisy (e.g., ), two TCP bearers should be considered in order to guarantee the over-limit outage of , even for a channel with 300 ms of delay.
6.2. Critical Data Transfer
For data transfer simulation, Iperf3 was used in the scenarios with a single bearer or two redundant bearers (either using TCP or UDP transport protocols for the tunnels). We tried several scenarios, including adding delay (from 0 to 400 ms) with/without additional packet loss.
Figure 16 reports the impact of induced packet loss on the bearers on the tunnel interfaces. Multi-path tunnels are consistently more resilient than single-path-based ones. In this configuration, a
packet loss rate was well-tolerated. As expected, the drop in performance (in terms of throughput) is less pronounced for multi-path tunnels. The TCP protocol proved to be more effective than UDP as the packet loss rate increases.
Figure 17 reported the throughput as a function of latency in this case for one and two bearers and for TCP and UDP as transport protocols between OG and NG. The throughput progressively decreases as latency increases. In the presence of packet loss, this occurs in a more accentuated way (already for a packet loss equal to
the degradation is tangible). However, even in this case, the presence of two bearers produced a clear improvement (especially when using TCP as tunnel protocol).