Towards Robust Routing: Enabling Long-Range Perception with the Power of Graph Transformers and Deep Reinforcement Learning in Software-Defined Networks
Abstract
:1. Introduction
- We integrate a GNN with MADRL for packet routing in SDNs, enhancing robustness against topology changes.
- We establish a virtual star node and a Transformer-based global message-passing mechanism, enabling nodes to perceive long-distance traffic information.
- We propose a path-based readout method, improving the ability of nodes to accurately detect potential congestion along multi-hop paths.
- We design a MADRL cooperation mechanism to facilitate the learning of general perception strategies and reduce the amount of computation.
- We build an SDN simulation environment and conduct extensive experiments, demonstrating the superior robustness of the proposed algorithm compared to all baseline algorithms.
2. Related Work
3. System Model and DRL Formulation
3.1. System Model
3.2. DRL Formulation
4. Method
4.1. Global Message-Passing with Graph Transformer and Star Node
- (1)
- Graph Transformer employs the Transformer attention mechanism to enable nodes to selectively aggregate information from the neighborhood, ensuring that crucial information is retained during message-passing.
- (2)
- The designed star node is a global virtual node connecting to all nodes in the graph, which serves as the bridge between any pair of nodes. The operation of the star node consists of two steps. Firstly, the star node aggregates the features of all nodes based on Transformer attention. Secondly, the star node participates in the original message-passing of each node. In this manner, the star node creates a shortcut for long-range perception, where the information of distant nodes is able to be embedded in the hidden feature of the star node and then directly transmitted to any other node.
4.2. Path-Based Readout for Multiple Agents
- (1)
- During the expansion to multi-hop neighborhoods, a practical problem is that the numbers of nodes from different directions are probably variable, leading to concatenated vectors for certain directions with different lengths. To address this issue, path-based readout focuses on concatenating node feature vectors along all paths from the central node, since paths with identical hops share the same number of nodes.
- (2)
- Compared to operations such as averaging and summing, concatenating retains all information along paths, thus facilitating the central node to detect potential congestion of certain paths and subsequently decrease the corresponding link weights.
4.3. MADRL Operation and Cooperation
- (1)
- We leverage federated learning to average only the parameters in GNNs among agents for global message-passing and preserve the independent updating of FC NNs for the readout of each agent. In this architecture, the shared GNN module consisting of averaged parameters is actually a general version for all agents and facilitates the learning of the universal rule of message-passing.
- (2)
- Without federated learning for parameter sharing, each agent needs to maintain a distinct GNN to process the entire graph, while only a few local node features are employed for readout, leading to extensive redundant calculations. Benefiting from the proposed cooperation mechanism, all agents share one set of GNN parameters and their inputs are also the same graph. Therefore, we just need to perform the global message-passing only once and enable each agent to directly fetch the required node features, thus reducing a large amount of computation.
5. Experiments
5.1. Simulation Setup
- ECMP [2], a traditional multi-path routing algorithm that tends to fixedly distribute each flow to all equal-cost shortest paths between the source and the destination. Path cost is determined by number of hops.
- GAPPO [18], a previous packet routing algorithm that combines PPO and GAT within SADRL architecture. The initial node feature is the specific row of the TM after padding with zeros. The single agent establishes a three-layer GNN to directly calculate the weights of all links.
- DGATR [14], a packet routing algorithm integrating Deep Q-Network (DQN) with a GAT within MADRL architecture. Each agent observes the indices of the current node, neighboring nodes, and the destination node as the initial node feature and employs the GNN to determine the next hop for each packet.
- GTSR-MP-k, a cut-down version of the proposed GTSR. The only difference from GTSR is that the single-layer Graph Transformer for global message-passing is replaced by a traditional k-layer MPNN, which will reveal the impact of GNN layers.
- GTSR-RO-k, another cut-down version of the proposed GTSR, where the path-based readout method is replaced by a simple readout with k-hop neighborhoods. This variant will reveal the impact of different readout methods.
5.2. Analysis of Different GNN Layers
5.3. Analysis of Different Readout Methods
5.4. Evaluation on Unseen Topologies
5.5. Evaluation in Other Real-World Networks
6. Discussion
7. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
CDF | Cumulative Distribution Function |
DRL | Deep Reinforcement Learning |
E2E | End-to-End |
FC | Fully Connected |
GNN | Graph Neural Network |
MADRL | Multi-Agent Deep Reinforcement Learning |
MPNN | Message Passing Neural Network |
NN | Neural Network |
QoS | Quality of Service |
SADRL | Single-Agent Deep Reinforcement Learning |
SDN | Software Defined Network |
TM | Traffic Matrix |
References
- Moy, J. OSPF Version 2. Technical Report. 1997. Available online: https://www.rfc-editor.org/rfc/rfc2178.html (accessed on 20 January 2025).
- Hopps, C. Analysis of an Equal-Cost Multi-Path Algorithm. Technical Report. 2000. Available online: https://www.rfc-editor.org/rfc/rfc2992.html (accessed on 20 January 2025).
- Kreutz, D.; Ramos, F.M.; Verissimo, P.E.; Rothenberg, C.E.; Azodolmolky, S.; Uhlig, S. Software-defined networking: A comprehensive survey. Proc. IEEE 2014, 103, 14–76. [Google Scholar] [CrossRef]
- Arulkumaran, K.; Deisenroth, M.P.; Brundage, M.; Bharath, A.A. Deep reinforcement learning: A brief survey. IEEE Signal Process. Mag. 2017, 34, 26–38. [Google Scholar] [CrossRef]
- Xiao, Y.; Liu, J.; Wu, J.; Ansari, N. Leveraging deep reinforcement learning for traffic engineering: A survey. IEEE Commun. Surv. Tutorials 2021, 23, 2064–2097. [Google Scholar] [CrossRef]
- Guo, Y.; Lin, B.; Tang, Q.; Ma, Y.; Luo, H.; Tian, H.; Chen, K. Distributed Traffic Engineering in Hybrid Software Defined Networks: A Multi-agent Reinforcement Learning Framework. IEEE Trans. Netw. Serv. Manag. 2024, 21, 6759–6769. [Google Scholar] [CrossRef]
- You, X.; Li, X.; Xu, Y.; Feng, H.; Zhao, J.; Yan, H. Toward packet routing with fully distributed multiagent deep reinforcement learning. IEEE Trans. Syst. Man Cybern. Syst. 2020, 52, 855–868. [Google Scholar] [CrossRef]
- Rusek, K.; Suárez-Varela, J.; Mestres, A.; Barlet-Ros, P.; Cabellos-Aparicio, A. Unveiling the potential of graph neural networks for network modeling and optimization in SDN. In Proceedings of the 2019 ACM Symposium on SDN Research, San Jose, CA, USA, 3–4 April 2019; pp. 140–151. [Google Scholar]
- Rusek, K.; Suárez-Varela, J.; Almasan, P.; Barlet-Ros, P.; Cabellos-Aparicio, A. RouteNet: Leveraging graph neural networks for network modeling and optimization in SDN. IEEE J. Sel. Areas Commun. 2020, 38, 2260–2270. [Google Scholar] [CrossRef]
- Scarselli, F.; Gori, M.; Tsoi, A.C.; Hagenbuchner, M.; Monfardini, G. The graph neural network model. IEEE Trans. Neural Netw. 2008, 20, 61–80. [Google Scholar] [CrossRef]
- Cui, Z.; Henrickson, K.; Ke, R.; Wang, Y. Traffic graph convolutional recurrent neural network: A deep learning framework for network-scale traffic learning and forecasting. IEEE Trans. Intell. Transp. Syst. 2019, 21, 4883–4894. [Google Scholar] [CrossRef]
- Hope, O.; Yoneki, E. GDDR: GNN-based data-driven routing. In Proceedings of the 2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS), Washington, DC, USA, 7–10 July 2021; pp. 517–527. [Google Scholar]
- Swaminathan, A.; Chaba, M.; Sharma, D.K.; Ghosh, U. GraphNET: Graph neural networks for routing optimization in software defined networks. Comput. Commun. 2021, 178, 169–182. [Google Scholar] [CrossRef]
- Mai, X.; Fu, Q.; Chen, Y. Packet routing with graph attention multi-agent reinforcement learning. In Proceedings of the 2021 IEEE Global Communications Conference (GLOBECOM), Madrid, Spain, 7–11 December 2021; pp. 1–6. [Google Scholar]
- Almasan, P.; Suárez-Varela, J.; Rusek, K.; Barlet-Ros, P.; Cabellos-Aparicio, A. Deep reinforcement learning meets graph neural networks: Exploring a routing optimization use case. Comput. Commun. 2022, 196, 184–194. [Google Scholar] [CrossRef]
- He, Q.; Wang, Y.; Wang, X.; Xu, W.; Li, F.; Yang, K.; Ma, L. Routing optimization with deep reinforcement learning in knowledge defined networking. IEEE Trans. Mob. Comput. 2023, 23, 1444–1455. [Google Scholar] [CrossRef]
- Chen, B.; Zhu, D.; Wang, Y.; Zhang, P. An approach to combine the power of deep reinforcement learning with a graph neural network for routing optimization. Electronics 2022, 11, 368. [Google Scholar] [CrossRef]
- Li, X.; Xiao, Y.; Liu, S.; Lu, X.; Liu, F.; Zhou, W.; Liu, J. GAPPO-A Graph Attention Reinforcement Learning based Robust Routing Algorithm. In Proceedings of the 2023 IEEE 34th Annual International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC), Toronto, ON, Canada, 5–8 September 2023; pp. 1–7. [Google Scholar]
- Bhavanasi, S.S.; Pappone, L.; Esposito, F. Dealing with changes: Resilient routing via graph neural networks and multi-agent deep reinforcement learning. IEEE Trans. Netw. Serv. Manag. 2023, 20, 2283–2294. [Google Scholar] [CrossRef]
- Bernárdez, G.; Suárez-Varela, J.; López, A.; Shi, X.; Xiao, S.; Cheng, X.; Barlet-Ros, P.; Cabellos-Aparicio, A. Magnneto: A graph neural network-based multi-agent system for traffic engineering. IEEE Trans. Cogn. Commun. Netw. 2023, 9, 494–506. [Google Scholar] [CrossRef]
- Gilmer, J.; Schoenholz, S.S.; Riley, P.F.; Vinyals, O.; Dahl, G.E. Neural message passing for quantum chemistry. In Proceedings of the International Conference on Machine Learning, Sydney, Australia, 6–11 August 2017; pp. 1263–1272. [Google Scholar]
- Huang, W.; Rong, Y.; Xu, T.; Sun, F.; Huang, J. Tackling over-smoothing for general graph convolutional networks. arXiv 2020, arXiv:2008.09864. [Google Scholar]
- Wu, Q.; Zhao, W.; Yang, C.; Zhang, H.; Nie, F.; Jiang, H.; Bian, Y.; Yan, J. Simplifying and empowering transformers for large-graph representations. Adv. Neural Inf. Process. Syst. 2024, 36. [Google Scholar]
- Rampásek, L.; Galkin, M.; Dwivedi, V.P.; Luu, A.T.; Wolf, G.; Beaini, D. Recipe for a general, powerful, scalable graph transformer. arXiv 2022, arXiv:2205.12454. [Google Scholar]
- Varga, A.; Hornig, R. An overview of the OMNeT++ simulation environment. In Proceedings of the 1st International ICST Conference on Simulation Tools and Techniques for Communications, Networks and Systems, Marseille, France, 3–7 March 2010. [Google Scholar]
- Stampa, G.; Arias, M.; Sánchez-Charles, D.; Muntés-Mulero, V.; Cabellos, A. A deep-reinforcement learning approach for software-defined networking routing optimization. arXiv 2017, arXiv:1709.07080. [Google Scholar]
- Valadarsky, A.; Schapira, M.; Shahaf, D.; Tamar, A. Learning to route. In Proceedings of the 16th ACM Workshop on Hot Topics in Networks, Palo Alto, CA, USA, 30 November–1 December 2017; pp. 185–191. [Google Scholar]
- Zhang, W.; Yang, D.; Wu, W.; Peng, H.; Zhang, N.; Zhang, H.; Shen, X. Optimizing federated learning in distributed industrial IoT: A multi-agent approach. IEEE J. Sel. Areas Commun. 2021, 39, 3688–3703. [Google Scholar] [CrossRef]
- Schulman, J.; Wolski, F.; Dhariwal, P.; Radford, A.; Klimov, O. Proximal policy optimization algorithms. arXiv 2017, arXiv:1707.06347. [Google Scholar]
- Knight, S.; Nguyen, H.X.; Falkner, N.; Bowden, R.; Roughan, M. The internet topology zoo. IEEE J. Sel. Areas Commun. 2011, 29, 1765–1775. [Google Scholar] [CrossRef]
- Fey, M.; Lenssen, J.E. Fast graph representation learning with PyTorch Geometric. arXiv 2019, arXiv:1903.02428. [Google Scholar]
Reference | Routing Type | DRL Architecture | GNN Model | Superior Robustness | Shortcomings | Common Shortcoming |
---|---|---|---|---|---|---|
[12] | Flow | SADRL | MPNN | No | Failing to distinctly outperform baselines after topology changes | Only utilizing basic GNNs, suffering from the risk of over-smoothing and the deficiency in feature extraction capabilities during long-range message-passing |
[13] | Packet | SADRL | MPNN | No | Requiring retraining after topology changes | |
[14] | Packet | MADRL | GAT | No | Lacking evaluation of robustness against topology changes | |
[15] | Flow | SADRL | MPNN | Yes | Selecting one path for each flow, only tailored for path-based flow routing | |
[16] | Flow | SADRL | MPNN | Yes | Only tailored for path-based flow routing, constrained to a single agent | |
[17] | Flow | SADRL | MPNN | Yes | Only tailored for path-based flow routing, constrained to a single agent | |
[18] | Packet | SADRL | GAT | Yes | Constrained to a single agent, only evaluating for simple traffic | |
[19] | Flow | MADRL | GCN | Yes | Only tailored for path-based flow routing | |
[20] | Flow | MADRL | MPNN | Yes | Only tailored for path-based flow routing |
Algorithm | E2E Delay | Packet Loss Rate | ||||
---|---|---|---|---|---|---|
Mean (ms) | Variance | <100 ms | Mean (ms) | Variance | <10% | |
ECMP | 107.35 | 67.46% | 22.46 | 23.25% | ||
GAPPO | 212.87 | 24.76% | 38.51 | 18.17% | ||
DGATR | 173.48 | 30.07% | 16.83 | 38.33% | ||
GTSR-MP-4 | 65.52 | 80.47% | 11.20 | 63.33% | ||
GTSR-RO-1 | 49.42 | 89.41% | 10.45 | 59.25% | ||
GTSR | 25.74 | 99.20% | 0.003 | 99.99% |
Topology | Sending Rate (Mbps) | ||||||
---|---|---|---|---|---|---|---|
(, ) | Betweenness | Number | Betweenness | Number | Betweenness | ||
Sprint | (11, 18) | 0.0989 | 17 | [0.1191, 0.0888] [+20.40%, −10.20%] | 132 | [0.1393, 0.0916] [+40.81%, −7.397%] | 0.063 |
Navigata | (13, 17) | 0.1188 | 11 | [0.1282, 0.1200] [+7.843%, +0.9803%] | 52 | [0.1515, 0.1191] [+27.45%, +0.2614%] | 0.029 |
BtAsiaPac | (20, 31) | 0.0733 | 25 | [0.0804, 0.0729] [+9.561%, −0.6327%] | 294 | [0.0944, 0.0727] [+28.68%, −0.9257%] | 0.021 |
Hurricane-Electric | (24, 37) | 0.0800 | 30 | [0.0871, 0.0790] [+8.847%, −1.234%] | 427 | [0.0989, 0.0768] [+23.66%, −3.997%] | 0.014 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, X.; Li, J.; Zhou, J.; Liu, J. Towards Robust Routing: Enabling Long-Range Perception with the Power of Graph Transformers and Deep Reinforcement Learning in Software-Defined Networks. Electronics 2025, 14, 476. https://doi.org/10.3390/electronics14030476
Li X, Li J, Zhou J, Liu J. Towards Robust Routing: Enabling Long-Range Perception with the Power of Graph Transformers and Deep Reinforcement Learning in Software-Defined Networks. Electronics. 2025; 14(3):476. https://doi.org/10.3390/electronics14030476
Chicago/Turabian StyleLi, Xinyuan, Junze Li, Jingli Zhou, and Jun Liu. 2025. "Towards Robust Routing: Enabling Long-Range Perception with the Power of Graph Transformers and Deep Reinforcement Learning in Software-Defined Networks" Electronics 14, no. 3: 476. https://doi.org/10.3390/electronics14030476
APA StyleLi, X., Li, J., Zhou, J., & Liu, J. (2025). Towards Robust Routing: Enabling Long-Range Perception with the Power of Graph Transformers and Deep Reinforcement Learning in Software-Defined Networks. Electronics, 14(3), 476. https://doi.org/10.3390/electronics14030476