Multi-Agent-Deep-Reinforcement-Learning-Enabled Offloading Scheme for Energy Minimization in Vehicle-to-Everything Communication Systems
Abstract
:1. Introduction
- (1)
- To compute the computation-intensive tasks for the computation-limited vehicles, we study an MEC-assisted multi-vehicle V2X communication system, where multi-antenna RSUs with linear receivers and a BS with a zero-forcing (ZF) receiver offload the tasks of vehicles jointly. In order to control the energy consumption, we formulate the energy consumption minimization problem and transform the non-convex optimization problem into a multi-agent decision process. Thus, each vehicle is capable of making intelligent decisions in its own communication environment.
- (2)
- In order to solve the non-convex optimization problem while satisfying the delay and power constraints, the multi-agent deep reinforcement learning (MADRL)-enabled server association, transmit power and offloading ratio joint optimization scheme is proposed. According to the reward function, which is related to the delay and energy consumption, vehicles are trained to select the beneficial transmit power, server association and offloading ratio. In order to reduce the action space dimensions and complexity of the MADRL algorithm, the improved K-nearest neighbors (KNN) algorithm is used to assign the vehicles that are located within the coverage area of several RSUs to the specific RSUs. Vehicles that are allocated to the same RSU as a group, and vehicles in the l-th group, can only offload tasks to the RSU l or BS.
- (3)
- Numerical results show that the proposed MADRL scheme can reduce more energy consumption compared with the full offloading and maximum transmit power schemes, and RSUs equipped with ZF receivers can decrease more energy consumption compared to the matched-filtering (MF) receivers. In addition, the proposed DRL scheme has a stable convergence under the different number of vehicles and data packet sizes.
2. System Model
2.1. Channel Model of the Multi-Antenna RSUs
2.2. Communication Model of the Multi-Antenna BS
3. Computing Model of MEC-Assisted V2X Communication System
3.1. Local Computing Model
3.2. RSU Computing Model
- (1)
- V2I communication link
- (2)
- I2B communication link
3.3. BS Processing Model
3.4. Optimization Problem
4. Proposed Scheme for Solving the Problem (P1)
4.1. Vehicle Grouping Based on the Improved KNN Algorithm
Algorithm 1 The Improved K-nearest Neighbors Algorithm |
|
4.2. Joint-Optimized Server Association, Offloading Ratio and Transmission Power Scheme Based on DRL
- (1)
- Agent: In the system, vehicles work as agents. They know their own information—since the BS and RSUs have no direct link to exchange the information, it is difficult to make the optimal decisions for offloading and power. Thus, the training is performed in the optimization problem to make the adjustment to the dynamic environment. We choose the vehicle as the agent to make the decision in the uplink link, and the BS and RSUs will send the corresponding information to the associated vehicles—i.e., the packet size of the task generated by vehicle k, set and the channel state information.
- (2)
- Action space: Due to the limited communication resources in the system, the server association, offloading ratio and transmit power affect the communication rate and determine the communication energy consumption. Thus, the action space of vehicle k comprises the server association, offloading ratio and transmit power. The server association action space of the k-th vehicle is given as . The offloading and power control action space are given as and . and represent the length of offloading and the power control action space. Therefore, the k-th agent can choose action from the action space .
- (3)
- State space: According to the communication model established in the system, the system energy consumption is related to transmission power, rate and delay; therefore, the state space comprises the transmit power , transmission rate and communication delay in Equation (24). When , the transmission rate is calculated by (14), and when , the transmission rate is calculated by (20). Thus, the state of vehicle k can be expressed as .
- (4)
- Reward: In the DRL algorithm, the agents are trained to adjust the action choice strategy and obtain the expected value by accumulating the maximum reward return. Aiming to minimize the total energy consumption of the V2X system and meet the delay constraint, the reward function of vehicle k is modeled as
Algorithm 2 DRL-based Multi-agent Transmit Power and Offloading Optimization Scheme |
|
4.3. DRL Algorithm Complexity Analysis
- (1)
- Calculate the reward function: Agents, according to the state, select the beneficial action, interact with the environment and obtain the reward. Thus, the computational complexity of an agent calculating the reward is , with denoting the length of the state space for the t-th training step.
- (2)
- Select the beneficial action: For each agent, the numbers of layers in the DQN network and neurons in each layer are considered. For the DQN network, the number of layers in the DQN network is M, and neurons in the m-th layer is . Thus, the computational complexity of the m-th layer is , and, for the t-th step, the computational complexity of an agent selecting the beneficial action is , with denoting the length of the action space for the t-th training step. Therefore, the computational complexity of an agent is . The computational complexity for a complete episode is , where is the number of steps in one episode. The computational complexity for the whole algorithm is , where is the number of iteration episodes. In this paper, all agents have the same DQN network, and the computational complexity of all agents is . The complexity of the proposed algorithm is primarily determined by the network architectures of the DQN network structure and the dimensions of the state and action space. In addition, the complexity of the Q-learning algorithm is primarily determined by the number of agents, and the implementation of Q-learning is more complex compared to the MADRL algorithm in the multi-vehicle V2X communication system. Thus, the proposed MADRL algorithm is more suitable for solving the optimization problem (P1) in this work.
5. Simulation Results
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- You, X.; Wang, C.; Huang, J.; Gao, X. Towards 6G Wireless Communication Networks: Vision, Enabling Technologies, and New Paradigm Shifts. Sci. China Inf. Sci. 2021, 64, 110301. [Google Scholar] [CrossRef]
- Moubayed, A.; Shami, A.; Heidari, P.; Larabi, A.; Brunner, R. Edge-Enabled V2X Service Placement for Intelligent Transportation Systems. IEEE Trans. Mob. Comput. 2021, 20, 1380–1392. [Google Scholar] [CrossRef]
- Noor-A-Rahim, M.; Liu, Z.; Lee, H.; Khyam, M.O.; He, J.; Pesch, D.; Moessner, K.; Saad, W.; Poor, H.V. 6G for Vehicle-to-Everything (V2X) Communications: Enabling Technologies, Challenges, and Opportunities. Proc. IEEE 2022, 110, 712–734. [Google Scholar] [CrossRef]
- Parada, R.; Vázquez-Gallego, F.; Sedar, R.; Vilalta, R. An Inter-Operable and Multi-Protocol V2X Collision Avoidance Service Based on Edge Computing. In Proceedings of the IEEE Vehicular Technology Conference (VTC-Spring), Helsinki, Finland, 19–22 June 2022; pp. 1–5. [Google Scholar]
- Vladyko, A.; Elagin, V.; Spirkina, A.; Muthanna, A.; Ateya, A.A. Distributed Edge Computing with Blockchain Technology to Enable Ultra-Reliable Low-Latency V2X Communications. Electronics 2022, 11, 173. [Google Scholar] [CrossRef]
- Amrita, G.; Mauro, C. Security Issues and Challenges in V2X: A Survey. Comput. Netw. 2020, 169, 107093. [Google Scholar]
- Wang, J.; Lv, T.; Huang, P.; Mathiopoulos, P.T. Mobility-Aware Partial Computation Offloading in Vehicular Networks: A deep Reinforcement Learning Based Scheme. China Commun. 2020, 17, 31–49. [Google Scholar] [CrossRef]
- Amrita, G.; Mauro, C. Efficient Anchor Point Deployment for Low Latency Connectivity in MEC-Assisted C-V2X Scenarios. IEEE Trans. Veh. Technol. 2023, 72, 16637–166495. [Google Scholar]
- Prathiba, S.B.; Raja, G.; Anbalagan, S.; Dev, K.; Gurumoorthy, S.; Sankaran, A.P. Federated Learning Empowered Computation Offloading and Resource Management in 6G-V2X. IEEE Trans. Netw. Sci. Eng. 2022, 9, 3234–3243. [Google Scholar] [CrossRef]
- Zhang, K.; Leng, S.; Peng, X.; Pan, L.; Maharjan, S.; Zhang, Y. Artificial Intelligence Inspired Transmission Scheduling in Cognitive Vehicular Communications and Networks. IEEE Internet Things J. 2019, 6, 1987–1997. [Google Scholar] [CrossRef]
- Pratik, T.; Anurag, T.; Atul, K. GREENSKY: A Fair Energy-Aware Optimization Model for UAVs in Next-Generation Wireless Networks. Green Energy Intell. Transp. 2024, 6, 100130. [Google Scholar]
- Li, X.; Li, J.; Yin, B.; Yan, J.; Fang, Y. Age of Information Optimization in UAV-Enabled Intelligent Transportation System via Deep Reinforcement Learning. In Proceedings of the IEEE Vehicular Technology Conference (VTC-Fall), London, UK, 26–29 September 2022; pp. 1–5. [Google Scholar]
- Hwang, R.H.; Islam, M.M.; Tanvir, M.A.; Hossain, M.S.; Lin, Y.D. Communication and Computation Offloading for 5G V2X: Modeling and Optimization. In Proceedings of the IEEE Global Communications Conference (GLOBECOM), Taipei, Taiwan, 7–11 December 2020; pp. 1–6. [Google Scholar]
- Huang, Y.; Fang, Y.; Li, X.; Xu, J. Coordinated Power Control for Network Integrated Sensing and Communication. IEEE Trans. Veh. Technol. 2022, 71, 13361–13365. [Google Scholar] [CrossRef]
- Nguyen, P.L.; Hwang, R.H.; Khiem, P.M.; Nguyen, K.; Lin, Y.D. Modeling and Minimizing Latency in Three-Tier V2X Networks. In Proceedings of the IEEE Global Communications Conference (GLOBECOM), Taipei, Taiwan, 7–11 December 2020; pp. 1–6. [Google Scholar]
- Bréhon-Grataloup, L.; Kacimi, R.; Beylot, A.L. Mobile Edge Computing for V2X Architectures and Applications: A Survey. Comput. Netw. 2022, 206, 108797. [Google Scholar] [CrossRef]
- Dinh, H.; Nguyen, N.H.; Nguyen, T.T.; Nguyen, T.H.; Nguyen, T.T.; Le Nguyen, P. Deep Reinforcement Learning-Based Offloading for Latency Minimization in 3-Tier V2X Networks. In Proceedings of the IEEE Wireless Communications and Networking Conference (WCNC), Austin, TX, USA, 10–13 April 2022; pp. 1803–1808. [Google Scholar]
- Wang, H.; Lin, Z.; Guo, K.; Lv, T. Computation Offloading Based on Game Theory in MEC-Assisted V2X Networks. In Proceedings of the IEEE International Conference on Communications Workshops (ICC Workshops), Montreal, QC, Canada, 14–23 June 2021; pp. 1–6. [Google Scholar]
- Zhang, Y.; Dong, X.; Zhao, Y. Decentralized Computation Offloading over Wireless-Powered Mobile-Edge Computing Networks. In Proceedings of the IEEE International Conference on Artificial Intelligence and Information Systems (ICAIIS), Dalian, China, 20–22 March 2020; pp. 137–140. [Google Scholar]
- Xiong, R.; Zhang, C.; Yi, X.; Li, L.; Zeng, H. Joint Connection Modes, Uplink Paths and Computational Tasks Assignment for Unmanned Mining Vehicles? Energy Saving in Mobile Edge Computing Networks. IEEE Access 2020, 8, 142076–142085. [Google Scholar] [CrossRef]
- Zhang, J.; Hu, X.; Ning, Z.; Ngai, E.C.H.; Zhou, L.; Wei, J.; Cheng, J.; Hu, B. Energy-Latency Trade-Off for Energy-Aware Offloading in Mobile Edge Computing Networks. IEEE Internet Things J. 2018, 5, 2633–2645. [Google Scholar] [CrossRef]
- Kai, C.; Meng, X.; Mei, L.; Huang, W. Deep Reinforcement Learning Based User Association and Resource Allocation for D2D-Enabled Wireless Networks. In Proceedings of the IEEE/CIC International Conference on Communications in China (ICCC), Xiamen, China, 28–30 July 2021; pp. 1172–1177. [Google Scholar]
- Sun, Y.; Xu, J.; Cui, S. Joint User Association and Resource Allocation Optimization for MEC-Enabled IoT Networks. In Proceedings of the IEEE International Conference on Communications (ICC), Seoul, Republic of Korea, 16–20 May 2022; pp. 4884–4889. [Google Scholar]
- Lyu, L.; Shen, Y.; Zhang, S. The Advance of Reinforcement Learning and Deep Reinforcement Learning. In Proceedings of the IEEE International Conference on Electrical Engineering, Big Data and Algorithms (EEBDA), Changchun, China, 25–27 February 2022; pp. 644–648. [Google Scholar]
- Lin, Y.; Zhang, Y.; Li, J.; Shu, F.; Li, C. Popularity-Aware Online Task Offloading for Heterogeneous Vehicular Edge Computing Using Contextual Clustering of Bandits. IEEE Internet Things J. 2022, 9, 5422–5433. [Google Scholar] [CrossRef]
- Lin, Y.; Zhang, Z.; Huang, Y.; Li, J.; Shu, F.; Hanzo, L. Heterogeneous User-Centric Cluster Migration Improves the Connectivity-Handover Trade-Off in Vehicular Networks. IEEE Trans. Veh. Technol. 2020, 69, 16027–16043. [Google Scholar] [CrossRef]
- Liang, T.; Lin, Y.; Shi, L.; Li, J.; Zhang, Y.; Qian, Y. Distributed Vehicle Tracking in Wireless Sensor Network: A Fully Decentralized Multiagent Reinforcement Learning Approach. IEEE Sens. Lett. 2021, 5, 1–4. [Google Scholar] [CrossRef]
- Luong, N.C.; Hoang, D.T.; Gong, S.; Niyato, D.; Wang, P.; Liang, Y.C.; Kim, D.I. Applications of Deep Reinforcement Learning in Communications and Networking: A Survey. IEEE Commun. Surv. Tutor. 2019, 21, 3133–3174. [Google Scholar] [CrossRef]
- Zhong, C.; Gursoy, M.C.; Velipasalar, S. A Deep Reinforcement Learning-Based Framework for Content Caching. In Proceedings of the Conference on Information Sciences and Systems (CISS), Princeton, NJ, USA, 21–23 March 2018; pp. 1–6. [Google Scholar]
- Yin, B.; Li, X.; Yan, J.; Zhang, S.; Zhang, X. DQN-Based Power Control and Offloading Computing for Information Freshness in Multi-UAV-Assisted V2X System. In Proceedings of the IEEE Vehicular Technology Conference (VTC-Fall), London, UK, 26–29 September 2022; pp. 1–6. [Google Scholar]
- Hossain, T.; Ali, M.Y.; Mowla, M.M. Energy Efficient Massive MIMO 5G System with ZF Receiver. In Proceedings of the International Conference on Electrical, Computer & Telecommunication Engineering (ICECTE), Rajshahi, Bangladesh, 26–28 December 2019; pp. 133–136. [Google Scholar]
- Das, D.; Shbat, M.; Tuzlukov, V. Employment of Generalized Receiver with Equalization in MIMO Systems. In Proceedings of the IET International Conference on Information Science and Control Engineering (ICISCE ), York, UK, 15–17 May 2012; pp. 1–5. [Google Scholar]
- Louie, R.H.Y.; McKay, M.R.; Collings, I.B. Spatial Multiplexing with MRC and ZF Receivers in Ad Hoc Networks. In Proceedings of the IEEE International Conference on Communications (ICC), Dresden, Germany, 14–18 June 2009; pp. 1–5. [Google Scholar]
- Wang, H.; Li, X.; Ji, H.; Zhang, H. Federated Offloading Scheme to Minimize Latency in MEC-Enabled Vehicular Networks. In Proceedings of the IEEE Globecom Workshops (GC Wkshps), Abu Dhabi, United Arab Emirates, 9–13 December 2018; pp. 1–6. [Google Scholar]
- Yang, T.; Li, X.; Shao, H. Federated Learning-Based Power Control and Computing for Mobile Edge Computing System. In Proceedings of the IEEE Vehicular Technology Conference (VTC-Fall), Norman, OK, USA, 27–30 September 2021; pp. 1–6. [Google Scholar]
- Osman, R.A. Optimizing Autonomous Vehicle Communication through an Adaptive Vehicle-to-Everything (AV2X) Model: A Distributed Deep Learning Approach. Electronics 2023, 12, 4023. [Google Scholar] [CrossRef]
- Wu, X.; Ma, Z.; Wang, Y. Joint User Grouping and Resource Allocation for Multi-User Dual Layer Beamforming in LTE-A. IEEE Commun. Lett. 2015, 19, 1822–1825. [Google Scholar] [CrossRef]
- Wang, Q.; Wang, C.; Feng, Z.; Ye, J.F. Review of K-Means Clustering Algorithm. Electron. Des. Eng. 2012, 20, 21–24. [Google Scholar]
- Zhang, N.; Karimoune, W.; Thompson, L.; Dang, H. A Between-Class Overlapping Coherence-Based Algorithm in KNN classification. In Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics (SMC), Banff, AB, Canada, 5–8 October 2017; pp. 572–577. [Google Scholar]
- Eick, C.; Zeidat, N.; Vilalta, R. Using Representative-Based Clustering for Nearest Neighbor Dataset Editing. In Proceedings of the IEEE International Conference on Data Mining (ICDM), Brighton, UK, 1–4 November 2004; pp. 375–378. [Google Scholar]
- C, C. Prediction of Heart Disease using Different KNN Classifier. In Proceedings of the International Conference on Intelligent Computing and Control Systems (ICICCS), Madurai, India, 6–8 May 2021; pp. 1186–1194. [Google Scholar]
- Li, X.; Yin, B.; Yan, J.; Zhang, X.; Wei, R. Joint Power Control and UAV Trajectory Design for Information Freshness via Deep Reinforcement Learning. In Proceedings of the IEEE Vehicular Technology Conference (VTC-Spring), Helsinki, Finland, 19–22 June 2022; pp. 1–5. [Google Scholar]
- Chen, W.; Qiu, X.; Cai, T.; Dai, H.N.; Zheng, Z.; Zhang, Y. Deep Reinforcement Learning for Internet of Things: A Comprehensive Survey. IEEE Commun. Surv. Tutor. 2021, 23, 1659–1692. [Google Scholar] [CrossRef]
Acronym | Description | Acronym | Description |
---|---|---|---|
BS | Base station | RSU | Road-side unit |
DQN | Deep Q-network | SINR | Signal-to-interference-plus-noise-ratio |
DRL | Deep reinforcement learning | 6G | Sixth generation |
I2B | Infrastructure-to-base-station | UAV | Unmanned aerial vehicle |
KNN | K-nearest neighbors | V2B | Vehicle-to-base-station |
MF | Matched filter | V2X | Vehicle-to-everything |
MEC | Mobile edge computing | V2I | Vehicle-to-infrastructure |
MADRL | Multi-agent deep reinforcement learning | ZF | Zero forcing |
MIMO | Multi-input multi-output |
Symbol | Description | Value |
---|---|---|
Maximum vehicle delay | 8 s | |
Processing capability of the vehicles | 0.1 Mbit/s | |
Processing capability of the RSUs | 5 Mbit/s | |
Processing capability of the BS | 10 Mbit/s | |
Data packet | 2 M | |
Processed data packet size | 1 M | |
Bandwidth of the V2I communication link | 1 MHz | |
Bandwidth of the I2B communication link | 5 MHz | |
Bandwidth of the V2B communication link | 5 MHz | |
Maximum transmit power of the vehicles | 23 dBm | |
Transmit power of the RSUs | 40 dBm | |
Length of offloading action space | 5 | |
Length of power control action space | 5 | |
Noise power | −90 dBm | |
Learning rate | 0.01 | |
Discount factor | 0.9 | |
Exploration possibility | 0.96 | |
Iteration episode | 350 | |
Training step in one episode | 200 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Duan, W.; Li, X.; Huang, Y.; Cao, H.; Zhang, X. Multi-Agent-Deep-Reinforcement-Learning-Enabled Offloading Scheme for Energy Minimization in Vehicle-to-Everything Communication Systems. Electronics 2024, 13, 663. https://doi.org/10.3390/electronics13030663
Duan W, Li X, Huang Y, Cao H, Zhang X. Multi-Agent-Deep-Reinforcement-Learning-Enabled Offloading Scheme for Energy Minimization in Vehicle-to-Everything Communication Systems. Electronics. 2024; 13(3):663. https://doi.org/10.3390/electronics13030663
Chicago/Turabian StyleDuan, Wenwen, Xinmin Li, Yi Huang, Hui Cao, and Xiaoqiang Zhang. 2024. "Multi-Agent-Deep-Reinforcement-Learning-Enabled Offloading Scheme for Energy Minimization in Vehicle-to-Everything Communication Systems" Electronics 13, no. 3: 663. https://doi.org/10.3390/electronics13030663
APA StyleDuan, W., Li, X., Huang, Y., Cao, H., & Zhang, X. (2024). Multi-Agent-Deep-Reinforcement-Learning-Enabled Offloading Scheme for Energy Minimization in Vehicle-to-Everything Communication Systems. Electronics, 13(3), 663. https://doi.org/10.3390/electronics13030663