Computational Offloading for MEC Networks with Energy Harvesting: A Hierarchical Multi-Agent Reinforcement Learning Approach
Abstract
:1. Introduction
- We design a multi-user multi-server MEC network with energy harvesting, and generate decisions on task offload location and task offload ratio considering the limited computing resources and battery power of UEs. We optimize the computation offloading decision to minimize the weighted sum of the average task delay, energy consumption, and task drop rate.
- We propose a computation offloading framework based on hierarchical multi-agent reinforcement learning to minimize system costs. The decision of task offloading location and ratio is optimized to minimize the weighted sum of the average task delay, energy consumption, and task drop rate, taking into account the limited computing resources and battery power of user devices. The problem is formulated as a MDP and solved using the HDMAPPO strategy, consisting of a high-level MAPPO algorithm and a low-level MAPPO algorithm. The high-level MAPPO algorithm determines the MEC server location for task offloading and the active dropping of tasks, while the low-level MAPPO algorithm determines the task offloading ratios. The state space of the low-level problem is restricted by the output of the high-level problem, with the results of the high-level problem being part of the state of the low-level problem, thus reducing the complexity of the state.
- Experimental results demonstrate the effectiveness of our proposed hierarchical multi-agent reinforcement learning based computational offloading framework in reducing the weighted sum of the average task delay, energy consumption, and task drop rate compared to other baseline algorithms.
2. Related Work
3. System Model
3.1. Local Computing Mode
3.2. Mobile Edge Execution Model
3.3. Energy Harvesting
3.4. Problem Formulation
4. Computation Offloading Decision Strategy
4.1. MAPPO Algorithm
Algorithm 1: MAPPO for subproblems |
|
4.2. Offloading Location Selection Strategy
- States: The states consist of the system information observed by each agent at the beginning of each time slot. The UE can observe the agent’s id , the amount of data , and the computational density of task as well as the remaining power and the collected energy of the UE in the current time slot. In addition information about the processing power of the MEC is needed. Therefore, the observation of UE i at step t can be expressed as:
- Action: In the offloading location selection problem, the action space is represented by , which has a range of . A value of 0 indicates that the task should be directly discarded and no MEC server is selected for offloading, whereas values in the range of denote the identifier of the selected MEC server. The offloading ratio of the task is resolved in the lower-level subproblem, therefore the action space does not encompass the offloading ratio in this context.
- Reward: The reward function is designed to reflect the efficiency of the actions executed in the environment and must be consistent with the system design for optimal results. The reward function is designed to consider the task delay, energy consumption, and task drop rate, and can be formulated as follows:The action of the higher-level subproblem will be treated as part of the state of the lower-level subproblem.
4.3. Task Offload Ratio Strategy
- States: The state includes the decision of the HMARL and the local information observed by the agent, which is denoted as:
- Action: The action is to determine the task offloading ratio after determining the task offload server location. The range of the action is a continuous range of , represents the proportion of task executed locally, and represents the proportion of MEC offloads. The action of the low-level subproblem can be expressed as:
- Reward: The purpose of the low-level algorithm is to reduce the task latency, energy consumption, and task discard rate, which is consistent with the high-level algorithm, so the reward function of the low-level algorithm is the same as the reward function of the high-level algorithm. It is calculated by (29).
4.4. HDMAPPO Framework
Algorithm 2: Proposed HDMAPPO algorithm |
|
5. Performance Evaluation
5.1. Simulation Setup
- All Local (AL): All parts of the task are executed locally at UE.
- All MEC (AM): All parts of the task are randomly offloaded to one of the MEC servers.
- Random task offloading (RTO): Partial scale tasks will be randomly offloaded to one of the MEC servers and the percentage of tasks offloaded is random.
- Independent proximal policy optimization offloading (IPPO): The IPPO [32] algorithm considers each UE as a separate agent, with no direct interdependence between the individual agents. Each agent independently executes the PPO algorithm and generates both server selection and task offloading ratio decisions.
- Multi-agent deep deterministic policy gradient offloading (MADDPG): MADDPG [33] extends deep deterministic policy gradient (DDPG) to a multi-agent environment by introducing a framework of centralized training and decentralized execution. Each UE individually trains an actor network to generate actions based on local private information and a centralized critic network is used to update policy parameters based on global information.
5.2. Performance Comparison
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
MEC | Multi-access edge computing |
AP | Access points |
MARL | Multi-agent reinforcement learning |
MAPPO | Multi-agent proximal policy optimization |
HDMAPPO | Hierarchical double multi-agent proximal policy |
WPT | Wireless power transfer |
EH | Energy harvesting |
IoT | Internet of Things |
MDP | Markov decision process |
MADDPG | Multi-agent deep deterministic policy gradient |
AL | All local |
AM | All MEC |
RTO | Random task offloading |
IPPO | Independent proximal policy optimization offloading |
References
- Mach, P.; Becvar, Z. Mobile edge computing: A survey on architecture and computation offloading. IEEE Commun. Surv. Tutor. 2017, 19, 1628–1656. [Google Scholar] [CrossRef] [Green Version]
- Zhao, Y.; Hou, F.; Lin, B.; Sun, Y. Joint Offloading and Resource Allocation with Diverse Battery Level Consideration in MEC System. IEEE Trans. Green Commun. Netw. 2023. [Google Scholar] [CrossRef]
- Guo, S.; Liu, J.; Yang, Y.; Xiao, B.; Li, Z. Energy-efficient dynamic computation offloading and cooperative task scheduling in mobile cloud computing. IEEE Trans. Mob. Comput. 2018, 18, 319–333. [Google Scholar] [CrossRef]
- Yi, C.; Cai, J.; Su, Z. A multi-user mobile computation offloading and transmission scheduling mechanism for delay-sensitive applications. IEEE Trans. Mob. Comput. 2019, 19, 29–43. [Google Scholar] [CrossRef]
- Kumar, K.; Liu, J.; Lu, Y.H.; Bhargava, B. A survey of computation offloading for mobile systems. Mob. Netw. Appl. 2013, 18, 129–140. [Google Scholar] [CrossRef]
- Lin, H.; Zeadally, S.; Chen, Z.; Labiod, H.; Wang, L. A survey on computation offloading modeling for edge computing. J. Netw. Comput. Appl. 2020, 169, 102781. [Google Scholar] [CrossRef]
- Min, M.; Xiao, L.; Chen, Y.; Cheng, P.; Wu, D.; Zhuang, W. Learning-based computation offloading for IoT devices with energy harvesting. IEEE Trans. Veh. Technol. 2019, 68, 1930–1941. [Google Scholar] [CrossRef] [Green Version]
- Choi, K.W.; Aziz, A.A.; Setiawan, D.; Tran, N.M.; Ginting, L.; Kim, D.I. Distributed wireless power transfer system for Internet of Things devices. IEEE Internet Things J. 2018, 5, 2657–2671. [Google Scholar] [CrossRef]
- Zaman, S.K.U.; Jehangiri, A.I.; Maqsood, T.; Umar, A.I.; Khan, M.A.; Jhanjhi, N.Z.; Shorfuzzaman, M.; Masud, M. COME-UP: Computation offloading in mobile edge computing with LSTM based user direction prediction. Appl. Sci. 2022, 12, 3312. [Google Scholar] [CrossRef]
- Zaman, S.K.u.; Jehangiri, A.I.; Maqsood, T.; Haq, N.u.; Umar, A.I.; Shuja, J.; Ahmad, Z.; Dhaou, I.B.; Alsharekh, M.F. LiMPO: Lightweight mobility prediction and offloading framework using machine learning for mobile edge computing. Clust. Comput. 2022, 26, 99–117. [Google Scholar] [CrossRef]
- Yu, C.; Velu, A.; Vinitsky, E.; Wang, Y.; Bayen, A.; Wu, Y. The surprising effectiveness of ppo in cooperative, multi-agent games. arXiv 2021, arXiv:2103.01955. [Google Scholar]
- Li, J.; Gao, H.; Lv, T.; Lu, Y. Deep reinforcement learning based computation offloading and resource allocation for MEC. In Proceedings of the 2018 IEEE Wireless Communications and Networking Conference (WCNC), IEEE, Barcelona, Spain, 15–18 April 2018; pp. 1–6. [Google Scholar]
- Li, C.; Xia, J.; Liu, F.; Li, D.; Fan, L.; Karagiannidis, G.K.; Nallanathan, A. Dynamic offloading for multiuser muti-CAP MEC networks: A deep reinforcement learning approach. IEEE Trans. Veh. Technol. 2021, 70, 2922–2927. [Google Scholar] [CrossRef]
- Ke, H.; Wang, J.; Deng, L.; Ge, Y.; Wang, H. Deep reinforcement learning-based adaptive computation offloading for MEC in heterogeneous vehicular networks. IEEE Trans. Veh. Technol. 2020, 69, 7916–7929. [Google Scholar] [CrossRef]
- Xu, J.; Ai, B.; Chen, L.; Cui, Y.; Wang, N. Deep Reinforcement Learning for Computation and Communication Resource Allocation in Multiaccess MEC Assisted Railway IoT Networks. IEEE Trans. Intell. Transp. Syst. 2022, 23, 23797–23808. [Google Scholar] [CrossRef]
- Qu, B.; Bai, Y.; Chu, Y.; Wang, L.e.; Yu, F.; Li, X. Resource allocation for MEC system with multi-users resource competition based on deep reinforcement learning approach. Comput. Netw. 2022, 215, 109181. [Google Scholar] [CrossRef]
- Zhang, Z.; Yu, F.R.; Fu, F.; Yan, Q.; Wang, Z. Joint offloading and resource allocation in mobile edge computing systems: An actor-critic approach. In Proceedings of the 2018 IEEE Global Communications Conference (GLOBECOM), IEEE, Abu Dhabi, United Arab Emirates, 9–13 December 2018; pp. 1–6. [Google Scholar]
- Liu, T.; Zhang, Y.; Zhu, Y.; Tong, W.; Yang, Y. Online computation offloading and resource scheduling in mobile-edge computing. IEEE Internet Things J. 2021, 8, 6649–6664. [Google Scholar] [CrossRef]
- Ho, T.M.; Nguyen, K.K. Joint server selection, cooperative offloading and handover in multi-access edge computing wireless network: A deep reinforcement learning approach. IEEE Trans. Mob. Comput. 2020, 21, 2421–2435. [Google Scholar] [CrossRef]
- Wang, J.; Hu, J.; Min, G.; Zhan, W.; Zomaya, A.Y.; Georgalas, N. Dependent task offloading for edge computing based on deep reinforcement learning. IEEE Trans. Comput. 2021, 71, 2449–2461. [Google Scholar] [CrossRef]
- Peng, H.; Shen, X. Multi-agent reinforcement learning based resource management in MEC-and UAV-assisted vehicular networks. IEEE J. Sel. Areas Commun. 2020, 39, 131–141. [Google Scholar] [CrossRef]
- Liu, C.; Tang, F.; Hu, Y.; Li, K.; Tang, Z.; Li, K. Distributed task migration optimization in MEC by extending multi-agent deep reinforcement learning approach. IEEE Trans. Parallel Distrib. Syst. 2020, 32, 1603–1614. [Google Scholar] [CrossRef]
- Ke, H.; Wang, H.; Sun, H. Multi-Agent Deep Reinforcement Learning-Based Partial Task Offloading and Resource Allocation in Edge Computing Environment. Electronics 2022, 11, 2394. [Google Scholar] [CrossRef]
- Zhou, H.; Long, Y.; Gong, S.; Zhu, K.; Hoang, D.T.; Niyato, D. Hierarchical Multi-Agent Deep Reinforcement Learning for Energy-Efficient Hybrid Computation Offloading. IEEE Trans. Veh. Technol. 2022, 72, 986–1001. [Google Scholar] [CrossRef]
- Huang, X.; Leng, S.; Maharjan, S.; Zhang, Y. Multi-agent deep reinforcement learning for computation offloading and interference coordination in small cell networks. IEEE Trans. Veh. Technol. 2021, 70, 9282–9293. [Google Scholar] [CrossRef]
- Chen, Z.; Zhang, L.; Pei, Y.; Jiang, C.; Yin, L. NOMA-based multi-user mobile edge computation offloading via cooperative multi-agent deep reinforcement learning. IEEE Trans. Cogn. Commun. Netw. 2021, 8, 350–364. [Google Scholar] [CrossRef]
- Zhao, N.; Ye, Z.; Pei, Y.; Liang, Y.C.; Niyato, D. Multi-agent deep reinforcement learning for task offloading in UAV-assisted mobile edge computing. IEEE Trans. Wirel. Commun. 2022, 21, 6949–6960. [Google Scholar] [CrossRef]
- Lin, W.; Ma, H.; Li, L.; Han, Z. Computing Assistance From the Sky: Decentralized Computation Efficiency Optimization for Air-Ground Integrated MEC Networks. IEEE Wirel. Commun. Lett. 2022, 11, 2420–2424. [Google Scholar] [CrossRef]
- Gan, Z.; Lin, R.; Zou, H. A Multi-Agent Deep Reinforcement Learning Approach for Computation Offloading in 5G Mobile Edge Computing. In Proceedings of the 2022 22nd IEEE International Symposium on Cluster, Cloud and Internet Computing (CCGrid), IEEE, Taormina, Italy, 16–19 May 2022; pp. 645–648. [Google Scholar]
- Gong, Y.; Yao, H.; Wang, J.; Jiang, L.; Yu, F.R. Multi-agent driven resource allocation and interference management for deep edge networks. IEEE Trans. Veh. Technol. 2021, 71, 2018–2030. [Google Scholar] [CrossRef]
- Guo, S.; Xiao, B.; Yang, Y.; Yang, Y. Energy-efficient dynamic offloading and resource scheduling in mobile cloud computing. In Proceedings of the IEEE INFOCOM 2016-The 35th Annual IEEE International Conference on Computer Communications, IEEE, San Francisco, CA, USA, 10–14 April 2016; pp. 1–9. [Google Scholar]
- de Witt, C.S.; Gupta, T.; Makoviichuk, D.; Makoviychuk, V.; Torr, P.H.; Sun, M.; Whiteson, S. Is independent learning all you need in the starcraft multi-agent challenge? arXiv 2020, arXiv:2011.09533. [Google Scholar]
- Chen, X.; Liu, G. Energy-efficient task offloading and resource allocation via deep reinforcement learning for augmented reality in mobile edge networks. IEEE Internet Things J. 2021, 8, 10843–10856. [Google Scholar] [CrossRef]
Approaches | Advantages | Disadvantages |
---|---|---|
Single-agent reinforcement learning based methods [12,13,19] | Simple model and easy to implement algorithm. | The growing number of user devices results in an explosive action space, impeding algorithmic learning. |
Multi-agent reinforcement learning based methods [22,25,30] | Centralized training for decentralized implementation, able to solve cooperation or competition problems. | Unstable learning processes may occur in complex scenarios. |
The proposed approach | Decomposing problems reduces complexity and can solve cooperation problems. | Poor adaptability, requires manual decomposition of problems. |
Parameters | Value |
---|---|
CPU frequency of MEC server | GHz |
CPU frequency of UE | GHz |
Bandwidth of channel W | 10 MHz |
Average interference power I | W |
Average transmit power of UE | W |
The data size of the task | Kbits |
Computational density of the task | cycles/bit |
The maximum tolerant delay of task | 1 s |
Component | Network Structure | Hyperparameter | Value |
---|---|---|---|
Actor | fc(state_dim,128),tanh | Learning rate of actor | 0.0003 |
fc(128,128),tanh | Learning rate of critic | 0.0004 | |
fc(128,action_dim),softplus | Reward discount | 0.99 | |
Critic | fc(state_dim,128),tanh | Optimizer | Adam |
fc(128,128),tanh | K_epochs | 10 | |
fc(128,1) | Clip_rate | 0.2 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Sun, Y.; He, Q. Computational Offloading for MEC Networks with Energy Harvesting: A Hierarchical Multi-Agent Reinforcement Learning Approach. Electronics 2023, 12, 1304. https://doi.org/10.3390/electronics12061304
Sun Y, He Q. Computational Offloading for MEC Networks with Energy Harvesting: A Hierarchical Multi-Agent Reinforcement Learning Approach. Electronics. 2023; 12(6):1304. https://doi.org/10.3390/electronics12061304
Chicago/Turabian StyleSun, Yu, and Qijie He. 2023. "Computational Offloading for MEC Networks with Energy Harvesting: A Hierarchical Multi-Agent Reinforcement Learning Approach" Electronics 12, no. 6: 1304. https://doi.org/10.3390/electronics12061304
APA StyleSun, Y., & He, Q. (2023). Computational Offloading for MEC Networks with Energy Harvesting: A Hierarchical Multi-Agent Reinforcement Learning Approach. Electronics, 12(6), 1304. https://doi.org/10.3390/electronics12061304