Dynamic Spectrum Sharing Based on Deep Reinforcement Learning in Mobile Communication Systems
Abstract
:1. Introduction
- (1)
- The resource allocation problem in the CR system is modeled as a reinforcement learning task. SUs are represented as agents in this model and their chosen channels and transmission power are defined as actions. The reward is based on the quality of communication and potential collisions.
- (2)
- we present a DRL-based algorithm for SUs to access the spectrum and control their transmission power in the CR system. The algorithm is based on the DQN algorithm and includes features, such as experience replay and freezing target networks. The artificial neural network structures used in the algorithm are DQN and DRQN. Through the use of a well-designed algorithm and network structure, users can learn and optimize their access strategies through training.
- (3)
- Simulation experiments are implemented to compare the proposed algorithm with other policies and investigate the impact of various parameters such as the coefficients in the reward function and active rate. The results show that the proposed algorithm can effectively enhance system performance and reduce interference to the PUs.
2. Related Work
3. System Model
4. DRL-Based Spectrum Sharing Method
4.1. Reinforcement Learning Model
4.2. Deep Q-Network and Deep Recurrent Q-Network
4.3. DRL-Based Training Algorithm
Algorithm 1 Training Algorithm of the DQN Method |
|
Algorithm 2 Training Algorithm of DRQN Method |
|
5. Simulation Results
5.1. Performance under a Different Number of Users
5.2. Parameters in the Reward Functions (2) and (3)
5.3. Active Rate
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Chang, H.; Song, H.; Yi, Y.; Zhang, H.; He, H.; Liu, L. Distributive dynamic spectrum access through deep reinforcement learning: A reservoir computing-based approach. IEEE Internet Things J. 2019, 6, 1938–1948. [Google Scholar] [CrossRef] [Green Version]
- Zong, J.; Liu, Y.; Liu, H.; Wang, Q.; Chen, P. 6G Cell-Free Network Architecture. In Proceedings of the 2022 IEEE 2nd International Conference on Electronic Technology, Communication and Information (ICETCI), Changchun, China, 27–29 May 2022.
- Naparstek, O.; Cohen, K. Deep multi-user reinforcement learning for distributed dynamic spectrum access. IEEE Trans. Wirel. Commun. 2013, 18, 310–323. [Google Scholar] [CrossRef] [Green Version]
- Mnih, V.; Kavukcuoglu, K.; Silver, D.; Graves, A.; Antonoglou, I.; Wierstra, D.; Riedmiller, M. Playing atari with deep reinforcement learning. arXiv 2013, arXiv:1312.5602. [Google Scholar]
- Liu, S.; Wang, T.; Pan, C.; Zhang, C.; Yang, F.; Song, J. Deep reinforcement learning for spectrum sharing in future mobile communication system. In Proceedings of the 2021 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB), Chengdu, China, 4–6 August 2021. [Google Scholar]
- Haykin, H. Cognitive radio: Brain-empowered wireless communications. IEEE J. Select. Areas Commun. 2005, 23, 201–220. [Google Scholar] [CrossRef]
- Zhou, W.; Zhu, Q.; Ling, Y. An improved game-theoretic algorithm for competitive spectrum sharing. In Proceedings of the 2010 International Conference on Communications and Mobile Computing, Shenzhen, China, 12–14 April 2010. [Google Scholar]
- Cai, F.; Gao, Y.; Cheng, L.; Sang, L.; Yang, D. Spectrum sharing for LTE and WiFi coexistence using decision tree and game theory. In Proceedings of the 2016 IEEE Wireless Communications and Networking Conference, Doha, Qatar, 3–6 April 2016. [Google Scholar]
- Ahmad, I.; Wei, Z.; Feng, Z.; Bai, Y.; Zhang, Q.; Zhang, P. Joint price and power allocation under interference constraint for dynamic spectrum access networks. In Proceedings of the 2014 IEEE International Symposium on Dynamic Spectrum Access Networks, McLean, VA, USA, 1–4 April 2014. [Google Scholar]
- Pandit, S.; Singh, G. Spectrum sharing in cognitive radio using game theory. In Proceedings of the 2013 3rd IEEE International Advance Computing Conference (IACC), Ghaziabad, India, 22–23 February 2013. [Google Scholar]
- Zhang, N.; Yang, D.; Jing, L. An advanced algorithm for spectrum allocation of primary users based on cournot game. In Proceedings of the 2019 IEEE International Conference on Signal, Information and Data Processing (ICSIDP), Chongqing, China, 11–13 December 2019. [Google Scholar]
- Alhammadi, A.; Roslee, M.; Alias, M.Y. Fuzzy logic based negotiation approach for spectrum handoff in cognitive radio network. In Proceedings of the 2016 IEEE 3rd International Symposium on Telecommunication Technologies (ISTT), Kuala Lumpur, Malaysia, 28–30 November 2016. [Google Scholar]
- Alhammadi, A.; Roslee, M.; Alias, M.Y. Analysis of spectrum handoff schemes in cognitive radio network using particle swarm optimization. In Proceedings of the 2016 IEEE 3rd International Symposium on Telecommunication Technologies (ISTT), Kuala Lumpur, Malaysia, 28–30 November 2016. [Google Scholar]
- Roslee, M.; Alhammadi, A.; Alias, M.Y.; Nmenme, P.U. Efficient handoff spectrum scheme using fuzzy decision making in cognitive radio system. In Proceedings of the 2017 3rd International Conference on Frontiers of Signal Processing (ICFSP), Paris, France, 6–8 September 2017. [Google Scholar]
- Luong, N.C.; Hoang, D.T.; Gong, S.; Niyato, D.; Wang, P.; Liang, Y.C.; Kim, D.I. Applications of deep reinforcement learning in communications and networking: A survey. IEEE Commun. Surv. Tutor. 2019, 21, 3133–3174. [Google Scholar] [CrossRef] [Green Version]
- Gao, X.; Dou, Z.; Qi, L. A new distributed dynamic spectrum access model based on DQN. In Proceedings of the 2020 15th IEEE International Conference on Signal Processing, Beijing, China, 6–9 December 2020. [Google Scholar]
- Nguyen, H.Q.; Nguyen, B.T.; Dong, T.Q.; Ngo, D.T.; Nguyen, T.A. Deep Q-Learning with multiband sensing for dynamic spectrum access. In Proceedings of the 2018 IEEE International Symposium on Dynamic Spectrum Access Networks (DySPAN), Seoul, Republic of Korea, 22–25 October 2018. [Google Scholar]
- Wang, S.; Liu, H.; Gomes, P.H.; Krishnamachari, B. Deep reinforcement learning for dynamic multichannel access in wireless networks. IEEE Trans. Cogn. Commun. Netw. 2018, 4, 257–265. [Google Scholar] [CrossRef] [Green Version]
- Li, X.; Fang, J.; Cheng, W.; Duan, H.; Chen, Z.; Li, H. Intelligent power control for spectrum sharing in cognitive radios: A deep reinforcement learning approach. IEEE Access 2018, 6, 25463–25473. [Google Scholar] [CrossRef]
- Wang, X.; Teraki, Y.; Umehira, M.; Zhou, H.; Ji, Y. A usage aware dynamic spectrum access scheme for interweave cognitive radio network by exploiting deep reinforcement learning. Sensors 2022, 22, 6949. [Google Scholar] [CrossRef] [PubMed]
- Ye, F.; Zhang, Y.; Li, Y.; Jiang, T.; Li, Y. Power control based on deep Q network with modified reward function in cognitive networks. In Proceedings of the 2020 IEEE USNC-CNC-URSI North American Radio Science Meeting (Joint with AP-S Symposium), Montreal, QC, Canada, 5–10 July 2020. [Google Scholar]
- Zhang, H.; Yang, N.; Huangfu, W.; Long, K.; Leung, V.C.M. Power control based on deep reinforcement learning for spectrum sharing. IEEE Trans. Wirel. Commun. 2020, 19, 4209–4219. [Google Scholar] [CrossRef]
- Yang, P.; Li, L.; Yin, J.; Zhang, H.; Liang, W.; Chen, W.; Han, Z. Dynamic spectrum access in cognitive radio networks using deep reinforcement learning and evolutionary game. In Proceedings of the 2018 IEEE/CIC International Conference on Communications in China, Beijing, China, 16–18 August 2018. [Google Scholar]
- Xu, Y.; Yu, J.; Headley, W.C.; Buehrer, R.M. Deep reinforcement learning for dynamic spectrum access in wireless networks. In Proceedings of the MILCOM 2018—2018 IEEE Military Communications Conference, Los Angeles, CA, USA, 29–31 October 2018. [Google Scholar]
- Yadav, M.A.; Li, Y.; Fang, G.; Shen, B. Deep Q-network based reinforcement learning for distributed dynamic spectrum access. In Proceedings of the 2022 IEEE 2nd International Conference on Computer Communication and Artificial Intelligence (CCAI), Beijing, China, 6–8 May 2022. [Google Scholar]
- Song, H.; Liu, L.; Ashdown, J.; Yi, Y. A deep reinforcement learning framework for spectrum management in dynamic spectrum access. IEEE Internet Things J. 2021, 8, 11208–11218. [Google Scholar] [CrossRef]
- Xu, Y.; Yu, J.; Buehrer, R.M. Cache-enabled dynamic spectrum access via deep recurrent Q-networks with partial observation. In Proceedings of the 2019 IEEE International Symposium on Dynamic Spectrum Access Networks, Newark, NJ, USA, 11–14 November 2019. [Google Scholar]
- Xu, Y.; Yu, J.; Buehrer, R.M. The application of deep reinforcement learning to distributed spectrum access in dynamic heterogeneous environments With partial observations. IEEE Trans. Wirel. Commun. 2020, 19, 4494–4506. [Google Scholar] [CrossRef]
- Xu, Y.; Yu, J.; Buehrer, R.M. Dealing with partial observations in dynamic sectrum access: Deep recurrent Q-networks. In Proceedings of the MILCOM 2018—2018 IEEE Military Communications Conference, Los Angeles, CA, USA, 29–31 October 2018. [Google Scholar]
- Bhandari, S.; Ranjan, N.; Kim, Y.; Kim, H. Deep reinforcement learning for dynamic spectrum access in the multi-channel wireless local area networks. In Proceedings of the 2022 International Conference on Electronics, Information, and Communication(ICEIC), Jeju, Republic of Korea, 6–9 February 2022. [Google Scholar]
- Cong, Q.; Lang, W. Deep multi-user reinforcement learning for centralized dynamic multichannel access. In Proceedings of the 2021 6th International Conference on Intelligent Computing and Signal Processing (ICSP), Xi’an, China, 9–11 April 2021. [Google Scholar]
- Wang, Y.; Li, X.; Wan, P.; Shao, R. Intelligent dynamic spectrum access using deep reinforcement learning for VANETs. IEEE Sens. J. 2021, 21, 15554–15563. [Google Scholar] [CrossRef]
- Li, Y.; Zhang, W.; Wang, C.; Sun, J.; Liu, Y. Deep reinforcement learning for dynamic spectrum sensing and aggregation in multi-channel wireless networks. IEEE Trans. Cogn. Commun. Netw. 2020, 6, 464–475. [Google Scholar] [CrossRef]
- Jiang, W.; Yu, W.; Wang, W.; Huang, T. Multi-agent reinforcement learning for joint cooperative spectrum sensing and channel access in cognitive UAV networks. Sensors 2022, 2, 1651. [Google Scholar] [CrossRef] [PubMed]
- Maulik, S.; Roy, R.; De, A.; Bhatttacharya, A. Online dynamic resource allocation in interference temperature constrained cognitive radio network using reinforcement learning. In Proceedings of the 2012 International Conference on Signal Processing and Communications, Bangalore, India, 22–25 July 2012. [Google Scholar]
- Liu, X.; Sun, C.; Yu, W.; Zhou, M. Reinforcement-learning-based dynamic spectrum access for software-defined cognitive industrial internet of things. IEEE Trans. Ind. Inform. 2022, 18, 4244–4253. [Google Scholar] [CrossRef]
- Janiar, S.B.; Pourahmadi, V. Deep reinforcement learning for fair distributed dynamic spectrum access in wireless networks. In Proceedings of the 2021 IEEE 18th Annual Consumer Communications & Networking Conference (CCNC), Las Vegas, NV, USA, 9–12 January 2021. [Google Scholar]
- Meinilä, J.; Kyösti, P.; Jämsä, T.; Hentilä, L. Winner II channel models. In Radio Technologies and Concepts for IMT-Advanced; Wiley: Hoboken, NJ, USA, 2009; pp. 39–92. [Google Scholar]
- Couillard, G.; Dahman, G.; Poitau, G.; Gagnon, F. Quantifying range extension capability of MIMO: A simulation study based on WINNER II model. In Proceedings of the 2019 IEEE Canadian Conference of Electrical and Computer Engineering (CCECE), Edmonton, AB, Canada, 5–8 May 2019. [Google Scholar]
- Sutton, R.S.; Andrew, G.B. Reinforcement Learning: An Introduction; MIT Press: Cambridge, MA, USA, 2018. [Google Scholar]
- Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
- Chung, J.; Gulcehre, C.; Cho, K.; Bengio, Y. Empirical evaluation of gated recurrent neural networkson sequence modeling. arXiv 2014, arXiv:1412.3555. [Google Scholar]
- Mnih, V.; Kavukcuoglu, K.; Silver, D.; Rusu, A.A.; Veness, J.; Bellemare, M.G.; Graves, A.; Riedmiller, M.; Fidjeland, A.K.; Ostrovski, G.; et al. Human-level control through deep reinforcement learning. Nature 2015, 518, 529–533. [Google Scholar] [CrossRef] [PubMed]
- Werbos, P.J. Backpropagation through time: What it does and how to do it. Proc. IEEE 1990, 78, 1550–1560. [Google Scholar] [CrossRef] [Green Version]
- To, T.; Choi, J. On exploiting idle channels in opportunistic multichannel ALOHA. IEEE Commun. Lett. 2010, 1, 51–53. [Google Scholar] [CrossRef]
Parameter | Value |
---|---|
Number of PU | 15 |
Number of SU | 1 |
Active Rate | 1 |
Selectable Transmission Power | 20 mW |
Learning Rate | |
Discounted Rate () | 0.9 |
Parameter | Value |
---|---|
Number of PU | 8 |
Number of SU | 4 |
Active Rate | 0.3 |
Selectable Transmission Power | 1 mW, 10 mW, 100 mW |
Learning Rate | |
Discounted Rate () | 0.9 |
Method | |||
---|---|---|---|
DQN | 28.7938 | 23.8333 | −0.1785 |
DRQN | 28.4703 | 25.0642 | 0 |
Random | 23.6995 | 20.0518 | −28.3294 |
Greedy | 29.1707 | 25.9507 | 0 |
Method | |||
---|---|---|---|
DQN | −6.2809 | 15.2238 | 223.8512 |
DRQN | −3.7978 | 16.9011 | 246.8618 |
Random | −4.9356 | 4.9345 | 161.5352 |
Greedy | −3.9868 | 5.2668 | 135.2807 |
Method | |||
---|---|---|---|
DQN | 20.6760% (with PU) | 18.1336% (with PU) | 21.4618% (with PU) |
12.9720% (with SU) | 2.9058% (with SU) | 4.3354 (with SU) | |
DRQN | 17.4286% (with PU) | 16.8952% (with PU) | 15.4653% (with PU) |
0 (with SU) | 0 (with SU) | 0 (with SU) | |
Random | 17.8571% (with PU) | 21.0887% (with PU) | 18.2176 (with PU) |
3.9442% (with SU) | 5.1209% (with SU) | 2.0970% (with SU) | |
Greedy | 10.5238% (with PU) | 11.8952% (with PU) | 14.6789% (with PU) |
63.7143% (with SU) | 59.7984% (with SU) | 46.2647% (with SU) |
Method | Active Rate = 1 | Active Rate = 0.5 | Active Rate = 0.2 |
---|---|---|---|
DQN | 4.0523 | 5.9819 | 9.6539 |
DRQN | 2.0296 | 6.8767 | 9.5976 |
Random | −2.2040 | 3.7478 | 5.1396 |
Greedy | −2.9390 | −2.5156 | −2.5097 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Liu, S.; Pan, C.; Zhang, C.; Yang, F.; Song, J. Dynamic Spectrum Sharing Based on Deep Reinforcement Learning in Mobile Communication Systems. Sensors 2023, 23, 2622. https://doi.org/10.3390/s23052622
Liu S, Pan C, Zhang C, Yang F, Song J. Dynamic Spectrum Sharing Based on Deep Reinforcement Learning in Mobile Communication Systems. Sensors. 2023; 23(5):2622. https://doi.org/10.3390/s23052622
Chicago/Turabian StyleLiu, Sizhuang, Changyong Pan, Chao Zhang, Fang Yang, and Jian Song. 2023. "Dynamic Spectrum Sharing Based on Deep Reinforcement Learning in Mobile Communication Systems" Sensors 23, no. 5: 2622. https://doi.org/10.3390/s23052622
APA StyleLiu, S., Pan, C., Zhang, C., Yang, F., & Song, J. (2023). Dynamic Spectrum Sharing Based on Deep Reinforcement Learning in Mobile Communication Systems. Sensors, 23(5), 2622. https://doi.org/10.3390/s23052622