A Usage Aware Dynamic Spectrum Access Scheme for Interweave Cognitive Radio Network by Exploiting Deep Reinforcement Learning
Abstract
:1. Introduction
2. Related Work
3. System Model and Preliminaries
3.1. System Model
3.2. Preliminaries
3.2.1. Reinforcement Learning
3.2.2. Deep Reinforcement Learning
4. Proposed Deep Reinforcement Learning Based Usage Aware Spectrum Access Scheme
4.1. Problem Formulation
4.2. Existing Q-Learning and DQN Based Spectrum Access Methods
4.3. Proposed Usage Aware Spectrum Access Scheme
4.3.1. Compressed States Representation
4.3.2. Additional Action
4.3.3. Status Aware Cost Function
5. Simulation Results
5.1. Simulation Settings
5.2. Evaluation Results for Correlated Channel Usages
5.3. Evaluation Results for Uncorrelated Channel Usages
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Conflicts of Interest
References
- Yin, S.; Chen, D.; Zhang, Q.; Liu, M.; Li, S. Mining spectrum usage data: A large-scale spectrum measurement study. IEEE Trans. Mob. Comput. 2012, 11, 1033–1046. [Google Scholar] [CrossRef]
- Wang, X.; Ji, Y.; Zhou, H.; Li, J. Auction based frameworks for secure communications in static and dynamic cognitive radio networks. IEee Trans. Veh. Technol. 2017, 66, 2658–2673. [Google Scholar] [CrossRef]
- Wang, X.; Umehira, M.; Han, B.; Zhou, H.; Li, P.; Wu, C. An Efficient privacy preserving spectrum sharing framework for internet of things. IEEE Access 2020, 8, 34675–34685. [Google Scholar] [CrossRef]
- Bhattarai, S.; Park, J.-M.; Lehr, W. Dynamic exclusion zones for protecting primary users in database-driven spectrum sharing. IEEE/ACM Trans. Netw. 2020, 28, 1506–1519. [Google Scholar] [CrossRef]
- Barb, G.; Alexa, F.; Otesteanu, M. Dynamic spectrum sharing for future LTE-NR networks. Sensors 2021, 21, 4215. [Google Scholar] [CrossRef]
- Mueck, M.D.; Srikanteswara, S.; Badi, B. Spectrum Sharing: Licensed Shared Access (lsa) and Spectrum Access System (sas). Available online: http://www.intel.com/content/dam/www/public/us/en/documents/white-papers/spectrum-sharing-lsasas-paper.pdf (accessed on 1 October 2021).
- Chakraborty, A.; Das, S.R. Measurement-augmented spectrum databases for white space spectrum. In Proceedings of the ACM CoNEXT, Sydney, NSW, Australia, 2–5 December 2014; pp. 67–74. [Google Scholar]
- Akimoto, M.; Wang, X.; Umehira, M.; Ji, Y. Crowdsourced Radio environment mapping by exploiting machine learning. In Proceedings of the 22nd International Symposium on Wireless Personal Multimedia Communications (WPMC), Lisbon, Portugal, 24–27 November 2019; pp. 1–6. [Google Scholar] [CrossRef]
- Wang, X.; Umehira, M.; Han, B.; Li, P.; Gu, Y.; Wu, C. Online incentive mechanism for crowdsourced radio environment map construction. In Proceedings of the IEEE International Conference on Communications (IEEE ICC 2019), Shanghai, China, 20–24 May 2019. [Google Scholar]
- Wang, X.; Umehira, M.; Akimoto, M.; Han, B.; Zhou, H. Green spectrum sharing framework in B5G era by exploiting crowdsensing. IEEE Trans. On Green Commun. Networking 2022. Early Access. [Google Scholar] [CrossRef]
- Li, H. Multiagent q-learning for aloha-like spectrum access in cognitive radio systems. J. Wirel. Com. Netw. 2010, 2010, 876216. [Google Scholar] [CrossRef]
- Macaluso, I.; Forde, T.K.; DaSilva, L.; Doyle, L. Impact of cognitive radio: Recognition and informed exploitation of grey spectrum opportunities. IEEE Veh. Technol. Mag. 2012, 7, 85–90. [Google Scholar] [CrossRef]
- Roy, D.; Mukherjee, T.; Chatterjee, M.; Pasiliao, E. Primary user activity prediction in DSA networks using recurrent structures. In Proceedings of the IEEE International Symposium on Dynamic Spectrum Access Networks (DySPAN), Newark, NJ, USA, 11–14 November 2019; pp. 1–10. [Google Scholar] [CrossRef]
- Yu, L.; Guo, Y.; Wang, Q.; Luo, C.; Li, M.; Liao, W.; Li, P. Spectrum availability prediction for cognitive radio communications: A DCG approach. IEEE Trans. Cogn. Commun. Netw. 2020, 6, 476–485. [Google Scholar] [CrossRef]
- Mosavat-Jahromi, H.; Li, Y.; Cai, L.; Pan, J. Prediction and Modeling of spectrum occupancy for dynamic spectrum access systems. IEEE Trans. Cogn. Commun. Netw. 2021, 7, 715–728. [Google Scholar] [CrossRef]
- Sengottuvelan, S.; Ansari, J.; Mähönen, P.; Venkatesh, T.G.; Petrova, M. Channel Selection algorithm for cognitive radio networks with heavy-tailed idle times. IEEE Trans. Mob. Comput. 2017, 16, 1258–1271. [Google Scholar] [CrossRef]
- Kishimoto, Y.; Wang, X.; Umehira, M. Reinforcement learning for joint channel/subframe selection of LTE in the unlicensed spectrum. Wirel. Commun. Mob. Comput. 2021, 2021, 15. [Google Scholar] [CrossRef]
- Amrallah, A.; Mohamed, E.M.; Tran, G.K.; Sakaguchi, K. Enhanced dynamic spectrum access in UAV wireless networks for post-disaster area surveillance system: A Multi-player multi-armed bandit approach. Sensors 2021, 21, 7855. [Google Scholar] [CrossRef]
- Teraki, Y.; Wang, X.; Umehira, M.; Ji, Y. Deep reinforcement learning based usage aware spectrum access scheme. In Proceedings of the 24th International Symposium on Wireless Personal Multimedia Communications (WPMC), Okayama, Japan, 12–16 December 2021; pp. 1–6. [Google Scholar] [CrossRef]
- Mnih, V.; Kavukcuoglu, K.; Silver, D.; Rusu, A.A.; Veness, J.; Bellemare, M.G.; Graves, A.; Riedmiller, M.; Fidjel, A.K.; Ostrovski, G.; et al. Human-level control through deep reinforcement learning. Nature 2015, 518, 529–533. [Google Scholar] [CrossRef]
- Jiang, H.; Lai, L.; Fan, R.; Poor, H.V. Optimal selection of channel sensing order in cognitive radio. IEEE Trans. Wirel. Commun. 2009, 8, 297–307. [Google Scholar] [CrossRef]
- Khan, Z.; Lehtomäki, J.J.; DaSilva, L.A.; Hossain, E.; Latva-Aho, M. Opportunistic channel selection by cognitive wireless nodes under imperfect observations and limited memory: A repeated game model. IEEE Trans. Mob. Comput. 2016, 15, 173–187. [Google Scholar] [CrossRef]
- Zhao, Q.; Krishnamachari, B.; Liu, K. On myopic sensing for multi-channel opportunistic access: Structure, optimality, and performance. IEEE Trans. Wirel. Commun. 2008, 7, 5431–5440. [Google Scholar] [CrossRef] [Green Version]
- Dai, W.; Gai, Y.; Krishnamachari, B. Online learning for multi-channel opportunistic access over unknown markovian channels. In Proceedings of the IEEE SECON, Singapore, 1–3 July 2014. [Google Scholar]
- Zhou, M.; Wang, T.; Wang, S. Spectrum Sensing across multiple service providers: A discounted thompson sampling method. IEEE Commun. Lett. 2019, 23, 2402–2406. [Google Scholar] [CrossRef]
- Venkatraman, P.; Hamdaoui, B.; Guizani, M. Opportunistic bandwidth sharing through reinforcement learning. IEEE Trans. Veh. Technol. 2010, 59, 3148–3153. [Google Scholar] [CrossRef]
- Syed, A.R.; Yau, K.L.A.; Mohamad, H.; Ramli, N.; Hashim, W. Channel selection in multi-hop cognitive radio network using reinforcement learning: An experimental study. In Proceedings of the ICFCNA, Kuala Lumpur, Malaysia, 3–5 November 2014. [Google Scholar]
- Nguyen, H.Q.; Nguyen, B.T.; Dong, T.Q.; Ngo, D.T.; Nguyen, T.A. Deep q-learning with multiband sensing for dynamic spectrum access. In Proceedings of the IEEE International Symposium on Dynamic Spectrum Access Networks (DySPAN), Seoul, Korea, 22–25 October 2018; pp. 1–5. [Google Scholar] [CrossRef]
- Liu, X.; Sun, C.; Yu, W.; Zhou, M. Reinforcement-Learning-based dynamic spectrum access for software-defined cognitive industrial internet of things. IEEE Trans. Ind. Inform. 2022, 18, 4244–4253. [Google Scholar] [CrossRef]
- Wang, S.; Liu, H.; Gomes, P.H.; Krishnamachari, B. Deep Reinforcement learning for dynamic multichannel access in wireless networks. IEEE Trans. Cogn. Commun. Netw. 2018, 4, 257–265. [Google Scholar] [CrossRef]
- Zhong, C.; Lu, Z.; Gursoy, M.C.; Velipasalar, S. A deep actor-critic reinforcement learning framework for dynamic multichannel access. IEEE Trans. Cogn. Commun. Netw. 2019, 5, 1125–1139. [Google Scholar] [CrossRef]
- Naparstek, O.; Cohen, K. Deep multi-user reinforcement learning for distributed dynamic spectrum access. IEEE Trans. Wirel. Commun. 2019, 18, 310–323. [Google Scholar] [CrossRef]
- Xu, Y.; Yu, J.; Buehrer, R.M. The Application of deep reinforcement learning to distributed spectrum access in dynamic heterogeneous environments with partial observations. IEEE Trans. Wirel. Commun. 2020, 19, 4494–4506. [Google Scholar] [CrossRef]
- Senthilmurugan, S.; Venkatesh, T.G. Optimal channel sensing strategy for cognitive radio networks with heavy-tailed idle times. IEEE Trans. Cogn. Commun. Netw. 2017, 3, 26–36. [Google Scholar] [CrossRef]
- Raj, V.; Dias, I.; Tholeti, T.; Kalyani, S. Spectrum Access in cognitive radio using a two-stage reinforcement learning approach. IEEE J. Sel. Top. Signal Process. 2018, 12, 20–34. [Google Scholar] [CrossRef] [Green Version]
- Sheng, X.; Wang, S. Sensing-Transmission tradeoff for multimedia transmission in cognitive radio networks. In Proceedings of the GLOBECOM 2020–2020 IEEE Global Communications Conference, Taipei, Taiwan, 7–11 December 2020; pp. 1–6. [Google Scholar] [CrossRef]
- Chang, H.-H.; Song, H.; Yi, Y.; Zhang, J.; He, H.; Liu, L. Distributive dynamic spectrum access through deep reinforcement learning: A reservoir computing-based approach. IEEE Internet Things J. 2019, 6, 1938–1948. [Google Scholar] [CrossRef]
- Macaluso, I.; Finn, D.; Ozgul, B.; DaSilva, L.A. Complexity of Spectrum activity and benefits of reinforcement learning for dynamic channel selection. IEEE J. Sel. Areas Commun. 2013, 31, 2237–2248. [Google Scholar] [CrossRef]
- Watkins, C.J.C.H.; Dayan, P. Q-learning. In Machine Learning; Springer: Berlin/Heidelberg, Germany, 1992; pp. 279–292. [Google Scholar]
- MATLAB and Statistics Toolbox Release 2021b; The MathWorks, Inc.: Natick, MA, USA, 2021.
- Python Software Foundation. Python Language Reference, Version 2.7. Available online: http://www.python.org (accessed on 1 August 2022).
DC | 0.1 | 0.3 | 0.5 | 0.7 | 0.9 |
---|---|---|---|---|---|
0.1261 | 0.3195 | 0.4689 | 0.3195 | 0.1261 | |
0.2104 | 0.5119 | 0.4690 | 0.5119 | 0.2105 | |
0.2777 | 0.6519 | 0.7219 | 0.6519 | 0.2777 | |
0.3330 | 0.7539 | 0.7219 | 0.7539 | 0.3330 | |
0.3788 | 0.8141 | 0.8813 | 0.8247 | 0.3788 | |
0.4152 | 0.8247 | 0.8813 | 0.8141 | 0.4152 | |
0.4152 | 0.8659 | 0.9710 | 0.8659 | 0.4431 | |
0.4431 | 0.8669 | 0.9710 | 0.8669 | 0.4617 | |
0.4617 | 0.8813 | 1.0000 | 0.8813 | 0.4690 |
Parameters | Value |
---|---|
Number of PUs | 3, 10 |
Total time slots | 110,000 |
Evaluation time slots | 10,000 |
Mini-batch size | 2500 |
Replay memory size | 125,000 |
Exploration rate | 1 → 0.001 |
Learning rate | 0.00009 |
Discount factor | 0.1 |
Number of hidden layers | 1 |
Number of neurons | 512 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, X.; Teraki, Y.; Umehira, M.; Zhou, H.; Ji, Y. A Usage Aware Dynamic Spectrum Access Scheme for Interweave Cognitive Radio Network by Exploiting Deep Reinforcement Learning. Sensors 2022, 22, 6949. https://doi.org/10.3390/s22186949
Wang X, Teraki Y, Umehira M, Zhou H, Ji Y. A Usage Aware Dynamic Spectrum Access Scheme for Interweave Cognitive Radio Network by Exploiting Deep Reinforcement Learning. Sensors. 2022; 22(18):6949. https://doi.org/10.3390/s22186949
Chicago/Turabian StyleWang, Xiaoyan, Yuto Teraki, Masahiro Umehira, Hao Zhou, and Yusheng Ji. 2022. "A Usage Aware Dynamic Spectrum Access Scheme for Interweave Cognitive Radio Network by Exploiting Deep Reinforcement Learning" Sensors 22, no. 18: 6949. https://doi.org/10.3390/s22186949
APA StyleWang, X., Teraki, Y., Umehira, M., Zhou, H., & Ji, Y. (2022). A Usage Aware Dynamic Spectrum Access Scheme for Interweave Cognitive Radio Network by Exploiting Deep Reinforcement Learning. Sensors, 22(18), 6949. https://doi.org/10.3390/s22186949