Hardware Implementations of a Deep Learning Approach to Optimal Configuration of Reconfigurable Intelligence Surfaces
Abstract
:1. Introduction
2. Edge Computing for RIS
2.1. Reconfigurable Intelligent Surfaces
2.2. Why Deep Learning and Not Other Approaches
2.3. Why on the Edge
2.4. Target Edge Devices
3. Methodology
3.1. Dataset Generation
- It takes as a reference the amount of energy in the desired direction with a random configuration of 0/1. Now, cell by cell, their states are inverted, and then the energy is again checked. If it has increased, the inversion is maintained; if not, it is reverted. When all the elements have been processed, new sweeps are performed until the stopping criterion is met: at the end of a sweep, less than of the element’s states have been inverted. This threshold is set because the computational cost of conducting a new sweep does not justify the marginal improvement in the energy value.
- Although the algorithm converges relatively quickly, the computational cost is high and sustained over time and not suitable for real-time calculations.
3.2. Model Architecture Design
3.3. Custom Loss Function
3.4. Model Training and Evaluation
4. Implementation
4.1. Neural Network Adaptation
4.2. CPU—ROCK 4C Plus
4.3. GPU—NVIDIA Jetson Nano
4.4. TPU—Google Coral
4.5. FPGA—Intel® Arria® 10 SX SoC Development Kit
4.5.1. MATLAB® Deep Learning HDL ToolboxTM
4.5.2. Intel® FPGA AI Suite
5. Results
5.1. Accuracy
5.2. Performance
Devices | FP32 | INT8 |
---|---|---|
ROCK 4C plus | 98.88% | 98.56% |
NVIDIA Jetson Nano | 98.88% | - |
Google Coral | - | 98.44% |
Intel® Arria® 10 SoC DevKit & MATLAB® DL ToolboxTM | 98.88% | 91.62% |
Intel® Arria® 10 SoC DevKit & FPGA AI Suite (A10_Generic) | 98.88% | - |
Intel® Arria® 10 SoC DevKit & FPGA AI Suite (A10_Performance) | 98.88% | - |
FP32 | INT8 | |||
---|---|---|---|---|
Device | FPS | Latency (ms) | FPS | Latency (ms) |
ROCK 4C plus | 2.38 | 420.17 | 65.60 | 15.24 |
NVIDIA Jetson Nano | 48.00 | 20.83 | - | - |
Google Coral | - | - | 345.93 | 2.89 |
Intel® Arria® 10 SoC DevKit & MATLAB® DL ToolboxTM | 51.19 | 19.54 | 104.92 | 9.53 |
Intel® Arria® 10 SoC DevKit & FPGA AI Suite (A10_Generic) | 881.09 | 1.13 | - | - |
Intel® Arria® 10 SoC DevKit & FPGA AI Suite (A10_Performance) | 971.14 | 1.03 | - | - |
5.3. Analysis of Resource Usage and Performance for FPGA Implementations
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- International Data Corporation. Future of Industry Ecosystems: Shared Data and Insights. Available online: https://blogs.idc.com/2021/01/06/future-of-industry-ecosystems-shared-data-and-insights/ (accessed on 1 August 2023).
- Global System for Mobile Communications Association. Second Wave of 5G: 30 Countries to Launch Services in 2023. Available online: https://www.gsma.com/newsroom/press-release/second-wave-of-5g-30-countries-to-launch-services-in-2023/ (accessed on 1 August 2023).
- Basar, E.; Di Renzo, M.; De Rosny, J.; Debbah, M.; Alouini, M.S.; Zhang, R. Wireless Communications Through Reconfigurable Intelligent Surfaces. IEEE Access 2019, 7, 116753–116773. [Google Scholar] [CrossRef]
- Klemic, K.; Peters, M.; Fetunmbi, L. Future Directions Workshop on Wireless Communications: XG and beyond. Available online: https://basicresearch.defense.gov/Portals/61/Documents/future-directions/Future%20Directions%20-%20Wireless%20Communications%20-%20for%20public%20release.pdf?ver=LL4z9Lfey_VSpFp6Nnmwmw%3D%3D (accessed on 1 August 2023).
- Chen, W.; Lin, X.; Lee, J.; Toskala, A.; Sun, S.; Chiasserini, C.F.; Liu, L. 5G-Advanced Toward 6G: Past, Present, and Future. IEEE J. Sel. Areas Commun. 2023, 41, 1592–1619. [Google Scholar] [CrossRef]
- ElMossallamy, M.A.; Zhang, H.; Song, L.; Seddik, K.G.; Han, Z.; Li, G.Y. Reconfigurable Intelligent Surfaces for Wireless Communications: Principles, Challenges, and Opportunities. IEEE Trans. Cogn. Commun. Netw. 2020, 6, 990–1002. [Google Scholar] [CrossRef]
- Dai, L.; Wang, B.; Wang, M.; Yang, X.; Tan, J.; Bi, S.; Xu, S.; Yang, F.; Chen, Z.; Renzo, M.D.; et al. Reconfigurable Intelligent Surface-Based Wireless Communications: Antenna Design, Prototyping, and Experimental Results. IEEE Access 2020, 8, 45913–45923. [Google Scholar] [CrossRef]
- Liu, Y.; Liu, X.; Mu, X.; Hou, T.; Xu, J.; Di Renzo, M.; Al-Dhahir, N. Reconfigurable Intelligent Surfaces: Principles and Opportunities. IEEE Commun. Surv. Tutorials 2021, 23, 1546–1577. [Google Scholar] [CrossRef]
- Molero, C.; Palomares-Caballero, A.; Alex-Amor, A.; Parellada-Serrano, I.; Gamiz, F.; Padilla, P.; Valenzuela-Valdés, J.F. Metamaterial-Based Reconfigurable Intelligent Surface: 3D Meta-Atoms Controlled by Graphene Structures. IEEE Commun. Mag. 2021, 59, 42–48. [Google Scholar] [CrossRef]
- Ge, Y.; Fan, J.; Li, G.Y.; Wang, L.C. Intelligent reflecting surface-enhanced UAV communications: Advances, challenges, and prospects. IEEE Wirel. Commun. 2023, 30, 119–126. [Google Scholar] [CrossRef]
- Yuan, J.; Ngo, H.Q.; Matthaiou, M. Large Intelligent Surface (LIS)-based Communications: New Features and System Layouts. In Proceedings of the ICC 2020—2020 IEEE International Conference on Communications (ICC), Dublin, Ireland, 7–11 June 2020; pp. 1–6. [Google Scholar] [CrossRef]
- Li, Z.; Wang, S.; Lin, Q.; Li, Y.; Wen, M.; Wu, Y.C.; Poor, H.V. Phase Shift Design in RIS Empowered Wireless Networks: From Optimization to AI-Based Methods. Network 2022, 2, 398–418. [Google Scholar] [CrossRef]
- Cao, K.; Liu, Y.; Meng, G.; Sun, Q. An Overview on Edge Computing Research. IEEE Access 2020, 8, 85714–85728. [Google Scholar] [CrossRef]
- Dhilleswararao, P.; Boppu, S.; Manikandan, M.S.; Cenkeramaddi, L.R. Efficient Hardware Architectures for Accelerating Deep Neural Networks: Survey. IEEE Access 2022, 10, 131788–131828. [Google Scholar] [CrossRef]
- Pan, C.; Ren, H.; Wang, K.; Kolb, J.F.; Elkashlan, M.; Chen, M.; Di Renzo, M.; Hao, Y.; Wang, J.; Swindlehurst, A.L.; et al. Reconfigurable Intelligent Surfaces for 6G Systems: Principles, Applications, and Research Directions. IEEE Commun. Mag. 2021, 59, 14–20. [Google Scholar] [CrossRef]
- Zhang, P.; Zhang, J.; Xiao, H.; Du, H.; Niyato, D.; Ai, B. RIS-Aided 6G Communication System with Accurate Traceable User Mobility. IEEE Trans. Veh. Technol. 2023, 72, 2718–2722. [Google Scholar] [CrossRef]
- Jian, M.; Alexandropoulos, G.C.; Basar, E.; Huang, C.; Liu, R.; Liu, Y.; Yuen, C. Reconfigurable intelligent surfaces for wireless communications: Overview of hardware designs, channel models, and estimation techniques. Intell. Converg. Netw. 2022, 3, 1–32. [Google Scholar] [CrossRef]
- LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
- Li, Z.; Liu, F.; Yang, W.; Peng, S.; Zhou, J. A Survey of Convolutional Neural Networks: Analysis, Applications, and Prospects. IEEE Trans. Neural Networks Learn. Syst. 2022, 33, 6999–7019. [Google Scholar] [CrossRef] [PubMed]
- Deng, S.; Zhao, H.; Fang, W.; Yin, J.; Dustdar, S.; Zomaya, A.Y. Edge Intelligence: The Confluence of Edge Computing and Artificial Intelligence. IEEE Internet Things J. 2020, 7, 7457–7469. [Google Scholar] [CrossRef]
- Ghimire, D.; Kil, D.; Kim, S.H. A Survey on Efficient Convolutional Neural Networks and Hardware Acceleration. Electronics 2022, 11, 945. [Google Scholar] [CrossRef]
- Wu, J.; Lu, X.; Wang, W.; Han, J.; Xu, G.; Huang, Z. Design of a Compact Polarization-Agile and Frequency-Tailored Array Antenna With Digital-Controllable Radiation Beams. IEEE Trans. Antennas Propag. 2022, 70, 813–822. [Google Scholar] [CrossRef]
- Gros, J.B.; Popov, V.; Odit, M.A.; Lenets, V.; Lerosey, G. A Reconfigurable Intelligent Surface at mmWave Based on a Binary Phase Tunable Metasurface. IEEE Open J. Commun. Soc. 2021, 2, 1055–1064. [Google Scholar] [CrossRef]
- Gao, S.H.; Cheng, M.M.; Zhao, K.; Zhang, X.Y.; Yang, M.H.; Torr, P. Res2Net: A New Multi-Scale Backbone Architecture. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 43, 652–662. [Google Scholar] [CrossRef] [PubMed]
- Horng, S.J.; Supardi, J.; Zhou, W.; Lin, C.T.; Jiang, B. Recognizing Very Small Face Images Using Convolution Neural Networks. IEEE Trans. Intell. Transp. Syst. 2022, 23, 2103–2115. [Google Scholar] [CrossRef]
- Le, D.N.; Parvathy, V.S.; Gupta, D.; Khanna, A.; Rodrigues, J.J.P.C.; Shankar, K. IoT enabled depthwise separable convolution neural network with deep support vector machine for COVID-19 diagnosis and classification. Int. J. Mach. Learn. Cybern. 2021, 12, 3235–3248. [Google Scholar] [CrossRef] [PubMed]
- Ji, M.; Al-Ars, Z.; Hofstee, P.; Chang, Y.; Zhang, B. FPQNet: Fully Pipelined and Quantized CNN for Ultra-Low Latency Image Classification on FPGAs Using OpenCAPI. Electronics 2023, 12, 4085. [Google Scholar] [CrossRef]
- LeCun, Y.; Boser, B.; Denker, J.S.; Henderson, D.; Howard, R.E.; Hubbard, W.; Jackel, L.D. Backpropagation Applied to Handwritten Zip Code Recognition. Neural Comput. 1989, 1, 541–551. [Google Scholar] [CrossRef]
- Islam, M.R.; Matin, A. Detection of COVID 19 from CT image by the novel LeNet-5 CNN architecture. In Proceedings of the 2020 23rd International Conference on Computer and Information Technology (ICCIT), Dhaka, Bangladesh, 19–21 December 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 1–5. [Google Scholar] [CrossRef]
- Yuan, Y.; Peng, L. Wireless device identification based on improved convolutional neural network model. In Proceedings of the 2018 IEEE 18th International Conference on Communication Technology (ICCT), Chongqing, China, 8–11 October 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 683–687. [Google Scholar] [CrossRef]
- Zhang, C.W.; Yang, M.Y.; Zeng, H.J.; Wen, J.P. Pedestrian detection based on improved LeNet-5 convolutional neural network. J. Algorithms Comput. Technol. 2019, 13, 1748302619873601. [Google Scholar] [CrossRef]
- Sermanet, P.; Eigen, D.; Zhang, X.; Mathieu, M.; Fergus, R.; LeCun, Y. Overfeat: Integrated recognition, localization and detection using convolutional networks. arXiv 2013, arXiv:1312.6229. [Google Scholar]
- Abadi, M.; Agarwal, A.; Barham, P.; Brevdo, E.; Chen, Z.; Citro, C.; Corrado, G.S.; Davis, A.; Dean, J.; Devin, M.; et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. Available online: https://www.tensorflow.org/ (accessed on 15 December 2023).
- Chollet, F. Keras. Available online: https://keras.io/getting_started/faq/#how-should-i-cite-keras (accessed on 15 December 2023).
- Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
- ROCK 4C Plus. Available online: https://wiki.radxa.com/Rock4/4cplus (accessed on 15 December 2023).
- Jetson Nano Developer Kit. Available online: https://developer.nvidia.com/embedded/jetson-nano-developer-kit (accessed on 15 December 2023).
- Google Coral Dev Board. Available online: https://coral.ai/products/dev-board (accessed on 15 December 2023).
- Intel® Arria® 10 SX SoC Development Kit. Available online: https://www.intel.com/content/www/us/en/products/details/fpga/development-kits/arria/10-sx.html (accessed on 15 December 2023).
- Lee, W.K.; Cheong, H.S.; Phan, R.C.W.; Goi, B.M. Fast implementation of block ciphers and PRNGs in Maxwell GPU architecture. Clust. Comput. 2016, 19, 335–347. [Google Scholar] [CrossRef]
- Hosseininoorbin, S.; Layeghy, S.; Kusy, B.; Jurdak, R.; Portmann, M. Exploring Edge TPU for deep feed-forward neural networks. Internet Things 2023, 22, 100749. [Google Scholar] [CrossRef]
- MATLAB® Deep Learning ToolboxTM. Available online: https://www.mathworks.com/products/deep-learning.html (accessed on 15 December 2023).
- MATLAB® Deep Learning HDL ToolboxTM. Available online: https://www.mathworks.com/products/deep-learning-hdl.html (accessed on 15 December 2023).
- OpenVINO Toolkit. Available online: https://docs.openvino.ai/2023.2/home.html (accessed on 15 December 2023).
- Intel® FPGA AI Suite. Available online: https://www.intel.com/content/www/us/en/software/programmable/fpga-ai-suite/overview.html (accessed on 15 December 2023).
- Intel® Quartus® Prime Software. Available online: https://www.intel.com/content/www/us/en/products/details/fpga/development-tools/quartus-prime.html (accessed on 18 January 2024).
- Erickson, J. Deploying Deep Learning on Embedded Devices—When FPGAs Make Sense. Available online: https://www.matlabexpo.com/content/dam/mathworks/mathworks-dot-com/images/events/matlabexpo/online/2020/deploying-deep-learning-on-embedded-devices-when-fpgas-make-sense.pdf (accessed on 15 December 2023).
- Intel® FPGA AI Suite: IP Reference Manual. Available online: https://www.intel.com/content/www/us/en/docs/programmable/768974/2023-2/reference-manual.html (accessed on 15 December 2023).
- Nios® V Processor for Intel® FPGA. Available online: https://www.intel.com/content/www/us/en/products/details/fpga/nios-processor/v.html (accessed on 15 December 2023).
Layer | Type | Output Shape | Activation | Number of Parameters |
---|---|---|---|---|
Input | Padding | [360,360,3] | - | 0 |
Lambda 1 | Lambda 1 | [360,360,3] | - | 0 |
Conv2d_0 | Conv2D | [180,180,8] | ReLU | 224 |
Conv2d_1 | Conv2D | [90,90,16] | - | 64 |
BN_0 | BatchNormalization | [90,90,16] | - | 64 |
LR_0 | Activation Layer | [90,90,16] | ReLU | 0 |
Conv2d_2 | Conv2D | [45,45,32] | ReLU | 4640 |
Conv2d_3 | Conv2D | [15,15,64] | - | 18,496 |
BN_1 | BatchNormalization | [15,15,64] | - | 64 |
LR_1 | Activation Layer | [15,15,64] | ReLU | 0 |
Conv2d_4 | Conv2D | [8,8,128] | ReLU | 73,856 |
Conv2d_5 | Conv2D | [4,4,256] | ReLU | 131,328 |
Conv2d_6 | Conv2D | [1,1,512] | ReLU | 2,097,664 |
FC_0 | Dense | [1,1,256] | ReLU | 131,328 |
FC_1 | Dense | [1,1,225] | Sigmoid | 57,825 |
Reshape | Reshape | [15,15,1] | - | 0 |
Total: | 2,516,849 |
ROCK 4C Plus | NVIDIA Jetson Nano | Google Coral | Intel® Arria® 10 SX SoC Development Kit | |
---|---|---|---|---|
Device Family | CPU + GPU | GPU + CPU | Edge TPU | FPGA + CPU |
CPU | ARM® Cortex™-72 + ARM® Cortex™-A53 | ARM® Cortex™-A57 | ARM® Cortex™-A53 + ARM® Cortex™-M4 | ARM® Cortex™-A9 MPCore |
CPU Cores | 2 + 4 | 4 | 4 + 1 | 2 |
CPU Architecture | 64-bit | 64-bit | 64-bit | 32-bit |
CPU Max. Freq. | 1.5/1.0 GHz | 1.43 GHz | 1.5 GHz | 1.2 GHz |
AI Acceleration | - | NVIDIA Maxwell architecture with 128 NVIDIA CUDA® cores | Google Edge TPU coprocessor | FPGA with 251,680 ALMs + 1687 Variable Precision DSPs + 2131 20-kb BlockRAMs |
RAM Memory | 4 GB 64-bit LPDDR4 3200 MHz | 4 GB 64-bit LPDDR4 1600 MHz | 4 GB 32-bit LPDDR4 1600 MHz | 2 GB + 1 GB 16-bit DDR4 1200 MHz |
Framework | TensorFlow | TensorFlow | TensorFlow Lite | MATLAB® Deep Learning HDL ToolboxTM/Intel® FPGA AI Suite |
Operating System | Debian Desktop 5.10.110 | Ubuntu Desktop 4.9.253 | Mendel Linux 4.14.98 | Linux Intel SoC 4.9.0/Yocto Linux 5.15.70 |
ALM | BlockRAM Memory Bits | BlockRAM | Variable Precision DSP | ||
---|---|---|---|---|---|
Total | 251,680 | 43,642,880 | 2131 | 1687 | |
FP32 | Used | 134,187 | 23,133,724 | 2131 | 255 |
Usage % | 53.32% | 53.01% | 100.00% | 15.12% | |
INT8 | Used | 160,818 | 17,584,432 | 2131 | 730 |
Usage % | 63.90% | 40.29% | 100.00% | 43.27% |
ALM | BlockRAM Memory Bits | BlockRAM | Variable Precision DSP | ||
---|---|---|---|---|---|
Total | 251,680 | 43,642,880 | 2131 | 1687 | |
A10_Generic | Used | 48,899 | 14,371,616 | 777 | 182 |
Usage % | 19.42% | 32.93% | 36.46% | 10.79% | |
A10_Performance | Used | 68,624 | 20,452,640 | 1102 | 606 |
Usage % | 27.26% | 46,36% | 51.71% | 35.92% |
ALM | BlockRAM Memory Bits | BlockRAM | Variable Precision DSP | ||
---|---|---|---|---|---|
Intel® FPGA AI Suite A10_Performance | Used | 68.624 | 20,452,640 | 1102 | 606 |
MATLAB® DL ToolboxTM FP32 | Used BM % | 134,187 195.54% | 23,133,724 113.11% | 2131 193.38% | 255 42.08% |
MATLAB® DL ToolboxTM INT8 | Used BM % | 160,818 234.35% | 17,584,432 85.98% | 2131 193.38% | 730 120.46% |
Intel® FPGA AI Suite A10_Generic | Used BM % | 48,899 71.26% | 14,371,616 70.27% | 777 64.15% | 182 30.03% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Martín-Martín, A.; Padial-Allué, R.; Castillo, E.; Parrilla, L.; Parellada-Serrano, I.; Morán, A.; García, A. Hardware Implementations of a Deep Learning Approach to Optimal Configuration of Reconfigurable Intelligence Surfaces. Sensors 2024, 24, 899. https://doi.org/10.3390/s24030899
Martín-Martín A, Padial-Allué R, Castillo E, Parrilla L, Parellada-Serrano I, Morán A, García A. Hardware Implementations of a Deep Learning Approach to Optimal Configuration of Reconfigurable Intelligence Surfaces. Sensors. 2024; 24(3):899. https://doi.org/10.3390/s24030899
Chicago/Turabian StyleMartín-Martín, Alberto, Rubén Padial-Allué, Encarnación Castillo, Luis Parrilla, Ignacio Parellada-Serrano, Alejandro Morán, and Antonio García. 2024. "Hardware Implementations of a Deep Learning Approach to Optimal Configuration of Reconfigurable Intelligence Surfaces" Sensors 24, no. 3: 899. https://doi.org/10.3390/s24030899
APA StyleMartín-Martín, A., Padial-Allué, R., Castillo, E., Parrilla, L., Parellada-Serrano, I., Morán, A., & García, A. (2024). Hardware Implementations of a Deep Learning Approach to Optimal Configuration of Reconfigurable Intelligence Surfaces. Sensors, 24(3), 899. https://doi.org/10.3390/s24030899