Implementation of Highly Reliable Convolutional Neural Network with Low Overhead on Field-Programmable Gate Array
Abstract
:1. Introduction
2. Basics of CNN Accelerator and SEU Fault Simulation
2.1. CNN Accelerator
2.2. SEU Fault Model
2.3. SEU Evaluation Platform
2.4. CNN Fault Type
3. SEU Tolerance Analysis of CNN
3.1. SEU Tolerance Analysis of PE Array
3.1.1. Analysis of Calculations in Pooling Layer
3.1.2. Analysis of Calculations in Convolutional Layer
3.1.3. Analysis of Calculations across Multiple Layers
3.2. SEU Tolerance Analysis of CTRL
3.3. MSC Identification
4. Hardening Strategies
4.1. FSM Error-Correcting Circuit
4.2. TMR Automatic-Hardening Technique
5. Evaluations and Verifications
5.1. Verifications of FSM-ECC and TMR-AHT
5.2. Comparisons of Design Overhead and Fault Tolerance
6. Conclusions and Future Works
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Li, Q.; Cai, W.; Wang, X.; Zhou, Y.; Feng, D.D.; Chen, M. Medical image classification with convolutional neural network. In Proceedings of the 2014 13th International Conference on Control Automation Robotics & Vision (ICARCV), Singapore, 10–12 December 2014; pp. 844–848. [Google Scholar] [CrossRef]
- Yang, X.; Wu, T.; Wang, N.; Huang, Y.; Song, B.; Gao, X. HCNN-PSI: A hybrid CNN with partial semantic information for space target recognition. Pattern Recognit. 2020, 108, 107531. [Google Scholar] [CrossRef]
- Priyadarshini, I.; Puri, V. Mars weather data analysis using machine learning techniques. Earth Sci. Inform. 2021, 14, 1885–1898. [Google Scholar] [CrossRef]
- Kain, E.T.; Lovelly, T.M.; George, A.D. Evaluating SEU Resilience of CNNs with Fault Injection. In Proceedings of the 2020 IEEE High Performance Extreme Computing Conference (HPEC), Waltham, MA, USA, 22–24 September 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 1–5. [Google Scholar]
- Lopes, I.C.; Kastensmidt, F.L.; Susin, A.A. SEU susceptibility analysis of a feedforward neural network implemented in a SRAM-based FPGA. In Proceedings of the 2017 18th IEEE Latin American Test Symposium (LATS), Bogota, Colombia, 13–15 March 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 1–6. [Google Scholar]
- Li, W.; Ge, G.; Guo, K.; Chen, X.; Wei, Q.; Gao, Z.; Wang, Y.; Yang, H. Soft error mitigation for deep convolution neural network on FPGA accelerators. In Proceedings of the 2020 2nd IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS), Genova, Italy, 31 August–2 September 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 1–5. [Google Scholar]
- Libano, F.; Wilson, B.; Wirthlin, M.; Rech, P.; Brunhaver, J. Understanding the impact of quantization, accuracy, and radiation on the reliability of convolutional neural networks on FPGAs. IEEE Trans. Nucl. Sci. 2020, 67, 1478–1484. [Google Scholar] [CrossRef]
- Libano, F.; Rech, P.; Neuman, B.; Leavitt, J.; Wirthlin, M.; Brunhaver, J. How reduced data precision and degree of parallelism impact the reliability of convolutional neural networks on FPGAs. IEEE Trans. Nucl. Sci. 2021, 68, 865–872. [Google Scholar] [CrossRef]
- Wang, H.B.; Wang, Y.S.; Xiao, J.H.; Wang, S.L.; Liang, T.J. Impact of single-event upsets on convolutional neural networks in Xilinx Zynq FPGAs. IEEE Trans. Nucl. Sci. 2021, 68, 394–401. [Google Scholar] [CrossRef]
- Syed, R.T.; Ulbricht, M.; Piotrowski, K.; Krstic, M. Fault resilience analysis of quantized deep neural networks. In Proceedings of the 2021 IEEE 32nd International Conference on Microelectronics (MIEL), Nis, Serbia, 12–14 September 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 275–279. [Google Scholar]
- Du, B.; Azimi, S.; De Sio, C.; Bozzoli, L.; Sterpone, L. On the reliability of convolutional neural network implementation on SRAM-based FPGA. In Proceedings of the 2019 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT), Noordwijk, The Netherlands, 2–4 October 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 1–6. [Google Scholar]
- Liu, C.; Chu, C.; Xu, D.; Wang, Y.; Wang, Q.; Li, H.; Li, X.; Cheng, K.T. HyCA: A hybrid computing architecture for fault-tolerant deep learning. IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. 2021, 41, 3400–3413. [Google Scholar] [CrossRef]
- Libano, F.; Wilson, B.; Anderson, J.; Wirthlin, M.J.; Cazzaniga, C.; Frost, C.; Rech, P. Selective hardening for neural networks in FPGAs. IEEE Trans. Nucl. Sci. 2018, 66, 216–222. [Google Scholar] [CrossRef]
- Gao, Z.; Zhang, H.; Yao, Y.; Xiao, J.; Zeng, S.; Ge, G.; Wang, Y.; Ullah, A.; Reviriego, P. Soft error tolerant convolutional neural networks on FPGAs with ensemble learning. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 2022, 30, 291–302. [Google Scholar] [CrossRef]
- Dos Santos, F.F.; Draghetti, L.; Weigel, L.; Carro, L.; Navaux, P.; Rech, P. Evaluation and mitigation of soft-errors in neural network-based object detection in three GPU architectures. In Proceedings of the 2017 47th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W), Denver, CO, USA, 26–29 June 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 169–176. [Google Scholar]
- Rajappa, A.J.; Reiter, P.; Sartori, T.K.S.; Laurini, L.H.; Fourati, H.; Mercelis, S.; Hellinckx, P.; Bastos, R.P. SMART: Selective MAC zero-optimzation for neural network reliability under radiation. In Proceedings of the 34th European Symposium on Reliability of Electron Devices, Failure Physics and Analysis (ESREF), Toulouse, France, 2–5 October 2023. [Google Scholar]
- Xia, L.; Liu, M.; Ning, X.; Chakrabarty, K.; Wang, Y. Fault-tolerant training enabled by on-line fault detection for RRAM-based neural computing systems. IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. 2018, 38, 1611–1624. [Google Scholar] [CrossRef]
- Schorn, C.; Guntoro, A.; Ascheid, G. An efficient bit-flip resilience optimization method for deep neural networks. In Proceedings of the 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE), Florence, Italy, 25–29 March 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 1507–1512. [Google Scholar]
- Chen, K.; Chen, X.; Zhang, Y.; Zhang, Z. A rapid evaluation technology for SEU in convolutional neural network circuits. In Proceedings of the 2021 IEEE 3rd International Conference on Circuits and Systems (ICCS), Chengdu, China, 29–31 October 2021; pp. 19–23. [Google Scholar]
- Chen, X.; Huo, L.; Xie, Y.; Shen, Z.; Xiang, Z.; Gao, C.; Zhang, Y. FPGA-Based Cross-Hardware MBU Emulation Platform for Layout-Level Digital VLSI. In Proceedings of the 2023 IEEE 32nd Asian Test Symposium (ATS), Beijing, China, 14–17 October 2023; pp. 1–6. [Google Scholar]
- Lu, Y.; Chen, X.; Zhai, X.; Saha, S.; Ehsan, S.; Su, J.; McDonald-Maier, K. A fast simulation method for analysis of SEE in VLSI. Microelectron. Reliab. 2021, 120, 114110. [Google Scholar] [CrossRef]
Fault Types | Notes |
---|---|
System crash | No values are given on time. |
Serious error | Both and are wrong. |
Tolerable error | is wrong, but is correct. |
Benign error | Both and are correct. |
Unhardened | Ref. [13] | Proposed | |
---|---|---|---|
LUTs | 8040 (100%) | 15,475 (192.5%) | 9705 (120.7%) |
FFs | 11,932 (100%) | 22,379 (187.6%) | 14,427 (120.9%) |
DSPs | 120 (100%) | 120 (100%) | 120 (100%) |
BRAMs | 33 (100%) | 33 (100%) | 33 (100%) |
BER | Error Probability (%) | HF (%) | |||
---|---|---|---|---|---|
Unhardened | [13] | Proposed | [13] | Proposed | |
10−8 | 0.6 (100%) | 0.30 (50%) | 0.00 (0%) | 0.17 (11.81%) | 1.44 (100%) |
10−7 | 10.57 (100%) | 10.49 (99.24%) | 0.36 (3.41%) | 0.04 (0.16%) | 24.54 (100%) |
2 × 10−7 | 20.56 (100%) | 19.26 (93.68%) | 1.08 (5.25%) | 0.72 (1.54%) | 46.80 (100%) |
10−6 | 68.89 (100%) | 70.51 (102.35%) | 5.56 (8.07%) | −0.90 (−0.59%) | 152.17 (100%) |
2 × 10−6 | 86.36 (100%) | 82.05 (95.01%) | 7.41 (8.58%) | 2.40 (1.27%) | 189.71 (100%) |
10−5 | 100.0 (100%) | 100.0 (100%) | 39.13 (39.13%) | 0.00 (0%) | 146.25 (100%) |
2 × 10−5 | 100.0 (100%) | 100.0 (100%) | 72.18 (72.18%) | 0.00 (0%) | 66.84 (100%) |
Proposed | [7] | [8] | [9] | [10] | [13] | [16] | ||||
---|---|---|---|---|---|---|---|---|---|---|
CNN Model | Lenet-5 | Lenet-4 | Lenet-4 | ZynqNet | MLP-3 | Lenet-4 | MLP-3 | |||
Platform | XC7Z035 | XCZU9EG | ZYNQ | XC7Z020 | Software | XCZU9EG | Software | |||
Baseline | Unhardened Fix W8A16 | FP32 | FP32 | Unhardened FP32 | Unhardened Fix 32 | Unhardened FP32 | Unhardened FP32 | |||
Precision | Fix W8A16 | Binary and FP32 | FP16 | INT8 | - | Fix 4 | - | - | ||
Fault Injection Method | Software | Hardware | Hardware | Hardware | Software | Hardware | Hardware | |||
Harden Method | FSM-ECC & TMR-AHT | Quant | Quant | Full TMR | Quant Fix 8 | Quant | Selective TMR | SMART+ TMR | ||
LUTs RIR | 20.7% | −31.46% | −47.9% | −76.3% | 119.3% | −17.6% | - | 6.74% | - | |
FFs RIR | 20.9% | 0% | −28.6% | −33.3% | 95.1% | −31.0% | - | 0% | - | |
BER | 10−7 | 10−6 | - | - | - | 10−1 | - | - | ||
ERR | 96.59% | 91.93% | −12% | 22% | 72% | 33.59% | 40.30% | 39.96% | 14% | 55.52% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Chen, X.; Xie, Y.; Huo, L.; Chen, K.; Gao, C.; Xiang, Z.; Yang, H.; Wang, X.; Ge, Y.; Zhang, Y. Implementation of Highly Reliable Convolutional Neural Network with Low Overhead on Field-Programmable Gate Array. Electronics 2024, 13, 879. https://doi.org/10.3390/electronics13050879
Chen X, Xie Y, Huo L, Chen K, Gao C, Xiang Z, Yang H, Wang X, Ge Y, Zhang Y. Implementation of Highly Reliable Convolutional Neural Network with Low Overhead on Field-Programmable Gate Array. Electronics. 2024; 13(5):879. https://doi.org/10.3390/electronics13050879
Chicago/Turabian StyleChen, Xin, Yudong Xie, Liangzhou Huo, Kai Chen, Changhao Gao, Zhiqiang Xiang, Hanying Yang, Xiaofeng Wang, Yifan Ge, and Ying Zhang. 2024. "Implementation of Highly Reliable Convolutional Neural Network with Low Overhead on Field-Programmable Gate Array" Electronics 13, no. 5: 879. https://doi.org/10.3390/electronics13050879
APA StyleChen, X., Xie, Y., Huo, L., Chen, K., Gao, C., Xiang, Z., Yang, H., Wang, X., Ge, Y., & Zhang, Y. (2024). Implementation of Highly Reliable Convolutional Neural Network with Low Overhead on Field-Programmable Gate Array. Electronics, 13(5), 879. https://doi.org/10.3390/electronics13050879