Intelligent Real-Time Face-Mask Detection System with Hardware Acceleration for COVID-19 Mitigation
Abstract
:1. Introduction
- A cost-effective technological solution is proposed for automatic face-mask detection. It exploits computer-vision (CV)-based face detection in conjunction with machine-learning strategies to build an integrated system that classifies detected faces as having a mask on or not. Such a system, which is provided as an open-source or a commercial product, can be implemented in many public and private sectors and businesses in order to ensure that the local population is adhering to mask-wearing policies.
- Three hardware-specific quantised models were built and benchmarked for further real-time system implementations.
- A thorough ablation study was conducted to prove the effectiveness of the proposed face-mask detection system on embedded hardware. The proposed model’s performance was also compared with that of other state-of-the-art DL approaches using evaluation metrics of accuracy, inference time, memory footprint, and cost.
2. Literature Review
3. Background
3.1. Deep Neural Network (DNN)
3.2. Transfer Learning (TL)
3.3. Quantisation
- Dynamic range quantisation: This statically quantises parameters from floating points to integers and dynamically quantises activations during inference. At inference, weights are converted from 8-bits of precision to floating points and computed using floating-point kernels. This conversion is performed once and cached to reduce latency. To further improve latency, dynamic-range operators dynamically quantise activations on the basis of their range to 8-bits, and perform computations with 8-bit weights and activations. This optimisation provides latency close to fully fixed-point inference. However, outputs are still stored using floating-point, so that the speed-up with dynamic-range OPS is less than that of full fixed-point computation.
- Full integer quantisation: This method statically quantises all weights and activations to INT8; therefore, it achieves the least latency during inference. In order to statically reduce precision to 8 bits, the method requires a small representative dataset. This is a generator function that provides a set of input data that is large enough to represent typical values. It allows for the converter to estimate a dynamic range for all variable data. After the representative dataset is created, the model is converted into a TFLite model (https://www.tensorflow.org/lite/convert/ (accessed on 2 February 2022)) and is hence quantised.
4. Proposed System
4.1. Proposed MaskDetect
4.2. Deployment on Embedded Platforms
Algorithm 1: Algorithmic summary of model deployment tailored to specific embedded devices used in this work |
4.2.1. Intel Neural Compute Stick 2 (INCS2)
4.2.2. NVIDIA Jetson Nano
4.2.3. Coral Edge TPU
4.3. Transfer-Learning Models
5. Experimental Analysis
5.1. Datasets
5.2. Data Preprocessing
- Resizing: images were resized into a uniform size of 128 × 128, as the proposed MaskDetect requires uniform input dimensions;
- Normalisation: pixel intensity values were normalised to [−1, 1] for the best results from ReLu activations;
- Labeling: categorical labels were one-hot encoded.
5.3. Desktop-Based Implementations and Analysis
5.4. Quantisation Analysis
5.5. Extended Experiments on More Hardware Accelerations
- Coral USB-based acceleration: This implementation accelerated the MaskDetect model on Raspberry Pi 4 using the Coral Edge USB at a relative cost of approximately 0.33 BaseCost (Pi product specifications: https://www.pishop.us (accessed on 1 February 2022); Coral product specifications: https://coral.ai (accessed on 1 February 2022)). The MaskDetect-Coral configuration achieved the second fastest real time performance (cf. Table 3) among the three hardware accelerated implementations, which could be attributed to the INT8 quantisation of the model. The system achieved average performance of 19 FPS, increasing the real-time performance of MaskDetect by nearly 73%. However there was a slight reduction detection accuracy due to weight quantisation, resulting in accuracy of 90.4%.
- INCS2-based acceleration: This implementation accelerated the MaskDetect model on Raspberry Pi 4 using INCS2 VPU at a relative cost of approximately 0.355 BaseCost (Pi product specifications: https://www.pishop.us, (accessed on 1 February 2022) INCS2 product specifications: https://store.intelrealsense.com (accessed on 1 February 2022)). The MaskDetect-INCS2 configuration achieved the second fastest real-time performance of 18 FPS, increasing the performance of MaskDetect by nearly 64%. Since the INCS2 model ruranns at FP16 quantisation, there was no measurable accuracy loss when compared to the original MaskDetect model.
- Jetson Nano acceleration: This implementation accelerated the MaskDetect model on the Jetson Nano Developer Kit using the 2 GB hardware version at a relative cost of approximately 0.147 BaseCost (Jetson product specifications: https://www.amazon.com (accessed on 1 February 2022)). This configuration provided the best real-time performance of 22 FPS. This resulted in an increase of MaskDetect’s performance to 100%. Real-time performance was the quickest on the Jetson Nano due to its GPU hardware, which is not available on Raspberry Pi. There was no measurable accuracy loss, as the model used FP32 weights, matching the accuracy of the baseline model.
5.6. Cost Analysis
6. Conclusions and Future Direction
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
CV | Computer vision |
CNN | Convolutional neural network |
DL | Deep learning |
DNN | Deep neural network |
FPS | Frames per second |
FDS | Face-detection subsystem |
INCS2 | Intel Neural Compute Stick 2 |
IR | Intermediate representation |
MDS | Mask-detection system/subsystem |
QAT | Quantisation-aware training |
ROI | Region of interest |
SSD | Single-shot detector |
TL | Transfer learning |
VPU | Vision processing unit |
References
- Holshue, M.L.; DeBolt, C.; Lindquist, S.; Lofy, K.H.; Wiesman, J.; Bruce, H.; Spitters, C.; Ericson, K.; Wilkerson, S.; Tural, A.; et al. First case of 2019 novel coronavirus in the United States. N. Engl. J. Med. 2020, 382, 929–936. [Google Scholar] [CrossRef] [PubMed]
- Gopinath, G. The great lockdown: Worst economic downturn since the great depression. IMF Blog 2020, 14, 2020. [Google Scholar]
- Xiong, J.; Lipsitz, O.; Nasri, F.; Lui, L.M.; Gill, H.; Phan, L.; Chen-Li, D.; Iacobucci, M.; Ho, R.; Majeed, A.; et al. Impact of COVID-19 pandemic on mental health in the general population: A systematic review. J. Affect. Disord. 2020, 277, 55–64. [Google Scholar] [CrossRef] [PubMed]
- Helppie-McFall, B.; Hsu, J.W. Financial profiles of workers most vulnerable to coronavirus-related earnings loss in the spring of 2020. Financ. Plan. Rev. 2020, 3, e1102. [Google Scholar]
- Chen, C.C.; Zou, S.S.; Chen, M.H. The fear of being infected and fired: Examining the dual job stressors of hospitality employees during COVID-19. Int. J. Hosp. Manag. 2022, 102, 103131. [Google Scholar] [CrossRef]
- Moghadas, S.M.; Vilches, T.N.; Zhang, K.; Wells, C.R.; Shoukat, A.; Singer, B.H.; Meyers, L.A.; Neuzil, K.M.; Langley, J.M.; Fitzpatrick, M.C.; et al. The impact of vaccination on COVID-19 outbreaks in the United States. medRxiv 2021. [Google Scholar] [CrossRef]
- Bai, N. Still Confused about Masks? Here’s the Science behind How Face Masks Prevent Coronavirus. Patient Care. 2020. Available online: https://www.ucsf.edu/news/2020/06/417906/still-confused-about-masks-heres-science-behind-how-face-masks-prevent (accessed on 1 February 2022).
- Mercurio, G. LIS2SPEECH: LIS Translation in Written Text and Spoken Language. Master’s Thesis, Polytechnic University of Turin, Turin, Italy, 2021. [Google Scholar]
- Chavda, A.; Dsouza, J.; Badgujar, S.; Damani, A. Multi-stage CNN architecture for face mask detection. In Proceedings of the 2021 6th International Conference for Convergence in Technology (I2CT), Maharashtra, India, 2–4 April 2021; pp. 1–8. [Google Scholar]
- Sharma, S.; Shanmugasundaram, K.; Ramasamy, S.K. FAREC—CNN based efficient face recognition technique using Dlib. In Proceedings of the 2016 International Conference on Advanced Communication Control and Computing Technologies (ICACCCT), Ramanathapuram, India, 25–27 May 2016; pp. 192–195. [Google Scholar]
- Deng, J.; Guo, J.; Ververas, E.; Kotsia, I.; Zafeiriou, S. Retinaface: Single-shot multi-level face localisation in the wild. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 5203–5212. [Google Scholar]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. SSD: Single shot multibox detector. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016; Springer: Berlin/Heidelberg, Germany, 2016; pp. 21–37. [Google Scholar]
- Wu, P.; Li, H.; Zeng, N.; Li, F. FMD-Yolo: An efficient face mask detection method for COVID-19 prevention and control in public. Image Vis. Comput. 2022, 117, 104341. [Google Scholar] [CrossRef] [PubMed]
- Song, Z.; Nguyen, K.; Nguyen, T.; Cho, C.; Gao, J. Spartan Face Mask Detection and Facial Recognition System. Healthcare 2022, 10, 87. [Google Scholar] [CrossRef] [PubMed]
- Reuther, A.; Michaleas, P.; Jones, M.; Gadepally, V.; Samsi, S.; Kepner, J. Survey of machine learning accelerators. In Proceedings of the 2020 IEEE High Performance Extreme Computing Conference (HPEC), Waltham, MA, USA, 22–24 September 2020; pp. 1–12. [Google Scholar]
- Esser, S.K.; Merolla, P.A.; Arthur, J.V.; Cassidy, A.S.; Appuswamy, R.; Andreopoulos, A.; Berg, D.J.; McKinstry, J.L.; Melano, T.; Barch, D.R.; et al. Convolutional networks for fast, energy-efficient neuromorphic computing. Proc. Natl. Acad. Sci. USA 2016, 113, 11441–11446. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Ghimire, D.; Kil, D.; Kim, S.H. A Survey on Efficient Convolutional Neural Networks and Hardware Acceleration. Electronics 2022, 11, 945. [Google Scholar] [CrossRef]
- Riaz, M.; Garg, H.; Hamid, M.T.; Afzal, D. Modelling uncertainties with TOPSIS and GRA based on q-rung orthopair m-polar fuzzy soft information in COVID-19. Expert Syst. 2022, e12940. [Google Scholar] [CrossRef]
- Yang, Y.; Wu, Q.M.J.; Feng, X.; Akilan, T. Recomputation of the Dense Layers for Performance Improvement of DCNN. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 42, 2912–2925. [Google Scholar] [CrossRef] [PubMed]
- Akilan, T.; Wu, Q.J.; Zhang, H. Effect of fusing features from multiple DCNN architectures in image classification. IET Image Process. 2018, 12, 1102–1110. [Google Scholar] [CrossRef]
- Akilan, T.; Wu, Q.J.; Safaei, A.; Jiang, W. A late fusion approach for harnessing multi-CNN model high-level features. In Proceedings of the 2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Banff, AB, Canada, 5–8 October 2017; pp. 566–571. [Google Scholar]
- Wang, X.; Wang, C.; Cao, J.; Gong, L.; Zhou, X. WinoNN: Optimizing FPGA-Based Convolutional Neural Network Accelerators Using Sparse Winograd Algorithm. IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. 2020, 39, 4290–4302. [Google Scholar] [CrossRef]
- Guo, K.; Sui, L.; Qiu, J.; Yu, J.; Wang, J.; Yao, S.; Han, S.; Wang, Y.; Yang, H. Angel-Eye: A Complete Design Flow for Mapping CNN Onto Embedded FPGA. IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. 2018, 37, 35–47. [Google Scholar] [CrossRef]
- Safaei, A.; Wu, Q.J.; Akilan, T.; Yang, Y. System-on-a-chip (SoC)-based hardware acceleration for an online sequential extreme learning machine (OS-ELM). IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. 2018, 38, 2127–2138. [Google Scholar] [CrossRef]
Model Name | Classification Accuracy | Model Size in Memory (MB) | Number of Parameters | Inference Time (s) |
---|---|---|---|---|
MaskDetect (baseline) | 0.942 | 11.5 | 983,330 | 0.046 |
VGG16 | 0.987 | 43.8 | 4,945,858 | 0.049 |
ResNet-50V2 | 0.990 | 143.0 | 27,825,583 | 0.050 |
ResNet-50V2 | 0.974 | 103.0 | 22,917,794 | 0.055 |
Model Name | Size (MB) | Classification Accuracy |
---|---|---|
MaskDetect (baseline) | 11.5 | 94.2% |
QAT MaskDetect | 0.983 | 95.0% |
Hardware | Avg. FPS | Accuracy | Model | Relative Cost | Extra Hardware |
---|---|---|---|---|---|
Baseline | 11 | 0.942 | MaskDetect | BaseCost | No |
Raspberry P14-B | 3 | 0.942 | MaskDetect | 0.183 × BaseCost | No |
P14+Coral USB | 19 | 0.904 | MaskDetect_edgeTPU.tflite | 0.330 × BaseCost | Yes |
P14+Intel NCS2 | 18 | 0.943 | MaskDetect_IR | 0.355 × BaseCost | Yes |
Jetson Nano | 22 | 0.942 | MaskDetect_TensorRT | 0.147 × BaseCost | No |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Sertic, P.; Alahmar, A.; Akilan, T.; Javorac, M.; Gupta, Y. Intelligent Real-Time Face-Mask Detection System with Hardware Acceleration for COVID-19 Mitigation. Healthcare 2022, 10, 873. https://doi.org/10.3390/healthcare10050873
Sertic P, Alahmar A, Akilan T, Javorac M, Gupta Y. Intelligent Real-Time Face-Mask Detection System with Hardware Acceleration for COVID-19 Mitigation. Healthcare. 2022; 10(5):873. https://doi.org/10.3390/healthcare10050873
Chicago/Turabian StyleSertic, Peter, Ayman Alahmar, Thangarajah Akilan, Marko Javorac, and Yash Gupta. 2022. "Intelligent Real-Time Face-Mask Detection System with Hardware Acceleration for COVID-19 Mitigation" Healthcare 10, no. 5: 873. https://doi.org/10.3390/healthcare10050873
APA StyleSertic, P., Alahmar, A., Akilan, T., Javorac, M., & Gupta, Y. (2022). Intelligent Real-Time Face-Mask Detection System with Hardware Acceleration for COVID-19 Mitigation. Healthcare, 10(5), 873. https://doi.org/10.3390/healthcare10050873