Attention-Mechanism-Based Face Feature Extraction Model for WeChat Applet on Mobile Devices
Abstract
:1. Introduction
- To prevent the leakage of original facial images, we propose a framework for a facial recognition system in which facial features are extracted with the WeChat front-end, and only the features, rather than an original facial image, are transmitted to the server for recognition.
- We propose the light-weight feature extractor FFEM-AM, and depth-wise separable convolution and the ECA module are introduced so that the FFEM-AM can be deployed with WeChat but also with high accuracy.
- We constructed a large-scale facial image database in a real-world environment and evaluated the proposed FFEM-AM using this self-built database and the WeChat applet for mobile devices; the experiments show that the prediction accuracy was 98.1%, which is better than those of popular light-weight models; the running time was less than 100 ms, and the memory cost was only 6.5 MB.
2. The Proposed Method
2.1. Framework
2.2. Network
2.3. Bottleneck
2.3.1. Efficient Channel Attention Module
2.3.2. Bottleneck with ECA Modules
2.4. Loss Function and Recognition
3. Experimental Results and Analysis
3.1. Set-Up
3.1.1. Model Training Platform and Metrics
3.1.2. Database
3.1.3. Model Deployment
3.2. Training Loss and Validation Accuracy
3.3. Evaluating the Prediction Accuracy
3.4. Performance on the WeChat Applet
4. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Simonyan, K.; Zisserman. A very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 1–9. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef]
- Kortli, Y.; Jridi, M.; Al Falou, A.; Atri, M. Face recognition systems: A survey. Sensors 2020, 20, 342. [Google Scholar] [CrossRef] [PubMed]
- Sun, Y.; Chen, Y.; Wang, X.; Tang, X. Deep learning face representation by joint identification-verification. Adv. Neural Inf. Process. Syst. 2014, 27. Available online: https://proceedings.neurips.cc/paper/2014/file/e5e63da79fcd2bebbd7cb8bf1c1d0274-Paper.pdf (accessed on 30 October 2023).
- Mahdi, F.P.; Habib, M.; Ahad, A.R.; Mckeever, S.; Moslehuddin, A.; Vasant, P. Face recognition-based real-time system for surveillance. Intell. Decis. Technol. 2017, 11, 79–92. [Google Scholar] [CrossRef]
- Radzi, S.A.; Alif, M.M.F.; Athirah, Y.N.; Jaafar, A.S.; Norihan, A.H.; Saleha, M.S. IoT based facial recognition door access control home security system using raspberry pi. Int. J. Power Electron. Drive Syst. 2020, 11, 417. [Google Scholar] [CrossRef]
- Patel, K.; Han, H.; Jain, A.K. Secure face unlock: Spoof detection on smartphones. IEEE Trans. Inf. Forensics Secur. 2016, 11, 2268–2283. [Google Scholar] [CrossRef]
- Wei, M.; Liu, K. Development and design of the wechat app. J. Web Syst. Appl. 2022, 4, 19–24. [Google Scholar]
- Wang, Z. Short video applet based on wechat. Application of intelligent systems in multi-modal information analytics. In Proceedings of the International Conference on Multi-modal Information Analytics (MMIA 2021), Huhehaote, China, 23–24 April 2021; Volume 2, pp. 844–849. [Google Scholar]
- Ding, Y.; Lu, X.; Xie, Z.; Jiang, T.; Song, C.; Wang, Z. Evaluation of a novel wechat applet for image-based dietary assessment among pregnant women in China. Nutrients 2021, 13, 3158. [Google Scholar] [CrossRef] [PubMed]
- Kumar, N.; Berg, A.C.; Belhumeur, P.N.; Nayar, S.K. Attribute and simile classifiers for face verification. In Proceedings of the International Conference on Computer Vision (ICCV), Kyoto, Japan, 29 September–2 October 2009; pp. 365–372. [Google Scholar]
- Iandola, F.N.; Han, S.; Moskewicz, M.W.; Ashraf, K.; Dally, W.J.; Keutzer, K. Squeezenet: Alexnet-level accuracy with 50× fewer parameters and <0.5 MB model size. arXiv 2016, arXiv:1602.07360. [Google Scholar]
- Gholami, A.; Kwon, K.; Wu, B.; Tai, Z.; Yue, X.; Jin, P.; Zhao, S.; Keutzer, K. Squeezenext: Hardware-aware neural network design. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Salt Lake City, UT, USA, 18–22 June 2018; pp. 1638–1647. [Google Scholar]
- Chollet, F. Xception: Deep learning with depthwise separable convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 1251–1258. [Google Scholar]
- Zhang, X.; Zhou, X.; Lin, M.; Sun, J. Shufflenet: An extremely efficient convolutional neural network for mobile devices. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 6848–6856. [Google Scholar]
- Ma, N.; Zhang, X.; Zheng, H.-T.; Sun, J. Shufflenet v2: Practical guidelines for efficient cnn architecture design. In Proceedings of the European Conference on Computer Vision, Munich, Germany, 8–14 September 2018; pp. 116–131. [Google Scholar]
- Howard, A.G.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv 2017, arXiv:1704.04861. [Google Scholar]
- Niu, Z.; Zhong, G.; Yu, H. A review on the attention mechanism of deep learning. Neurocomputing 2021, 452, 48–62. [Google Scholar] [CrossRef]
- Wang, Q.; Wu, B.; Zhu, P.; Li, P.; Zuo, W.; Hu, Q. ECA-Net: Efficient channel attention for deep convolutional neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020. [Google Scholar] [CrossRef]
- Sandler, M.; Howard, A.; Zhu, M.; Zhmoginov, A.; Chen, L.C. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 4510–4520. [Google Scholar]
- Zhang, T.; Zhang, X.; Shi, J.; Wei, S. Depth wise separable convolution neural network for high speed sar ship detection. Remote Sens. Environ. 2019, 11, 2483. [Google Scholar] [CrossRef]
- Han, Z.; Yu, S.; Lin, S.-B.; Zhou, D.-X. Depth selection for deep ReLU nets in feature extraction and generalization. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 44, 1853–1868. [Google Scholar] [CrossRef] [PubMed]
- Yan, C.; Tu, Y.; Wang, X.; Zhang, Y.; Hao, X.; Zhang, Y.; Dai, Q. STAT: Spatial-temporal attention mechanism for video captioning. IEEE Trans. Multimed. 2019, 22, 229–241. [Google Scholar] [CrossRef]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7132–7141. [Google Scholar]
- Smilkov, D.; Thorat, N.; Assogba, Y.; Nicholson, C.; Kreeger, N.; Yu, P.; Cai, S.; Nielsen, E.; Soegel, D.; Bileschi, S.; et al. Tensorflow. js: Machine learning for the web and beyond. In Proceedings of the Machine Learning and Systems, Stanford, CA, USA, 31 March–2 April 2019; pp. 309–321. [Google Scholar]
Input | Operator | t | c | n | s |
---|---|---|---|---|---|
962 × 3 | Conv2d | - | 32 | 1 | 2 |
482 × 32 | bottleneck | 1 | 16 | 1 | 1 |
482 × 16 | bottleneck | 6 | 24 | 2 | 2 |
242 × 24 | bottleneck | 6 | 32 | 2 | 2 |
122 × 32 | bottleneck | 6 | 64 | 3 | 2 |
62 × 64 | bottleneck | 6 | 96 | 2 | 1 |
62 × 96 | bottleneck | 6 | 160 | 3 | 2 |
32 × 160 | bottleneck | 6 | 320 | 1 | 1 |
32 × 320 | conv2d 1 × 1 | - | 1280 | 1 | 1 |
32 × 1280 | pool 3 × 3 | - | - | 1 | - |
1 × 1 × 1280 | conv2d 1 × 1 | - | k | - | - |
Accuracy (%) | |
---|---|
Males | 98.6 |
Females | 96.6 |
With Glasses | 97.2 |
Without Glasses | 97.0 |
Memory (MB) | Time (ms/Sheet) | |
---|---|---|
Honor V30 | 6.5 | 86 |
iPhone12 | 6.5 | 31 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Xiao, J.; Zhou, H.; Lei, Q.; Liu, H.; Xiao, Z.; Huang, S. Attention-Mechanism-Based Face Feature Extraction Model for WeChat Applet on Mobile Devices. Electronics 2024, 13, 201. https://doi.org/10.3390/electronics13010201
Xiao J, Zhou H, Lei Q, Liu H, Xiao Z, Huang S. Attention-Mechanism-Based Face Feature Extraction Model for WeChat Applet on Mobile Devices. Electronics. 2024; 13(1):201. https://doi.org/10.3390/electronics13010201
Chicago/Turabian StyleXiao, Jianyu, Hongyang Zhou, Qigong Lei, Huanhua Liu, Zunlong Xiao, and Shenxi Huang. 2024. "Attention-Mechanism-Based Face Feature Extraction Model for WeChat Applet on Mobile Devices" Electronics 13, no. 1: 201. https://doi.org/10.3390/electronics13010201
APA StyleXiao, J., Zhou, H., Lei, Q., Liu, H., Xiao, Z., & Huang, S. (2024). Attention-Mechanism-Based Face Feature Extraction Model for WeChat Applet on Mobile Devices. Electronics, 13(1), 201. https://doi.org/10.3390/electronics13010201