Robust Hand Gesture Recognition Using HOG-9ULBP Features and SVM Model
Abstract
:1. Introduction
- We use single Gaussian models (SGM) and the K-means algorithm to segment hand gestures against complex backgrounds. The SGM method can segment hand gestures from a general non-skin background. However, for the skin-like background, the skin color pixels and the skin color-like pixels cannot be correctly segmented. After noticing that there are three types of pixels in the images after SGM segmentation, i.e., the black background pixels, the skin color-like pixels and the skin color pixels, we further use the K-means algorithm to cluster these three kinds of pixels, to separate the skin color-like pixels and skin color pixels and achieve segmentation of hand gestures against the skin-like interference.
- We propose an improved 9ULBP descriptor and combine it with the HOG feature for feature extraction of MSMA hand gestures. The HOG feature is used to capture the contour information and maintain the scale invariance. The proposed 9ULBP descriptor can capture the texture information effectively and is rotationally invariant. So the combined HOG-9ULBP feature has not only the scale and rotation invariance, but also rich contour and texture information. SVM is then used to complete the feature classification.
2. Methods
2.1. Hand Gesture Segmentation
2.2. Feature Extraction
2.2.1. HOG Feature
- Calculate the horizontal gradient and vertical gradient of each pixel , respectively:
- Calculate the gradient magnitude and orientation as follows:
- Calculate the orientation histogram vector of each cell. We take the calculation of 9-dimensional gradient orientation histogram vector for instance. First, the image is divided into some specified blocks and cells. Then, the range is equally divided into 9 bins. Let represent , respectively. For each pixel in a cell, if , let . The obtained is then -norm normalized as follows:
- HOG feature calculation. The calculation procedure of the HOG feature is illustrated in Figure 3. First, the HOG feature of each block is obtained by sequentially connecting the gradient orientation histogram of all the cells in the block. Then, the HOG feature of the entire image is obtained using the 4-pixel stride overlap strategy.
2.2.2. 9ULBP Feature
- Count the number of 1s in the 58ULBP code.
- Divide the LBP codes into 9 categories based on the number of 1s.
- Calculate decimal values of LBP codes in each class to find the smallest LBP code.
- Choose the smallest LBP code as the 9ULBP code.
2.2.3. Feature Fusion
2.3. Gesture Classification
3. Results
3.1. Dataset Creation
- Use Kinect’s human skeleton recognition technology to track the center point of the right hand palm.
- Intercept the right hand gesture pictures at four different scales.
- Rotate the obtained gesture images in five directions (−90, −45, 0, 45, 90).
- Organize these multi-scale and multi-angle gesture images into the dataset.
3.2. Experiment on Self-Collected Dataset
3.3. Experiment on the NUS Dataset
3.4. Experiment on the MU HandImages ASL Dataset
4. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Chakraborty, H.K.; Sarma, D.; Bhuyan, M.K.; Macdorman, K.F. Review of constraints on vision-based gesture recognition for human–computer interaction. IET Comput. Vis. 2018, 12, 3–15. [Google Scholar] [CrossRef]
- Guo, L.; Lu, Z.; Yao, L. Human–Machine Interaction Sensing Technology Based on Hand Gesture Recognition: A Review. IEEE Trans.-Hum.-Mach. Syst. 2021, 51, 300–309. [Google Scholar] [CrossRef]
- Deafness and Hearing Loss. Available online: https://www.who.int/news-room/fact-sheets/detail/deafness-and-hearing-loss (accessed on 14 February 2022).
- Neiva, D.H.; Zanchettin, C. Gesture recognition: A review focusing on sign language in a mobile context. Expert Syst. Appl. 2018, 103, 159–183. [Google Scholar] [CrossRef]
- Setiawardhana; Hakkun, R.Y.; Baharuddin, A. Sign language learning based on Android for deaf and speech impaired people. In Proceedings of the 2015 International Electronics Symposium, Surabaya, Indonesia, 29–30 September 2015; pp. 114–117. [Google Scholar]
- Aly, W.; Aly, S.; Almotairi, S. User-independent american sign language alphabet recognition based on depth image and PCANet features. IEEE Access 2019, 7, 123138–123150. [Google Scholar] [CrossRef]
- Pisharady, P.K.; Saerbeck, M. Recent methods and databases in vision-based hand gesture recognition: A review. Comput. Vis. Image Underst. 2015, 141, 152–165. [Google Scholar] [CrossRef]
- Zou, C.; Liu, Y.; Wang, J.; Si, H. Deformable Part Model Based Hand Detection against Complex Backgrounds. Adv. Image Graph. Technol. 2016, 634, 149–159. [Google Scholar]
- Choudhury, A.; Talukdar, A.K.; Sarma, K.K. A novel hand segmentation method for multiple-hand gesture recognition system under complex background. In Proceedings of the International Conference on Signal Processing and Integrated Networks, Noida, India, 20–21 February 2014; pp. 136–140. [Google Scholar]
- Stergiopoulou, E.; Sgouropoulos, K.; Nikolaou, N.; Papamarkos, N.; Mitianoudis, N. Real time hand detection in a complex background. Eng. Appl. Artif. Intell. 2014, 35, 54–70. [Google Scholar] [CrossRef]
- Cheng, F.C.; Chen, B.H.; Huang, S.C. A background model re-initialization method based on sudden luminance change detection. Eng. Appl. Artif. Intell. 2015, 38, 138–146. [Google Scholar] [CrossRef]
- Ban, Y.; Kim, S.K.; Kim, S.; Toh, K.A.; Lee, S. Face detection based on skin color likelihood. Pattern Recognit. 2014, 47, 1573–1585. [Google Scholar] [CrossRef]
- Hu, M.K. Visual pattern recognition by moment invariants. IRE Trans. Inf. Theory 1962, 8, 179–187. [Google Scholar]
- Li, G.; Ou, Q.; Luo, J. An Improved Hu-moment Algorithm in Gesture Recognition Based on Kinect Sensor. Inf. Technol. J. 2013, 12, 2963–2968. [Google Scholar]
- Priyal, S.P.; Bora, P.K. A robust static hand gesture recognition system using geometry based normalizations and Krawtchouk moments. Pattern Recognit. 2013, 48, 2202–2219. [Google Scholar] [CrossRef]
- Al-Utaibi, K.A.; Abdulhussain, S.H.; Mahmmod, B.M.; Naser, M.A.; Alsabah, M.; Sait, S.M. Reliable recurrence algorithm for high-order Krawtchouk polynomials. Entropy 2021, 23, 1162. [Google Scholar] [CrossRef]
- Žemgulys, J.; Raudonis, V.; Maskeliūnas, R.; Damaševičius, R. Recognition of basketball referee signals from videos using Histogram of Oriented Gradients (HOG) and Support Vector Machine (SVM). Procedia Comput. Sci. 2018, 130, 953–960. [Google Scholar] [CrossRef]
- Maqueda, A.I.; Del-Blanco, C.R.; Jaureguizar, F. Human–Computer Interaction based on Visual Hand-Gesture Recognition using Volumetric Spatiograms of Local Binary Patterns. Comput. Vis. Image Underst. 2015, 141, 126–137. [Google Scholar] [CrossRef] [Green Version]
- Zhou, S.; Liu, Y.H.; Li, K.Q. Recognition of multi-scale multi-angle gestures based on HOG-LBP feature. In Proceedings of the Internationla Conference on Control, Automation, Robotics and Vision, Singapore, 18–21 November 2018; pp. 407–412. [Google Scholar]
- Yao, S.; Pan, S.; Wang, T.; Zheng, C.; Shen, W.; Chong, Y. A new pedestrian detection method based on combined HOG and LSS features. Neurocomputing 2015, 151, 1006–1014. [Google Scholar] [CrossRef]
- Muhammad, A.; Muhammad, J.I.; Iftikhar, A.; Madini, O.A.; Rayed, A.; Mohammad, B.; Muhammad, W. Real-time surveillance through face recognition using HOG and feedforward neural networks. IEEE Access 2019, 7, 121236–121244. [Google Scholar]
- Anwer, R.M.; Khan, F.S.; Weijer, J.; Molinier, M.; Laaksonen, J. Binary patterns encoded convolutional neural networks for texture recognition and remote sensing scene classification. ISPRS J. Photogramm. Remote Sens. 2018, 138, 74–85. [Google Scholar] [CrossRef] [Green Version]
- Singh, S.; Chintalacheruvu, S.C.K.; Garg, S.; Giri, Y.; Kumar, M. Efficient Face Identification and Authentication Tool for Biometric Attendance System. In Proceedings of the 2021 8th International Conference on Signal Processing and Integrated Networks (SPIN), Noida, India, 26–27 August 2021; pp. 379–383. [Google Scholar]
- Zhu, C.; Wang, R. Local multiple patterns based multiresolution gray-scale and rotation invariant texture classification. Inf. Sci. 2012, 187, 93–109. [Google Scholar] [CrossRef]
- Konstantinidis, D.; Stathaki, T.; Argyriou, V.; Grammalidis, N. Building detection using enhanced HOG-LBP features and region refinement processes. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2016, 10, 1–18. [Google Scholar] [CrossRef] [Green Version]
- Kumar, M.; Rani, A.; Raheja, S.; Munjal, G. Automatic Brain Tumor Detection Using Machine Learning and Mixed Supervision. In Evolving Role of AI and IoMT in the Healthcare Market; Springer: Cham, Switzerland, 2021; pp. 247–262. [Google Scholar]
- Lahiani, H.; Neji, M. A survey on hand gesture recognition for mobile devices. Int. J. Intell. Syst. Technol. Appl. 2020, 19, 458–485. [Google Scholar]
- Zheng, C.H.; Pei, W.J.; Yan, Q.; Chong, Y.W. Pedestrian detection based on gradient and texture feature integration. Neurocomputing 2017, 228, 71–78. [Google Scholar] [CrossRef] [Green Version]
- Ren, Y.; Xie, X.; Li, G.; Wang, Z. Hand Gesture Recognition With Multiscale Weighted Histogram of Contour Direction Normalization for Wearable Applications. IEEE Trans. Circuits Syst. Video Technol. 2018, 28, 364–377. [Google Scholar] [CrossRef]
- Liang, Z.; Sun, Z.; Cao, M. Recognition of static human gesture based on radiant projection transform and Fourier transform. In Proceedings of the International Congress on Image and Signal Processing, Sanya, China, 27–30 May 2008; pp. 635–640. [Google Scholar]
- Huang, Y.; Yang, J. A multi-scale descriptor for real time RGB-D hand gesture recognition. Pattern Recognit. Lett. 2021, 144, 97–104. [Google Scholar] [CrossRef]
- Zhou, Y.; Jiang, G.; Lin, Y. A novel finger and hand pose estimation technique for real-time hand gesture recognition. Pattern Recognit. 2015, 49, 102–114. [Google Scholar] [CrossRef]
- Chakraborty, B.K.; Bhuyan, M.K.; Kumar, S. Combining image and global pixel distribution model for skin colour segmentation. Pattern Recognit. Lett. 2017, 88, 33–40. [Google Scholar] [CrossRef]
- Kakumanu, P.; Makrogiannis, S.; Bourbakis, N. A survey of skin-color modeling and detection methods. Pattern Recognit. 2007, 40, 1106–1122. [Google Scholar] [CrossRef]
- Sun, J.; Wu, X. Infrared target recognition based on improved joint local ternary pattern. Opt. Eng. 2016, 55, 53–101. [Google Scholar] [CrossRef]
- Lategahn, H.; Gross, S.; Stehle, T.; Aach, T. Texture classification by modeling joint distributions of local patterns with Gaussian mixtures. IEEE Trans. Image Process. 2010, 19, 1548–1557. [Google Scholar] [CrossRef] [PubMed]
- Xia, S.; Chen, P.; Zhang, J.; Li, X.; Wang, B. Utilization of rotation-invariant uniform lbp histogram distribution and statistics of connected regions in automatic image annotation based on multi-label learning. Neurocomputing 2017, 228, 11–18. [Google Scholar] [CrossRef] [Green Version]
- Ojala, T.; Pietikainen, M.; Maenpaa, T. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 2002, 24, 971–987. [Google Scholar] [CrossRef]
- Yang, Y.T.; Fishbain, B.; Hochbaum, D.S.; Norman, E.B.; Swanberg, E. The Supervised Normalized Cut Method for Detecting, Classifying, and Identifying Special Nuclear Materials. Informs J. Comput. 2013, 26, 45–58. [Google Scholar]
- Richhariya, B.; Tanveer, M. Eeg signal classification using universum support vector machine. Expert Syst. Appl. 2018, 106, 169–182. [Google Scholar] [CrossRef]
- Tang, H.; Liu, H.; Xiao, W.; Sebe, N. Fast and robust dynamic hand gesture recognition via key frames extraction and feature fusion. Neurocomputing 2019, 331, 424–433. [Google Scholar] [CrossRef] [Green Version]
- Girshick, R. Fast R-CNN. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 7–13 December 2015; pp. 1440–1448. [Google Scholar]
- Kelly, D.; Mcdonald, J.; Markham, C. A person independent system for recognition of hand postures used in sign language. Pattern Recognit. Lett. 2010, 31, 1359–1368. [Google Scholar] [CrossRef] [Green Version]
- Kumar, P.P.; Vadakkepat, P.; Loh, A.P. Hand posture and face recognition using a Fuzzy-Rough Approach. Int. J. Humanoid Robot. 2010, 7, 331–356. [Google Scholar] [CrossRef]
- Gupta, S.; Trivedi, M.C.; Kamya, S. Hand Skin Classification from Other Skin Objects Using Multi-direction 3D Color-Texture Feature and Cascaded Neural Network Classifier. Adv. Intell. Syst. Comput. 2016, 409, 523–534. [Google Scholar]
- Pisharady, P.K.; Vadakkepat, P.; Loh, A.P. Attention Based Detection and Recognition of Hand Postures Against Complex Backgrounds. Int. J. Comput. Vis. 2013, 101, 403–419. [Google Scholar] [CrossRef]
- Barczak, A.L.C.; Reyes, N.H.; Abastillas, M.; Piccio, A.; Susnjak, T. A New 2D Static Hand Gesture Colour Image Dataset for ASL Gestures. Ph.D. Thesis, Massey University, Palmerston North, New Zealand, 2011. [Google Scholar]
- Zhuang, H.; Yang, M.; Cui, Z.; Zheng, Q. A method for static hand gesture recognition based on non-negative matrix factorization and compressive sensing. Iaeng Int. J. Comput. Sci. 2017, 44, 52–59. [Google Scholar]
- Aowal, M.A.; Zaman, A.S.; Rahman, S.M.M.; Hatzinakos, D. Static hand gesture recognition using discriminative 2D Zernike moments. In Proceedings of the TENCON IEEE Region 10 Conference, Bangkok, Thailand, 22–25 October 2014; pp. 1–5. [Google Scholar]
- Kumar, V.; Nandi, G.C.; Kala, R. Static hand gesture recognition using stacked Denoising Sparse Autoencoders. In Proceedings of the International Conference on Contemporary Computing, Noida, India, 7–9 August 2014; pp. 99–104. [Google Scholar]
Block | Cell | Dimension | Recognition Rate |
---|---|---|---|
(64, 64) | (32, 32) | 36 | 36.39% |
(64, 64) | (16, 16) | 144 | 77.87% |
(64, 64) | (8, 8) | 576 | 90.02% |
(32, 32) | (16, 16) | 2916 | 89.41% |
(32, 32) | (8, 8) | 11,664 | 89.03% |
Cell | 9ULBP Dimension | Recognition Rate |
---|---|---|
(32, 32) | 36 | 20.97% |
(16, 16) | 144 | 40.49% |
(8, 8) | 576 | 85.47% |
(4, 4) | 2304 | 85.22% |
HOG | 9ULBP | HOG-9ULBP | Recognition Rate |
---|---|---|---|
2916 | 576 | 3492 | 97.46% |
2916 | 144 | 3060 | 97.43% |
2916 | 36 | 2952 | 97.40% |
576 | 576 | 1152 | 99.01% |
576 | 144 | 720 | 98.69% |
576 | 36 | 612 | 94.50% |
144 | 576 | 720 | 91.07% |
144 | 144 | 288 | 89.91% |
144 | 36 | 180 | 88.54% |
Recognition Method | Recognition Rate |
---|---|
The proposed method | 99.01% |
Fast R-CNN [42] | 97.15% |
KMD Classifier [15] | 96.94% |
Contour feature and SVM [29] | 95.00% |
Hu moment feature and SVM [43] | 91.80% |
HOG feature and SVM [17] | 90.02% |
LBP feature and SVM [8] | 85.47% |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, J.; Li, C.; Han, J.; Shi, Y.; Bian, G.; Zhou, S. Robust Hand Gesture Recognition Using HOG-9ULBP Features and SVM Model. Electronics 2022, 11, 988. https://doi.org/10.3390/electronics11070988
Li J, Li C, Han J, Shi Y, Bian G, Zhou S. Robust Hand Gesture Recognition Using HOG-9ULBP Features and SVM Model. Electronics. 2022; 11(7):988. https://doi.org/10.3390/electronics11070988
Chicago/Turabian StyleLi, Jianyong, Chengbei Li, Jihui Han, Yuefeng Shi, Guibin Bian, and Shuai Zhou. 2022. "Robust Hand Gesture Recognition Using HOG-9ULBP Features and SVM Model" Electronics 11, no. 7: 988. https://doi.org/10.3390/electronics11070988
APA StyleLi, J., Li, C., Han, J., Shi, Y., Bian, G., & Zhou, S. (2022). Robust Hand Gesture Recognition Using HOG-9ULBP Features and SVM Model. Electronics, 11(7), 988. https://doi.org/10.3390/electronics11070988