An Experimental Analysis of Various Machine Learning Algorithms for Hand Gesture Recognition
Abstract
:1. Introduction
2. Related Work
3. Methodology
3.1. Feature Selection
- Starting Point Selection: Selecting a starting point for a feature subset is important. It can be done by starting with an empty matrix and adding relevant features to it. This is a forward selection method. Another technique is by starting from the feature set and eliminating irrelevant features from it. This is a backward selection method. The selection can also be initiated from the middle, proceeding outwards.
- Evaluation Method: The evaluation strategies for different feature selection algorithms vary. In the filter method, irrelevant and redundant features are eliminated before the learning algorithm begins. In the wrapper method, the bias of a particular induction algorithm that uses cross-validation to give final accuracy is taken into account for feature selection.
- Stopping Criteria: The feature selection algorithm will stop if on adding or removing features, the accuracy does not improve any further. The feature subset is revised until the merit does not degrade. This is based on the evaluation method.
- The training data set comprised 27,455 cases (80%), and the test data set comprised 7172 cases (20%). One-third of the data (threefold) were used for cross-validation, one-third of the samples were reserved from each subject to be trained, and the rest were used for testing. Figure 1 represents the mechanism of the entire network architecture of CNN, which was used for the experimental setup. Using this, images are classified into different classes, which are used for further analysis.
Features and Features Extraction
- Contour-based: These features are extracted from shape boundaries. These are known as the local methods, in which rather than extracting features from the whole image, the features are only extracted from the ROI.
- Region-based: These features are extracted from the whole image. These are known as global methods. Examples of tour-based features are geometric moments, Zernike moments, etc.
- Gradient-based: In these features, images are split into blocks, and features are taken from these blocks only. This is local shape information, and objects are captured from these dense blocks such as HOG and Scale Invariant Feature Transform (SIFT).
- Distinguishable: An efficient descriptor must be able to distinguish the classes of hand gestures to be discriminated against. This must have inter-class variance.
- Invariant: A feature extraction must be resistant to changes in rotation, translation, and scale. It should have small a variance within the class, even if there is a small change in the image.
- Reliable: Features must be resistant to noise in the image. The feature set must be strong enough to handle noise and variations due to ambient conditions.
- Statistically Independent: Two or more than two features must be independent of each other. A small change in the first feature must not affect the other feature set. The features must be statically independent. Otherwise, a small error in the first feature would affect the total system accuracy.
3.2. Analysis Using SVC Algorithm
3.3. Analysis Using KNN Algorithm
3.4. Analysis Using Logistic Regression
3.5. Analysis Using Naïve Bayes
3.5.1. Analysis Using Naïve Bayes for Multinomial Models
3.5.2. Analysis Using Naïve Bayes for Gaussian Models
3.6. Analysis Using SGDC
3.7. Analysis Using CNN
3.8. Analysis Using Random Forest
3.9. Analysis Using XGBoost
- i.
- Training to a baseline model took place to check the performance of the model in general.
- ii.
- A second model was used to train by parameter tuning, and the results are compared with the baseline model.
4. Results
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Dardas, N.H.; Georganas, N.D. Real-Time Hand Gesture Detection and Recognition Using Bag-of-Features and Support Vector Machine Techniques. IEEE Trans. Instrum. Meas. 2011, 60, 3592–3607. [Google Scholar] [CrossRef]
- Skaria, S.; Al-Hourani, A.; Evans, R.J. Deep-Learning Methods for Hand-Gesture Recognition Using Ultra-Wideband Radar. IEEE Access 2020, 8, 203580–203590. [Google Scholar] [CrossRef]
- Keskin, C.; Kirac, F.; Kara, Y.E.; Akarun, L. Randomized decision forests for static and dynamic handshape classification. In Proceedings of the 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, Providence, RI, USA, 16–21 June 2012; pp. 31–36. [Google Scholar]
- Chakraborty, D.; Garg, D.; Ghosh, A.; Chan, J.H. Trigger Detection System for American Sign Language using Deep Convolutional Neural Networks. In Proceedings of the 10th International Conference on Advances in Information Technology, Bangkok, Thailand, 10–13 December 2018; Association for Computing Machinery: New York, NY, USA, 2018; p. 4. [Google Scholar]
- Wang, S.B.; Quattoni, A.; Morency, L.P.; Demirdjian, D.; Darrell, T. Hidden conditional random fields for gesture recognition. In Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), New York, NY, USA, 17–22 June 2006; Volume 2, pp. 1521–1527. [Google Scholar]
- Just, A.; Marcel, S. A comparative study of two state-of-the-art sequence processing techniques for hand gesture recognition. Comput. Vis. Image Underst. 2009, 113, 532–543. [Google Scholar] [CrossRef]
- Tam, S.; Boukadoum, M.; Campeau-Lecours, A.; Gosselin, B. A Fully Embedded Adaptive Real-Time Hand Gesture Classifier Leveraging HD-sEMG and Deep Learning. IEEE Trans. Biomed. Circuits Syst. 2020, 14, 232–243. [Google Scholar] [CrossRef]
- Wahid, M.F.; Tafreshi, R.; Al-Sowaidi, M.; Langari, R. Subject-independent hand gesture recognition using normalization and machine learning algorithms. J. Comput. Sci. 2018, 27, 69–76. [Google Scholar] [CrossRef]
- Li, H.; Wu, L.; Wang, H.; Han, C.; Quan, W.; Zhao, J. Hand Gesture Recognition Enhancement Based on Spatial Fuzzy Matching in Leap Motion. IEEE Trans. Ind. Inform. 2020, 16, 1885–1894. [Google Scholar] [CrossRef]
- Lee, U.; Tanaka, J. Finger identification and hand gesture recognition techniques for natural user interface. In Proceedings of the 11th Asia Pacific Conference on Computer Human Interaction, Bangalore, India, 24–27 September 2013; Association for Computing Machinery: New York, NY, USA, 2013; pp. 274–279. [Google Scholar] [CrossRef]
- Nogales, R.E.; Benalcázar, M.E. Hand gesture recognition using machine learning and infrared information: A systematic literature review. Int. J. Mach. Learn. Cyber. 2021, 12, 2859–2886. [Google Scholar] [CrossRef]
- Cote-Allard, U.; Fall, C.L.; Drouin, A.; Campeau-Lecours, A.; Gosselin, C.; Glette, K.; Laviolette, F.; Gosselin, B. Deep Learning for Electromyographic Hand Gesture Signal Classification Using Transfer Learning. IEEE Trans. Neural Syst. Rehabil. Eng. 2019, 27, 760–771. [Google Scholar] [CrossRef] [Green Version]
- Heo, H.; Lee, E.C.; Park, K.R.; Kim, C.J.; Whang, M. A realistic game system using multi-modal user interfaces. IEEE Trans. Consum. Electron. 2010, 56, 1364–1372. [Google Scholar] [CrossRef]
- Dardas, N.; Chen, Q.; Georganas, N.D.; Petriu, E.M. Hand gesture recognition using Bag-of-features and multi-class Support Vector Machine. In Proceedings of the 2010 IEEE International Symposium on Haptic Audio Visual Environments and Games, Phoenix, AZ, USA, 16–17 October 2010; pp. 1–5. [Google Scholar]
- Zhang, X.; Chen, X.; Li, Y.; Lantz, V.; Wang, K.; Yang, J. A Framework for Hand Gesture Recognition Based on Accelerometer and EMG Sensors. IEEE Trans. Syst. Man Cyber.-Part A Syst. Hum. 2011, 41, 1064–1076. [Google Scholar] [CrossRef]
- Keskin, C.; Kıraç, F.; Kara, Y.E.; Akarun, L. Hand Pose Estimation and Hand Shape Classification Using Multi-layered Randomized Decision Forests. In Computer Vision—ECCV 2012. ECCV 2012. Lecture Notes in Computer Science; Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C., Eds.; Springer: Berlin/Heidelberg, Germany, 2012; Volume 7577. [Google Scholar] [CrossRef]
- Ren, Z.; Yuan, J.; Meng, J.; Zhang, Z. Robust Part-Based Hand Gesture Recognition Using Kinect Sensor. IEEE Trans. Multimed. 2013, 15, 1110–1120. [Google Scholar] [CrossRef]
- Ohn-Bar, E.; Trivedi, M.M. Hand Gesture Recognition in Real Time for Automotive Interfaces: A Multimodal Vision-Based Approach and Evaluations. IEEE Trans. Intell. Transp. Syst. 2014, 15, 2368–2377. [Google Scholar] [CrossRef] [Green Version]
- Plouffe, G.; Cretu, A. Static and Dynamic Hand Gesture Recognition in Depth Data Using Dynamic Time Warping. IEEE Trans. Instrum. Meas. 2016, 65, 305–316. [Google Scholar] [CrossRef]
- Zhang, W.; Wang, J.; Lan, F. Dynamic hand gesture recognition based on short-term sampling neural networks. IEEE/CAA J. Autom. Sin. 2021, 8, 110–120. [Google Scholar] [CrossRef]
- Zhao, Y.; Wang, L. The Application of Convolution Neural Networks in Sign Language Recognition. In Proceedings of the 2018 Ninth International Conference on Intelligent Control and Information Processing (ICICIP), Wanzhou, China, 9–11 November 2018; pp. 269–272. [Google Scholar]
- Gajowniczek, K.; Grzegorczyk, I.; Ząbkowski, T.; Bajaj, C. Weighted Random Forests to Improve Arrhythmia Classification. Electronics 2020, 9, 99. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Yu, J.-W.; Yoon, Y.-W.; Baek, W.-K.; Jung, H.-S. Forest Vertical Structure Mapping Using Two-Seasonal Optic Images and LiDAR DSM Acquired from UAV Platform through Random Forest, XGBoost, and Support Vector Machine Approaches. Remote Sens. 2021, 13, 4282. [Google Scholar] [CrossRef]
- Aggarwal, A.; Kumar, M. Image surface texture analysis and classification using deep learning. Multimed. Tools Appl. 2021, 80, 1289–1309. [Google Scholar] [CrossRef]
- Ding, X.; Jiang, T.; Xue, W.; Li, Z.; Zhong, Y. A New Method of Human Gesture Recognition Using Wi-Fi Signals Based on XGBoost. In Proceedings of the 2020 IEEE/CIC International Conference on Communications in China (ICCC Workshops), Chongqing, China, 9–11 August 2020; pp. 237–241. [Google Scholar] [CrossRef]
- Li, T.; Zhou, M. ECG Classification Using Wavelet Packet Entropy and Random Forests. Entropy 2016, 18, 285. [Google Scholar] [CrossRef]
- Paleczek, A.; Grochala, D.; Rydosz, A. Artificial Breath Classification Using XGBoost Algorithm for Diabetes Detection. Sensors 2021, 21, 4187. [Google Scholar] [CrossRef]
- Jin, J.; Fu, K.; Zhang, C. Traffic Sign Recognition with Hinge Loss Trained Convolutional Neural Networks. IEEE Trans. Intell. Transp. Syst. 2014, 15, 1991–2000. [Google Scholar] [CrossRef]
- Alshehri, M.; Kumar, M.; Bhardwaj, A.; Mishra, S.; Gyani, J. Deep Learning Based Approach to Classify Saline Particles in Sea Water. Water 2021, 13, 1251. [Google Scholar] [CrossRef]
- Badi, H. Recent methods in vision-based hand gesture recognition. Int. J. Data Sci. Anal. 2016, 1, 77–87. [Google Scholar] [CrossRef] [Green Version]
- Patwary, M.J.A.; Parvin, S.; Akter, S. Significant HOG-Histogram of Oriented Gradient Feature Selection for Human Detection. Int. J. Comput. Appl. 2015, 132, 20–24. [Google Scholar]
- Devineau, G.; Moutarde, F.; Xi, W.; Yang, J. Deep Learning for Hand Gesture Recognition on Skeletal Data. In Proceedings of the 13th IEEE Conference on Automatic Face and Gesture Recognition (FG’2018), Xi’an, China, 15–19 May 2018. [Google Scholar]
- Al-Hammadi, M.; Muhammad, G.; Abdul, W.; Alsulaiman, M.; Bencherif, M.A.; Mekhtiche, M.A. Hand Gesture Recognition for Sign Language Using 3DCNN. IEEE Access 2020, 8, 79491–79509. [Google Scholar] [CrossRef]
- Bao, P.; Maqueda, A.I.; Del-Blanco, C.R.; García, N. Tiny hand gesture recognition without localization via a deep convolutional network. IEEE Trans. Consum. Electron. 2017, 63, 251–257. [Google Scholar] [CrossRef] [Green Version]
- Pouyanfar, S.; Sadiq, S.; Yan, Y.; Tian, H.; Tao, Y.; Reyes, M.P.; Shyu, M.L.; Chen, S.C.; Iyengar, S.S. A survey on deep learning: Algorithms, techniques, and applications. ACM Comput. Surv. 2018, 51, 1–36. [Google Scholar] [CrossRef]
- Rida, I.; Al-Maadeed, N.; Al-Maadeed, S.; Bakshi, S. A comprehensive overview of feature representation for biometric recognition. Multimed. Tools Appl. 2018, 79, 4867–4890. [Google Scholar] [CrossRef]
- Yang, M.; Kidiyo, K.; Joseph, R. A survey of shape feature extraction techniques. Pattern Recognit. 2008, 15, 43–90. [Google Scholar]
- Ping Tian, D. A review on image feature extraction and representation techniques. Int. J. Multimed. Ubiquitous Eng. 2013, 8, 385–396. [Google Scholar]
- Rida, I.; Herault, R.; Marcialis, G.L.; Gasso, G. Palmprint recognition with an efficient data driven ensemble classifier. Pattern Recognit. Lett. 2019, 126, 21–30. [Google Scholar] [CrossRef]
- Hamid, N.A.; Sjarif, N.N.A. Handwritten recognition using SVM, KNN, and Neural networks. arXiv 2017, arXiv:1702.00723. [Google Scholar]
- Kumar, M.; Sriastava, S.; Hensman, A. A Hybrid Novel Approach of Video Watermarking. Int. J. Signal Process. Image Process. Pattern Recognit. 2016, 9, 365–376. [Google Scholar] [CrossRef]
- Chakradar, M.; Aggarwal, A.; Cheng, X.; Rani, A.; Kumar, M.; Shankar, A. A Non-invasive Approach to Identify Insulin Resistance with Triglycerides and HDL-c Ratio Using Machine learning. Neural Process. Lett. 2021, 1–21. [Google Scholar] [CrossRef]
- Kumar, M.; Aggarwal, J.; Rani, A.; Stephan, T.; Shankar, A.; Mirjalili, S. Secure video communication using firefly optimization and visual cryptography. Artif. Intell. Rev. 2021, 1–21. [Google Scholar] [CrossRef]
- Bhushan, S.; Alshehri, M.; Agarwal, N.; Keshta, I.; Rajpurohit, J.; Abugabah, A. A Novel Approach to Face Pattern Analysis. Electronics 2022, 11, 444. [Google Scholar] [CrossRef]
- Singh, A.K.; Kumar, S.; Bhushan, S.; Kumar, P.; Vashishtha, A. A Proportional Sentiment Analysis of MOOCs Course Reviews Using Supervised Learning Algorithms. Ingénierie Syst. D’inf. 2021, 26, 501–506. [Google Scholar] [CrossRef]
- Albawi, S.; Bayat, O.; Al-Azawi, S.; Ucan, O.N. Social Touch Gesture Recognition Using Convolutional Neural Network. Comput. Intell. Neurosci. 2018, 2018, 6973103. [Google Scholar] [CrossRef] [Green Version]
- Fong, S.; Liang, J.; Fister, I.J.; Mohammed, S. Gesture Recognition from Data Streams of Human Motion Sensor Using Accelerated PSO Swarm Search Feature Selection Algorithm. J. Sens. 2015, 2015, 205707. [Google Scholar] [CrossRef]
- Yan, S.; Xia, Y.; Smith, J.S.; Lu, W.; Zhang, B. Multiscale Convolutional Neural Networks for Hand Detection. Appl. Comput. Intell. Soft Comput. 2017, 2017, 9830641. [Google Scholar] [CrossRef] [Green Version]
S. No | Year | Author | Detection Technique | Dataset | Other Characteristics |
---|---|---|---|---|---|
1 | 2010 | Heo et al. [13] | Binary open (stretching) and close (crooking) | Hand Gesture Dataset | Used for the game system, so only grabbing and not grabbing have been used in this. |
2011 | Dardas and Georganas [14] | Support vector machine (SVM), scale invariance feature transform (SIFT), and K-means clustering | Real-time Dataset | Accuracy of 96.23% under variable scale | |
3 | 2012 | Zhang et al. [15] | Three-axis accelerometer (ACC) and multi-channel electromyography (EMG) sensors, multistream hidden Markov models, and HMM classifiers | Work on 72 CSL words and Hand Gesture Dataset | Accuracies of 95.3% and 96.3% for two subjects and by HMM accuracy is increased by 2.5% |
4 | 2012 | Keskin et al. [16] | Shape Classification Forest (SCF) | American Sign Language (ASL) dataset and ChaLearn Gesture Dataset (CGD2011) | Achieved a success rate of 97.8% when using the ASL dataset |
5 | 2013 | Ren et al. [17] | Kinect sensor, Finger-Earth Mover’s Distance (FEMD) | Hand Gesture Dataset, 10-gesture dataset | 93.2% mean accuracy, efficiency: 0.0750 s per frame |
6 | 2014 | Bar and Trivedi [18] | A Multi-modal Vision-Based Approach and Evaluations | Real dataset (set of 19 gestures), RGBD fusion | Studied the feasibility of an in-vehicle vision-based gesture recognition system |
7 | 2016 | Plouffe and Cretu [19] | Kinect sensor, k-curvature algorithm, DTW algorithm | Real-time hand gesture. | Accuracy of 92.4% is achieved over 55 static and dynamic hand gestures. |
8 | 2018 | Wahid et al. [8] | Upper limb’s electromyography (EMG). For classification: k-Nearest Neighbor (kNN), Discriminant Analysis (DA), Naïve Bayes (NB), Random Forest (RF), and Support Vector Machine (SVM) and non-parametric Wilcoxon signed-rank test. | Three different hand gestures as a fist, wave in, and wave out | Accuracy of 96.4% is achieved by using the area under the averaged root mean square curve (AUC-RMS) |
9 | 2021 | Zhang et al. [20] | Convolutional neural network (ConvNet), short-term memory (LSTM) network | Jester dataset and Nvidia dataset. | An accuracy of 95.73% received by Jester dataset and 95.69% by the “zoomed-out” Jester dataset. In the Nvidia dataset, an accuracy of 85.13% has been achieved. |
10 | 2018 | Zhao and Wang [21] | Convolutional Neural Network (CNN) | American Sign Language (ASL) dataset and MNIST dataset | CNN gave the highest efficiency on parameter distribution on the ASL dataset. |
11 | 2019 | Allard et al. [12] | Raw EMG, spectrograms, and continuous wavelet transform (CWT) | Two datasets: 19 and 17 able-bodied participants out of the first one is for pre-training Myo armband, NinaPro database | An offline accuracy: 98.31% (7 gestures and 17 participants by CWT-based ConvNet) and 68.98% (18 gestures and 10 participants by raw EMG-based ConvNet) |
12 | 2020 | Li et al. [9] | Leap Motion gen.2, spatial fuzzy matching (SFM) algorithm | Hand gesture dataset | Static Gesture: accuracy ranges from 94 to 100%. Dynamic Gesture: More than 90% accuracy has been achieved. |
13 | 2020 | Tam et al. [7] | Convolutional neural network (CNN) and myoelectric control scheme | Nina database, real-time hand gesture | Accuracy of 98.2% was achieved. |
Reference | Model | Accuracy |
---|---|---|
Saad et al. [47] | Random Forest (RF) and Boosting Algorithms, Decision Tree Algorithm | 63% |
Fong et al. [47] | Model Induction Algorithm, K-star Algorithm, Updated Naïve Bayes Algorithm, Decision Tree Algorithm | 76% |
Yan et al. [48] | AdaBoost Algorithm, SAMME Algorithm, SGD Algorithm, Edgebox Algorithm | 81.25% |
Proposed Model | Convolutional Neural Networks (CNN) model applied to sign language MNIST dataset | 91.41% |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Bhushan, S.; Alshehri, M.; Keshta, I.; Chakraverti, A.K.; Rajpurohit, J.; Abugabah, A. An Experimental Analysis of Various Machine Learning Algorithms for Hand Gesture Recognition. Electronics 2022, 11, 968. https://doi.org/10.3390/electronics11060968
Bhushan S, Alshehri M, Keshta I, Chakraverti AK, Rajpurohit J, Abugabah A. An Experimental Analysis of Various Machine Learning Algorithms for Hand Gesture Recognition. Electronics. 2022; 11(6):968. https://doi.org/10.3390/electronics11060968
Chicago/Turabian StyleBhushan, Shashi, Mohammed Alshehri, Ismail Keshta, Ashish Kumar Chakraverti, Jitendra Rajpurohit, and Ahed Abugabah. 2022. "An Experimental Analysis of Various Machine Learning Algorithms for Hand Gesture Recognition" Electronics 11, no. 6: 968. https://doi.org/10.3390/electronics11060968
APA StyleBhushan, S., Alshehri, M., Keshta, I., Chakraverti, A. K., Rajpurohit, J., & Abugabah, A. (2022). An Experimental Analysis of Various Machine Learning Algorithms for Hand Gesture Recognition. Electronics, 11(6), 968. https://doi.org/10.3390/electronics11060968