Intelligent Gesture Recognition Based on Screen Reflectance Multi-Band Spectral Features
Abstract
:1. Introduction
2. Principles and System
2.1. Principles
2.2. System
3. Data Collection
4. RGB Multi-Channel CNN-LSTM Gesture Recognition Model
4.1. RGB Three-Channel 1D-CNN Feature Extractor
4.2. LSTM Network
4.3. Evaluation
5. Experimental Results and Discussion
5.1. Experimental I Results
5.2. Experimental II Results
5.3. Discussion
- (1)
- The gesture categories and spectral ranges in this work are limited. In future research, expanding the range of gesture classifications could enable more complex human–machine interactions, potentially incorporating dynamic movements. For example, integrating with a sign language database would greatly enhance the system’s practicality for individuals with hearing and speech impairments. To achieve this, detailed plans for data collection and window segmentation will be essential. Additionally, this work focused solely on collecting spectral data within the visible light range. Future extensions could involve expanding to wider spectral ranges to fully leverage the data characteristics across different spectra.
- (2)
- This work established an RGB three-channel narrowband spectral gesture recognition system. Future efforts will focus on optimizing the reception system to advance the accuracy and applicability of the proposed method in diverse real-world scenarios. To enhance accuracy in complex interactions, deploying more narrowband receivers at multiple locations to establish a reception matrix would prove beneficial.
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Vrana, J.; Singh, R. Handbook of Nondestructive Evaluation 4.0; Springer International Publishing: Berlin/Heidelberg, Germany, 2022; pp. 107–123. [Google Scholar]
- Hewett, T.; Baecker, R.; Card, S.; Carey, T.; Gasen, J.; Mantei, M.; Perlman, G.; Strong, G.; Verplank, W. ACM SIGCHI Curricula for Human-Computer Interaction; ACM Press: New York, NY, USA, 1992; pp. 5–7. [Google Scholar]
- Mourtzis, D.; Angelopoulos, J.; Panopoulos, N. The future of the human–machine interface (HMI) in society 5.0. Future Internet 2023, 15, 162. [Google Scholar] [CrossRef]
- Reipschlager, P.; Flemisch, T.; Dachselt, R. Personal augmented reality for information visualization on large interactive displays. IEEE Trans. Vis. Comput. Graph. 2021, 27, 1182–1192. [Google Scholar] [CrossRef] [PubMed]
- Biele, C. Hand movements using keyboard and mouse. Hum. Mov. Hum.-Comput. Interact. 2022, 996, 39–51. [Google Scholar]
- Wu, J.; Zhu, Y.; Fang, X.; Banerjee, P. Touch or click? The effect of direct and indirect human-computer interaction on consumer responses. J. Mark. Theory Pract. 2023, 32, 158–173. [Google Scholar] [CrossRef]
- Jakobsen, M.R.; Hornbaek, K. Up close and personal: Collaborative work on a high-resolution multitouch wall display. ACM Trans. Comput.-Hum. Interact. 2014, 21, 1–34. [Google Scholar] [CrossRef]
- Nunes, J.S.; Castro, N.; Gonçalves, S.; Pereira, N.; Correia, V.; Lanceros-Mendez, S. Marked object recognition multitouch screen printed touchpad for interactive applications. Sensors 2017, 17, 2786. [Google Scholar] [CrossRef]
- Prouzeau, A.; Bezerianos, A.; Chapuis, O. Evaluating multi-user selection for exploring graph topology on wall-displays. IEEE Trans. Vis. Comput. Graph. 2016, 23, 1936–1951. [Google Scholar] [CrossRef] [PubMed]
- Huang, Z.; Huang, X. A study on the application of voice interaction in automotive human machine interface experience design. In Proceedings of the AIP Conference, Xi’an, China, 20–21 January 2018; p. 040074. [Google Scholar]
- Uludağli, M.Ç.; Acartürk, C. User interaction in hands-free gaming: A comparative study of gaze-voice and touchscreen interface control. Turk. J. Electr. Eng. Comput. Sci. 2018, 26, 1967–1976. [Google Scholar] [CrossRef]
- Gao, L.; Liu, Y.; Le, J.; Liu, R. Research on the application of multi-channel interaction in information system. In Proceedings of the 2nd International Conference on Robotics, Artificial Intelligence and Intelligent Control (RAIIC), Mianyang, China, 11–13 August 2023; pp. 121–125. [Google Scholar]
- Birch, B.; Griffiths, C.A.; Morgan, A. Environmental effects on reliability and accuracy of MFCC based voice recognition for industrial human-robot-interaction. Proc. Inst. Mech. Eng. B J. Eng. Manuf. 2021, 235, 1939–1948. [Google Scholar] [CrossRef]
- Alrowais, F.; Negm, N.; Khalid, M.; Almalki, N.; Marzouk, R.; Mohamed, A.; Al Duhayyim, M.; Alneil, A.A. Modified earthworm optimization with deep learning assisted emotion recognition for human computer interface. IEEE Access 2023, 11, 35089–35096. [Google Scholar] [CrossRef]
- Pereira, R.; Mendes, C.; Ribeiro, J.; Ribeiro, R.; Miragaia, R.; Rodrigues, N.; Costa, N.; Pereira, A. Systematic review of emotion detection with computer vision and deep learning. Sensors 2024, 24, 3484. [Google Scholar] [CrossRef] [PubMed]
- Aghajanzadeh, S.; Naidu, R.; Chen, S.H.; Tung, C.; Goel, A.; Lu, Y.H.; Thiruvathukal, G.K. Camera placement meeting restrictions of computer vision. In Proceedings of the IEEE International Conference on Image Processing (ICIP), Abu Dhabi, United Arab Emirates, 25–28 October 2020; pp. 3254–3258. [Google Scholar]
- Harshitaa, A.; Hansini, P.; Asha, P. Gesture based home appliance control system for disabled people. In Proceedings of the Second International Conference on Electronics and Sustainable Communication Systems (ICESC), Coimbatore, India, 4–6 August 2021; pp. 1501–1505. [Google Scholar]
- Ryumin, D.; Ivanko, D.; Axyonov, A. Cross-language transfer learning using visual information for automatic sign gesture recognition. Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci. 2023, 48, 209–216. [Google Scholar] [CrossRef]
- Zahra, R.; Shehzadi, A.; Sharif, M.I.; Karim, A.; Azam, S.; De Boer, F.; Jonkman, M.; Mehmood, M. Camera-based interactive wall display using hand gesture recognition. Intell. Syst. Appl. 2023, 19, 200262. [Google Scholar] [CrossRef]
- Benitez-Garcia, G.; Prudente-Tixteco, L.; Castro-Madrid, L.C.; Toscano-Medina, R.; Olivares-Mercado, J.; Sanchez-Perez, G.; Villalba, L.J.G. Improving real-time hand gesture recognition with semantic segmentation. Sensors 2021, 21, 356. [Google Scholar] [CrossRef]
- Luo, G.; Yang, P.; Chen, M.; Li, P. HCI on the table: Robust gesture recognition using acoustic sensing in your hand. IEEE Access 2020, 8, 31481–31498. [Google Scholar] [CrossRef]
- Hazra, S.; Santra, A. Robust gesture recognition using millimetric-wave radar system. IEEE Sens. Lett. 2018, 2, 1–4. [Google Scholar] [CrossRef]
- Cheng, Y.L.; Yeh, W.; Liao, Y.P. The implementation of a gesture recognition system with a millimeter wave and thermal imager. Sensors 2024, 24, 581. [Google Scholar] [CrossRef] [PubMed]
- Oudah, M.; Al-Naji, A.; Chahl, J. Hand gesture recognition based on computer vision: A review of techniques. J. Imaging 2020, 6, 73. [Google Scholar] [CrossRef]
- Galván-Ruiz, J.; Travieso-González, C.M.; Tejera-Fettmilch, A.; Pinan-Roescher, A.; Esteban-Hernández, L.; Domínguez-Quintana, L. Perspective and evolution of gesture recognition for sign language: A review. Sensors 2020, 20, 3571. [Google Scholar] [CrossRef]
- Sokhib, T.; Whangbo, T.K. A combined method of skin-and depth-based hand gesture recognition. Int. Arab J. Inf. Technol. 2020, 17, 137–145. [Google Scholar] [CrossRef]
- Xu, J.; Li, J.; Zhang, S.; Xie, C.; Dong, J. Skeleton guided conflict-free hand gesture recognition for robot control. In Proceedings of the 11th International Conference on Awareness Science and Technology (iCAST), Qingdao, China, 7–9 December 2020; pp. 1–6. [Google Scholar]
- Alwaely, B.; Abhayaratne, C. Ghosm: Graph-based hybrid outline and skeleton modelling for shape recognition. ACM Trans. Multim. Comput. Commun. Appl. 2023, 19, 1–23. [Google Scholar] [CrossRef]
- Qiao, G.; Ning, N.; Zuo, Y.; Zhou, P.; Sun, M.; Hu, S.; Yu, Q.; Liu, Y. Spatio-temporal fusion spiking neural network for frame-based and event-based camera sensor fusion. IEEE Trans. Emerg. Top. Comput. Intell. 2024, 8, 2446–2456. [Google Scholar] [CrossRef]
- Ryumin, D.; Ivanko, D.; Ryumina, E. Audio-visual speech and gesture recognition by sensors of mobile devices. Sensors 2023, 23, 2284. [Google Scholar] [CrossRef] [PubMed]
- Hakim, N.L.; Shih, T.K.; Kasthuri Arachchi, S.P.; Aditya, W.; Chen, Y.C.; Lin, C.Y. Dynamic hand gesture recognition using 3DCNN and LSTM with FSM context-aware model. Sensors 2019, 19, 5429. [Google Scholar] [CrossRef] [PubMed]
- Sharma, P.; Anand, R.S. Depth data and fusion of feature descriptors for static gesture recognition. IET Image Process. 2020, 14, 909–920. [Google Scholar] [CrossRef]
- Zengeler, N.; Kopinski, T.; Handmann, U. Hand gesture recognition in automotive human–machine interaction using depth cameras. Sensors 2019, 19, 59. [Google Scholar] [CrossRef]
- Yu, J.; Qin, M.; Zhou, S. Dynamic gesture recognition based on 2D convolutional neural network and feature fusion. Sci. Rep. 2022, 12, 4345. [Google Scholar] [CrossRef]
- Tran, D.; Bourdev, L.; Fergus, R.; Torresani, L.; Paluri, M. Learning spatiotemporal features with 3d convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 4489–4497. [Google Scholar]
- Hui, W.S.; Huang, W.; Hu, J.; Tao, K.; Peng, Y. A new precise contactless medical image multimodal interaction system for surgical practice. IEEE Access 2020, 8, 121811–121820. [Google Scholar] [CrossRef]
- Safavi, S.M.; Sundaram, S.M.; Heydarigorji, A.; Udaiwal, N.S.; Chou, P.H. Application of infrared scanning of the neck muscles to control a cursor in human-computer interface. In Proceedings of the 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Jeju, Republic of Korea, 11–15 July 2017; pp. 787–790. [Google Scholar]
- Singh, J.; Raza, U. Passive visible light positioning systems: An overview. In Proceedings of the Workshop on Light Up the IoT, London, UK, 21 September 2020; pp. 48–53. [Google Scholar]
- Fragner, C.; Krutzler, C.; Weiss, A.P.; Leitgeb, E. LEDPOS: Indoor visible light positioning based on LED as sensor and machine learning. IEEE Access 2024, 12, 46444–46461. [Google Scholar] [CrossRef]
- Pathak, P.H.; Feng, X.; Hu, P.; Mohapatra, P. Visible light communication, networking, and sensing: A survey, potential and challenges. IEEE Commun. Surv. Tutor. 2015, 17, 2047–2077. [Google Scholar] [CrossRef]
- Lu, Y.; Wu, F.; Huang, Q.; Tang, S.; Chen, G. Telling secrets in the light: An efficient key extraction mechanism via ambient light. IEEE Trans. Wirel. Commun. 2021, 20, 186–198. [Google Scholar] [CrossRef]
- Liao, Z.; Luo, Z.; Huang, Q.; Zhang, L.; Wu, F.; Zhang, Q.; Wang, Y. SMART: Screen-based gesture recognition on commodity mobile devices. In Proceedings of the 27th Annual International Conference on Mobile Computing and Networking, New Orleans, LA, USA, 31 January–4 February 2022; pp. 283–295. [Google Scholar]
- Lin, P.; Zhuo, R.; Wang, S.; Wu, Z.; Huangfu, J. LED screen-based intelligent hand gesture recognition system. IEEE Sens. J. 2022, 22, 24439–24448. [Google Scholar] [CrossRef]
- Jogin, M.; Madhulika, M.S.; Divya, G.D.; Meghana, R.K.; Apoorva, S. Feature extraction using convolution neural networks (CNN) and deep learning. In Proceedings of the 3rd IEEE International Conference on Recent Trends in Electronics, Information and Communication Technology (RTEICT), Bangalore, India, 18–19 May 2018; pp. 2319–2323. [Google Scholar]
- Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
- Sherstinsky, A. Fundamentals of recurrent neural network and long short-term memory network. Phys. D Nonlinear Phenom. 2020, 404, 132306. [Google Scholar] [CrossRef]
- Gers, F.A.; Schmidhuber, J.; Cummins, F. Learning to forget: Continual prediction with LSTM. Neural Comput. 2000, 12, 2451–2471. [Google Scholar] [CrossRef]
- Takahashi, K.; Yamamoto, K.; Kuchiba, A.; Koyama, T. Confidence interval for micro-averaged F1 and macro-averaged F1 scores. Appl. Intell. 2022, 52, 4961–4972. [Google Scholar] [CrossRef]
- Sokolova, M.; Japkowicz, N.; Szpakowicz, S. Beyond accuracy, F-score and ROC: A family of discriminant measures for performance evaluation. In Proceedings of the 19th Australasian Joint Conference on Artificial Intelligence, Berlin, Germany, 4–8 December 2006; pp. 1015–1021. [Google Scholar]
Dataset | Light Source | Volunteers | Number of Gestures | Samples |
---|---|---|---|---|
1 | Screen | 10 | 8 | 920 × 8 |
2 | Screen + ambient light | 10 | 8 | 920 × 8 |
Experiment | Channel | Accuracy | Precision | Recall | F1-Score |
---|---|---|---|---|---|
I | RGB three-channel | 99.93% | 99.73% | 99.73% | 99.73% |
Red channel | 96.45% | 89.66% | 85.82% | 83.59% | |
Green channel | 95.82% | 88.94% | 83.26% | 81.43% | |
Blue channel | 98.07% | 93.15% | 92.28% | 91.98% | |
II | RGB three-channel | 99.89% | 99.57% | 99.57% | 99.57% |
Red channel | 94.16% | 78.53% | 76.63% | 74.42% | |
Green channel | 96.56% | 85.94% | 86.25% | 85.32% | |
Blue channel | 89.29% | 63.62% | 57.17% | 54.45% |
System | Equipment | Accuracy | Number of Gestures | Algorithm |
---|---|---|---|---|
Zahra et al. [19] | Camera | 93.35% | 6 | Skin detection and genetic algorithm |
Benitez-Garcia et al. [20] | Camera | 85.10% | 13 | Temporal segment networks (TSN), temporal shift modules (TSM) |
Luo et al. [21] | Microphone | 93.20% | 7 | Feature extraction and support vector machine (SVM) |
Cheng et al. [23] | Millimeter wave radar and a thermal imager | 100.00% | 5 | Feature extraction and gated recurrent unit (GRU) |
Liao et al. [42] | Ambient light sensor | 96.10% | 9 | Feature extraction and k-nearest neighbors (KNN) |
This work | Narrowband spectral receivers | 99.93% | 8 | RGB multi-channel CNN-LSTM |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Lin, P.; Li, C.; Chen, S.; Huangfu, J.; Yuan, W. Intelligent Gesture Recognition Based on Screen Reflectance Multi-Band Spectral Features. Sensors 2024, 24, 5519. https://doi.org/10.3390/s24175519
Lin P, Li C, Chen S, Huangfu J, Yuan W. Intelligent Gesture Recognition Based on Screen Reflectance Multi-Band Spectral Features. Sensors. 2024; 24(17):5519. https://doi.org/10.3390/s24175519
Chicago/Turabian StyleLin, Peiying, Chenrui Li, Sijie Chen, Jiangtao Huangfu, and Wei Yuan. 2024. "Intelligent Gesture Recognition Based on Screen Reflectance Multi-Band Spectral Features" Sensors 24, no. 17: 5519. https://doi.org/10.3390/s24175519
APA StyleLin, P., Li, C., Chen, S., Huangfu, J., & Yuan, W. (2024). Intelligent Gesture Recognition Based on Screen Reflectance Multi-Band Spectral Features. Sensors, 24(17), 5519. https://doi.org/10.3390/s24175519