A Lightweight Hand Attitude Estimation Method Based on GCN Feature Enhancement
Abstract
:1. Introduction
2. Methods
2.1. Related Theoretical Foundations
2.2. The Proposed Method
2.2.1. Backbone
2.2.2. Heat Map Estimation Based on Decov
2.2.3. Hand Pose Estimation Based on GCN Feature Enhancement
3. Experimental Results and Analysis
3.1. Experimental Setup
3.2. Evaluating Indicators
3.3. Comparative Analysis of Experimental Results
- (1)
- Visualization experiment results.
- (2)
- Comparative analysis of experimental results.
- (3)
- Ablation experiment.
4. Discussion
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Zhou, X.; Chen, J.; Yang, Z.; Liu, W. Manipulation Action Recognition Based on Gesture Feature Fusion. Comput. Eng. Appl. 2021, 57, 169–175. [Google Scholar]
- Zhang, W.; Lin, Z.; Cheng, J.; Ke, M.; Deng, X.; Wang, H. Survey of Dynamic Hand Gesture Understanding and Interaction. J. Softw. 2021, 32, 3051–3067. [Google Scholar]
- Wang, R.; Popovic, J. Real-time hand-tracking with a color glove. Acm Trans. Graph. 2009, 28, 1–8. [Google Scholar]
- Xu, C.; Nanjappa, A.; Zhang, X.; Cheng, L. Estimate Hand Poses Efficiently from Single Depth Images. Int. J. Comput. Vis. 2016, 116, 21–45. [Google Scholar] [CrossRef]
- Guo, X.; Quan, T.; Pan, Y. Position Inferring of Hand Joints Based on Kinect. Comput. Appl. Softw. 2020, 37, 5. [Google Scholar]
- Yu, H.; Tang, X.; Liu, J.; Chen, Y.; Huang, C. Robust Single Fingertip Tracking Method Based on Plam Posture Self-adaption. J. Comput.-Aided Des. Comput. Grap 2013, 25, 1793–1800. [Google Scholar]
- Sun, X.; Wei, Y.; Liang, S.; Tang, X.; Sun, J. Cascaded hand pose regression. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 824–832. [Google Scholar]
- Tang, D.; Chang, H.; Tejani, A.; Kim, T. Latent Regression Forest: Structured Estimation of 3D Hand Poses. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1374–1387. [Google Scholar] [CrossRef]
- Li, C.; Zhong, F.; Ma, X.; Qin, X. Real-Time Head Pose Estimation Based on Kalman Filter and Random Regression Forest. J. Comput.-Aided Des. Comput. Graph. 2017, 29, 2309–2316. [Google Scholar] [CrossRef]
- Santavas, N.; Kansizoglou, I.; Bampis, L.; Karakasis, E.; Gasteratos, A. Attention! A Lightweight 2D Hand Pose Estimation Approach. IEEE Sens. J. 2021, 21, 11488–11496. [Google Scholar] [CrossRef]
- Qiao, S.; Wang, Y.; Li, J. Real-time human gesture grading based on OpenPose. In Proceedings of the 2017 10th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI), Shanghai, China, 14–16 October 2017; pp. 1–6. [Google Scholar]
- Cheng, B.; Xiao, B.; Wang, J.; Shi, H.; Huang, T.; Zhang, L. HigherHRNet: Scale-Aware Representation Learning for Bottom-Up Human Pose Estimation. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 5385–5394. [Google Scholar]
- Zhang, F.; He, T. Action Recognition Combined with Lightwight Openpose and Attention-Guided Graph Convolution. Comput. Eng. Appl. 2022, 58, 8. [Google Scholar] [CrossRef]
- Papandreou, G.; Zhu, T.; Chen, L.; Gidaris, S.; Tompson, J.; Murphy, K. Personlab: Person pose estimation and instance segmentation with a part-based geometric embedding model. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 282–299. [Google Scholar]
- Newell, A.; Yang, K.; Jia, D. Stacked Hourglass Networks for Human Pose Estimation. In Proceedings of the European Conference on Computer Vision (ECCV), Amsterdam, The Netherlands, 11–14 October 2016; Springer International Publishing: Berlin/Heidelberg, Germany, 2016; Volume 9912, pp. 483–499. [Google Scholar]
- Xiao, B.; Wu, H.; Wei, Y. Simple Baselines for Human Pose Estimation and Tracking. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; Volume 11210, pp. 472–487. [Google Scholar]
- Doosti, B.; Naha, S.; Mirbagheri, M.; Crandall, D. HOPE-Net: A Graph-Based Model for Hand-Object Pose Estimation. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 6607–6616. [Google Scholar]
- Lin, Y.; Lin, S.; Lin, Z. 3D Hand Pose Estimation Algorithm Based on Cascaded Features and Graph Conyolution. Chin. J. Liq. Cryst. Disp. 2022, 37, 736–745. [Google Scholar] [CrossRef]
- Ma, S.; Zhang, Q.; Li, T.; Song, H. Basic motion behavior recognition of single dairy cow based on improved Rexnet 3D network. Comput. Electron. Agric. 2022, 194, 0168–1699. [Google Scholar] [CrossRef]
- Sandler, M.; Howard, A.; Zhu, M.; Zhmoginov, A.; Chen, L. MobileNetV2: Inverted Residuals and Linear Bottlenecks. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 4510–4520. [Google Scholar]
- Zimmermann, C.; Brox, T. Learning to estimate 3d hand pose from single rgb images. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 4903–4911. [Google Scholar]
- Zimmermann, C.; Ceylan, D.; Yang, J. FreiHAND: A Dataset for Markerless Capture of Hand Pose and Shape from Single RGBImages. In Proceedings of the Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 813–822. [Google Scholar]
- Simon, T.; Joo, H.; Matthews, I.; Sheikh, Y. Hand Keypoint Detection in Single Images Using Multiview Bootstrapping. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4645–4653. [Google Scholar]
- Ge, L.; Liang, H.; Yuan, J.; Thalmann, D. 3D Convolutional Neural Networks for Efficient and Robust Hand Pose Estimation from Single Depth Images. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 1991–2000. [Google Scholar]
- Chen, Z.; Sun, Y. Joint-wise 2D to 3D lifting for hand pose estimation from a single RGB image. Appl. Intell. 2023, 53, 6421–6431. [Google Scholar] [CrossRef]
- Lin, F.; Wilhelm, C.; Martinez, T. Two-hand global 3D pose estimation using monocular rgb. In Proceedings of the IEEE CVF Winter Conference on Applications of Computer Vision, Virtual, 5–9 January 2021; pp. 2373–2381. [Google Scholar]
- Guo, S.; Cai, Q.; Qi, L. CLIP-Hand3D: Exploiting 3D Hand Pose Estimation via Context-Aware Prompting. In Proceedings of the 31st ACM International Conference on Multimedia, Ottawa, ON, Canada, 29 October–3 November 2023; pp. 4896–4907. [Google Scholar]
- Lin, K.; Wang, L.; Liu, Z. End-to-end human pose and mesh reconstruction with transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 19–25 June 2021; pp. 1954–1963. [Google Scholar]
- Cho, J.; Kim, Y.; Oh, T. Cross-attention of disentangled modalities for 3dhuman mesh recovery with transformers. In Proceedings of the European Conference on Computer Vision, Tel Aviv, Israel, 23–27 October 2022; Springer Nature: Cham, Switzerland, 2022; pp. 342–359. [Google Scholar]
Method | Dataset | E_Mean | E_Median | AUC |
---|---|---|---|---|
SBL | CMU-Hand | 5.937 | 3.474 | 0.826 |
LeamableGroups-Hand | 6.237 | 4.767 | 0.816 | |
Hourglass | 8.340 | 5.283 | 0.759 | |
Lightweight CNN | 16.1 | 12.85 | 0.55 | |
Our method | 6.675 | 4.228 | 0.801 |
Method | Dataset | E_Mean | E_Median | AUC |
---|---|---|---|---|
Chen [25] | RHD | 10.49 | 8.69 | 0.962 |
Lin [26] | 11.14 | 12.47 | 0.942 | |
Guo [27] | - | 10.58 | 0.965 | |
Our method | 10.21 | 8.34 | 0.970 |
Method | FPS | Params | FLOPs |
---|---|---|---|
METRO [28] | 19.55 | 183.80 M | 41.47 G |
FastMETRO [29] | 21.88 | 133.90 M | 30.56 G |
Our method | 38.62 | 41.80 M | 26.17 G |
Method | Backbone | E_Mean | E_Median | AUC |
---|---|---|---|---|
Squeezenet | Squeezenet | 11.883 | 9.029 | 0.673 |
Squeezenet + Decov | 9.386 | 8.162 | 0.703 | |
Squeezenet + Decov + GCN | 9.012 | 6.847 | 0.721 | |
ResNet50 | ResNet | 8.360 | 5.691 | 0.746 |
ResNet50 + Decov | 5.937 | 3.474 | 0.826 | |
ResNet50 + Decov + GCN | 3.378 | 1.621 | 0.857 | |
RexNet | RexNet | 8.759 | 5.970 | 0.735 |
RexNet + Decov | 7.275 | 4.628 | 0.773 | |
RexNet + Decov + GCN (this paper) | 6.675 | 4.228 | 0.801 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Rong, D.; Gang, F. A Lightweight Hand Attitude Estimation Method Based on GCN Feature Enhancement. Electronics 2024, 13, 4424. https://doi.org/10.3390/electronics13224424
Rong D, Gang F. A Lightweight Hand Attitude Estimation Method Based on GCN Feature Enhancement. Electronics. 2024; 13(22):4424. https://doi.org/10.3390/electronics13224424
Chicago/Turabian StyleRong, Dang, and Feng Gang. 2024. "A Lightweight Hand Attitude Estimation Method Based on GCN Feature Enhancement" Electronics 13, no. 22: 4424. https://doi.org/10.3390/electronics13224424
APA StyleRong, D., & Gang, F. (2024). A Lightweight Hand Attitude Estimation Method Based on GCN Feature Enhancement. Electronics, 13(22), 4424. https://doi.org/10.3390/electronics13224424