Stroke Classification in Table Tennis as a Multi-Label Classification Task with Two Labels Per Stroke
Abstract
:1. Introduction
- Our proposed method of stroke classification improves the accuracy of table tennis analysis;
- We demonstrate that the three-dimensional joint coordinates of players are appropriate inputs for the classification of game videos;
- Several classification methods improve the accuracy via the proposed technique of multi-labeling.
2. Related Work
3. Proposed Method
3.1. Dataset
3.2. Network Structure for Classification
3.3. Multi-Labeling
- Forehand: hitting the ball with the face of the racket, where the palm faces the incoming ball, and returning the ball;
- Backhand: hitting the ball with the face of the racket, where the back of the hand faces the incoming ball, and returning the ball.
- Topspin: applying upward rotation to the ball by swinging the racket up from the bottom;
- Push: returning a ball that falls on the table by pushing it forward;
- Block: adjusting the angle of the racket to minimize movement and returning a ball that falls on the table;
- Flick: applying upward rotation to a ball that falls on the table by quickly moving the wrist.
4. Experiment
4.1. Evaluation Metrics
4.1.1. Recall and Precision
- The terms TP, FP, TN, and FN have the following meanings.
- TP: the number of items classified as class A whose correct class was class A;
- FP: the number of items classified as class A whose correct class was not class A;
- TN: the number of items not classified as class A whose correct class was not class A;
- FN: the number of items not classified as class A whose correct class was class A.
4.1.2. Accuracy and F1 Score
4.2. Backbone Models
- TSTCNN [11]
- C3D [22]
- I3D [23]
- C2D [24]
- R(2+1)D [25]
- STGCN [26]
- AGCN [27]
4.3. Experimental Results
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Appendix A
Appendix B
Backbone | Batch Size | Optimizer | Weight Decay | Learning Rate |
---|---|---|---|---|
TSTCNN [11] | 7 | AdamW | 5.60 × 10−9 | 2.07 × 10−4 |
C3D [22] | 7 | Adam | 2.39 × 10−8 | 4.69 × 10−4 |
I3D [23] | 8 | Adam | 1.68 × 10−6 | 4.31 × 10−3 |
C2D [24] | 6 | SGD | 3.92 × 10−9 | 1.56 × 10−5 |
R(2+1)D [25] | 7 | SGD | 1.05 × 10−9 | 4.57 × 10−5 |
STGCN [26] | 3 | Adam | 4.96 × 10−10 | 1.14 × 10−2 |
AGCN [27] | 2 | Adam | 5.21 × 10−10 | 6.93 × 10−4 |
References
- Blank, P.; Hoßbach, J.; Schuldhaus, D.; Eskofier, B.M. Sensor-Based Stroke Detection and Stroke Type Classification in Table Tennis. In Proceedings of the 2015 ACM International Symposium on Wearable Computers, Osaka, Japan, 7–11 September 2015. [Google Scholar]
- Liu, R.; Wang, Z.; Shi, X.; Zhao, H.; Qiu, S.; Li, J.; Yang, N. Table Tennis Stroke Recognition Based on Body Sensor Network. In Proceedings of the Internet and Distributed Computing Systems, Naples, Italy, 10–21 October 2019. [Google Scholar]
- Fu, Z.; Shu, K.I.; Zhang, H. Ping Pong Motion Recognition based on Smart Watch. In Proceedings of the 3rd International Conference on Mechatronics Engineering and Information Technology, Dalian, China, 29–30 March 2019. [Google Scholar]
- Kulkarni, K.M.; Shenoy, S. Table Tennis Stroke Recognition using Two-Dimensional Human Pose Estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Nashville, TN, USA, 19–25 June 2021. [Google Scholar]
- Yenduri, S.; Chalavadi, V. Adaptive Temporal Aggregation for Table Tennis Shot Recognition. Neurocomputing 2024, 584, 127567. [Google Scholar] [CrossRef]
- Tian, J.; Xiao, Y. Research on the Difference of Stroke Characteristics and Stroke Effect between Different Stroke Duration of Table Tennis Players. Sci. Rep. 2024, 14, 25405. [Google Scholar] [CrossRef]
- Shi, Z.; Jia, Y.; Shi, G.; Zhang, K.; Ji, L.; Wang, D.; Wu, Y. Design of Motor Skill Recognition and Hierarchical Evaluation System for Table Tennis Players. IEEE Sens. J. 2024, 24, 5303–5315. [Google Scholar] [CrossRef]
- Duan, K. Biomechanical analysis of pace adjustment in table tennis players combined with image recognition technology. Mol. Cell. Biomech. 2025, 22, 977. [Google Scholar] [CrossRef]
- Bańkosz, Z.; Winiarski, S.; Lanzoni, I.M. Kinematic Analysis of Short and Long Services in Table Tennis. Appl. Sci. 2025, 15, 470. [Google Scholar] [CrossRef]
- Martin, P.-E.; Benois-Pineau, J.; Péteri, R.; Morlier, J. Fine Grained Sport Action Recognition with Twin Spatio Temporal Convolutional Neural Networks. Multimed. Tools Appl. 2021, 70, 4571–4579. [Google Scholar] [CrossRef]
- Martin, P.-E.; Benois-Pineau, J.; Peteri, R.; Morlier, J. 3D Attention Mechanism for Fine-Grained Classification of Table Tennis Strokes using a Twin Spatio-Temporal Convolutional Neural Networks. In Proceedings of the 2020 25th International Conference on Pattern Recognition (ICPR), Milan, Italy, 10–11 January 2021. [Google Scholar]
- Hacker, L.; Bartels, F.; Martin, P.E. Fine Grained Action Detection with RGB and Pose Information using Two Stream Convolutional Networks. In Proceedings of the MediaEval’22, Bergen, Norway, 12–13 January 2023. [Google Scholar]
- Kenton, J.D.; Toutanova, L.K. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, MN, USA, 2–7 June 2019. [Google Scholar]
- Xie, C.; Fujiwara, M.; Shishido, H.; Kitahara, I. Table Tennis Stroke Recognition Based on Player Motions. In Proceedings of the IEEE Global Conference on Consumer Electronics, Kitakyushu, Japan, 29 October–1 November 2024. [Google Scholar]
- Song, H.; Li, Y.; Fu, C.; Xue, F.; Zhao, Q.; Zheng, X.; Jiang, K.; Liu, T. Using Complex Networks and Multiple Artificial Intelligence Algorithms for Table Tennis Match Action Recognition and Technical-tactical Analysis. Chaos Solitons Fractals 2020, 178, 114343. [Google Scholar] [CrossRef]
- Bian, J.; Li, X.; Wang, T.; Wang, Q.; Huang, J.; Liu, C.; Zhao, J.; Lu, F.; Dou, D.; Xiong, H. P2ANet: A Large-Scale Benchmark for Dense Action Detection from Table Tennis Match Broadcasting Videos. ACM Trans. Multimed. Comput. Commun. Appl. 2024, 20, 1–23. [Google Scholar] [CrossRef]
- Voeikov, R.; Falaleev, N.; Baikulov, R. TTNet: Real-Time Temporal and Spatial Video Analysis of Table Tennis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Seattle, WA, USA, 14–19 June 2020. [Google Scholar]
- Martinez, J.; Hossain, R.; Romero, J.; Little, J.J. A Simple Yet Effective Baseline for 3D Human Pose Estimation. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017. [Google Scholar]
- Ionescu, C.; Papava, D.; Olaru, V.; Sminchisescu, C. Human3.6M: Large Scale Datasets and Predictive Methods for 3D Human Sensing in Natural Environments. IEEE Trans. Pattern Anal. Mach. Intell. 2014, 36, 1325–1339. [Google Scholar] [CrossRef] [PubMed]
- Akiba, T.; Sano, S.; Yanase, T.; Ohta, T.; Koyama, M. Optuna: A Next-Generation Hyperparameter Optimization Framework. In Proceedings of the KDD ‘19: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA, 4–8 August 2019. [Google Scholar]
- OpenMMLab’s Next Generation Video Understanding Toolbox and Benchmark. Available online: https://github.com/open-mmlab/mmaction2 (accessed on 12 November 2024).
- Tran, D.; Bourdev, L.; Fergus, R.; Torresani, L.; Paluri, M. Learning Spatiotemporal Features with 3D Convolutional Networks. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015. [Google Scholar]
- Carreira, J.; Zisserman, A. Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- Wang, X.; Girshick, R.; Gupta, A.; He, K. Non-Local Neural Networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018. [Google Scholar]
- Tran, D.; Wang, H.; Torresani, L.; Ray, J.; LeCun, Y.; Paluri, M. A Closer Look at Spatiotemporal Convolutions for Action Recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018. [Google Scholar]
- Yan, S.; Xiong, Y.; Lin, D. Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition. In Proceedings of the AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018. [Google Scholar]
- Shi, L.; Zhang, Y.; Cheng, J.; Lu, H. Two-Stream Adaptive Graph Convolutional Networks for Skeleton-Based Action Recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019. [Google Scholar]
Class Name | Amount of Data | |
---|---|---|
1 | Forehand Topspin | 155 |
2 | Backhand Topspin | 171 |
3 | Forehand Push | 65 |
4 | Backhand Push | 141 |
5 | Forehand Block | 87 |
6 | Backhand Block | 148 |
7 | Forehand Flick | 10 |
8 | Backhand Flick | 15 |
9 | Forehand Lob | 0 |
10 | Backhand Lob | 0 |
Total | 792 |
Use | Amount of Data | Number of Players | |
---|---|---|---|
Data A | train/validation | 626 | 16 |
Data B | test | 166 | 8 |
Hardware Environment | |
---|---|
CPU | Intel (R) Core (TM) i7-13700KF |
GPU | NVIDIA GeForce RTX 3080 (NVIDIA, Santa Clara, CA, USA) |
RAM | 64 GB |
GPU Memory | 10 GB |
Hyperparameter | Search Range |
---|---|
Batch Size | 1~8 |
Optimizer | SGD, Adam, AdamW |
Weight Decay | 1 × 10−3~1 × 10−10 |
Learning Rate | 1 × 10−1~1 × 10−5 |
Backbone | Multi-Labeling | Accuracy | Precision | Recall | Macro-F1 | Weighted-F1 |
---|---|---|---|---|---|---|
TSTCNN [11] | no | 69.0 | 0.499 | 0.506 | 0.496 | 0.354 |
yes | 70.6 | 0.715 | 0.666 | 0.679 | 0.855 | |
C3D [22] | no | 54.8 | 0.433 | 0.537 | 0.440 | 0.350 |
yes | 62.2 | 0.506 | 0.572 | 0.524 | 0.463 | |
I3D [23] | no | 55.3 | 0.421 | 0.513 | 0.424 | 0.196 |
yes | 59.0 | 0.520 | 0.570 | 0.519 | 0.369 | |
C2D [24] | no | 64.9 | 0.670 | 0.605 | 0.625 | 0.846 |
yes | 66.0 | 0.631 | 0.624 | 0.568 | 0.416 | |
R(2+1)D [25] | no | 55.9 | 0.460 | 0.540 | 0.461 | 0.238 |
yes | 59.0 | 0.476 | 0.560 | 0.478 | 0.240 |
Backbone | Multi-Labeling | Accuracy | Precision | Recall | Macro-F1 | Weighted-F1 |
---|---|---|---|---|---|---|
TSTCNN [11] | no | 53.0 | 0.386 | 0.348 | 0.347 | 0.145 |
yes | 63.9 | 0.444 | 0.411 | 0.410 | 0.131 | |
C3D [22] | no | 16.9 | 0.134 | 0.111 | 0.102 | 0.028 |
yes | 22.9 | 0.210 | 0.184 | 0.186 | 0.100 | |
I3D [23] | no | 36.7 | 0.195 | 0.233 | 0.195 | 0.107 |
yes | 18.7 | 0.168 | 0.172 | 0.162 | 0.086 | |
C2D [24] | no | 39.8 | 0.221 | 0.199 | 0.197 | 0.044 |
yes | 32.5 | 0.289 | 0.322 | 0.265 | 0.389 | |
R(2+1)D [25] | no | 28.3 | 0.200 | 0.187 | 0.174 | 0.077 |
yes | 28.9 | 0.199 | 0.191 | 0.185 | 0.091 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Fujihara, Y.; Shimada, T.; Kong, X.; Tanaka, A.; Nishikawa, H.; Tomiyama, H. Stroke Classification in Table Tennis as a Multi-Label Classification Task with Two Labels Per Stroke. Sensors 2025, 25, 834. https://doi.org/10.3390/s25030834
Fujihara Y, Shimada T, Kong X, Tanaka A, Nishikawa H, Tomiyama H. Stroke Classification in Table Tennis as a Multi-Label Classification Task with Two Labels Per Stroke. Sensors. 2025; 25(3):834. https://doi.org/10.3390/s25030834
Chicago/Turabian StyleFujihara, Yuta, Tomoyasu Shimada, Xiangbo Kong, Ami Tanaka, Hiroki Nishikawa, and Hiroyuki Tomiyama. 2025. "Stroke Classification in Table Tennis as a Multi-Label Classification Task with Two Labels Per Stroke" Sensors 25, no. 3: 834. https://doi.org/10.3390/s25030834
APA StyleFujihara, Y., Shimada, T., Kong, X., Tanaka, A., Nishikawa, H., & Tomiyama, H. (2025). Stroke Classification in Table Tennis as a Multi-Label Classification Task with Two Labels Per Stroke. Sensors, 25(3), 834. https://doi.org/10.3390/s25030834