A Spatio-Temporal Graph Convolutional Network Model for Internet of Medical Things (IoMT)
Abstract
:1. Introduction
- In the context of IoMT, an efficient spatial and temporal feature extraction framework for HAR is introduced, together with a framework for utilizing the features.
- A novel architecture, STGCN, is proposed to enable the independent extraction of spatial and temporal features. Due to its reduced number of parameters and efficient feature extraction method, our model extracts spatial and temporal features from only joint-level information.
- Finally, we provide a strong framework for skeleton-based HAR. We demonstrate with extensive experimentation and analysis that our models achieve competitive accuracy with state-of-the-art models. The baselines we established should be useful to future research on skeleton-based HAR and vision-based patient monitoring.
2. Related Work
2.1. Human Action Recognition
2.1.1. Hand-Crafted Feature-Based Methods
2.1.2. DL-Based Methods
2.2. Skeleton-Based Action Recognition Methods
2.2.1. RNN-Based Methods
2.2.2. CNN-Based Methods
2.2.3. GCN-Based Methods
2.3. Vision-Based Methods for Healthcare Services
3. Proposed STGCN
3.1. Skeleton Graph Construction
3.2. Graph Convolution
3.3. Implementation of GCN
3.4. Spatio-Temporal Graph Convolutional Block
3.5. Spatio-Temporal Graph Convolutional Layers
4. Experimental Setup
4.1. Datasets
NTU-RGBD Dataset
4.2. Training Details
5. Results and Discussion
5.1. Visualization of Feature Selection
5.2. Ablation Study
5.3. Performance Analysis with Different Input Features
5.4. Comparison with the State-of-the-Art Models
5.5. Performance Evaluation for Patient Monitoring System
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Papaioannou, M.; Karageorgou, M.; Mantas, G.; Sucasas, V.; Essop, I.; Rodriguez, J.; Lymberopoulos, D. A survey on security threats and countermeasures in internet of medical things (IoMT). Trans. Emerg. Telecommun. Technol. 2022, 33, e4049. [Google Scholar] [CrossRef]
- Su, Y.S.; Ding, T.J.; Chen, M.Y. Deep learning methods in internet of medical things for valvular heart disease screening system. IEEE Internet Things J. 2021, 8, 16921–16932. [Google Scholar] [CrossRef]
- Yang, L.; Yu, K.; Yang, S.X.; Chakraborty, C.; Lu, Y.; Guo, T. An intelligent trust cloud management method for secure clustering in 5G enabled internet of medical things. IEEE Trans. Ind. Inform. 2021, 18, 8864–8875. [Google Scholar] [CrossRef]
- Swarna Priya, R.M.; Maddikunta, P.K.R.; Parimala, M.; Koppu, S.; Gadekallu, T.R.; Chowdhary, C.L.; Alazab, M. An effective feature engineering for DNN using hybrid PCA-GWO for intrusion detection in IoMT architecture. Comput. Commun. 2020, 160, 139–149. [Google Scholar] [CrossRef]
- Xiong, H.; Jin, C.; Alazab, M.; Yeh, K.H.; Wang, H.; Gadekallu, T.R.; Wang, W.; Su, C. On the Design of Blockchain-Based ECDSA With Fault-Tolerant Batch Verification Protocol for Blockchain-Enabled IoMT. IEEE J. Biomed. Health Inform. 2022, 26, 1977–1986. [Google Scholar] [CrossRef] [PubMed]
- Silva de Lima, A.L.; Evers, L.J.W.; Hahn, T.; Bataille, L.; Hamilton, J.L.; Little, M.A.; Okuma, Y.; Bloem, B.R.; Faber, M.J. Freezing of gait and fall detection in Parkinson’s disease using wearable sensors: A systematic review. J. Neurol. 2017, 264, 1642–1654. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Oskouei, R.J.; MousaviLou, Z.; Bakhtiari, Z.; Jalbani, K.B. IoT-Based Healthcare Support System for Alzheimer’s Patients. Wirel. Commun. Mob. Comput. 2020, 2020, 8822598. [Google Scholar] [CrossRef]
- Blunda, L.L.; Gutiérrez-Madroñal, L.; Wagner, M.F.; Medina-Bulo, I. A Wearable Fall Detection System Based on Body Area Networks. IEEE Access 2020, 8, 193060–193074. [Google Scholar] [CrossRef]
- Feng, M.; Meunier, J. Skeleton Graph-Neural-Network-Based Human Action Recognition: A Survey. Sensors 2022, 22, 2091. [Google Scholar] [CrossRef] [PubMed]
- Yin, J.; Han, J.; Wang, C.; Zhang, B.; Zeng, X. A Skeleton-based Action Recognition System for Medical Condition Detection. In Proceedings of the IEEE Biomedical Circuits and Systems Conference (BioCAS), Nara, Japan, 17–19 October 2019; pp. 1–4. [Google Scholar] [CrossRef]
- Jiao, L.; Chen, J.; Liu, F.; Yang, S.; You, C.; Liu, X.; Li, L.; Hou, B. Graph Representation Learning Meets Computer Vision: A Survey. IEEE Trans. Artif. Intell. 2022, 1–22. [Google Scholar] [CrossRef]
- Li, B.; Dai, Y.; Cheng, X.; Chen, H.; Lin, Y.; He, M. Skeleton based action recognition using translation-scale invariant image mapping and multi-scale deep CNN. In Proceedings of the IEEE International Conference on Multimedia Expo Workshops (ICMEW), Hong Kong, China, 10–14 July 2017; pp. 601–604. [Google Scholar] [CrossRef]
- Shahroudy, A.; Liu, J.; Ng, T.T.; Wang, G. Ntu rgb+ d: A large scale dataset for 3d human activity analysis. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 1010–1019. [Google Scholar]
- Zhang, P.; Lan, C.; Xing, J.; Zeng, W.; Xue, J.; Zheng, N. View Adaptive Neural Networks for High Performance Skeleton-Based Human Action Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2019, 41, 1963–1978. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Nguyen, T.T.; Pham, D.T.; Vu, H.; Le, T.L. A robust and efficient method for skeleton-based human action recognition and its application for cross-dataset evaluation. IET Comput. Vis. 2018, 16, 709–726. [Google Scholar] [CrossRef]
- Liu, M.; Liu, H.; Chen, C. Enhanced skeleton visualization for view invariant human action recognition. Pattern Recognit. 2017, 68, 346–362. [Google Scholar] [CrossRef]
- Monti, F.; Boscaini, D.; Masci, J.; Rodola, E.; Svoboda, J.; Bronstein, M.M. Geometric deep learning on graphs and manifolds using mixture model cnns. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 5115–5124. [Google Scholar]
- Chen, H.; Li, M.; Jing, L.; Cheng, Z. Lightweight Long and Short-Range Spatial-Temporal Graph Convolutional Network for Skeleton-Based Action Recognition. IEEE Access 2021, 9, 161374–161382. [Google Scholar] [CrossRef]
- Yan, S.; Xiong, Y.; Lin, D. Spatial temporal graph convolutional networks for skeleton-based action recognition. In Proceedings of the AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018; Volume 32, pp. 7444–7452. [Google Scholar]
- Shi, L.; Zhang, Y.; Cheng, J.; Lu, H. Two-stream adaptive graph convolutional networks for skeleton-based action recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 12026–12035. [Google Scholar]
- Bobick, A.F.; Davis, J.W. The recognition of human movement using temporal templates. IEEE Trans. Pattern Anal. Mach. Intell. 2001, 23, 257–267. [Google Scholar] [CrossRef] [Green Version]
- Klaser, A.; Marszałek, M.; Schmid, C. A spatio-temporal descriptor based on 3d-gradients. In Proceedings of the BMVC 2008-19th British Machine Vision Conference, Leeds, UK, 1–4 September 2008; pp. 275–285. [Google Scholar]
- Somasundaram, G.; Cherian, A.; Morellas, V.; Papanikolopoulos, N. Action recognition using global spatio-temporal features derived from sparse representations. Comput. Vis. Image Underst. 2014, 123, 1–13. [Google Scholar] [CrossRef] [Green Version]
- Carreira, J.; Zisserman, A. Quo vadis, action recognition? a new model and the kinetics dataset. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 6299–6308. [Google Scholar]
- Simonyan, K.; Zisserman, A. Two-Stream Convolutional Networks for Action Recognition in Videos. In Proceedings of the 27th International Conference on Neural Information Processing Systems-Volume 1, NIPS’14, Montreal, QC, Canada, 8–13 December 2014; pp. 568–576. [Google Scholar]
- Vemulapalli, R.; Arrate, F.; Chellappa, R. Human Action Recognition by Representing 3D Skeletons as Points in a Lie Group. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 588–595. [Google Scholar] [CrossRef]
- Zhang, P.; Lan, C.; Xing, J.; Zeng, W.; Xue, J.; Zheng, N. View Adaptive Recurrent Neural Networks for High Performance Human Action Recognition From Skeleton Data. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 2136–2145. [Google Scholar]
- Li, S.; Li, W.; Cook, C.; Zhu, C.; Gao, Y. Independently recurrent neural network (indrnn): Building a longer and deeper rnn. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 5457–5466. [Google Scholar]
- Wang, H.; Wang, L. Modeling temporal dynamics and spatial configurations of actions using two-stream recurrent neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 499–508. [Google Scholar]
- Tu, J.; Liu, M.; Liu, H. Skeleton-Based Human Action Recognition Using Spatial Temporal 3D Convolutional Neural Networks. In Proceedings of the IEEE International Conference on Multimedia and Expo (ICME), San Diego, CA, USA, 23–27 July 2018; pp. 1–6. [Google Scholar] [CrossRef]
- Ke, Q.; Bennamoun, M.; An, S.; Sohel, F.; Boussaid, F. A new representation of skeleton sequences for 3d action recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 3288–3297. [Google Scholar]
- Niepert, M.; Ahmed, M.; Kutzkov, K. Learning convolutional neural networks for graphs. In Proceedings of the International Conference on Machine Learning, New York, NY, USA, 20–22 June 2016; pp. 2014–2023. [Google Scholar]
- Hbali, Y.; Hbali, S.; Ballihi, L.; Sadgal, M. Skeleton-based human activity recognition for elderly monitoring systems. IET Comput. Vis. 2018, 12, 16–26. [Google Scholar] [CrossRef]
- Gul, M.A.; Yousaf, M.H.; Nawaz, S.; Ur Rehman, Z.; Kim, H. Patient Monitoring by Abnormal Human Activity Recognition Based on CNN Architecture. Electronics 2020, 9, 1993. [Google Scholar] [CrossRef]
- Gao, Y.; Xiang, X.; Xiong, N.; Huang, B.; Lee, H.J.; Alrifai, R.; Jiang, X.; Fang, Z. Human Action Monitoring for Healthcare Based on Deep Learning. IEEE Access 2018, 6, 52277–52285. [Google Scholar] [CrossRef]
- Liu, J.; Shahroudy, A.; Xu, D.; Wang, G. Spatio-Temporal LSTM with Trust Gates for 3D Human Action Recognition. In Proceedings of the Computer Vision—ECCV 2016, Amsterdam, The Netherlands, 11–14 October 2016; Leibe, B., Matas, J., Sebe, N., Welling, M., Eds.; Springer: Cham, Switzerland, 2016; pp. 816–833. [Google Scholar]
- Song, S.; Lan, C.; Xing, J.; Zeng, W.; Liu, J. An End-to-End Spatio-Temporal Attention Model for Human Action Recognition from Skeleton Data. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 4–9 February 2017; pp. 4263–4270. [Google Scholar]
- Zheng, W.; Li, L.; Zhang, Z.; Huang, Y.; Wang, L. Relational Network for Skeleton-Based Action Recognition. In Proceedings of the IEEE International Conference on Multimedia and Expo (ICME), Shanghai, China, 8–12 July 2019; pp. 826–831. [Google Scholar]
- Li, C.; Zhong, Q.; Xie, D.; Pu, S. Skeleton-based action recognition with convolutional neural networks. In Proceedings of the IEEE International Conference on Multimedia & Expo Workshops (ICMEW), Hong Kong, China, 10–14 July 2017; pp. 597–600. [Google Scholar]
- Huang, L.; Huang, Y.; Ouyang, W.; Wang, L. Part-Level Graph Convolutional Network for Skeleton-Based Action Recognition. Proc. AAAI Conf. Artif. Intell. 2020, 34, 11045–11052. [Google Scholar] [CrossRef]
- Shi, L.; Zhang, Y.; Cheng, J.; Lu, H. Skeleton-Based Action Recognition With Directed Graph Neural Networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 7904–7913. [Google Scholar] [CrossRef]
Methods | Accuracy |
---|---|
STGCN (w/o A) | 91.8 |
STGCN (w/o B) | 91.6 |
STGCN (w/o C) | 91.8 |
STGCN | 92.2 |
Input Features | X-Sub (%) | X-View (%) |
---|---|---|
Bone data | 83.8 | 91.4 |
Joint data | 84.5 | 92.2 |
Methods | X-Sub (%) | X-View (%) |
---|---|---|
Lie Group [26] | 50.1 | 82.8 |
Deep LSTM [13] | 60.7 | 67.3 |
ST-LSTM [36] | 69.2 | 77.7 |
STA-LSTM [37] | 73.4 | 81.2 |
VA-LSTM [27] | 79.2 | 87.7 |
ARRN-LSTM [38] | 81.8 | 89.6 |
Ind-RNN [28] | 81.8 | 88.0 |
Two-Stream 3DCNN [30] | 66.8 | 72.6 |
TCN [15] | 74.3 | 83.1 |
Clips + CNN + MTLN [31] | 79.6 | 84.8 |
Synthesized CNN [16] | 80.0 | 87.2 |
CNN + Motion + Trans [39] | 83.2 | 89.3 |
RSR-GCN (ours) | 84.5 | 92.2 |
Methods | X-Sub (%) | X-View (%) | Parameters (M) | Complexity (GFLOPS) |
---|---|---|---|---|
ST-GCN [19] | 81.5 | 88.3 | 3.1 | 16.3 |
2s-AGCN [20] | 88.5 | 95.1 | 6.9 | 37.4 |
PL-GCN [40] | 89.2 | 95.0 | 20.7 | - |
DGNN [41] | 89.9 | 96.1 | 26.24 | - |
STGCN (ours) | 84.5 | 92.2 | 3.6 | 20.9 |
Action | Precision | Recall | F1-Score | Accuracy (%) |
---|---|---|---|---|
Sneeze/cough | 0.89 | 0.88 | 0.88 | 89.35 |
Staggering | 0.91 | 0.95 | 0.93 | 91.44 |
Falling | 0.98 | 0.99 | 0.99 | 97.82 |
Headache | 0.88 | 0.93 | 0.90 | 87.99 |
Stomachache/heart pain | 0.84 | 0.90 | 0.87 | 84.27 |
Backache | 0.93 | 0.90 | 0.92 | 93.14 |
Neck ache | 0.94 | 0.91 | 0.92 | 93.77 |
Nausea/vomiting | 0.88 | 0.90 | 0.89 | 88.44 |
Feeling warm | 0.93 | 0.95 | 0.94 | 93.46 |
Action | Precision | Recall | F1-Score | Accuracy (%) |
---|---|---|---|---|
Sneeze/cough | 0.82 | 0.63 | 0.71 | 81.60 |
Staggering | 0.86 | 0.97 | 0.91 | 86.17 |
Falling | 0.94 | 0.95 | 0.94 | 93.55 |
Headache | 0.67 | 0.76 | 0.71 | 66.88 |
Stomachache/heart pain | 0.83 | 0.87 | 0.85 | 82.53 |
Backache | 0.79 | 0.87 | 0.83 | 78.95 |
Neck ache | 0.85 | 0.78 | 0.81 | 84.58 |
Nausea/vomiting | 0.89 | 0.87 | 0.88 | 88.81 |
Feeling warm | 0.82 | 0.89 | 0.86 | 82.49 |
Action | RC VA-LSTM Accuracy (%) | STGCN Accuracy (%) |
---|---|---|
Sneeze/cough | 86.30 | 89.35 |
Staggering | 96.00 | 91.44 |
Falling | 98.70 | 97.82 |
Headache | 81.30 | 87.99 |
Stomachache/heart pain | 85.30 | 84.27 |
Backache | 86.10 | 93.14 |
Neck ache | 84.30 | 93.77 |
Nausea/vomiting | 90.40 | 88.44 |
Feeling warm | 91.40 | 93.46 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ghosh, D.K.; Chakrabarty, A.; Moon, H.; Piran, M.J. A Spatio-Temporal Graph Convolutional Network Model for Internet of Medical Things (IoMT). Sensors 2022, 22, 8438. https://doi.org/10.3390/s22218438
Ghosh DK, Chakrabarty A, Moon H, Piran MJ. A Spatio-Temporal Graph Convolutional Network Model for Internet of Medical Things (IoMT). Sensors. 2022; 22(21):8438. https://doi.org/10.3390/s22218438
Chicago/Turabian StyleGhosh, Dipon Kumar, Amitabha Chakrabarty, Hyeonjoon Moon, and M. Jalil Piran. 2022. "A Spatio-Temporal Graph Convolutional Network Model for Internet of Medical Things (IoMT)" Sensors 22, no. 21: 8438. https://doi.org/10.3390/s22218438
APA StyleGhosh, D. K., Chakrabarty, A., Moon, H., & Piran, M. J. (2022). A Spatio-Temporal Graph Convolutional Network Model for Internet of Medical Things (IoMT). Sensors, 22(21), 8438. https://doi.org/10.3390/s22218438