Improved Long Short-Term Memory Network with Multi-Attention for Human Action Flow Evaluation in Workshop
Abstract
:Featured Application
Abstract
1. Introduction
- Image-based recognition methods
- Human skeleton-based recognition method
2. Methods
2.1. Manufacture Action Recognition Based on WA-LSTM
2.1.1. Encoding of Worker Skeleton Sequence
2.1.2. Feature Extraction of Workpiece and Fusion Method
2.1.3. Overall Workflow of WA-LSTM
2.2. Manufacturing Process Evaluation Based on KAA-LSTM
2.2.1. Encoding of Action Sequences
2.2.2. Key Action Attentional Mechanisms
3. Results
3.1. Case Description and Dataset
- (a)
- Polish the surface, shaping it;
- (b)
- Blow on the surface and cool it;
- (c)
- Tap the internal thread to ensure that the thread form meets the requirements;
- (d)
- Clean with a brush to remove surface debris;
- (e)
- Move the propellant to the next working procedure.
3.2. Experiment and Result for WA-LSTM
- DNN: The worker skeleton features and the workpiece features are directly input into the FC network to obtain the classification results without considering the temporal information.
- LSTM: Only the skeleton features of the worker and the temporal information are considered, but not the workpiece information.
3.3. Experiment and Result for KAA-LSTM
- a simple DNN model;
- an LSTM model without attentional mechanisms.
4. Discussion
4.1. Discussion on the Results of WA-LSTM
4.2. Discussion on the Results of KAA-LSTM
4.3. Prospect
5. Conclusions
Author Contributions
Funding
Conflicts of Interest
References
- Rude, D.J.; Adams, S.; Beling, P.A. Task Recognition from Joint Tracking Data in an Operational Manufacturing Cell. J. Intell. Manuf. 2015, 29, 1203–1217. [Google Scholar] [CrossRef]
- Goecks, L.S.; Dos Santos, A.A.; Korzenowski, A.L. Decision-Making Trends in Quality Management: A Literature Review about Industry 4.0. Producao 2020, 30, 30. [Google Scholar] [CrossRef]
- Tsao, L.; Li, L.; Ma, L. Human Work and Status Evaluation Based on Wearable Sensors in Human Factors and Ergonomics: A Review. IEEE Trans. Hum. Mach. Syst. 2018, 49, 72–84. [Google Scholar] [CrossRef]
- Wang, D.; Kotake, Y.; Nakajima, H.; Mori, K.; Hata, Y. A Relationship between Product Quality and Body Information of Worker and Its Application to Improvement of Productivity. In Proceedings of the 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Miyazaki, Japan, 7–10 October 2018; pp. 1433–1438. [Google Scholar]
- Song, K.; Lee, S.; Shin, S.; Lee, H.J.; Han, C. Simulation-Based Optimization Methodology for Offshore Natural Gas Liquefaction Process Design. Ind. Eng. Chem. Res. 2014, 53, 5539–5544. [Google Scholar] [CrossRef]
- Moustafa, N.; Adi, E.; Turnbull, B.; Hu, J. A New Threat Intelligence Scheme for Safeguarding Industry 4.0 Systems. IEEE Access 2018, 6, 32910–32924. [Google Scholar] [CrossRef]
- Fernandez-Carames, T.M.; Fraga-Lamas, P. A Review on Human-Centered IoT-Connected Smart Labels for the Industry 4.0. IEEE Access 2018, 6, 25939–25957. [Google Scholar] [CrossRef]
- Jobanputra, C.; Bavishi, J.; Doshi, N. Human Activity Recognition: A Survey. Procedia Comput. Sci. 2019, 155, 698–703. [Google Scholar] [CrossRef]
- Kong, Y.; Fu, Y. Human Action Recognition and Prediction: A Survey. arXiv 2018, arXiv:1806.11230. [Google Scholar]
- Lan, Z.; Zhu, Y.; Hauptmann, A.G.; Newsam, S. Deep Local Video Feature for Action Recognition. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Honolulu, HI, USA, 21–26 July 2017; 2017, pp. 1219–1225. [Google Scholar]
- Wang, H.; Kläser, A.; Schmid, C.; Liu, C.-L. Dense Trajectories and Motion Boundary Descriptors for Action Recognition. Int. J. Comput. Vis. 2013, 103, 60–79. [Google Scholar] [CrossRef] [Green Version]
- Wang, H.; Schmid, C. Action Recognition with Improved Trajectories. In Proceedings of the 2013 IEEE International Conference on Computer Vision, Sydney, Australia, 1–8 December 2013; pp. 3551–3558. [Google Scholar]
- Wang, L.; Qiao, Y.; Tang, X. Action Recognition with Trajectory-Pooled Deep-Convolutional Descriptors. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 4305–4314. [Google Scholar]
- Song, J.; Yang, Z.; Zhang, Q.; Fang, T.; Hu, G.; Han, J.; Chen, C. Human Action Recognition with 3D Convolution Skip-Connections and RNNs. Lect. Notes Comput. Sci. 2018, 11301, 319–331. [Google Scholar] [CrossRef]
- Tran, A.; Cheong, L.-F. Two-Stream Flow-Guided Convolutional Attention Networks for Action Recognition. In Proceedings of the 2017 IEEE International Conference on Computer Vision Workshops (ICCVW), Venice, Italy, 22–29 October 2017; pp. 3110–3119. [Google Scholar]
- Ke, Q.; Bennamoun, M.; An, S.; Sohel, F.; Boussaid, F. A New Representation of Skeleton Sequences for 3D Action Recognition. In Proceedings of the 30th IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 4570–4579. [Google Scholar]
- Gaglio, S.; Re, G.L.; Morana, M. Human Activity Recognition Process Using 3-D Posture Data. IEEE Trans. Hum. Mach. Syst. 2014, 45, 586–597. [Google Scholar] [CrossRef]
- Wei, S.; Song, Y.; Zhang, Y. Human Skeleton Tree Recurrent Neural Network with Joint Relative Motion Feature for Skeleton Based Action Recognition. In Proceedings of the 2017 IEEE International Conference on Image Processing (ICIP), Beijing, China, 17–20 September 2017; pp. 91–95. [Google Scholar]
- Li, Y.; He, Z.; Ye, X.; He, Z.; Han, K. Spatial Temporal Graph Convolutional Networks for Skeleton-Based Dynamic Hand Gesture Recognition. EURASIP J. Image Video Process. 2019, 2019, 1–7. [Google Scholar] [CrossRef]
- Klochkov, Y.; Gazizulina, A.; Golovin, N.; Glushkova, A.; Zh, S. Information Model-Based Forecasting of Technological Process State. In Proceedings of the 2017 International Conference on Infocom Technologies and Unmanned Systems (Trends and Future Directions) (ICTUS), Dubai, UAE, 18–20 December 2017; pp. 709–712. [Google Scholar]
- Cimini, C.; Pirola, F.; Pinto, R.; Cavalieri, S. A Human-in-the-Loop Manufacturing Control Architecture for the Next Generation of Production Systems. J. Manuf. Syst. 2020, 54, 258–271. [Google Scholar] [CrossRef]
- Du, S.; Wu, P.; Wu, G.; Yao, C.; Zhang, L. The Collaborative System Workflow Management of Industrial Design Based on Hierarchical Colored Petri-Net. IEEE Access 2018, 6, 27383–27391. [Google Scholar] [CrossRef]
- Zaremba, W.; Sutskever, I.; Vinyals, O. Recurrent Neural Network Regularization. arXiv 2014, arXiv:1409.2329. [Google Scholar]
- Gers, F.; Schmidhuber, E. LSTM Recurrent Networks Learn Simple Context-Free and Context-Sensitive Languages. IEEE Trans. Neural Netw. 2001, 12, 1333–1340. [Google Scholar] [CrossRef] [Green Version]
- Li, Y.; Lan, C.; Xing, J.; Zeng, W.; Yuan, C.; Liu, J. Online Human Action Detection Using Joint Classification-Regression Recurrent Neural Networks. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 8–16 October 2016; pp. 203–220. [Google Scholar]
- Liu, J.; Li, Y.; Song, S.; Xing, J.; Lan, C.; Zeng, W. Multi-Modality Multi-Task Recurrent Neural Network for Online Action Detection. IEEE Trans. Circuits Syst. Video Technol. 2018, 29, 2667–2682. [Google Scholar] [CrossRef]
- Shotton, J.; FitzGibbon, A.; Cook, M.; Sharp, T.; Finocchio, M.; Moore, R.; Kipman, A.; Blake, A. Real-Time Human Pose Recognition in Parts from Single Depth Images. In Proceedings of the CVPR 2011, Colorado Springs, CO, USA, 20–25 June 2011; Volume 56, pp. 1297–1304. [Google Scholar]
- Cao, Z.; Simon, T.; Wei, S.-E.; Sheikh, Y. Realtime Multi-person 2D Pose Estimation Using Part Affinity Fields. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 June 2017; pp. 1302–1310. [Google Scholar]
- Shelhamer, E.; Long, J.; Darrell, T. Fully Convolutional Networks for Semantic Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2016, 39, 640–651. [Google Scholar] [CrossRef] [PubMed]
- Welch, G.; Bishop, G. An Introduction to the Kalman Filter; University of North Carolina at Chapel Hill: Chapel Hill, NC, USA, 1995. [Google Scholar]
- Zhao, S.; Shmaliy, Y.S.; Liu, F. Fast Kalman-Like Optimal Unbiased FIR Filtering with Applications. IEEE Trans. Signal Process. 2016, 64, 2284–2297. [Google Scholar] [CrossRef]
- Sharma, S.; Kiros, R.; Salakhutdinov, R. Action Recognition using Visual Attention. arXiv 2015, arXiv:1511.04119. [Google Scholar]
- Bahdanau, D.; Cho, K.H.; Bengio, Y. Neural Machine Translation by Jointly Learning to Align and Translate. In Proceedings of the 3rd International Conference on Learning Representations, ICLR, San Diego, CA, USA, 7–9 May 2015. [Google Scholar]
- Uriarte-Arcia, A.V.; López-Yáñez, I.; Yáñez-Márquez, C. One-Hot Vector Hybrid Associative Classifier for Medical Data Classification. PLoS ONE 2014, 9, e95715. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Song, S.; Lan, C.; Xing, J.; Zeng, W.; Liu, J. An End-to-End Spatio-Temporal Attention Model for Human Action Recognition from Skeleton Data. In Proceedings of the 31st AAAI Conference on Artificial Intelligence, Marina del Rey, CA, USA, 1–3 June 2017; pp. 4263–4270. [Google Scholar]
- Yuan, J.; Liu, Z.; Wu, Y. Discriminative Subvolume Search for Efficient Action Detection. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), Miami, FL, USA, 20–25 June 2009; pp. 22–24. [Google Scholar]
- Soomro, K.; Zamir, A.R.; Shah, M. UCF101: A Dataset of 101 Human Actions Classes from Videos in The Wild. arXiv 2012, arXiv:1212.0402. [Google Scholar]
- Sáiz-Manzanares, M.C.; Escolar-Llamazares, M.-C.; Arnaiz-González, Á. Effectiveness of Blended Learning in Nursing Education. Int. J. Environ. Res. Public Health 2020, 17, 1589. [Google Scholar] [CrossRef] [Green Version]
Index | DNN | LSTM | KAA-LSTM |
---|---|---|---|
Cross Entropy | 0.1991 | 0.1843 | 0.1274 |
Accuracy | 0.9478 | 0.9775 | 0.9936 |
Recall | 0.8832 | 0.9023 | 0.9525 |
Precision | 0.9227 | 0.9343 | 0.9710 |
F1-score | 0.9025 | 0.9180 | 0.9617 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yang, Y.; Wang, J.; Liu, T.; Lv, X.; Bao, J. Improved Long Short-Term Memory Network with Multi-Attention for Human Action Flow Evaluation in Workshop. Appl. Sci. 2020, 10, 7856. https://doi.org/10.3390/app10217856
Yang Y, Wang J, Liu T, Lv X, Bao J. Improved Long Short-Term Memory Network with Multi-Attention for Human Action Flow Evaluation in Workshop. Applied Sciences. 2020; 10(21):7856. https://doi.org/10.3390/app10217856
Chicago/Turabian StyleYang, Yun, Jiacheng Wang, Tianyuan Liu, Xiaolei Lv, and Jinsong Bao. 2020. "Improved Long Short-Term Memory Network with Multi-Attention for Human Action Flow Evaluation in Workshop" Applied Sciences 10, no. 21: 7856. https://doi.org/10.3390/app10217856
APA StyleYang, Y., Wang, J., Liu, T., Lv, X., & Bao, J. (2020). Improved Long Short-Term Memory Network with Multi-Attention for Human Action Flow Evaluation in Workshop. Applied Sciences, 10(21), 7856. https://doi.org/10.3390/app10217856