Improved Long Short-Term Memory Network with Multi-Attention for Human Action Flow Evaluation in Workshop
Round 1
Reviewer 1 Report
The paper presented a deep learning-based model for action detection and process evaluation in a workspace area. In particular, such a proposal is a fundamental component in identifying the relationship between the working process and product quality.
The authors designed a Workpiece Attention-based LSTM(WA-LSTM) network for action detect and a Key Action Attention-based LSTM(KAA-LSTM) network for action flow evaluation.
The methodology was tested on a self-created dataset, obtaining interesting results recognizing a complex work process composed of 5 different sub-processes.
Finally, the authors compared the defined methodology with the existing state of the art Deep Neural Networks (DNN) and achieved an overall 99% accuracy.
The strong points of the methodology:
- the studied context is very interesting,
- Utilization of the OpenPose library.
- Brand new testing dataset.
- Achieved results
However, the paper presents some major weak points.
- The writing quality is low, presenting many errors and easily identifiable typos.
- Figures are not well explained and not clear, for example, Figure 3.
- Authors refer to figures but do not insert the reference.
- The methodology (Section II) is challenging to understand.
- Authors talk about a feature extraction phase, but it is not clear if it is an actual feature extraction phase or refers to the implicit feature extraction that DNN performs.
- In the Results section, the authors do not present the characteristics of the used testing datasets (e.g., length of the processes, the difference between standard and non-standard procedures, etc.)
- The proposed methodology must be tested on other types of video-based data available online to validate the achieved results.
Author Response
1.Thanks for your careful review. We have made a Comprehensive inspection and modification of the errors in the paper, including line 67-69, 73, etc.
2.Thank you for your valuable advice. We have redrawn the figure 3 and added more comments under figure2(line 89-92) and figure3(line 117-121).
3.Thank you very much for pointing out the mistake. This is an oversight in our writing process. We have inserted all the reference of figures.
4.Thanks for the comments. The statements in chapter 2 have been partially adjusted to make the statement more concise and accurate. (line 79-82, 84,85, etc.)
5.Thank you for your valuable advice. In general, this is an implicit feature extraction method. However, we provide a guiding role for feature extraction through the operation of pre-training and fixed parameters. We have redrawn the figure 3 and added more comments to make it clearer (line 117-121).
6.Thank you very much for pointing out the mistake. We have added a description of the dataset in line 208-211.
7.Thanks for the valuable advice. We do want to find a proper dataset to verify our method. We search datasets on data.gov, Kaggle, Google datasets, etc. However, to our best knowledge, there is no such dataset for process evaluation. The main difference is that focuses on analyzing the standard of the process but not only recognize the single action. So we prepared our own dataset to prove our method.
Author Response File: Author Response.docx
Reviewer 2 Report
It is an interesting work, which is well structured and presented.
However, I make a number of suggestions for improvement:
1. Unify in the figures the size and type of lettering.
2. Include more quotations updated in the last five years, 18 quotations out of 39 are presented, which means that 43% would be interesting to reach 60%.
3. Review the citations in the bibliography since it has been detected that the years are missing in the citations 24, 29, 31. Also review the rules of citation of the journal and adjust the citations to it.
4. Regarding the comparative fit between the three experiments it would be good to include a fit of the model through Goodness-of-fit indices. One way of applying this can be found in the article by Sáiz, M.C., Escolar, M.C., Arnaiz, Á. (2020). Effectiveness of Blended Learning in Nursing Education Int. Public Health Res. 17(5), 1-15. 10.3390/ijerph17051589
5. It would be advisable to include a section on Discussion and within it to refer to future lines of research.
6. Review in the article the spaces in the quotations and also the nomination of the figures and tables in the text, adjusting to the rules of the journal.
7. The approval number of the ethics committee or commission of your institution and the informed consent of the participants must be introduced and a section of participants and another of procedure must be included, as images appear. If these are public, reference must be made to the repository from which they were taken.
Author Response
1.Thanks for your careful review. We have made a Comprehensive inspection and modification of the errors in the paper, including line 67-69, 73, etc.
2.Thank you for your valuable advice. We have found literatures that has been published in recent years to replace the old one. For example, line 341-342, 369-371, 377-379, 396-398, etc. Now, the proportion of quotations in the last five years has reach 29/38
3.Thank you very much for pointing out the mistake. We have rewritten all the quotations according to the format(line 311-465)
4.Thank you for your valuable advice. The goodness of fit index is more proper used in the regression problem. But our problem is more of a classification problem, the indicators of recall, precision can judge the model better. The table writing method in the paper is worth learning. So we have changed our table(line 253) according to is and cited this paper(line 437-438).
5.Thank you for your valuable advice. We have added a new section Prospect on Discussion to envision future research directions
6.Thank you very much for pointing out the mistake. We have formatted all the pictures and tables according to the rules of the journal.
7.Thank you very much for pointing out the mistake. This is an oversight in our writing process. We have added information of author contributions, funding, conflicts of interest after Conclution in line 301-109
Author Response File: Author Response.docx
Round 2
Reviewer 1 Report
The proposed modifications are in line with the reviewer's comments.
The paper is more readable and easy to understand.
However, the experimental results still need to be improved. My suggestion would be that of using sports datasets, like basket or soccer. Authors can find references to such datasets in the following papers:
- "Rashidi et al., Human Activity Recognition using Inertial, Physiological and Environmental Sensors: a Comprehensive Survey. "
- "P Pareek, and A Thakkar., A survey on video-based Human Action Recognition: recent updates, datasets, challenges, and applications."
- "HC Shih., A survey of content-aware video analysis for sports."
Author Response
Thanks for your valuable comments and suggestions.
We have read through the recommended papers, the differences are summarized as follows:
1. The datasets in Paper 1 were primarily collected by sensors whose approach is not within our research domain.
2. Paper 2 and Paper 3 used the same technique “computer vision”. The datasets in this research is derived from an operation performed on a part. Datasets like Weizmann, KTH, MSR, NUT_RGBD are simply the records of the body's own behavior. Besides, each sample of our dataset consists of several continuous actions while the sample of datasets like UCF101, HMDB51 is a snippet of single action.
3. The datasets in this research provides a label of evaluation for every sample to make a “good or not” classification of sequential actions. As far as we are concerned, this could be an novelty in this research field.
We have declared these in the line 220-225. In addition, we would like to address that
1) The LSTM method was proved to be effective in datasets like MSR and NUT_RGBD. It has been recognized as the most popular basebone in the domain of action recognition;
2)The dataset in this research consists of 520 samples which should be sufficient for the validation.
Author Response File: Author Response.docx
Reviewer 2 Report
1. It is recommended that the authors in the Table 1 put a legend at the bottom of the table with the meaning of the acronyms
2. The authors must modify the bibliographic citation and write it correctly
Sáiz-Manzanares, M.C.; Escolar-Llamazares, M.C.; Arnaiz-González, Á. Effectiveness of blended learning in nursing education. Int. J. Environ. Res. Public Health 2020, 17, doi:10.3390/ijerph17051589.
Author Response
1. Thanks for your constructive suggestion. We have explained the meaning of the acronyms under table1 in line 260-261 to make it clearer.
2. Thanks for your careful review. We have modified the corresponding bibliographic citation in line 445-448.
Author Response File: Author Response.docx
Round 3
Reviewer 1 Report
The authors took into account the reviewer's suggestion.
The proposed modifications are in line with the reviewer's comments.
They also improved their understanding of their work, clearly defining the differences and the novel their dataset introduces.
Author Response
Thank you for your recognition. According to your suggestion, we have made some minor modification: (1) We explained the meaning of the acronyms under table1 in line 260-261 to make it clearer. (2) We added a conclusion at the end of discussion in line 287-288 to summarize the results. (3) We modified the bibliographic citation format in line 445-448 to meet the specification.
Author Response File: Author Response.docx