CNN Based on Transfer Learning Models Using Data Augmentation and Transformation for Detection of Concrete Crack
Round 1
Reviewer 1 Report
In this study, the authors proposed a transfer learning approach to the detection of concrete cracks from images. The idea is applied to a public dataset CCIC and it reached a promising performance. However, some major points should be addressed as follows:
1. Although the authors reached a good result, the novelty of the study is still unclear. If the authors only used AlexNet, what are the main points that made their study outperform the other previous studies? Since AlexNet is a well-known architecture and the authors did not modify any information from the pretrained model.
2. The authors should have external validation data to evaluate the performance of model on unseen data.
3. It is suggested to conduct cross-validation in the training process.
4. Uncertainties of models should be reported.
5. When comparing the predictive performance among methods/models, the authors should conduct some statistical tests to see significant differences. For example, in Table 4, we cannot say that the AlexNet outperformed the others if without statistical tests.
6. The authors should conduct hyperparameter tuning on their models and report this step in the manuscript.
7. Besides current metrics, the authors should add ROC curves and PR curves.
8. Measurement metrics (i.e., Recall, Precision, Accuracy, ...) are well-known and have been used in previous studies such as PMID: 34915158, PMID: 34502160. Thus, the authors are suggested to refer to more works in this description to attract a broader readership.
9. Fig. 13 is duplicated since all contents were displayed in Figs. 11-12. It is suggested to remove Figs. 11-12 and keep Fig. 13 only.
10. English writing and presentation style should be improved.
11. Table 3 should be removed since it is well-known.
12. Quality of figures should be improved.
13. Source codes should be provided for replicating the study.
Author Response
Response to the Reviewer: #1 Reviewer#1, Concern # 1: Although the authors reached a good result, the novelty of the study is still unclear. If the authors only used AlexNet, what are the main points that made their study outperform the other previous studies? Since AlexNet is a well-known architecture and the authors did not modify any information from the pretrained model.
Author response: We thank the reviewer for his constructive comments. In this revised paper, in the last paragraph of the introduction section, we have clearly conveyed the contribution of the article. The authors applied 4 transfer learning models named VGG16, 45 ResNet18, DenseNet161, and AlexNet and also modified the AlexNet model.
Reviewer#1, Concern # 2: The authors should have external validation data to evaluate the performance of model on unseen data.
Author response: We cordially thank you for finding out this valuable point. We only conduct experiments on CCIC dataset.
Reviewer#1, Concern # 3: It is suggested to conduct cross-validation in the training process.
Author response: We thank the reviewer for this suggestion. We used Cross-validation during model training. Train dataset was used for model training, and the test dataset was used for model validation.
Reviewer#1, Concern # 4: Uncertainties of models should be reported.
Author response: We appreciate the reviewer's comments. We do not think there are any reasonable uncertainties.
Reviewer#1, Concern # 5: When comparing the predictive performance among methods/models, the authors should conduct some statistical tests to see significant differences. For example, in Table 4, we cannot say that the AlexNet outperformed the others if without statistical tests.
Author response: We thank reviewers valuable comment. We updated this in the revised manuscripts.
Reviewer#1, Concern # 6: The authors should conduct hyperparameter tuning on their models and report this step in the manuscript.
Author response: Thank you once again for this comment. We used same hyperparameters optimization setup for all architectures. These hyperparameters are described in Table 2. Please check the 3.5 section line number 254.
Reviewer#1, Concern # 7: Besides current metrics, the authors should add ROC curves and PR curves.
Author response: We have updated both ROC and PR curve. Please check the figure no 13 and 14 in result and discussion section. Please see the page number of 13 and 14.
Reviewer#1, Concern # 8: Measurement metrics (i.e., Recall, Precision, Accuracy, ...) are well-known and have been used in previous studies such as PMID: 34915158, PMID: 34502160. Thus, the authors are suggested to refer to more works in this description to attract a broader readership.
Author response: We have updated this. Please check the figure no 13 and 14 in result and discussion section. Please see the page number of 13 and 14.
Reviewer#1, Concern # 9: Fig. 13 is duplicated since all contents were displayed in Figs. 11-12. It is suggested to remove Figs. 11-12 and keep Fig. 13 only.
Author response: Thank you very much for this comment. We have updated this as figure number 11.
Reviewer#1, Concern # 10: English writing and presentation style should be improved.
Author response: Thank you very much for this comment. We have substantially improved the English matter in our revised manuscript. We hope the reviewer will get the experience while going through the paper.
Reviewer#1, Concern # 11: Table 3 should be removed since it is well-known.
Author response: Thank you very much for this comment. We have removed it. Please check the section 4.
Reviewer#1, Concern # 12: Quality of figures should be improved.
Author response: Thank you very much for this comment. We have updated all figures.
Reviewer#1, Concern # 13: Source codes should be provided for replicating the study.
Author response: Thank you very much for this comment. We attached the link of the source code at the end of the paper.
Author Response File: Author Response.docx
Reviewer 2 Report
The manuscript is interesting, however there are several major points that need to be addressed:
-In the abstract too many decimal numbers are reported for performances, please round. Also this performance is referring to the test/training performance?
-Figure 1 is a bit asymetric, please try to remake the figure in order to make it more symmetric
-line 110 the test set is not the validation set. please specify better and explain how you have organized this division and splitting.
-the examples of rotation and flipping and so on, are good to understand, but should be organized better (merge some figures, and try to make them more compact in the description)
-figures of the proposed architectures are nice, but small. Enlarge them to make them readable
-make some comment on hyperparameters optimization used in the various architectures employed in the study
-Section 3.4 Should have a larger introduction to CNN and deep learning models for images. Then Transfer learning should be introduced, and further the various models described Some relevant references concerning deep learning and deep learning models applied to images should be added, as for instance:
-Shin, Hoo-Chang, et al. "Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning." IEEE transactions on medical imaging 35.5 (2016): 1285-1298.
- Dimitri, Giovanna Maria, et al. "Unsupervised stratification in neuroimaging through deep latent embeddings." 2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC). IEEE, 2020.
-Kim, D. H., and T. MacKinnon. "Artificial intelligence in fracture detection: transfer learning from deep convolutional neural networks." Clinical radiology 73.5 (2018): 439-445.
-LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton. "Deep learning." nature 521.7553 (2015): 436-444.
- Affonso, C., Rossi, A. L. D., Vieira, F. H. A., & de Leon Ferreira, A. C. P. (2017). Deep learning for biological image classification. Expert systems with applications, 85, 114-122.
-Bengio, Yoshua, Yann Lecun, and Geoffrey Hinton. "Deep learning for AI." Communications of the ACM 64.7 (2021): 58-65.
-Table 2: There is a typo in Cross Eentropy -> Entropy.
-Figure 12/13 should be merged and enlarged (probably rotated vertically)
-Description of performance metrics should be described better and better formatted (Precision, P or Recall, R, or F1Score, F1 are not good descriptions also in terms of definition, and should be amended)
-There are several qualitative assessment throughout the paper. As for example line 301 where the sentence various standard evaluation scores is used. Please amend and use scientific notation and description
-Figure 14 has some issues (some labels are cropped and fade away). Also is very small. Please enlarge
-Table 4 is not clear in the notation. please remake
-many numbers have several decimal numbers when specified. Please amend, and round consistently the scientific notation used across the paper.
-Table 6: please specify in the caption what is the duration score
-Was cross validation performed for assessing model performances?
-The results reported in tables 7 are comparable to the results reported in the present paper? Was the same divisions in test/train used? And also the same dataset?
Author Response
Response to the Reviewer: #2
Reviewer#2, General Concern #1: In the abstract too many decimal numbers are reported for performances, please round. Also this performance is referring to the test/training performance?
Author response: We thank reviewer for this correction. We have done it. Please check it as Using the publicly available CCIC dataset, the suggested technique on AlexNet outperforms existing models with a testing accuracy of 99.90%, precision of 99.92%, recall of 99.80%, and F1-score of 99.86% for crack class.
Reviewer#2, Concern # 2: Figure 1 is a bit asymetric, please try to remake the figure in order to make it more symmetric
Author response: We have updated this.
Reviewer#2, Concern # 3: line 110 the test set is not the validation set. please specify better and explain how you have organized this division and splitting.
Author response: We have corrected this in 111 line.
Reviewer#2, Concern # 4: the examples of rotation and flipping and so on, are good to understand, but should be organized better (merge some figures, and try to make them more compact in the description)
Author response: Corrected – thank you once again.
Reviewer#2, Concern # 5: figures of the proposed architectures are nice, but small. Enlarge them to make them readable
Author response: We appreciate your effort in pointing out this mistake. We have corrected it accordingly. Please see Figure 6 on page 9 for the correction.
Reviewer#2, Concern # 6: make some comment on hyperparameters optimization used in the various architectures employed in the study
Author response: We used same hyperparameters optimization setup for all architectures. These hyperparameters are described in Table 2.
Reviewer#2, Concern # 7: Section 3.4 Should have a larger introduction to CNN and deep learning models for images. Then Transfer learning should be introduced, and further the various models described Some relevant references concerning deep learning and deep learning models applied to images should be added, as for instance:
-Shin, Hoo-Chang, et al. "Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning." IEEE transactions on medical imaging 35.5 (2016): 1285-1298.
- Dimitri, Giovanna Maria, et al. "Unsupervised stratification in neuroimaging through deep latent embeddings." 2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC). IEEE, 2020.
-Kim, D. H., and T. MacKinnon. "Artificial intelligence in fracture detection: transfer learning from deep convolutional neural networks." Clinical radiology 73.5 (2018): 439-445.
-LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton. "Deep learning." nature 521.7553 (2015): 436-444.
- Affonso, C., Rossi, A. L. D., Vieira, F. H. A., & de Leon Ferreira, A. C. P. (2017). Deep learning for biological image classification. Expert systems with applications, 85, 114-122.
-Bengio, Yoshua, Yann Lecun, and Geoffrey Hinton. "Deep learning for AI." Communications of the ACM 64.7 (2021): 58-65.
Author response: Yes, we agree that these materials. We have concised them accordingly in this revised manuscript. Please see blue colour text of pages 6.
Reviewer#2, Concern # 8: There is a typo in Cross Eentropy -> Entropy.
Author response: Thank you for this comment. We have updated this.
Reviewer#2, Concern # 9: Figure 12/13 should be merged and enlarged (probably rotated vertically)
Author response: Thank you for this comment. We have updated this as figure number 11.
Reviewer#2, Concern # 10: Description of performance metrics should be described better and better formatted (Precision, P or Recall, R, or F1Score, F1 are not good descriptions also in terms of definition, and should be amended)
Author response: Thank you for this comment. We have updated this.
Reviewer#2, Concern # 11: There are several qualitative assessment throughout the paper. As for example line 301 where the sentence various standard evaluation scores is used. Please amend and use scientific notation and description
Author response: Thank you for this comment. We have updated this.
Reviewer#2, Concern # 12: Figure 14 has some issues (some labels are cropped and fade away). Also is very small. Please enlarge
Author response: Thank you for this comment. We have updated this.
Reviewer#2, Concern # 13: Table 4 is not clear in the notation. please remake
Author response: Thank you for this comment. We have updated this.
Reviewer#2, Concern # 14: many numbers have several decimal numbers when specified. Please amend, and round consistently the scientific notation used across the paper.
Author response: Thank you for this comment. We have updated this.
Reviewer#2, Concern # 15: Table 6: please specify in the caption what is the duration score
Author response: Thank you for this comment. We have updated this.
Reviewer#2, Concern # 16: Was cross validation performed for assessing model performances?
Author response: Yes. Train dataset was used for model training and test dataset was used for model validation.
Reviewer#2, Concern # 17: The results reported in tables 7 are comparable to the results reported in the present paper? Was the same divisions in test/train used? And also the same dataset?
Author response: Thank you for this comment. We have updated this in the table 6 now.
Author Response File: Author Response.docx
Round 2
Reviewer 1 Report
Thanks for addressing my previous comments. However, some comments have not yet been addressed well and still need room for improvement as follows:
1. There must have external validation data to evaluate the performance of model on unseen data. Without it, the study lacks novelty.
2. Cross-validation (i.e., 5-fold CV, 10-fold CV) should be conducted during the training process.
3. I still cannot see any statistical test conducted when comparing the performance.
4. Previous comment #8 suggested the authors add more references related to measurement metrics, but the authors skipped it.
Author Response
Response Letter is attached herewith.
Author Response File: Author Response.docx
Reviewer 2 Report
I think the authors have now addressed all the required comments
Please che the english in the manuscript for a final revision
Author Response
Response Letter is attached herewith.
Author Response File: Author Response.docx
Round 3
Reviewer 1 Report
My previous comments have been addressed.