Exploring Convolutional Neural Networks for the Thermal Image Classification of Volcanic Activity
Round 1
Reviewer 1 Report
Comments and Suggestions for Authors1. In the first paragraph of the Introduction section, the sentence "as highlighted by [6], who achieved a remarkable overall accuracy of 98.3% in recognizing eruptive activity from satellite images at seven different volcanoes" lacks specific details, making it unclear which aspect of knowledge was utilized and not mentioning the methods or results of the study. It is suggested to provide more specific information so that readers can understand the content and achievements of previous research.
Revised suggestion: Provide more specific details, such as how previous research utilized pre-trained CNN models and large image datasets to identify volcanic eruption activity, including specific CNN models and accuracy results.
2. Thermal Image Classification of Volcanic Activity is a typical application of machine learning algorithms, and some immune based machine learning algorithms have good reference value for this task, such as Multimodal image registration techniques: a comprehensive survey, Floating pollutant image target extraction algorithm based on immune extremum region, Infrared image segmentation using growing immune field and clone threshold, Pipeline image diagnosis algorithm based on neural immune ensemble learning. Meanwhile, the author should provide a more effective description of the practical application scope and significance of this article.
3. In the second paragraph of the Introduction section, the sentence "Operators aim to recognize eruptive events promptly, , and especially any sudden change in the state of the volcano, underscoring once more the importance of an accurate classification of the eruptive activity" contains a grammatical error, particularly the redundant comma.
Revised suggestion: Remove the redundant comma to improve the sentence structure and fluency, for example, "Operators aim to recognize eruptive events promptly, especially any sudden change in the state of the volcano, underscoring once more the importance of an accurate classification of the eruptive activity."
4. In the second paragraph of the Introduction section, the sentence "distinguishing the eruptive activity of Mount Etna into six classes" uses inaccurate terminology, which may lead to misunderstanding, as volcanic activity is classified into different states rather than categories.
Revised suggestion: Use more accurate terminology to describe, for example, "classifying the eruptive states of Mount Etna into six categories."
5. In the paragraph where the sentence "In addition, being the summit of Etna formed by four active craters, it is possible that they erupt together displaying a different kind of behavior, an event that is rare but not impossible [7]" is located, the sentence structure is slightly complex, and it can be expressed more clearly to convey the author's viewpoint.
Revised suggestion: Simplify the sentence structure to improve understanding, for example, "Additionally, since the summit of Etna comprises four active craters, it's possible for them to erupt simultaneously, exhibiting different behaviors, although such events are rare but not impossible [7]."
6. In the paragraph where the sentence "To address these challenges, we decided to use Convolutional Neural Networks (CNNs)" is located, the sentence lacks further explanation of why CNNs were chosen and how they address specific challenges.
Revised suggestion: Expand the sentence to explain why CNNs were chosen and how they address the challenges faced, for example, "To tackle these difficulties, we opted to utilize Convolutional Neural Networks (CNNs), given their proven effectiveness in handling complex classification tasks under diverse conditions, as demonstrated by their performance in competitions like the one described by [8]."
Comments on the Quality of English LanguageMinor editing of English language required
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for Authors
I am fine with the paper, which builds on the outcomes of a previous paper by the same Authors (Remote Sens. 2022, 14, 2392) to make unsupervised detection robust. Recent literature hosts some good attempts in this direction by AI/ML/DL from ground based cameras - at same Mt.Etna volcano (Tello et al., 2022, same issue, paper #4477, as previously on Mt. Klyuchevskoy (Remote Sens. 2021, 13, 4747). On account of the difference in view angle, resolution, atmospheric artifacts and – mainly – refresh frequency, I did not consider satellite-borne radiometers (e.g., Corradino et al. 2021, Remote Sens., 13, 4080) for comparison.
Beyond the combination of methods, the above papers significantly differ in the capacity of removing noise from large/huge datasets to improve robustness: a target achieved, in my opinion, in this case. Below, a few comments, questions, suggestions or caveats mainly addressed to operational implementations of the procedure.
Orderly:
1) Figure 8 (ROC) and the confusion matrix of Fig.7 highlight the efficiency of the VGG16 classifier in detecting paroxysms (green curve, steadily no-false positive) and degassing, where I assume that to be massive/dense plumes. Considering the view angle of thermal cameras it is surprising, however, that intensity-varying strombolian is not confused with low-level/height/paroxysmal. A question: how to distinguish between
2) Are lava fountains included in paroxysmal or strombolian, and how you distinguish them? There could be definitional voids in the chosen broad classes: worth defining better these classes, with (maybe) examples, and/or clarify the system behavior in transitional states.
3) Data acquisition vs. (re)processing rates. Acquisition rates in standard monitoring, and time required for processing are not quoted but I imagine outputs to be instantaneous once the learning-and-decision sequence up to the VGG16 classifier is trained. In practice, however, activity may rapidly switch from strong strombolian to lava fountaining and back. So, two questions: (a) are refresh rates – fixed or varying, we don’t know – adequate to volcano status changes, and (b) is the system able to learn and retrain from the previous image/couple of images in times as short as minute(s). Please insert a few performance data of both cameras and the CNN chain.
4) I understand that the system was developed on one camera. Not considering data merging and output optimization, do Authors think that this 5-net scheme would apply to other cameras at similar view angles but different azimuth, with nil-to-little change?
The above comments do not impact on my evaluation of the paper which deserves to be published. Questions aim at future implementation for operational scopes.
Also, a few sentences on the way forward will be appreciated.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsThe innovation of this paper is seriously insufficient, the algorithm still needs to be further studied.
Comments on the Quality of English LanguageExtensive editing of English language required
Author Response
Please see the attachment.
Author Response File: Author Response.pdf