Next Article in Journal
Estimation of Nitrogen in Rice Crops from UAV-Captured Images
Previous Article in Journal
Assessment of Forest Biomass Estimation from Dry and Wet SAR Acquisitions Collected during the 2019 UAVSAR AM-PM Campaign in Southeastern United States
 
 
Article
Peer-Review Record

Limitations of Predicting Substrate Classes on a Sedimentary Complex but Morphologically Simple Seabed

Remote Sens. 2020, 12(20), 3398; https://doi.org/10.3390/rs12203398
by Markus Diesing 1,*, Peter J. Mitchell 2, Eimear O’Keeffe 3, Giacomo O. A. Montereale Gavazzi 4 and Tim Le Bas 5
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Remote Sens. 2020, 12(20), 3398; https://doi.org/10.3390/rs12203398
Submission received: 28 September 2020 / Revised: 12 October 2020 / Accepted: 13 October 2020 / Published: 16 October 2020
(This article belongs to the Section Ocean Remote Sensing)

Round 1

Reviewer 1 Report

I am satisfied with the authors' answers.

Author Response

No response required.

Reviewer 2 Report

Thank you for the opportunity to re-review, I judge the paper suitable for publication as it stands, following acceptance of what appear to be track-changes in the document.

Author Response

No response required.

Reviewer 3 Report

The paper is well written and a great improvement from the original version. It seems, however, that the tracking function of Word (?) messed up the pdf conversion, so the pdf version I received for review was sometimes quite difficult to follow (especially the Results section where authors made lots of changes.). Other than cleaning up the mess due to the tracking function, it will be nice to see a confusion matrix of the ensemble map, preferably in the main manuscript (not in the supplemental material).  Were there any patterns as to which categories tended to cause disagreements among the 5 models (resulting in "no class")?

Author Response

We have now provided a confusion matrix of the ensemble map in the main manuscript and briefly describe the results of that analysis.

This manuscript is a resubmission of an earlier submission. The following is a list of the peer review reports and author responses from that submission.


Round 1

Reviewer 1 Report

I would like to congratulate the authors on taking up such an important and difficult subject as comparison of supervised classification methods of benthic habitats. The article is valuable in the development of the field of remote sensing and hydroacoustic measurements and data processing. Although the classification results achieved low overall accuracies, the experiment was well planned and the results achieved are discussed in detail. The manuscript is very interesting and generally well prepared.

I believe that some minor points should be improved in the manuscript:

  • in “Materials and Methods” there is no description of hydroacoustic measurements. More details are required, such as the frequency of the acoustic signal used for measurements, the type of multibeam echosounder, the resolution of the input maps.
  • line 162 - „… using High Precision Acoustic Positioning (HiPAP) 500 model to measure…” - I suggest: „… using High Precision Acoustic Positioning System (HiPAP), model 500 to measure…”
  • line 165 - source is needed
  • In the Figure 2 a, b, c, d there should be the same gauge scale, the location should be marked E
  • Line 213 – method A - what predictors/ features were used?
  • method B, D - no reference to works where such a method of classification has already been presented
  • lines 274 – 276 - why was the information about the possibility of choosing the scale and thus the details given? In other presented methods you can also manipulate the scale. Wasn't the result of the presented method D, which has the highest accuracies with the validation samples, selected for comparison? If so, this paragraph is unnecessary.
  • line 313 - Do you need an empty parenthesis? Was this supposed to be a reference?
  • Table 2 - If these are the maps presented above, they should be called in the same way, e.g. give the maps numbers on fig 7 or symbols. Why map 1 has for all bottom coverage classes? - This is not explained in the text.
  • line 544 - If the data set is to be used for further comparisons and analysis by other researchers, the input maps should be made available in the form of geotiff and not just the same images as in the manuscript (https://storymaps.arcgis.com/stories/07eac32755c04ac8894af6d13525943a)
  • lines 561-562 - „ Alternatively, continuous data modelling may overcome some of the issues encountered with maps of low thematic accuracy.” - This idea should be developed in tekst.
  • Please confirm that the information provided on page https://storymaps.arcgis.com/stories/07eac32755c04ac8894af6d13525943a will not be available until the manuscript is published. The data presented there are the same results as in the manuscript so they should not be made available on the website earlier.

After the comments have been fulfilled, the article may be allowed to be printed.

Reviewer 2 Report

I'm embarrassed I didn't respond earlier. A few of my recent articles reviewed required extensive edits. This paper is very well put together. It reads well and conveys the authors findings concisely. The supplementary materials and available data are much appreciated. I don't have any significant suggestions for improvement. Excellent work!

Reviewer 3 Report

Benthic habitat mapping is an imprecise science at best, and this paper offers a comparison of some of the methods which have been used to undertake this.

The data for the survery were original, and the comparison of methods and way forward constitute a contribution to the conversation about the use of acoustic data for habitat mapping. Something the paper highlights is how far the discipline has to go to accurately map benthic habitats, particularly where there is no groundtruthing data available.

Methods used for analysis are appropriate. References are appropriate. Training sample locations are comprehensive and representative.

Groundtruthing was not temporally coincident with all surveys but also mark part of a time series.

 

A few minor comments to improve the presentation of the paper:

  • It’s implied, but not clear, on line 141, whether the 2015 subsequent survey was just for groundtruthing or whether it also acquired acoustic data.
  • Sediment samples were classified according to Long [36] in Fig. 1. I initially commented that it would be useful to also see a ternary plot of the sediment classifications in addition to the classification scheme, as Fig. 1 and Fig. 3 are a long way apart and it’s not obvious to the reader that this information is available later in the paper. I suggest considering placing those closer together for ease of comparison.
  • The use of acronyms to describe sediment types (R, Sa, Mu) in the text has the effect of distracting the reader from the message. I recommend spelling the words out in full.
  • Section 4.4: some formatting issues to be dealt with.

Reviewer 4 Report

This paper is an interesting attempt, and I can totally relate to the difficulty of "classifying" something when it is really a gradient.  I have some concerns about the low sample size in this study.  Also, as to the low accuracy of the models, I see that sometimes failed attempts can provide us with meaningful information, but I also think that authors should dive deeper into their models and do more investigations to pinpoint the potential sources of the low accuracy, so that this paper can offer even more meaningful information to readers.

Please see the specific comments below.

Methods

Could you clarify the size of the study area in the text (about 20 km x 20 km based on Figure 2?) and also include the depth (range) of the area. I have seen a hard bottom area at 75-100 m depths turning into a sandy bottom after a category 3 hurricane passing over the area. How can you be sure that the data you obtained in 2013 and 2015 (especially 2015 data) still corresponded to the acoustic survey data collected in 2012?

I assume the testing dataset was completely withheld from the participants and any model tunings when necessary were done using the training dataset (perhaps splitting it).  Then, it seems to make more sense to call the training dataset "training/validation set."

Results

Figure 3: For the top left figure, according to Figure 1, some of the samples at the top right should be classified as "CS" not "Mx" as both figures are according to Long 2006.  I'm a bit confused.  Also, for the two top figures, could you label each line segment?  I assume the left line where the gravel content (%) is shown in Figure 1 is actually the mud content.  It got me puzzled at first.

Lines 367-368 "The associated map of agreement between the predictions might be interpreted as a measure of confidence in the predictions."  I've seen how a voting method can reduce classification errors, but in this particular case where the accuracy of each model is ~50%, voting really doesn't work (as it can be seen by the fact the accuracy of ensemble model is lower than Model A).  I cannot see how it can be interpreted as a measure of confidence in the predictions (by the ensemble model, which has a lower accuracy than Model A, a component of the ensemble model).  Given that it is in the result section, I suggest simply deleting this sentence.

Discussion

Lines 404-406 "The need to fit continuous data into predefined categories can therefore be problematic as samples that differ by only a few percent may be acoustically similar but classified as different sediment types." Please include a confusion matrix for each model, so readers can see where misclassifications are happening.  They should offer some insights into whether classifying continuous data into categories is indeed the issue here.

Lines 488-489 "A critical issue in this study was the high complexity and spatial heterogeneity of the site, with changes in substrate types at scales of several metres to a few tens of metres."  This statement relates to my biggest question/concern about this study.  If the site has high spatial heterogeneity, shouldn't there be more samples?  I am used to collecting thousands of data points for each benthic category before being able to use machine learning algorithms to auto classify a data point.  The sample size of this study seems extremely small, especially given the high complexity of the classification.

Lines 510-512 "In Figure 8, areas where all or most maps predict the same class could suggest areas of high confidence for each class, and areas of split votes are more ambiguous and may be more complex or reclassified as hybrid classes."  Again, I am not sure about this statement given that the underlying 5 maps all have ~50% accuracy.  Just because all maps agree, it doesn't mean that their classification is correct.  If all models misclassify in the same way, though, that is important information.  Is there any way for you to dive deeper into this and see the accuracy of classification where all or most maps predict the same class and the percentage of each category when all or most maps generate correct classification?  Such investigations should offer more meaningful information as they can potentially pinpoint the source of the overall low accuracy.

Back to TopTop