Crop Type Mapping Based on Polarization Information of Time Series Sentinel-1 Images Using Patch-Based Neural Network
Round 1
Reviewer 1 Report
In this paper, the Authors are presenting a study to investigate a radar polarization´s information, the patch strategy and the combination of a traditional Convolutional Neural Network (CNN) with a Recurrent Neural Network (RNN). Their hypothesis are related to an effectively improvement in accuracy for the crop type mapping. They used Sentinel-1 SLC and GRD products as data sources to obtain VH, VV, VH/VV, VV+VH, Entropy, Anisotropy, i.e., features for classification. Their show that the three-dimension Convolutional Neural Network (Conv3d) performs the best results in comparison with classifiers that combine RNN and CNN. In fact, such study demonstrated the value of combining of deep learning and polarization decomposition to provide a robust technical support for large-scale crop mapping.
The review in the state of the art should be improved, and at least a flowchart of the final algorithm should be included.
Additionally, it is really important to explore in their manuscript the procedures used to organize the database, i.e., looking to get a better classification otherwise problems with accuracy and precision will be present during operation, i.e., in terms of data quality.
How the false-positives occurrences are treated by the presented algorithms? The results should be better clarified; and additional quantitative results should be pointed out, i.e., considering the following points, as below:
1. The Authors should explain where they have found enhancements in the integrity of the ontology-based models, and what they really brought as a contribution if they have observed any comparison with other published or even patents related to such a methods.
2. The Authors should explain better Figure 3 (The optimal three-dimensional Convolutional Neural Network (Conv3d) framework in this study, where t is 30 (the number of S-1 imaging time periods), and m represents 7 input channels. The output dimension is c*1, and the value of c represents the 9 crop categories.); Figure 3 (Test dataset’s accuracy and kappa value for different patch sizes with Conv3d.); Figure 7 (Temporal feature value profiles of different crop categories imaged by (A) Alpha, (B) Anisotropy, (C) Entropy, (D) VH, (E) VV, (F) VH/VV, (G) VH+VV.), and Figure 9 (Effect of filtering strategy on classification results under different patch sizes).
3. Authors should consider additional information related to the developed methodology and choices carried out. Why those choices? In fact, how should be compatible with the related theories (some equations of the final model)?
4. In relation to the manuscript, there are very few comments related to the integration of the methods to have an applicable system. The agronomic intelligence aspects should be also included in the discussions of the results. Why do they have been missed in the paper?
5. Conclusion section should be expanded.
Finally, the Authors do not make it clear whether the presented solutions are already being applied or if they just considered prospective aspects to have conceptual models.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 2 Report
Detailed Comments:
- Abstract:
- L9: Radar data is not 'insensitive' to weather. Heavy rain and strong wind may affect the datasets. Please revise the sentence.
- L10 & L13 & L22: The word 'combination' is unappropriate here. You don't 'combine' polsar features with machine learning or deep learning. Polsar data could be used/applied/tested within machine learning or deep learning framework.
- L19-20: Which kind of rich information was derived from polarization decomposition? I would suggest shortening the sentence and making it more specific.
- Introduction:
- L31-34: Repetition of words 'information' and 'other'. Please revise the sentence and shorten.
- L42-43: The classification accuracy depend also on other factors not only quality of the remote sensing data. Improvement of dataset does not necessarily mean improved accuracies. Please revise the sentence.
- L43-47: Revise the sentence. It is too long for the context it is delivering. Please avoid using word 'information' too often in your text.
- L47-48: I would write here why dense time-series is useful. It can better capture short-term changes of phenological growth stages in crops making them more separable.
- L48-50: There are hundreds of papers showing that optical data deliver better maps (with respect to classification accuracy) than those based on SAR data. Revise the sentence and do not repeat word 'consider'.
- L71: it is not common that 'H/A/a' data is used for crop type mapping tasks. I would mention it here.
- L80: Please cite those 'lots of studies'. Also consider using another way of writing it.
- L81-93: Here you give general information about CNNs and RNNs. This could be done in your Methods section when you explain each model. I would add instead finding of other crop type mapping studies that use CNN and RNN.
- L99-100: What is the definition of the 'good enough'? Please revise your first research question.
- L101-102: Improve accuracy compared to what? Please revise all of your research questions and make them more specific.
- Materials and Methods:
- L105: officially 'United States of America'
- L106: use internationally accepted metric system first and put miles in brackets.
- L105-111: As I understood your study site is not the whole state of Texas it is a small AOI within it. Then please focus on the statistics of the AOI (area, climate etc.) and not on the whole state of Texas. E.g. the data on the number of state residents are completely useless for the research you are reporting.
- Figure 1: What is your study site? Is it the footprint of S1 data or the yellow bounding box on C? Please make it more clear on your figure and figure captions.
- L122-124: What do you exactly mean by 'ascending orbit were selected due to VV+VH interacting with agricultural fields strongly'? Does descending more with the same polarization interact differently? Please revise the whole sentence.
- L125-129: Why didn't you use only SLC data and did not calculate radar backscatter coef. from SLC data?
- L133: Which 'relevant departments'? I would suggest or omit this information or if you decide to provide it make it explicit.
- L133-135: Revise the sentence. Now it reads like 'map projection' provides information on over 200 crop types.
- L135-137: Revise the sentence.
- Table 1: Why do you give Trees a separate class and don't put it to 'other vegetation'?
- Methods:
- L178-181: Do you mean the effect of class imbalance here? This is a bit confusing sentences. Consider using common known terms when explaining.
- L181-183: You provide only two sentences on sampling approach. Please write more in details. How many samples per category you used? Did you consider the balance of classes within each category? Etc.
- L186-210: It is a long and confusing paragraph. Please split to the smaller paragraphs and give examples of studies where these models are used for crop type mapping.
- L213: 'destroy' is a too strong word here.
- L214-216: Do you know any global crop type product that is based on the object-based image analysis? I don't. While in the literature it shows advantages pixel-based approaches seems to still dominate when it comes to large-scale or global map production. Please revise above sentences to make it scientifically appropriate.
- L217: What was you motivation to use such small patch sizes?
- Results and Analysis:
- L231: "4.1. Hardware configuration and software environment" does not belong to "Results" section
- L236: "4.2. Parameter tuning of models" does not belong to "Results" section
- Table 2: how many features (layers) did each input have?
- L294-314: Majority of this paragraph belong to the Discussion section
- L300: What kind of "interference information" you imply?
- Figure 5: What do you mean by misclassified pixel? Which class was assigned to them? Also, applicable to all you maps: please use different color code. Current color code is not colorblind safe!
- L322-323: confusing sentence, please revise.
- L324: multi-classification change to multi-class classification
- L327, L329: please avoid using non-scientific language. e.g. excellent. What is the definition of excellent?
- L334-336: What do you mean by "acceptable low accuracy"? Since you have multi-class classification topic low accuracy in one class affect also other classes. There is no such thing as "acceptable low accuracy". Please revise!
- L346-349: Did you make phenological growth plots? Are they similar?
- Discussion:
- "good enough" is not a scientific term. Please change it in your entire manuscript.
- L364-369: Remove or make your message more clear and organized.
- L374: and why didn't you translate DN to dB?
- L376-382: Can it also be because of the pre-processing issues with backscatter coefficients?
- Figure 7: instead of 'Feature Values' write directly which feature it is. Only colors are not helpful to distinguish the profiles. Additionally use markers. How many samples each profile contain?
- L390-392: Please do not use "excellent", "a bit", "good enough". This is not scientific way of delivering information.
- Figure 8: use km not miles in your scale bar. Applicable to all other maps.
- L421-426: why the description of it missing in Methods section?
- L421: did you train/test split was done at polygon level? How many crop type polygon did you have? And how big is your study site?
- Figure 10: very confusing figure. What exactly do you mean by 'misclassified' and 'predicted' pixels?
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 3 Report
1. For Table 1, could you please explain why certain categories are reclassified and combined into fewer classes?
2. In Section 5 (Discussion), could you please compare your crop mapping performance with crop mapping results based on passive remote sensing instruments? Especially compare with those who have similar spatial resolution, such as Sentinel and Landsat. For example, as reported in "Parcel-based Crop Classification in Ukraine using Landsat-8 data and Sentinel-1A Data" and "Potential of Red Edge Spectral Bands in Future Landsat Satellites on Agroecosystem Canopy Chlorophyll Content Retrieval", passive remote sensing based algorithms can achieve certain accuracy for various vegetation monitoring tasks.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Round 2
Reviewer 2 Report
Based on the revisions, I now understand that the study site is extreamly small. Splitting raster features into patches by moving the 3x3 or 21x21 kernel by only one pixel obviously introduces a lot of repetition to the data. Then this data is randomly split to model training and validation which means that training and test patches could be just different in one row/column. The classification accuracies based on such data is biased. Even the basic rule of splitting data at polygon level or any other statistically sound manner, is not considered. This makes the outcomes of the whole study very questionable.
Author Response
Please see the attachment.
Author Response File: Author Response.docx
Reviewer 3 Report
n/a
Author Response
We did not revise the manuscript as there were no new suggestions.