Estimating Wheat Grain Yield Using Sentinel-2 Imagery and Exploring Topographic Features and Rainfall Effects on Wheat Performance in Navarre, Spain
Round 1
Reviewer 1 Report
I strongly appreciate the efforts of the authors to improve the reliability of the proposed model for estimating wheat grain yield from Sentinel-2 data. The re-submitted version of the manuscript is greatly enriched both in terms of content and form. In particular, the added tests for verifying the collinearity of the independent variables correctly support the assumptions for the applicability of the proposed stepwise multilinear regressions model. The implementation of such tests supported the identification of a completely different model (different variables and coefficients).
However, another effort is still required to make the manuscript publishable in RS. The novelty of the study does not yet emerge.
The manuscript still seems a mere exercise in the Navarra area. The added sentences on the study aims and the additional references on the existing studies on yield-models based on S2 are not sufficient.
In the current version, only the high resolution of S2 is highlighted as a novelty. This is not sufficient since yield-models bases on S2 already exist (reference provided in the previous review). The authors have to detail (at least in the introduction, and then in the conclusions) the differences between their approach and those present in literature.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 2 Report
Authors did a clear improvement to their manuscript.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 3 Report
The evaluation of the topography and rainfall impact on wheat performance is an interesting approach, however, the methods description and the results presentation are not very clear. I believe that a text reformulation would improve the readability of the manuscript, as well as, a moderate English editing.
- One of my main concerns about this study is the few number of parcels, just 39 in total, 20 of which for modelling (2018) the grain yield estimation and 19 for validation (2019) for such a vast area.
- Section 2.1.3 describes the use of temperature data to determine the phenological stage through GDD, however, no information about the number/location of the meteorological stations is provided. Further, in section 2.2.3, you refer the use of rainfall data from 30 stations located throughout the three agricultural zones under study, are they the same used for obtaining temperature data? Please add the location of these stations to Figure 1. I also suggest that this latter section should be moved to the previous section 2.1, where data is described. Besides, section 2.1.3 is too confusing (too many things been described at the same time!). Again, I suggest a section for the description of meteorological data (temperature for GDD) and rainfall and another for the description of the Sentinel-2 data and the derived vegetation indices.
- Regarding GDD, was it used at any step of the methodology? It is not obvious for me what was the purpose of GDD!
- Sentinel-2 Level2A images are available globally since December 2018, why did you download Level1C and decided to do yourself the conversion to SR using Sen2Cor?
- In section 2.1.5, you describe the polygon vector file (SIGPAC) with the parcel’s information. This file has the limits of each parcel as well as information about the crop declared by the farmer, right? Can you explain the need to apply a RF classification in order to classify all wheat parcels, when I suppose that you have that information as an attribute in the SIGPAC files? The samples used for RF train and test were extracted from the SIGCAP file or coordinated in the field (in line 193)?
- Regarding the RF classification, why only 3 bands were used when previously you have described the added value of the red-edge and SWIR bands? Besides, have you used more than 3 images before and preliminary results showed that late season images were determinant for the classification?
- Improve sentence in lines 299-301, it is not clear!
- Add some text before showing a Table/Figure at the beginning of a section (for example, sections 3.1 and 3.2.
- The legend of Figure 3 is broken by the image! The inside legend is not totally visible (the equation is missing).
- Highlight the Overall Accuracy values in Table 7 and mention it in the legend (it is not clear that those values are in the right bottom of each confusion matrix).
- Improve sentence in lines 385-387 (Legend of Table 8).
- Explain the acronyms in Table 9 (SE) and Table 10 (F), in the legend for example.
- In line 413, replace “In contracts” with “In contrast”.
- In line 418, replace “Regarding rainfall” with “Regarding the rainfall interpolation” for clarity!
- In line 425, replace “expect” with “except”.
- In the legend of Figure 5 replace “three analysed parameters” with “four analysed parameters”.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Round 2
Reviewer 1 Report
The addition of an in-depth analysis of existing models to estimate grain yield from Sentinel-2 data clearly improved the novelty of the study.
In addition, the changes made throughout the manuscript have significantly improved its quality and readability.
Congratulations to the authors for the efforts made.
The manuscript is ready to be published in RS.
Author Response
Reviewer 1
The addition of an in-depth analysis of existing models to estimate grain yield from Sentinel-2 data clearly improved the novelty of the study.
In addition, the changes made throughout the manuscript have significantly improved its quality and readability.
Congratulations to the authors for the efforts made.
The manuscript is ready to be published in RS.
We acknowledge the significant scientific contributions of the reviewer which helped improve the manuscript regarding both its quality and readability.
The positive evaluation from Reviewer 1 is much appreciated. The previous suggestions allowed the authors to improve the manuscript together with other edits related to the other Reviewers and the MDPI Academic Editor.
Sincerely,
The Authors
[remotesensing-841721]
"Also in the attachment."
Author Response File: Author Response.pdf
Reviewer 3 Report
Some minor text editing is still required, for example in the legend of Table 3 (line 274) the authors mention four red-edge bands, but in fact there are only 3 red-edges bands (B5, B6 and B7).
Another example is, in the sentence in lines 660-662:
"Regarding the goodness of the model the selected vegetation indices showed absence of collinearity as VIF, 2.16 in both cases, was lower to the collinearity threshold value of 10 [66]."
I suggest that it should be:
"Regarding the goodness of the model, the selected vegetation indices showed absence of collinearity, as a VIF of 2.16, obtained for both cases, was lower than the collinearity threshold value of 10 [66]."
Author Response
Reviewer 3
Some minor text editing is still required, for example in the legend of Table 3 (line 274) the authors mention four red-edge bands, but in fact there are only 3 red-edges bands (B5, B6 and B7).
Following the suggestions of Reviewer 3 the legend of Table 3 has been reformulated to clarify and separate the three red-edge bands from the vegetation red-edge at 20 m of spatial resolution.
Table 3. Spectral bands and spatial resolutions of the Sentinel-2 Multispectral Instrument (MSI). Novelty in spectral coverage includes three red-edge spectral bands and the vegetation red-edge (RE) at 20 m as well as improved SWIR coverage at 20 and 60 m spatial resolutions. Broadband spectral coverage of the visible and near infrared are provided at 10 m spatial resolution.
Another example is, in the sentence in lines 660-662:
"Regarding the goodness of the model the selected vegetation indices showed absence of collinearity as VIF, 2.16 in both cases, was lower to the collinearity threshold value of 10 [66]."
I suggest that it should be:
"Regarding the goodness of the model, the selected vegetation indices showed absence of collinearity, as a VIF of 2.16, obtained for both cases, was lower than the collinearity threshold value of 10 [66]."
We believe that the reformulation proposed by the Reviewer gives more clarity to the sentence and consequently it has been modified (lines 550-552 of the revised manuscript).
The positive evaluation from Reviewer 3 is much appreciated. As suggested, the authors have improved the text for a better clarity.
Sincerely,
The Authors
[remotesensing-841721]
Author Response File: Author Response.pdf
This manuscript is a resubmission of an earlier submission. The following is a list of the peer review reports and author responses from that submission.
Round 1
Reviewer 1 Report
The article – Estimating wheat grain yield using Sentinel-2 imagery and exploring topographic and rainfall effects on performance in Navarre – reports a method to estimate wheat yield and assess the effects of topographic and climatic variables; it falls well in the scope of the Journal.
Several weak points has been observed in the article:
- A first criticism is the consideration (lines 50 to 60) that empirical models are “more popular and easy to use in practice” than growth models. The scientific research about physically-based models produced many and important results on this topic. I understand that more input data you have and more reliable will be the models. The regression models have large limitations, as stated by the AA (lines 54-55), in the possible results inference.
- The concept of “Topography” is very large and cannot be limited to only few features as slope, height (I prefer “altitude”) and aspect.
- The number of fields used (39) is very low and statistical procedure used appears very simple.
- The results of Table 10, about the relationships of topographic and rainfall on grain yield, are very bad, also for GWR method (R2=0.28)
- No reference in the text to the previous crop effect: the soil fertility for wheat after sunflower crop is higher than after oat crop.
Additional weak points:
In the Title the word “performance” is far from the word “wheat”: please rephrase (add “its”, for example).
Key words are missing.
At “2.1.2 Field data” chapter, an essential information is missing at all: the average field size with associated descriptive statistics (minimum, maximum, standard deviation,..).
Table 1: a reference to a crop phenological scale should be added (Zadok scale or BBCH scale).
Line 153: jjjjjn nnnn: a mistake?
Line 155: 0 °C
Equation (2): define PA and UA
Lines 264-273: three times the “ArcGis Pro 2.3.0” words have been repeated.
Line 296: larger
Line 322: Table 7
Lines 415-416: very obvious statement
In general, the article appears poor in the innovation and originality: the results have a very low interest for the limited inference that can be made with these kind of empirical models.
For these reasons I think that the MS is not acceptable for publication in the present form.
Reviewer 2 Report
This manuscript reports a wheat yield estimation using Sentinel-2 satellite images. Authors used a governmental database for crop classification to classify wheat fields at Navarre, North Spain. They derived different vegetation indices from Sentinel-2 data and correlated it with total yield from different 39 fields from 2018 and 2019 seasons. Also, they studied the influence of topography and rainfall in wheat grain yield variability. In general, the manuscript is written in a good manner with enough information to all sections. However, I have the following minor comments:
L97: Add a reference for this 90%.
L126: Add more information about study fields area; the total area from each region and the average area per field.
L131: Are those fields the same for the two seasons or different? Clarify it.
L139: Convert yield units to be ton/ha as it is commonly used.
L153: Correct the summation variable.
L155: correct the 0 C degree sign.
L172: Define the term vegetation indices (VIs)..
Table 5: Why do you have 2 image dates from each region at each phenological stage? The same for lines 225-227.
L244: What about the 2019 season? Did you perform the classification for it?
Equation 2: Define equation terms; Fscore, PA and UA.
L271: Correct the order 2.2.4.
L371: here in.
L469: You may start a new section from here (4. Discussion).
Reviewer 3 Report
General comments
The paper proposes a method for estimating wheat grain yield from Sentinel-2 data. In particular, for such an estimate, the authors adopt stepwise multilinear regressions based on different vegetation indices measured at various phenological stages. In addition, they evaluate the relationships of the mapped per-parcel grain yield estimation with topography and rainfall. The topic is very interesting for RS readers.
The manuscript is well written and organized, unfortunately, in the current form, it seems a mere exercise of an application on a specific area.
The novelty of the implemented study is not shown. There is a vast literature on crop classification and yield estimation from Sentinel-2, including wheat cultivation, and the comparative improvements of the proposed models are not understandable.
The model is applied without verifying the assumptions for its applicability, particularly the absence/low of correlation between predictor variables (no collinearity for the independent variables). The used predictors (statistics of vegetation indices) are generally highly correlated; therefore, a high correlation is expected for the minimum values of two vegetation indices (i.e., MIN_GNDVI, MIN_MSAVI) used in the selected best model (no statistics od predictors and plots of residuals are shown).
A rearrangement of the manuscript including an in-depth collinearity study of the stepwise multilinear regression method and a detailed overview of Sentinel-2 crop mapping and yield estimation is required to resubmit the manuscript to RS journal.
Specific comments
Introduction: An overview of recent literature on Sentinel-2 applications for crop mapping and yield estimation is missing. There are many articles published in remote sensing journals, including RS and other MDPI newspapers. Such an overview allows a better picture for your study by highlighting the pros and cons of the selected approach.
Sentinel 2 wheat mapping and yield prediction
Skakun S. et al., “Winter Wheat Yield Assessment from Landsat 8 and Sentinel-2 Data: Incorporating Surface Reflectance, Through Phenological Fitting, into Regression Yield Models,” Remote Sensing, vol. 11, no. 15, p. 1768, Jan. 2019, doi: 10.3390/rs11151768.
Dedeoğlu M., L. Başayiğit, M. Yüksel, and F. Kaya, “Assessment of the vegetation indices on Sentinel-2A images for predicting the soil productivity potential in Bursa, Turkey,” Environ Monit Assess, vol. 192, no. 1, p. 16, Dec. 2019, doi: 10.1007/s10661-019-7989-8.
Waldner F., H. Horan, Y. Chen, and Z. Hochman, “High temporal resolution of leaf area data improves empirical estimation of grain yield,” Sci Rep, vol. 9, no. 1, p. 15714, Dec. 2019, doi: 10.1038/s41598-019-51715-7.
Belgiu M. and O. Csillik, “Sentinel-2 cropland mapping using pixel-based and object-based time-weighted dynamic time warping analysis,” Remote Sensing of Environment, vol. 204, pp. 509–523, Jan. 2018, doi: 10.1016/j.rse.2017.10.005.
Sentinel 2 crop yield prediction
Kayad, A., Sozzi, M., Gatto, S., Marinello, F., & Pirotti, F. (2019). Monitoring Within-Field Variability of Corn Yield using Sentinel-2 and Machine Learning Techniques. Remote Sensing, 11(23), 2873.
Habyarimana E., I. Piccard, M. Catellani, P. De Franceschi, and M. Dall’Agata, “Towards Predictive Modeling of Sorghum Biomass Yields Using Fraction of Absorbed Photosynthetically Active Radiation Derived from Sentinel-2 Satellite Imagery and Supervised Machine Learning Techniques,” Agronomy, vol. 9, no. 4, p. 203, Apr. 2019, doi: 10.3390/agronomy9040203.
Lambert M.-J., P. C. S. Traoré, X. Blaes, P. Baret, and P. Defourny, “Estimating smallholder crops production at village level from Sentinel-2 time series in Mali’s cotton belt,” Remote Sensing of Environment, vol. 216, pp. 647–657, Oct. 2018, doi: 10.1016/j.rse.2018.06.036.
Methods: Very few lines are dedicated to the multilinear stepwise regression approach (214-216) even if it represents the core of the yield estimation model. More information is needed and the basic assumptions have to be declared (e.g. Normal distribution of residuals, no or little multi-collinearity, Homoscedasticity).
For the type of selected predictors, all based on vegetation indices, a measure of collinearity has to be added, such as the variance inflation factor (VIF), which assesses how much the variance of an estimated regression coefficient increases if your predictors are correlated. The collinearity strongly limits stepwise selection methods, because if one, rather than another, collinear predictor is dropped from the model, the selection process may proceed on a wrong trajectory (Harrell 2001, Meloun et al. 2002, Dormann et al. 2013).
Harrell F. E. Jr 2001. Regression modeling strategies – with applications to linear models, logistic regression, and survival analysis. – Springer.
Meloun M. et al. 2002. Crucial problems in regression modelling and their solutions. Analyst 127: 433–450. DOI: 10.1039/b110779h
Dormann, C. F., Elith, J., Bacher, S., Buchmann, C., Carl, G., Carré, G., ... & Münkemüller, T. (2013). Collinearity: a review of methods to deal with it and a simulation study evaluating their performance. Ecography, 36(1), 27-46. doi.org/10.1111/j.1600-0587.2012.07348.x
The R Studio (library “MASS”) tool adopted by the authors for the multilinear stepwise regressions allows the creation of the diagnostic plots to check the assumptions of the model. Such plots have to be added to the results section to show the goodness of the proposed models.