Next Article in Journal
Recent Progress on Vegetation Remote Sensing Using Spaceborne GNSS-Reflectometry
Next Article in Special Issue
Assessing Satellite, Land Surface Model and Reanalysis Evapotranspiration Products in the Absence of In-Situ in Central Asia
Previous Article in Journal
Detecting the Complex Relationships and Driving Mechanisms of Key Ecosystem Services in the Central Urban Area Chongqing Municipality, China
Previous Article in Special Issue
Estimating Evapotranspiration of Mediterranean Oak Savanna at Multiple Temporal and Spatial Resolutions. Implications for Water Resources Management
 
 
Article
Peer-Review Record

Characterizing Leaf Nutrients of Wetland Plants and Agricultural Crops with Nonparametric Approach Using Sentinel-2 Imagery Data

Remote Sens. 2021, 13(21), 4249; https://doi.org/10.3390/rs13214249
by Mandla Dlamini 1,2,*, George Chirima 3,4, Mbulisi Sibanda 5, Elhadi Adam 1 and Timothy Dube 2
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Remote Sens. 2021, 13(21), 4249; https://doi.org/10.3390/rs13214249
Submission received: 26 July 2021 / Revised: 18 October 2021 / Accepted: 19 October 2021 / Published: 22 October 2021
(This article belongs to the Special Issue Remote Sensing for Water Resources and Environmental Management)

Round 1

Reviewer 1 Report

General comments

Authors characterized the nutrient contents (N, K, Ca, Mg, P, S, Zn, B, Cu) of two agricultural crops and wild plants using Sentinel-2 MSI data and non-parametric regression methods. Manuscript has now been substantially improved with the revision made by following most of the important recommendations provided by reviewers to the previous version of submission. My major concern was that why authors chose to apply only non-parametric (random forest-RF) approach to describe and characterize the nutrient contents with use of several explanatory variables and why not parametric approach (linear or nonlinear regression approaches). Authors provided the reason for this; however, I am not fully satisfied with that. Authors may add some other reasons to the manuscript, such as any nonparametric approach including RF could be suitable when there are several potential explanatory variables involved and it is difficult to identify the most contributing ones to the variations of response variable of interest (here, nutrient concentration). Another reason would be that when explanatory variables have complex relationships (e.g., nonlinearity) and high correlations with each other, nonparametric regression approach could deal with such problems effectively. Authors also have several of such variables with complex relationships but most contributing ones are only a few (Figure 4). In this data condition, it may be reasonable to apply RF. However, the effectiveness of this method should be verified with application of parametric approach (linear or nonlinear regression, most preferably mixed-effects modeling). Another important issue, which I also raised in my previous review, is that any nonparametric approach including RF cannot be used as prediction models for new data in new environment, as this does not produce the estimated parameter values and variances, which are necessary for any prediction model to be applicable to such new data and environment. As nonparametric approach is “black box approach” and suitable for describing the variations of response variable of interest with explanatory variables provided to the box. Authors have consistently used the term nutrient predictions, prediction models, etc. however, I do not agree with using these terms, with the reasons given above. So, authors are suggested to use terminologies that are more appropriate than predictions. Actually, authors have predicted the nutrient concentrations, but they described/ explained/ or characterized the variations of the nutrient concentrations. So, authors are suggested to be careful while using terminologies. Figure 3 has used the term “predicted values” in y-axis, which may be the “fitted values” if authors did not apply their fitted model to new data and new environment. How this figure was produced or which dataset was used to produce this figure, training dataset (70%) or testing dataset (30%)? Please mention how this was produced, and apply corrections as suggested, if needed.  The above-mentioned issues are my major concerns, which are still properly unaddressed, and authors are suggested to address them in the revision. After addressing those issues, manuscript may be published in the Remote Sensing, as manuscript has potentiality to contribute new knowledge to the readers of the journal. I have some minor issues as below, which should also be resolved.

Minor issues

Would be more attractive if you changed title to “Characterizing Leaf Nutrients of Wetland plants and Agricultural Crops with Nonparametric Approach using Sentinel-2 Imagery Data”

Line 19-40: Please define acronyms and abbreviations used here, and avoid them if they are not used in this section again and again, e.g. %RMSE and NDVI

Line 178-186: As mentioned earlier, using only one methodological approach, such as RF may be risky, as different approaches have different performances according to the data structure and variations. Thus, it would be wise to apply at least two or three approaches and check their difference. Verification of results produced with any method is necessary. You may add some additional reasons as suggested earlier why only RF was chosen in your study.

Furthermore, what hypotheses you have evaluated or what questions you have answered in your study, should be stated in the last paragraph (line 178-191)

Line 194: delete description, study area alone is good enough.

Line 227: why not “plant leaf sampling”?

Table 1: Sample collection date instead of collection date of samples

Table 2: Please define each acronym used in this table. Please provide left part of the bracket to the fifth formula

Line 399: ….nutrient prediction…. This is not appropriate term used here. You may consider applying appropriate terms as suggested earlier. “Description of dry and wet season vegetation nutrient variations” may be the term that is more appropriate. You should be clear that every model or algorithm describes the variation of any response variable of interest (nutrient concentration, in your case) with use of explanatory variables or predictor variables.

In result section everywhere you are showing statistically significant or non-significant values (p-values  and alpha values), which is not necessary when you write one sentence in the analysis section, such as unless and otherwise stated, level significance was assumed as 5% or 1%, as you wise.

Line 410: it is not necessary to say statistically significant, but only saying significant gives such meaning, so please consider making corrections everywhere. Significant term is used in statistical analysis, and so it obviously give such meaning.

Table 4, 5, 6, 7, 8: Please define each acronym used in these tables to make them self-explanatory.

Figure 3 should be improved with one-to-one line (diagonal line) added to the data and regression line: dotted line). Figure caption should define what is dotted line also. When you add one-to-one line to each figure component here, your data and regression line would definitely be largely deviated from this, meaning that your model, algorithm, or RF is not appropriate enough to be used for better fitting/prediction performance. One-to-one line and regression line (dotted line) should be passed through the middle of data clouds, which would be not possible with the RF applied here. Considering this consequence, I was suggesting you to apply linear or nonlinear regression to describe the data variations adequately, which you declined to do. Also, which dataset used to show in this figure: training dataset or testing dataset, please mention and choose to use appropriate term in y-axis as suggested earlier. Also define acronyms used in this figure.

Figure 4: Define acronyms used in this figure

Line 656: high coefficient of determination

Line 657: What do you mean by safe to conclude? Please conclude based on your data and results, whether they are appropriate or not, but do not use term safe to conclude.

Line 662-675: Please define acronyms and avoid using such terms in conclusion, which should be understood independently from other sections.

Author Response

Attached is the reply

Author Response File: Author Response.docx

Reviewer 2 Report

Comments regarding “Seasonal Wetland Vegetation and Crop Leaf Nutrients 2 Characterisation using Sentinel-2 Imagery”. This manuscript is potentially of interest. It is exciting to read that S2 works reasonably well for crop leaf nutrients quantification. However, there are few concerns that need to be resolved:

  • First and foremost, it is not really justified why using these vegetation indices (VIs)? After all they are merely simple transformations of the original bands. If believing that VIs provide added value, that should first be analysed as opposed to using the single bands alone. This is of importance, because ML algorithms (e.g. RF) are well able to exploit the relevant information, without the need to apply simple transformations first. Hence it may be that equally good results are obtained when simply entering the raw bands. Thus, if the authors are convinced that introducing VIs provide an added value, that should be clearly demonstrated and quantified. If it appears that VIs do not provide added value, then I would not complicate calculating these VIs first. Hence, I believe the Results section should be reorganized, with a more systematic analysis. I take this comments very serious, because far too VIs are introduced without a proper rationale whether that step is really needed. Related to this, note that often calculation of PCA components as auxiliary information proved to be more effective than VIs. It is worth to consider as well.
  • The authors decided to use random forests (RF). However it is not quite clear why this would be the best performing method. Did the author do any analysis with alternative ML algorithms? For instance, note that several studied have compared multiple ML algorithms with S2, and did not come to the conclusion that RF is the best choice. See e.g. Verrelst et al., (2015) where all kinds of ML algorithms were compared. Gaussian processes regression (GPR) clearly outperformed regression trees. See also related studies, and note that GPR also provides band ranking properties. It would be of interest to compare your dataset and RF results with GPR results.
  • It is a pity that no hyperspectral data was analysed. That would give a more physical interpretation about the relevant absorption regions. Now a rather shallow discussion is provided about relevant bands. For instance, N can be related to specific bands in the SWIR. See also the papers of Berger et al. (2020).
  • When using S2, and when claiming multiple times “mapping” or “monitoring” throughout the manuscript, then some maps are expected. Now results are restricted to validation data, yet we have no clue whether the maps are meaningful. After all, satellite data is much more than those <100 samples. Maps should make sense over larger areas.
  • See also specific comments within the ms.

Comments for author File: Comments.pdf

Author Response

Attached please find the reply.

Author Response File: Author Response.docx

Round 2

Reviewer 1 Report

Thank you authors for careful revision of the manuscript as suggested. I am satisfied with your revision works and response. Now your manuscript is suitable for publication in RS. Congratulations!

Author Response

We thank the reviewer  for  reviewing this manuscript and seeing it as suitable to be published in the RS journal.

Reviewer 2 Report

Dear authors,

Thanks for preparing an updated version.

Unfortunately I am not satisfied with the pursued approach. Rather than doing the efforts to go for a more in-depth analysis, the authors only came with a shallow answer by referring to old and not necessarily appropriate literature. Let me explain:

(1) The authors used VIs in a machine learning regression algorithm. Yet, apparently the authors did not want to test the performances of the individual bands as opposed to vegetation indices. Nevertheless, it is a straightforward analysis that immediately will clarify whether the usage of VIs provide additional benefits as opposed to directly use the bands.  That such analysis is not done is problematic in several ways. Is it: (i) the authors lack curiosity about their data? They are simply not interested in systematically exploring their data? Or (ii) the authors did the test, but are not willing to show the results? Whatever the reason is, by ignoring a systematic analysis it gives an overall impression of an unfinished work.

(2) The authors referred to literature mentioning the benefits of using VIs. However, the authors seem not to realize that here the data is inputted into an adaptive (nonlinear) machine learning algorithm (RF). Hence, the RF exploits the available information to the fullest (i.e. more flexible than the simple formulations introduced by VIs). That is different as compared to classical (linear) exploitation of the VIs (as referred to in the mentioned literature). Please realize that the RF is highly adaptive, and thus perfectly able to exploit relevant information directly from the raw bands. See for instance (10.1109/TGRS.2011.2168962) where it was demonstrated that single bands in machine learning (GPR) can outperform the usage of VIs. 

Further, the answer about solely using RF and not willing to test other algorithms is also not satisfactory. While the authors refer to some article (e.g., Mutanga et al., 2012), it is a pity that apprently the authosr lack curiosity to find out themselves if RF is truly top performing for their data. Note that in literature also papers can be found that demonstrate that other algorithms can outperform RF (again, it depends on the used data). 

Altogether, a more critical reflection towards the analysis would make the manuscript more credible. 

Thanks for showing maps. I would suggest to give the variable name directly on top of each map. Now it exhausts the reader forcing each time to look down the caption to find out what is being mapped.

 

 

Author Response

Attached please find the reply.

Author Response File: Author Response.docx

This manuscript is a resubmission of an earlier submission. The following is a list of the peer review reports and author responses from that submission.


Round 1

Reviewer 1 Report

  1. Introduction can be improved. Even many references are cited, proper citing is missing. I mean what is relevant and relation from each ref. to the present work is important. For example,

 "Their presence and concentrations in plants can now 77
be easily determined by using remote sensing which is now well developed and capable 78 of accurately estimating on foliar nitrogen, phosphorus and potassium [22-31]"  This group citation is too general. Readers may want to know the particularities and differences from each refs. such as from [27]: "Explaining Leaf Nitrogen Distribution in a Semi-Arid Environment Predicted on Sentinel-2 Imagery Using a Field Spectroscopy Derived Model" This article seems to be a key ref for this study and its discussion is expected.

2. About the structure of the ms, Results and Discussion might be joined, and Conclusions need to be extended with quantitative results.

3. A flow chart which summarizes overall strategy is missing

4. Details of field data collection is missing

5. A brief and concise summary how random forest works is appreciated.

6. Figures and tables can be improved in presentation and quality. Additional proper labeling is missing. For example there are two Table 1. The quality of Figure 1 is poor. Landmarks are missing, labels are not clear. Figs 2 and 3 have too many panels, can be split or re-arranged.

7. We need to uniform proper names such as UMFolozi, uMFolozi, umfolozi,.. which is correct? are they different just changing the case of a letter?

8. Discussion about the performance the approach is need. For example, what might be the reasons of poorly detection of leaf magnesium, copper and sulphur in a wet season?

Reviewer 2 Report

Seasonal Wetland Vegetation and Crop Leaf Nutrients Characterisation using Sentinel-2 MSI Data

The article focuses on assessing the nutrient contents of wetlends using Sentinel-2 images.

Specific comments:

Line 109 - Sentinel-2 was actually not lauched that recently - rather in 2015
Line 216 - Table 1 - given that the S2 images come from different relative orbits i.e. R049 and R092 it is unclear if they contain data for the same extent. Maybe some clarification on that would be useful
Line 286 - Table 2 - since the values of elemnts have different ranges - the listing of RMSE and MSR does not shed any light on how good the regression was. The elements have to pe scaled the have values between the same range - in order for RMSE to be comparable. Another question regarding RMSE - are you listing in the table the mean RMSE for all the samples ?
Line 288 - Table 3 - is R2 different than the previously used Rsq?
Line 291-292 - Table 4 - it is not clear to me what are you showing in this table. what does 0.62 and 0.69 represent? These values have no column headin. The P-values shown in the table are greater than 0.05 - which would mean that your results are not significant. Regarding P-values - you also did not mention the significance level (I assume 0.05). 

Broad comments:

The design of the statistical analsis is not convincing. First, it is not clear if the statistics shown in the Tables 1-6 are computed on the test or on the training data. Secondly, reading into the Tables - one would conclude from Table 4 that you results are not significant. Also, you provided the p-value just for Table 4. Scaling the absolute values of nutrients so that they would have the same range - would give a better overview into the meaning of the RMSE - and how the estimation of the individual nutrients perfom compared to each other.
Regarding RF, you did not specify the hyperparameters used, nor listed the features used prior to the results. Also, it is unclear to me if you did 2 models - one for the dry and one for the wet seson. In the case of two models, did you use the exact same model settings and predictors?
On a theoretical level, it would be good to explain if the concetration of nutrients is stable in time - or if it varies. Maybe you would have benefited from more S2 images around the dates of the field campaigns. Did you check if more cloud free images were available - at least for the dry season?
Regarding the reference data - it would be interesting to check if the nutrients correlate amongst each other - to get a better overview on the subject. Is a high concentration in N coincident with a high correlation in C? In the scatterplots from Figure 2 it would be good to be able to identify the reference points (i.e. by a color code).

Reviewer 3 Report

General comments

Authors intend to characterize nutrients of two agricultural crops and wild plants using Sentinel-2 MSI data, which is not substantially a new concept, as many other studies (some of them are also cited in the manuscript) have already been carried out in this subjects. However, to some extent, the current study may be of interest, as this intends to make the exploration on the nutrients of crops and plants growing in the wetlands, which may have a rich biodiversity. Leaf samples were collected, representing leaf conditions in both the winter and wet seasons. Various vegetation indices of the species of interest were extracted from the Sentinel-2 MSI data (image) of the sampled area. Authors applied the nonparametric modeling approach, such as random forest (RF) to describe or characterize the nutrients concentrations in the leaf with the vegetation indices used as the predictor variables. After reading this manuscript twice, I concluded that authors have a clear intention to present the nonparametric models to predict the concentrations of N, K, Ca, Mg, P, S, Zn, B, and Cu nutrients of wetland agricultural crops and wild plants. However, I did not find any such models presented in the results that can be used to apply in the new environment. The prediction models should be presented with the parameter estimates and their standard errors. However, authors did not present these values at all. Without this information, how model users will be able to apply the authors’ models to predict the concentrations of nutrients. What are usefulness of the presented results and how these results could be implemented? The method (RF) used in this study only describe the concentrations of nutrients based on the current data (Figure 2, in which ‘prediction’ term used is wrong, and this should be ‘fitted value’), but not able to make the prediction for new data in another environment. Most important issue is also that why authors chose only RF to describe the concentrations of nutrients, any why not some other machine learning approaches and parametric approaches (linear or nonlinear regression approaches), as each approach would be able to provide different fitting/predicting performance to the different datasets. Thus, it is necessary to evaluate a number of candidate models or algorithms from both the parametric and non-parametric modeling approaches. Authors seem to agree with this idea (Line 135-138), but they did not consider applying some other alternative approaches mentioned above to make their modeling scientifically more robust. Sampling and chemical analysis made is also not clear, as authors did not describe methodology in greater derails. Authors used the ‘purposive random sampling’ –what does this mean? The sampling should be either purposive (subjective) or random sampling (objective). When any sampling would be of subjective type, it may produce the bias results. Sampling should thus be random, and chosen samples should be representative. Authors kept all the leaf samples in the oven at 70 °C for at least 24 hours and milled to particle sizes of <0.5 mm, regardless of type of crop and plant types. Is this approach appropriately good enough for each crop and plant types, as they would have different moisture contents due to different physical and chemical properties? In order to make the study more interesting and innovative, authors are required to address all the major issues mentioned above. The current version does not have significant novelty, and therefore is not suitable for the high impact journal, RS.

Reviewer 4 Report

The manuscript entitled “Seasonal wetland vegetation and crop leaf nutrients characterization using Sentinel-2 MSI Data” is interesting. At the beginning of the text, the reading is pleasant. On the other hand, the manuscript has problems with Methods and Results.

The document has some methodological issues to solve. Firstly, the authors affirm that images were atmospherically corrected to Top Of Atmosphere (TOA) reflectance. Actually, Sen2Cor converts to Bottom of Atmosphere (BOA) reflectance or surface reflectance. Another issue about atmospheric correction is validation. Were the results of the atmospheric correction assessed based on in situ measurements? In addition, justify why Sen2Cor was used. Cite some paper that used it or show results.

In addition, It is not clear what was done according to the description in lines 225-227. Why were images classified? What was the classifier used? Were really the images classified? Would not the authors have extracted surface reflectance values from the images?

Also, it is not clear the vegetation indexes used as input attributes. Specify each one and show the formula and abbreviation used. Did all the experiments use the same input dataset? It is important to specify the number of training and validation samples used in each stage. When we analyze the Tables and Figure displayed in the Results, it is not clear what was done.

Some problems are found in statistical metrics. For example, what means MSR? Include the meaning. What is the RMSE coefficient? R2 is the determination coefficient.

The section ‘Results’ needs to be deeply improved. This section is very superficial. The authors inserted 5 Tables and 2 Figures but there are consistent explanations. The results in those tables and figures are under-exploited. Include the unit of each element in Tables 2, 3, and 5. The unit should be included in the text as well. Explain RMSE% = 0 for Ca. How many samples were used in validation? Include the number of samples in the figures as well. Was Figure 2 created with basis in the validation dataset? Figure 2 needs to be explained.

In Figure 3, what means each abbreviation in y-axis? Include these abbreviations in the text. Include in the caption of the figure a description of (a), (b), (c), ... and (i). Figure 3 needs to be explained.

 

Specific comments

Line 17. Remove comma after ‘in’.

Line 20 and 21. Remove ‘In’ before ‘this’.

Line 21. Remove comma after ‘study’. After correction: ‘This study determined…’.

Line 26. Include comma after ‘seasons’.

Line 26. Remove comma after ‘R2’. After correction: (R2 < 0.5).

Line 28. Include units for 26.8 and 20.8.

Line 29. Replace comma for ‘=’. After correction: (p-value = 0.001).

Lines 28-30. ‘There was a statistically… for vegetation = 0.81)’. I do not understand the relationship between difference and R2.

Line 30. Include the central wavelength of the red-edge band in parentheses.

Line 30. Include acronym of ‘normalized difference vegetation index'.

Line 39. Remove space between number and %.

Line 42. Remove comma after ‘include’.

Line 53. Remove ‘Nyamadzawo et al.’

Line 65. Include 'and' before 'increased'.

Line 65. Remove ‘and’ before ‘are’.

Line 68. Include space between ‘[17]’ and ‘found’.

Line 77. Include space between ‘amounts’ and ‘[21]’.

Line 78. Remove ‘easily’.

Line 84. Include space between ‘spectroscopy’ and ‘[32]’.

Line 90. Replace ( for [.

Line 109. Remove extra space between ‘Sentinel-2’ and ‘multispectral’. In addition, it is ‘Multispectral’.

Line 113. Acronyms, symbols, and abbreviations are inserted inthe first time that the meaning appears in the text. In this case, S2 should be included at the beginning of this paragraph. Include abbreviation for ‘Multispectral Imager’ (MSI) because it is used in line 121. Check this for other symbols and acronyms in the whole text.

Line 113. Actually, it is 10-60 m.

Line 116. ‘red-edge’ instead of only ‘red’.

Line 121. Actually, the sensor is MSI. S2 is the platform.

Lines 121-124. Include number of each band. After, this information will be important.

Line 127. Remove ‘estimation of the’ or ‘estimation’ before [50].

Line 147. Section 2.1. Is there any difference between UMfolozi, uMfolozi and Mfolozi? If not, standardize in the whole text.

Line 151. Remove space in 1 620.

Line 163-165. ‘When about… [66]’. There is something wrong with the phrase.

Line 169. Include space between ‘mm/a.’ and ‘[67]’.

Line 171. Figure 1. Specify (A), (B), and (C) in the caption of Fig.1.

Lines 189-191. Use symbols. Check the comment in line 113.

Line 193. First time in the text? If so, include symbols, hereafter, use only symbols.

Line 196. Would not it be ‘CO2’?

Line 223. What is the central wavelength?

Line 232. Remove comma after ‘by’.

Line 232-233. ‘(remotely sensed data in this study)’. What was the independent variable used as input? Was bottom-of-atmosphere reflectance extracted from atmospherically corrected S2 images?

Line 258. Specify bands used because bands 1, 9, and 10 were not used.

Line 265. What means MSR? Include the meaning.

Line 292. Table 4. Is Summer-Winter repeated?

Line 321. Remove blank line.

Line 343. Remove extra space before ‘Few’.

Lines 344-345. ‘(REP1 and NDVI)’. Abbreviations and acronyms need to be included together with their meaning in the text. This must be done in the first time that the term appears in the text.

Line. 362. Use symbols.

Lines 365-375. Explain why such bands and indexes exhibited better performance compared to the other attributes.

Line 370. Use symbols.

Line 370. Include central wavelength between parentheses for red-edge position 1.

Line 372. Use symbols.

Line 376. Use symbols

Line 383. Use symbol.

Line 385. Use symbol.

Lines 386-387. Use symbols.

Line 395. Use symbols.

Line 405. Use symbols.

Lines 405-406. Is it possible to affirm this? Are there results showing the performance of models in estimating concentrations for each vegetation type?

Lines 410-417. Did the studies cited also use Sentinel-2 images? If so or not, including sensor information.

Line 411. Use symbols.

Line 412. Use symbols.

Line 413-414. ‘This means… some seasons’. But why?

Line 419. Use symbols.

Line 42-7428. Use et al.

Line 428. Include space between [102] and found.

Line 429. Is there any explanation for inverse performance?

Lines 459-460. Use symbols.

Back to TopTop