Rapid Mapping of Large-Scale Greenhouse Based on Integrated Learning Algorithm and Google Earth Engine
Round 1
Reviewer 1 Report
This is an interesting paper proposing the construction of an integrated classifier, based on three state of the art independent classifiers for greenhouse mapping on large areas. The paper is in general well written but it focuses mainly on the results and qualitative comments. Details about the classifiers parameters for example are missing. I guess that the authors want to preserve the property of their finding but, on the other hand, if the calculations can’t be reproduced independently how sure can be the reader that they are accurate? For example it is not clear for me how are combined the results of the three classifiers. Or how are calculated the weights related to accuracy.
I have also the following minor comments:
- The references are not numbered so it was practically impossible to know which paper is cited in a specific context
- Various terms are not defined. These terms are common notion in the field, but for a wider audience they should be explicated – for example:
- “Referring to the results of previous studies, the most common 6 texture features including the asm, contrast, corr, var, idm, and ent, are selected to train the classifier, and to reduce data overlap and redundancy between too many texture features”
- “CART, SVM, randomForest, gmoMaxEnt and naiveBayes”
- I think that the following phrase should be reformulated using authors’ words in a more explicit form:
- Due to remote sensing images generally have the phenomenon of "same object with different spectrums and different objects with same spectrum" [19], there may be misclassifications and omissions in the extraction of land types with a single spectral feature [20].
Author Response
Responses to Reviewer1
General Comment: This is an interesting paper proposing the construction of an integrated classifier, based on three state of the art independent classifiers for greenhouse mapping on large areas. The paper is in general well written but it focuses mainly on the results and qualitative comments. Details about the classifiers parameters for example are missing. I guess that the authors want to preserve the property of their finding but, on the other hand, if the calculations can’t be reproduced independently how sure can be the reader that they are accurate? For example it is not clear for me how are combined the results of the three classifiers. Or how are calculated the weights related to accuracy.
Response: Thanks for the reviewer’s insightful comments concerning our manuscript entitled “Rapidly mapping of large-scale greenhouse based on integrated learning algorithm and Google Earth Engine” (remotesensing-1125630). We have read the comments very carefully and further clarified every issue concerning about our manuscript. First of all, we have adopted your suggestions, and your suggestions play an important role in improving the quality and readability of our manuscript. Regarding the parameters of the classifier you are concerned about and the weight assignment related to accuracy of the integrated classification algorithm and other related details, we have further clarified in the revised manuscript. For example, we add Table 2. (Feature selection of greenhouse identification and its connotation) to show the details of the 18 feature parameters under the integrated learning classifier, and expand these details in the research method part. In addition, we are also add a detailed idea and process of constructing an integrated learning classifier based on the classification accuracy of each classifier in the introduction of the integrated learning algorithm and its result analysis section. For detailed changes, please check L192-L199, L207-L208, L234-L247, L278-L294.
However, we believe that these comments will help us to further improve the quality of our manuscript. We are also glad to discuss any further concerns you may have. The detailed responses to your comments are presented in red color in this document. If you have any queries, please don’t hesitate to contact me.
I have also the following minor comments:
Comment1: The references are not numbered so it was practically impossible to know which paper is cited in a specific context.
Response:Thanks to the reviewer for your insightful comment. We have adopted your suggestions and numbered the literature one by one. We downloaded the manuscript on the submission system for your review. We did find that the literature were not numbered. In fact, the original manuscript uploaded when we submitted the manuscript included the literature number, but the system is convenient for later review, the format of the manuscript were unified, and the number of the literature was automatically deleted by the submission system, which caused the lack of numbering of literature. Sorry for the trouble, we have renumbered it. Thank you for your valuable comments on the errors in our manuscript. For detailed changes, please check L497-L578.
Comment2: Various terms are not defined. These terms are common notion in the field, but for a wider audience they should be explicated – for example:
“Referring to the results of previous studies, the most common 6 texture features including the asm, contrast, corr, var, idm, and ent, are selected to train the classifier, and to reduce data overlap and redundancy between too many texture features”
“CART, SVM, randomForest, gmoMaxEnt and naiveBayes”
Response: Thanks to the reviewer for your comment, we have adopted your suggestions. We have added the full names of some professional abbreviations in the field of remote sensing in the manuscript, and redefined some professional terms so that a wider audience can better understand the connotation of the manuscript. For example, the most common 6 texture features of the the asm, contrast, corr, var, idm, ent, and 5 different classifiers of the CART, SVM, randomForest, gmoMaxEnt and naiveBayes you mentioned. The revised version of the content is as follow. For detailed changes, please check L192-L199, L227-L230.
“Referring to the results of previous studies, starting from the texture characteristics of the greenhouse in the region, considering the correlation, difference and redundancy between texture parameters, the 6 most common texture parameters are selected from the aspects of contrast, correlation, and entropy etc. Based on the Google Earth Engine cloud platform, the angular second moment (B2_asm), contrast (B2_contrast), correlation (B2_corr), variance (B2_var), inverse difference moment (B2_idm), and entropy (B2_ent), are selected to construct the characteristic parameters of greenhouse and train the classifier, and to reduce data overlap and redundancy between too many texture features [19].”
“this research constructs the feature bands based on spectral features, texture features, and topographic features, and selects five classifiers including classification and regression trees model (CART), support vector machines model (SVM), random forest model (randomForest), maximum entropy model (gmoMaxEnt) and naive bayesian model (naiveBayes), to interpret the spatial distribution of greenhouses in the study area”
Comment3: I think that the following phrase should be reformulated using authors’ words in a more explicit form:
Due to remote sensing images generally have the phenomenon of "same object with different spectrums and different objects with same spectrum" [19], there may be misclassifications and omissions in the extraction of land types with a single spectral feature [20].
Response: Thanks to the reviewer for your comment. We have adopted your suggestion and rewritten this phrase, expressing the meaning of the original manuscript in our own words. We believe that the revised phrase will make it easier for readers to understand the meaning of this phrase. The revised phrase is “Due to remote sensing images generally have the phenomenon that the same spectrum on remote sensing images may actually be different ground features, and the same ground features may also have different spectral features on remote sensing images [19]. Therefore, selecting a single spectral feature for remote sensing extraction of land use types maybe cause partial errors and omissions in the classification results of remote sensing interpretation [20].” For detailed changes, please check L155-L161.
Author Response File: Author Response.docx
Reviewer 2 Report
The article is interesting, but requires several clarifications and corrections:
- Literature is not numbered, it is unknown to which paper the authors refer.
- Section 2.3, (2) Texture features, authors wrote:
„Referring to the results of previous studies, the most common 6 texture features including the asm, contrast, corr, var, idm, and ent, are selected to train the classifier, and to reduce data overlap and redundancy between too many texture features [19].”
Please explain the meaning of the texture features for which only function names are given.
Why were 6 texture features used? Some features are dependent on each other, so maybe you don't need 6 features, just 3 features, for example?
- Table 2 shows the values Producer’s accuracy (PA) and user’s accuracy (UA)
Please explain what is the difference between Producer’s accuracy (PA) and user’s accuracy (UA)?
- There are ambiguities in table 3.
Earlier, the authors write:
„From the perspective of producer's accuracy, the classification accuracy of greenhouses is 0.94, and there are 1027 real samples, of which 927 are correctly classified, and 100 are incorrectly classified into other land use types.”
Hence the accuracy is 0.90 and not 0.94
In Table 3, in the GreenHouse column, 927 samples are correctly classified. The remainder, i.e. 56 (33 + 3 + 2 + 18 = 56), were classified incorrectly. The authors write that 100 samples were incorrectly classified. What about the rest of the 44 samples?
How many test samples were there for the greenhouse?
Can the authors show examples of images that have been incorrectly classified, for example that have been recognition as Farmland or Forest Land instead of a greenhouse?
- Figure 1 - source not given.
Author Response
Responses to Reviewer2
General: Comment: The article is interesting, but requires several clarifications and corrections:
Response: Thanks for the reviewer’s insightful comments concerning our manuscript entitled “Rapidly mapping of large-scale greenhouse based on integrated learning algorithm and Google Earth Engine” (remotesensing-1125630). We have read the comments very carefully and further clarified every issue concerning about our manuscript. However, we believe that these comments will help us to further improve the quality of our manuscript. We are also glad to discuss any further concerns you may have. The detailed responses to your comments are presented in red color in this document. If you have any queries, please don’t hesitate to contact me.
Comment1: Literature is not numbered, it is unknown to which paper the authors refer.
Response:Thanks to the reviewer for your insightful comment, and there is also other reviewer who are concerned about this issue. We have adopted your suggestions and renumbered the literature one by one, and matched the order of citations in the manuscript to facilitate readers to read the article. We downloaded the manuscript on the submission system and find that the literature were not numbered. In fact, the original manuscript uploaded when we submitted the manuscript which included the literature number, but the format of the manuscript were unified by the submission system, and the number of the literature was automatically deleted, which caused the lack of numbering of literature. Sorry for the trouble, we have renumbered it. Thank you for correcting the error in our manuscript. For detailed changes, please check L497-L578.
Comment2: Section 2.3, (2) Texture features, authors wrote:
„Referring to the results of previous studies, the most common 6 texture features including the asm, contrast, corr, var, idm, and ent, are selected to train the classifier, and to reduce data overlap and redundancy between too many texture features [19].”
Please explain the meaning of the texture features for which only function names are given.
Why were 6 texture features used? Some features are dependent on each other, so maybe you don't need 6 features, just 3 features, for example?
Response: Thanks to the reviewer for your comment, We have adopted your suggestion. In the original version of the manuscript, we adopted the abbreviation of the texture feature that can be directly used by Google Earth Engine, which really brings inconvenience to readers. In the revised manuscript, we introduced the six texture feature parameters one by one, and explained why these six texture features were selected to construct the feature parameters for greenhouse recognition. Google Earth Engine provides fast calculations based on GLCM texture features function glcmTexture (size, kernel, average). The construction of texture feature parameters must consider the similarity of texture features on the one hand, and try to satisfy the relative comprehensiveness of texture features on the other hand. Therefore, considering the correlation, difference and redundancy between texture parameters, the 6 most common texture parameters are selected from the aspects of contrast, correlation, and entropy etc. For detailed changes, please check L192-L199.
Comment3: Table 2 shows the values Producer’s accuracy (PA) and user’s accuracy (UA)
Please explain what is the difference between Producer’s accuracy (PA) and user’s accuracy (UA)?
Response: Thanks to the reviewer for your comment, we have adopted your suggestion. In the accuracy verification part, we only briefly introduced the use of producer’s accuracy, user’s accuracy, overall accuracy and Kappa coefficient to evaluate the accuracy of greenhouse extraction, but failed to discuss the meaning of each parameter and its difference. Therefore, in the revised version of the manuscript, we explained the inner meaning of these four parameters and their differences from the connection. Among them, the PA is the probability that the classifier can classify the pixels of an image as A, assuming that the ground surface is truly class A; The UA is the The producer accuracy is the probability that the corresponding ground truth class is A, when the classifier classifies the pixel into the A class. so that readers can have a better understanding of the results analysis and discussion chapters. For detailed changes, please check L253-L262.
Comment4: There are ambiguities in table 3.
Earlier, the authors write:
“From the perspective of producer's accuracy, the classification accuracy of greenhouses is 0.94, and there are 1027 real samples, of which 927 are correctly classified, and 100 are incorrectly classified into other land use types.”
Hence the accuracy is 0.90 and not 0.94
In Table 3, in the GreenHouse column, 927 samples are correctly classified. The remainder, i.e. 56 (33 + 3 + 2 + 18 = 56), were classified incorrectly. The authors write that 100 samples were incorrectly classified. What about the rest of the 44 samples?
How many test samples were there for the greenhouse?
Can the authors show examples of images that have been incorrectly classified, for example that have been recognition as Farmland or Forest Land instead of a greenhouse?
Response: Thanks to the reviewer for your comment, we have adopted your suggestion, carefully checked this part of the data and its related analysis, and found that our discussion in the original version of the manuscript contains errors, for example, we verify the total number of sample points and their accuracy There is an error, and the data in the table is correct. This table is the land use confusion matrix under the integrated learning classifier, and the producer accuracy and mapping accuracy of each land category are calculated. We found that the precision of the greenhouse is high, the producer's precision is 0.94, and the total verification sample points for the greenhouse are 985, of which 927 are correctly classified and 85 are not correctly classified. On this basis, we reorganized and rewritten the content in this part. Thank you for correcting the errors in our manuscript in time. We apologize for the inconvenience caused by the errors in the original manuscript. For detailed changes, please check L297-L305.
Comment5: Figure 1 - source not given.
Response: Thanks to the reviewer for your comment, we have adopted your comment and marked the source of Figure 1. In the original manuscript, we only inserted Figure 1, but omitted the citation in the manuscript text. Figure 1 represents the geographic location of the study area and its basic overview. We have quoted it in the overview section of the study area. For detailed changes, please check L115.
Author Response File: Author Response.docx
Reviewer 3 Report
The paper “Rapidly mapping of large-scale greenhouse based on integrated learning algorithm and Google Earth Engine” by Lin et al. is dedicated to the construction of an integrated classification algorithm for greenhouse recognition in Jiangsu Province based on regional characteristics. It uses Google Earth Engine with its own massive data and cloud computing capabilities. The combination of different spectral, texture and terrain features is evaluated and established as leading to highest classification accuracy. Also greenhouse distribution in the studied area is evaluated. Even without really understanding the significance to obtain the spatial distribution data of greenhouse quickly and accurately, I think that the paper is well written and could be published in Remote Sensing.
Author Response
Responses to Reviewer3
General Comment: The paper “Rapidly mapping of large-scale greenhouse based on integrated learning algorithm and Google Earth Engine” by Lin et al. is dedicated to the construction of an integrated classification algorithm for greenhouse recognition in Jiangsu Province based on regional characteristics. It uses Google Earth Engine with its own massive data and cloud computing capabilities. The combination of different spectral, texture and terrain features is evaluated and established as leading to highest classification accuracy. Also greenhouse distribution in the studied area is evaluated. Even without really understanding the significance to obtain the spatial distribution data of greenhouse quickly and accurately, I think that the paper is well written and could be published in Remote Sensing.
Response: Thanks for the your comments concerning our manuscript entitled “Rapidly mapping of large-scale greenhouse based on integrated learning algorithm and Google Earth Engine” (remotesensing-1125630). We are very happy to hear about that you gave a high rating to our manuscript, and we are very grateful for your recognition of our work and believe our work can be published in the Remote Sensing. However, other reviewers also put forward some comments on the further improvement of the quality of our manuscript. On the basis of the original manuscript and combined with the comments of other reviewers, we revised point by point to further improve the quality of the manuscript. We believe that the revised version of the manuscript can better meet the requirements of Remote Sensing. Thank you again for your support of our manuscript. We are also glad to discuss any further concerns the editor or reviewers may have. If you have any queries, please don’t hesitate to contact me.
Author Response File: Author Response.docx
Reviewer 4 Report
This manuscript describes a research to develop a method to classify greenhouses using Google Earth. The research topic fits well with the journals aim and scope, and it may be of interest for the journal´s international readership. The methods seem to be well applied and described, and the results clearly described. However, there are some points that the authors need to improved:
- There is no initial hypothesis. The scientific method is based on testing hypothesis based on current knowledge, and then discussing if the hypothesis can be accepted or rejected. Objectives are not substitutes of hypothesis.
- The introduction provides scarce background on previous research on classifying greenhouse with remote sensing outside of China. The authors should make clear if such previous research exists, and if so, to clarify why this research is important, and why now.
- The discussion is too descriptive. There should be a deeper discussion on the geophysical mechanisms that allow the detection of greenhouses and how the algorithm developed works with them.
- The conclusions are too extensive, most of the text could be actually better placed in the discussion.
Author Response
Responses to Reviewer4
General Comment: This manuscript describes a research to develop a method to classify greenhouses using Google Earth. The research topic fits well with the journals aim and scope, and it may be of interest for the journal´s international readership. The methods seem to be well applied and described, and the results clearly described. However, there are some points that the authors need to improved:
Response: Thanks for the reviewer’s insightful comments concerning our manuscript entitled “Rapidly mapping of large-scale greenhouse based on integrated learning algorithm and Google Earth Engine” (remotesensing-1125630). We are very grateful for your recognition of our work and your valuable suggestions for revision of our manuscript. We have read the comments very carefully and further clarified every issue concerning about our manuscript. However, we believe that these comments will help us to further improve the quality of our manuscript. We are also glad to discuss any further concerns you may have. If you have any queries, please don’t hesitate to contact me. The detailed responses to your comments are presented in red color in this document.
Comment1: There is no initial hypothesis. The scientific method is based on testing hypothesis based on current knowledge, and then discussing if the hypothesis can be accepted or rejected. Objectives are not substitutes of hypothesis.
Response: Thanks to the reviewer for the comment. We have adopted your suggestion, and scientific hypotheses are indeed very important in research. In the revised version of the manuscript, we added a hypothetical premise for the study in the introduction. Currently, relevant scholars have conducted research on the greenhouse and have achieved fruitful results, and mainly based on a combination of high-resolution images and object-oriented methods or used Landsat images to achieve large-scale remote sensing extraction of greenhouse. However, the current existing methods have some problems, and it is difficult to achieve large-scale rapid mapping by remote sensing in a large-scale greenhouse. On the basis of the current Google Earth Engine cloud platform providing efficient computing power and massive remote sensing data and algorithms, whether it can be combined with the platform's own classification algorithm, and integrate the advantages of different classifiers to build a better and efficient large-scale greenhouse remote sensing classification algorithm to achieve rapid mapping of large-scale greenhouse remote sensing under complex terrain? For detailed changes, please check L75-L79, L97-L102.
Comment2: The introduction provides scarce background on previous research on classifying greenhouse with remote sensing outside of China. The authors should make clear if such previous research exists, and if so, to clarify why this research is important, and why now.
Response: Thanks to the reviewer for the comment. We have adopted your suggestion. We have added some research progress on greenhouse identification conducted outside of China in the introduction section. Generally speaking, relevant scholars mainly have two mainstream greenhouse identification methods. On the one hand, they are used GeoEye-1, High-resolution images such as WorldView-2, GaoFen-2 and Sentinel-2 for greenhouse interpretation. Although the interpretation accuracy is high, due to the difficulty of data acquisition, generally only small-scale research can be carried out; on the other hand, according to the characteristics of the case area, the relevant parameters are selected to construct the greenhouse index to extract the large-scale greenhouse spatial distribution, but the greenhouse algorithm of this method generally has distinct local characteristics, and its method is poor in migration and cannot achieve effective migration of the algorithm in different area. In view of this, this study relies on the massive data and cloud computing capabilities of the Google Earth Engine cloud platform, integrates the classification advantages of multiple classifiers, and further overcomes the shortcomings of a single classification algorithm, and replaces the classification of the traditional single classifier with the classification probability of the pixel. As a result, the research can enrich the current greenhouse identification methods at a certain level, realize the high-efficiency identification of large-scale greenhouses under complex terrain, and enable effective migration of algorithms. Therefore, we believe that this research is of great significance in the current research on greenhouse identification. For detailed changes, please check L52-L79.
Comment3: The discussion is too descriptive. There should be a deeper discussion on the geophysical mechanisms that allow the detection of greenhouses and how the algorithm developed works with them.
Response: Thanks to the reviewer for the comment. In this research, we have not been able to in-depth study the geophysical mechanism related to the greenhouse, but your suggestion provides a reference for our follow-up research. In-depth study of the geophysical mechanism of the greenhouse is a an important prerequisite for better understanding of the spectral parameters and texture features, which can further improve the accuracy and effectiveness of greenhouse recognition. Therefore, in Section 4.3. Limitations and outlook of the discussion section, we have added information about the influence of geophysical mechanisms on greenhouse identification, and used it to construct more scientific and effective greenhouse identification parameters and improve the identification accuracy of greenhouses etc. Then discuss and prospect the influence of geophysical mechanisms on the identification and monitoring of greenhouses. For detailed changes, please check L430-L436, L446-L449.
Comment4: The conclusions are too extensive, most of the text could be actually better placed in the discussion.
Response: Thanks to the reviewer for the comment. We have adopted your suggestion and rewritten the conclusions in the manuscript, simplify the conclusions in the original manuscript, retained the main core conclusions and opinions, deleted the redundant parts, and part of the content in the original manuscript will be transferred to the discussion part for elaboration and expansion, for example, in the discussion part, the research is compared with other greenhouse recognition algorithms, and some research results are added to better demonstrate the advantages of this method. which will further improve the readability and clarity of the conclusion in revision manuscript. For detailed changes, please check L411-L416, L458-L479.
Author Response File: Author Response.docx
Round 2
Reviewer 1 Report
The authors have improved the manuscript. However there are still several points which need addition care.
Lines 236-242 – in my opinion the paragraph below is very hard to follow and in the end quite unclear. I suggest reformulating using shorter sentences.
In order to better distinguish the accuracy of the different classifiers, and to ensure that the probability of the classification result of a single classifier cannot be higher than the cumulative probability of multiple classifiers, then the classification results of different classifiers can be considered comprehensively, and the absolute influence of a single classifier can be avoided, so that the classification results of the integrated learning classification algorithm can obtain the advantages of different classifiers as much as possible.
Lines 261-262:
“The UA is the The producer accuracy is the probability that the corresponding ground […]” – something is wrong here
Lines 305-306: it seems that the phrase starting with “As shown in Table 34’
"does not have a continuation.
Reviewer 2 Report
Thank you for incorporating the comments and making the requested corrections.
I recommend accepting this revised manuscript for publication.