Evaluation of Rainfall Erosivity Factor Estimation Using Machine and Deep Learning Models
Round 1
Reviewer 1 Report
Authors presented a study about the use of machine learning models to predict monthly R-factor values in Korea.
The use of multivariate techniques based on machine-learning algorithms is certainly interesting for the scientific community and worth of publication. However, substantive changes to the manuscript are required. I suggest the Authors to improve the methodological section. In particular, the description of the applied approaches is confusing and with many inaccuracies (detailed below). Furthermore, I warmly recommend to revise the text by avoiding repetitions and paying attention to explain the meaning of the abbreviations before they are used.
Please, find major concerns below. Other suggestions are provided in the attached pdf.
1. Introduction: L63
Why do you write “etc.”? RUSLE has only 5 factors: rainfall (R), soil erodibility (K), length slope (LS), cover management (C), and conservation practices (P).
2. Introduction: L67
What does “BMPs” means? In the text, there are many abbreviations used before their explanation. Please, check such issue in the whole manuscript.
3. Methods: Machine Learning models (L148-211)
I warmly recommend to completely re-write the description of the models. Also for researchers that commonly use such approaches, it is very difficult to understand your description of how each method works. Moreover, many inaccuracies are found.
In particular:
3a) L167: Why do you describe Support Vector Machine Regression (SVMR) in the subsection "Decision tree". You haven't used SVMR in the manuscript. You should simply describe how decision tree works (e.g. a decision tree is a series of sequential decisions made to reach a specific result, etc...).
3b) L176-184 Why KNN and Multilayer Perceptron (MLP) are described in the same section? MLP is a type of artificial neural network and it should be described together with Deep Neural network.
3c) L186. Random forest: It is not only a classification approach but also a regression method.
4: Methods: Model performance
L232: It is not necessary to calculate both the coefficient of correlation (R) and the coefficient of determination (R²), since they provide the same information. Similarly, also in the results and discussions sections only one of them should be reported.
5: Figures:
Fig. 3. I suggest to delete this figure. It is not relevant for the manuscript.
Fig. 4. The caption should be enriched by adding the way the maps have been obtained.
Comments for author File: Comments.pdf
Author Response
Author Response File: Author Response.pdf
Reviewer 2 Report
The authors performed machine learning study to predict USLE R-factor, which requires time and specialized knowledge. This research findings can help to support quick analysis method for rainfall intensity in estimating soil loss. However, the content of manuscript seems to be a “routine work”, which makes to be weak for a part of publication for “Water”. Thus, major revision is need for a part of publication for "Water" by considering the following points;
- The authors write a lot of unnecessary information for their study, which is mostly about introducing machine learning. For example;
“Machine learning is an algorithm that improves itself through numerous experiences by understanding the trends of the data themselves and predict the output of new input data.”(page5 line 149-150)
“Machine learning can be largely divided into supervised learning, unsupervised
learning, and reinforcement learning [33-34]. The supervised learning is a method of learning a machine with a given label for the data. ”(page5 line151-153)
“In addition, detailed tuning of the hyperparameter in Random Forest is easier than an artificial neural network and support vector regression.”(page6 line190-191, citation needed)
etc..
In general, those explanation is for the textbook. The authors should focus on what the authors have done in this research and what kind of research output was found. Thus, the authors should brush up the manuscript for concise explanations.
- It is hard to evaluate machine learning models without the information training data. Do authors perform preprocess for raw data, such as standardization or normalization? Neural network models tend to improve its accuracy with adequate preprocess so that kind of information must be provided. Moreover, the number of training dataset is unknown, which greatly effect the accuracy of machine learning models. In general, the dataset should be provided as supporting information for the post-review occasion.
- Please provide the details of machine learning environment. The authors used scikit-learn and keras for model building, however, the result can vary when the applied package version is different. Thus, the information of package version should be provided. Additionally, keras is the wrapper of tensorflow so the information of tensorflow is needed.
- Please perform K-fold(folds=5) cross validation score in each model and provide their values on the manuscript.
- Looking overall manuscript, it gives impression as a “routine work”, applying “scikit-learn” , and “keras” for machine learning and tuning parameters. Thus, the authors should perform detailed analysis on source data and predicted result. For example, predicted result shown in Table 6 seems that prediction accuracy is different in each season. This can be explained by performing statistical analysis on source data and comparing accuracy difference.
Author Response
We appreciate the Editor's and the reviewer's evaluations and valuable comments on this manuscript for publication. We have adopted the comments and suggestions in our revised manuscript to improve the quality of our manuscript.
"Please see the attachment." in the box.
(We revised the whole manuscript and marked with blue.)
Page 1-19 (Revised manuscripts)
Page 20-27 (Itemized response to comments)
Author Response File: Author Response.pdf
Round 2
Reviewer 1 Report
Dear Authors,
I would like to thank you for considering my comments in revising your paper. I can notice that many of the raised concerns have been addressed.
However, according to me, the description of the machine learning models (L136-227) still require additional efforts to describe adequately how each approach works, without any redundancy and confusion.
Moreover, some typos are still present.
Comments for author File: Comments.pdf
Author Response
We revised the whole manuscript and marked with blue highlights. We appreciate your valuable and careful comments to improve this manuscript. Unfortunately, specifying exact version of the machine learning library (Line 140-147) is suggested by other reviewer. so we kept in this revised manuscript.
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 2 Report
The authors revised manuscript adequately by judging from revised manuscript and comment to the reviewer; details of experiment are clearly given and quality of presentation is improved.
Thus, the manuscript is considered to be a part of "Water".
Author Response
We appreciate the reviewer's evaluations and valuable comments on this manuscript for publication.