Soil Dynamics and Crop Yield Modeling Using the MONICA Crop Simulation Model and Time Series Forecasting Methods
Round 1
Reviewer 1 Report
The paper investigates the possibility of using time series forecasting methods in modelling crop and soil dynamics using the MONICA simulation model, and compares the performance of different time series forecasting methods. The manuscript provides a detailed description of the methodology and results, but there are still some issues that need to be improved.
Major issues
L197, Meteorological data is much easier to obtain compared to crop and soil data, in usual studies, decades of meteorological data are used for smoothing average calculations, whereas only 4 years of meteorological data were used as the model training dataset in this study, which seems to be too small, especially for machine learning methods, and it is recommended to increase the number to more than 10 years.
L260, Please explain why crop yields, above-ground biomass and LAI can be accurately predicted when meteorological data differ significantly from actual observations? Does this mean that the effect of meteorological data on crop yield, above-ground biomass and LAI estimates is itself small?
L331, What is causing the two peaks in the observed nitrate leaching? Fertilizer management or rainfall?
And what is the reason for only one peak in the simulation model?
Tables, The tables in the full manuscript appear to have formatting errors, with "," and "." being mixed up. The R2 value for nitrate leaching in "Table 6” is incorrect and it is recommended to check the other tables.
Minor issues
L73, Please add a description of crop growth period, April-September?
L78, The article does not contain data on soils at different depths or descriptions of soil properties at different depths, please adjust the content to be consistent throughout the text.
Author Response
Major issues
Comment: L197, Meteorological data is much easier to obtain compared to crop and soil data, in usual studies, decades of meteorological data are used for smoothing average calculations, whereas only 4 years of meteorological data were used as the model training dataset in this study, which seems to be too small, especially for machine learning methods, and it is recommended to increase the number to more than 10 years.
Reply: Thank you for your question. In our research, we are employing classical machine learning models and statistical analysis techniques. These methods have a distinct characteristic: they can provide accurate predictions without needing a large volume of training data, which sets them apart from deep learning models. Deep learning models typically demand substantial data for practical training and accurate forecasts. In contrast, our chosen methodologies are able to achieve precise predictions while operating with a more modest dataset. Also, we did not consider the issue of changes in climatic norms, which may be noticeable in the range of 10-15 years, so we used a sample of 4-5 years to train models.
Comment: L260, Please explain why crop yields, above-ground biomass and LAI can be accurately predicted when meteorological data differ significantly from actual observations? Does this mean that the effect of meteorological data on crop yield, above-ground biomass and LAI estimates is itself small?
Reply: Thank you for your question. In our study, we just evaluated the forecast's quality for the yield model's predictive capabilities. Firstly, producing an acceptable weather forecast for the whole season is challenging. However, it is worth noting that some weather indicators are predicted better (temperature) and worse (precipitation). We desired to determine which weather indicators are most important for forecasting LAI, crop biomass, and crop yield. Furthermore, we also wanted to test a scenario in which we need a better weather forecast to get an acceptable yield forecast. We have described our assumption about the impact of forecast quality on yield models in lines 64-68 of the revised manuscript.
Comment: L331, What is causing the two peaks in the observed nitrate leaching? Fertilizer management or rainfall?
Reply: I sincerely thank you for the question. The answer to your question about nitrate leaching peaks, probably related to precipitation peaks, implies several aspects to be analyzed.
Firstly, two peaks of nitrate leaching for actual weather observations are most likely associated with intense precipitation in spring. As can be seen from the graphs and the crop rotation table, nitrogen fertilizers were introduced in early April, which caused an increase in leaching for all weather scenarios. However, then two heavy rains are allocated for actual observations of precipitation, which are absent in the model scenarios.
Secondly, it is necessary to predict extreme weather events that lead to high leaching levels. It is worth admitting that the ability of models to predict such events still needs to be stronger. We added more comments about this in Discussion (L 431- 434)
To improve understanding, we have added precipitation dynamics for the 2016 season to the Appendix.
Comment: And what is the reason for only one peak in the simulation model?
The presence of a single peak of leaching in simulation models is associated with the difficulty of predicting extreme precipitation. For this reason, there is only one peak associated with introducing nitrogen fertilizers at the beginning of the season. Moreover, the other two peaks, associated with extreme precipitation (the Observed variant), are no longer visible on the graph since the models could not predict such precipitation.
Comment: Tables, The tables in the full manuscript appear to have formatting errors, with "," and "." being mixed up. The R2 value for nitrate leaching in "Table 6” is incorrect and it is recommended to check the other tables.
Reply: Thank you for your comment, we have corrected the tables in the text.
The coefficient of determination measures the model's ability to explain variations in a dependent variable using available predictors.
A negative value of the coefficient of determination occurs when the predictors included in the model show a limited or insignificant degree of relationship with the dependent variable or when the model does not sufficiently consider the data's variability.
The presence of a negative coefficient of determination is not necessarily a negative or erroneous phenomenon in the context of statistical modeling. It may indicate the limited information available, insufficient data quality, or inconsistency of the chosen model with natural processes.
Minor issues
Comment: L73, Please add a description of crop growth period, April-September?
Reply: Thank you for your question, the growing season for soybeans was from mid-April to early October, and for sugar beet from early May to early November. To clarify the data on crop rotation, we added them in table format, now this is table number 2 (after L 231).
Comment: L78, The article does not contain data on soils at different depths or descriptions of soil properties at different depths, please adjust the content to be consistent throughout the text.
Reply: Thank you for your description, in the manuscript we indicated the name of the soil according to the World reference base for soil resources (L 90). We also corrected Table 1 and indicated the soil properties for each horizon. These soil properties were used in modeling.
Reviewer 2 Report
The manuscript provides a clear overview of the study's focus, which is the evaluation of crop simulation models and their applicability in modeling soil conditions and agricultural ecosystem performance. It highlights the importance of crop simulation models in assessing agricultural ecosystems and the limitations posed by the high uncertainty of long-term weather forecasts and labor-intensive requirements.
1. Abstract should provide some metrics to show the sensitivity and reflect the modelling performance.
2. Line 19. Here could cite Liu et al.,2023 to further support this statement.
Liu, K., Harrison, M.T., Yan, H. et al. Silver lining to a climate crisis in multiple prospects for alleviating crop waterlogging under future climates. Nat Commun 14, 765 (2023). https://doi.org/10.1038/s41467-023-36129-4
3. Line 25-34. These descriptions could be deleted.
4. The current objectives are not well reflected and answered in the abstract.
5. Many regional and global climate models could be used to produce climate scenarios. What’s the benefit of SARIMAX and Prophet?
6. 3.4 Example? What’s this?
English reads well
Author Response
Comment: 1. Abstract should provide some metrics to show the sensitivity and reflect the modelling performance.
Reply: Thank you very much for the recommendation, we added the main yield metrics for sugar beet and soybean (L 14-18).
Comment: 2. Line 19. Here could cite Liu et al.,2023 to further support this statement.
Liu, K., Harrison, M.T., Yan, H. et al. Silver lining to a climate crisis in multiple prospects for alleviating crop waterlogging under future climates. Nat Commun 14, 765 (2023). https://doi.org/10.1038/s41467-023-36129-4
Reply: Thank you for your suggestion, we have carefully studied the article and added it to the Introduction (L 24).
Comment: 3. Line 25-34. These descriptions could be deleted.
Reply: Thank you for your comment. We agree and partially corrected this section (Lines 30-36). However, the scientific value is not too great, but the description of the models allows the reader from another field to understand better the models' logic and the context of the research work. For this reason, we have added a description of the logic of the models and links to the leading software solutions in this area.
Comment: 4. The current objectives are not well reflected and answered in the abstract.
Reply: Thank you for your comment. We have yet to directly outline the study's goals either in the introduction or in the abstract. In our study, we tried to assess forecast quality's impact on the yield model's predictive capabilities. Firstly, obtaining a good weather forecast for the whole season is challenging. However, it is worth noting that some weather indicators are predicted better (temperature) and worse (precipitation). We wanted to determine which weather indicators are most important for forecasting LAI, crop biomass and crop yield. Moreover, we also wanted to test a scenario in which we need a better weather forecast to get an acceptable yield forecast. We have clarified the goals of the work in the abstract (Lines 7-9).
Comment: 5. Many regional and global climate models could be used to produce climate scenarios. What’s the benefit of SARIMAX and Prophet?
Reply: Thank you for your question. Indeed climate models are one of the primary data sources for assessing climate change. However, they have two limitations that do not allow them to be directly used for yield modeling tasks locally using process-based models. Firstly, the low spatial resolution of such models. Secondly, global climate models display notable fluctuations and deviations in crucial processes, such as precipitation, atmospheric river flux, and droughts, which could be essential for crop yield modeling. We added more about climate models usage in Introduction (L 47-50).
Comment: 6. 3.4 Example? What’s this?
Reply: Thank you for your comment, it was a typo, we corrected it.
Reviewer 3 Report
1- Abstract need a quantified results
2- Hypothesis and novelty need to be presented well in Introduction
3- Add map of the study area in methods
4- Add paragraph about the importance of ML and DL in climate data with related references, the following papers may help you (https://doi.org/10.1038/s43247-021-00225-4;
https://doi.org/10.1016/j.heliyon.2023.e18200
5- In Results and Discussion Section, Discussion need substantial improving with support of related references
6- The study limitations must be discussed as well.
7- Introduction section is too weak, you mentioned some crop models, buty not discussing the specific advantages of MONICA
8- In Methods, I don't see related information about the model calibration, please extend the text with model calibration and validation and support the text with additional Table
9- Results and Discussion, I don't see any discussion and all text about results. I do suggest separating Discussion from results and ensure the following points in discussion:
A- The discussion section is where you explore the underlying meaning of your research, its possible implications in other areas of study, and the possible improvements that can be made in order to further develop the concerns of your research.
B- Begin the Discussion section by restating your statement of the problem and briefly summarizing the major results. Do not simply repeat your findings. Rather, try to create a concise statement of the main results that directly answer the central research question that you stated in the Introduction section.
C- In brief you can write the discussion as:
- 1-Summary: A brief recap of your key results.
- 2- Interpretations: What do your results mean?
- 3- Implications: Why do your results matter?
- 4- Limitations: What can't your results tell us?
- 5- Recommendations: Avenues for further studies or analyses.
D- Ensure to cite recent and related references in each step.
Moderate
Author Response
Comment: 1- Abstract need a quantified results
Reply: Thank you for your comment. We added the research tasks and described the results in more detail, including adding the quality values of the yield forecast that were obtained (L 14-18).
Comment: 2 - Hypothesis and novelty need to be presented well in Introduction
Reply: Thank you for your comment. We added novelty and our hypothesis to the Introduction section (L 57- 68).
Comment: 3- Add map of the study area in methods
Reply: Thank you for your suggestion, we have prepared and added a study area map (Figure 1).
Comment: 4- Add paragraph about the importance of ML and DL in climate data with related references, the following papers may help you (https://doi.org/10.1038/s43247-021-00225-4;
https://doi.org/10.1016/j.heliyon.2023.e18200
Reply: Thank you for your suggestion, we suggested using ML and DL methods for scaling climate models, subseasonal forecasting and hybrid modeling in the introductory section (L 53-60).
Comment: 5- In Results and Discussion Section, Discussion need substantial improving with support of related references
Reply: Thank you for your comment, we have added discussion to the results and discussion section, and also added links to this part (L 416-443).
Comment: 6 - The study limitations must be discussed as well.
Reply: Спасибо за ваше замечание, мы добавили limitations нашего исследования, а также привели примеры (L 428-438).
Comment: 7- Introduction section is too weak, you mentioned some crop models, buty not discussing the specific advantages of MONICA
Reply: Indeed, the Introduction section does not compare the MONICA model with other models. However, the main advantages of the model are described in the section in subsection 2.5. Crop model MONICA (L 194-213). Of the main advantages of the MONICA model, the following can be distinguished: firstly, the soil block in the model describes in detail the carbon cycle, as well as the dynamics of nutrients, unlike the WOFOST model. Secondly, the model has a command-line interface, allowing it to be used and integrated with libraries in Python and other languages.
Comment: 8 - In Methods, I don't see related information about the model calibration, please extend the text with model calibration and validation and support the text with additional Table
Reply: In our research, we did not consider the issue of calibration and validation of the model, but in our previous work we tested this model for sugar beet and soy and conducted a sensitivity analysis. We added reference to sensitivity analysis of MONICA for sugar-beet and soybean coupled with model testing (L 212-213).
Comment: 9 - Results and Discussion, I don't see any discussion and all text about results. I do suggest separating Discussion from results and ensure the following points in discussion:
Reply: Thank you for your comment, we have added discussion to the results and discussion section (L 416-443).
Comment: A- The discussion section is where you explore the underlying meaning of your research, its possible implications in other areas of study, and the possible improvements that can be made in order to further develop the concerns of your research.
Reply: Thank you for your comment, we took your comment into account and tried to describe it in more detail (L 421-426).
Comment: B- Begin the Discussion section by restating your statement of the problem and briefly summarizing the major results. Do not simply repeat your findings. Rather, try to create a concise statement of the main results that directly answer the central research question that you stated in the Introduction section.
In brief you can write the discussion as:
- 1-Summary: A brief recap of your key results.
- 2- Interpretations: What do your results mean?
- 3- Implications: Why do your results matter?
- 4- Limitations: What can't your results tell us?
- 5- Recommendations: Avenues for further studies or analyses.
D - Ensure to cite recent and related references in each step.
Reply: Thank you for your comment and suggestion. We tried to describe the discussion part and considered your suggestions on the structure. We have added links to current research in Introduction and Discussion (L 416-443).
Round 2
Reviewer 1 Report
The authors have replied and revised the questions raised, and there are no more questions in the current version.
Reviewer 3 Report
The paper improved significantly after addressing the raised comments.
Minor editing of English language required.