Remotely Sensed Boro Rice Production Forecasting Using MODIS-NDVI: A Bangladesh Perspective
Round 1
Reviewer 1 Report
General Comments:
The manuscript has been improved compared to the previous version. However, it requires to be edited by an English partner. The issue regarding the NDVI saturation remained and the authors must show that it didn’t happen in their case. I think it can be solved if they add figures showing the relationship between NDVI versus yield, particularly for March and April. In general, the introduction was fine, and the methodology has been improved. I have some minor concerns in these sections. However, further effort should be made in the result section in order to make it a suitable contribution to the community. The quality of the figure is low, and tables should be converted into the timeseries. In addition, some parts of the introduction and methodology are repeated in the result section. I have listed my comments that need to be considered by the authors in the next round.
Major:
Abstract:
Line16-20: “MVC technique minimizes cloud contamination, reduces off-nadir directional viewing effects, minimizes sun-angle and shadow effects, and minimizes aerosol and water-vapor effects as well. Vegetation mask layers consists of (i) Forests and trees, (ii) Seasonal agricultural crops and (iii) Others which have been generated from high resolution satellite data and have been utilized to find the district-wise rice pixel only.” These are not necessary to be mentioned in the introduction. Move them into the datasets section used in the current study.
Line 23-25: “Regression analysis between district-based pixel-wise summation of MODIS-NDVI and district-wise BBS (Bangladesh Bureau of Statistics) estimated Boro production (R2=0.57-0.85) reveals significant correlation between the two.” R2 less than 0.7 cannot be considered as a significant correlation.
Line 25-26: “Model coefficients have been obtained through mathematical optimization of the regression function against yearly data for a given year.” These are not necessary to be mentioned in the introduction. Everyone knows how linear regression works.
Introduction:
Line 97-98: “MODIS images have been widely used at larger regional scale due to its faster re-visit time (~1 day) and relatively smaller datasets resulting from its lower resolution [29-33, 20].” Smaller datasets in terms of what (the memory size, scale, pixel size, …)?
Line 112-114: “Moreover various geo-bio-environmental factors such as climate change consequences, increased natural disasters, uncertainty in geo- environmental variability appear to be as possible factors determining the crop production to ensure nation-wide food security in this area.” It is not related to the previous lines considering the authors claimed that the remote sensing regression-based model are “suitable even under threats of climate changes and natural disasters”
Line 118-119: “In dense forest canopies the Normalized Difference Vegetation Index (NDVI) shows saturation effects [11]” the saturation situation is not solely related to the dense forest. NDVI in any dense vegetated area is going to be saturated particularly when it is mapped versus a non-limited continuous parameter such as yield. Therefore, to overcome this issue some research used a non-normalized version of NDVI which is N/R and still it is not clear why the authors developed all those regression models just based on NDVI. I don’t know if this happened in the current study, but if the authors are confident that the saturated problem is not occurred in the developed models because of the early growing stage, low yield samples, etc. they must clearly address it in the manuscript.
Materials and Methods:
Fig 1: still the quality of Fig 1 is too low. Make sure that the images have been exported in at least 300 dpi.
Table 2: Label those stages with months if applicable (e.g. Initial: May-June).
Line 226-228: “Ultimately, each image corresponding to each month as obtained is based on the selection of the best pixel value from a total of 16 images of 16 consecutive dates of the same pixel.” It is not clear to me whether the final image considered for April, for instance, is the best image among 16 images or it is a merged image from the best pixels from 16 pixels for each pixel location.
Line 282-283: “At this stage, spatial masking operation has been carried out in the ERDAS Imagine platform using previously prepared vegetation mask layer.” The mask layer showing those type of vegetation (the output of ERDAS imagine) must be illustrated in the manuscript. I don’t know how difficult it is but to show the low changes of the mask layer during these years (2011-2017), I recommend adding two subfigures, one for mask layer in 2011 and the other one for 2017 with an appropriate legend.
Line 318-320: “Therefore, in this study relatively slowly growing vegetation features have been generated from high resolution optical images and moderate high-resolution satellite image with spatial resolution of 5 meter or so on have been used for spatial details.” The source of a high-resolution satellite image (5 m) must be included in the context.
Line 347-355: “As a fundamental approach, the model … not used to retrieve the values of the parameters.” Still, it is not clear for the readers. As I understand, the authors consider 2011-2014 as a training set (retrieving regression coefficients) and 2014-2017 as testing or validating the models. If this is true, revise the paragraph to clearly address this concept.
Results:
Line 392-393: “The regression coefficient from January to April ranges from 0.61-0.84; 0.63-0.83; 0.63-0.84;0.64-0.85; 0.72-0.84; 0.57-0.75 and 0.57-0.65 in 2011; 2012; 2013; 2014; 2015; 2016 and 2017, respectively” Remove these numbers and just bring upper and lower bounds or mean of them in the manuscript.
Line 409-410: “Therefore, based on the highest regression coefficient of March 2014 a Boro crop production forecasting model has been selected to apply over 2011-2017 periods in this study”. As I understand, the authors chose March 2014 regression model since it had the highest value of r-square and then applied it on the datasets as prediction or simulation. However, it is clear that all these linear equations came from an empirical approach. All of them are site- and time-specific. I’m wondering to know why the authors decided to select one model among all those monthly models and employ it over total datasets while they can choose the best model in each month to diminish the effect of the time-specific issue.
Table 4. Move into the Appendix section and replace it with a time series graph showing the R-square in Y-axis and months (Jan-April) in the x-axis. It must have 7 lines for 7 datasets (2011-2017).
Line 423-424: “Therefore, MODIS-NDVI based developed model in Eq. (i) has been simulated in forecasting Boro crop production during 2011-2017 periods.” Use number instead of “i”. Remove “Therefore”.
Line 444-462: Move them in the methodology since the corresponding values have been reported in the previous sections.
Fig 3: It seems these subfigures have been starched. Make sure that the ratio scale is locked if you exported them from excel. Also, it must be discussed why the error increased from 2011 to 2017. Is that because of what?
Minor:
Line 26-28: “Derived set of coefficient values for a given year have been utilized to obtain year-wise rice productions for all the remaining years.” Mention the remaining years specifically, such as 2016-2017.
Line 35-36: “and requires minimum human as well as machine resources.” Remove it.
Line 121-126: “The present study is focused on NDVI because it is widely used in phenological works, e.g. [45-47], and because it is known to be more sensitive to small increases in the amount of photosynthetic vegetation [48-49]. In view of the above, the objectives of this research are: a) to develop an effective methodological framework to retrieve useful information on seasonal crop and b) to forecast the seasonal crop production estimates based on the developed methodology for potential use in country’s national food security issues.” Put it in a new paragraph.
Line 192-194: “For over half the world's people Rice (Oryza sativa L.) is a staple food of which is grown on approximately 146 million hectares, more than 10 percent of total available land.” it requires a proper citation.
Line 327-328: “Hence present study duly considered the temporally dynamic and spatially heterogeneity characteristics in surface feature identification and masking operation with continuous time interval for regular updating (Table 3) for precise crop information” Check it again.
Line 376-377: “Various studies on the extensive use of NDVI values over the world for measuring the vegetation cover characteristics, crop assessment studies and….nations over the past 40 years including
Bangladesh have been reported by different studies [73, 12].” These are not related to the results section. Move them into the introduction section.
Line 23-25: “Regression analysis between district-based pixel-wise summation of MODIS-NDVI and district-wise BBS (Bangladesh Bureau of Statistics) estimated Boro production (R2=0.57-0.85) reveals significant correlation between the two.” Changed “reveals” to “revealed”.
Author Response
Response to Reviewer 1 Comments
Point 1: Line16-20: “MVC technique minimizes cloud contamination, reduces off-nadir directional viewing effects, minimizes sun-angle and shadow effects, and minimizes aerosol and water-vapor effects as well. Vegetation mask layers consists of (i) Forests and trees, (ii) Seasonal agricultural crops and (iii) Others which have been generated from high resolution satellite data and have been utilized to find the district-wise rice pixel only.” These are not necessary to be mentioned in the introduction. Move them into the datasets section used in the current study.
Response 1: “MVC technique minimizes cloud contamination, reduces off-nadir directional viewing effects, minimizes sun-angle and shadow effects, and minimizes aerosol and water-vapor effects as well” these lines have been deleted from Abstract section and these lines have been written in 2.5 section.
“Vegetation mask layers consists of (i) Forests and trees, (ii) Seasonal agricultural crops and (iii) Others which have been generated from high resolution satellite data and have been utilized to find the district-wise rice pixel only.” these lines have been moved to 2.6 section.
Point 2: Line 23-25: “Regression analysis between district-based pixel-wise summation of MODIS-NDVI and district-wise BBS (Bangladesh Bureau of Statistics) estimated Boro production (R2=0.57-0.85) reveals significant correlation between the two.” R2 less than 0.7 cannot be considered as a significant correlation.
Response 2: This sentence has been revised and rewritten in abstract section. In most of the cases the regression coefficient values are above 0.7 except the some values found in January and February out of 28 regression coefficient in Appendix 1. The regression coefficient value ranges from (0.57-0.85) among the individually developed twenty eight (28) regression coefficient. The lowest and highest regression coefficient value of R2=0.57 and R2=0.85 have been found in April 2012 and March 2014 respectively. The reason behind that, the peak greenness period represent the highest NDVI values [77-78, 29] and March 23/24 to April 6/7 depending on leap year has been considered as a peak greenness period in the context of Bangladesh [14, 23-24]. Some other articles [79-81, 29, 78] also support these findings that the peak greenness period (March) is generally related to the Boro crop production. The details of the regression coefficient have been stated in 3.1 section
Point 3: Line 25-26: “Model coefficients have been obtained through mathematical optimization of the regression function against yearly data for a given year.” These are not necessary to be mentioned in the introduction. Everyone knows how linear regression works.
Response 3: ‘Model coefficients have been obtained through mathematical optimization of the regression function against yearly data for a given year’ these lines have been deleted from Abstract section.
Point 4: Line 97-98: “MODIS images have been widely used at larger regional scale due to its faster re-visit time (~1 day) and relatively smaller datasets resulting from its lower resolution [29-33, 20].”Smaller datasets in terms of what (the memory size, scale, pixel size, …)?
Response 4: ‘We have added memory size or data volume within the text in the above mentioned line.
Point 5: Line 112-114: “Moreover various geo-bio-environmental factors such as climate change consequences, increased natural disasters, uncertainty in geo- environmental variability appear to be as possible factors determining the crop production to ensure nation-wide food security in this area.” It is not related to the previous lines considering the authors claimed that the remote sensing regression-based model are “suitable even under threats of climate changes and natural disasters”
Response 5: ‘Yes i agree with your point and this line “Suitable even under threats of climate changes and natural disasters” have been deleted from Abstract section as per your suggestion.
Point 6: Line 118-119: “In dense forest canopies the Normalized Difference Vegetation Index (NDVI) shows saturation effects [11]” the saturation situation is not solely related to the dense forest. NDVI in any dense vegetated area is going to be saturated particularly when it is mapped versus a non-limited continuous parameter such as yield. Therefore, to overcome this issue some research used a non-normalized version of NDVI which is N/R and still it is not clear why the authors developed all those regression models just based on NDVI. I don’t know if this happened in the current study, but if the authors are confident that the saturated problem is not occurred in the developed models because of the early growing stage, low yield samples, etc. they must clearly address it in the manuscript.
Response 6: According to [11], the NDVI shows saturation effect in the dense forest canopies and hence we excluded the dense forest area (hilly area namely Bandarban, Khagrachari and Rangamati) in the context of our country to minimize this saturation effect. Subsequently we have noticed that after a certain range of NDVI value the saturation effect may occur that’s why we mask out the other forest area and homestead vegetation by applying country scale updated vegetation mask layer. Beside Figure 6 shows the linear positive relation and the curves also shows no sign of saturation effect in the present study. The write up relevant to the saturation effect for present study has been added in the 2.7 section as per recommendation.
Point 7: Fig 1: still the quality of Fig 1 is too low. Make sure that the images have been exported in at least 300 dpi.
Response 7: We checked the Figure quality and now it has become 300 dpi.
Point 8: Table 2: Label those stages with months if applicable (e.g. Initial: May-June).
Response 8: Table 2 has been changed to Table 1 where we have added the month respective to the days in as per your suggestion.
Point 9: Line 226-228: “Ultimately, each image corresponding to each month as obtained is based on the selection of the best pixel value from a total of 16 images of 16 consecutive dates of the same pixel.” It is not clear to me whether the final image considered for April, for instance, is the best image among 16 images or it is a merged image from the best pixels from 16 pixels for each pixel location.
Response 9: It means that it is a merged image from the best pixels from 16 pixels for each pixel location. Moreover the Terra MODIS utilize the Maximum Value Composite (MVC) techniques in data generation procedure and details of it have been given in 2.5 section.
Point 10: Line 282-283: “At this stage, spatial masking operation has been carried out in the ERDAS Imagine platform using previously prepared vegetation mask layer.” The mask layer showing those type of vegetation (the output of ERDAS imagine) must be illustrated in the manuscript. I don’t know how difficult it is but to show the low changes of the mask layer during these years (2011-2017), I recommend adding two subfigures, one for mask layer in 2011 and the other one for 2017 with an appropriate legend.
Response 10: The details of mask layer have been described in 2.6 & 2.7 section. The details of vegetation mask layer generation and updating procedure have been given in Figure 2. According to your recommendation two Figures 3a &3b have been added in the manuscript which determined the changes of mask layer properties with the time period 2011-2017.
Point 11: Line 318-320: “Therefore, in this study relatively slowly growing vegetation features have been generated from high resolution optical images and moderate high-resolution satellite image with spatial resolution of 5 meter or so on have been used for spatial details.” The source of a high-resolution satellite image (5 m) must be included in the context.
Response 11: Table 2 provides the list of major surface features to be monitored along with necessary time interval, spatial resolution and data source.
Point 12: Line 347-355: “As a fundamental approach, the model … not used to retrieve the values of the parameters.” Still, it is not clear for the readers. As I understand, the authors consider 2011-2014 as a training set (retrieving regression coefficients) and 2014-2017 as testing or validating the models. If this is true, revise the paragraph to clearly address this concept.
Response 12: This paragraph has been revised and modified in the manuscript as:
“As a fundamental approach, the model coefficients have been derived through mathematical optimization of the functional model against a given data set covering a given time period 2011-2017. Then based on the highest regression coefficient value the regression model of March 2014 has been applied to independently generate crop production data values at country scale from year 2011-2017. After that the comparison has been made with the RS model based simulated results versus ground based BBS estimated crop production statistics for testing or validating the model which were not used to retrieve the values of the parameters.”
Point 13: Line 392-393: “The regression coefficient from January to April ranges from 0.61-0.84; 0.63-0.83; 0.63-0.84;0.64-0.85; 0.72-0.84; 0.57-0.75 and 0.57-0.65 in 2011; 2012; 2013; 2014; 2015; 2016 and 2017, respectively” Remove these numbers and just bring upper and lower bounds or mean of them in the manuscript.
Response 13: The coefficient values from 2011-2017 have been given with upper and lower bounds in the manuscript and now it has been modified.
Point 14: Line 409-410: “Therefore, based on the highest regression coefficient of March 2014 a Boro crop production forecasting model has been selected to apply over 2011-2017 periods in this study”. As I understand, the authors chose March 2014 regression model since it had the highest value of r-square and then applied it on the datasets as prediction or simulation. However, it is clear that all these linear equations came from an empirical approach. All of them are site-and time-specific. I’m wondering to know why the authors decided to select one model among all those monthly models and employ it over total datasets while they can choose the best model in each month to diminish the effect of the time-specific issue.
Response 14: We choose March 2014 regression model from 28 regression model because it shows the highest value of r-square and in the context of our country (Bangladesh) it is more practical and realistic because of the vegetative growth structure of Boro crop. In the context of our country (Bangladesh), the Boro rice season lasts from January to April where transplantation begins in January and the crops become mature at the end of March. From our working experiences in monitoring the Boro crop life cycle over the last decade for the estimation of Boro rice area at country scale we expects the highest NDVI values at peak greenness period (March) and therefore regression coefficient also shows highest regression value at March. Appendix 1 shows that at every year the regression coefficient at march becomes higher respective to the other month. Some other article [83-85, 29, 82] also support this findings that the peak greenness period (March) is generally related to the Boro crop production. Besides we also applied other regression equation to estimate the Boro crop production but as expected the difference between the predicted and the estimated production becomes higher than the presently applied model. Furthermore the regression coefficient at yearly scale (Table 4) has been derived to show the closeness of forecasted and estimated statistics at country scale. Therefore in the context of Bangladesh where there is diversity in landscape and smaller plot area the best regression model in each month may not represent the actual surface feature condition at country scale.
Point 15: Table 4. Move into the Appendix section and replace it with a time series graph showing the R-square in Y-axis and months (Jan-April) in the x-axis. It must have 7 lines for 7 datasets (2011-2017).
Response 15: As per your recommendation Table 4 has been moved to Appendix section and a time series graph has been generated showing the R-square in Y-axis and months (Jan-April) in the x-axis with 7 lines for 7 datasets (2011-2017). The manuscript texts have been revised also.
Point 16: Line 423-424: “Therefore, MODIS-NDVI based developed model in Eq. (i) has been simulated in forecasting Boro crop production during 2011-2017 periods.” Use number instead of “i”. Remove “Therefore”.
Response 16: All the equations number has been modified.
Point 17: Line 444-462: Move them in the methodology since the corresponding values have been reported in the previous sections.
Response 17: The mathematical equations have been moved to methodology section.
Point 18: Fig 3: It seems these subfigures have been starched. Make sure that the ratio scale is locked if you exported them from excel. Also, it must be discussed why the error increased from 2011 to 2017. Is that because of what?
Response 18: The Figure 3 has been changed to Figure 6 and has been exported at 300dpi as per suggestion of previous reviewer and the ratio scale is also locked.
Point 19: Line 26-28: “Derived set of coefficient values for a given year have been utilized to obtain year-wise rice productions for all the remaining years.” Mention the remaining years specifically, such as 2016-2017.
Response 19: It has been modified.
Point 20: Line 35-36: “and requires minimum human as well as machine resources.” Remove it.
Response 20: These words “and requires minimum human as well as machine resources” have been removed.
Point 21: Line 121-126: “The present study is focused on NDVI because it is widely used in phenological works, e.g. [45-47], and because it is known to be more sensitive to small increases in the amount of photosynthetic vegetation [48-49]. In view of the above, the objectives of this research are: a) to develop an effective methodological framework to retrieve useful information on seasonal crop and b) to forecast the seasonal crop production estimates based on the developed methodology for potential use in country’s national food security issues.” Put it in a new paragraph.
Response 21: These lines have been putted in a new paragraph.
Point 22: Line 192-194: “For over half the world's people Rice (Oryza sativa L.) is a staple food of which is grown on approximately 146 million hectares, more than 10 percent of total available land.” it requires a proper citation.
Response 22: Proper citation has been added within the manuscript and in the reference section. The whole reference numbers have been changed now due to the addition of new reference.
Point 23: Line 327-328: “Hence present study duly considered the temporally dynamic and spatially heterogeneity characteristics in surface feature identification and masking operation with continuous time interval for regular updating (Table 3) for precise crop information” Check it again.
Response 23: Table 3 has been changed to Table 2. It has been updated and the detail of it has been illustrated in 2.6 & 2.7 sections.
Point 24: Line 376-377: “Various studies on the extensive use of NDVI values over the world for measuring the vegetation cover characteristics, crop assessment studies and….nations over the past 40 years including Bangladesh have been reported by different studies [73, 12].” These are not related to the results section. Move them into the introduction section.
Response 24: These lines have been deleted as the similar meaning has been written in other lines.
Point 25: Line 23-25: “Regression analysis between district-based pixel-wise summation of MODIS-NDVI and district-wise BBS (Bangladesh Bureau of Statistics) estimated Boro production (R2=0.57-0.85) reveals significant correlation between the two.” Changed “reveals” to “revealed”.
Response 25: It has been changed.
Author Response File: Author Response.docx
Reviewer 2 Report
Summary:
This research developed a methodology to monitor and forecast rice crop production in Bangladesh using RS and GIS. This topic is interesting, but the writing needs to be substantially improved, especially for the methodology section. Specific comments:
Abstract section:
Line 16-20: “MVC technique minimizes cloud contamination, reduces off-nadir directional viewing effects, minimizes sun-angle and shadow effects, and minimizes aerosol and water-vapor effects as well. Vegetation mask layers consists of (i) Forests and trees, (ii) Seasonal agricultural crops and (iii) Others which have been generated from high resolution satellite data and have been utilized to find the district-wise rice pixel only.”
These two sentences comprehensively introduce MVC and the mask layer. It is important, but it is not necessary in the abstract section. Compared to the details of mask layer, the readers are more interested in the names of data/product, classification methods and accuracies of rice pixels.
Line 22: “Hence the district-wise sum of NDVI on pixel by pixel has been calculated.”
Please elaborate “sum of NDVI” in which month/period in which year.
Line 26-27: “Derived set of coefficient values for a given year have been utilized to obtain year-wise rice productions for all the remaining years.” It is very confusing. Do you mean you obtain rice production during 2012-2017 using 2011, obtain 2011 and 2013-2017 using 2012? What is purpose for iteratively doing this?
Introduction section:
Line 105-106: “Previously, very few works on remote sensing-based crop yield forecasting systems have been performed in comparatively smaller area in Bangladesh for example:”
Why the study in the smaller area is more important?
Line 108-109: “[38] have found high R2 of 0.84, 0.72 and 0.80 for the NDVI, LAI and fPAR, respectively between the regression model of vegetation index and field level potato yield;” Please add the study area in this example.
Line 109-111: “[39] found good agreements between forecasted (i.e., MODIS-based) and ground-based Boro rice yield i.e., R2 (0.76 and 0.86); RMSE (0.21 and 0.29 Mton/ha), and RE (-5.45% and 6.65%).” Please also add the study area in this example.
Methodology section:
“Figure 1. Map showing the sample MODIS imagery over the study area”. The labels of grid only need to show degree for conciseness because minutes and seconds are both 0.
Table 1 is not necessary. Line 195: “mostly in Asia [54] and Bangladesh is the 4th top rice producing country in the world (Table 1)” This sentence is enough to highlight the importance of study area.
Line 207: “after this phonological stage and chlorophyll”
It is “phenological” instead of “phonological stage”.
Please add the start date and end date for each stage in Table 2. Because these dates would help choose the reasonable period of NDVI sum to correlate with crop production.
Line 224-226: “Data coverage for each month from January to April time period covering Boro lifecycle during 2011 to 2017 have been utilized.” Why do the authors choose the period from January to April? Is it the growing season? This period is very important, which should include the growing season at a pixel level.
L238: “estimation reporting system the selected sample clusters are visited four times in a year”
Please add the month/date for each visit, and what variables are measured for each visit?
“Section 2.6 Digital Overlay and Masking Operation”
This section is also very confusing. I recommend the authors at least highlight the name of data/product and methodology, and classification accuracy.
Author Response
Response to Reviewer 2 Comments
Point 1: Line 16-20: “MVC technique minimizes cloud contamination, reduces off-nadir directional viewing effects, minimizes sun-angle and shadow effects, and minimizes aerosol and water-vapor effects as well. Vegetation mask layers consists of (i) Forests and trees, (ii) Seasonal agricultural crops and (iii) Others which have been generated from high resolution satellite data and have been utilized to find the district-wise rice pixel only.”
These two sentences comprehensively introduce MVC and the mask layer. It is important, but it is not necessary in the abstract section. Compared to the details of mask layer, the readers are more interested in the names of data/product, classification methods and accuracies of rice pixels.
Response 1: These sentences have been deleted from abstract section. The details of mask layer have been described in 2.6 & 2.7 section. The details of vegetation mask layer generation and updating procedure have been given in Figure 2. Besides two Figures 3a & 3b have been added in the manuscript which determined the changes of mask layer properties with the time period 2011-2017.
Point 2: Line 22: “Hence the district-wise sum of NDVI on pixel by pixel has been calculated. ”Please elaborate “sum of NDVI” in which month/period in which year.
Response 2: This line has been revised with month and year as per your suggestion.
Point 3: Line 26-27: “Derived set of coefficient values for a given year have been utilized to obtain year-wise rice productions for all the remaining years.” It is very confusing. Do you mean you obtain rice production during 2012-2017 using 2011, obtain 2011 and 2013-2017 using 2012? What is purpose for iteratively doing this?
Response 3: This line has been modified. Actually we tried to explain that based on the highest regression coefficient value the regression model of march 2014 has been applied to independently generate crop production data values at country scale from year 2011-2017.
Point 4: Line 105-106: “Previously, very few works on remote sensing-based crop yield forecasting systems have been performed in comparatively smaller area in Bangladesh for example: ” Why the study in the smaller area is more important?
Response 4: Yes, we agree with your statement and now we revised the lines and added the study area also. This section has been added based on the previous reviewer comment.
Point 5: Line 108-109: “[38] have found high R2 of 0.84, 0.72 and 0.80 for the NDVI, LAI and fPAR, respectively between the regression model of vegetation index and field level potato yield;” Please add the study area in this example.
Response 5: The study area name has been added in this section.
Point 6: Line 109-111: “[39] found good agreements between forecasted (i.e., MODIS-based) and ground-based Boro rice yield i.e., R2 (0.76 and 0.86); RMSE (0.21 and 0.29 Mton/ha), and RE (-5.45% and 6.65%).” Please also add the study area in this example.
Response 6: The study area name has been added in this section and the sentence has been modified.
Point 7: “Figure 1. Map showing the sample MODIS imagery over the study area”. The labels of grid only need to show degree for conciseness because minutes and seconds are both 0.
Response 7: The labels of grid in Figure 1 have been changed with degree only.
Point 8: Table 1 is not necessary. Line 195: “mostly in Asia [54] and Bangladesh is the 4th top rice producing country in the world (Table 1)” This sentence is enough to highlight the importance of study area.
Response 8: Table 1 has been deleted.
Point 9: Line 207: “after this phonological stage and chlorophyll” It is “phenological” instead of “phonological stage”.
Response 9: It has been modified.
Point 10: Please add the start date and end date for each stage in Table 2. Because these dates would help choose the reasonable period of NDVI sum to correlate with crop production.
Response 10: Table 2 has been changed to Table 1 where we have added the month respective to the days in as per your suggestion.
Point 11: Line 224-226: “Data coverage for each month from January to April time period covering Boro lifecycle during 2011 to 2017 have been utilized.” Why do the authors choose the period from January to April? Is it the growing season? This period is very important, which should include the growing season at a pixel level.
Response 11: Generally the Boro life cycle last for January to April over the Bangladesh Table 1 shows the life cycle of Boro crop over the North western part of Bangladesh. From our practical experience of working for the monitoring of Boro crop area we select Jan-April to cover the life cycle.
Point 12: L238: “estimation reporting system the selected sample clusters are visited four times in a year” Please add the month/date for each visit, and what variables are measured for each visit?
Response 12: BBS is a national organization dealing with nation level statistics and they have specific plan to visit the define cluster at country level. Actually the field level officer continuously monitors the status of field crops but officially they visit the cluster four times in a year. They take acreage information, plot area, yield rate, production rate etc. about specific crops and provide report to the authority. The detail of ground based data collection procedure has been given in 2.3 section.
Point 13: “Section 2.6 Digital Overlay and Masking Operation”
This section is also very confusing. I recommend the authors at least highlight the name of data/product and methodology, and classification accuracy.
Response 13: The layouts of generating and updating the mask layer have been illustrated in Figure 2. The data products and classification accuracy have been written within the text. The details of mask layer have been described in 2.6 & 2.7 section.
Round 2
Reviewer 1 Report
Although the authors didn’t highlight the parts that are changed in this version, the manuscript has been improved and my major comments are addressed except one. I copied and pasted again along with the authors’ response and re-write it to make it more clear for the authors. In addition, there are several major issues in new tables and figures that must be revised before publishing. Here are my comments:
Major:
Reviewer Comment: Line 409-410: “Therefore, based on the highest regression coefficient of March 2014 a Boro crop production forecasting model has been selected to apply over 2011-2017 periods in this study”. As I understand, the authors chose March 2014 regression model since it had the highest value of r-square and then applied it on the datasets as prediction or simulation. However, it is clear that all these linear equations came from an empirical approach. All of them are site-and time-specific. I’m wondering to know why the authors decided to select one model among all those monthly models and employ it over total datasets while they can choose the best model in each month to diminish the effect of the time-specific issue.
Author Response: We choose March 2014 regression model from 28 regression model because it shows the highest value of r-square and in the context of our country (Bangladesh) it is more practical and realistic because of the vegetative growth structure of Boro crop. In the context of our country (Bangladesh), the Boro rice season lasts from January to April where transplantation begins in January and the crops become mature at the end of March. From our working experiences in monitoring the Boro crop life cycle over the last decade for the estimation of Boro rice area at country scale we expects the highest NDVI values at peak greenness period (March) and therefore regression coefficient also shows highest regression value at March. Appendix 1 shows that at every year the regression coefficient at march becomes higher respective to the other month. Some other article [83-85, 29, 82] also support this findings that the peak greenness period (March) is generally related to the Boro crop production. Besides we also applied other regression equation to estimate the Boro crop production but as expected the difference between the predicted and the estimated production becomes higher than the presently applied model. Furthermore the regression coefficient at yearly scale (Table 4) has been derived to show the closeness of forecasted and estimated statistics at country scale. Therefore in the context of Bangladesh where there is diversity in landscape and smaller plot area the best regression model in each month may not represent the actual surface feature condition at country scale.
Reviewer Comment:
1- Look at Fig 5. My point is why the authors decided to apply the regression model fitted over March 2014 datasets for the entire datasets while its performance is not acceptable for January and February. In contrast, it is much better to use the dark blue line for January and February, the yellow line for March and the gray line for April. Using March 2014 model as the best model is questionable. It is not clear to readers why the authors pick the March 2014 model for simulating the crop product for January while we have a better model for simulating in January (Dark blue line). The authors could simply report their model for simulation of the crop product at the yearly scale using “if else statement”. For example: if month==1 or 2, use model 1 (dark blue), else if month==3, use model 2 (yellow line), else use model 3 (gray line).
2- Fig 1: use a stretch color bar for MODIS NDVI considering the low NDVI regions in red and high NDVI regions green.
3- Eq. (4) Remove Y, use BCP and define it as Boro Crop Production. Instead of NDVI_sum use the mathematical shape of that (use Sigma sign).
4- Fig 5: Use the name of Months instead of 1, 2, …, 4. In Fig 5, Y-Axis must be R-square, not regression coefficients values. If it is not drawn based r-square, update it. In addition, remove Jan-April form legend and keep only years.
5- Table 4. Remove decimals for BBS Estimated, MBE and RMSE. Use a thousand separator for RS model estimated, BBS estimated, MBE and RMSE items.
6- Table 4. Move it into the Appendix section and add a radar chart showing R2, MBE, RMSE, ME for each year. But show them in one radar chart.
7- If it is possible, add a figure showing the simulated crop production based on the best model and for each year in subfigures.
Minor:
1- Table 1: Make the text center.
2- Fig 2: change “GPS-based Ground Verification & Correction” to “GPS-based Ground Verification and Correction”
3- Eq. (2) and Eq (3): Change “I” to “i”
4- in Fig 6, subfigures are stretched.
Author Response
Response to Reviewer 1 Comments (Third round)
Point 1: Look at Fig 5. My point is why the authors decided to apply the regression model fitted over March 2014 datasets for the entire datasets while its performance is not acceptable for January and February. In contrast, it is much better to use the dark blue line for January and February, the yellow line for March and the gray line for April. Using March 2014 model as the best model is questionable. It is not clear to readers why the authors pick the March 2014 model for simulating the crop product for January while we have a better model for simulating in January (Dark blue line). The authors could simply report their model for simulation of the crop product at the yearly scale using “if else statement”. For example: if month==1 or 2, use model 1 (dark blue), else if month==3, use model 2 (yellow line), else use model 3 (gray line).
Response 1: According to your suggestions we have developed two more models for month of Jan/Feb and April in the revised manuscript based on the regression value of Figure 5. It needs to mention that we did not simulate the model generated from March for Jan/Feb/April in total dataset (2011-2017). We simulated the model developed from March (BCP Model 2) only for the month of March for all the dataset (2011-2017) which is statistically significant also. Based on your comments we have mentioned in our revised manuscript that BCP model 1 and BCP Model-3 can be used but BCP Model-2 is suitable and more realistic because of highest photosynthetic activity at March (Based on crop calendar). Hope it will bring the realistic scenario to the reviewer about the countries crop production. Now we think your comment on this issue has been addressed according to your suggestion which we did not realize it previously.
Point 2: Fig 1: use a stretch colour bar for MODIS NDVI considering the low NDVI regions in red and high NDVI regions green.
Response 2: Figure 1 has been revised according to your suggestion.
Point 3: Eq. (4) Remove Y, use BCP and define it as Boro Crop Production. Instead of NDVI_sum use the mathematical shape of that (use Sigma sign).
Response 3: All the equations have been modified as per your suggestions and sigma sign has been inserted at mathematical shape.
Point 4: Fig 5: Use the name of Months instead of 1, 2, …, 4. In Fig 5, Y-Axis must be R-square, not regression coefficients values. If it is not drawn based r-square, update it. In addition, remove Jan-April form legend and keep only years.
Response 4: The names of the month have been modified instead of number and Y-axis has been revised as per suggestion. The legend has also been changed.
Point 5: Table 4. Remove decimals for BBS Estimated, MBE and RMSE. Use a thousand separator for RS model estimated, BBS estimated, MBE and RMSE items.
Response 5: Table 4 has been moved to Appendix 2 where the thousand separators have been used for RS model estimated, BBS estimated, MBE and RMSE items.
Point 6: Table 4. Move it into the Appendix section and add a radar chart showing R2, MBE, RMSE, ME for each year. But show them in one radar chart.
Response 6: According to your suggestion a radar chart in Figure 6 has been prepared which shows the MBE, RMSE, ME for each year.
Point 7: If it is possible, add a figure showing the simulated crop production based on the best model and for each year in subfigures.
Response 7: We have planned to carry out detail study on this issue. It requires more time to elaborate this issue. Hope you understand our points. Again thanks for your constructive comments in all the issues.
Minor Revision:
Point 8: Table 1: Make the text center.
Response 8: The texts in the table have been centred.
Point 9: Fig 2: change “GPS-based Ground Verification & Correction” to “GPS-based Ground Verification and Correction”
Response 9: It has been changed.
Point 10: Eq. (2) and Eq (3): Change “I” to “i”
Response 10: The equation numbers have been modified. The RS models have been revised to BCP Model 1, BCP Model 2 and BCP Model 3 instead of Equation (iv-vi).
Point 11: in Fig 6, subfigures are stretched.
Response 11: All the subfigures have been prepared in 300 dpi as per instruction during the first round review. It has been exported at 300 dpi through Microsoft XL toolbox for publication quality.
Reviewer 2 Report
Overall, I am happy about the revision. Just one point:
Line 22:
“Therefore, derived set of coefficient value (BCP Model 2) has”. What is BCP Model 2? I recommend to say like this “ the highest regression coefficient value from derived set of coefficient value …….”
Author Response
Overall Comments: Overall, I am happy about the revision. Just one point:
Authors Response: Thank you very much for your constructive comments which really helped us to improve the article from its previous version and we have acknowledged the reviewers contribution within the manuscript.
Point 1: Line 22: “Therefore, derived set of coefficient value (BCP Model 2) has”. What is BCP Model 2? I recommend to say like this “the highest regression coefficient value from derived set of coefficient value …….”
Response 1: According to your suggestion we have elaborated the BCP Model as Boro Crop Production Model at line 22 and rewrite the sentence as “the highest regression coefficient value from derived set of coefficient value (BCP-Boro Crop Production Model 2) has been utilized to obtain year-wise rice productions for all the years (2011-2017).
Round 3
Reviewer 1 Report
My comments are completely answered by the authors. There is only one issue in Fig 6 that needs to be resolved before final submission.
(1) According to Table 2 (Appendix), it seems the range of ME is between 0-1 while MBE and RMSE are not. That's why we cannot see the impact ME on the figure. Due to the scale and range of ME, I think it is better to remove ME from the figure or instead of one radar chart, prepare 3 radar charts for each of stats (MBE, RMSE, ME) separately.
(2) In addition to the range issue, there is a blue dot on "0" that needs be removed as well.
(3) Check the value of MBE in 2014 again (Table 2).
(4) If it is possible, report |MBE| (absolute error) in Fig 6 instead of MBE. I'm asking this because 0 of MBE is not located in the center. By reporting absolute values of MBE (MBE) in Fig 6, this issue will be solved.
Author Response
Overall Comments: My comments are completely answered by the authors. There is only one issue in Fig 6 that needs to be resolved before final submission.
Authors Response: Thank you very much for your fruitful comments in reviewing the manuscript. The manuscript has been really benefited from your comments and we have acknowledged the reviewers contribution within the manuscript.
Point 1: According to Table 2 (Appendix), it seems the range of ME is between 0-1 while MBE and RMSE are not. That's why we cannot see the impact ME on the figure. Due to the scale and range of ME, I think it is better to remove ME from the figure or instead of one radar chart, prepare 3 radar charts for each of stats (MBE, RMSE, ME) separately.
Response 1: We have removed the ME from Figure 6 as per your observation and prepare a new Radar Chart.
Point 2: In addition to the range issue, there is a blue dot on "0" that needs be removed as well.
Response 2: The value of MBE ranges from negative value to positive value.
Point 3: Check the value of MBE in 2014 again (Table 2).
Response 3: The value of MBE in 2014 has been checked again and it’s ok.
Point 4: If it is possible, report |MBE| (absolute error) in Fig 6 instead of MBE. I'm asking this because 0 of MBE is not located in the center. By reporting absolute values of MBE (MBE) in Fig 6, this issue will be solved.
Response 4: Now, if we want to add absolute value of MBE instead of MBE then a lot of statistical calculation is required. We have planned to carry out detail study on this issue. It requires more time to elaborate this issue. Hope you understand our points. Besides we utilize these statistical tools based on relevant published article where they used these statistical tools to assess the accuracy of the predicted results.
This manuscript is a resubmission of an earlier submission. The following is a list of the peer review reports and author responses from that submission.
Round 1
Reviewer 1 Report
Summary:
The authors tried to predict Boro rice production by building the relationship between MODIS-NDVI and ground-based estimated Boro production. The topic is interesting. However, there are several major issues needed to be addressed.
1. This study only chose one NDVI image in each month from January to April. For example, January 1st, the second half month from February to April. What are the reasons to choose these four images?
2. Line 115: “(iii) necessary geometric and atmospheric corrections”. Please elaborate what kinds of geometric and atmospheric corrections were carried out.
3. L121-123: “the country scale vegetation layer has been used to mask the forest area [40] to consider the rice pixel only hence the district-specific sum of NDVI-values has been extracted from January to April at respective years”.
Which vegetation layer did you use? Does the vegetation layer only comprise of forest and rice? What about other vegetation types?
When you calculate the sum of NDVI-values for each district, how do you exclude the bad quality values?
Specific comments:
1. Line 31-32: “Moreover, the scientific consensus for the GBM basin projected a temperature rise of 1-30C and 20 percent more monsoon rainfall by 2050”. What does the GBM represent?
2. Line 44: “the European Community have been using RS data”. Please spell out “RS” when first using the abbreviation.
3. L116-117: “mosaicking of the collected images and respective band math to improve the quality of acquired satellite data”. How the mosaic and band math improve the quality of satellite data?
4. L238: in Table 2: Does Ground-based or model-based estimated production refer to the entire production in total 61 districts?
5. Line 289-290:“This section is not mandatory, but can be added to the manuscript if the discussion is unusually long or complex.” Is this sentence necessary?
Reviewer 2 Report
General Comments:
The authors developed linear regression models between NDVI obtained from MODIS imagery and ground-based measurements for estimation of rice productivity at the monthly scale. Then, they evaluated a general linear model selected among all the regression models on a yearly scale. The introduction was fine and told a coherent story to highlight the importance of rice productivity estimates in developing countries. However, the authors confirmed that there are several studies working and focusing on developing regression models between vegetation indices and rice productivity. Also, I believe that the hot topics in this field are moving towards the intelligence based techniques such as deep learning and involving the canopy structure information that can be captured by lidar systems. Therefore, fitting a linear regression model with one input (NDVI) is not a novel and attractive study for the readers. The methodology must be completed. I asked several questions in that section. The responses for those questions must be added in that section. The results must be completely revised. The figures were not informative and the quality of them was too low. There are many points that need to be clarified by the authors. I have listed my comments that need to be considered by the authors to improve the manuscript.
Major:
Abstract
1- Line 7-8: “The regression analysis between the sum of MODIS-NDVI and ground-based estimated Boro production statistics (R2=0.47-8 0.85) reveals strong agreements”. I don’t believe when R2 is less than 0.6, we can conclude there is a strong correlation between estimations and observed values. Also, it seems this study compared two empirical methods developed for rice production. If this is true, we cannot use “ground-based” phrase.
2- Line 10-11: “the highest regression coefficient based model has been implemented to forecast the Boro crop estimation over 2011-2017 periods.” Since the authors considered a period between 2011 and 2017, this study cannot be classified into forecasting problems. It is better to say they “simulated” or estimated the rice production using remote le sensed data.
3- Line 18-19: “The potentiality of the MODIS-NDVI based regression model to generate the Boro crop production statistics for forecasting on a timely basis at country scale”. This is really important to highlight the applicable scale when we developed a model for prediction or estimation. This is good that the authors pay attention to report the proper scale of their regression model. However, I’m wondering know how they concluded that the developed ndvi-production regression model is suitable at the country scale. Did they evaluate the model in terms of the pixel resolution or the grid size sensitive analysis?
Materials and Methods
4- Fig1: the table which is inside the figure is stretched. Also, the legend must be added to show the name of the colored area. Also, I highly recommend adding a sample of MODIS imagery over Fig1 as a transparent layer.
5- The BBS estimated rice crop production method considered the ground-based records must be described in a short paragraph. In addition, it is not clear how the authors extract the rice pixel from MODIS imagery. The impact of a mixture of information consisting of rice and soil pixels must be addressed in this section.
6- Fig 2: it is not clear how the authors fitted the regression model between NDVI and rice production. Were they in the same grid size (spatial scale)? Did the authors use an upscaling or downscaling approach to make them at the same matrix size before fitting the regression? Since they used the summation of NDVI for correlation analysis, the impact of cloud and cloud shadows must be described. Did the authors filter out the cloudy imagery? The answers to these questions must be added in the Methodology section.
Introduction
7- Line 36-38: “food crises and grain price variation that can be the cause of social troubles due to economic and political changes. Hence the developing countries like Bangladesh need scientific instruments which would be able to provide them updated information” I’m wondering to know how the scientific instrument can prevent grain price variation. The answer to this question must be included before “Hence,…”
8- Line 53-54: “it is obvious that Bangladesh is facing numerous challenges in terms of rice production estimation” After reading the introduction, I think facing numerous challenges couldn’t be an obvious fact for the readers. The authors have to report some information about the increasing rate of population, the impact of climate change (trend of temperature, rainfall, water consumption, and water demands), land use changes particularly for Bangladesh.
9- Line 55-57: “Several works have been performed to find out the alternate method of remote sensing-based techniques for effective forecasting in rice yield and found strong relationship between satellite-based NDVI with ground-based rice yield for forecasting purpose [17-21, 12-13]” As the authors correctly mentioned, there numerous studies focusing on estimation and prediction of strategic crops particularly wheat and rice using remotely sensed data. In addition, considering only one parameter such as NDVI or any vegetation indices with a linear regression model is not new. When the trend of study in this type of problem is to moving toward deep learning considering many possible inputs, precision agriculture, and smart farm management, I believe that this study is lacking a novelty. After reading the introduction, I think the only part that may be new for the readers is related to the case study which is Bangladesh. So, if there is another aspect that is not clearly mentioned in the introduction, it must be described in detail in the last paragraph of the introduction. Besides, the authors mentioned that “The reason behind the selection of MODIS NDVI products is that the NDVI dynamics is representative of crop growth and biomass changes and is closely related to crop yield and has a direct correlation with LAI, biomass and vegetation cover”. However, there are many studies showing the relationships developed between vegetation indices and biomass are not reliable due to the saturation situation happening for well-developed canopies and crops. This means that the relationships between a vegetation indices such as ndvi and yield is not linear when the crop is ready to be harvested. According to those studies, the relationship is close to an exponential behavior. That’s why they conclude that the vegetation indices are not proper to be individually considered in a prediction or estimation model.
Results
10- Figure 3: The type of figure 3 must change. Since the x-axis of this graph is not a continuous parameter, using a continuous line leads to having a confusing figure. It must be converted to a scatter plot or bar plot. In some cases, it seems the order of NDVIs is not correct which means that with growing the crop we have lower values for NDVIs (see Kishoregonji, Bogra, etc). Showing BBS production records in this graph is not necessary. If the authors want to keep it, it is better to be added as a secondary y-axis, not in the same axis. Since these lines, inherently, are time series, I think it would be more informative to use a time series graph type, dates (January 2012-April 2012) in the x-axis, NDVI in axis and colors for each region.
11- Line 219-221: “Boro rice forecasting model from these regression models can be developed and based on the highest regression coefficient a Boro crop production model has been selected to apply over the 2011-2017 period in this study.” That is completely unclear to me why the authors decided to apply the highest regression coefficient which was found for each month and for each year (March 2013) for the entire study period (2011-2017)! Instead, they could fit the linear regression over all the data points (involving all the MODIS-NDVIs versus ground-based observations). In addition, the results presented based on the regression model is not consistent with the values reported in Table2. As you can see the R-square reported for 2011-2017 in Table 2 using the modeled equation (yearly scale) is much better than the regression model fitted over the monthly scale! This part of the study must be re-calculated. For instance, for 2011, I expected to see a value close to the average value of monthly scale (average of r-square for January 2011- April 2011) not close to the model found just for March.
Minor:
1- The name of authors are not mentioned in the pdf version of the manuscript (after the title)
2- Line 15-16: “with MBE=(-29881.13 to 19431.53) M.Ton, ME=(0.86-0.94), and RMSE=(5238.11-11852.25) M.Ton have been obtained over the seven years”. Keep the numbers reported for MBE and RMSE with zero decimal and add a separator (“,”) for those numbers.
3- Line 31-32: “Moreover, the scientific consensus for the GBM basin projected a temperature rise of 1-3 C and 20 percent more monsoon rainfall by 2050”. What is GBM. All the abbreviations must be defined at the first appearance. Also see (RS in line 44, )
4- Line 44-45: What is “important crops”. Must be mentioned.
5- Line 45: Remove “Therefore”
6- In equation 3, change “I” to “i”. Also, remove “….” In those Eqs.
7- The quality of the figure is too low. It seems it is stretched. Make sure that the resolution of each image must be at least 300dpi.
Reviewer 3 Report
This paper describes an application for the estimation of Boro rice crop production in a set of 61 districts of Bangladesh using MODIS NDVI products downloaded from seven years, as compared to official production data from the National Bureau of Statistics. The methods are simple, not properly described and unreliable. This work may be interesting as a technical report for local authorities, but it does not entails any research advance.
I am afraid that this paper do not provide anything new or original for the scientific community in order to be published in a scientific journal.
English grammar and style needs significant improvement, it should be revised by a English speaking native. Just some grammar/style mistakes, typo errors or similar:
L11 (abstract): "highly significant connotation" should be " highly significant correlation"
L31: What is GBM basin?
L60: "researches have also performed relevant researches"
L77: "are to: a) to develop" avoid the first "to"
L83: "20o34/ to 26o38/ North latitude" Use the proper signs for degrees and minutes.
L116: Every processing step must me concisely described. Sentences like "necessary geometric and atmospheric corrections" for instance, should be avoided. What kind of atmospheric and geometric corrections exactly?
Fig. 3. These graphs barely provide information. Why not showing the temporal distribution of several districts, for instance, instead of this distribution of NDVI in the districts? Where are the units of BBS production in the graph? How a sum of NDVI in one month can be 2 millions? Number of pixels involved in each District would help to have a sense of scale.
The methods are unreliable. Do you use the NDVI sum of all pixel values per district as independent variable? Then, in equation (4) the dependent variable (Y) should be expressed in absolute values (MTons), not in relative units (MTons/ha), otherwise big districts should have a different equation than small districts. How have the models been evaluated? Did you use the same samples for training the model and for the evaluation? There is not information about this in the paper. There are not significancy values of the models, either.
The first sentence of the conclusion "This section is not mandatory, but can be added to the manuscript if the discussion is unusually long or complex " should be erased. The conclusions are not what it is expected in a paper, the main findings of the research should be extracted, not just application results, which is mainly what this work is about.