3.1. Evaluation at Within-Field Level in JSendra Rice
The
JSendra fields’ coefficient of determination evolution for each Sentinel-2 spectral band at 10 m (
Figure A1a) and at 20 m (
Figure A1b) spatial resolution is analyzed in
Appendix A. These results include the R
2 temporal evolution of three simple linear regression models combining the different spectral bands (M1S in
Figure A1a and M2S and M3S in
Figure A1b). Note that the R
2 values studied represent all within-field pixels of the
JSendra training data. The models can be written as:
where the spatial resolution is specified in parenthesis and the model coefficients are different for each equation and for each date.
Figure A1a shows that at 10 m, the band that shows the highest correlation with the yield is B8 (NIR), followed by B2 (Blue) and B3 (Green), while B4 (Red) shows the lowest correlation. At 20 m (
Figure A1b), B8A shows the highest correlation, followed by B8 and B7 (Red edge) whose evolution is very close to B6 (Red edge), but has a minor peak at the beginning of the season. Other bands, such B11 and B12 (SWIR), show some lower peaks of correlation, having only one significant peak on July 4. Visible bands show a similar tendency compared to 10 m resolution and B5 (Red edge) is very close to B4 (Red). The linear models combining spectral bands (M1S, M2S and M3S) show the best correlation, which is consistent with previous works utilizing remote sensing data to infer within-field maize and soybean yields [
8]. The best performance (based on the coefficient of determination evolution) happens on July 4 for both spatial resolutions, that is when rice is in the tillering phase. Comparing the performance at 10 m and 20 m the coefficient of determination remains almost the same and shows an equivalent evolution.
Focusing on July 4,
Figure 3 shows the average spectral response of each band, splitting all training data into nine yield ranges. Note that we preserved the spatial resolution of each band. B6, B7, B8A and specially B8 show the largest differences amongst yield ranges by increasing their reflectance values with the yield.
Next,
Table A1 and
Table A2 (
Appendix B) show the performance metrics of each model at 10 m and 20 m spatial resolution, respectively, focusing on the tillering phase and based on the training data. This phenological stage has been determined in the previous results as critical to monitor rice yield. Three Sentinel-2 acquisition dates were selected to analyze the performance metrics of each model: the date of the R
2 peak (July 4), the previous date (June 29) and the next one (July 19). Note that in these tables, models consisting in bands are built based on a linear regression of said band or combination of band. Additionally, the tables show the performance metrics of three models (M1S+, M2S+ and M3S+) whose definition is based on a stepwise regression program (StatGraphics). These models are written as:
where the spatial resolution is specified in parenthesis and the model coefficients are different for each equation and for each date. As a summary of the results,
Table 4 shows the evaluation of the best performing models for each date and spatial resolution.
The best performing models at 10 m are M1S and M1S+, which combine the four bands (B2, B3, B4 and B8). In relation to 20 m, the best performing models are M3S, M2S+ and M3S+. Comparing the dates, July 4 observations provide the best performance metrics, showing R
2 greater than 0.70, MAPEs lower than 4% (<3% at 20 m) and MAEs ranging from 0.20–0.25 t⋅ha
−1. In relation to the VIs (based on visible, NIR, Red edge and SWIR regions), they do not improve considerably the spectral bands or linear combinations of bands with higher R
2 (data, equations and references are shown in
Appendix C). The best results for VIs are obtained with the Rice Growth Vegetation Index (RGVI) (R
2 = 0.66).
Appendix D shows an example at 10 m and 20 m of a reference yield map, each spectral band, and the modeled yield.
Finally,
Figure 4 shows the evaluation of the best performing models (M1S and M1S+ at 10 m and M3S and M3S+ at 20 m) for the
JSendra variety on July 4 considering all data from the validation fields for each spatial resolution. M1S+ and M3S+ provide a yield modelling error (MAE) of 0.254 t⋅ha
−1 (3.73%) and 0.240 t⋅ha
−1 (3.48%), respectively, which is aligned with the statistics shown on the training data. In fact, 60% of all pixels evaluated presented a difference between modelled and referenced yield lower than the MAE value, with a maximum difference of 1.0 t·ha
−1 and with only 10% of data exceeding a difference of 0.50 t·ha
−1. Additionally,
Figure 4 (right) shows the performance metrics of the models without interactions at a spatial resolution of 10 m (M1S) and 20 m (M3S). The performance metrics (MAE of 0.318 t⋅ha
−1 (4.71%) at 10 m and 0.305 t⋅ha
−1 (4.50%) at 20 m) show worse results than on the training data modelling, which suggests that the interactions proposed in M1S+ and M3S+ models provide a better consistency and applicability. Finally, we tested a Random Forest model considering all bands from the training data as inputs and evaluating its performance with the validation data. This model provides an R
2 of 0.79 and MAE of 0.292 t⋅ha
−1 (4.33%) at 10 m and R
2 of 0.78 and MAE of 0.302 t⋅ha
−1 (4.46%) at 20 m. These statistics are equivalent to M1S+ and M3S+.
3.2. Evaluation at Within-Field Level in Bomba Rice
Focusing now on the
Bomba variety,
Figure A2 (
Appendix A) shows the coefficient of determination evolution along the rice season for each Sentinel-2 band acquisition at 10 m (
Figure A2a) and at 20 m (
Figure A2b) spatial resolution considering the training data. Additionally, the figures include the R
2 for three models which linearly combine all bands, following the same structure than the models described in
JSendra (M1S, M2S and M3S) but changing the model coefficients. Thus, for
Bomba these equations are named M1B, M2B and M3B (M1B in
Figure A2a and M2B and M3B in
Figure A2b). Note that
Bomba rice was harvested by mid-September, thus latter dates are excluded in the plot.
In this case, there are two dates with the maximum correlation at 10 m and 20 m, July 4 and August 28, although the latter includes half the number of pixels than the former. Analogously to JSendra, the results are similar at both spatial resolutions. However, in Bomba rice the visible bands (B2, B3, B4) show the best correlation with the final yields on July 4. B5 shows a very similar performance to the visible bands, and even improves and anticipates the correlation peak with the final yield on June 29. Furthermore, these bands keep a high R2 throughout July and decrease in August (when the flowering stage happens). Additionally, B3 (Green) and B5 show an acceptable value of R2 on August 28. On the contrary, the NIR bands show poor correlation on July 4, which is improved by mid-season (end of July), and finally shows an R2 peak on August 28. Analogously to JSendra, B8, B8A and B7 present very similar results. B6 shows an equivalent evolution to these bands but with a poor correlation during mid-season. Regarding the SWIR bands, at the beginning of the season they show a small correlation peak which decreases until reaching a minimum by the end of June and then improves again, reaching a peak by the end of July which is comparable to the visible bands’ correlation. Finally, the models M1B, M2B and M3B show the best performance.
Focusing on July 4,
Figure 5 shows the average spectral surface reflectance for all pixels within the training fields splitting them by yield ranges. In this variety and date, the visible bands (specially B3) have a larger separability showing lower SR values for higher yields. Otherwise, the NIR, Red edge and SWIR reflectance do not show a clear dependency with the yield.
Next,
Table A3 and
Table A4 (
Appendix B) show the performance metrics of each model at 10 m and 20 m spatial resolution respectively, focusing on the same dates as in
JSendra. These three dates are essential for a forecast, since tillering is still being defined and it is still possible to modify crop management to improve final yields. Models are built and shown based on the same principles as in the
JSendra variety. The resulting models of the stepwise regression program (StatGraphics), M1B+, M2B+ and M3B+ are defined as:
where the spatial resolution is specified in parenthesis and the model coefficients are different for each equation and for each date. As a summary of the results, the
Table 5 shows the evaluation of the best models for each date and spatial resolution.
Comparing the dates, July 4 observations provide the best performance metrics, showing an R
2 greater than 0.85, MAPEs lower than 8% (<6% at 20 m) and MAEs ranging from 0.18–0.23 t⋅ha
−1. The best model at 10 m is M1B+, that is the model including B2, B3, B4 and their interactions. Note that this model does not include the NIR band. Additionally, model M1B shows similar results to the simple linear model without the NIR band (
). At 20 m, M2B+ and M3B+ show the best performance metrics. Note that M2B+, which depends only on B2 and B3, shows similar performance metrics to M3B+ (that depends on B2, B3, B11 and B12) and M3B (that includes all bands). Additionally, it is important to note that B3 (Green) and B5 individually show excellent results. Regarding VIs, the best performance is obtained using the Normalized Difference Red Edge 2 (NDRE2) (R
2 = 0.59). All the performance metrics of the VIs tested, their equations and references can be found in
Appendix C. Analogously to the
JSendra section,
Appendix D shows an example at 10 m and 20 m of a reference yield map, each spectral band, and the modeled yield.
Figure 6 (left) shows the validation of the best performance models (M1B+ and M3B+) for the
Bomba variety on July 4 considering the validation fields. These results show that the proposed models M1B+ and M3B+ provide a MAE of 0.218 t·ha
−1 (5.82%) and 0.223 t·ha
−1 (5.78%), respectively. A total of 60% of pixels presented a difference between modelled and referenced yield lower than the MAE, being the maximum difference around 0.80 t·ha
−1, with only 9% of data exceeding a difference of 0.50 t·ha
−1. Note that the differences in MAPE between varieties were due to their corresponding average yields, being 3.8 t⋅ha
−1 for
Bomba and 7.0 t⋅ha
−1 for
JSendra in the studied area. The validation results of M1B and M3B are shown in
Figure 6 (right) (MAE of 0.261 t·ha
−1 (6.93%) at 10 m and 0.273 t·ha
−1 (7.17%) at 20 m) and are slightly worse than M1B+ and M3B+, respectively. The decrease in R
2 when compared to the training data evaluation is caused by the limited yield variability in the validation pixels (between 2.5 and 5.0 t⋅ha
−1) compared to the training pixels (between 0.8 and 5.0 t·ha
−1). Note that this is not the case in
JSendra, where the training and validation pixels show similar variability (5.5–8.0 t⋅ha
−1). Finally, a Random Forest algorithm based on the same dataset provides a R
2 of 0.71 and MAE of 0.287 t⋅ha
−1 (7.67%) at 10 m and R
2 of 0.67 and MAE of 0.297 t⋅ha
−1 (7.69%) at 20 m. These performance metrics are worse than the previous models based on linear regressions (M1B+, M3B+, M1B and M3B).