Next Article in Journal
A Lightweight Method for Ripeness Detection and Counting of Chinese Flowering Cabbage in the Natural Environment
Next Article in Special Issue
Climate Change and Vegetation Greening Jointly Promote the Increase in Evapotranspiration in the Jing River Basin
Previous Article in Journal
Process Adaptability Appraisal of Fermented Chopped Chili Pepper Made from Fresh Chili Peppers of Different Varieties
Previous Article in Special Issue
Winter Wheat Aboveground-Biomass Estimation and Its Dynamic Variation during Coal Mining—Assessing by Unmanned Aerial Vehicle-Based Remote Sensing
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Yield Prediction of Winter Wheat at Different Growth Stages Based on Machine Learning

Key Laboratory of Spatio-Temporal Information and Ecological Restoration of Mines of Natural Resources of the People’s Republic of China, Henan Polytechnic University, Jiaozuo 454003, China
*
Author to whom correspondence should be addressed.
Agronomy 2024, 14(8), 1834; https://doi.org/10.3390/agronomy14081834
Submission received: 29 July 2024 / Revised: 14 August 2024 / Accepted: 18 August 2024 / Published: 20 August 2024

Abstract

:
Accurate and timely prediction of crop yields is crucial for ensuring food security and promoting sustainable agricultural practices. This study developed a winter wheat yield prediction model using machine learning techniques, incorporating remote sensing data and statistical yield records from Henan Province, China. The core of the model is an ensemble voting regressor, which integrates ridge regression, gradient boosting, and random forest algorithms. This study optimized the hyperparameters of the ensemble voting regressor and conducted an in-depth comparison of its yield prediction performance with that of other mainstream machine learning models, assessing the impact of key hyperparameters on model accuracy. This study also explored the potential of yield prediction at different growth stages and its application in yield spatialization. The results demonstrate that the ensemble voting regressor performed exceptionally well throughout the entire growth period, with an R2 of 0.90, an RMSE of 439.21 kg/ha, and an MAE of 351.28 kg/ha. Notably, during the heading stage, the model’s prediction performance was particularly impressive, with an R2 of 0.81, an RMSE of 590.04 kg/ha, and an MAE of 478.38 kg/ha, surpassing models developed for other growth stages. Additionally, by establishing a yield spatialization model, this study mapped county-level yield predictions to the pixel level, visually illustrating the spatial differences in land productivity. These findings provide reliable technical support for winter wheat yield prediction and valuable references for crop yield estimation in precision agriculture.

1. Introduction

Accurate crop yield prediction is vital for ensuring food security and promoting sustainable agricultural development [1]. Wheat, one of the world’s three major staple crops, constitutes 40% of the global food supply, thus playing a pivotal role in global food security [2]. With the rapid advancement of satellite-based Earth observation technologies, the significance of utilizing remote sensing techniques in the research of large-scale winter wheat yield prediction has become increasingly apparent.
Crop yield is influenced by various factors such as weather, climate, soil, and field management practices [3]. Crop yield prediction commonly employs both physical and statistical modeling approaches. Physical models typically utilize crop growth models to simulate the dynamic changes in crop growth and the formation of yield [4,5]. However, the complexity of parameters required by physical models, including crop varieties, soil types, and climate variables, limits their application in large-scale predictions [6,7]. Statistical models predict yields by establishing relationships between crop production and inherent crop and environmental characteristics [8]. Remote sensing technology provides the data foundation for the establishment of statistical models [9]. Remote sensing data have the advantages of wide coverage and spectral range, which can capture a variety of crop characteristics, such as monitoring crop growth [10], identifying crop pests and diseases [11,12], and estimating weed density [13]. The vegetation index (VI), derived from remote sensing data, is more common in crop yield forecasting. VI is more sensitive to vegetation conditions than the original reflectance values and can better capture changes in vegetation conditions, such as crop growth and health status [14]. Temperature and evapotranspiration data can characterize crop health or stress [15]. Remote sensing products within various spectral ranges have been extensively employed in crop yield prediction. These include vegetation indices (VIs) [16,17,18], surface reflectance (SR) [19], leaf area index (LAI) [20], fraction of photosynthetically active radiation (FPAR) [21], solar-induced fluorescence (SIF) [9], land surface temperature (LST) [22], and gross primary productivity (GPP) [23], among others. Data-driven statistical models have the advantage of data detection and are widely used for crop yield prediction on a large scale [19,24,25]. Although remote sensing data provide type-rich data for crop yield prediction, crop yield, crop biochemical information, and growth condition information are usually nonlinear, and statistical models constructed using only linear relationships are poorly fitted at large scales [26].
Machine learning possesses the ability to discern nonlinear relationships between target and feature variables, effectively aiding quantitative remote sensing research [27]. Currently, various machine learning algorithms, such as ridge regression (RR), Gaussian process regression (GPR), random forest (RF), Lasso regression (Lasso), support vector machine (SVM), and gradient boosting, among others, are extensively applied in crop yield prediction driven by remote sensing data [28,29,30,31,32]. However, single machine learning algorithms exhibit instability in crop yield prediction. For instance, Pang et al. employed the random forest (RF) algorithm with high-resolution imagery, meteorological variables, and yield data to predict wheat yields in the southeastern region of Australia, where the predictive performance in one planting area significantly lagged behind that of the other two areas [30]. Similarly, Zhou et al. utilized remote sensing data and climate variables, employing the RF, SVM, and Lasso algorithms for yield prediction in winter wheat planting areas in China, revealing substantial disparities in the predictive accuracy among the three machine learning algorithms [33]. Moreover, utilizing feature variables across the entire growth period of winter wheat for yield prediction obscures the potential variations in predictive capabilities across different growth stages, thereby limiting the timeliness of governmental decision-making. Zhou et al., considering both spectral features and agronomic trait parameters, assessed the impact of different growth stages on yield prediction outcomes [17]. Zhao et al., employing inputs such as cumulative biomass, climate adaptability indices, and extreme climate indices in a statistical regression model, predicted wheat yields in the North China Plain and evaluated the performance of yield prediction models concerning different growth stages [34]. While the previous studies explored the predictive performance across different growth stages, they all employed a single machine learning model for yield prediction, overlooking the impact of the model itself on the predictive potential across various growth stages. Additionally, the yield prediction results in the past were mostly presented at the county level and did not downscale county-level yield data to pixel-level resolution. Pixel-level yield information is crucial in helping the government take necessary measures in the agricultural production process to achieve yield maximization.
To address the aforementioned issues, this study utilized eight parameters, namely normalized difference vegetation index (NDVI), land surface temperature (LST), gross primary productivity (GPP), enhanced vegetation index (EVI), fraction of photosynthetically active radiation (Fpar), potential evapotranspiration (PET), actual evapotranspiration (ET), and leaf area index (LAI). Combining these parameters with winter wheat yield statistics, an ensemble voting model based on gradient boosting, random forest, and ridge algorithms was constructed. This study analyzed the yield prediction potential across different growth stages of winter wheat and established a spatialization model for winter wheat yield at both county and pixel levels (Figure 1).

2. Materials and Methods

2.1. Study Area

This study focuses on Henan Province, a primary winter wheat cultivation area in China, located between 31°23′ to 36°22′ N and 110°21′ to 116°39′ E (Figure 2). The region encompasses diverse land-use types, ranked in descending order by proportion: arable land, forest land, built-up land, water bodies, grassland, and unused land [35]. The climate is characterized as subtropical and temperate monsoon, featuring distinct seasonal variations, with an average annual temperature ranging from 12 to 16 °C and annual precipitation between 500 and 900 mm [36]. In the study area, winter wheat is typically sown in October and harvested from late May to early June of the following year [37]. Precipitation in both spring and winter benefits winter wheat growth and other early spring crops. However, natural precipitation alone is insufficient to meet the growth requirements of winter wheat, prompting local farmers to adopt groundwater extraction for irrigation as an additional water source [38].

2.2. Data and Pre-Processing

2.2.1. Statistical Data

The county-level total yield and total planting area data for winter wheat in Henan Province were sourced from the Statistical Yearbook of Henan Province published by the Henan Provincial Bureau of Statistics [39]. Unit yield data for each county-level entity were obtained by dividing the total yield by the total planting area. Considering factors such as administrative changes and occasional missing statistical data, this study selected counties with complete records from 2012 to 2021 (Figure 3) as the modeling and analysis samples, resulting in an effective sample size of 1020 records.

2.2.2. Winter Wheat Vector Data and Phenological Periods

The reliability of crop yield prediction using agri-environmental variables depends on the spatial aggregation of environmental variables. The use of annual masks for specific crop groups can effectively improve the accuracy of yield estimates compared to the use of general cropland masks [40]. The winter wheat vector data in this study were extracted from 10 m resolution Sentinel-2 satellite remote sensing images via an object-oriented deep learning method. In order to ensure the accuracy and quality of the data, the satellite images used underwent rigorous pre-processing, including atmospheric correction, radiometric correction, and geometric correction, to eliminate possible sensor errors and atmospheric effects. At the same time, in order to ensure consistency between the extracted winter wheat planting distribution data and the actual planting situation, the confusion matrix was calculated by combining the ground survey sample data, and the Kappa coefficient was 0.82. Considering that Henan Province is the main production area of winter wheat and its crop cultivation structure is relatively stable, the winter wheat vector data in 2021 were selected as the mask data for extracting the model feature parameters. Table 1 shows the winter wheat phenology calendar in Henan Province.

2.2.3. Remote Sensing Data

The prediction of winter wheat yield is affected by many complex factors [41]. Yield prediction models considering multiple factors have better accuracy in yield estimation [33,42]. In this paper, based on MODIS remote sensing products, EVI and NDVI, which reflect vegetation growth, and GPP, FPAR, LST, LAI, ET, and PET, which represent ecosystem functions, were calculated. The data type, resolution, and source are detailed in Table 2. In order to ensure data consistency and accuracy, the above remote sensing data were pre-processed according to the quality control band of MODIS data products, and mask processing was performed according to the winter wheat vector data of each county, and the average value was calculated. In addition, some characteristic variables may have outliers, and these outliers may affect the model’s prediction performance. For this reason, the RobustScaler scale, which is more robust to outliers, was used in this paper to pre-process the feature variables to reduce the impact of outliers on the prediction results, thereby improving the reliability and accuracy of the prediction.

2.3. Machine Learning Methods for Yield Prediction

The commonly used statistical model methods for yield prediction include linear regression, deep learning, and machine learning. Linear regression typically involves simple weights and coefficients but fails to capture nonlinear relationships within the data. Deep learning models are often more complex, requiring training on large datasets to achieve outstanding performance. In contrast, machine learning models stand out by bridging the shortcomings of both approaches, showcasing unique advantages. Machine learning has the ability to explore linear or nonlinear relationships between data features and target variables. It is well-suited for small-scale datasets, exhibiting relatively low model complexity, and is less prone to overfitting effects.

2.3.1. Linear Models and Regularization Methods

In this study, two linear regression models were initially employed: ridge regression and elastic net regression, to predict winter wheat yield. Ridge regression effectively addresses the issue of multicollinearity among predictors by introducing a regularization term [43]. The regularization strength was set to α = 1.0 to ensure robust performance in environments with highly correlated features. Elastic net regression, which combines L1 and L2 regularization (with l1_ratio = 0.5), excels in handling both sparsity and multicollinearity, making it particularly suitable for high-dimensional data.

2.3.2. Decision Tree Models and Their Extensions

This study also utilized various decision tree-based models to capture the nonlinear characteristics of winter wheat yield, including the decision tree regressor, extra tree regressor, and random forest regressor. The decision tree regressor recursively partitions the data space and makes predictions within each partition, making it particularly suitable for noisy data [44]. To mitigate the risk of overfitting, the absolute error criterion (criterion = “absolute_error”) was employed and a random state was set. The extra tree regressor enhances model robustness by introducing randomness during node splitting [45], using a random splitter (splitter = “random”) and the squared error criterion (criterion = “squared_error”). The random forest regressor reduces model variance by integrating multiple decision trees, showing excellent performance, especially in handling high-dimensional and missing data. In this study, we used 100 trees (n_estimators = 100) and full feature selection (max_features = 1.0) to ensure the model’s robustness.

2.3.3. Distance-Based Models

This study also employed the instance-based k-nearest neighbors regressor (KNeighborsRegressor). KNeighborsRegressor is an instance-based learning method that makes predictions by calculating the distance between new samples and those in the training set [46]. In this research, three nearest neighbors (n_neighbors = 3) and the Minkowski distance metric (p = 2) were used. This approach is particularly suitable for analyzing datasets with smaller sample sizes or low-dimensional feature subsets.

2.3.4. Support Vector Regression Models

This study also introduced support vector regression (SVR and NuSVR) for yield prediction. Support vector regression models (SVR and NuSVR) are based on the theory of support vector machines (SVM) and are well-suited for handling small sample sizes and nonlinear problems. NuSVR controls the proportion of support vectors by introducing a relaxation variable, using a linear kernel function (kernel = “linear”) and a regularization parameter C = 0.5. In contrast, SVR uses a radial basis function (RBF) kernel (kernel = “rbf”) to handle nonlinear relationships. Although SVR is advantageous for small samples and nonlinear problems, its higher computational complexity limits its application in large-scale data and high-dimensional features [47].

2.3.5. Ensemble Models

To further enhance model performance, this study employed various ensemble methods, including the gradient boosting regressor, AdaBoost regressor, and voting regressor. The gradient boosting regressor improves predictive accuracy by incrementally fitting new models [37]. The learning rate was set to 0.2 (learning_rate = 0.2) and 200 iterations were used (n_estimators = 200). The AdaBoost regressor increases the model’s accuracy by adjusting sample weights to handle difficult-to-predict samples more effectively [38], with 50 weak learners (n_estimators = 50) and a default learning rate (learning_rate = 1.0). The voting regressor enhances overall model performance by combining the predictions of three base learners: ridge, random forest, and gradient boosting. Each base learner was individually tuned, with ridge set to α = 0.01, random forest using a random state, and gradient boosting with a learning rate of 0.2 and 200 iterations. This approach leverages the strengths of multiple models, making it suitable for complex regression problems that require a balance between bias and variance [48]. Figure 4 provides a visual representation of the ensemble methods used in this study.

2.4. Winter Wheat Yield Prediction

To assess the performance of various machine learning algorithms in yield prediction, initially, we constructed models using data spanning the entire growth period. We partitioned the county-level modeling factors for each growth stage of winter wheat, calculating the mean values of these factors during the corresponding growth stage as feature variables for model construction. Data from 2012 to 2019 served as the training set for training the models, including the ensemble voting model based on ridge, random forest, and gradient boosting, along with 10 other commonly used machine learning models. The data from 2020 were used as the validation set and the data from 2021 as the test set in the evaluation of the predictive accuracy of the models.
To explore the yield prediction potential across different growth stages, we chose machine models with high accuracy throughout the entire growth period and the ensemble voting model constructed in this study for modeling analysis at each growth stage. In this process, each class of modeling factors was partitioned based on the winter wheat growth stage and used as a set of feature variables for model construction. We continued to use data from 2012 to 2019 as the training set, 2020 as the validation set, and 2021 as the test set for data partitioning. The algorithms were then employed for yield prediction at each growth stage, followed by accuracy validation.

2.5. Accuracy Assessment

This study constructed winter wheat yield prediction models using historical yield data and remote sensing data, followed by accuracy evaluation. All constructed yield prediction models underwent validation using K-fold cross-validation, with root mean square error (RMSE), mean absolute error (MAE), and coefficient of determination (R2) employed as evaluation metrics.
M A E = i = 1 n y i y p n ,
R M S E = i = 1 n ( y i y p ) 2 n ,
R 2 = 1 i = 1 n y i y p 2 i = 1 n y i y ¯ 2 ,
where n represents the number of samples, y i denotes the actual observed value for the ith sample, y p is the predicted value for the ith sample, and y ¯ is the mean of all the observed values.

2.6. Construction of Spatialization Model for Yield

By constructing a yield spatialization model, we aimed to depict productivity visually at the pixel level and reveal spatial variations in winter wheat yield across Henan Province. The model utilized the county-level averages of each feature variable as a bridge, linking the feature variables at the pixel level to the county-level predicted yields. This process allowed for the back-calculation of county-level winter wheat yield data to the pixel level. The yield spatialization model is represented by Equation (4):
y p i x e l = y p r e × j = 1 N F e a t u r e i F e a t u r e m e a n N ,
where N represents the number of modeling driver types, which is 8 in this study; F e a t u r e i corresponds to the pixel value for each class of modeling driver factors; F e a t u r e m e a n denotes the pixel average value for each class of modeling driver factors; y p r e is the predicted yield; and y p i x e l is the yield for each pixel.

3. Results

3.1. Comparison of Accuracy in Yield Prediction Algorithms for Entire Growth Period

This study utilized winter wheat yield data as the target variable and constructed 11 winter wheat yield prediction models covering the entire growth period in Henan Province. Table 3 summarizes the accuracy of these models on the validation and test sets for winter wheat yield prediction.
In the validation set, the ensemble voting model achieved the highest R2 value, reaching 0.90. Following closely were ridge, gradient boosting, and random forest, with R2 values ranging from 0.69 to 0.86, while other models had R2 values below 0.69. Simultaneously, the ensemble voting model obtained the lowest RMSE and MAE at 439.21 kg/ha and 351.28 kg/ha, respectively. Ridge, gradient boosting, and random forest exhibited relatively higher RMSE and MAE, ranging from 509.50 to 756.39 kg/ha and 389.66 to 611.46 kg/ha, respectively. The RMSE and MAE for other models fell within the range of 847.14 to 1332.69 kg/ha and 725.36 to 1165.48 kg/ha, respectively.
In the test set, the ensemble voting model continued to achieve the highest R2 value and the lowest RMSE and MAE, with values of 0.90, 424.44 kg/ha, and 313.92 kg/ha, respectively. Following closely were ridge, gradient boosting, and random forest, with R2 values ranging from 0.75 to 0.79 and RMSE and MAE ranging from 609.80 to 676.49 kg/ha and 475.07 to 517.72 kg/ha, respectively. Other models exhibited R2 values below 0.75, with RMSE and MAE ranging from 821.96 to 1316.57 kg/ha and 614.30 to 1161.38 kg/ha, respectively. The results from the validation and test sets indicated that the ensemble voting model outperformed other models in predicting winter wheat yield across the entire growth period in Henan Province.
To assess the contribution of each sub-model to the overall performance of the ensemble voting model, we conducted an ablation experiment. The ensemble voting model comprised three sub-models: ridge, random forest, and gradient boosting. By sequentially removing these sub-models and evaluating model performance using R2, RMSE, and MAE, Table 4 summarizes the changes in model performance during the ablation experiment. It is evident that the removal of sub-models led to a decrease in model performance in both the validation and test sets, with the most significant decline observed when removing ridge and the smallest decline when removing random forest. Specifically, R2 decreased by a range of 0.02 to 0.13, RMSE decreased by 38.41 to 213.29 kg/ha, and MAE decreased by 30.34 to 174.76 kg/ha.
From Figure 5, it is evident that the ensemble voting model achieved the highest accuracy on both the validation and test sets, demonstrating robust performance. Figure 6 and Figure 7 depict scatter plots between predicted values and actual values for models with an R2 value exceeding 0.75 in the validation and test sets, respectively. These plots provide a visual representation of the performance and residual information of each predictive model on both training and test sets. The results further confirm the effectiveness and superiority of the ensemble voting model in winter wheat yield prediction.

3.2. Comparison of Accuracy in Yield Prediction Algorithms for Individual Growth Periods

This study focused on winter wheat yield as the target variable, utilizing different modeling factors for each growth stage as the feature variables. High-precision machine learning algorithms, including gradient boosting, random forest, ridge, and ensemble voting, were employed for modeling and analysis at each growth stage throughout the entire winter wheat growing season. Figure 8 illustrates the accuracy of the four algorithm models in predicting winter wheat yield on the validation and test sets. For the validation set, considering the entire growth period, the ranges of R2, RMSE, and MAE across different models were 0.20–0.81, 585.34–1211.34 kg/ha, and 468.07–1040.78 kg/ha, respectively. On an individual growth stage basis, the differences in R2, RMSE, and MAE for each model ranged from 0.05 to 0.30, 48.32 to 260.84 kg/ha, and 3.01 to 198.16 kg/ha, respectively. Ensemble voting showed higher yield prediction accuracy. For the test set, considering the entire growth period, the ranges of R2, RMSE, and MAE across different models were 0.33–0.77, 645.66–1101.03 kg/ha, and 534.93–877.08 kg/ha, respectively. On an individual growth stage basis, the differences in R2, RMSE, and MAE for each model ranged from 0.16 to 0.42, 268.3 to 372.83 kg/ha, and 263.02 to 403.70 kg/ha, respectively. The heading stage exhibited the highest yield prediction accuracy.
In both the validation and test sets, the ensemble voting model consistently exhibited stable predictive capabilities. Figure 9 illustrates the scatter plot and residual information for the ensemble voting model during the heading stage. The curve in the graph indicates a well-fitted linear relationship between the predicted and actual yields during the wheat heading stage.

3.3. Pixel-Level Spatialization of Yield

Based on the aforementioned research, the stability and superiority of the ensemble voting model were confirmed, particularly in predicting winter wheat yields during the heading stage. Therefore, this study adopted the ensemble voting model, utilizing various feature variables during the wheat heading stage as inputs, to predict the winter wheat yield in Henan Province for the year 2021. By employing the yield spatialization model, the predicted results at the county level were mapped to a more refined pixel scale. The enhancement in spatial resolution allowed for more accurate capture of the geographical variations in winter wheat productivity, unveiling local growth characteristics and yield fluctuations within specific regions.
Figure 10 demonstrates the spatial distribution of winter wheat yield at the pixel scale in Henan Province. It can be observed from the figure that the high-yielding areas of winter wheat in Henan Province are mainly concentrated in the eastern region, and the yield in the western region is relatively low. In addition, the planting structure of winter wheat is more fragmented in the western region. The more concentrated distribution of high-yielding plots highlights the existence of more favorable conditions for winter wheat cultivation in the eastern region. The results of this spatialization of yields help us to more comprehensively understand the geographic variability of winter wheat yields in Henan Province and provide visual support for agricultural decision-making and management.

4. Discussion

4.1. Performance Comparison of Winter Wheat Yield Prediction Models

We compared commonly used machine learning models with the ensemble voting model constructed in this study. The results show that the R2 values for different algorithm models range from 0.03 to 0.90, demonstrating significant performance differences. Among these models, the ensemble voting model exhibited the highest R2 and the lowest RMSE and MAE for both the full growth cycle and individual growth cycles. This model integrates ridge, gradient boosting, and random forest using a weighted approach during training, where the results of weak learners compensate for the errors of individual learners [49,50], enhancing the flexibility of reducible error [51] and showcasing the substantial advantages of ensemble learning in overall prediction performance. When selecting the best ensemble method for a given problem, it is important to consider the suitability of the setup (such as class imbalance and high dimensionality) as well as computational costs [52]. Due to its advantages of fast training speed and low computational cost, ensemble learning methods are widely used in various fields, including short-term power load forecasting, cost estimation, and plasma reaction dynamics modeling [53,54,55].

4.2. Analysis of Yield Prediction Potential for Individual Growth Periods

The spatial heterogeneity of the soil and physiological characteristics of crops change during different growth stages [18,56,57,58]. Figure 8 illustrates the accuracy trends of various yield prediction models for winter wheat. In this study, we observed significant differences in the accuracy of winter wheat yield predictions across different growth stages. The accuracy of yield predictions gradually increased with the growth of winter wheat, peaking at the heading stage before declining. The peak prediction potential at the heading stage is likely due to the formation of spikes and ears, which stabilize the plant’s morphology and structure. Nutrient accumulation and growth conditions play a crucial role in the final yield of winter wheat. The subsequent decline in prediction accuracy may be attributed to the inclusion of vegetation index data among the features. After the heading stage, nutrients are transferred from the stems and leaves to the grains, leading to a decrease in chlorophyll in the leaves. This reduction in chlorophyll affects the vegetation index data related to chlorophyll, thereby reducing its correlation with winter wheat yield and resulting in decreased accuracy of the yield prediction models.

4.3. Analysis of Spatialization in Yield Research

Small-scale yield prediction for winter wheat is crucial for understanding planting structures and achieving optimal agricultural resource allocation. Previous studies typically used methods such as drone imagery and crop growth models for yield estimation at the field level [4,56]. However, these methods often struggle to cover provincial scales simultaneously, and some crop growth models require numerous parameters. Due to spatial heterogeneity in soil, weather, and environmental factors, unifying some of these parameters can be challenging.
In this study, the ensemble voting model was used to predict winter wheat yield during the heading stage, which offers the greatest yield prediction potential. By applying a spatial yield model, winter wheat yield estimates at the county level were downscaled to the pixel level. This approach not only meets the need for county-level yield prediction but also visually represents winter wheat productivity at a finer scale. The feasibility and practicality of this method make it a powerful tool for supporting agricultural decision-making and management, providing a scientific basis for precision agriculture.

5. Conclusions

In this study, we estimated the yield of winter wheat in Henan Province using eight feature variables, including LAI, LST, and GPP, along with historical yield data. We proposed an ensemble voting model composed of gradient boosting, random forest, and ridge. The results show that the ensemble voting model demonstrated the highest accuracy among various machine learning models, both across the entire growth period and within individual growth stages, highlighting the stability and predictive accuracy of this approach for crop yield estimation. Additionally, we found that the heading stage had the greatest yield prediction potential, which may be linked to the stabilization of the wheat plants’ morphology and nutrient accumulation during this period. This finding provides critical information for the early allocation of agricultural resources, greatly aiding in the achievement of food security and precision agriculture. By constructing a yield spatialization model, we refined the county-level yield predictions to the pixel level, avoiding the complexities and computational difficulties associated with direct pixel-level yield estimation, thus offering an effective solution for pixel-level winter wheat yield prediction.
Nevertheless, this study has some uncertainties and areas for improvement. Crop yield is influenced by multiple factors, including climate, soil properties, and human management practices, and the modeling features selected in this study may not cover all key factors, leading to potential instability in prediction performance. Future research should consider incorporating additional features to enhance the model’s comprehensiveness and accuracy. Furthermore, the relatively limited training data may affect the model’s training efficacy and generalization capability. Future studies could use longer time-series data or introduce adversarial networks and other methods to increase sample size, thereby better capturing the long-term trends and cyclical variations in winter wheat yield.

Author Contributions

Conceptualization, X.L. and Z.L.; methodology and data curation, Z.L. and S.L.; writing—original draft preparation, Z.L.; writing—review and editing, X.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research and the APC were funded by National Key Research and Development Plan of China (No. 2016YFC0803103).

Data Availability Statement

Data are contained within the article. The original contributions presented in the study are included in the article; further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Tian, H.; Wang, P.; Tansey, K.; Han, D.; Zhang, J.; Zhang, S.; Li, H. A Deep Learning Framework under Attention Mechanism for Wheat Yield Estimation Using Remotely Sensed Indices in the Guanzhong Plain, PR China. Int. J. Appl. Earth Obs. Geoinf. 2021, 102, 102375. [Google Scholar] [CrossRef]
  2. Chen, P.; Li, Y.; Liu, X.; Tian, Y.; Zhu, Y.; Cao, W.; Cao, Q. Improving Yield Prediction Based on Spatio-Temporal Deep Learning Approaches for Winter Wheat: A Case Study in Jiangsu Province, China. Comput. Electron. Agric. 2023, 213, 108201. [Google Scholar] [CrossRef]
  3. Xu, X.; Gao, P.; Zhu, X.; Guo, W.; Ding, J.; Li, C.; Zhu, M.; Wu, X. Design of an Integrated Climatic Assessment Indicator (ICAI) for Wheat Production: A Case Study in Jiangsu Province, China. Ecol. Indic. 2019, 101, 943–953. [Google Scholar] [CrossRef]
  4. Zhuo, W.; Fang, S.; Gao, X.; Wang, L.; Wu, D.; Fu, S.; Wu, Q.; Huang, J. Crop Yield Prediction Using MODIS LAI, TIGGE Weather Forecasts and WOFOST Model: A Case Study for Winter Wheat in Hebei, China during 2009–2013. Int. J. Appl. Earth Obs. Geoinf. 2022, 106, 102668. [Google Scholar] [CrossRef]
  5. Ren, Y.; Li, Q.; Du, X.; Zhang, Y.; Wang, H.; Shi, G.; Wei, M. Analysis of Corn Yield Prediction Potential at Various Growth Phases Using a Process-Based Model and Deep Learning. Plants 2023, 12, 446. [Google Scholar] [CrossRef] [PubMed]
  6. Cao, J.; Zhang, Z.; Tao, F.; Zhang, L.; Luo, Y.; Han, J.; Li, Z. Identifying the Contributions of Multi-Source Data for Winter Wheat Yield Prediction in China. Remote Sens. 2020, 12, 750. [Google Scholar] [CrossRef]
  7. Li, L.; Wang, B.; Feng, P.; Li Liu, D.; He, Q.; Zhang, Y.; Wang, Y.; Li, S.; Lu, X.; Yue, C.; et al. Developing Machine Learning Models with Multi-Source Environmental Data to Predict Wheat Yield in China. Comput. Electron. Agric. 2022, 194, 106790. [Google Scholar] [CrossRef]
  8. Cao, J.; Wang, H.; Li, J.; Tian, Q.; Niyogi, D. Improving the Forecasting of Winter Wheat Yields in Northern China with Machine Learning–Dynamical Hybrid Subseasonal-to-Seasonal Ensemble Prediction. Remote Sens. 2022, 14, 1707. [Google Scholar] [CrossRef]
  9. Cai, Y.; Guan, K.; Lobell, D.; Potgieter, A.B.; Wang, S.; Peng, J.; Xu, T.; Asseng, S.; Zhang, Y.; You, L.; et al. Integrating Satellite and Climate Data to Predict Wheat Yield in Australia Using Machine Learning Approaches. Agric. For. Meteorol. 2019, 274, 144–159. [Google Scholar] [CrossRef]
  10. Madugundu, R.; Al-Gaadi, K.A.; Tola, E.; Edrris, M.K.; Edrees, H.F.; Alameen, A.A. Optimal Timing of Carrot Crop Monitoring and Yield Assessment Using Sentinel-2 Images: A Machine-Learning Approach. Appl. Sci. 2024, 14, 3636. [Google Scholar] [CrossRef]
  11. Kaur, P.; Harnal, S.; Tiwari, R.; Upadhyay, S.; Bhatia, S.; Mashat, A.; Alabdali, A.M. Recognition of Leaf Disease Using Hybrid Convolutional Neural Network by Applying Feature Reduction. Sensors 2022, 22, 575. [Google Scholar] [CrossRef]
  12. Nagaraju, M.; Chawla, P.; Upadhyay, S.; Tiwari, R. Convolution Network Model Based Leaf Disease Detection Using Augmentation Techniques. Expert Syst. 2022, 39, e12885. [Google Scholar] [CrossRef]
  13. Mishra, A.M.; Harnal, S.; Gautam, V.; Tiwari, R.; Upadhyay, S. Weed density estimation in soya bean crop using deep convolutional neural networks in smart agriculture. J. Plant Dis. Prot. 2022, 129, 593–604. [Google Scholar] [CrossRef]
  14. Khan, S.N.; Li, D.; Maimaitijiang, M. A Geographically Weighted Random Forest Approach to Predict Corn Yield in the US Corn Belt. Remote Sens. 2022, 14, 2843. [Google Scholar] [CrossRef]
  15. Proutsos, N.D.; Fotelli, M.N.; Stefanidis, S.P.; Tigkas, D. Assessing the Accuracy of 50 Temperature-Based Models for Estimating Potential Evapotranspiration (PET) in a Mediterranean Mountainous Forest Environment. Atmosphere 2024, 15, 662. [Google Scholar] [CrossRef]
  16. Guo, Y.; Wang, H.; Wu, Z.; Wang, S.; Sun, H.; Senthilnath, J.; Wang, J.; Robin Bryant, C.; Fu, Y. Modified Red Blue Vegetation Index for Chlorophyll Estimation and Yield Prediction of Maize from Visible Images Captured by UAV. Sensors 2020, 20, 5055. [Google Scholar] [CrossRef] [PubMed]
  17. Zhou, H.; Yang, J.; Lou, W.; Sheng, L.; Li, D.; Hu, H. Improving Grain Yield Prediction through Fusion of Multi-Temporal Spectral Features and Agronomic Trait Parameters Derived from UAV Imagery. Front. Plant Sci. 2023, 14, 1217448. [Google Scholar] [CrossRef] [PubMed]
  18. Zhou, X.; Zheng, H.B.; Xu, X.Q.; He, J.Y.; Ge, X.K.; Yao, X.; Cheng, T.; Zhu, Y.; Cao, W.X.; Tian, Y.C. Predicting Grain Yield in Rice Using Multi-Temporal Vegetation Indices from UAV-Based Multispectral and Digital Imagery. ISPRS J. Photogramm. Remote Sens. 2017, 130, 246–255. [Google Scholar] [CrossRef]
  19. Sun, J.; Di, L.; Sun, Z.; Shen, Y.; Lai, Z. County-Level Soybean Yield Prediction Using Deep CNN-LSTM Model. Sensors 2019, 19, 4363. [Google Scholar] [CrossRef] [PubMed]
  20. Wang, J.; Si, H.; Gao, Z.; Shi, L. Winter Wheat Yield Prediction Using an LSTM Model from MODIS LAI Products. Agriculture 2022, 12, 1707. [Google Scholar] [CrossRef]
  21. Tan, C.; Wang, D.; Zhou, J.; Du, Y.; Luo, M.; Zhang, Y.; Guo, W. Remotely Assessing Fraction of Photosynthetically Active Radiation (FPAR) for Wheat Canopies Based on Hyperspectral Vegetation Indexes. Front. Plant Sci. 2018, 9, 776. [Google Scholar] [CrossRef] [PubMed]
  22. Schwalbert, R.A.; Amado, T.; Corassa, G.; Pott, L.P.; Prasad, P.V.V.; Ciampitti, I.A. Satellite-Based Soybean Yield Forecast: Integrating Machine Learning and Weather Data for Improving Crop Yield Prediction in Southern Brazil. Agric. For. Meteorol. 2020, 284, 107886. [Google Scholar] [CrossRef]
  23. Khan, S.N.; Li, D.; Maimaitijiang, M. Using Gross Primary Production Data and Deep Transfer Learning for Crop Yield Prediction in the US Corn Belt. Int. J. Appl. Earth Obs. Geoinf. 2024, 131, 103965. [Google Scholar] [CrossRef]
  24. Peng, B.; Guan, K.; Pan, M.; Li, Y. Benefits of Seasonal Climate Prediction and Satellite Data for Forecasting U.S. Maize Yield. Geophys. Res. Lett. 2018, 45, 9662–9671. [Google Scholar] [CrossRef]
  25. Huber, F.; Yushchenko, A.; Stratmann, B.; Steinhage, V. Extreme Gradient Boosting for Yield Estimation Compared with Deep Learning Approaches. Comput. Electron. Agric. 2022, 202, 107346. [Google Scholar] [CrossRef]
  26. Zhang, L.; Zhang, Z.; Luo, Y.; Cao, J.; Xie, R.; Li, S. Integrating Satellite-Derived Climatic and Vegetation Indices to Predict Smallholder Maize Yield Using Deep Learning. Agric. For. Meteorol. 2021, 311, 108666. [Google Scholar] [CrossRef]
  27. Fei, S.; Hassan, M.A.; Xiao, Y.; Su, X.; Chen, Z.; Cheng, Q.; Duan, F.; Chen, R.; Ma, Y. UAV-Based Multi-Sensor Data Fusion and Machine Learning Algorithm for Yield Prediction in Wheat. Precis. Agric. 2023, 24, 187–212. [Google Scholar] [CrossRef]
  28. Ahmed, A.A.M.; Sharma, E.; Jui, S.J.J.; Deo, R.C.; Nguyen-Huy, T.; Ali, M. Kernel Ridge Regression Hybrid Method for Wheat Yield Prediction with Satellite-Derived Predictors. Remote Sens. 2022, 14, 1136. [Google Scholar] [CrossRef]
  29. Wang, Y.; Shi, W.; Wen, T. Prediction of Winter Wheat Yield and Dry Matter in North China Plain Using Machine Learning Algorithms for Optimal Water and Nitrogen Application. Agric. Water Manag. 2023, 277, 108140. [Google Scholar] [CrossRef]
  30. Pang, A.; Chang, M.W.L.; Chen, Y. Evaluation of Random Forests (RF) for Regional and Local-Scale Wheat Yield Prediction in Southeast Australia. Sensors 2022, 22, 717. [Google Scholar] [CrossRef]
  31. Kumar, S.; Attri, S.D.; Singh, K.K. Comparison of Lasso and Stepwise Regression Technique for Wheat Yield Prediction. J. Agrometeorol. 2019, 21, 188–192. [Google Scholar] [CrossRef]
  32. Son, N.-T.; Chen, C.-F.; Cheng, Y.-S.; Toscano, P.; Chen, C.-R.; Chen, S.-L.; Tseng, K.-H.; Syu, C.-H.; Guo, H.-Y.; Zhang, Y.-T. Field-Scale Rice Yield Prediction from Sentinel-2 Monthly Image Composites Using Machine Learning Algorithms. Ecol. Inform. 2022, 69, 101618. [Google Scholar] [CrossRef]
  33. Zhou, W.; Liu, Y.; Ata-Ul-Karim, S.T.; Ge, Q.; Li, X.; Xiao, J. Integrating Climate and Satellite Remote Sensing Data for Predicting County-Level Wheat Yield in China Using Machine Learning Methods. Int. J. Appl. Earth Obs. Geoinf. 2022, 111, 102861. [Google Scholar] [CrossRef]
  34. Zhao, Y.; Xiao, D.; Bai, H.; Tang, J.; Liu, D.L.; Qi, Y.; Shen, Y. The Prediction of Wheat Yield in the North China Plain by Coupling Crop Model with Machine Learning Algorithms. Agriculture 2023, 13, 99. [Google Scholar] [CrossRef]
  35. Zhao, Y.; Zhang, Y.; Yang, Y.; Li, F.; Dai, R.; Li, J.; Wang, M.; Li, Z. The Impact of Land Use Structure Change on Utilization Performance in Henan Province, China. Int. J. Environ. Res. Public Health 2023, 20, 4251. [Google Scholar] [CrossRef]
  36. Huang, J.; Zhou, L.; Zhang, F.; Hu, Z.; Tian, H. Responses of Yield Variability of Summer Maize in Henan Province, North China, to Large-Scale Atmospheric Circulation Anomalies. Theor. Appl. Clim. 2021, 143, 1655–1665. [Google Scholar] [CrossRef]
  37. Xie, Y.; Shi, S.; Xun, L.; Wang, P. A Multitemporal Index for the Automatic Identification of Winter Wheat Based on Sentinel-2 Imagery Time Series. GIScience Remote Sens. 2023, 60, 2262833. [Google Scholar] [CrossRef]
  38. Wang, Y.; Zhang, Y.; Zhang, R.; Li, J.; Zhang, M.; Zhou, S.; Wang, Z. Reduced Irrigation Increases the Water Use Efficiency and Productivity of Winter Wheat-Summer Maize Rotation on the North China Plain. Sci. Total Environ. 2018, 618, 112–120. [Google Scholar] [CrossRef]
  39. National Bureau of Statistics of China. Data and Statistics. National Bureau of Statistics of China. Available online: https://data.stats.gov.cn (accessed on 1 October 2022).
  40. Ronchetti, G.; Manfron, G.; Weissteiner, C.J.; Seguini, L.; Nisini Scacchiafichi, L.; Panarello, L.; Baruth, B. Remote Sensing Crop Group-Specific Indicators to Support Regional Yield Forecasting in Europe. Comput. Electron. Agric. 2023, 205, 107633. [Google Scholar] [CrossRef]
  41. Wang, X.; Huang, J.; Feng, Q.; Yin, D. Winter Wheat Yield Prediction at County Level and Uncertainty Analysis in Main Wheat-Producing Regions of China with Deep Learning Approaches. Remote Sens. 2020, 12, 1744. [Google Scholar] [CrossRef]
  42. Tian, L.; Wang, C.; Li, H.; Sun, H. Yield Prediction Model of Rice and Wheat Crops Based on Ecological Distance Algorithm. Environ. Technol. Innov. 2020, 20, 101132. [Google Scholar] [CrossRef]
  43. Khalaf, G.; Månsson, K.; Shukur, G. Modified Ridge Regression Estimators. Commun. Stat.-Theory Methods 2013, 42, 1476–1487. [Google Scholar] [CrossRef]
  44. Polat, K.; Guenes, S. A novel hybrid intelligent method based on C4.5 decision tree classifier and one-against-all approach for multi-class classification problems. Expert Syst. Appl. 2009, 36, 1587–1592. [Google Scholar] [CrossRef]
  45. Tian, H.; Cheng, L.; Wu, D.; Wei, Q.; Zhu, L. Regional Monitoring of Leaf ChlorophyII Content of Summer Maize by Integrating Multi-Source Remote Sensing Data. Agronomy 2023, 13, 2040. [Google Scholar] [CrossRef]
  46. Appelhans, T.; Mwangomo, E.; Hardy, D.R.; Hemp, A.; Nauss, T. Evaluating machine learning approaches for the interpolation of monthly air temperature at Mt. Kilimanjaro, Tanzania. Spat. Stat. 2015, 14, 91–113. [Google Scholar] [CrossRef]
  47. Song, J.; Zhang, L.; Jiang, Q.; Ma, Y.; Zhang, X.; Xue, G.; Shen, X.; Wu, X. Estimate the Daily Consumption of Natural Gas in District Heating System Based on a Hybrid Seasonal Decomposition and Temporal Convolutional Network Model. Appl. Energy 2022, 309, 118444. [Google Scholar] [CrossRef]
  48. Mienye, I.D.; Sun, Y. A Survey of Ensemble Learning: Concepts, Algorithms, Applications, and Prospects. IEEE Access 2022, 10, 99129–99149. [Google Scholar] [CrossRef]
  49. Sagi, O.; Rokach, L. Ensemble Learning: A Survey. WIREs Data Min. Knowl. Discov. 2018, 8, e1249. [Google Scholar] [CrossRef]
  50. Sankalpa, C.; Kittipiyakul, S.; Laitrakun, S. Forecasting Short-Term Electricity Load Using Validated Ensemble Learning. Energies 2022, 15, 8567. [Google Scholar] [CrossRef]
  51. Yulisa, A.; Park, S.H.; Choi, S.; Chairattanawat, C.; Hwang, S. Enhancement of Voting Regressor Algorithm on Predicting Total Ammonia Nitrogen Concentration in Fish Waste Anaerobiosis. Waste Biomass Valorization 2023, 14, 461–478. [Google Scholar] [CrossRef]
  52. Banfield, R.E.; Hall, L.O.; Bowyer, K.W.; Kegelmeyer, W.P. Ensemble Diversity Measures and Their Application to Thinning. Inf. Fusion 2005, 6, 49–62. [Google Scholar] [CrossRef]
  53. Hanicinec, M.; Mohr, S.; Tennyson, J. A Regression Model for Plasma Reaction Kinetics. J. Phys. D Appl. Phys. 2023, 56, 374001. [Google Scholar] [CrossRef]
  54. Phyo, P.-P.; Byun, Y.-C.; Park, N. Short-Term Energy Forecasting Using Machine-Learning-Based Ensemble Voting Regression. Symmetry 2022, 14, 160. [Google Scholar] [CrossRef]
  55. Natras, R.; Soja, B.; Schmidt, M. Ensemble Machine Learning of Random Forest, AdaBoost and XGBoost for Vertical Total Electron Content Forecasting. Remote Sens. 2022, 14, 3547. [Google Scholar] [CrossRef]
  56. Perros, N.; Kalivas, D.; Giovos, R. Spatial Analysis of Agronomic Data and UAV Imagery for Rice Yield Estimation. Agriculture 2021, 11, 809. [Google Scholar] [CrossRef]
  57. Sagan, V.; Maimaitijiang, M.; Bhadra, S.; Maimaitiyiming, M.; Brown, D.R.; Sidike, P.; Fritschi, F.B. Field-Scale Crop Yield Prediction Using Multi-Temporal WorldView-3 and PlanetScope Satellite Data and Deep Learning. ISPRS J. Photogramm. Remote Sens. 2021, 174, 265–281. [Google Scholar] [CrossRef]
  58. Xu, W.; Yang, W.; Chen, S.; Wu, C.; Chen, P.; Lan, Y. Establishing a Model to Predict the Single Boll Weight of Cotton in Northern Xinjiang by Using High Resolution UAV Remote Sensing Data. Comput. Electron. Agric. 2020, 179, 105762. [Google Scholar] [CrossRef]
Figure 1. Workflow of winter wheat yield prediction.
Figure 1. Workflow of winter wheat yield prediction.
Agronomy 14 01834 g001
Figure 2. Distribution map of land-use types in Henan Province in 2021.
Figure 2. Distribution map of land-use types in Henan Province in 2021.
Agronomy 14 01834 g002
Figure 3. Statistical graph of production data.
Figure 3. Statistical graph of production data.
Agronomy 14 01834 g003
Figure 4. Overall architecture diagram of multi-model ensemble methods.
Figure 4. Overall architecture diagram of multi-model ensemble methods.
Agronomy 14 01834 g004
Figure 5. Comparison of model prediction accuracy for the entire growth period of winter wheat.
Figure 5. Comparison of model prediction accuracy for the entire growth period of winter wheat.
Agronomy 14 01834 g005
Figure 6. Scatter plots and residuals of the validation set models for the entire growth period. Subfigures (ad) depict the scatter plots and residual information for ensemble voting, ridge, gradient boosting, and random forest models, respectively.
Figure 6. Scatter plots and residuals of the validation set models for the entire growth period. Subfigures (ad) depict the scatter plots and residual information for ensemble voting, ridge, gradient boosting, and random forest models, respectively.
Agronomy 14 01834 g006
Figure 7. Scatter plots and residuals of the test set models for the entire growth period. Subfigures (ad) depict the scatter plots and residual information for ensemble voting, ridge, gradient boosting, and random forest models, respectively.
Figure 7. Scatter plots and residuals of the test set models for the entire growth period. Subfigures (ad) depict the scatter plots and residual information for ensemble voting, ridge, gradient boosting, and random forest models, respectively.
Agronomy 14 01834 g007aAgronomy 14 01834 g007b
Figure 8. Accuracy of single growth period models. Subfigures comprise a (a) heatmap of R2 for the validation set, (b) heatmap of R2 for the test set, (c) heatmap of RMSE for the validation set, (d) heatmap of RMSE for the test set, (e) heatmap of MAE for the validation set, and (f) heatmap of MAE for the test set.
Figure 8. Accuracy of single growth period models. Subfigures comprise a (a) heatmap of R2 for the validation set, (b) heatmap of R2 for the test set, (c) heatmap of RMSE for the validation set, (d) heatmap of RMSE for the test set, (e) heatmap of MAE for the validation set, and (f) heatmap of MAE for the test set.
Agronomy 14 01834 g008
Figure 9. Scatter plots of ensemble voting’s yield predictions during the heading stage. Subfigures comprise a (a) scatter plot for the validation set and (b) scatter plot for the test set.
Figure 9. Scatter plots of ensemble voting’s yield predictions during the heading stage. Subfigures comprise a (a) scatter plot for the validation set and (b) scatter plot for the test set.
Agronomy 14 01834 g009
Figure 10. Spatialization of winter wheat yield prediction in Henan Province, 2021.
Figure 10. Spatialization of winter wheat yield prediction in Henan Province, 2021.
Agronomy 14 01834 g010
Table 1. Phenological calendar of winter wheat in Henan Province.
Table 1. Phenological calendar of winter wheat in Henan Province.
PhenologyEmergenceTilleringOverwinteringGreen-UpJointingHeadingMilk RipeningMaturation
TimeLate September to late OctoberEarly November to early DecemberMid-December to mid-FebruaryMid-February to mid-MarchMid-March to early AprilMid-April to late AprilMayEarly June to Late June
Table 2. Summary of winter wheat production forecast data.
Table 2. Summary of winter wheat production forecast data.
CategoryVariablesSpatial ResolutionTemporal ResolutionSources
Statistical dataYieldCounty-levelYearlyStatistical Yearbook [39]
Wheat area
Vector dataWheatCounty-levelYearlySentinel 2 Image Extraction
Vegetation indexEVI500 m8-dayMOD09A1
NDVIMOD09A1
FPARMOD15A2H
LST1 kmMOD11A2
Ecological dataGPP500 m8-dayMOD17A2H
LAIMOD15A2H
Hydrological dataET500 m8-dayMOD16A2
PET
Table 3. Accuracy of 11 yield prediction models.
Table 3. Accuracy of 11 yield prediction models.
ModelsValidation DataTest Data
R2RMSEMAER2RMSEMAE
Elastic Net0.52933.28767.950.52930.86754.47
Gradient Boosting0.78631.74507.250.78633.30462.28
Random Forest0.69756.39611.460.75676.49517.72
Ridge0.86509.50389.660.79609.80475.07
Adaboost0.61848.46725.360.57878.83731.44
KNeighbors0.61847.14685.250.63821.96614.30
DecisionTree0.55904.67628.910.49954.56700.81
ExtraTree0.321116.44812.830.381054.95752.55
NuSVR0.251172.381036.040.261156.471021.89
SVR0.031332.691165.480.041316.571161.38
Ensemble Voting0.90439.21351.280.90424.44313.92
Table 4. Accuracy of ablation experimental models.
Table 4. Accuracy of ablation experimental models.
Removed ModelValidation DataTest Data
R2RMSEMAER2RMSEMAE
Gradient Boosting0.82569.00458.290.86506.82386.23
Random Forest0.88477.62381.620.88474.53349.19
Ridge0.77652.50526.040.78636.83470.86
None (Initial Model)0.90439.21351.280.90424.44313.92
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Lou, Z.; Lu, X.; Li, S. Yield Prediction of Winter Wheat at Different Growth Stages Based on Machine Learning. Agronomy 2024, 14, 1834. https://doi.org/10.3390/agronomy14081834

AMA Style

Lou Z, Lu X, Li S. Yield Prediction of Winter Wheat at Different Growth Stages Based on Machine Learning. Agronomy. 2024; 14(8):1834. https://doi.org/10.3390/agronomy14081834

Chicago/Turabian Style

Lou, Zhengfang, Xiaoping Lu, and Siyi Li. 2024. "Yield Prediction of Winter Wheat at Different Growth Stages Based on Machine Learning" Agronomy 14, no. 8: 1834. https://doi.org/10.3390/agronomy14081834

APA Style

Lou, Z., Lu, X., & Li, S. (2024). Yield Prediction of Winter Wheat at Different Growth Stages Based on Machine Learning. Agronomy, 14(8), 1834. https://doi.org/10.3390/agronomy14081834

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop