Prediction of Potato (Solanum tuberosum L.) Yield Based on Machine Learning Methods
Abstract
:1. Introduction
- Weather traits;
- Agricultural traits;
- Traits conditioned by genotype and phytophenological traits;
- Soil environment;
- Spectral data, including vegetation indices;
- Indicators related to plant productivity.
- Dependence on historical data.
- Disregarding nonlinear factors.
- Variability of environmental conditions.
- Errors in measurements and other inputs.
- No consideration of changes in agricultural practices.
- Complexity of the interaction between factors.
2. Materials and Methods
2.1. Dataset Description
2.1.1. Data Augmentation
- A random change percentage (between 0.01 and 0.05) is chosen within a predefined range, which determines the degree of modification for the augmentation.
- Noise is generated based on the random change percentage and is applied to both the features and the target variable. This noise addition simulates realistic variability within the data.
- The new synthetic record, created by applying noise, is then denormalized to bring it back to the original data scale.
- The synthetic record is appended to the augmented dataset along with its corresponding textual data.
2.1.2. BBCH-Scale
- (1)
- BBCH 0–10 (from planting to the beginning of emergence);
- (2)
- BBCH 11–50 (from the beginning of emergence to the beginning of tuber setting);
- (3)
- BBCH > 50 (from the beginning of tuber setting to harvest).
2.1.3. Agronomic Data
2.1.4. Climate Data
2.1.5. Soil Data
2.1.6. Satellite Data
2.1.7. Selyaninov Hydrothermal Coefficient
- n—the length of the period considered in days;
- —the rainfall on the i-th day (mm);
- —the average daily temperature on the i-th day (°C).
2.1.8. GDD Features
- Assessing a region’s suitability for cultivating specific crops;
- Estimating the growth stages of crops, weeds, or insects;
- Predicting the maturity and cutting dates of forage crops;
- Determining the optimal timing of fertiliser or pesticide application;
- Estimating heat stress on crops;
- Planning the spacing of planting dates to produce separate harvest dates.
- GDD—the Growing Degree Day (°C);
- n—the length of the period considered in days;
- —the average daily air temperature ≥ 0 (°C).
2.1.9. Total Numerical features
2.1.10. Data Pruning: Addressing Missing Values
- Data integrity: When more than half of the data for a variable are missing, the integrity and representativeness of that variable become questionable. With over 50% missing data, any form of imputation would largely be based on speculation, rather than trends or patterns inherent in the data.
- Statistical significance: Variables with significant missing data can potentially skew the results and lead to unreliable conclusions. By setting the threshold at 50%, we aimed to maintain variables that had a statistically significant amount of data, thereby ensuring that our models were built on solid and representative foundations.
- Balance between data retention and quality: The 50% threshold strikes a balance between retaining as much data as possible and ensuring the quality of the dataset. This threshold allowed us to keep a substantial portion of the dataset while avoiding the pitfalls of basing our analysis on largely imputed or speculative data.
- Benchmarking against standard practices: This threshold is in line with common practices in data science and statistical analysis, where a 50% cutoff is often used as a standard for determining the viability of a variable in a dataset.
2.2. Data Imputation
Methods of Data Imputation
Algorithm 1 Hybrid imputation procedure. |
|
2.3. Data Normalisation
- -
- is the normalised value;
- -
- X is the original value;
- -
- is the minimum value in the feature column;
- -
- is the maximum value in the feature column.
2.4. Prototyping the 3 AI Models: Non-Satellite, Satellite, and Hybrid
2.5. Data Partitioning: Training, Validation, and Testing
- df_train—data from the years 2018 and 2019;
- df_val—data from the year 2020;
- df_test—data from the year 2021.
2.6. Feature Selection
2.6.1. Stepwise Regression
- Fit the initial model.
- If any predictors not in the model have p-values less than the entry tolerance (e.g., 0.05), add the one with the smallest p-value and repeat this step. If not, proceed to the next step.
- If any predictors in the model have p-values greater than the exit tolerance (e.g., 0.10), remove the one with the largest p-value, and go back to the previous step. If not, stop.
2.6.2. Pearson Correlation
2.6.3. Chi-Squared Test
2.6.4. Principal Component Analysis
3. Outlier Detection
3.1. Local Outlier Factor
Algorithm 2 Pseudocode for local outlier factor. |
|
3.2. One-Class SVM
Algorithm 3 Pseudocode for One-Class SVM. |
|
4. Results and Discussion
4.1. Non-Satellite Model
4.2. Satellite Model
4.3. Hybrid Model
4.4. Models Comparison
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- FAO. FAOSTAT Statistical Database. 2023. Available online: https://ourworldindata.org/grapher/potato-yields (accessed on 7 June 2023).
- Potatonewstoday. FAO Updates Global Potato Statistics. Available online: https://www.potatonewstoday.com/2022/03/28/fao-updates-global-potato-statistics/ (accessed on 7 June 2023).
- Popkin, B.M.; Reardon, T. Obesity and the food system transformation in Latin America. Obes. Rev. 2018, 19, 1028–1064. [Google Scholar] [CrossRef] [PubMed]
- Shafi, U.; Mumtaz, R.; García-Nieto, J.; Hassan, S.A.; Zaidi, S.A.R.; Iqbal, N. Precision agriculture techniques and practices: From considerations to applications. Sensors 2019, 19, 3796. [Google Scholar] [CrossRef] [PubMed]
- Vannoppen, A.; Gobin, A. Estimating yield from NDVI, weather data, and soil water depletion for sugar beet and potato in Northern Belgium. Water 2022, 14, 1188. [Google Scholar] [CrossRef]
- Newton, I.H.; Tariqul Islam, A.; Saiful Islam, A.; Tarekul Islam, G.; Tahsin, A.; Razzaque, S. Yield prediction model for potato using landsat time series images driven vegetation indices. Remote Sens. Earth Syst. Sci. 2018, 1, 29–38. [Google Scholar] [CrossRef]
- Cambouris, A.N.; Zebarth, B.J.; Ziadi, N.; Perron, I. Precision agriculture in potato production. Potato Res. 2014, 57, 249–262. [Google Scholar] [CrossRef]
- Hwang, E.; Park, Y.S.; Kim, J.Y.; Park, S.H.; Kim, J.; Kim, S.H. Intraoperative Hypotension Prediction Based on Features Automatically Generated Within an Interpretable Deep Learning Model. IEEE Trans. Neural Netw. Learn. Syst. 2023, 1–15. [Google Scholar] [CrossRef]
- Renju, R.S.; Deepthi, P.S.; Chitra, M.T. A Review of Crop Yield Prediction Strategies based on Machine Learning and Deep Learning. In Proceedings of the 2022 International Conference on Computing, Communication, Security and Intelligent Systems (IC3SIS), Kochi, India, 23–25 June 2022; pp. 1–6. [Google Scholar] [CrossRef]
- Šťastná, M.; Toman, F.; Dufková, J. Usage of SUBSTOR model in potato yield prediction. Agric. Water Manag. 2010, 97, 286–290. [Google Scholar] [CrossRef]
- Ahmed, M.; Fatima, Z.; Iqbal, P.; Kalsoom, T.; Abbasi, K.S.; Shaheen, F.A.; Ahmad, S. Potato Modelling. In Systems Modelling; Ahmed, M., Ed.; Springer: Singapore, 2020; pp. 383–401. [Google Scholar] [CrossRef]
- Divya, K.L.; Mhatre, P.H.; Venkatasalam, E.P.; Sudha, R. Crop Simulation Models as Decision-Supporting Tools for Sustainable Potato Production: A Review. Potato Res. 2021, 64, 387–419. [Google Scholar] [CrossRef]
- Travasso, M.I.; Caldiz, D.O.; Saluzzo, J.A. Yield prediction using the SUBSTOR-potato model under Argentinian conditions. Potato Res. 1996, 39, 305–312. [Google Scholar] [CrossRef]
- Bala, S.K.; Islam, A.S. Correlation between potato yield and MODIS-derived vegetation indices. Int. J. Remote Sens. 2009, 30, 2491–2507. [Google Scholar] [CrossRef]
- Gómez, D.; Salvador, P.; Sanz, J.; Casanova, J.L. Potato Yield Prediction Using Machine Learning Techniques and Sentinel 2 Data. Remote Sens. 2019, 11, 1745. [Google Scholar] [CrossRef]
- Gómez, D.; Salvador, P.; Sanz, J.; Casanova, J.L. New spectral indicator Potato Productivity Index based on Sentinel-2 data to improve potato yield prediction: A machine learning approach. Int. J. Remote Sens. 2021, 42, 3426–3444. [Google Scholar] [CrossRef]
- Sun, J.; Di, L.; Sun, Z.; Shen, Y.; Lai, Z. County-Level Soybean Yield Prediction Using Deep CNN-LSTM Model. Sensors 2019, 19, 4363. [Google Scholar] [CrossRef] [PubMed]
- Gobin, A.; Sallah, A.H.M.; Curnel, Y.; Delvoye, C.; Weiss, M.; Wellens, J.; Piccard, I.; Planchon, V.; Tychon, B.; Goffart, J.P.; et al. Crop Phenology Modelling Using Proximal and Satellite Sensor Data. Remote Sens. 2023, 15, 2090. [Google Scholar] [CrossRef]
- Lin, Y.; Li, S.; Ye, Y.; Li, B.; Li, G.; Lyv, D.; Jin, L.; Bian, C.; Liu, J. Methodological evolution of potato yield prediction: A comprehensive review. Front. Plant Sci. 2023, 14, 1214006. [Google Scholar] [CrossRef]
- Akhand, K.; Nizamuddin, M.; Roytman, L.; Kogan, F. Using remote sensing satellite data and artificial neural network for prediction of potato yield in Bangladesh. In Remote Sensing and Modelling of Ecosystems for Sustainability XIII; SPIE: Bellingham, WA, USA, 2016; Volume 9975, pp. 52–66. [Google Scholar]
- Al-Gaadi, K.A.; Hassaballa, A.A.; Madugundu, R.; Tola, E.; Fulleros, R.B. Prediction of potato high-yield zones of a field: Bivariate frequency ratio technique. Curr. Sci. 2020, 119, 992. [Google Scholar] [CrossRef]
- Noman, A.M.; Haidar, Z.A.; Aljumah, A.S.; Almutairi, S.Z.; Alqahtani, M.H. Forecasting the Distortion in Solar Radiation during Midday Hours by Analyzing Solar Radiation during Early Morning Hours. Appl. Sci. 2023, 13, 6049. [Google Scholar] [CrossRef]
- Piekutowska, M.; Niedbała, G.; Piskier, T.; Lenartowicz, T.; Pilarski, K.; Wojciechowski, T.; Pilarska, A.A.; Czechowska-Kosacka, A. The application of multiple linear regression and artificial neural network models for yield prediction of very early potato cultivars before harvest. Agronomy 2021, 11, 885. [Google Scholar] [CrossRef]
- Hara, P.; Piekutowska, M.; Niedbała, G. Selection of independent variables for crop yield prediction using artificial neural network models with remote sensing data. Land 2021, 10, 609. [Google Scholar] [CrossRef]
- Li, Q.; Zhang, S. Impacts of recent climate change on potato yields at a provincial scale in Northwest China. Agronomy 2020, 10, 426. [Google Scholar] [CrossRef]
- Rymuza, K.; Radzka, E.; Lenartowicz, T. Effect of weather conditions on early potato yields in east-central Poland. Commun. Biometry Crop Sci. 2015, 10, 65–72. [Google Scholar]
- Nyawade, S.O.; Gitari, H.I.; Karanja, N.N.; Gachene, C.K.; Schulte-Geldermann, E.; Parker, M.L. Yield and evapotranspiration characteristics of potato-legume intercropping simulated using a dual coefficient approach in a tropical highland. Field Crop. Res. 2021, 274, 108327. [Google Scholar] [CrossRef]
- Blecharczyk, A.; Kowalczewski, P.Ł.; Sawinska, Z.; Rybacki, P.; Radzikowska-Kujawska, D. Impact of Crop Sequence and Fertilization on Potato Yield in a Long-Term Study. Plants 2023, 12, 495. [Google Scholar] [CrossRef] [PubMed]
- Pandey, J.; Scheuring, D.C.; Koym, J.W.; Vales, M.I. Genomic regions associated with tuber traits in tetraploid potatoes and identification of superior clones for breeding purposes. Front. Plant Sci. 2022, 13, 952263. [Google Scholar] [CrossRef] [PubMed]
- Singh, B. Are nitrogen fertilizers deleterious to soil health? Agronomy 2018, 8, 48. [Google Scholar] [CrossRef]
- Hasnain, M.; Chen, J.; Ahmed, N.; Memon, S.; Wang, L.; Wang, Y.; Wang, P. The effects of fertilizer type and application time on soil properties, plant traits, yield and quality of tomato. Sustainability 2020, 12, 9065. [Google Scholar] [CrossRef]
- Fiers, M.; Edel-Hermann, V.; Chatot, C.; Le Hingrat, Y.; Alabouvette, C.; Steinberg, C. Potato soil-borne diseases. A review. Agron. Sustain. Dev. 2012, 32, 93–132. [Google Scholar] [CrossRef]
- Vreugdenhil, D.; Bradshaw, J.; Gebhardt, C.; Govers, F.; Taylor, M.A.; MacKerron, D.K.; Ross, H.A. Potato Biology and Biotechnology: Advances and Perspectives; Elsevier: Amsterdam, The Netherlands, 2011. [Google Scholar]
- Boyd, N.; Gordon, R.; Martin, R. Relationship between leaf area index and ground cover in potato under different management conditions. Potato Res. 2002, 45, 117–129. [Google Scholar] [CrossRef]
- Quiroz, R.; Loayza, H.; Barreda, C.; Gavilán, C.; Posadas, A.; Ramírez, D. Linking process-based potato models with light reflectance data: Does model complexity enhance yield prediction accuracy? Eur. J. Agron. 2017, 82, 104–112. [Google Scholar] [CrossRef]
- Rokhafrouz, M.; Latifi, H.; Abkar, A.; Wojciechowski, T.; Czechlowski, M.; Naieni, A.; Maghsoudi, Y.; Niedbała, G. Simplified and Hybrid Remote Sensing-Based Delineation of Management Zones for Nitrogen Variable Rate Application in Wheat. Agriculture 2021, 11, 1104. [Google Scholar] [CrossRef]
- Salvador, P.; Gómez, D.; Sanz, J.; Casanova, J.L. Estimation of potato yield using satellite data at a municipal level: A machine learning approach. ISPRS Int. J. Geo-Inf. 2020, 9, 343. [Google Scholar] [CrossRef]
- Samborski, S.; Leszczyńska, R.; Gozdowski, D. Detecting spatial variability of potato canopy using various remote sensing data. In Proceedings of the Precision Agriculture’21, Budapest, Hungary, 19–22 July 2021; Wageningen Academic Publishers: Wageningen, The Netherlands, 2021; pp. 1786–1798. [Google Scholar]
- Prasad Patnaik, P.; Padhy, N. An Approach for Potato Yield Prediction Using Machine Learning Regression Algorithms. In Proceedings of the Next Generation of Internet of Things, Gunupur, India, 5–6 February 2021; Kumar, R., Mishra, B.K., Pattnaik, P.K., Eds.; Springer Nature: Singapore, 2023; pp. 327–336. [Google Scholar]
- Sharma, A.K.; Rajawat, A.S. Crop Yield Prediction using Hybrid Deep Learning Algorithm for Smart Agriculture. In Proceedings of the 2022 Second International Approach for Potato Yield Prediction Using Machine Learning Regression Algorithmsional Conference on Artificial Intelligence and Smart Energy (ICAIS), Coimbatore, India, 23–25 February 2022; pp. 330–335. [Google Scholar] [CrossRef]
- Niedbała, G. Simple model based on artificial neural network for early prediction and simulation winter rapeseed yield. J. Integr. Agric. 2019, 18, 54–61. [Google Scholar] [CrossRef]
- Niedbała, G.; Kurasiak-Popowska, D.; Piekutowska, M.; Wojciechowski, T.; Kwiatek, M.; Nawracała, J. Application of Artificial Neural Network Sensitivity Analysis to Identify Key Determinants of Harvesting Date and Yield of Soybean (Glycine max [L.] Merrill) Cultivar Augusta. Agriculture 2022, 12, 754. [Google Scholar] [CrossRef]
- Niedbała, G.; Wróbel, B.; Piekutowska, M.; Zielewicz, W.; Paszkiewicz-Jasińska, A.; Wojciechowski, T.; Niazian, M. Application of Artificial Neural Networks Sensitivity Analysis for the Pre-Identification of Highly Significant Factors Influencing the Yield and Digestibility of Grassland Sward in the Climatic Conditions of Central Poland. Agronomy 2022, 12, 1133. [Google Scholar] [CrossRef]
- Niedbała, G.; Kurasiak-Popowska, D.; Stuper-Szablewska, K.; Nawracała, J. Application of Artificial Neural Networks to Analyze the Concentration of Ferulic Acid, Deoxynivalenol, and Nivalenol in Winter Wheat Grain. Agriculture 2020, 10, 127. [Google Scholar] [CrossRef]
- Hara, P.; Piekutowska, M.; Niedbała, G. Prediction of Protein Content in Pea (Pisum sativum L.) Seeds Using Artificial Neural Networks. Agriculture 2022, 13, 29. [Google Scholar] [CrossRef]
- Boniecki, P.; Sujak, A.; Niedbała, G.; Piekarska-Boniecka, H.; Wawrzyniak, A.; Przybylak, A. Neural Modelling from the Perspective of Selected Statistical Methods on Examples of Agricultural Applications. Agriculture 2023, 13, 762. [Google Scholar] [CrossRef]
- Gonzalez-Sanchez, A.; Frausto-Solis, J.; Ojeda-Bustamante, W. Attribute selection impact on linear and nonlinear regression models for crop yield prediction. Sci. World J. 2014, 2014, 509429. [Google Scholar] [CrossRef]
- Maestrini, B.; Mimić, G.; van Oort, P.A.; Jindo, K.; Brdar, S.; Athanasiadis, I.N.; van Evert, F.K. Mixing process-based and data-driven approaches in yield prediction. Eur. J. Agron. 2022, 139, 126569. [Google Scholar] [CrossRef]
- Morales, A.; Villalobos, F.J. Using machine learning for crop yield prediction in the past or the future. Front. Plant Sci. 2023, 14, 1128388. [Google Scholar] [CrossRef]
- Ansarifar, J.; Wang, L.; Archontoulis, S.V. An interaction regression model for crop yield prediction. Sci. Rep. 2021, 11, 17754. [Google Scholar] [CrossRef]
- Kuradusenge, M.; Hitimana, E.; Hanyurwimfura, D.; Rukundo, P.; Mtonga, K.; Mukasine, A.; Uwitonze, C.; Ngabonziza, J.; Uwamahoro, A. Crop Yield Prediction Using Machine Learning Models: Case of Irish Potato and Maize. Agriculture 2023, 13, 225. [Google Scholar] [CrossRef]
- Yun, S.D.; Gramig, B.M. Spatial Panel Models of Crop Yield Response to Weather: Econometric Specification Strategies and Prediction Performance. J. Agric. Appl. Econ. 2022, 54, 53–71. [Google Scholar] [CrossRef]
- Fadón, E.; Herrero, M.; Rodrigo, J. Flower development in sweet cherry framed in the BBCH scale. Sci. Hortic. 2015, 192, 141–147. [Google Scholar] [CrossRef]
- Alcaraz, M.; Thorp, T.; Hormaza, J. Phenological growth stages of avocado (Persea americana) according to the BBCH scale. Sci. Hortic. 2013, 164, 434–439. [Google Scholar] [CrossRef]
- Seth Software Sp. z o.o. Plantator System. 2023. Available online: https://plantator.com (accessed on 1 November 2023).
- Matsushita, B.; Yang, W.; Chen, J.; Onda, Y.; Qiu, G. Sensitivity of the Enhanced Vegetation Index (EVI) and Normalized Difference Vegetation Index (NDVI) to Topographic Effects: A Case Study in High-density Cypress Forest. Sensors 2007, 7, 2636–2651. [Google Scholar] [CrossRef]
- Chmist-Sikorska, J.; Kepinska-Kasprzak, M.; Struzik, P. Agricultural drought assessment on the base of Hydro-thermal Coefficient of Selyaninov in Poland. Ital. J. Agrometeorol. 2022, 1, 3–12. [Google Scholar] [CrossRef]
- McMaster, G.S.; Wilhelm, W. Growing degree-days: One equation, two interpretations. Agric. For. Meteorol. 1997, 87, 291–300. [Google Scholar] [CrossRef]
- Zhang, Z. Missing data imputation: Focusing on single imputation. Ann. Transl. Med. 2016, 4, 9. [Google Scholar]
- Henderi, H.; Wahyuningsih, T.; Rahwanto, E. Comparison of Min-Max normalization and Z-Score Normalization in the K-nearest neighbor (kNN) Algorithm to Test the Accuracy of Types of Breast Cancer. Int. J. Inform. Inf. Syst. 2021, 4, 13–20. [Google Scholar] [CrossRef]
- Jegorowa, A.; Górski, J.; Kurek, J.; Kruk, M. Use of nearest neighbors (k-NN) algorithm in tool condition identification in the case of drilling in melamine faced particleboard. Maderas. Ciencia Y TecnologÃa. 2020, 22, 189–196. Available online: http://www.scielo.cl/scielo.php?script=sci_arttext&pid=S0718-221X2020000200189&nrm=iso (accessed on 1 November 2023). [CrossRef]
- sklearn.linear_model.LinearRegression—Scikit-Learn 1.0.2 Documentation. 2023. Available online: https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html (accessed on 1 November 2023).
- Ridge Regression in Scikit-Learn. 2023. Available online: https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.Ridge.html (accessed on 1 November 2023).
- Lasso in Scikit-Learn. 2023. Available online: https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.Lasso.html (accessed on 1 November 2023).
- ElasticNet in Scikit-Learn. 2023. Available online: https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.ElasticNet.html (accessed on 1 November 2023).
- XGBoost Python Package. 2023. Available online: https://xgboost.readthedocs.io/en/stable/Python/python_api.html (accessed on 1 November 2023).
- Random Forest Regressor in Scikit-Learn. 2023. Available online: https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestRegressor.html (accessed on 1 November 2023).
- MLP Regressor in Scikit-Learn. 2023. Available online: https://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPRegressor.html (accessed on 1 November 2023).
- SGD Regressor in Scikit-Learn. 2023. Available online: https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.SGDRegressor.html (accessed on 1 November 2023).
- SVR in Scikit-Learn. 2023. Available online: https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVR.html (accessed on 1 November 2023).
- Niedbała, G.; Kurek, J.; Świderski, B.; Wojciechowski, T.; Antoniuk, I.; Bobran, K. Prediction of Blueberry (Vaccinium corymbosum L.) Yield Based on Artificial Intelligence Methods. Agriculture 2022, 12, 2089. [Google Scholar] [CrossRef]
- Kurek, J.; Świderski, B.; Osowski, S.; Kruk, M.; Barhoumi, W. Deep learning versus classical neural approach to mammogram recognition. Bull. Pol. Acad. Sci. Tech. Sci. 2018, 66, 831–840. [Google Scholar] [CrossRef]
- Swiderski, B.; Kurek, J.; Osowski, S. Multistage classification by using logistic regression and neural networks for assessment of financial condition of company. Decis. Support Syst. 2012, 52, 539–547. [Google Scholar] [CrossRef]
- Osowski, S.; Les, T. Deep learning ensemble for melanoma recognition. In Proceedings of the 2020 International Joint Conference on Neural Networks (IJCNN), Glasgow, UK, 19–24 July 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 1–7. [Google Scholar]
- Gil, F.; Osowski, S.; Slowinska, M. Melanoma recognition using deep learning and ensemble of classifiers. In Proceedings of the 2022 23rd International Conference on Computational Problems of Electrical Engineering (CPEE), Zuberec, Slovak Republic, 11–14 September 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 1–4. [Google Scholar]
- Kruk, M.; Kurek, J.; Osowski, S.; Koktysz, R.; Swiderski, B.; Markiewicz, T. Ensemble of classifiers and wavelet transformation for improved recognition of Fuhrman grading in clear-cell renal carcinoma. Biocybern. Biomed. Eng. 2017, 37, 357–364. [Google Scholar] [CrossRef]
- Siwek, K.; Osowski, S.; Kurek, J. Ensemble Neural Network Approach to the Load Forecasting in the Power System. In Proceedings of the International IEEE Conference on ISTET’05, Lviv, Ukraine, 4–7 July 2005; pp. 380–383. [Google Scholar]
- Kurek, J.; Osowski, S. Support Vector Machine for diagnosis of the bars of cage inductance motor. In Proceedings of the 2008 15th IEEE International Conference on Electronics, Circuits and Systems, Saint Julian’s, Malta, 31 August–3 September 2008; IEEE: Piscataway, NJ, USA, 2008; pp. 1022–1025. [Google Scholar]
- Cheng, Z.; Zou, C.; Dong, J. Outlier Detection Using Isolation Forest and Local Outlier Factor. In Proceedings of the Conference on Research in Adaptive and Convergent Systems, RACS’19, Chongqing, China, 24–27 September 2019; Association for Computing Machinery: New York, NY, USA, 2019; pp. 161–168. [Google Scholar] [CrossRef]
- Sawicka, B.; Pszczółkowski, P.; Kiełtyka-Dadasiewicz, A.; Barbaś, P.; Ćwintal, M.; Krochmal-Marczak, B. The Effect of Effective Microorganisms on the Quality of Potato Chips and French Fries. Appl. Sci. 2021, 11, 1415. [Google Scholar] [CrossRef]
- Haverkort, A.; Struik, P. Yield Levels of Potato Crops: Recent Achievements and Future Prospects. Field Crops Res. 2021, 182, 76–85. [Google Scholar] [CrossRef]
- Cirocki, R.; Gołębiewska, B. Changes in the profitability of production of industrial potatoes in poland—A case study. Ann. Polish Assoc. Agric. Agribus. Econ. 2019, 21, 19–28. [Google Scholar] [CrossRef]
- Licker, R.; Johnston, M.; Foley, J.; Barford, C.; Kucharik, C.; Monfreda, C.; Ramankutty, N. Mind the Gap: How Do Climate and Agricultural Management Explain the ‘Yield Gap’ of Croplands around the World? Glob. Ecol. Biogeogr. 2010, 19, 769–782. [Google Scholar] [CrossRef]
- Hochman, Z.; Gobbett, D.; Holzworth, D.; McClelland, T.; van Rees, H.; Marinoni, O.; Garcia, J.; Horan, H. Reprint of “Quantifying Yield Gaps in Rainfed Cropping Systems: A Case Study of Wheat in Australia”. Field Crops Res. 2013, 143, 65–75. [Google Scholar] [CrossRef]
- Harahagazwe, D.; Condori, B.; Barreda, C.; Bararyenya, A.; Byarugaba, A.; Kude, D.; Lung’aho, C.; Martinho, C.; Mbiri, D.; Nasona, B.; et al. How Big Is the Potato (Solanum tuberosum L.) Yield Gap in Sub-Saharan Africa and Why? A Participatory Approach. Open Agric. 2018, 3, 180–189. [Google Scholar] [CrossRef]
- Campos, H.; Ortiz, O. The Potato Crop. In Its Agricultural, Nutritional and Social Contribution to Humankind; Springer: Berlin/Heidelberg, Germany, 2020; ISBN 978–3–030–28682–8. [Google Scholar]
- Grassini, P.; van Bussel, L.; Van Wart, J.; Wolf, J.; Claessens, L.; Yang, H.; Boogaard, H.; de Groot, H.; van Ittersum, M.; Cassman, K. How Good Is Good Enough? Data Requirements for Reliable Crop Yield Simulations and Yield-Gap Analysis. Field Crops Res. 2015, 177, 49–63. [Google Scholar] [CrossRef]
- Meroni, M.; Waldner, F.; Seguini, L.; Kerdiles, H.; Rembold, F. Yield Forecasting with Machine Learning and Small Data: What Gains for Grains? Agric. For. Meteorol. 2021, 108555, 308–309. [Google Scholar]
- Dwivedi, S.; Goldman, I.; Ortiz, R. Pursuing the Potential of Heirloom Cultivars to Improve Adaptation, Nutritional, and Culinary Features of Food Crops. Agronomy 2019, 9, 441. [Google Scholar] [CrossRef]
- Ahmad, U.; Sharma, L.A. Review of Best Management Practices for Potato Crop Using Precision Agricultural Technologies. Smart Agric. Technol. 2023, 4, 100220. [Google Scholar] [CrossRef]
- Vetrovsky, T.; Siranec, M.; Marencakova, J.; Tufano, J.; Capek, V.; Bunc, V.; Belohlavek, J. Validity of Six Consumer-Level Activity Monitors for Measuring Steps in Patients with Chronic Heart Failure. PLoS ONE 2019, 14, e0222569. [Google Scholar] [CrossRef]
- Hara, P.; Piekutowska, M.; Niedbała, G. Prediction of Pea (Pisum sativum L.) Seeds Yield Using Artificial Neural Networks. Agriculture 2023, 13, 661. [Google Scholar] [CrossRef]
- Al-Gaadi, K.; Hassaballa, A.; Tola, E.; Kayad, A.; Madugundu, R.; Alblewi, B.; Assiri, F. Prediction of Potato Crop Yield Using Precision Agriculture Techniques. PLoS ONE 2016, 11, e0162219. [Google Scholar] [CrossRef]
- Li, D.; Miao, Y.; Gupta, S.; Rosen, C.; Yuan, F.; Wang, C.; Wang, L.; Huang, Y. Improving Potato Yield Prediction by Combining Cultivar Information and UAV Remote Sensing Data Using Machine Learning. Remote Sens. 2021, 13, 3322. [Google Scholar] [CrossRef]
Variety | 2018 | 2019 | 2020 | 2021 | 2022 | Total |
---|---|---|---|---|---|---|
Innovator | 60 | 65 | 40 | 55 | 20 | 240 |
Ludmilla | 20 | 30 | 30 | 20 | 15 | 115 |
Ivory Russet | 5 | 10 | 0 | 0 | 0 | 15 |
Zorba | 5 | 20 | 15 | 10 | 0 | 50 |
Total | 90 | 115 | 95 | 85 | 35 | 420 |
Group of Features | No. of Features |
---|---|
Aggregated weather features | 7 |
Weather features | 92 |
Soil features | 17 |
Agrotechnical treatment features | 6 |
Vegetation indexes GE | 64 |
Vegetation indexes PL | 64 |
Total | 250 |
Variable Type | List of Variables |
---|---|
Agrotechnical treatment features (4 items) | Liquid fertilisation, spraying, planting, broadcast fertilisation |
Weather features (23 items) | Average temperature (°C), rainfall (mm), air temperature1 (°C), air temperature2 (°C), air temperature3 (°C), solar panel (mV), precipitation (mm), wind speed AVG (m/s), wind speed Min (m/s), wind speed Max (m/s), battery (mV), leaf wetness time (min), HC serial number, HC air temperature AVG (°C), HC air temperature Max (°C), HC air temperature Max (°C), HC relative humidity AVG (%), HC relative humidity AVG (%), HC relative humidity AVG (%), Dev point temperature AVG (°C), Dev point temperature Max (°C), vapour pressure deficit AVG (mBar), vapour pressure deficit Min (mBar) |
Aggregated weather features (6 items) | HTC 0–10, HTC 11–50, HTC > 50, GDD 0–10, GDD > 50, GDD 11–50 |
Soil features (4 items) | Soil pH , phosphorus (mg/100 g), potassium (mg/100 g), magnesium (mg/100 g) |
Vegetation indices GE (calculated based on Sentinel via Google Earth) (64 items) | EVI_GE_0_10_Max, EVI_GE_11_50_Max, EVI_GE_50_Max, EVI_GE_daily_Max, NDVI_GE_0_10_Max, NDVI_GE_11_50_Max, NDVI_GE_50_Max, NDVI_GE_daily_Max, RDVI_GE_0_10_Max, RDVI_GE_11_50_Max, RDVI_GE_50_Max, RDVI_GE_daily_Max, SAVI_GE_0_10_Max, SAVI_GE_11_50_Max, SAVI_GE_50_Max, SAVI_GE_daily_Max, and so on, for mean, Min, StdDev variants |
Vegetation indices PL (calculated based on PlanetScope via Planet Labs) (64 items) | EVI_PL_0_10_Max, EVI_PL_11_50_Max, EVI_PL_50_Max, EVI_PL_daily_Max, NDVI_PL_0_10_Max, NDVI_PL_11_50_Max, NDVI_PL_50_Max, NDVI_PL_daily_Max, RDVI_PL_0_10_Max, RDVI_PL_11_50_Max, RDVI_PL_50_Max, RDVI_PL_daily_Max, SAVI_PL_0_10_Max, SAVI_PL_11_50_Max, SAVI_PL_50_Max, SAVI_PL_daily_Max, and so on, for mean, Min, StdDev variants |
Group of Features | No. of Features |
---|---|
Aggregated weather features | 4 |
Weather features | 23 |
Soil features | 4 |
Agrotechnical treatment features | 6 |
Vegetation indexes GE | 64 |
Vegetation indexes PL | 64 |
Total | 165 |
No. | Penter | Premove |
---|---|---|
1 | 0.01 | 0.06 |
2 | 0.02 | 0.07 |
3 | 0.03 | 0.08 |
4 | 0.04 | 0.09 |
… | … | … |
87 | 0.87 | 0.92 |
88 | 0.88 | 0.93 |
89 | 0.89 | 0.94 |
90 | 0.9 | 0.95 |
Model (Number) | Training Set (Samples/Features) | Validation Set (Samples/Features) | Test Set (Samples/Features) |
---|---|---|---|
NSM Without Outlier Detection (1) | 205/37 | 95/37 | 120/37 |
NSM With Outlier Detection Using Local Outlier Factor (2) | 200/37 | 95/37 | 120/37 |
NSM With Outlier Detection Using One-Class SVM (3) | 103/37 | 95/37 | 120/37 |
SM Without Outlier Detection (4) | 205/128 | 95/128 | 120/128 |
SM With Outlier Detection Using Local Outlier Factor (5) | 201/128 | 95/128 | 120/128 |
SM With Outlier Detection Using One-Class SVM (6) | 104/128 | 95/128 | 120/128 |
HM Without Outlier Detection (7) | 205/165 | 95/165 | 120/165 |
HM With Outlier Detection Using Local Outlier Factor (8) | 200/165 | 95/165 | 120/165 |
HM With Outlier Detection Using One-Class SVM (9) | 120/165 | 95/165 | 101/165 |
Model Number | Is_Stepwise Fit_Used | Penter |Premove | Is_Pearson _Used | Is_Chi2 _Used | Is_PCA | n_PCA_ Components |
---|---|---|---|---|---|---|
(1) | True | 0.8|0.85 | False | False | True | 5 |
(2) | False | N/A|N/A | False | False | True | 5 |
(3) | True | 0.3|0.35 | False | False | True | 5 |
(4) | True | 0.5|0.44 | False | False | False | 0 |
(5) | True | 0.4|0.45 | False | False | False | 0 |
(6) | True | 0.4|0.45 | False | False | False | 0 |
(7) | True | 0.6|0.65 | True | False | False | 0 |
(8) | True | 0.2|0.25 | False | False | False | 0 |
(9) | True | 0.1|0.15 | False | False | True | 5 |
Group of Features | No. |
---|---|
Aggregated weather features | 4 |
Weather features | 23 |
Soil features | 4 |
Agrotechnical treatment features | 6 |
Total | 37 |
Type | Model | Outlier Detection | No. of Features | MAPE | PCA Used | PCA No. of Features |
---|---|---|---|---|---|---|
Non-satellite | SVR | N/A | 32 | 17.31% | True | 5 |
Non-satellite | SVR | LOF | 37 | 16.99% | True | 5 |
Non-satellite | XGB | SVM | 18 | 8.47% | True | 5 |
Satellite | Ridge | N/A | 92 | 14.87% | False | 0 |
Satellite | Ridge | LOF | 83 | 15.43% | False | 0 |
Satellite | Ridge | SVM | 102 | 16.38% | False | 0 |
Hybrid | XGB | N/A | 79 | 6.10% | False | 0 |
Hybrid | Random Forest | LOF | 80 | 6.94% | False | 0 |
Hybrid | XGB | SVM | 57 | 5.85% | True | 5 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kurek, J.; Niedbała, G.; Wojciechowski, T.; Świderski, B.; Antoniuk, I.; Piekutowska, M.; Kruk, M.; Bobran, K. Prediction of Potato (Solanum tuberosum L.) Yield Based on Machine Learning Methods. Agriculture 2023, 13, 2259. https://doi.org/10.3390/agriculture13122259
Kurek J, Niedbała G, Wojciechowski T, Świderski B, Antoniuk I, Piekutowska M, Kruk M, Bobran K. Prediction of Potato (Solanum tuberosum L.) Yield Based on Machine Learning Methods. Agriculture. 2023; 13(12):2259. https://doi.org/10.3390/agriculture13122259
Chicago/Turabian StyleKurek, Jarosław, Gniewko Niedbała, Tomasz Wojciechowski, Bartosz Świderski, Izabella Antoniuk, Magdalena Piekutowska, Michał Kruk, and Krzysztof Bobran. 2023. "Prediction of Potato (Solanum tuberosum L.) Yield Based on Machine Learning Methods" Agriculture 13, no. 12: 2259. https://doi.org/10.3390/agriculture13122259
APA StyleKurek, J., Niedbała, G., Wojciechowski, T., Świderski, B., Antoniuk, I., Piekutowska, M., Kruk, M., & Bobran, K. (2023). Prediction of Potato (Solanum tuberosum L.) Yield Based on Machine Learning Methods. Agriculture, 13(12), 2259. https://doi.org/10.3390/agriculture13122259