A Multi-Farm Global-to-Local Expert-Informed Machine Learning System for Strawberry Yield Forecasting
Abstract
:1. Introduction
Novelty and Contributions
- Our proposed approach uses growers’ pre-season forecasts alongside a machine learning model and the ERA5 climate model to develop a strawberry yield forecasting system;
- Inspired by real-data intricacies from multiple farms across the UK, we present a comprehensive end-to-end framework and a forecasting model that can be used to support the growers’ decision-making process;
- With our global-to-local model, we demonstrate how data from multiple farms can be used to inform the decision-making at a local level, therefore supporting a global-to-local approach rather than the most commonly used local-to-local one.
2. Related Work
3. Materials and Methods
3.1. Angus Soft Fruits Data
3.1.1. Pre-Season Forecast
3.1.2. Weekly Forecasts
3.2. Data Wrangling
3.2.1. The Base Dataset
3.2.2. Datasets for Comparison
- The base dataset (machine learning model);
- The dataset with the growers’ forecasts added (expert-informed model);
- The dataset with the Era5 weather data added (machine learning model);
- The dataset with both the growers’ forecasts and the Era5 weather data added (expert-informed model plus climate).
3.3. ERA5 Temperature Data
Comparison to a Farm Weather Station
3.4. Models
3.4.1. Random Forest
3.4.2. XGBoost
3.4.3. Forecasting Framework
4. Results
4.1. Model Variations
4.2. Random Forest Baseline
4.3. Base Model vs. Expert-Informed Model
4.4. Climate Model Data
4.5. Expert-Informed + Climate Model Data
4.6. Comparisons with Grower Forecasts
5. Discussion
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A. Additional Forecasting Plots
Appendix A.1
Appendix A.2. Random Forest Results
Farm | Plot | Base | Expert | Climate | Expert + Climate | ||||
---|---|---|---|---|---|---|---|---|---|
RMSE | MAE | RMSE | MAE | RMSE | MAE | RMSE | MAE | ||
1 | 1 | 0.1695 | 0.0793 | 0.0976 | 0.0429 | 0.1647 | 0.0781 | 0.0952 | 0.0411 |
± 0.0014 | ± 0.0006 | ± 0.0013 | ± 0.0006 | ± 0.0022 | ± 0.0009 | ± 0.0010 | ± 0.0005 | ||
2 | 0.0870 | 0.0380 | 0.0416 | 0.0166 | 0.0841 | 0.0354 | 0.0456 | 0.0170 | |
± 0.0008 | ± 0.0004 | ± 0.0005 | ± 0.0004 | ± 0.0008 | ± 0.0004 | ± 0.0007 | ± 0.0004 | ||
3 | 0.2008 | 0.0751 | 0.1736 | 0.0606 | 0.1932 | 0.0771 | 0.1653 | 0.0598 | |
± 0.0014 | ± 0.0004 | ± 0.0008 | ± 0.0004 | ± 0.0013 | ± 0.0007 | ± 0.0007 | ± 0.0003 | ||
4 | 0.0884 | 0.0431 | 0.0984 | 0.0431 | 0.0723 | 0.0318 | 0.0990 | 0.0420 | |
± 0.0011 | ± 0.0005 | ± 0.0010 | ± 0.0007 | ± 0.0010 | ± 0.0005 | ± 0.0010 | ± 0.0004 | ||
5 | 0.0959 | 0.0407 | 0.0495 | 0.0210 | 0.0993 | 0.0433 | 0.0483 | 0.0218 | |
± 0.0011 | ± 0.0006 | ± 0.0012 | ± 0.0005 | ± 0.0012 | ± 0.0006 | ± 0.0006 | ± 0.0003 | ||
2 | 1 | 0.0801 | 0.0368 | 0.0653 | 0.0260 | 0.0605 | 0.0271 | 0.0592 | 0.0222 |
± 0.0011 | ± 0.0007 | ± 0.0015 | ± 0.0007 | ± 0.0011 | ± 0.0007 | ± 0.0010 | ± 0.0004 | ||
2 | 0.0871 | 0.0380 | 0.0909 | 0.0407 | 0.0830 | 0.0390 | 0.0913 | 0.0408 | |
± 0.0011 | ± 0.0005 | ± 0.0012 | ± 0.0006 | ± 0.0016 | ± 0.0008 | ± 0.0010 | ± 0.0004 | ||
3 | 1 | 0.1047 | 0.0284 | 0.1105 | 0.0299 | 0.0788 | 0.0356 | 0.1140 | 0.0316 |
± 0.0004 | ± 0.0004 | ± 0.0009 | ± 0.0004 | ± 0.0012 | ± 0.0006 | ± 0.0008 | ± 0.0002 | ||
2 | 0.1076 | 0.0526 | 0.1365 | 0.0574 | 0.1030 | 0.0514 | 0.1398 | 0.0591 | |
± 0.0005 | ± 0.0004 | ± 0.0009 | ± 0.0005 | ± 0.0012 | ± 0.0006 | ± 0.0013 | ± 0.0007 | ||
3 | 0.0781 | 0.0369 | 0.0694 | 0.0283 | 0.0830 | 0.0390 | 0.0913 | 0.0344 | |
± 0.0007 | ± 0.0003 | ± 0.0015 | ± 0.0006 | ± 0.0016 | ± 0.0008 | ± 0.0007 | ± 0.0003 | ||
4 | 1 | 0.0680 | 0.0297 | 0.0418 | 0.0176 | 0.0605 | 0.0271 | 0.0365 | 0.0138 |
± 0.0008 | ± 0.0003 | ± 0.0004 | ± 0.0002 | ± 0.0011 | ± 0.0007 | ± 0.0006 | ± 0.0003 | ||
5 | 1 | 0.0828 | 0.0352 | 0.1295 | 0.0585 | 0.0847 | 0.0392 | 0.1279 | 0.0564 |
± 0.0019 | ± 0.0010 | ± 0.0005 | ± 0.0004 | ± 0.0023 | ± 0.0011 | ± 0.0004 | ± 0.0003 | ||
6 | 1 | 0.1180 | 0.0523 | 0.1009 | 0.0457 | 0.1101 | 0.0461 | 0.1006 | 0.0468 |
± 0.0009 | ± 0.0004 | ± 0.0014 | ± 0.0006 | ± 0.0004 | ± 0.0003 | ± 0.0010 | ± 0.0006 |
References
- Rose, D.C.; Shortland, F.; Hall, J.; Hurley, P.; Little, R.; Nye, C.; Lobley, M. The impact of COVID-19 on farmers’ mental health: A case study of the UK. J. Agromedicine 2023, 28, 346–364. [Google Scholar] [CrossRef]
- Rose, D.C.; Bhattacharya, M. Adoption of autonomous robots in the soft fruit sector: Grower perspectives in the UK. Smart Agric. Technol. 2023, 3, 100118. [Google Scholar] [CrossRef]
- Filippi, P.; Jones, E.J.; Wimalathunge, N.S.; Somarathna, P.D.; Pozza, L.E.; Ugbaje, S.U.; Jephcott, T.G.; Paterson, S.E.; Whelan, B.M.; Bishop, T.F. An approach to forecast grain crop yield using multi-layered, multi-farm data sets and machine learning. Precis. Agric. 2019, 20, 1015–1029. [Google Scholar] [CrossRef]
- Basso, B.; Liu, L. Seasonal crop yield forecast: Methods, applications, and accuracies. Adv. Agron. 2019, 154, 201–255. [Google Scholar]
- Kantanantha, N.; Serban, N.; Griffin, P. Yield and price forecasting for stochastic crop decision planning. J. Agric. Biol. Environ. Stat. 2010, 15, 362–380. [Google Scholar] [CrossRef]
- MacKenzie, S.J.; Chandler, C.K. A method to predict weekly strawberry fruit yields from extended season production systems. Agron. J. 2009, 101, 278–287. [Google Scholar] [CrossRef]
- Chaudhary, M.; Gastli, M.S.; Nassar, L.; Karray, F. Deep learning approaches for forecasting Strawberry yields and prices using satellite images and station-based soil parameters. arXiv 2021, arXiv:2102.09024. [Google Scholar]
- Barrero, O.; Ouazaa, S.; Jaramillo-Barrios, C.I.; Quevedo, M.; Chaali, N.; Jaramillo, S.; Beltran, I.; Montenegro, O. Rice Yield Prediction Using On-Farm Data Sets and Machine Learning. In Proceedings of the International Conference on Smart Information & Communication Technologies, Saidia, Morocco, 26–28 September 2019; Springer: Cham, Switzerland, 2019; pp. 422–430. [Google Scholar]
- Jafari, F.; Ponnambalam, K.; Mousavi, J.; Karray, F. Yield forecast of California strawberry: Time-series Models vs. ML Tools. In Proceedings of the 2020 IEEE International Conference on Systems, Man, and Cybernetics (SMC), IEEE, Toronto, ON, Canada, 11–14 October 2020; pp. 3594–3598. [Google Scholar]
- Nonhebel, S. The Importance of Weather Data in Crop Growth Simulation Models and Assessment of Climatic Change Effects; Wageningen University and Research: Wageningen, The Netherlands, 1993. [Google Scholar]
- Urraca, R.; Huld, T.; Gracia-Amillo, A.; Martinez-de Pison, F.J.; Kaspar, F.; Sanz-Garcia, A. Evaluation of global horizontal irradiance estimates from ERA5 and COSMO-REA6 reanalyses using ground and satellite-based data. Sol. Energy 2018, 164, 339–354. [Google Scholar] [CrossRef]
- Chen, T.; Guestrin, C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd Acm Sigkdd International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
- Liakos, K.G.; Busato, P.; Moshou, D.; Pearson, S.; Bochtis, D. Machine learning in agriculture: A review. Sensors 2018, 18, 2674. [Google Scholar] [CrossRef]
- Lezoche, M.; Hernandez, J.E.; Díaz, M.d.M.E.A.; Panetto, H.; Kacprzyk, J. Agri-food 4.0: A survey of the supply chains and technologies for the future agriculture. Comput. Ind. 2020, 117, 103187. [Google Scholar] [CrossRef]
- Li, A.; Markovic, M.; Edwards, P.; Leontidis, G. Model pruning enables localized and efficient federated learning for yield forecasting and data sharing. Expert Syst. Appl. 2024, 242, 122847. [Google Scholar] [CrossRef]
- Sheoran, P.; Kamboj, P.; Kumar, A.; Kumar, A.; Singh, R.K.; Barman, A.; Prajapat, K.; Mandal, S.; Yousuf, D.J.; Narjary, B.; et al. Matching N supply for yield maximization in salt–affected wheat agri–food systems: On-farm participatory assessment and validation. Sci. Total Environ. 2023, 875, 162573. [Google Scholar] [CrossRef]
- Thota, M.; Kollias, S.; Swainson, M.; Leontidis, G. Multi-source domain adaptation for quality control in retail food packaging. Comput. Ind. 2020, 123, 103293. [Google Scholar] [CrossRef]
- Clarke, A.; Yates, D.; Blanchard, C.; Islam, M.; Ford, R.; Rehman, S.; Walsh, R. The effect of dataset construction and data pre-processing on the eXtreme Gradient Boosting algorithm applied to head rice yield prediction in Australia. Comput. Electron. Agric. 2024, 219, 108716. [Google Scholar] [CrossRef]
- Durrant, A.; Markovic, M.; Matthews, D.; May, D.; Enright, J.; Leontidis, G. The role of cross-silo federated learning in facilitating data sharing in the agri-food sector. Comput. Electron. Agric. 2022, 193, 106648. [Google Scholar] [CrossRef]
- Kim, Y.H.; Yoo, S.J.; Gu, Y.H.; Lim, J.H.; Han, D.; Baik, S.W. Crop pests prediction method using regression and machine learning technology: Survey. IERI Procedia 2014, 6, 52–56. [Google Scholar] [CrossRef]
- Saleem, R.M.; Bashir, R.N.; Faheem, M.; Haq, M.A.; Alhussen, A.; Alzamil, Z.S.; Khan, S. Internet of things based weekly crop pest prediction by using deep neural network. IEEE Access 2023, 11, 85900–85913. [Google Scholar] [CrossRef]
- Alhnaity, B.; Pearson, S.; Leontidis, G.; Kollias, S. Using deep learning to predict plant growth and yield in greenhouse environments. In Proceedings of the International Symposium on Advanced Technologies and Management for Innovative Greenhouses: GreenSys2019 1296, Angers, France, 16–20 June 2019; pp. 425–432. [Google Scholar]
- Onoufriou, G.; Hanheide, M.; Leontidis, G. Premonition Net, a multi-timeline transformer network architecture towards strawberry tabletop yield forecasting. Comput. Electron. Agric. 2023, 208, 107784. [Google Scholar] [CrossRef]
- Lee, M.A.; Monteiro, A.; Barclay, A.; Marcar, J.; Miteva-Neagu, M.; Parker, J. A framework for predicting soft-fruit yields and phenology using embedded, networked microsensors, coupled weather models and machine-learning techniques. Comput. Electron. Agric. 2020, 168, 105103. [Google Scholar] [CrossRef]
- Zheng, C.; Abd-Elrahman, A.; Whitaker, V.; Dalid, C. Prediction of Strawberry Dry Biomass from UAV Multispectral Imagery Using Multiple Machine Learning Methods. Remote Sens. 2022, 14, 4511. [Google Scholar] [CrossRef]
- Wei, Y.; Peng, K.; Ma, Y.; Sun, Y.; Zhao, D.; Ren, X.; Yang, S.; Ahmad, M.; Pan, X.; Wang, Z.; et al. Validation of ERA5 Boundary Layer Meteorological Variables by Remote-Sensing Measurements in the Southeast China Mountains. Remote Sens. 2024, 16, 548. [Google Scholar] [CrossRef]
- Prasad, N.; Patel, N.; Danodia, A. Crop yield prediction in cotton for regional level using random forest approach. Spat. Inf. Res. 2021, 29, 195–206. [Google Scholar] [CrossRef]
- Zhang, L.; Bian, W.; Qu, W.; Tuo, L.; Wang, Y. Time series forecast of sales volume based on XGBoost. J. Phys. Conf. Ser. 2021, 1873, 012067. [Google Scholar]
- Potts, R.; Hackney, R.; Leontidis, G. Tabular Machine Learning Methods for Predicting Gas Turbine Emissions. Mach. Learn. Knowl. Extr. 2023, 5, 1055–1075. [Google Scholar] [CrossRef]
- Luo, J.; Zhang, Z.; Fu, Y.; Rao, F. Time series prediction of COVID-19 transmission in America using LSTM and XGBoost algorithms. Results Phys. 2021, 27, 104462. [Google Scholar] [CrossRef]
- Abdurohman, M.; Putrada, A.G. Forecasting Model for Lighting Electricity Load with a Limited Dataset Using XGBoost. In Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control; Telkom University: Barat, Indonesia, 2023. [Google Scholar]
- Maskey, M.L.; Pathak, T.B.; Dara, S.K. Weather based strawberry yield forecasts at field scale using statistical and machine learning models. Atmosphere 2019, 10, 378. [Google Scholar] [CrossRef]
Farm | Plot | Base | Expert | Climate | Expert + Climate | ||||
---|---|---|---|---|---|---|---|---|---|
RMSE | MAE | RMSE | MAE | RMSE | MAE | RMSE | MAE | ||
1 | 1 | 0.1239 | 0.0542 | 0.1097 | 0.0486 | 0.1340 | 0.0562 | 0.1068 | 0.0452 |
± 0.0026 | ± 0.0013 | ± 0.0024 | ± 0.0014 | ± 0.0018 | ± 0.0010 | ± 0.0026 | ± 0.0016 | ||
2 | 0.0624 | 0.0281 | 0.0360 | 0.0127 | 0.0462 | 0.0211 | 0.0387 | 0.0126 | |
± 0.0029 | ± 0.0015 | ± 0.0018 | ± 0.0009 | ± 0.0016 | ± 0.0010 | ± 0.0019 | ± 0.0011 | ||
3 | 0.1670 | 0.0628 | 0.1601 | 0.0537 | 0.1694 | 0.0653 | 0.1565 | 0.0587 | |
± 0.0012 | ± 0.0010 | ± 0.0018 | ± 0.0012 | ± 0.0029 | ± 0.0014 | ± 0.0028 | ± 0.0008 | ||
4 | 0.0850 | 0.0366 | 0.0800 | 0.0331 | 0.0724 | 0.0310 | 0.0824 | 0.0351 | |
± 0.0014 | ± 0.0007 | ± 0.0029 | ± 0.0016 | ± 0.0018 | ± 0.0009 | ± 0.0024 | ± 0.0010 | ||
5 | 0.0601 | 0.0243 | 0.0536 | 0.0226 | 0.0635 | 0.0238 | 0.0454 | 0.0201 | |
± 0.0024 | ± 0.0013 | ± 0.0019 | ± 0.0008 | ± 0.0043 | ± 0.0018 | ± 0.0011 | ± 0.0005 | ||
2 | 1 | 0.0436 | 0.0211 | 0.0478 | 0.0186 | 0.0311 | 0.0147 | 0.0365 | 0.0150 |
± 0.0023 | ± 0.0010 | ± 0.0045 | ± 0.0018 | ± 0.0010 | ± 0.0011 | ± 0.0024 | ± 0.0012 | ||
2 | 0.0938 | 0.0412 | 0.0985 | 0.0448 | 0.0774 | 0.0370 | 0.0961 | 0.0442 | |
± 0.0017 | ± 0.0008 | ± 0.0020 | ± 0.0010 | ± 0.0018 | ± 0.0010 | ± 0.0030 | ± 0.0015 | ||
3 | 1 | 0.1176 | 0.0314 | 0.1154 | 0.0327 | 0.1184 | 0.0313 | 0.1180 | 0.0326 |
± 0.0018 | ± 0.0009 | ± 0.0019 | ± 0.0007 | ± 0.0017 | ± 0.0008 | ± 0.0026 | ± 0.0006 | ||
2 | 0.0922 | 0.0452 | 0.1125 | 0.0472 | 0.0953 | 0.0447 | 0.1267 | 0.0541 | |
± 0.0052 | ± 0.0027 | ± 0.0025 | ± 0.0010 | ± 0.0029 | ± 0.0013 | ± 0.0027 | ± 0.0010 | ||
3 | 0.0716 | 0.0348 | 0.0650 | 0.0238 | 0.0730 | 0.0322 | 0.0822 | 0.0318 | |
± 0.0026 | ± 0.0009 | ± 0.0017 | ± 0.0006 | ± 0.0030 | ± 0.0015 | ± 0.0029 | ± 0.0013 | ||
4 | 1 | 0.0645 | 0.0259 | 0.0364 | 0.0149 | 0.0426 | 0.0175 | 0.0337 | 0.0126 |
± 0.0029 | ± 0.0015 | ± 0.0021 | ± 0.0009 | ± 0.0038 | ± 0.0018 | ± 0.0022 | ± 0.0013 | ||
5 | 1 | 0.1342 | 0.0620 | 0.1244 | 0.0516 | 0.1453 | 0.0649 | 0.1233 | 0.0497 |
± 0.0020 | ± 0.0010 | ± 0.0022 | ± 0.0018 | ± 0.0048 | ± 0.0021 | ± 0.0010 | ± 0.0009 | ||
6 | 1 | 0.1042 | 0.0455 | 0.0718 | 0.0302 | 0.0934 | 0.0350 | 0.0868 | 0.0323 |
± 0.0023 | ± 0.0010 | ± 0.0017 | ± 0.0008 | ± 0.0019 | ± 0.0011 | ± 0.0026 | ± 0.0010 |
Method | XGBoost | RF | ||
---|---|---|---|---|
Mean RMSE | Mean MAE | Mean RMSE | Mean MAE | |
Base ML | 0.0939 | 0.0395 | 0.1110 | 0.0475 |
Expert-informed ML | 0.0855 | 0.0334 | 0.0922 | 0.0371 |
Climate ERA5 plus ML | 0.0894 | 0.0365 | 0.1079 | 0.0463 |
Expert-informed ML plus Climate ERA5 | 0.0872 | 0.0342 | 0.0934 | 0.0375 |
Method | Mean RMSE | Mean MAE |
---|---|---|
Pre-season Grower | 0.1008 | 0.0412 |
Mid-season Grower | 0.1310 | 0.0519 |
Expert-informed ML (XGBoost) | 0.0855 | 0.0334 |
Expert-informed ML (RF) | 0.0922 | 0.0371 |
Farm | Plot | Grower Pre-season | Grower Mid-season | Expert | |||
---|---|---|---|---|---|---|---|
RMSE | MAE | RMSE | MAE | RMSE | MAE | ||
1 | 1 | 0.0989 | 0.0355 | 0.1082 | 0.0401 | 0.1097 | 0.0486 |
± 0.0024 | ± 0.0014 | ||||||
2 | 0.0681 | 0.0307 | 0.0992 | 0.0413 | 0.0360 | 0.0127 | |
± 0.0018 | ± 0.0009 | ||||||
3 | 0.1756 | 0.0626 | 0.2133 | 0.0819 | 0.1601 | 0.0537 | |
± 0.0018 | ± 0.0012 | ||||||
4 | 0.0646 | 0.0241 | 0.1071 | 0.0329 | 0.0800 | 0.0331 | |
± 0.0029 | ± 0.0016 | ||||||
5 | 0.0765 | 0.0314 | 0.2925 | 0.1345 | 0.0536 | 0.0226 | |
± 0.0019 | ± 0.0008 | ||||||
2 | 1 | 0.0426 | 0.0184 | 0.0809 | 0.0306 | 0.0478 | 0.0186 |
± 0.0045 | ± 0.0018 | ||||||
2 | 0.1102 | 0.0471 | 0.1207 | 0.0459 | 0.0985 | 0.0448 | |
± 0.0020 | ± 0.0010 | ||||||
3 | 1 | 0.1029 | 0.0327 | 0.1226 | 0.0337 | 0.1154 | 0.0327 |
± 0.0019 | ± 0.0007 | ||||||
2 | 0.1574 | 0.0704 | 0.1169 | 0.0545 | 0.1125 | 0.0472 | |
± 0.0027 | ± 0.0010 | ||||||
3 | 0.1153 | 0.0456 | 0.1320 | 0.0440 | 0.0650 | 0.0238 | |
± 0.0029 | ± 0.0013 | ||||||
4 | 1 | 0.0364 | 0.0146 | 0.0631 | 0.0277 | 0.0364 | 0.0149 |
± 0.0022 | ± 0.0013 | ||||||
5 | 1 | 0.1451 | 0.0674 | 0.1410 | 0.0665 | 0.1244 | 0.0516 |
± 0.0010 | ± 0.0009 | ||||||
6 | 1 | 0.1169 | 0.0549 | 0.1055 | 0.0404 | 0.0718 | 0.0303 |
± 0.0026 | ± 0.0010 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Beddows, M.; Leontidis, G. A Multi-Farm Global-to-Local Expert-Informed Machine Learning System for Strawberry Yield Forecasting. Agriculture 2024, 14, 883. https://doi.org/10.3390/agriculture14060883
Beddows M, Leontidis G. A Multi-Farm Global-to-Local Expert-Informed Machine Learning System for Strawberry Yield Forecasting. Agriculture. 2024; 14(6):883. https://doi.org/10.3390/agriculture14060883
Chicago/Turabian StyleBeddows, Matthew, and Georgios Leontidis. 2024. "A Multi-Farm Global-to-Local Expert-Informed Machine Learning System for Strawberry Yield Forecasting" Agriculture 14, no. 6: 883. https://doi.org/10.3390/agriculture14060883
APA StyleBeddows, M., & Leontidis, G. (2024). A Multi-Farm Global-to-Local Expert-Informed Machine Learning System for Strawberry Yield Forecasting. Agriculture, 14(6), 883. https://doi.org/10.3390/agriculture14060883