Machine Learning Approaches for Forecasting the Best Microbial Strains to Alleviate Drought Impact in Agriculture
Abstract
:1. Introduction
2. Materials and Methods
2.1. Description of the Data on Microbial Strains and Drought Conditions
2.2. Data Collection and Measurement Methods
- Bioassays for Plant-Growth-Promoting Traits: The production of plant growth hormones, siderophore production, and phosphate solubilization were detected using various bioassays. ACC deaminase activity, which is associated with the ability of bacteria to alleviate plant stress, was also measured.
- Bacterial Counts in Substrate: The bacterial populations in the substrate were assessed using the dilution plate method. This involved taking substrate samples from each pot and plating them on Tryptone Soya Agar. The colony-forming units (CFU) were then counted after incubation.
- Chlorophyll “a” Fluorescence: The health and stress level of the plants were assessed by measuring the parameters of chlorophyll fluorescence using a spectrofluorometer. This provided information on the efficiency of photosystem II, which can be affected by drought stress.
- Greenhouse Gas Emission Measurements: The emissions of NH3, CO2, N2O, and CH4 were measured using a field photoacoustic gas meter connected to a static chamber. This allowed for the assessment of the impact of the microbial strains and drought conditions on greenhouse gas emissions from the soil surface.
2.3. Explanation of the Machine Learning Techniques Used
2.4. Details of the Experimental Design and Data Analysis
2.5. Model Training Specifics
- Naive Bayes:Algorithm Type: Probabilistic.Training Approach: Applied Bayes theorem with an assumption of independence among predictors. The model was trained using a maximum likelihood estimation method.Hyperparameters: Default priors were used, with no hyperparameter tuning applied.
- Generalized Linear Model (GLM):Algorithm Type: Regression.Training Approach: The model used a link function to relate the linear combination of the input variables to the mean of the output variable. Iteratively reweighted least squares were employed for model optimization.Hyperparameters: Standard exponential family distributions (e.g., Gaussian, Binomial) were used.
- Logistic Regression:Algorithm Type: Classification.Training Approach: The model employed a logistic function to model the binary dependent variable.Hyperparameters: L2 regularization was utilized with a default regularization strength.
- Fast Large Margin:Algorithm Type: Classification.Training Approach: Used a margin-based classification method that aims to find the hyperplane which has the largest distance to the nearest training data of any class.Hyperparameters: Margin constraints were set with default values, with no hyperparameter tuning applied.
- Deep Learning:Algorithm Type: Neural network.Training Approach: Employed a feedforward deep neural network with backpropagation for optimization. Used ReLU activation functions for hidden layers and softmax for the output layer.Hyperparameters: Learning rate was set to 0.001, batch size was 32, and the model was trained for 50 epochs.
- Decision Tree:Algorithm Type: Decisional.Training Approach: Utilized a top–down, recursive, divide-and-conquer approach. The Gini impurity was the criterion for splitting.Hyperparameters: Maximum depth was set to 5, and a minimum of 10 samples were required to split an internal node.
- Random Forest:Algorithm Type: Ensemble.Training Approach: This model trained multiple decision trees during learning and used averaging to improve the predictive accuracy and control overfitting.Hyperparameters: Number of trees was set to 100, with a maximum depth of 5.
- Gradient Boosted Trees:Algorithm Type: Ensemble.Training Approach: Built trees one at a time, where each new tree tried to correct errors of the preceding one. Used a gradient descent algorithm to minimize the loss.Hyperparameters: Learning rate was set to 0.1, 100 boosting stages were performed, and a maximum depth of 3 was set for individual trees.
3. Results
4. Discussion
- Top Performer in Accuracy: The Gradient Boosted Trees stood out with the highest accuracy of 87%, followed closely by the Deep Learning model at 80%.
- Computational Efficiency: In terms of total computation time, the Generalized Linear Model was the most efficient, taking only 543,840 units of time, whereas the Gradient Boosted Trees required a considerably higher time, clocking in at 3,381,260 units, emphasizing a significant trade-off between accuracy and computational efficiency.
- Consistency: When looking at the standard deviation, which indicates the consistency of the model results, most models maintained a deviation within the 3–8% range. The Gradient Boosted Trees, despite its high accuracy, exhibited consistency with a standard deviation of just 4%.
- Training Efficiency: In terms of training time for 1000 rows, the Naive Bayes algorithm was the quickest, with a time of 2014.9 units. This contrasts sharply with the Gradient Boosted Trees model, which took 22,101.2 units, indicating that while Gradient Boosted Trees are accurate, they require significantly more time to train.
- Gains: The Gradient Boosted Trees model also topped the gains metric at 68.0, with the Logistic Regression model following at 46.0. This suggests that the Gradient Boosted Trees not only offers high accuracy but also maximizes the true positive rate.
4.1. Comparative Analysis of Machine Learning Models
4.2. Implications for Microbial Strain Selection in Agriculture
4.3. Implications for the Selection of Microbial Strains for Drought Mitigation
Discussion on Model Selection Based on Scenarios
- High Priority on Accuracy with Adequate Resources:Recommended Model: Gradient Boosted Trees.Reasoning: Achieving an accuracy of 87% and having a reasonable standard deviation of 4%, the Gradient Boosted Trees model stands out as the top performer. However, it also demands the highest computational resources, with a total time of 3,381,260 and a substantial training time per 1000 rows of data.
- Need for Quick Results with Moderate Accuracy:Recommended Model: Generalized Linear Model or Naive Bayes.Reasoning: Both these models offer decent accuracy, with the GLM slightly edging out at 69%. Their total computation time is relatively low, making them suitable for applications where quick insights are essential.
- Scenarios Requiring Deep Insights and Nonlinearity:Recommended Model: Deep Learning.Reasoning: With an accuracy of 80% and the ability to capture intricate patterns and relationships in data, Deep Learning models can be ideal. They are particularly effective when the dataset is large and when nonlinear relationships are suspected.
- Balancing Accuracy and Computation Time:Recommended Model: Decision Tree or Random Forest.Reasoning: Both models provide a good compromise between accuracy and computational efficiency. Random Forest, being an ensemble method, can handle more complex data patterns and offers slightly reduced variance compared to a single Decision Tree.
- Scenarios with Limited Data or Resources:Recommended Model: Fast Large Margin.Reasoning: With a relatively low computation time and modest accuracy, this model can be effective when computational resources are limited, or when a rapid prototype is required.
4.4. Future Directions
4.5. Limitations
5. Conclusions
Author Contributions
Funding
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Liu, X.; Zhu, X.; Pan, Y.; Li, S.; Liu, Y.; Ma, Y. Agricultural Drought Monitoring: Progress, Challenges, and Prospects. J. Geogr. Sci. 2016, 26, 750–767. [Google Scholar] [CrossRef]
- Verner, D.; Treguer, D.; Redwood, J.; Christensen, J.; McDonnell, R.; Elbert, C.; Konishi, Y.; Belghazi, S. Climate Variability, Drought, and Drought Management in Morocco’s Agricultural Sector. 2018. Available online: http://hdl.handle.net/10986/30603 (accessed on 15 March 2022).
- Faisol, A.; Indarto, I.; Novita, E.; Budiyono, B. Assessment of Agricultural Drought Based on CHIRPS Data and SPI Method over West Papua—Indonesia. J. Water Land Dev. 2022, 52, 44–52. [Google Scholar] [CrossRef]
- Ahluwalia, O.; Singh, P.C.; Bhatia, R. A Review on Drought Stress in Plants: Implications, Mitigation and the Role of Plant Growth Promoting Rhizobacteria. Resour. Environ. Sustain. 2021, 5, 100032. [Google Scholar] [CrossRef]
- Camaille, M.; Fabre, N.; Clément, C.; Barka, E.A. Advances in Wheat Physiology in Response to Drought and the Role of Plant Growth Promoting Rhizobacteria to Trigger Drought Tolerance. Microorganisms 2021, 9, 687. [Google Scholar] [CrossRef] [PubMed]
- Chaudhary, P.; Parveen, H.; Gangola, S.; Kumar, G.; Bhatt, P.; Chaudhary, A. Plant Growth-Promoting Rhizobacteria and Their Application in Sustainable Crop Production. In Microbial Technology for Sustainable Environment; Bhatt, P., Gangola, S., Udayanga, D., Kumar, G., Eds.; Springer: Singapore, 2021; pp. 217–234. [Google Scholar] [CrossRef]
- Khan, N.; Bano, A.; Shahid, M.A.; Nasim, W.; Ali Babar, M. Interaction between PGPR and PGR for Water Conservation and Plant Growth Attributes under Drought Condition. Biologia 2018, 73, 1083–1098. [Google Scholar] [CrossRef]
- Zheng, W.; Zeng, S.; Bais, H.; LaManna, J.M.; Hussey, D.S.; Jacobson, D.L.; Jin, Y. Plant Growth-Promoting Rhizobacteria (PGPR) Reduce Evaporation and Increase Soil Water Retention. Water Resour. Res. 2018, 54, 3673–3687. [Google Scholar] [CrossRef]
- Ding, Y.; Li, C.; Li, Z.; Liu, S.; Zou, Y.; Gao, X.; Cai, Y.; Siddique, K.H.M.; Wu, P.; Zhao, X. Greenhouse Gas Emission Responses to Different Soil Amendments on the Loess Plateau, China. Agric. Ecosyst. Environ. 2023, 342, 108233. [Google Scholar] [CrossRef]
- Schillaci, M.; Gupta, S.; Walker, R.; Roessner, U. The Role of Plant Growth-Promoting Bacteria in the Growth of Cereals under Abiotic Stresses. In Root Biology—Growth, Physiology, and Functions; Ohyama, T., Ed.; IntechOpen: London, UK, 2019; pp. 1–21. [Google Scholar] [CrossRef]
- Seo, Y.; Cho, K.S. Rhizoremdiation of Petroleum Hydrocarbon-Contaminated Soils and Greenhouse Gas Emission Characteristics: A Review. Microbiol. Biotechnol. Lett. 2020, 48, 99–112. [Google Scholar] [CrossRef]
- Mohanty, P.; Singh, P.K.; Chakraborty, D.; Mishra, S.; Pattnaik, R. Insight Into the Role of PGPR in Sustainable Agriculture and Environment. Front. Sustain. Food Syst. 2021, 5, 667150. [Google Scholar] [CrossRef]
- Vocciante, M.; Grifoni, M.; Fusini, D.; Petruzzelli, G.; Franchi, E. The Role of Plant Growth-Promoting Rhizobacteria (PGPR) in Mitigating Plant’s Environmental Stresses. Appl. Sci. 2022, 12, 1231. [Google Scholar] [CrossRef]
- Abdelazeem, S.A.E.M.; Al-Werwary, S.M.; Mehana, T.A.E.; El-Hamahmy, M.A.; Kalaji, H.M.; Rastogi, A.; Elsheery, N.I. Use of Plant Growth-Promoting Rhizobacteria Isolates as a Potential Biofertiliser for Wheat. J. Water Land Dev. 2022, 99–111. [Google Scholar] [CrossRef]
- Massa, F.; Defez, R.; Bianco, C. Exploitation of Plant Growth Promoting Bacteria for Sustainable Agriculture: Hierarchical Approach to Link Laboratory and Field Experiments. Microorganisms 2022, 10, 865. [Google Scholar] [CrossRef] [PubMed]
- Ruzzi, M.; Aroca, R. Plant Growth-Promoting Rhizobacteria Act as Biostimulants in Horticulture. Sci. Hortic 2015, 196, 124–134. [Google Scholar] [CrossRef]
- Vejan, P.; Khadiran, T.; Abdullah, R.; Ismail, S.; Dadrasnia, A. Encapsulation of Plant Growth Promoting Rhizobacteria—Prospects and Potential in Agricultural Sector: A Review. J. Plant Nutr. 2019, 42, 2600–2623. [Google Scholar] [CrossRef]
- Poncheewin, W.; van Diepeningen, A.D.; van der Lee, T.A.J.; Suarez-Diez, M.; Schaap, P.J. Classification of the Plant-Associated Lifestyle of Pseudomonas Strains Using Genome Properties and Machine Learning. Sci. Rep. 2022, 12, 1–12. [Google Scholar] [CrossRef] [PubMed]
- Sambo, P.; Nicoletto, C.; Giro, A.; Pii, Y.; Valentinuzzi, F.; Mimmo, T.; Lugli, P.; Orzes, G.; Mazzetto, F.; Astolfi, S.; et al. Hydroponic Solutions for Soilless Production Systems: Issues and Opportunities in a Smart Agriculture Perspective. Front. Plant Sci. 2019, 10, 465257. [Google Scholar] [CrossRef] [PubMed]
- Shelar, A.; Singh, A.V.; Maharjan, R.S.; Laux, P.; Luch, A.; Gemmati, D.; Tisato, V.; Singh, S.P.; Santilli, M.F.; Shelar, A.; et al. Sustainable Agriculture through Multidisciplinary Seed Nanopriming: Prospects of Opportunities and Challenges. Cells 2021, 10, 2428. [Google Scholar] [CrossRef]
- Higdon, S.M.; Huang, B.C.; Bennett, A.B.; Weimer, B.C. Identification of Nitrogen Fixation Genes in Lactococcus Isolated from Maize Using Population Genomics and Machine Learning. Microorganisms 2020, 8, 2043. [Google Scholar] [CrossRef]
- Indumathi, V.; Santhana Megala, S.; Padmapriya, R.; Suganya, M.; Jayanthi, B.; Bca, H. Prediction and Analysis of Plant Growth Promoting Bacteria Using Machine Learning for Millet Crops. Ann. Rom. Soc. Cell Biol. 2021, 25, 1826–1833. [Google Scholar]
- Wu, J.; Zhao, F. Machine Learning: An Effective Technical Method for Future Use in Assessing the Effectiveness of Phosphorus-Dissolving Microbial Agroremediation. Front. Bioeng. Biotechnol. 2023, 11, 1189166. [Google Scholar] [CrossRef]
- Benos, L.; Tagarakis, A.C.; Dolias, G.; Berruto, R.; Kateris, D.; Bochtis, D. Machine Learning in Agriculture: A Comprehensive Updated Review. Sensors 2021, 21, 3758. [Google Scholar] [CrossRef] [PubMed]
- Sharma, A.; Jain, A.; Gupta, P.; Chowdary, V. Machine Learning Applications for Precision Agriculture: A Comprehensive Review. IEEE Access 2021, 9, 4843–4873. [Google Scholar] [CrossRef]
- Storm, H.; Baylis, K.; Heckelei, T. Machine Learning in Agricultural and Applied Economics. Eur. Rev. Agric. Econ. 2020, 47, 849–892. [Google Scholar] [CrossRef]
- Borchert, E.; Hammerschmidt, K.; Hentschel, U.; Deines, P. Enhancing Microbial Pollutant Degradation by Integrating Eco-Evolutionary Principles with Environmental Biotechnology. Trends Microbiol. 2021, 29, 908–918. [Google Scholar] [CrossRef]
- de Souza, R.S.C.; Armanhi, J.S.L.; Arruda, P. From Microbiome to Traits: Designing Synthetic Microbial Communities for Improved Crop Resiliency. Front. Plant Sci. 2020, 11, 553605. [Google Scholar] [CrossRef] [PubMed]
- Vassilev, N.; Malusà, E.; Neri, D.; Xu, X. Editorial: Plant Root Interaction with Associated Microbiomes to Improve Plant Resiliency and Crop Biodiversity. Front. Plant Sci. 2021, 12, 715676. [Google Scholar] [CrossRef]
- Paliwoda, D.; Mikiciuk, G.; Mikiciuk, M.; Kisiel, A.; Sas-Paszt, L.; Miller, T. Effects of Rhizosphere Bacteria on Strawberry Plants (Fragaria × ananassa Duch.) under Water Deficit. Int. J. Mol. Sci. 2022, 23, 10449. [Google Scholar] [CrossRef]
- Paliwoda, D.; Mikiciuk, G.; Mikiciuk, M.; Miller, T.; Kisiel, A.; Sas-Paszt, L.; Kozioł, A.; Brysiewicz, A. The Use of Plant Growth Promoting Rhizobacteria to Reduce Greenhouse Gases in Strawberry Cultivation under Different Soil Moisture Conditions. Agronomy 2023, 13, 754. [Google Scholar] [CrossRef]
- Berrar, D. Bayes’ Theorem and Naive Bayes Classifier. Encycl. Bioinform. Comput. Biol. ABC Bioinform. 2018, 1–3, 403–412. [Google Scholar] [CrossRef]
- Yang, F.J. An Implementation of Naive Bayes Classifier. In Proceedings of the International Conference on Computational Science and Computational Intelligence (CSCI), Las Vegas, NA, USA, 12−14 December 2018; pp. 301–306. [Google Scholar] [CrossRef]
- Hastie, T.J.; Pregibon, D. Generalized Linear Models. In Statistical Models in S; Hastie, T.J., Ed.; Routledge: New York, NY, USA, 2017; pp. 195–247. [Google Scholar] [CrossRef]
- Meng, X.; Wu, S.; Zhu, J. A Unified Bayesian Inference Framework for Generalized Linear Models. IEEE Signal Process. Lett. 2017, 25, 398–402. [Google Scholar] [CrossRef]
- Gasso, G. Logistic Regression; INSA Rouen-ASI Departement Laboratory: Saint-Etienne-du-Rouvray, France, 2019; pp. 1–30. [Google Scholar]
- Kuha, J.; Mills, C. On Group Comparisons with Logistic Regression Models. Sociol. Methods Res. 2018, 49, 498–525. [Google Scholar] [CrossRef]
- Sokolic, J.; Giryes, R.; Sapiro, G.; Rodrigues, M.R.D. Robust Large Margin Deep Neural Networks. IEEE Trans. Signal Process. 2016, 65, 4265–4280. [Google Scholar] [CrossRef]
- Wang, M.; Liu, Y.; Huang, Z. Large Margin Object Tracking with Circulant Feature Maps. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4021–4029. [Google Scholar] [CrossRef]
- LeCun, Y.; Bengio, Y.; Hinton, G. Deep Learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
- Min, S.; Lee, B.; Yoon, S. Deep Learning in Bioinformatics. Brief. Bioinform. 2017, 18, 851–869. [Google Scholar] [CrossRef] [PubMed]
- Kamiński, B.; Jakubczyk, M.; Szufel, P. A Framework for Sensitivity Analysis of Decision Trees. Cent. Eur. J. Oper. Res. 2018, 26, 135–159. [Google Scholar] [CrossRef] [PubMed]
- Yang, Y.; Morillo, I.G.; Hospedales, T.M. Deep Neural Decision Trees. In Proceedings of the 2018 ICML Workshop on Human Interpretability in Machine Learning (WHI 2018), Stockholm, Sweden, 14 July 2018. [Google Scholar] [CrossRef]
- Paul, A.; Mukherjee, D.P.; Das, P.; Gangopadhyay, A.; Chintha, A.R.; Kundu, S. Improved Random Forest for Classification. IEEE Trans. Image Process. 2018, 27, 4012–4024. [Google Scholar] [CrossRef] [PubMed]
- Schonlau, M.; Zou, R.Y. The Random Forest Algorithm for Statistical Learning. Stata J. 2020, 20, 3–29. [Google Scholar] [CrossRef]
- Si, S.; Zhang, H.; Keerthi, S.S.; Mahajan, D.; Dhillon, I.S.; Hsieh, C.-J. Gradient Boosted Decision Trees for High Dimensional Sparse Output. In Proceedings of the 34th International Conference on Machine Learning (ICML), Sydney, Australia, 6−11 August 2017; pp. 3182–3190. [Google Scholar]
- Zhang, Z.; Jung, C. GBDT-MO: Gradient-Boosted Decision Trees for Multiple Outputs. IEEE Trans. Neutral Netw. Learn Syst. 2021, 32, 3156–3167. [Google Scholar] [CrossRef]
- Murlidharan, S.; Shukla, V.K.; Chaubey, A. Application of Machine Learning in Precision Agriculture Using IoT. In Proceedings of the 2021 2nd International Conference on Intelligent Engineering and Management (ICIEM), London, UK, 28−30 April 2021; pp. 34–39. [Google Scholar] [CrossRef]
- Park, S.J.; Chae, D.K.; Bae, H.K.; Park, S.; Kim, S.W. Reinforcement Learning over Sentiment-Augmented Knowledge Graphs towards Accurate and Explainable Recommendation. In Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining, New York, NY, USA, 21−25 February 2022; pp. 784–793. [Google Scholar] [CrossRef]
- Rehman, M.; Razzaq, A.; Baig, I.A.; Jabeen, J.; Tahir, M.H.N.; Ahmed, U.I.; Altaf, A.; Abbas, T. Semantics Analysis of Agricultural Experts’ Opinions for Crop Productivity through Machine Learning. Appl. Artif. Intell. 2022, 36, 1–16. [Google Scholar] [CrossRef]
- Chlingaryan, A.; Sukkarieh, S.; Whelan, B. Machine Learning Approaches for Crop Yield Prediction and Nitrogen Status Estimation in Precision Agriculture: A Review. Comput. Electron. Agric. 2018, 151, 61–69. [Google Scholar] [CrossRef]
- Yadav, N.; Alfayeed, S.M.; Wadhawan, A. Machine Learning In Agriculture: Techniques And Applications. Int. J. Eng. Appl. Sci. Technol. 2020, 5, 118–122. [Google Scholar] [CrossRef]
- Bragg, J.; Habli, I. What Is Acceptably Safe for Reinforcement Learning? In SAFECOMP 2018: Computer Safety, Reliability, and Security; Gallina, B., Skavhaug, A., Schoitsch, E., Bitsch, F., Eds.; Springer: Cham, Switzerland, 2018; pp. 418–430. [Google Scholar] [CrossRef]
- Stocker, M.D.; Pachepsky, Y.A.; Hill, R.L. Prediction of E. Coli Concentrations in Agricultural Pond Waters: Application and Comparison of Machine Learning Algorithms. Front. Artif. Intell. 2022, 4, 768650. [Google Scholar] [CrossRef] [PubMed]
- Saleem, M.H.; Potgieter, J.; Arif, K.M. Automation in Agriculture by Machine and Deep Learning Techniques: A Review of Recent Developments. Precis. Agric. 2021, 22, 2053–2091. [Google Scholar] [CrossRef]
- Rastrollo-Guerrero, J.L.; Gómez-Pulido, J.A.; Durán-Domínguez, A. Analyzing and Predicting Students’ Performance by Means of Machine Learning: A Review. Appl. Sci. 2020, 10, 1042. [Google Scholar] [CrossRef]
- Leite, D.M.C.; Lopez, J.F.; Brochet, X.; Barreto-Sanz, M.; Que, Y.A.; Resch, G.; Pena-Reyes, C. Exploration of Multiclass and One-Class Learning Methods for Prediction of Phage-Bacteria Interaction at Strain Level. In Proceedings of the 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Madrid, Spain, 3−6 December 2018; pp. 1818–1825. [Google Scholar] [CrossRef]
- Tang, G.; Shi, J.; Wu, W.; Yue, X.; Zhang, W. Sequence-Based Bacterial Small RNAs Prediction Using Ensemble Learning Strategies. BMC Bioinform. 2018, 19, 13–23. [Google Scholar] [CrossRef]
- Durmuş, H.; Güneş, E.O.; Kırcı, M. Disease Detection on the Leaves of the Tomato Plants by Using Deep Learning. In Proceedings of the 2017 6th Int. Conf. Agro-Geoinformatics, Fairfax, VA, USA, 7 August 2017. [Google Scholar] [CrossRef]
- Ropelewska, E.; Sabanci, K.; Aslan, M.F. The Changes in Bell Pepper Flesh as a Result of Lacto-Fermentation Evaluated Using Image Features and Machine Learning. Foods 2022, 11, 2956. [Google Scholar] [CrossRef] [PubMed]
- Akhter, R.; Sofi, S.A. Precision Agriculture Using IoT Data Analytics and Machine Learning. J. King Saud Univ.—Comput. Inf. Sci. 2022, 34, 5602–5618. [Google Scholar] [CrossRef]
- Meshram, V.; Patil, K.; Meshram, V.; Hanchate, D.; Ramkteke, S.D. Machine Learning in Agriculture Domain: A State-of-Art Survey. Artif. Intell. Life Sci. 2021, 1, 100010. [Google Scholar] [CrossRef]
- Hashimoto, D.A.; Witkowski, E.; Gao, L.; Meireles, O.; Rosman, G. Artificial Intelligence in Anesthesiology: Current Techniques, Clinical Applications, and Limitations. Anesthesiology 2020, 132, 379–394. [Google Scholar] [CrossRef]
- Peng, G.C.Y.; Alber, M.; Buganza Tepole, A.; Cannon, W.R.; De, S.; Dura-Bernal, S.; Garikipati, K.; Karniadakis, G.; Lytton, W.W.; Perdikaris, P.; et al. Multiscale Modeling Meets Machine Learning: What Can We Learn? Arch. Comput. Methods Eng. 2021, 28, 1017–1037. [Google Scholar] [CrossRef]
- Sagan, V.; Peterson, K.T.; Maimaitijiang, M.; Sidike, P.; Sloan, J.; Greeling, B.A.; Maalouf, S.; Adams, C. Monitoring Inland Water Quality Using Remote Sensing: Potential and Limitations of Spectral Indices, Bio-Optical Simulations, Machine Learning, and Cloud Computing. Earth-Sci. Rev. 2020, 205, 103187. [Google Scholar] [CrossRef]
- Alber, M.; Buganza Tepole, A.; Cannon, W.R.; De, S.; Dura-Bernal, S.; Garikipati, K.; Karniadakis, G.; Lytton, W.W.; Perdikaris, P.; Petzold, L.; et al. Integrating Machine Learning and Multiscale Modeling—Perspectives, Challenges, and Opportunities in the Biological, Biomedical, and Behavioral Sciences. NPJ Digit. Med. 2019, 2, 1–11. [Google Scholar] [CrossRef]
- Uddin, S.; Khan, A.; Hossain, M.E.; Moni, M.A. Comparing Different Supervised Machine Learning Algorithms for Disease Prediction. BMC Med. Inform. Decis. Mak. 2019, 19, 1–16. [Google Scholar] [CrossRef]
Soil Moisture Condition | Water Potential (kPa) | Description |
---|---|---|
Control (Optimal Soil Moisture) | −10 to −15 | The water potential was maintained at this level under control conditions. |
Water Deficit | −40 to −45 | The water potential was maintained at this level under conditions of water deficit in the substrate. |
Model | Accuracy | Standard Deviation | Gains | Total Time | Training Time (1000 Rows) | R2 | MAE |
---|---|---|---|---|---|---|---|
Naive Bayes | 67% | 3% | 28.0 | 556,941.0 | 2014.9 | 0.65 | 5.4 |
Generalized Linear Model | 69% | 4% | 42.0 | 543,840.0 | 5125.0 | 0.68 | 5.2 |
Logistic Regression | 67% | 6% | 46.0 | 904,840.0 | 6985.1 | 0.66 | 5.3 |
Fast Large Margin | 62% | 3% | 18.0 | 880,130.0 | 4145.8 | 0.60 | 5.9 |
Deep Learning | 80% | 6% | 40.0 | 915,363.0 | 6279.8 | 0.97 | 4.0 |
Decision Tree | 73% | 8% | 18.0 | 682,198.0 | 5717.3 | 0.72 | 4.7 |
Random Forest | 71% | 7% | 20.0 | 953,485.0 | 5300.6 | 0.70 | 4.8 |
Gradient Boosted Trees | 87% | 4% | 68.0 | 3,381,260.0 | 22,101.2 | 0.89 | 3.2 |
Number of Trees | Maximal Depth | Learning Rate | Accuracy |
---|---|---|---|
30.0 | 4.0 | 0.1 | 0.8732 |
30.0 | 7.0 | 0.1 | 0.8683 |
90.0 | 4.0 | 0.01 | 0.8573 |
150.0 | 7.0 | 0.01 | 0.8573 |
150.0 | 4.0 | 0.01 | 0.8478 |
90.0 | 7.0 | 0.01 | 0.6785 |
30.0 | 2.0 | 0.1 | 0.6785 |
150.0 | 2.0 | 0.01 | 0.6836 |
150.0 | 4.0 | 0.001 | 0.6938 |
90.0 | 4.0 | 0.001 | 0.6989 |
30.0 | 2.0 | 0.01 | 0.6989 |
30.0 | 4.0 | 0.01 | 0.7040 |
30.0 | 7.0 | 0.01 | 0.7040 |
Metric | Value |
---|---|
Model Metrics Type | Multinomial |
Description | Metrics reported on full training frame |
Model ID | rm-h2o-model-model-61089 |
Frame ID | rm-h2o-frame-model-61089 |
RMSE | 0.8007551905 |
R2 | 0.9719184 |
Logloss | 1.2460235 |
Mean Per Class Error | 0.36813188 |
Step | Objective | Approach |
---|---|---|
1. Expanding Data Sources | Diversify and increase the robustness of predictive models. | Incorporate data from various geographical locations, covering different soil types, microbial ecologies, and climatic conditions. |
2. Incorporating Genomic Data | Achieve a deeper understanding of microbial strains. | Delve into the genomic data of the microbial strains to identify genetic markers associated with drought resistance. |
3. Ensemble Learning and Hybrid Models | Enhance prediction accuracy and model robustness. | Use ensemble methods combining various algorithms or create hybrid models blending traditional statistical methods with machine learning techniques. |
4. Real-time Monitoring and Prediction | Facilitate proactive interventions. | Develop a system with IoT devices for real-time monitoring and use machine learning models for impending drought stress predictions. |
5. Collaboration with Microbiologists | Ensure the biological viability of machine learning predictions. | Form interdisciplinary teams with microbiologists and soil scientists to validate the biological viability of machine learning recommendations. |
6. Model Explainability and Interpretation | Make machine learning models more transparent and understandable. | Implement techniques from Explainable AI (XAI) for insights into microbial strain selections. |
7. Field Trials and Validation | Empirically validate the efficacy of selected microbial strains. | Conduct controlled field trials monitoring plant health, yield, and drought resilience to validate machine learning recommendations. |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Miller, T.; Mikiciuk, G.; Kisiel, A.; Mikiciuk, M.; Paliwoda, D.; Sas-Paszt, L.; Cembrowska-Lech, D.; Krzemińska, A.; Kozioł, A.; Brysiewicz, A. Machine Learning Approaches for Forecasting the Best Microbial Strains to Alleviate Drought Impact in Agriculture. Agriculture 2023, 13, 1622. https://doi.org/10.3390/agriculture13081622
Miller T, Mikiciuk G, Kisiel A, Mikiciuk M, Paliwoda D, Sas-Paszt L, Cembrowska-Lech D, Krzemińska A, Kozioł A, Brysiewicz A. Machine Learning Approaches for Forecasting the Best Microbial Strains to Alleviate Drought Impact in Agriculture. Agriculture. 2023; 13(8):1622. https://doi.org/10.3390/agriculture13081622
Chicago/Turabian StyleMiller, Tymoteusz, Grzegorz Mikiciuk, Anna Kisiel, Małgorzata Mikiciuk, Dominika Paliwoda, Lidia Sas-Paszt, Danuta Cembrowska-Lech, Adrianna Krzemińska, Agnieszka Kozioł, and Adam Brysiewicz. 2023. "Machine Learning Approaches for Forecasting the Best Microbial Strains to Alleviate Drought Impact in Agriculture" Agriculture 13, no. 8: 1622. https://doi.org/10.3390/agriculture13081622
APA StyleMiller, T., Mikiciuk, G., Kisiel, A., Mikiciuk, M., Paliwoda, D., Sas-Paszt, L., Cembrowska-Lech, D., Krzemińska, A., Kozioł, A., & Brysiewicz, A. (2023). Machine Learning Approaches for Forecasting the Best Microbial Strains to Alleviate Drought Impact in Agriculture. Agriculture, 13(8), 1622. https://doi.org/10.3390/agriculture13081622