Soil Temperature Estimation with Meteorological Parameters by Using Tree-Based Hybrid Data Mining Models
Abstract
:1. Introduction
2. Materials and Methods
2.1. Material
2.2. Methods
2.2.1. Gradient Boosted Trees (GBT)
2.2.2. Decision Trees (DT)
2.2.3. Hybrid DT–GBT
2.2.4. Metrics Performed for Evaluation
- Root mean squared error (RMSE)—the standard deviation of the residuals (prediction errors).
- Pearson correlation coefficient (r)—used to obtain the strength and direction of the linear relationship between the predicted value and observed value for the soil temperature.
- Mean absolute error (MAE)—it is commonly used in forecasting time series.
- Kling–Gupta efficiency (KGE)—first introduced by Gupta et al. [42] as an improvement to the Nash–Sutcliffe efficiency. It facilitates the separate analysis of the relative importance of correlation, bias, and variability in the process of hydrological modelling.
2.2.5. Parameter Setup
2.2.6. Scenarios and Implementation
- load training data;
- load testing data;
- load test scenarios;
- for each test scenario in the list:
- ○
- establish predicted value as specified in the scenario;
- ○
- select only attributes specified;
- ○
- generate model on the training data using windowing;
- ○
- apply generated model on the test data;
- ○
- store results.
- aggregate results.
3. Results
4. Discussion
Author Contributions
Funding
Conflicts of Interest
References
- Bond-Lamberty, B.; Wang, C.; Gower, S.T. Spatiotemporal measurement and modeling of stand-level boreal forest soil temperatures. Agric. For. Meteorol. 2005, 131, 27–40. [Google Scholar] [CrossRef]
- Buckman, H.O.; Brady, N.C. The Nature and Properties of Soils, 6th ed.; The Mac Millian Co.: New York, NY, USA, 1960. [Google Scholar]
- Seyfried, M.S.; Flerchinger, G.N.; Murdock, M.D.; Hanson, C.L.; Van Vactor, S. Long-Term Soil Temperature Database, Reynolds Creek Experimental Watershed, Idaho, United States. Water Resour. Res. 2001, 37, 2843–2846. [Google Scholar] [CrossRef]
- Tenge, A.; Kaihura, F.B.; Lal, R.; Singh, B. Diurnal soil temperature fluctuations for different erosion classes of an oxisol at Mlingano, Tanzania. Soil Tillage Res. 1998, 49, 211–217. [Google Scholar] [CrossRef]
- Zheng, D.; Hunt, E.; Running, S. A daily soil temperature model based on air temperature and precipitation for continental applications. Clim. Res. 1993, 2, 183–191. [Google Scholar] [CrossRef]
- Yang, C.-C.; Prasher, S.O.; Mehuys, G.R.; Patni, N.K. Application of artificial neural networks for simulation of soil temperature. Trans. ASAE 1997, 40, 649–656. [Google Scholar] [CrossRef]
- Paul, K.I.; Polglase, P.J.; Smethurst, P.J.; O’Connell, A.M.; Carlyle, C.J.; Khanna, P.K. Soil temperature under forests: A simple model for predicting soil temperature under a range of forest types. Agric. For. Meteorol. 2004, 121, 167–182. [Google Scholar] [CrossRef]
- Bilgili, M. Prediction of soil temperature using regression and artificial neural network models. Meteorol. Atmos. Phys. 2010, 110, 59–70. [Google Scholar] [CrossRef]
- Sattari, M.T.; Apaydin, H.; Shamshirband, S. Performance Evaluation of Deep Learning-Based Gated Recurrent Units (GRUs) and Tree-Based Models for Estimating ETo by Using Limited Meteorological Variables. Mathematics 2020, 8, 972. [Google Scholar] [CrossRef]
- Sattari, M.T.; Dodangeh, E.; Abraham, J. Estimation of daily soil temperature via data mining techniques in semi-arid climate conditions. Earth Sci. Res. J. 2017, 21, 85–93. [Google Scholar] [CrossRef]
- Apaydin, H.; Feizi, H.; Sattari, M.T.; Colak, M.S.; Shamshirband, S.; Chau, K.-W. Comparative Analysis of Recurrent Neural Network Architectures for Reservoir Inflow Forecasting. Water 2020, 12, 1500. [Google Scholar] [CrossRef]
- Keskiner, A.; Ibrikci, T.; Cetin, M. Estimation and Comparison of Probabilistic Temperatures through Using Artificial Neural Networks in Geographic Information Systems Media. J. Agric. Sci. 2012, 17, 242–252. [Google Scholar]
- Yurekli, K.; Sattari, M.T.; Anli, A.S.; Hinis, M.A. Seasonal and annual regional drought prediction by using data-mining approach. Atmosfera 2012, 25, 85–105. [Google Scholar]
- Terzi, O.; Barak, M. Rainfall-Runoff Forecasting with Wavelet-Neural Network Approach: A Case Study of Kızılırmak River. J. Agric. Sci. 2015, 21, 546–557. [Google Scholar]
- Nourani, V.; Sattari, M.T.; Molajou, A. Threshold-Based Hybrid Data Mining Method for Long-Term Maximum Precipitation Forecasting. Water Resour. Manag. 2017, 31, 2645–2658. [Google Scholar] [CrossRef]
- Sattari, M.T.; Mirabbasi, R.; Sushab, R.S.; Abraham, J.P. Prediction of Groundwater Level in Ardebil Plain Using Support Vector Regression and M5 Tree Model. Ground Water 2018, 56, 636–646. [Google Scholar] [CrossRef]
- Rouzegari, N.; Hassanzadeh, Y.; Sattari, M.T. Using the Hybrid Simulated Annealing-M5 Tree Algorithms to Extract the If-Then Operation Rules in a Single Reservoir. Water Resour. Manag. 2019, 33, 3655–3672. [Google Scholar] [CrossRef]
- Shabani, S.; Samadianfard, S.; Sattari, M.T.; Mosavi, A.; Shamshirband, S.; Kmet, T.; Várkonyi-Kóczy, A.R. Modeling Pan Evaporation Using Gaussian Process Regression K-Nearest Neighbors Random Forest and Support Vector Machines; Comparative Analysis. Atmosphere 2020, 11, 66. [Google Scholar] [CrossRef] [Green Version]
- Zounemat-Kermani, M. Hydrometeorological Parameters in Prediction of Soil Temperature by Means of Artificial Neural Network: Case Study in Wyoming. J. Hydrol. Eng. 2013, 18, 707–718. [Google Scholar] [CrossRef]
- Aslay, F.; Ozen, U. Estimating Soil Temperature with Artificial Neural Networks Using Meteorological Parameters. J. Polytech. 2013, 16, 139–145. [Google Scholar]
- Hosseinzadeh Talaee, P. Daily soil temperature modeling using neuro-fuzzy approach. Theor. Appl. Climatol. 2014, 118, 481–489. [Google Scholar] [CrossRef]
- Kim, S.; Singh, V.P. Modeling daily soil temperature using data-driven models and spatial distribution. Theor. Appl. Climatol. 2014, 118, 465–479. [Google Scholar] [CrossRef]
- Yener, D.; Ozgener, O.; Ozgener, L. Prediction of soil temperatures for shallow geothermal applications in Turkey. Renew. Sustain. Energy Rev. 2017, 70, 71–77. [Google Scholar] [CrossRef]
- Samadianfard, S.; Asadi, E.; Jarhan, S.; Kazemi, H.; Kheshtgar, S.; Kisi, O.; Sajjadi, S.; Manaf, A.A. Wavelet neural networks and gene expression programming models to predict short-term soil temperature at different depths. Soil Tillage Res. 2018, 175, 37–50. [Google Scholar] [CrossRef]
- Feng, Y.; Cui, N.; Hao, W.; Gao, L.; Gong, D. Estimation of soil temperature from meteorological data using different machine learning models. Geoderma 2019, 338, 67–77. [Google Scholar] [CrossRef]
- Costache, R.; Pham, Q.B.; Avand, M.; Thuy Linh, N.T.; Vojtek, M.; Vojteková, J.; Lee, S.; Khoi, D.N.; Thao Nhi, P.T.; Dung, T.D. Novel hybrid models between bivariate statistics, artificial neural networks and boosting algorithms for flood susceptibility assessment. J. Environ. Manag. 2020, 265, 110485. [Google Scholar] [CrossRef] [PubMed]
- Matei, O.; Rusu, T.; Petrovan, A.; Mihut, G. A data mining system for real time soil moisture prediction. Procedia Eng. 2017, 181, 837–844. [Google Scholar] [CrossRef]
- Matei, O.; Rusu, T.; Bozga, A.; Pop, P.; Anton, A. Context-aware data mining: Embedding external data sources in a machine learning process. In International Conference on Hybrid Artificial Intelligence Systems; Springer: Cham, Switzerland, 2017. [Google Scholar] [CrossRef]
- Anton, C.A.; Avram, A.; Petrovan, A.; Matei, O. Performance Analysis of Collaborative Data Mining vs Context Aware Data Mining in a Practical Scenario for Predicting Air Humidity. In Proceedings of the Computational Methods in Systems and Software; Springer: Cham, Switzerland, 2019; pp. 31–40. [Google Scholar] [CrossRef]
- Wu, Z.; Zhou, Y.; Wang, H.; Jiang, Z. Depth prediction of urban flood under different rainfall return periods based on deep learning and data warehouse. Sci. Total Environ. 2020, 716, 137077. [Google Scholar] [CrossRef]
- Anoynmous. Sivas Investment Guide; Central Anatolia Development Agency: Kayseri, Turkey, 2017. (In Turkish)
- Anoynmous. Activity Report; Republic of Turkey, Sivas Governorship Agriculture and Forest Provincial Directorate: Sivas, Turkey, 2019. (In Turkish)
- Anoynmous. Meteorological Instruments; State Meteorological Service. Available online: https://www.mgm.gov.tr/genel/meteorolojikaletler.aspx (accessed on 8 August 2020). (In Turkish)
- Anoynmous. Specifications of Meteorological Instruments; State Meteorological Service. Available online: https://www.mgm.gov.tr/FILES/kurumsal/mevzuat/ruzgar-gunes-ek.pdf (accessed on 8 August 2020). (In Turkish)
- Hofmann, M.; Klinkenberg, R. RapidMiner: Data Mining Use Cases and Business Analytics Applications; CRC Press: Boca Raton, FL, USA, 2016. [Google Scholar]
- Freund, Y.; Schapire, R.E. A decision-theoretic generalization of on-line learning and an application to boosting. In European Conference on Computational Learning Theory; Springer: Berlin/Heidelberg, Germany, 1995; pp. 23–37. [Google Scholar]
- Breiman, L.; Friedman, J.; Stone, C.J.; Olshen, R.A. Classification and Regression Trees; CRC Press: Boca Raton, FL, USA, 1984. [Google Scholar]
- Hastie, T.; Tibshirani, R.; Friedman, J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2009. [Google Scholar]
- Rokach, L.; Oded, Z.M. Data Mining with Decision Trees: Theory and Applications; World Scientific: Singapore, 2008; Volume 69. [Google Scholar]
- Nash, J.E.; Sutcliffe, J.V. River flow forecasting through conceptual models part I—A discussion of principles. J. Hydrol. 1970, 10, 282–290. [Google Scholar] [CrossRef]
- Hyndman, R.J.; Koehler, A.B. Another look at measures of forecast accuracy. Int. J. Forecast. 2006, 22, 679–688. [Google Scholar] [CrossRef] [Green Version]
- Gupta, H.V.; Kling, H.; Yilmaz, K.K.; Martinez, G.F. Decomposition of the mean squared error and NSE performance criteria: Implications for improving hydrological modelling. J. Hydrol. 2009, 377, 80–91. [Google Scholar] [CrossRef] [Green Version]
- Koskela, T.; Markus, V.; Jukka, H.; Kimmo, K. Timeseries prediction using recurrent som with local linear models. Int. J. Knowl. Based Intell. Eng. Syst. 1998, 2, 60–68. [Google Scholar]
- Avram, A.; Matei, O.; Pintea, C.; Pop, P.; Anton, C. Context-aware data mining vs classical data mining: Case study on predicting soil moisture. In International Workshop on Soft Computing Models in Industrial and Environmental Applications; Springer: Cham, Switzerland, 2019. [Google Scholar] [CrossRef]
- Avram, A.; Matei, O.; Pintea, C.; Anton, C. Innovative Platform for Designing Hybrid Collaborative Context-Aware Data Mining Scenarios. Mathematics 2020, 8, 684. [Google Scholar] [CrossRef]
- Anton, C.A.; Matei, O.; Avram, A. Collaborative Data Mining in Agriculture for Prediction of Soil Moisture and Temperature. Computer Science On-Line Conference; Springer: Cham, Switzerland, 2019. [Google Scholar] [CrossRef]
Statistic | ST5 (°C) | ST10 (°C) | ST20 (°C) | MinT (°C) | MeanT (°C) | MaxT (°C) | Sunshine Intensity (cal/cm2) | Sunshine Duration (h) | Precip. (mm) |
---|---|---|---|---|---|---|---|---|---|
Minimum | −9.3 | −8.4 | −4.6 | −19.8 | −15.5 | −11.3 | 0 | 0 | 0 |
Maximum | 41.6 | 36.6 | 31.2 | 26.3 | 31.7 | 41.1 | 750.78 | 13.1 | 33.6 |
Mean | 14.53 | 14.34 | 14.34 | 6.47 | 12.41 | 19.09 | 383.66 | 7.12 | 1.02 |
Stdev | 11.12 | 10.53 | 9.73 | 8.18 | 9.66 | 11.30 | 200.12 | 4.09 | 3.10 |
Number of records | 3316 | 3347 | 3347 | 3358 | 3358 | 3367 | 3275 | 3345 | 3395 |
Scenario | Meteorological Variables |
---|---|
1 | MinT-MaxT-MeanT-Sunshine Intensity-Sunshine Duration-Precipitation |
2 | MinT-MaxT-MeanT-Sunshine Intensity-Sunshine Duration |
3 | MinT-MaxT-MeanT-Sunshine Duration |
4 | MinT-MaxT-MeanT-Sunshine Intensity |
5 | Sunshine Intensity-Sunshine Duration |
6 | MinT-MaxT-MeanT |
7 | MeanT-Sunshine Duration |
8 | MeanT |
Window Value/Algorithm | 3 | 5 | 7 |
---|---|---|---|
DT | 1.3937 | 1.4219 | 1.4209 |
GBT | 2.0939 | 2.0982 | 2.1058 |
Hybrid DT–GBT | 1.8491 | 1.9546 | 1.9622 |
Value for Maximal Depth | Avg RMSE Per Scenario (°C) |
---|---|
3 | 3.8837 |
5 | 2.6878 |
7 | 2.2810 |
10 | 2.2010 |
15 | 2.3768 |
20 | 2.2789 |
No of Trees | Max Depth | RMSE | No of Trees | Max Depth | RMSE | No. of Trees | Max Depth | RMSE |
---|---|---|---|---|---|---|---|---|
30 | 10 | 8.0326 | 100 | 10 | 4.3900 | 200 | 10 | 2.4910 |
30 | 20 | 8.0313 | 100 | 20 | 4.3869 | 200 | 20 | 2.4892 |
30 | 30 | 8.0313 | 100 | 30 | 4.3869 | 200 | 30 | 2.4893 |
50 | 10 | 6.6889 | 150 | 10 | 3.1269 | |||
50 | 20 | 6.6864 | 150 | 20 | 3.1240 | |||
50 | 30 | 6.6864 | 150 | 30 | 3.1240 |
Scenario | Algorithm | RMSE (°C) | ||
---|---|---|---|---|
ST5 | ST10 | ST20 | ||
MeanT | DT | 2.0624 | 1.3795 | 0.8209 |
MeanT-Sunshine Duration | DT | 2.0454 | 1.3379 | 0.7703 |
MinT-MaxT-MeanT | DT | 2.0419 | 1.3677 | 0.7935 |
MinT-MaxT-MeanT-Sunshine Duration | DT | 2.0289 | 1.3165 | 0.7368 |
MinT-MaxT-MeanT-Sunshine Intensity | DT | 2.1041 | 1.3196 | 0.7522 |
MinT-MaxT-MeanT-Sunshine Int.-Sunshine Dur. | DT | 2.1226 | 1.3306 | 0.7481 |
MinT-MaxT-MeanT-Sunshine Intensity-Sunshine Duration-Precipitation | DT | 2.1271 | 1.3271 | 0.7479 |
Sunshine Intensity-Sunshine Duration | DT | 2.0188 | 1.3790 | 0.7694 |
MeanT | GBT | 2.6109 | 1.9734 | 1.6583 |
MeanT-Sunshine Duration | GBT | 2.6495 | 1.9783 | 1.6505 |
MinT-MaxT-MeanT | GBT | 2.6435 | 1.9674 | 1.6462 |
MinT-MaxT-MeanT-Sunshine Duration | GBT | 2.6673 | 1.9554 | 1.6389 |
MinT-MaxT-MeanT-Sunshine Intensity | GBT | 2.6686 | 1.9654 | 1.6509 |
MinT-MaxT-MeanT-Sunshine Int.-Sunshine Dur. | GBT | 2.6785 | 1.9640 | 1.6480 |
MinT-MaxT-MeanT-Sunshine Intensity-Sunshine Duration-Precipitation | GBT | 2.6803 | 1.9663 | 1.6473 |
Sunshine Intensity-Sunshine Duration | GBT | 2.6521 | 2.0109 | 1.6807 |
MeanT | Hybrid | 2.1007 | 1.4851 | 1.0770 |
MeanT-Sunshine Duration | Hybrid | 2.1440 | 1.4334 | 1.0295 |
MinT-MaxT-MeanT | Hybrid | 2.1609 | 1.4473 | 1.0445 |
MinT-MaxT-MeanT-Sunshine Duration | Hybrid | 2.1505 | 1.4351 | 1.0121 |
MinT-MaxT-MeanT-Sunshine Intensity | Hybrid | 2.1875 | 1.4432 | 1.0194 |
MinT-MaxT-MeanT-Sunshine Int.-Sunshine Dur. | Hybrid | 2.1878 | 1.4375 | 1.0157 |
MinT-MaxT-MeanT-Sunshine Intensity-Sunshine Duration-Precipitation | Hybrid | 2.2018 | 1.4366 | 1.0146 |
Sunshine Intensity-Sunshine Duration | Hybrid | 2.1315 | 1.4417 | 1.0644 |
Inputs | Output | NS | R | MAE | RMSE | KGE |
---|---|---|---|---|---|---|
Sunshine Intensity-Sunshine Duration | ST5 | 0.9669 | 0.9833 | 1.4533 | 2.0188 | 0.975 |
MinT-MaxT-MeanT-Sunshine Duration | ST10 | 0.9846 | 0.9922 | 0.9564 | 1.3165 | 0.989 |
MinT-MaxT-MeanT-Sunshine Duration | ST20 | 0.9942 | 0.9971 | 0.5171 | 0.7368 | 0.995 |
Inputs | Output | NS | R | MAE | RMSE | KGE |
---|---|---|---|---|---|---|
MeanT | ST5 | 0.9446 | 0.9793 | 1.9144 | 2.6109 | 0.857 |
MinT-MaxT-MeanT-Sunshine Duration | ST10 | 0.9658 | 0.9915 | 1.5442 | 1.9554 | 0.861 |
MinT-MaxT-MeanT-Sunshine Duration | ST20 | 0.9713 | 0.9939 | 1.2689 | 1.6389 | 0.866 |
Inputs | Output | NS | R | MAE | RMSE | KGE |
---|---|---|---|---|---|---|
MeanT | ST5 | 0.9642 | 0.9839 | 1.5358 | 2.1007 | 0.921 |
MeanT-Sunshine Duration | ST10 | 0.9817 | 0.9934 | 1.1025 | 1.4334 | 0.922 |
MinT-MaxT-MeanT-Sunshine Duration | ST20 | 0.9890 | 0.9968 | 0.7779 | 1.0121 | 0.930 |
Methods | DT | GBT | Hybrid DT–GBT | Measured (ST5) |
---|---|---|---|---|
Best Scenario | Sunshine Intensity, Sunshine Duration | MeanT | MeanT | |
Minimum | −4.21 | −2.00 | −3.03 | −9.30 |
Maximum | 35.55 | 32.05 | 33.95 | 41.60 |
Mean | 16.29 | 16.12 | 16.29 | 14.53 |
Stdev | 10.89 | 9.54 | 10.25 | 11.12 |
Correlation | 0.9800 | 0.9923 | 0.9954 | 1.0000 |
Number of records | 965 | 965 | 965 | 3316 |
Methods | DT | GBT | Hybrid DT–GBT | Measured (ST10) |
---|---|---|---|---|
Best Scenario | MinT-MaxT-MeanT-Sunshine Duration | MinT-MaxT-MeanT-Sunshine Duration | MeanT-Sunshine Duration | |
Minimum | −5.25 | −1.43 | −2.28 | −8.40 |
Maximum | 33.83 | 30.70 | 31.95 | 33.90 |
Mean | 15.80 | 15.62 | 15.73 | 15.84 |
Stdev | 10.50 | 9.12 | 9.77 | 10.58 |
Correlation | 0.9983 | 0.9900 | 0.9900 | 1.0000 |
Number of records | 998 | 998 | 998 | 3347 |
Methods | DT | GBT | Hybrid DT–GBT | Measured (ST20) |
---|---|---|---|---|
Best Scenario | MinT-MaxT-MeanT-Sunshine Duration | MinT-MaxT-MeanT-Sunshine Duration | MinT-MaxT-MeanT-Sunshine Duration | |
Minimum | −2.95 | −0.06 | −1.47 | −4.60 |
Maximum | 31.02 | 28.53 | 29.78 | 31.20 |
Mean | 15.739 | 15.56 | 15.68 | 15.79 |
Stdev | 9.6425 | 8.38 | 9.00 | 9.66 |
Correlation | 0.9994 | 0.9933 | 0.9974 | 1.0000 |
Number of records | 998 | 998 | 998 | 3347 |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Sattari, M.T.; Avram, A.; Apaydin, H.; Matei, O. Soil Temperature Estimation with Meteorological Parameters by Using Tree-Based Hybrid Data Mining Models. Mathematics 2020, 8, 1407. https://doi.org/10.3390/math8091407
Sattari MT, Avram A, Apaydin H, Matei O. Soil Temperature Estimation with Meteorological Parameters by Using Tree-Based Hybrid Data Mining Models. Mathematics. 2020; 8(9):1407. https://doi.org/10.3390/math8091407
Chicago/Turabian StyleSattari, Mohammad Taghi, Anca Avram, Halit Apaydin, and Oliviu Matei. 2020. "Soil Temperature Estimation with Meteorological Parameters by Using Tree-Based Hybrid Data Mining Models" Mathematics 8, no. 9: 1407. https://doi.org/10.3390/math8091407
APA StyleSattari, M. T., Avram, A., Apaydin, H., & Matei, O. (2020). Soil Temperature Estimation with Meteorological Parameters by Using Tree-Based Hybrid Data Mining Models. Mathematics, 8(9), 1407. https://doi.org/10.3390/math8091407