Robustness of Optimized Decision Tree-Based Machine Learning Models to Map Gully Erosion Vulnerability
Abstract
:1. Introduction
2. Materials and Methods
2.1. Study Area
2.2. Methodology
2.2.1. Gully Erosion Inventory Mapping
2.2.2. Dataset Preparation for Spatial Modelling
2.2.3. Multicollinearity Analysis
2.2.4. Decision Tree-Based Approaches
Random Forest (RF)
C5.0
Adaboost
Treebag
Gradient Boosting Machine (GBM)
Extreme Gradient Boosting (Boost)
2.2.5. Models’ Optimization
2.2.6. Validation and Accuracy Assessment
2.2.7. Variable Importance Analysis
3. Results
3.1. Preliminary Data Analysis
3.2. Spatial Relationship between Gully Locations and Effective Factors
3.3. Variable Importance Analysis
3.4. Gully Erosion Susceptibility Mapping
3.5. Model Accuracy and Validation Results
4. Discussion
4.1. Accuracy Assessment and Comparison
4.2. Geoenvironmental Variable Importance Analysis
4.3. Gully Erosion VulnerAbility Maps
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Poesen, J.; Nachtergaele, J.; Verstraeten, G.; Valentin, C. Gully erosion and environmental change: Importance and research needs. Catena 2003, 50, 91–133. [Google Scholar] [CrossRef]
- Roy, P.; Chandra Pal, S.; Arabameri, A.; Chakrabortty, R.; Pradhan, B.; Chowdhuri, I.; Tien Bui, D. Novel ensemble of multivariate adaptive regression spline with spatial logistic regression and boosted regression tree for gully erosion susceptibility. Remote Sens. 2020, 12, 3284. [Google Scholar] [CrossRef]
- Li, Z.; Fang, H. Impacts of climate change on water erosion: A review. Earth-Sci. Rev. 2016, 163, 94–117. [Google Scholar] [CrossRef]
- Zabihi, M.; Pourghasemi, H.R.; Motevalli, A.; Zakeri, M.A. Gully erosion modeling using GIS-based data mining techniques in Northern Iran: A comparison between boosted regression tree and multivariate adaptive regression spline. In Natural Hazards GIS-Based Spatial Modeling Using Data Mining Techniques; Springer: Cham, Switzerland, 2019; pp. 1–26. [Google Scholar] [CrossRef]
- Gupta, G.S. Land degradation and challenges of food security. Rev. Eur. Stud. 2019, 11, 63. [Google Scholar] [CrossRef]
- Borrelli, P.; Robinson, D.A.; Panagos, P.; Lugato, E.; Yang, J.E.; Alewell, C.; Wuepper, D.; Montanarella, L.; Ballabio, C. Land use and climate change impacts on global soil erosion by water (2015–2070). Proc. Natl. Acad. Sci. USA 2020, 117, 21994–22001. [Google Scholar] [CrossRef]
- FAO. Global Soil Status, Processes and Trends. Status of the World’s Soil Resources (SWSR)—Main Report of the Food and Agriculture Organization; FAO: New York, NY, USA, 2015. [Google Scholar]
- Acharki, S.; El Qorchi, F.; Arjdal, Y.; Amharref, M.; Bernoussi, A.S.; Aissa, H.B. Soil erosion assessment in Northwestern Morocco. Remote Sens. Appl. Soc. Environ. 2022, 25, 100663. [Google Scholar] [CrossRef]
- Markhi, A.; Laftouhi, N.; Grusson, Y.; Soulaimani, A. Assessment of potential soil erosion and sediment yield in the semi-arid N′ fis basin (High Atlas, Morocco) using the SWAT model. Acta Geophys. 2019, 67, 263–272. [Google Scholar] [CrossRef]
- Micheletti, N.; Foresti, L.; Robert, S.; Leuenberger, M.; Pedrazzini, A.; Jaboyedoff, M.; Kanevski, M. Machine learning feature selection methods for landslide susceptibility mapping. Math. Geosci. 2013, 46, 33–57. [Google Scholar] [CrossRef]
- Smith, S.J.; Williams, J.R.; Menzel, R.G.; Coleman, G.A. Prediction of sediment yield from southern plains grasslandds with the modified universal soil loss equation. J. Range Manag. 1984, 37, 295–297. [Google Scholar] [CrossRef]
- Renard, K.G.; Foster, G.R.; Weesies, G.A.; Porter, J.P. RUSLE, revised universal soil loss equation. J. Soil Water Conserv. 1991, 46, 30–33. [Google Scholar]
- Flanagan, D.C.; Nearing, M.A. USDA-Water Erosion Prediction Project: Hill Slope and Watershed Model Documentation. NSERI Report No. 10; USDA-ARS National Soil Erosion Research Laboratory: West Lafayette, IN, USA, 1995. [Google Scholar]
- Wischmeier, W.H.; Smith, D.D. Predicting Rainfall Erosion Losses: A Guide to Conservation Planning. Agriculture Handbook. 282; USDA-ARS: Beltsville, MA, USA, 1978. [Google Scholar]
- Williams, J.R.; Jones, C.A.; Dyke, P.T. The EPIC Model. United States Department of Agriculture (USDA) Teachnical Bulletin No. 1768; United States Department of Agriculture: Washington, DC, USA, 1990. [Google Scholar]
- Gayen, A.; Saha, S. Application of weights-of-evidence (WoE) and evidential belief function (EBF) models for the delineation of soil erosion vulnerable zones: A study on Pathro river basin, Jharkhand, India. Model. Earth Syst. Environ. 2017, 3, 1123–1139. [Google Scholar] [CrossRef]
- Alewell, C.; Borrelli, P.; Meusburger, K.; Panagos, P. Using the USLE: Chances, challenges and limitations of soil erosion modelling. Int. Soil Water Conserv. Res. 2019, 7, 203–225. [Google Scholar] [CrossRef]
- Luca, F.; Conforti, M.; Robustelli, G. Comparison of GIS-based gullying susceptibility mapping using bivariate and multivariate statistics: Northern Calabria, South Italy. Geomorphology 2011, 134, 297–308. [Google Scholar] [CrossRef]
- Svoray, T.; Michailov, E.; Cohen, A.; Rokah, L.; Sturm, A. Predicting gully initiation: Comparing data mining techniques, analytical hierarchy processes and the topographic threshold. Earth Surf. Process. Landf. 1991, 37, 607–619. [Google Scholar] [CrossRef]
- Conoscenti, C.; Angileri, S.; Cappadonia, C.; Rotigliano, E.; Agnesi, V.; Marker, M. Gully erosion susceptibility assessment by means of GIS-based logistic regression: A case of Sicily (Italy). Geomorphology 2014, 204, 399–411. [Google Scholar] [CrossRef]
- Dube, F.; Nhapi, I.; Murwira, A.; Gumindoga, W.; Goldin, J.; Mashauri, D.A. Potential of weight of evidence modelling for gully erosion hazard assessment in Mbire District—Zimbabwe. Phys. Chem. Earth 2014, 67, 145–152. [Google Scholar] [CrossRef]
- Zakerinejad, R.; Maerker, M. An integrated assessment of soil erosion dynamics with special emphasis on gully erosion in the Mazayjan basin, southwestern Iran. Nat. Hazards 2015, 79, 25–50. [Google Scholar] [CrossRef]
- Du Plessis, C.; Van Zijl, G.; Van Tol, J.; Manyevere, A. Machine learning digital soil mapping to inform gully erosion mitigation measures in the Eastern Cape, South Africa. Geoderma 2020, 368, 114287. [Google Scholar] [CrossRef]
- Zhao, X.; Chen, W. Gis-based evaluation of landslide susceptibility models using certainty factors and functional trees-based ensemble techniques. Appl. Sci. 2020, 10, 16. [Google Scholar] [CrossRef]
- Sahour, H.; Gholami, V.; Vazifedan, M. A comparative analysis of statistical and machine learning techniques for mapping the spatial distribution of groundwater salinity in a coastal aquifer. J. Hydrol. 2020, 591, 125321. [Google Scholar] [CrossRef]
- Marjanović, M.; Kovačević, M.; Bajat, B.; Voženílek, V. Landslide susceptibility assessment using SVM machine learning algorithm. Eng. Geol. 2011, 123, 225–234. [Google Scholar] [CrossRef]
- Chen, W.; Lei, X.; Chakrabortty, R.; Pal, S.C.; Sahana, M.; Janizadeh, S. Evaluation of different boosting ensemble machine learning models and novel deep learning and boosting framework for head-cut gully erosion susceptibility. J. Environ. Manag. 2021, 284, 112015. [Google Scholar] [CrossRef]
- Alaboz, P.; Dengiz, O.; Demir, S.; Şenol, H. Digital mapping of soil erodibility factors based on decision tree using geostatistical approaches in terrestrial ecosystem. Catena 2021, 207, 105634. [Google Scholar] [CrossRef]
- Pal, S.C.; Chakrabortty, R.; Arabameri, A.; Santosh, M.; Saha, A.; Chowdhuri, I.; Roy, P.; Shit, M. Chemical weathering and gully erosion causing land degradation in a complex river basin of Eastern India: An integrated field, analytical and artificial intelligence approach. Nat. Hazards 2022, 110, 847–879. [Google Scholar] [CrossRef]
- Saha, S.; Roy, J.; Arabameri, A.; Blaschke, T.; Tien Bui, D. Machine Learning-Based Gully Erosion Susceptibility Mapping: A Case Study of Eastern India. Sensors 2020, 20, 1313. [Google Scholar] [CrossRef]
- Pourghasemi, H.R.; Sadhasivam, N.; Kariminejad, N.; Collins, A.L. Gully erosion spatial modelling: Role of machine learning algorithms in selection of the best controlling factors and modelling process. Geosci. Front. 2020, 11, 2207–2219. [Google Scholar] [CrossRef]
- Tiwari, A.; Arun, G.; Vishwakarma, B.D. Parameter importance assessment improves efficacy of machine learning methods for predicting snow avalanche sites in Leh-Manali Highway, India. Sci. Total Environ. 2021, 794, 148738. [Google Scholar] [CrossRef]
- Rahmati, O.; Tahmasebipour, N.; Haghizadeh, A.; Pourghasemi, H.R.; Feizizadeh, B. Evaluation of different machine learning models for predicting and mapping the susceptibility of gully erosion. Geomorphology 2017, 298, 118–137. [Google Scholar] [CrossRef]
- Conforti, M.; Aucelli, P.; Robustelli, G.; Scarciglia, F. Geomorphology and GIS analysis for mapping gully erosion susceptibility in the Turbolo Stream catchment (Northern Calabria, Italy). Nat. Hazards 2011, 56, 881–898. [Google Scholar] [CrossRef]
- Sharma, M.; Garg, R.D.; Badenko, V.; Fedotov, A.; Min, L.; Yao, A. Potential of airborne LiDAR data for terrain parameters extraction. Quat. Int. 2021, 575, 317–327. [Google Scholar] [CrossRef]
- Holloway, J.; Rudy, A.; Lamoureux, S.; Treitz, P. Determining the terrain characteristics related to the surface expression of subsurface water pressurization in permafrost landscapes using susceptibility modelling. Cryosphere 2017, 11, 1403–1415. [Google Scholar] [CrossRef]
- Gutiérrez, Á.G.; Schnabel, S.; Contador, F.L. Gully erosion, land use and topographical thresholds during the last 60 years in a small rangeland catchment in SW Spain. Land Degrad. Dev. 2009, 20, 535–550. [Google Scholar] [CrossRef]
- Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
- Ravì, D.; Bober, M.; Farinella, G.M.; Guarnera, M.; Battiato, S. Semantic segmentation of images exploiting DCT based features and random forest. Pattern Recognit. 2016, 52, 260–273. [Google Scholar] [CrossRef]
- Zhang, G.; Cai, Y.; Zheng, Z.; Zhen, J.; Liu, Y.; Huang, K. Integration of the Statistical Index Method and the Analytic Hierarchy Process technique for the assessment of landslide susceptibility in Huizhou, China. Catena 2016, 142, 233–244. [Google Scholar] [CrossRef]
- Pandya, R.; Pandya, J. C5. 0 algorithm to improved decision tree with feature selection and reduced error pruning. Int. J. Comput. Appl. 2015, 117, 18–21. [Google Scholar] [CrossRef]
- Putra, F.; Sitanggang, I. Classification model of air quality in Jakarta using decision tree algorithm based on air pollutant standard index. IOP Conf. Ser. Earth Environ. Sci. 2020, 528, 012053. [Google Scholar] [CrossRef]
- Pham, B.T.; Nguyen, M.D.; Nguyen-Thoi, T.; Ho, L.S.; Koopialipoor, M.; Kim Quoc, N.; Armaghani, D.J.; Le, H.V. A novel approach for classification of soils based on laboratory tests using Adaboost, Tree and ANN modeling. Transp. Geotech. 2021, 27, 100508. [Google Scholar] [CrossRef]
- Freund, Y.; Schapire, R.E. Experiments with a new boosting algorithm. ICML 1996, 96, 148–156. [Google Scholar]
- West, D.; Dellana, S.; Qian, J. Neural network ensemble strategies for financial decision applications. Comput. Oper. Res. Appl. Neural Netw. 2005, 32, 2543–2559. [Google Scholar] [CrossRef]
- Wang, S.; Mathew, A.; Chen, Y.; Xi, L.; Ma, L.; Lee, J. Empirical analysis of support vector machine ensemble classifiers. Expert Syst. Appl. 2009, 36, 6466–6476. [Google Scholar] [CrossRef]
- Hong, H.; Liu, J.; Bui, D.T.; Pradhan, B.; Acharya, T.D.; Pham, B.T.; Zhu, A.-X.; Chen, W.; Ahmad, B.B. Landslide susceptibility mapping using J48 Decision Tree with AdaBoost, Bagging and Rotation Forest ensembles in the Guangchang area (China). Catena 2018, 163, 399–413. [Google Scholar] [CrossRef]
- Breiman, L. Bagging predictors. Mach. Learn. 1996, 24, 123–140. [Google Scholar] [CrossRef]
- Chan, J.C.-W.; Paelinckx, D. Evaluation of Random Forest and Adaboost tree-based ensemble classification and spectral band selection for ecotope mapping using airborne hyperspectral imagery. Remote Sens. Environ. 2008, 112, 2999–3011. [Google Scholar] [CrossRef]
- Banfield, R.E. Learning on Complex Simulations; University of South Florida: Tampa, FL, USA, 2007. [Google Scholar]
- Friedman, J.H. Greedy Function Approximation: A Gradient Boosting Machine. Ann. Stat. 2001, 29, 1189–1232. Available online: https://www.jstor.org/stable/2699986 (accessed on 10 May 2023). [CrossRef]
- Sahin, E.K. Assessing the predictive capability of ensemble tree methods for landslide susceptibility mapping using XGBoost, gradient boosting machine, and random forest. SN Appl. Sci. 2020, 2, 1308. [Google Scholar] [CrossRef]
- Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ‘16, San Francisco, CA, USA, 13–17 August 2016; Association for Computing Machinery: New York, NY, USA; pp. 785–794. [Google Scholar] [CrossRef]
- Ramezan, C.A.; Warner, T.A.; Maxwell, A.E. Evaluation of sampling and cross-validation tuning strategies for regional-scale machine learning classification. Remote Sens. 2019, 11, 185. [Google Scholar] [CrossRef]
- Breiman, L.; Cutler, A. A deterministic algorithm for global optimization. Math. Program. 1993, 58, 179–199. [Google Scholar] [CrossRef]
- Lee, S.; Pradhan, B. Landslide hazard mapping at Selangor, Malaysia using frequency ratio and logistic regression models. Landslides 2006, 4, 33–41. [Google Scholar] [CrossRef]
- Guo, Z.; Shi, Y.; Huang, F.; Fan, X.; Huang, J. Landslide susceptibility zonation method based on C5. 0 decision tree and K-means cluster algorithms to improve the efficiency of risk management. Geosci. Front. 2021, 12, 101249. [Google Scholar] [CrossRef]
- Masselink, R.H.; Temme, A.J.A.M.; Giménez Díaz, R.; Casalí Sarasíbar, J.; Keesstra, S.D. Assessing hillslope-channel connectivity in an agricultural catchment using rare-earth oxide tracers and random forests models. Cuad. Investig. Geográfica 2017, 43, 19–39. [Google Scholar] [CrossRef]
- Tehrany, M.S.; Pradhan, B.; Mansor, S.; Ahmad, N. Flood susceptibility assessment using GIS-based support vector machine model with different kernel types. Catena 2015, 125, 91–101. [Google Scholar] [CrossRef]
- Pham, B.T.; Prakash, I.; Singh, S.K.; Shirzadi, A.; Shahabi, H.; Tran, T.-T.T.; Tien Bui, D. Landslide susceptibility modeling using reduced error pruning trees and different ensemble techniques: Hybrid machine learning approaches. Catena 2019, 175, 203–218. [Google Scholar] [CrossRef]
- Romer, C.; Ferentinou, M. Shallow landslide susceptibility assessment in a semiarid environment—A quaternary catchment of KwaZulu-Natal, South Africa. Eng. Geol. 2016, 201, 29–44. [Google Scholar] [CrossRef]
- Arabameri, A.; Tiefenbacher, J.P.; Blaschke, T.; Pradhan, B.; Tien Bui, D. Morphometric analysis for soil erosion susceptibility mapping using novel gis-based ensemble model. Remote Sens. 2020, 12, 874. [Google Scholar] [CrossRef]
- Bouzekraoui, H.; El Khalki, Y.; Mouaddine, A.; Lhissou, R.; El Youssi, M.; Barakat, A. Characterization and dynamics of agroforestry landscape using geospatial techniques and field survey: A case study in central High-Atlas (Morocco). Agrofor. Syst. 2016, 90, 965–978. [Google Scholar] [CrossRef]
- Azareh, A.; Rahmati, O.; Rafiei-Sardooi, E.; Sankey, J.B.; Lee, S.; Shahabi, H.; Ahmad, B.B. Modelling gully-erosion susceptibility in a semi-arid region, Iran: Investigation of applicability of certainty factor and maximum entropy models. Sci. Total Environ. 2019, 655, 684–696. [Google Scholar] [CrossRef]
- Nazari Samani, A.; Ahmadi, H.; Jafari, M.; Ghoddousi, J. Geomorphic threshold conditions for gully erosion in Southwestern Iran (Boushehr-Samal watershed). J. Asian Earth Sci. 2009, 35, 180–189. [Google Scholar] [CrossRef]
- Bochet, E.; García-Fayos, P. Factors controlling vegetation establishment and water erosion on motorway slopes in Valencia, Spain. Restor. Ecol. 2004, 12, 166–174. [Google Scholar] [CrossRef]
- Wang, L.; Wei, S.; Horton, R.; Shao, M.A. Effects of vegetation and slope aspect on water budget in the hill and gully region of the Loess Plateau of China. Catena 2011, 87, 90–100. [Google Scholar] [CrossRef]
- Beullens, J.; Van de Velde, D.; Nyssen, J. Impact of slope aspect on hydrological rainfall and on the magnitude of rill erosion in Belgium and northern France. Catena 2014, 114, 129–139. [Google Scholar] [CrossRef]
- Luo, W.; Liu, C.C. Innovative landslide susceptibility mapping supported by geomorphon and geographical detector methods. Landslides 2018, 15, 465–474. [Google Scholar] [CrossRef]
- Barakat, A.; Rafai, M.; Mosaid, H.; Islam, M.S.; Saeed, S. Mapping of Water-Induced Soil Erosion Using Machine Learning Models: A Case Study of Oum Er Rbia Basin (Morocco). Earth Syst. Environ. 2022, 7, 151–170. [Google Scholar] [CrossRef]
- Meliho, M.; Khattabi, A.; Mhammdi, N. A GIS-based approach for gully erosion susceptibility modelling using bivariate statistics methods in the Ourika watershed, Morocco. Environ. Earth Sci. 2018, 77, 655. [Google Scholar] [CrossRef]
Class | Description |
---|---|
1 | Silurian: Graptolitic shales |
2 | Stephano-Triassic: Sandstones and red conglomerates |
3 | Permian-Triassic: Basalt |
4 | Lower Lias: Limestones, and red clays |
5 | Lower Lias: Limestones and marls |
6 | Middle Lias: Limestones |
7 | Upper Lias: Conglomerates, sandstones, and clays |
8 | Dogger: Marls and limestones |
9 | Quaternary: Alluvial and Rockfull |
Factors | Data Layers | Data Source |
---|---|---|
Topographic factors | Elevation Slope (°) Stream Power Index (SPI) Topographic Position Index (TPI) Slope Length (LS) Aspect Curvature Topographic Wetness Index (TWI) Topographic Roughness Index (TRI) | SRTM-DEM (Digital Elevation Model) were downloaded from the website of United States Geological Survey (USGS) (http://gdex.cr.usgs.gov/gdex/ (accessed on 2 August 2022)); Pixel size of 30 m × 30 m. |
Hydrological factors | Distance To Rivers Drainage Density | |
Geomorphological factors | Valley depth Geomorphons | |
Geological factors | Lithology | Geologic map of Ouaouizghte-Dades 1/200,000 Bourcart et al., 1942 [32] Geologic map of Demnate-Telouate 1/200,000 Termier, 1941 [32] |
Climatic factors | Rainfall (mm) | TRMM data |
LAND cover factors | Normalized Difference Vegetation Index (NDVI) LandUse-LandCover (LULC) | LANDSAT-8 OLI TIRS satellite image |
Factors | VIF | TOL |
---|---|---|
Elevation | 1.264 | 0.791 |
Aspect | 1.118 | 0.895 |
Curvature | 1.069 | 0.936 |
Slope | 2.805 | 0.356 |
SPI | 2.461 | 0.406 |
TWI | 1.075 | 0.930 |
Drainage Density | 1.214 | 0.824 |
Distance To Rivers | 1.088 | 0.919 |
Lithology | 1.366 | 0.732 |
Rainfall | 1.406 | 0.711 |
NDVI | 1.537 | 0.650 |
LULC | 1.464 | 0.683 |
Valley Depth | 1.207 | 0.828 |
TPI | 1.217 | 0.822 |
TRI | 3.301 | 0.303 |
LS | 3.396 | 0.295 |
Geomorphons | 1.494 | 0.669 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Eloudi, H.; Hssaisoune, M.; Reddad, H.; Namous, M.; Ismaili, M.; Krimissa, S.; Ouayah, M.; Bouchaou, L. Robustness of Optimized Decision Tree-Based Machine Learning Models to Map Gully Erosion Vulnerability. Soil Syst. 2023, 7, 50. https://doi.org/10.3390/soilsystems7020050
Eloudi H, Hssaisoune M, Reddad H, Namous M, Ismaili M, Krimissa S, Ouayah M, Bouchaou L. Robustness of Optimized Decision Tree-Based Machine Learning Models to Map Gully Erosion Vulnerability. Soil Systems. 2023; 7(2):50. https://doi.org/10.3390/soilsystems7020050
Chicago/Turabian StyleEloudi, Hasna, Mohammed Hssaisoune, Hanane Reddad, Mustapha Namous, Maryem Ismaili, Samira Krimissa, Mustapha Ouayah, and Lhoussaine Bouchaou. 2023. "Robustness of Optimized Decision Tree-Based Machine Learning Models to Map Gully Erosion Vulnerability" Soil Systems 7, no. 2: 50. https://doi.org/10.3390/soilsystems7020050
APA StyleEloudi, H., Hssaisoune, M., Reddad, H., Namous, M., Ismaili, M., Krimissa, S., Ouayah, M., & Bouchaou, L. (2023). Robustness of Optimized Decision Tree-Based Machine Learning Models to Map Gully Erosion Vulnerability. Soil Systems, 7(2), 50. https://doi.org/10.3390/soilsystems7020050