A Comparative Study of Landslide Susceptibility Mapping Using Bagging PU Learning in Class-Prior Probability Shift Datasets
Abstract
:1. Introduction
2. Materials and Methods
2.1. Study Area
2.2. Landslide Inventories and Influence Factors
2.3. Methodology
2.3.1. Landslide Susceptibility Mapping Models
2.3.2. Measure the Uncertainty of the Predicted Value
3. Results
3.1. Classification Performance of Models
3.2. Uncertainty in the Predicted Susceptibility Values
3.3. Generalization Prediction of Models
4. Discussion
4.1. Classification Ability of Models
4.2. Generalization Ability of Models
5. Conclusions
- The selection of positive-to-negative sample ratio profoundly affects the classification performance of LSM model. While models trained on unbalanced datasets exhibit superior overall binary classification performance, those trained on balanced datasets demonstrate higher landslide recall. This emphasizes the importance of the trade-off between precision and recall in LSM modeling, differing from typical binary classification problems.
- The positive-to-negative sample ratio significantly impacts the mapping results. Models trained on unbalanced datasets tend to predict negative samples, lowering the overall landslide susceptibility probability in the region. Conversely, balanced datasets yield more reasonable for prevention and control planning.
- Utilizing Bagging PU Learning in classifiers has the potential to boost recall in the context of class-prior probability shift, thereby enhancing the overall generalization performance of the model. This method can reduce the uncertainty of model predictions in high susceptibility areas. In this study, the BaggingPU-GDBT model shows the best performance.
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Su, M.-B.; Chen, I.-H.; Liao, C.-H. Using TDR Cables and GPS for Landslide Monitoring in High Mountain Area. J. Geotech. Geoenviron. Eng. 2009, 135, 1113–1121. [Google Scholar] [CrossRef]
- Zhang, Y.; Tang, H.; Li, C.; Lu, G.; Cai, Y.; Zhang, J.; Tan, F. Design and Testing of a Flexible Inclinometer Probe for Model Tests of Landslide Deep Displacement Measurement. Sensors 2018, 18, 224. [Google Scholar] [CrossRef]
- Zhu, H.-H.; Shi, B.; Zhang, C.-C. FBG-Based Monitoring of Geohazards: Current Status and Trends. Sensors 2017, 17, 452. [Google Scholar] [CrossRef] [PubMed]
- Caviedes-Voullième, D.; Juez, C.; Murillo, J.; García-Navarro, P. 2D Dry Granular Free-Surface Flow over Complex Topography with Obstacles. Part I: Experimental Study Using a Consumer-Grade RGB-D Sensor. Comput. Geosci. 2014, 73, 177–197. [Google Scholar] [CrossRef]
- Cao, Y.; Wei, X.; Fan, W.; Nan, Y.; Xiong, W.; Zhang, S. Landslide Susceptibility Assessment Using the Weight of Evidence Method: A Case Study in Xunyang Area, China. PLoS ONE 2021, 16, e0245668. [Google Scholar] [CrossRef] [PubMed]
- Aditian, A.; Kubota, T.; Shinohara, Y. Comparison of GIS-Based Landslide Susceptibility Models Using Frequency Ratio, Logistic Regression, and Artificial Neural Network in a Tertiary Region of Ambon, Indonesia. Geomorphology 2018, 318, 101–111. [Google Scholar] [CrossRef]
- Fang, Z.; Wang, Y.; Peng, L.; Hong, H. A Comparative Study of Heterogeneous Ensemble-Learning Techniques for Landslide Susceptibility Mapping. Int. J. Geogr. Inf. Sci. 2021, 35, 321–347. [Google Scholar] [CrossRef]
- Felicísimo, Á.M.; Cuartero, A.; Remondo, J.; Quirós, E. Mapping Landslide Susceptibility with Logistic Regression, Multiple Adaptive Regression Splines, Classification and Regression Trees, and Maximum Entropy Methods: A Comparative Study. Landslides 2013, 10, 175–189. [Google Scholar] [CrossRef]
- Yao, X.; Tham, L.G.; Dai, F.C. Landslide Susceptibility Mapping Based on Support Vector Machine: A Case Study on Natural Slopes of Hong Kong, China. Geomorphology 2008, 101, 572–582. [Google Scholar] [CrossRef]
- Liu, M.; Liu, J.; Xu, S.; Zhou, T.; Ma, Y.; Zhang, F.; Konečný, M. Landslide Susceptibility Mapping with the Fusion of Multi-Feature SVM Model Based FCM Sampling Strategy: A Case Study from Shaanxi Province. Int. J. Image Data Fusion 2021, 12, 349–366. [Google Scholar] [CrossRef]
- Nefeslioglu, H.A.; Gokceoglu, C.; Sonmez, H. An Assessment on the Use of Logistic Regression and Artificial Neural Networks with Different Sampling Strategies for the Preparation of Landslide Susceptibility Maps. Eng. Geol. 2008, 97, 171–191. [Google Scholar] [CrossRef]
- Peng, L.; Niu, R.; Huang, B.; Wu, X.; Zhao, Y.; Ye, R. Landslide Susceptibility Mapping Based on Rough Set Theory and Support Vector Machines: A Case of the Three Gorges Area, China. Geomorphology 2014, 204, 287–301. [Google Scholar] [CrossRef]
- Kavzoglu, T.; Sahin, E.K.; Colkesen, I. Landslide Susceptibility Mapping Using GIS-Based Multi-Criteria Decision Analysis, Support Vector Machines, and Logistic Regression. Landslides 2014, 11, 425–439. [Google Scholar] [CrossRef]
- Rabby, Y.W.; Li, Y.; Hilafu, H. An Objective Absence Data Sampling Method for Landslide Susceptibility Mapping. Sci. Rep. 2023, 13, 1740. [Google Scholar] [CrossRef] [PubMed]
- Su, C.; Wang, B.; Lv, Y.; Zhang, M.; Peng, D.; Bate, B.; Zhang, S. Improved Landslide Susceptibility Mapping Using Unsupervised and Supervised Collaborative Machine Learning Models. Georisk Assess. Manag. Risk Eng. Syst. Geohazards 2023, 17, 387–405. [Google Scholar] [CrossRef]
- Huang, F.; Yin, K.; Jiang, S.; Huang, J.; Cao, Z. Landslide Susceptibility Assessment Based on Clustering Analysis and Support Vector Machine. Chin. J. Rock Mech. Eng. 2018, 37, 156–167. [Google Scholar] [CrossRef]
- Sun, D.; Wu, X.; Wen, H.; Gu, Q. A LightGBM-Based Landslide Susceptibility Model Considering the Uncertainty of Non-Landslide Samples. Geomat. Nat. Hazards Risk 2023, 14, 2213807. [Google Scholar] [CrossRef]
- Fang, Z.; Wang, Y.; Niu, R.; Peng, L. Landslide Susceptibility Prediction Based on Positive Unlabeled Learning Coupled With Adaptive Sampling. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 11581–11592. [Google Scholar] [CrossRef]
- Wu, B.; Qiu, W.; Jia, J.; Liu, N. Landslide Susceptibility Modeling Using Bagging-Based Positive-Unlabeled Learning. IEEE Geosci. Remote Sens. Lett. 2021, 18, 766–770. [Google Scholar] [CrossRef]
- Elkan, C.; Noto, K. Learning Classifiers from Only Positive and Unlabeled Data. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Las Vegas, NV, USA, 24 August 2008; ACM: New York, NY, USA, 2008; pp. 213–220. [Google Scholar]
- Nakajima, S.; Sugiyama, M. Positive-Unlabeled Classification under Class-Prior Shift: A Prior-Invariant Approach Based on Density Ratio Estimation. Mach. Learn. 2023, 112, 889–919. [Google Scholar] [CrossRef]
- Li, X.; Liu, B. Learning to Classify Texts Using Positive and Unlabeled Data. In Proceedings of the 18th International Joint Conference on Artificial Intelligence, Acapulco, Mexico, 9–15 August 2003; Morgan Kaufmann Publishers Inc.: San Francisco, CA, USA, 2003; pp. 587–592. [Google Scholar]
- Yu, H.; Han, J.; Chang, K.C.-C. PEBL: Positive Example Based Learning for Web Page Classification Using SVM. In Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Edmonton, AB, Canada, 23 July 2002; ACM: New York, NY, USA, 2002; pp. 239–248. [Google Scholar]
- Tang, L.; Yu, X.; Jiang, W.; Zhou, J. Comparative Study on Landslide Susceptibility Mapping Based on Unbalanced Sample Ratio. Sci. Rep. 2023, 13, 5823. [Google Scholar] [CrossRef]
- Wang, Y.; Feng, L.; Li, S.; Ren, F.; Du, Q. A Hybrid Model Considering Spatial Heterogeneity for Landslide Susceptibility Mapping in Zhejiang Province, China. Catena 2020, 188, 104425. [Google Scholar] [CrossRef]
- Wu, L. The Multi-Fractal of the Spatial Distribution of Landslide; Deng, W., Ed.; Chongqing Normal University: Chongqing, China, 2011; pp. 99–102. [Google Scholar]
- Wright, R. Positive-Unlabeled Learning. 2017. [Google Scholar]
- Ullah, K.; Wang, Y.; Fang, Z.; Wang, L.; Rahman, M. Multi-Hazard Susceptibility Mapping Based on Convolutional Neural Networks. Geosci. Front. 2022, 13, 101425. [Google Scholar] [CrossRef]
- Liao, M.; Wen, H.; Yang, L. Identifying the Essential Conditioning Factors of Landslide Susceptibility Models under Different Grid Resolutions Using Hybrid Machine Learning: A Case of Wushan and Wuxi Counties, China. Catena 2022, 217, 106428. [Google Scholar] [CrossRef]
- Wang, D.; Hao, M.; Chen, S.; Meng, Z.; Jiang, D.; Ding, F. Assessment of Landslide Susceptibility and Risk Factors in China. Nat. Hazards 2021, 108, 3045–3059. [Google Scholar] [CrossRef]
- Mordelet, F.; Vert, J.-P. A Bagging SVM to Learn from Positive and Unlabeled Examples. Pattern Recognit. Lett. 2014, 37, 201–209. [Google Scholar] [CrossRef]
- Scott, C.; Blanchard, G. Novelty Detection: Unlabeled Data Definitely Help. In Artificial Intelligence and Statistics, Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics, Clearwater Beach, FL, USA, 15 April 2009; Van Dyk, D., Welling, M., Eds.; PMLR: New York, NY, USA, 2009; Volume 5, pp. 464–471. [Google Scholar]
- Breiman, L. Bagging Predictors. Mach. Learn. 1996, 24, 123–140. [Google Scholar] [CrossRef]
- Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
- Mandrekar, J.N. Receiver Operating Characteristic Curve in Diagnostic Test Assessment. J. Thorac. Oncol. 2010, 5, 1315–1316. [Google Scholar] [CrossRef]
- Kouw, W.M.; Loog, M. An Introduction to Domain Adaptation and Transfer Learning. arXiv 2018, arXiv:1812.11806. [Google Scholar] [CrossRef]
- Guzzetti, F.; Reichenbach, P.; Ardizzone, F.; Cardinali, M.; Galli, M. Estimating the Quality of Landslide Susceptibility Models. Geomorphology 2006, 81, 166–184. [Google Scholar] [CrossRef]
- Zhao, L.; Wu, X.; Niu, R.; Wang, Y.; Zhang, K. Using the Rotation and Random Forest Models of Ensemble Learning to Predict Landslide Susceptibility. Geomat. Nat. Hazards Risk 2020, 11, 1542–1564. [Google Scholar] [CrossRef]
- Pourghasemi, H.R.; Kornejady, A.; Kerle, N.; Shabani, F. Investigating the Effects of Different Landslide Positioning Techniques, Landslide Partitioning Approaches, and Presence-Absence Balances on Landslide Susceptibility Mapping. Catena 2020, 187, 104364. [Google Scholar] [CrossRef]
- Yang, C.; Liu, L.-L.; Huang, F.; Huang, L.; Wang, X.-M. Machine Learning-Based Landslide Susceptibility Assessment with Optimized Ratio of Landslide to Non-Landslide Samples. Gondwana Res. 2023, 123, 198–216. [Google Scholar] [CrossRef]
- Gao, H.; Fam, P.S.; Tay, L.T.; Low, H.C. Comparative Landslide Spatial Research Based on Various Sample Sizes and Ratios in Penang Island, Malaysia. Bull. Eng. Geol. Environ. 2021, 80, 851–872. [Google Scholar] [CrossRef]
- Sun, D.; Xu, J.; Wen, H.; Wang, Y. An Optimized Random Forest Model and Its Generalization Ability in Landslide Susceptibility Mapping: Application in Two Areas of Three Gorges Reservoir, China. J. Earth Sci. 2020, 31, 1068–1086. [Google Scholar] [CrossRef]
- Chu, H.-J.; Chen, Y.-C.; Ali, M.; Höfle, B. Multi-Parameter Relief Map from High-Resolution DEMs: A Case Study of Mudstone Badland. Int. J. Environ. Res. Public Health 2019, 16, 1109. [Google Scholar] [CrossRef]
- Guo, Y.; Li, X.; Ju, S.; Lyu, Q.; Liu, T. Utilization of 3D Laser Scanning for Stability Evaluation and Deformation Monitoring of Landslides. J. Environ. Public Health 2022, 2022, 8225322. [Google Scholar] [CrossRef] [PubMed]
- Mantovani, J.R.; Bueno, G.T.; Alcântara, E.; Park, E.; Cunha, A.P.; Londe, L.; Massi, K.; Marengo, J.A. Novel Landslide Susceptibility Mapping Based on Multi-Criteria Decision-Making in Ouro Preto, Brazil. J. Geovisualization Spat. Anal. 2023, 7, 7. [Google Scholar] [CrossRef]
- Tesfa, C. GIS-Based AHP and FR Methods for Landslide Susceptibility Mapping in the Abay Gorge, Dejen–Renaissance Bridge, Central, Ethiopia. Geotech. Geol. Eng. 2022, 40, 5029–5043. [Google Scholar] [CrossRef]
- Millán-Arancibia, C.; Lavado-Casimiro, W. Rainfall Thresholds Estimation for Shallow Landslides in Peru from Gridded Daily Data. Nat. Hazards Earth Syst. Sci. 2023, 23, 1191–1206. [Google Scholar] [CrossRef]
- Zhang, J.; Ma, X.; Zhang, J.; Sun, D.; Zhou, X.; Mi, C.; Wen, H. Insights into Geospatial Heterogeneity of Landslide Susceptibility Based on the SHAP-XGBoost Model. J. Environ. Manag. 2023, 332, 117357. [Google Scholar] [CrossRef] [PubMed]
- Jin, W.; Cui, P.; Zhang, G.; Wang, J.; Zhang, Y.; Zhang, P. Evaluating the Post-Earthquake Landslides Sediment Supply Capacity for Debris Flows. Catena 2023, 220, 106649. [Google Scholar] [CrossRef]
- Carrión-Mero, P.; Montalván-Burbano, N.; Morante-Carballo, F.; Quesada-Román, A.; Apolo-Masache, B. Worldwide Research Trends in Landslide Science. Int. J. Environ. Res. Public Health 2021, 18, 9445. [Google Scholar] [CrossRef] [PubMed]
- Alcántara-Ayala, I.; Garnica-Peña, R.J. Landslide Warning Systems in Low-And Lower-Middle-Income Countries: Future Challenges and Societal Impact. In Progress in Landslide Research and Technology, Volume 1 Issue 1, 2022; Progress in Landslide Research and Technology; Sassa, K., Konagai, K., Tiwari, B., Arbanas, Ž., Sassa, S., Eds.; Springer International Publishing: Cham, Germany, 2023; pp. 137–147. ISBN 978-3-031-16897-0. [Google Scholar]
- Bucała-Hrabia, A.; Kijowska-Strugała, M.; Śleszyński, P.; Rączkowska, Z.; Izdebski, W.; Malinowski, Z. Evaluating the Use of the Landslide Database in Spatial Planning in Mountain Communes (the Polish Carpathians). Land Use Policy 2022, 112, 105842. [Google Scholar] [CrossRef]
- Garcia-Delgado, H.; Petley, D.N.; Bermúdez, M.A.; Sepúlveda, S.A. Fatal Landslides in Colombia (from Historical Times to 2020) and Their Socio-Economic Impacts. Landslides 2022, 19, 1689–1716. [Google Scholar] [CrossRef]
Category | Factors | Data Source |
---|---|---|
Topography | Elevation, aspect, slope, profile curvature, plan curvature, terrain surface texture, relative slope position, topographic wetness index (TWI), topographic roughness index (TRI), valley depth | ASTER GDEM data of the Geospatial Data Cloud (http://www.gscloud.cn/, accessed on 1 March 2023) |
Geology | Lithology, strata, distance from faults, slope structure | 1:200,000 scale geological map (http://dcc.ngac.org.cn/, accessed on 1 March 2023) |
Environment | NDVI, land use, distance from rivers, magnitude, Annual average rainfall | Land use was extracted from GlobeLand30 (http://globeland30.org/, accessed on 1 March 2023). The NDVI was calculated in the Google Earth Engine platform. Rivers were derived from the 1:100,000 basic geographic database of China national catalogue service for geographic information. Magnitude and rainfall were provided by Hubei Geological Environment Station. |
Human activity | Distance from roads, POI kernel density | 1:100,000 basic geographic database of China national catalogue service for geographic information |
Positive and Unlabeled Sample Ratio | Model | SEM | ||||
---|---|---|---|---|---|---|
Very Low | Low | Moderate | High | Very High | ||
1:1 | LR | 0.00211 | 0.00207 | 0.00463 | 0.00362 | 0.0016 |
SVM | 0.00208 | 0.0021 | 0.00465 | 0.00427 | 0.00158 | |
BaggingPU-SVM | 0.00244 | 0.00209 | 0.00417 | 0.00366 | 0.00155 | |
RF | 0.00216 | 0.00207 | 0.00443 | 0.00477 | 0.00142 | |
GDBT | 0.00198 | 0.00207 | 0.0049 | 0.00529 | 0.0014 | |
BaggingPU-GDBT | 0.00186 | 0.00205 | 0.00491 | 0.00503 | 0.00136 | |
1:5 | LR | 0.0006 | 0.00395 | 0.00212 | 0.00303 | 0.00411 |
SVM | 0.00058 | 0.00423 | 0.0026 | 0.00356 | 0.00322 | |
BaggingPU-SVM | 0.00058 | 0.00367 | 0.00297 | 0.00334 | 0.00287 | |
RF | 0.00057 | 0.00425 | 0.00299 | 0.00335 | 0.00248 | |
GDBT | 0.00055 | 0.00479 | 0.00303 | 0.00337 | 0.00227 | |
BaggingPU-GDBT | 0.00055 | 0.0051 | 0.00309 | 0.00335 | 0.00212 |
Model | Susceptibility | Unite Proportion | Landslide Unit Proportion | Frequency Ratio |
---|---|---|---|---|
Region B based BaggingPU-GDBT | 0.00–0.20 | 52.42% | 0.37% | 0.007 |
0.20–0.45 | 23.44% | 4.01% | 0.171 | |
0.45–0.55 | 6.01% | 3.46% | 0.576 | |
0.55–0.80 | 10.64% | 13.10% | 1.231 | |
0.80–1.00 | 7.49% | 79.06% | 10.557 | |
Region A based BaggingPU-GDBT | 0.00–0.20 | 42.51% | 6.87% | 0.162 |
0.20–0.45 | 34.32% | 19.44% | 0.566 | |
0.45–0.55 | 6.66% | 7.81% | 1.173 | |
0.55–0.80 | 14.86% | 59.40% | 3.997 | |
0.80–1.00 | 1.65% | 6.48% | 3.917 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhao, L.; Ma, H.; Dong, J.; Wu, X.; Xu, H.; Niu, R. A Comparative Study of Landslide Susceptibility Mapping Using Bagging PU Learning in Class-Prior Probability Shift Datasets. Remote Sens. 2023, 15, 5547. https://doi.org/10.3390/rs15235547
Zhao L, Ma H, Dong J, Wu X, Xu H, Niu R. A Comparative Study of Landslide Susceptibility Mapping Using Bagging PU Learning in Class-Prior Probability Shift Datasets. Remote Sensing. 2023; 15(23):5547. https://doi.org/10.3390/rs15235547
Chicago/Turabian StyleZhao, Lingran, Hangling Ma, Jiahui Dong, Xueling Wu, Hang Xu, and Ruiqing Niu. 2023. "A Comparative Study of Landslide Susceptibility Mapping Using Bagging PU Learning in Class-Prior Probability Shift Datasets" Remote Sensing 15, no. 23: 5547. https://doi.org/10.3390/rs15235547
APA StyleZhao, L., Ma, H., Dong, J., Wu, X., Xu, H., & Niu, R. (2023). A Comparative Study of Landslide Susceptibility Mapping Using Bagging PU Learning in Class-Prior Probability Shift Datasets. Remote Sensing, 15(23), 5547. https://doi.org/10.3390/rs15235547