CatBoost–Bayesian Hybrid Model Adaptively Coupled with Modified Theoretical Equations for Estimating the Undrained Shear Strength of Clay
Abstract
:1. Introduction
2. Materials and Methods
2.1. CatBoost Algorithm
- The CatBoost algorithm can handle categorical features in GBDT features better, and the simplest way is to use the average value of the corresponding labels to replace them. In the decision tree, the label average value will be used as the criterion for node splitting. This method is known as greedy target-based statistics, or greedy TS. However, this method has obvious drawbacks, so greedy TS is improved by adding prior distribution terms to reduce the effect of noise and low frequency categorical data on the data distribution [14,23].
- 2.
- The prediction shift is caused by the gradient bias. To overcome this problem, CatBoost proposes a new algorithm called ordered boosting (Algorithm 1).
Algorithm 1: Ordered boosting pseudo-code algorithm |
input: |
random permutation of |
for |
for to do |
for to do |
for to do |
learn model |
2.2. Bayesian Optimization Algorithm (SMAC) and k-Fold Cross-Validation
2.3. Theoretical Equation
2.4. Bayesian Perspective of Unified Undrained Shear Strength Equation
2.4.1. Prior Distribution
2.4.2. Likelihood Function
2.4.3. Uncertainty Analysis of the Unified Constitutive Model
2.5. Quantitative Evaluation Indicators
3. Results and Discussion
3.1. Properties of Clays and the Database
3.2. Feature Importance of CatBoost–Bayesian Hybrid Model
3.3. Estimation of Clay Undrained Shear Strength
3.3.1. Uncertainty Analysis of Equation Parameters
3.3.2. Verification of the Feasibility of the Theoretical Equation
3.3.3. Comparative Analysis of Estimation Results
4. Conclusions
- 1.
- From the feature importance ranking of the CatBoost–Bayesian hybrid model, parameters with high importance and ease of measurement were selected; the overconsolidation ratio () and the effective overburden pressure () could reasonably explain the model and indirectly estimate the undrained shear strength of the clay.
- 2.
- The equation parameter of the clay in the overconsolidated state was affected by , and the equation parameter was affected by the plasticity index, . For the measured parameters of clay that were difficult to obtain, it was recommended that when the clay depth is , the calculation parameters of undrained shear strength should be computed according to the following recommended values , and . The theoretical equation was .
- 3.
- When the undrained shear strength of clay in the normally consolidated state was estimated at a depth of , the recommended theoretical equation was ; when the undrained shear strength of clay in the underconsolidated state was estimated at a depth of , the recommended theoretical equation was .
- 4.
- Compared with the calculation results of Ohta and Wang et al., it was found that the theoretical equation in this study can well estimate the undrained shear strength of isotropically consolidated clay. When the clay depth is , the huge fluctuation of the estimated value of is mainly due to the long-term influence of evaporative water loss in the upper part of the clay.
- 5.
- The CatBoost–Bayesian hybrid model could excavate the intrinsic relationship of the soil parameters, but it could not give a comprehensive interpretability. The undrained shear strength of isotropic clays was estimated and is to a certain extent interpretable by the CatBoost–Bayesian hybrid model feature importance, adaptively coupled to the theoretical equation derived from the modified Cambridge model. Comparing the results of the CatBoost–Bayesian hybrid model and its similar hybrid models, this study ensured that the average reaches 0.92, the average and were 0.19 and 0.03, respectively, and the overall performance was good.
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Motaghedi, H.; Eslami, A. Analytical Approach for Determination of Soil Shear Strength Parameters from CPT and CPTu Data. Arabian J. Sci. Eng. 2014, 39, 4363–4376. [Google Scholar] [CrossRef]
- Ladd, C.C.; Foott, R. New Design Procedure for Stability of Soft Clays. J. Geotech. Eng. Div. 1974, 100, 763–786. [Google Scholar] [CrossRef]
- Mesri, G. Discussion of “New design procedure for stability of soft clays”. J. Geotech. Eng. Div. 1975, 101, 409–412. [Google Scholar] [CrossRef]
- Jiang, S.H.; Zeng, S.H.; Yang, J.H.; Yao, C.; Huang, J.S.; Zhou, C.B. Slope Reliability Analysis by Simulation of Non-Stationary Random Field of Undrained Shear Strength. Yantu Lixue 2018, 39, 1071–1081. (In Chinese). Available online: http://ytlx.whrsm.ac.cn/EN/10.16285/j.rsm.2016.0609 (accessed on 20 May 2022).
- Marchetti, S.; Monaco, P.; Totani, G.; Calabrese, M. The Flat Dilatometer Test (DMT) in Soil Investigation; ISSMGE TC 16 Report; ISSMGE: London, UK, 2001; pp. 1–26. [Google Scholar]
- Robertson, P.K. Soil Behavior Type Using the DMT. In Proceedings of the 3rd International Flat Dilatometer Conference, Roma, Italy, 14–16 June 2015; pp. 14–16. Available online: https://www.cpt-robertson.com/PublicationsPDF/Robertson%20DMT15%202015.pdf (accessed on 20 May 2022).
- Nguyen, H.; Bui, X.-N.; Tran, Q.-H.; Mai, N.-L. A New Soft Computing Model for Estimating and Controlling Blast-Produced Ground Vibration Based on Hierarchical K-Means Clustering and Cubist Algorithms. Appl. Soft Comput. 2019, 77, 376–386. [Google Scholar] [CrossRef]
- Xu, H.; Zhou, J.; Asteris, P.G.; Armaghani, D.J.; Tahir, M.M. Supervised Machine Learning Techniques to the Prediction of Tunnel Boring Machine Penetration Rate. Appl. Sci. 2019, 9, 3715. [Google Scholar] [CrossRef]
- Zhou, J.; Li, E.; Yang, S.; Wang, M.; Shi, X.; Yao, S.; Mitri, H.S. Slope Stability Prediction for Circular Mode Failure Using Gradient Boosting Machine Approach Based on an Updated Database of Case Histories. Saf. Sci. 2019, 118, 505–518. [Google Scholar] [CrossRef]
- Jiao, P.; Alavi, A.H. Artificial Intelligence in Seismology: Advent, Performance and Future Trends. Geosci. Front. 2020, 11, 739–744. [Google Scholar] [CrossRef]
- Cui, K.; Jing, X. Research on Prediction Model of Geotechnical Parameters Based on BP Neural Network. Neural. Comput. Appl. 2019, 31, 8205–8215. [Google Scholar] [CrossRef]
- Tran, Q.A.; Ho, L.S.; Le, H.V.; Prakash, I.; Pham, B.T. Estimation of the Undrained Shear Strength of Sensitive Clays Using Optimized Inference Intelligence System. Neural. Comput. Appl. 2022, 34, 7835–7849. [Google Scholar] [CrossRef]
- Jong, S.; Ong, D.; Oh, E. State-of-the-Art Review of Geotechnical-Driven Artificial Intelligence Techniques in Underground Soil-Structure Interaction. Tunn. Undergr. Space Technol. 2021, 113, 103946. [Google Scholar] [CrossRef]
- Dorogush, A.V.; Ershov, V.; Gulin, A. CatBoost: Gradient Boosting with Categorical Features Support. arXiv 2018, arXiv:1810.11363. [Google Scholar]
- Tran, D.A.; Tsujimura, M.; Ha, N.T.; Van Binh, D.; Dang, T.D.; Doan, Q.-V.; Bui, D.T.; Ngoc, T.A.; Thuc, P.T.B.; Pham, T.D.; et al. Evaluating the Predictive Power of Different Machine Learning Algorithms for Groundwater Salinity Prediction of Multi-Layer Coastal Aquifers in the Mekong Delta, Vietnam. Ecol. Indic. 2021, 127, 107790. [Google Scholar] [CrossRef]
- Xu, J.-G.; Hong, W.; Zhang, J.; Hou, S.-T.; Wu, G. Seismic Performance Assessment of Corroded RC Columns Based on Data-Driven Machine-Learning Approach. Eng. Struct. 2022, 255, 113936. [Google Scholar] [CrossRef]
- Huang, G.; Wu, L.; Ma, X.; Zhang, W.; Fan, J.; Yu, X.; Zeng, W.; Zhou, H. Evaluation of CatBoost Method for Prediction of Reference Evapotranspiration in Humid Regions. J. Hydrol. 2019, 574, 1029–1041. [Google Scholar] [CrossRef]
- Zhang, Y.X.; Zhao, Z.G.; Zheng, J.H. CatBoost: A New Approach for Estimating Daily Reference Crop Evapotranspiration in Arid and Semi-Arid Regions of Northern China. J. Hydrol. 2020, 588, 125087. [Google Scholar] [CrossRef]
- Zhang, W.; Wu, C.; Zhong, H.; Li, Y.; Wang, L. Prediction of Undrained Shear Strength Using Extreme Gradient Boosting and Random Forest Based on Bayesian Optimization. Geosci. Front 2021, 12, 469–477. [Google Scholar] [CrossRef]
- Oh, H.-J.; Syifa, M.; Lee, C.-W.; Lee, S. Land Subsidence Susceptibility Mapping Using Bayesian, Functional, and Meta-Ensemble Machine Learning Models. Appl. Sci. 2019, 9, 1248. [Google Scholar] [CrossRef]
- Roscoe, K.H.; Burland, J.B. On the Generalised Stress-Strain Behaviour of “wet” Clay. Eng. Plast. 1968, 535–609. Available online: https://trid.trb.org/view/124868 (accessed on 20 May 2022).
- Wang, L.; Ye, S.; Shen, K.; Hu, Y. Undrained Shear Strength of K0 Consolidated Soft Clays. Chin. J. Geotech. Eng. 2006, 28, 971–977. [Google Scholar] [CrossRef]
- Prokhorenkova, L.; Gusev, G.; Vorobev, A.; Dorogush, A.V.; Gulin, A. CatBoost: Unbiased Boosting with Categorical Features. In Advances in Neural Information Processing Systems; Curran Associates, Inc.: Dutchess County, NY, USA, 2018; Volume 31. [Google Scholar]
- Mockus, J. The Application of Bayesian Methods for Seeking the Extremum. J. Glob. Optim. 1998, 2, 117. Available online: https://cir.nii.ac.jp/crid/137057611871035611 (accessed on 20 May 2022).
- Katakami, S.; Sakamoto, H.; Okada, M. Bayesian Hyperparameter Estimation Using Gaussian Process and Bayesian Optimization. J. Phys. Soc. Jpn. 2019, 88, 074001. [Google Scholar] [CrossRef]
- Lindauer, M.; Eggensperger, K.; Feurer, M.; Biedenkapp, A.; Deng, D.; Benjamins, C.; Ruhkopf, T.; Sass, R.; Hutter, F. SMAC3: A Versatile Bayesian Optimization Package for Hyperparameter Optimization. J. Mach. Learn. Res. 2022, 23, 1–9. [Google Scholar] [CrossRef]
- Hutter, F.; Hoos, H.H.; Leyton-Brown, K. Sequential Model-Based Optimization for General Algorithm Configuration. In Proceedings of the Learning and Intelligent Optimization: 5th International Conference, LION 5, Rome, Italy, 17–21 January 2011. [Google Scholar] [CrossRef]
- Breiman, L. Random Forests. Mach Learn 2001, 45, 5–32. [Google Scholar] [CrossRef]
- Geisser, S. The Predictive Sample Reuse Method with Applications. J. Am. Stat. Assoc. 1975, 70, 320–328. [Google Scholar] [CrossRef]
- Stone, M. Cross-Validatory Choice and Assessment of Statistical Predictions (with Discussion). J. R. Stat. Soc. B 1976, 38, 102. [Google Scholar] [CrossRef]
- Jung, Y. Multiple Predicting K-Fold Cross-Validation for Model Selection. J. Nonparametr. Stat. 2018, 30, 197–215. [Google Scholar] [CrossRef]
- Pham, B.T.; Qi, C.; Ho, L.S.; Nguyen-Thoi, T.; Al-Ansari, N.; Nguyen, M.D.; Nguyen, H.D.; Ly, H.-B.; Le, H.V.; Prakash, I. A Novel Hybrid Soft Computing Model Using Random Forest and Particle Swarm Optimization for Estimation of Undrained Shear Strength of Soil. Sustainability 2020, 12, 2218. [Google Scholar] [CrossRef]
- Ohta, H.; Nishihara, A. Anisotropy of Undrained Shear Strength of Clays under Axi-Symmetric Loading Conditions. Soils Found. 1985, 25, 73–86. [Google Scholar] [CrossRef]
- Karube, D. Nonstandard Triaxial Testing Method and Its Problems. Proceedings of the 20th Symposium of the International Society for Rock Mechanics, JSSMFE. 1975, pp. 45–60. Available online: https://cir.nii.ac.jp/crid/1572261549455735296 (accessed on 20 May 2022).
- He, P.; Wang, W.; Xu, Z. Empirical Correlations of Compression Index and Swelling Index for Shanghai Clay. Yantu Lixue 2018, 39, 1–10. (In Chinese) [Google Scholar] [CrossRef]
- Azzouz, A.S.; Krizek, R.J.; Corotis, R.B. Regression Analysis of Soil Compressibility. Soils Found. 1976, 16, 19–29. [Google Scholar] [CrossRef]
- Ladd, C.C. Stability Evaluation during Staged Construction. J. Geotech. Eng. 1991, 117, 540–615. [Google Scholar] [CrossRef]
- Zhang, J. Bayesian Method: A Natural Tool for Processing Geotechnical Information; TC205/TC304 Discussion Groups; ISSMGE: London, UK, 2016. [Google Scholar]
- Fu, Y.; Ma, C.; Bian, Y.; Lv, G.; Hu, Y.; Wang, C. Stochastic Mechanics-Based Bayesian Method Calibrating the Constitutive Parameters of the Unified Model for Clay and Sand with CPTU Data. Acta Geotech. 2022, 17, 4577–4598. [Google Scholar] [CrossRef]
- Cao, Z.; Wang, Y. Bayesian Model Comparison and Characterization of Undrained Shear Strength. J. Geotech. Geoenviron. Eng. 2014, 140, 04014018. [Google Scholar] [CrossRef]
- Zhao, Z.; Duan, W.; Cai, G.; Wu, M.; Liu, S. CPT-Based Fully Probabilistic Seismic Liquefaction Potential Assessment to Reduce Uncertainty: Integrating XGBoost Algorithm with Bayesian Theorem. Comput. Geotech. 2022, 149, 104868. [Google Scholar] [CrossRef]
- Juang, C.H.; Zhang, J. Bayesian Methods for Geotechnical Applications—A Practical Guide; ASCE: Reston, VA, USA, 2017; pp. 215–246. [Google Scholar] [CrossRef]
- Guan, Z.; Wang, Y. SPT-Based Probabilistic Evaluation of Soil Liquefaction Potential Considering Design Life of Civil Infrastructures. Comput. Geotech. 2022, 148, 104807. [Google Scholar] [CrossRef]
- Guan, Z.; Wang, Y. CPT-Based Probabilistic Liquefaction Assessment Considering Soil Spatial Variability, Interpolation Uncertainty and Model Uncertainty. Comput. Geotech. 2022, 141, 104504. [Google Scholar] [CrossRef]
- Juang, C.H.; Ching, J.; Ku, C.-S.; Hsieh, Y.-H. Unified CPTu-Based Probabilistic Model for Assessing Probability of Liquefaction of Sand and Clay. Geotechnique 2012, 62, 877–892. [Google Scholar] [CrossRef]
- Ku, C.-S.; Juang, C.H.; Chang, C.-W.; Ching, J. Probabilistic Version of the Robertson and Wride Method for Liquefaction Evaluation: Development and Application. Can. Geotech. J. 2012, 49, 27–44. [Google Scholar] [CrossRef]
- Draper, N.R.; Smith, H. Applied Regression Analysis; John Wiley & Sons: Hoboken, NJ, USA, 1998; Volume 326. [Google Scholar] [CrossRef]
- Huang, J.-C.; Ko, K.-M.; Shu, M.-H.; Hsu, B.-M. Application and Comparison of Several Machine Learning Algorithms and Their Integration Models in Regression Problems. Neural. Comput. Appl. 2020, 32, 5461–5469. [Google Scholar] [CrossRef]
- Zhang, R.; Li, Y.; Goh, A.T.; Zhang, W.; Chen, Z. Analysis of Ground Surface Settlement in Anisotropic Clays Using Extreme Gradient Boosting and Random Forest Regression Models. J. Rock Mech. Geotech. Eng. 2021, 13, 1478–1484. [Google Scholar] [CrossRef]
- FI-CLAY/14/856 Finland Clays. Available online: http://140.112.12.21/issmge/tc304.htm (accessed on 15 June 2022).
- Löfman, M.S.; Korkiala-Tanttu, L.K. Transformation Models for the Compressibility Properties of Finnish Clays Using a Multivariate Database. Georisk 2022, 16, 330–346. [Google Scholar] [CrossRef]
- Rutledge, P.C. Cooperative Triaxial Shear Research Program of the Corps of Engineers. 1947. Available online: https://trid.trb.org/view/119101 (accessed on 20 May 2022).
- Jamiolkowski, M.; Ladd, C.C.; Germaine, J.T.; Lancellotta, R. New developments in field and laboratory testing of soils. In Proceedings of the XI the International Conference on Soil Mechanics & Foundation Engineering, San Francisco, CA, USA, 12–16 August 1985. [Google Scholar]
- Yuchun, C. A Comparison of Simplified Calculation Methods of Undrained Shear Strength of Soft Clays after Consolidation. China. Civil. Eng. 2014, 47, 107–116. (In Chinese) [Google Scholar] [CrossRef]
- Qiao, Y.F.; Lu, X.B.; Huang, J.; Ding, W.Q. Simplified calculation method for lateral pressure at rest in the under-consolidation stratum. Yantu Lixue 2020, 41, 3722–3729. (In Chinese). Available online: http://ytlx.whrsm.ac.cn/CN/10.16285/j.rsm.2020.0124 (accessed on 20 May 2022).
- Asaoka, A.; A-Grivas, D. Spatial Variability of the Undrained Strength of Clays. J. Geotech. Eng. Div. 1982, 108, 743–756. [Google Scholar] [CrossRef]
- Xiao-qing, G.; Bin, Z.; Jin-chao, L. Others Experimental Study of Undrained Shear Strength and Cyclic Degradation Behaviors of Marine Clay in Pearl River Estuary. Yantu Lixue 2016, 37, 1005–1012. (In Chinese). Available online: http://ytlx.whrsm.ac.cn/CN/10.16285/j.rsm.2016.04.013 (accessed on 20 May 2022).
- Ching, J.; Arroyo, M.; Chen, J.; Jorge, C.; Lansivaara, T.; Li, D.; Mayne, P.; Phoon, K.; Prakoso, W.; Uzielli, M. Transformation Models and Multivariate Soil Databases. In Final Report of Joint TC205/TC304 Working Group on “Discussion of Statistical/Reliability Methods for Eurocodes”; International Society for Soil Mechanics and Geotechnical Engineering (ISSMGE): London, UK, 2017; p. 372. [Google Scholar]
Parameters | Symbol | Min | Max | Std | Mean | COV | Unit |
---|---|---|---|---|---|---|---|
Organic content | 0.00 | 7.10 | 1.48 | 1.23 | 1.20 | % | |
Clay content | 12.70 | 95.00 | 20.61 | 58.99 | 0.35 | % | |
Void ratio | 0.81 | 3.88 | 0.69 | 2.13 | 0.32 | - | |
Natural water content | 28.00 | 155.00 | 25.52 | 77.45 | 0.33 | % | |
Liquid limit | 24.40 | 166.00 | 24.73 | 76.78 | 0.32 | % | |
Plastic limit | 17.70 | 42.00 | 4.90 | 27.68 | 0.18 | % | |
Effective in situ stress | 4.00 | 130.00 | 27.31 | 41.68 | 0.66 | kPa | |
Preconsolidation pressure | 13.00 | 198.00 | 37.89 | 69.35 | 0.55 | kPa | |
Overconsolidation ratio | 0.46 | 20.00 | 1.81 | 2.12 | 0.85 | - | |
Compression index | 0.10 | 4.22 | 0.86 | 1.29 | 0.67 | - | |
Sensitivity | 1.69 | 163 | 19.97 | 24.26 | 0.84 | - | |
Undrained shear strength | 5.21 | 240.00 | 31.32 | 28.95 | 1.08 | kPa |
Optuna_Parameters of CatBoost | Description |
---|---|
loss function | MAE |
n_estimators | 1000 |
learning rate | 0.153 |
random state | 2019 |
l2_leaf_reg | 0.030 |
colsample_bylevel | 0.098 |
depth | 1 |
boosting type | Plain |
bootstrap type | MVS |
min_data_in_leaf | 4 |
one_hot_max_size | 3 |
early_stopping_rounds | 100 |
R2_Mean | Evar_Mean | RMSE_Mean | MAE_Mean | |||||
---|---|---|---|---|---|---|---|---|
Train | Test | Train | Test | Train | Test | Train | Test | |
UCI_Su | - | 0.88 | - | 0.88 | - | 0.19 | - | 0.04 |
NCI_Su | - | 0.90 | - | 0.90 | - | 0.16 | - | 0.03 |
OCI_Su | - | 0.97 | - | 0.97 | - | 0.22 | - | 0.02 |
CatBoost–Bayesian | 0.91 | 0.86 | 0.91 | 0.86 | 0.30 | 0.37 | 0.12 | 0.20 |
LightGBM-Bayesian | 0.99 | 0.81 | 0.99 | 0.81 | 0.14 | 0.48 | 0.08 | 0.23 |
XGBoost-Bayesian | 0.94 | 0.81 | 0.94 | 0.82 | 0.24 | 0.41 | 0.16 | 0.24 |
RandomForest -Bayesian | 0.95 | 0.80 | 0.95 | 0.80 | 0.99 | 0.98 | 0.46 | 0.46 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yang, H.; Liu, Z.; Li, Y.; Wei, H.; Huang, N. CatBoost–Bayesian Hybrid Model Adaptively Coupled with Modified Theoretical Equations for Estimating the Undrained Shear Strength of Clay. Appl. Sci. 2023, 13, 5418. https://doi.org/10.3390/app13095418
Yang H, Liu Z, Li Y, Wei H, Huang N. CatBoost–Bayesian Hybrid Model Adaptively Coupled with Modified Theoretical Equations for Estimating the Undrained Shear Strength of Clay. Applied Sciences. 2023; 13(9):5418. https://doi.org/10.3390/app13095418
Chicago/Turabian StyleYang, Huajian, Zhikui Liu, Yuantao Li, Haixia Wei, and Nengsheng Huang. 2023. "CatBoost–Bayesian Hybrid Model Adaptively Coupled with Modified Theoretical Equations for Estimating the Undrained Shear Strength of Clay" Applied Sciences 13, no. 9: 5418. https://doi.org/10.3390/app13095418
APA StyleYang, H., Liu, Z., Li, Y., Wei, H., & Huang, N. (2023). CatBoost–Bayesian Hybrid Model Adaptively Coupled with Modified Theoretical Equations for Estimating the Undrained Shear Strength of Clay. Applied Sciences, 13(9), 5418. https://doi.org/10.3390/app13095418