Uncertainty Quantification in Shear Wave Velocity Predictions: Integrating Explainable Machine Learning and Bayesian Inference
Abstract
:1. Introduction
2. Data-Driven Shear Wave Velocity (Vs) Prediction Models
2.1. Bayesian Generalized Linear Model
2.2. Extreme Gradient Boosting (XGBoost) Algorithm
3. Methodology
3.1. Training and Testing Dataset
3.2. Hungarian Seismic Cone Penetration Test (SCPT)
3.3. Performance Measurements
4. Discussion
4.1. Evaluation of the Models
4.2. Validation of the Models
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
CHT | Cross-hole test |
CI | Credible interval |
colsample _bytree | Column subsampling ratio per tree |
CPT | Cone penetration test |
cv | Coefficient of variation |
DHT | Downhole test |
E | Error term |
fs | Sleeve friction |
fs_cv | Coefficient of variation of sleeve friction |
fs_cv | sleeve friction coefficient of variation |
fs_mean | Mean of sleeve friction |
gamma | Minimum loss reduction |
GLM | Generalized linear model |
IA | Index of Agreement |
Ic | Soil Behavior type index |
Ic_mean | Mean of soil behavior type index |
IQR | Interquartile range |
KGE | Kling–Gupta efficiency |
kPa | Kilo pascal |
Loss function | |
Likelihood function | |
Predictor function | |
m/s | Meter per second |
MAE | Mean absolute error |
MARE | Mean absolute relative error |
MASRE | Mean square relative error |
MASW | Multichannel analysis of surface waves |
max_bin | Maximum number of bins |
max_depth | Maximum depth of trees |
MaxARE | Maximum absolute relative error |
MBE | Mean bias error |
MCMC | Markov Chain Monte Carlo |
MCSE | Monte Carlo Standard Error |
ML | Machine learning |
MPa | Mega pascal |
MSE | Mean squared error |
N | Total number of input features |
n_eff | Effective sample size |
n_estimators | Number of trees |
PDPs | Partial dependence plots |
qc | Cone tip resistance |
qc_cv | Coefficient of variation of cone tip resistance |
qc_mean | Mean of cone tip resistance |
r | Linear correlation coefficient |
R2 | coefficient of determination |
reg_alpha | L1 Regularization term on weights |
reg_lambda | L2 Regularization term on weights |
Rf | Friction ratio |
Rf_mean | Mean of friction ratio |
Rhat | Potential scale reduction factor |
RMSRE | Root mean squared relative error |
S | Subset of features |
scale_pos_ weight | Balancing weight for positive and negative classes |
SCPT | Seismic cone penetration test |
SCPTu | Cone penetration test with pore pressure measurement |
SHAP | Shapley Additive Explanations |
STD | Standard deviation |
Vs | Shear wave velocity |
Weight of leaf node | |
X | Measured shear wave velocity |
XGBoost | Extreme Gradient Boosting |
Y | Predicted shear wave velocity |
σ,v | Total overburden stress |
Mean of prior distribution | |
Variance of prior distribution | |
Intercept | |
Coefficients of predictor variables | |
Objective function | |
regularization parameter | |
regularization coefficients | |
Base value | |
SHAP value |
References
- Bazzurro, P. Ground-Motion Amplification in Nonlinear Soil Sites with Uncertain Properties. Bull. Seismol. Soc. Am. 2004, 94, 2090–2109. [Google Scholar] [CrossRef]
- Rathje, E.M.; Kottke, A.R.; Trent, W.L. Influence of Input Motion and Site Property Variabilities on Seismic Site Response Analysis. J. Geotech. Geoenviron. Eng. 2010, 136, 607–619. [Google Scholar] [CrossRef]
- Chala, A.; Ray, R. Impact of Randomized Soil Properties and Rock Motion Intensities on Ground Motion. Adv. Civ. Eng. 2024, 2024, 1–12. [Google Scholar] [CrossRef]
- Campanella, R.G.; Stewart, W.P. Seismic Cone Analysis Using Digital Signal Processing for Dynamic Site Characterization. Can. Geotech. J. 1992, 29, 477–486. [Google Scholar] [CrossRef]
- Hardee, H.C.; Elbring, G.J.; Paulsson, B.N.P. Downhole Seismic Source. Geophysics 1987, 52, 729–739. [Google Scholar] [CrossRef]
- Robertson, P.K.; Campanella, R.G.; Gillespie, D.; Rice, A. Seismic CPT to Measure in Situ Shear Wave Velocity. J. Geotech. Eng. 1986, 112, 791–803. [Google Scholar] [CrossRef]
- Stokoe, K.H.; Woods, R.D. In Situ Shear Wave Velocity by Cross-Hole Method. J. Soil Mech. Found. Div. 1972, 98, 443–460. [Google Scholar] [CrossRef]
- Park, C.B.; Miller, R.D.; Xia, J. Multichannel Analysis of Surface Waves. Geophysics 1999, 64, 800–808. [Google Scholar] [CrossRef]
- Meisina, C.; Bonì, R.; Bordoni, M.; Lai, C.G.; Bozzoni, F.; Cosentini, R.M.; Castaldini, D.; Fontana, D.; Lugli, S.; Ghinoi, A.; et al. 3D Engineering Geological Modeling to Investigate a Liquefaction Site: An Example in Alluvial Holocene Sediments in the Po Plain, Italy. Geosciences 2022, 12, 155. [Google Scholar] [CrossRef]
- Yang, H.Q.; Chu, J.; Wu, S.; Zhu, X.; Qi, X.; Chiam, K. Advancing Geological Modelling and Geodata Management: A Web-Based System with AI Assessment in Singapore. Georisk 2024. [Google Scholar] [CrossRef]
- Robertson, P.K. Interpretation of Cone Penetration Tests—A Unified Approach. Can. Geotech. J. 2009, 46, 1337–1355. [Google Scholar] [CrossRef]
- Mayne, P.W.; Rix, G.J. Correlations Between Shear Wave Velocity and Cone Tip Resistance in Natural Clays. Soils Found. 1995, 35, 107–110. [Google Scholar] [CrossRef] [PubMed]
- Andrus, R.D.; Mohanan, N.P.; Piratheepan, P.; Ellis, B.S.; Holzer, T.L. Predicting shear-wave velocity from cone penetration resistance. In Proceedings of the 4th International Conference on Earthquake Geotechnical Engineering, Thessaloniki, Greece, 24–29 June 2007. [Google Scholar]
- Griffiths, S.C.; Cox, B.R.; Rathje, E.M.; Teague, D.P. Surface-Wave Dispersion Approach for Evaluating Statistical Models That Account for Shear-Wave Velocity Uncertainty. J. Geotech. Geoenviron. Eng. 2016, 142, 04016061. [Google Scholar] [CrossRef]
- Matasovic, N.; Hashash, Y. NCHRP Synthesis 428: Practices and Procedures for Site-Specific Evaluations of Earthquake Ground Motions, a Synthesis of Highway Practice; National Academies Press: Washington, DC, USA, 2012. [Google Scholar]
- Toro, G.R. Probabilistic Models of Site Velocity Profiles for Generic and Site-Specific Ground-Motion Amplification Studies. Tech. Rep. 1995, 779574. [Google Scholar]
- Rauter, S.; Tschuchnigg, F. Cpt Data Interpretation Employing Different Machine Learning Techniques. Geosciences 2021, 11, 265. [Google Scholar] [CrossRef]
- Padarian, J.; Minasny, B.; McBratney, A.B. Machine Learning and Soil Sciences: A Review Aided by Machine Learning Tools. SOIL 2020, 6, 35–52. [Google Scholar] [CrossRef]
- Chala, A.T.; Ray, R.P. Machine Learning Techniques for Soil Characterization Using Cone Penetration Test Data. Appl. Sci. 2023, 13, 8286. [Google Scholar] [CrossRef]
- Felić, H.; Marzouk, I.; Tschuchnigg, F.; Peterstorfer, T. Data-Driven Site Characterization—Focus on Small-Strain Stiffness. In Proceedings of the 7th International Conference on Geotechnical and Geophysical Site Characterization—CIMNE, Barcelona, Spain, 18–21 June 2024. [Google Scholar]
- Olayiwola, T.; Tariq, Z.; Abdulraheem, A.; Mahmoud, M. Evolving Strategies for Shear Wave Velocity Estimation: Smart and Ensemble Modeling Approach. Neural Comput. Appl. 2021, 33, 17147–17159. [Google Scholar] [CrossRef]
- Taheri, A.; Makarian, E.; Manaman, N.S.; Ju, H.; Kim, T.H.; Geem, Z.W.; Rahimizadeh, K. A Fully-Self-Adaptive Harmony Search GMDH-Type Neural Network Algorithm to Estimate Shear-Wave Velocity in Porous Media. Appl. Sci. 2022, 12, 6339. [Google Scholar] [CrossRef]
- Goodrich, B.; Gabry, J.A.I.; Brilleman, S. Rstanarm: Bayesian Applied Regression Modeling via Stan, R Package Version 2.21.4. 2023.
- Gelman, A.; Carlin, J.B.; Stern, H.S.; Rubin, D.B. Bayesian Data Analysis; Chapman and Hall/CRC: Boca Raton, FL, USA, 1995; ISBN 0429258410. [Google Scholar]
- Yang, H.Q.; Zhang, L.; Pan, Q.; Phoon, K.K.; Shen, Z. Bayesian Estimation of Spatially Varying Soil Parameters with Spatiotemporal Monitoring Data. Acta Geotech. 2021, 16, 263–278. [Google Scholar] [CrossRef]
- Gong, W.; Tien, Y.M.; Juang, C.H.; Martin, J.R.; Luo, Z. Optimization of Site Investigation Program for Improved Statistical Characterization of Geotechnical Property Based on Random Field Theory. Bull. Eng. Geol. Environ. 2017, 76, 1021–1035. [Google Scholar] [CrossRef]
- Gelman, A.; Rubin, D.B. Inference from Iterative Simulation Using Multiple Sequences. Stat. Sci. 1992, 7, 457–472. [Google Scholar] [CrossRef]
- Wang, X.; Wang, X.S.; Li, N.; Wan, L. Bayesian Inversion of Soil Hydraulic Properties from Simplified Evaporation Experiments: Use of DREAM(ZS) Algorithm. Water 2021, 13, 2614. [Google Scholar] [CrossRef]
- Qin, S.; Song, R.; Li, N. Bayesian Model Updating for Bridge Engineering Applications Based on DREAMKZS Algorithm and Kriging Model. Structures 2023, 58, 105565. [Google Scholar] [CrossRef]
- Liu, G.; Jiang, W. Model Updating of a Prestressed Concrete Rigid Frame Bridge Using Multiple Markov Chain Monte Carlo Method and Dfferential Evolution. Int. J. Struct. Stab. Dyn. 2022, 22, 2240020. [Google Scholar] [CrossRef]
- Chen, T.; Guestrin, C. Xgboost: A Scalable Tree Boosting System. In Proceedings of the 22nd Acm Sigkdd International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
- Akiba, T.; Sano, S.; Yanase, T.; Ohta, T.; Koyama, M. Optuna: A Next-Generation Hyperparameter Optimization Framework. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Anchorage, AK, USA, 4–8 August 2019; ACM: New York, NY, USA, 2019; pp. 2623–2631. [Google Scholar]
- Lundberg, S. A Unified Approach to Interpreting Model Predictions. arXiv 2017, arXiv:1705.07874. [Google Scholar]
- Friedman, J.H. Greedy Function Approximation: A Gradient Boosting Machine. Ann. Stat. 2001, 29, 1189–1232. [Google Scholar] [CrossRef]
- Oberhollenzer, S.; Premstaller, M.; Marte, R.; Tschuchnigg, F.; Erharter, G.H.; Marcher, T. Cone Penetration Test Dataset Premstaller Geotechnik. Data Brief 2021, 34, 106618. [Google Scholar] [CrossRef]
- Ray, R.P.; Wolf, A.; Kegyes-Brassai, O. Harmonizing Dynamic Property Measurements of Hungarian Soils. In Proceedings of the 6th International Conference on Geotechnical and Geophysical Site Characterization (ISC2020), Budapest, Hungary, 7–11 September 2020. [Google Scholar]
- Kegyes-Brassai, O.; Wolf, Á.; Szilvágyi, Z.; Ray, R.P. Effects of Local Ground Conditions on Site Response Analysis Results in Hungary. In Proceedings of the 19th International Conference on Soil Mechanics and Geotechnical Engineering (19th ICSMGE), Seoul, Republic of Korea, 17–21 September 2017; pp. 2003–2006. [Google Scholar]
- Szilvágyi, Z.; Panuska, J.; Kegyes-brassai, O.; Wolf, Á.; Tildy, P.; Ray, R.P. Ground Response Analyses in Budapest Based on Site Investigations and Laboratory Measurements. World Acad. Sci. Eng. Technol. Int. J. Environ. Chem. Ecol. Geol. Geophys. Eng. 2017, 11, 307–317. [Google Scholar]
- Wolf, Á.; Ray, R.P. Comparison and Improvement of the Existing Cone Penetration Test Results: Shear Wave Velocity Correlations for Hungarian Soils. Int. J. Geol. Environ. Eng. 2017, 11, 362–371. [Google Scholar]
- Kazemi, F.; Asgarkhani, N.; Jankowski, R. Optimization-Based Stacked Machine-Learning Method for Seismic Probability and Risk Assessment of Reinforced Concrete Shear Walls. Expert Syst. Appl. 2024, 255, 124897. [Google Scholar] [CrossRef]
- Asgarkhani, N.; Kazemi, F.; Jakubczyk-Gałczyńska, A.; Mohebi, B.; Jankowski, R. Seismic Response and Performance Prediction of Steel Buckling-Restrained Braced Frames Using Machine-Learning Methods. Eng. Appl. Artif. Intell. 2024, 128, 107388. [Google Scholar] [CrossRef]
- Wakjira, T.G.; Kutty, A.A.; Alam, M.S. A Novel Framework for Developing Environmentally Sustainable and Cost-Effective Ultra-High-Performance Concrete (UHPC) Using Advanced Machine Learning and Multi-Objective Optimization Techniques. Constr. Build. Mater. 2024, 416, 135114. [Google Scholar] [CrossRef]
Metrics | qc_mean (MPa) | qc_cv (-) | fs_mean (kPa) | fs_cv (-) | Rf_mean (%) | Ic_mean (-) | Depth (m) | σ,v (kPa) | Vs (m/s) |
---|---|---|---|---|---|---|---|---|---|
Mean | 4.24 | 0.24 | 50.53 | 0.26 | 2.29 | 2.74 | 13.38 | 254 | 237.6 |
STD | 7.12 | 0.25 | 56.57 | 0.21 | 3.16 | 0.65 | 9.01 | 171 | 91.1 |
Minimum | 0.02 | 0 | 0.35 | 0 | 0.09 | 0 | 0.5 | 9.5 | 22 |
Maximum | 74.75 | 2.59 | 820.25 | 1.79 | 112.95 | 4.06 | 49.5 | 941 | 547 |
Count | 3600 | 3600 | 3600 | 3600 | 3600 | 3600 | 3600 | 3600 | 3600 |
Metrics | Formula | Ideal Value | Equation. No |
---|---|---|---|
Correlation coefficient | 1 | (13) | |
Coefficient of determination | 1 | (14) | |
Index of Agreement | 1 | (15) | |
Kling–Gupta efficiency | 1 | (16) | |
Mean squared error | 0 | (17) | |
Root mean squared relative error | 0 | (18) | |
Mean absolute error | 0 | (19) | |
Mean absolute relative error | 0 | (20) | |
Mean square relative error | 0 | (21) | |
Mean bias error | 0 | (22) | |
Maximum absolute relative error | 0 | (23) |
Hyperparameters | Search Span | Optimized Values |
---|---|---|
Number of trees (n_estimators) | 50–600 | 444 |
Learning rate | 0.001–0.5 | 0.0094 |
Maximum depth of trees | 1–10 | 10 |
Subsampling ratio | 0.05–1 | 0.640 |
L1 Regularization term on weights (reg_alpha) | 0.01–1 | 0.208 |
L2 Regularization term on weights (reg_lambda) | 0.01–1 | 0.603 |
Column subsampling ratio per tree (colsample_bytree) | 0.5–1 | 0.707 |
Minimum loss reduction required to make a further split (gamma) | 0–10 | 8.79 |
Maximum number of bins for feature quantization (max_bin) | 128–512 | 499 |
Balancing weight for positive and negative classes (scale_pos_weight) | 0.1–10 | 6.09 |
Performance Metrics | Test Dataset | Train Dataset |
---|---|---|
0.54 | 0.91 | |
IA | 0.84 | 0.97 |
KGE | 0.65 | 0.82 |
MSE | 3792 | 781 |
RMSRE | 0.39 | 0.22 |
MAE | 41 | 19.6 |
MARE | 0.21 | 0.11 |
MSRE | 0.15 | 0.05 |
MBE | 1.12 | 0.13 |
MaxARE | 3.30 | 3.95 |
Parameters | Mean | STD | MCSE (%) | Rhat | n_eff |
---|---|---|---|---|---|
Intercept | 5.387 | 0.006 | 0.004 | 1.000 | 32,562 |
0.079 | 0.010 | 0.007 | 0.999 | 19,341 | |
0.015 | 0.007 | 0.004 | 1.000 | 28,913 | |
0.092 | 0.010 | 0.007 | 1.000 | 19,305 | |
0.092 | 0.786 | 0.65 | 0.999 | 14,770 | |
0.109 | 0.787 | 0.65 | 1.000 | 14,771 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Chala, A.T.; Ray, R. Uncertainty Quantification in Shear Wave Velocity Predictions: Integrating Explainable Machine Learning and Bayesian Inference. Appl. Sci. 2025, 15, 1409. https://doi.org/10.3390/app15031409
Chala AT, Ray R. Uncertainty Quantification in Shear Wave Velocity Predictions: Integrating Explainable Machine Learning and Bayesian Inference. Applied Sciences. 2025; 15(3):1409. https://doi.org/10.3390/app15031409
Chicago/Turabian StyleChala, Ayele Tesema, and Richard Ray. 2025. "Uncertainty Quantification in Shear Wave Velocity Predictions: Integrating Explainable Machine Learning and Bayesian Inference" Applied Sciences 15, no. 3: 1409. https://doi.org/10.3390/app15031409
APA StyleChala, A. T., & Ray, R. (2025). Uncertainty Quantification in Shear Wave Velocity Predictions: Integrating Explainable Machine Learning and Bayesian Inference. Applied Sciences, 15(3), 1409. https://doi.org/10.3390/app15031409