Uncertainty Quantification in Shear Wave Velocity Predictions: Integrating Explainable Machine Learning and Bayesian Inference

Chala, Ayele Tesema; Ray, Richard

doi:10.3390/app15031409

Open AccessArticle

Uncertainty Quantification in Shear Wave Velocity Predictions: Integrating Explainable Machine Learning and Bayesian Inference

by

Ayele Tesema Chala

^*

and

Richard Ray

Structural and Geotechnical Engineering Department, Faculty of Architecture, Civil and Transport Sciences, Széchenyi Istvan University, Egyetem Ter 1, H-9026 Gyor, Hungary

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(3), 1409; https://doi.org/10.3390/app15031409

Submission received: 4 January 2025 / Revised: 27 January 2025 / Accepted: 28 January 2025 / Published: 30 January 2025

(This article belongs to the Special Issue Uncertainty and Reliability Analysis for Engineering Systems)

Download

Browse Figures

Versions Notes

Abstract

:

The accurate prediction of shear wave velocity (Vs) is critical for earthquake engineering applications. However, the prediction is inevitably influenced by geotechnical variability and various sources of uncertainty. This paper investigates the effectiveness of integrating explainable machine learning (ML) model and Bayesian generalized linear model (GLM) to enhance both predictive accuracy and uncertainty quantification in Vs prediction. The study utilizes an Extreme Gradient Boosting (XGBoost) algorithm coupled with Shapley Additive Explanations (SHAPs) and partial dependency analysis to identify key geotechnical parameters influencing Vs predictions. Additionally, a Bayesian GLM is developed to explicitly account for uncertainties arising from geotechnical variability. The effectiveness and predictive performance of the proposed models were validated through comparison with real case scenarios. The results highlight the unique advantages of each model. The XGBoost model demonstrates good predictive performance, achieving high coefficient of determination (

R^{2}

), index of agreement (IA), Kling–Gupta efficiency (KGE) values, and low error values while effectively explaining the impact of input parameters on Vs. In contrast, the Bayesian GLM provides probabilistic predictions with 95% credible intervals, capturing the uncertainty associated with the predictions. The integration of these two approaches creates a comprehensive framework that combines the strengths of high-accuracy ML predictions with the uncertainty quantification of Bayesian inference. This hybrid methodology offers a powerful and interpretable tool for Vs prediction, providing engineers with the confidence to make informed decisions.

Keywords:

shear wave velocity prediction; geotechnical variability; uncertainty quantification; machine learning; extreme gradient boosting; generalized linear model

1. Introduction

Shear wave velocity (Vs) is a critical parameter in geotechnical and earthquake engineering. It has the greatest effect on the determination of dynamic behavior of soil and rock materials during seismic events [1,2,3]. Vs can be measured with either invasive (e.g., seismic cone penetration test, SCPT, cross-hole test, CHT, downhole test, DHT) or noninvasive (e.g., multichannel analysis of surface waves, MASW) techniques [4,5,6,7,8]. Integrating these tests with laboratory tests enables site characterization and the identification of potentially liquefiable lithological units, which is essential for hazard assessment [9,10]. While these methods offer the benefit of testing the soil in its natural environment, they tend to be time-consuming and economically infeasible for routine geotechnical exploration programs. Consequently, researchers have developed empirical correlations to estimate Vs using CPT parameters (cone tip resistance qc and sleeve friction fs). Many notable examples of such correlations can be found in the literature (e.g., see the works of Robertson and others [11,12,13]). However, these empirical models often exhibit limitations in generalization, as they are based on site-specific data and are unable to capture the complex, non-linear relationships inherent in the soil properties influencing Vs. Regardless of the techniques, there is unavoidable uncertainty in the final Vs profile, which contributes to uncertainty in site response predictions [14].

Accounting for Vs uncertainty is of paramount importance in engineering applications. While there has been a great deal of research on the significance of incorporating uncertainty of Vs into engineering practices, no universally accepted methods exist for realistically addressing this critical issue [14]. Matasovic and Hashash [15] highlighted the use of bounding-type profiles, which is developing upper and lower bounds (e.g.,

\pm 20 %

) around a base Vs profile to represent potential variability. Additionally, statistical approaches, such as Monte Carlo simulations and the Toro [16] Vs randomization model, have been utilized to generate a range of Vs profiles that reflect the inherent uncertainty in subsurface conditions.

With the increasing availability of high-quality, freely accessible data (e.g., CPT data), the use of machine learning (ML) algorithms for soil characterization has become increasingly popular in recent years [17,18]. Research related to Vs prediction using various algorithms shows promising results. Notable examples include predictions of Vs from CPT parameters (e.g., see authors’ previous research and many others) [19,20,21,22]. ML algorithms excel at learning complex relationships within data without relying on the prior knowledge of the underlying mathematical models. This capability makes them highly effective for analyzing highly variable geotechnical data. However, the development of ML models must be approached with caution, as their performances heavily depend on several factors, including data preprocessing, feature selection, and hyperparameter tuning.

With the growing interest in predictive modeling for soil characterization, Bayesian generalized linear models (GLMs), have also emerged as a focus of study. Bayesian GLMs can be considered valuable tools alongside ML algorithms, particularly in geotechnical applications where data variability and uncertainty are prevalent. To the best of the authors’ knowledge, no previous study has reported on assessment of Vs predictions by integrating ML and Bayesian GLM approaches. The aim of this study is to develop Bayesian GLM and ML models to predict Vs using CPT parameters. For the ML model, the Extreme Gradient Boosting (XGBoost) algorithm coupled with Shapley Additive Explanations (SHAP) analysis was chosen due to its high performance in predictive models. Following the development of Bayesian GLM and XGBoost models, a comparative analysis was conducted against real SCPT data from different locations to validate the predictive capabilities of the proposed models.

2. Data-Driven Shear Wave Velocity (Vs) Prediction Models

In the following subsections, we will discuss Bayesian GLM and XGBoost algorithms for the development of Vs prediction models.

2.1. Bayesian Generalized Linear Model

The Bayesian framework has been utilized in geotechnical parameter estimations for some time now. This approach offers a powerful paradigm to estimate soil parameters in a probabilistic manner. However, studies on Vs prediction using the Bayesian framework are limited. Bayesian GLM provides a robust framework for integrating CPT data to enhance the accuracy and reliability of Vs predictions. Bayesian GLM are readily available through rstanarm [23] package and can be implemented through R programming language [23]. In this study, we formulate a Bayesian approach to model the relationship between Vs and CPT parameters using a GLM framework. The general form of Bayesian GLM can be expressed as follows [24]:

Let Y represent Vs and X represent CPT parameters (e.g., cone tip resistance, qc; sleeve friction, fs; friction ratio, Rf). Then, the relationship between Y and predictor variables X can be expressed as

Y = M (X) + E

(1)

where

Y

is predicted Vs,

M (X)

is the linear predictor function based on the predictor variables and

E

is an error term.

The linear predictor can be defined as

M (X) = β_{0} + β_{1} q c + β_{2} f s + \dots + β_{p} X_{p}

(2)

where

β_{0}

is the intercept and

β_{i} (i = 1, 2, \dots, p)

is the coefficient of

X_{i}

.

The Bayesian GLM requires specifying prior distributions for the intercept and coefficients. The values of these parameters are equally likely to be positive or negative, but they are highly unlikely to be far from zero. These beliefs can be represented using normal distributions with a mean of zero and a small standard deviation. The default priors in rstanarm package perform well for most cases. Prior distributions for the coefficients can be defined as

β_{i} ~ N o r m a l (0, σ_{β_{i}}^{2})

(3)

where

σ_{β_{i}}^{2}

is variance of prior distribution of the coefficients. Similarly, the prior distribution for the intercept can be defined as

β_{0} ~ N o r m a l (0, σ_{β_{0}}^{2})

(4)

where

σ_{β_{0}}^{2}

is variance of prior distribution of the intercept.

The error term is assumed to follow normal distribution with a mean of 0 and variance

σ^{2}

).

E ~ N o r m a l (0, σ^{2})

(5)

Common choices for the prior distribution of the variance of error include the exponential, inverse gamma, and half-normal distributions. In this study, the exponential prior distribution (the default for GLM in the rstanarm package, with rate parameter λ = 1) was considered. It is worth noting that the priors are internally adjusted in rstanarm to account for data scaling and centering.

σ ~ E x p o n e n t i a l (λ)

(6)

Assuming normally distributed errors, the likelihood function can be expressed as

L (Y | X, β, σ^{2}) = \prod_{i = 1}^{n} \frac{1}{\sqrt{{2 π σ}^{2}}} e x p (- \frac{{(Y_{i} - M (X_{i}))}^{2}}{{2 σ}^{2}})

(7)

By using Bayes’ theorem, the posterior distribution of the parameters can be derived as

P (β | Y, X) \propto L (Y | X, β, σ^{2}) P (β)

(8)

The actual values of posterior distributions of the parameters are rarely obtainable through analytical methods. Therefore, Markov Chain Monte Carlo (MCMC) simulation is commonly used to explore the posterior target distribution through random sampling [25,26]. In MCMC simulations, the Markov chains should converge to stationary distributions. To ensure that the chains mix well within parameter space, visual inspections of the chains and statistical tests, such as the Gelman and Rubin Rhat statistics [27], should be conducted. At convergence, Rhat is expected to be below 1.2, which indicates that the between-chain and within-chain variances for the model parameters are in good agreement [28,29,30]. This threshold serves as a guideline to confirm that the Markov chains have mixed well and that the posterior estimates are reliable and representative of the target distribution.

2.2. Extreme Gradient Boosting (XGBoost) Algorithm

XGBoost algorithm is a powerful machine learning algorithm known for its efficiency, scalability, and ability to handle large datasets with high-dimensional feature spaces and non-linear relationships [31]. In this study, XGBoost was selected to develop a Vs predictive model using seismic cone penetration tests (SCPTs) and SCPTs with a pore water pressure measurement (SCPTu) dataset. Each observation in the dataset represents a measurement from SCPT or SCPTu (e.g., cone resistance (qc), sleeve friction (fs), and depth, among others). These observations (features) are used as an input to the XGBoost model for the prediction of output (in this case the Vs). The XGBoost algorithm has the capability to discover patterns and relationships in the data without requiring pre-defined mathematical models. For instance, the relationships between Vs and geotechnical parameters (e.g., qc and fs) may involve non-linearities and are difficult to express mathematically. XGBoost overcomes this limitation by learning directly from the data, making it particularly suited for SCPT-based Vs predictions where the physical or mathematical relationships may be unknown or too complex to model explicitly. The model uses an ensemble of decision trees, built iteratively to minimize the prediction error by optimizing a loss function. Specifically, XGBoost combines weak learners (individual trees) in a sequential manner, where each tree corrects the errors of its predecessors.

In XGBoost, a regularization parameter was incorporated to reduce model complexity and prevent overfitting. The objective function of XGBoost is defined as follows [31]:

L (ϕ) = \sum_{i = 1}^{N} l (y_{i}, {\hat{y}}_{i}) + \sum_{i = 1}^{T} Ω (f_{t})

(9)

where

L (ϕ)

is objective function,

l (.)

is loss function that measures difference between actual value

y_{i}

and predicted value

{\hat{y}}_{i}

,

N

is number of observations,

T

is number of estimators, and

Ω (f_{t})

is regularization parameter defined as

Ω (f_{t}) = γ T + \frac{1}{2} λ ω_{t}

(10)

where

T

is the number of nodes, and

ω_{t}

weight of the leaf nodes, and

γ

and

λ

are regularization coefficients that control the model’s complexity.

The predictive performance and generalization capability of XGBoost depends on the values of its hyperparameter. Therefore, the optimal hyperparameter values must be determined through hyperparameters tuning techniques. A next-generation optimization framework, Optuna, is widely used [32]. To prevent overfitting during the tuning process, k-fold cross validation techniques are commonly employed. This process helps the identification of the optimal parameter configuration that maximizes the model’s coefficient of determination (

R^{2}

) while avoiding overfitting.

To analyze the influence of input features on the optimized XGBoost model, partial dependence plots (PDPs) and SHAPs were utilized. SHAP, introduced by Lundberg [33], offers a unified framework for interpreting feature importance in the model. SHAP assigns a unique Shapley value (

ϕ_{j}

) to each feature based on its contribution to the model output. For a prediction

f (x)

, SHAP is expressed as [33]:

f (x) = ϕ_{0} + \sum_{j = 1}^{N} ϕ_{j}

(11)

where

ϕ_{0}

is base value (average model output for the dataset),

ϕ_{j}

is SHAP value for feature

j

and it is defined as

ϕ_{j} = \sum_{S \subseteq N ∖ {j}} \frac{| S |! (|N| - |S| - 1)!}{| N |!} [f (S \cup \{j\}) - f (S)]

(12)

where

N

is total number of input features,

S

is a subset of features excluding

j

,

f (S)

is the model prediction for the subset

S

,

|S|

is the number of features in subset

S

, and

S \subseteq N ∖ {j}

means that

S

is a subset of the set of features

N

excluding feature

j

. The feature with the highest absolute SHAP value is considered the most influential in the model’s decision-making process.

PDPs can be used to visualize the interactions between the input features and the target variable in machine learning models [34]. In this study, PDPs were employed to analyze the relationship between input features and the target variable, Vs. By marginalizing over all other features, PDPs enable a clear depiction of the isolated effect of a single feature on Vs. The PDPs can reveal whether the relationship between Vs and input features is linear, monotonic, or non-linear. From the PDP analysis, it is possible to identify key features that contribute significantly to the predictions of Vs.

3. Methodology

This study focuses on developing XGBoost and Bayesian GLMs for predicting Vs. Figure 1 presents the methodology used for the development and evaluation of these predictive models. The process begins with the preprocessing of SCPT and SCPTu datasets. The preprocessed dataset was used for the development of the XGBoost model, conducting partial dependence analysis to identify impactful features, and fitting the Bayesian GLM. Model evaluations include PDPs and SHAP analysis for the XGBoost model and a comparison of the predictive performance of both models. The following sub-sections provide a detailed description of the research data, performance metrics, and the assessment of the models’ predictive capabilities.

3.1. Training and Testing Dataset

The data used in this study were provided by Graz University of Technology [35] (Graz, Austria). The complete dataset consists of 1339 cone penetration tests (CPT, CPTu, SCPT, and SCPTu) conducted in Austria and Germany by Premstaller Geotechnik ZT GmbH [35] (Hallein, Austria). For this study, we utilized only the SCPT (46 tests in total) and SCPTu (50 tests in total) datasets. The SCPT and SCPTu can measure Vs at various intervals (e.g., at every 1 m) along the penetration depth, in addition to cone penetration data (i.e., cone tip resistance, qc and sleeve friction, fs). Before training the ML algorithm, the dataset underwent preprocessing. Raw data are rarely error-free or perfectly structured for ML training. Data often arrive in raw, messy, or incomplete forms, rendering them unsuitable for direct use in ML models. Initially, the dataset was filtered to include only SCPT and SCPTu tests. Unnecessary columns, and non-relevant measurements, were removed. Rows with zero or negative values for critical parameters were excluded. The differences in sampling intervals between CPT and Vs measurements were reconciled as per procedures described in [36]. This involved computing interval-based statistics (mean and coefficient of variation, cv) for relevant input features within the depth intervals where Vs data were recorded. Initial input features considered include mean cone tip resistance (qc_mean) and its coefficient of variation (qc_cv); mean sleeve friction (fs_mean) and its coefficient of variation (fs_cv); mean friction ratio (Rf_mean) and its coefficient of variation (Rf_cv); mean of soil behavior type index (Ic_mean); total overburden stress (σ,v); and depth of soil while Vs served as the target variable. Outliers in Vs were identified and removed using the interquartile range (IQR) method, with a threshold of 2.5 times the IQR above the third quartile. The cleaned and processed dataset was then saved for further modeling and analysis. Table 1 presents a statistical summary of the dataset after preprocessing, while Figure 2 displays a pair plot of the input features and the target variable, Vs. The diagonal of the matrix contains histograms that illustrate the distribution of each individual feature. The off-diagonal scatter plots depict the pairwise relationships between features. Additionally, the upper triangle includes linear correlation coefficients, highlighting the strength and direction of these relationships.

The XGBoost model was trained on 80% of the dataset, while the remaining 20% of the dataset was used to evaluate its performance. To ensure the model’s robustness and minimize the risk of overfitting, hyperparameter optimization was conducted using the Optuna framework. This optimization process was paired with 10-fold cross-validation, where the training dataset was divided into 10 equal subsets. In each trial, the model was trained on nine folds and validated on the remaining fold. The primary objective of the optimization was to maximize the coefficient of determination (

R^{2}

). The optimized model was used to conduct partial dependence analysis to examine the influence of each feature on Vs. Features with negligible importance were excluded, and the model was retrained using the updated feature set. For the Bayesian GLM, only features exhibiting a linear or nearly linear relationship with Vs were selected. The performance of the final XGBoost model was assessed using multiple statistical metrics. Additionally, the predictive performance of both the Bayesian GLM and XGBoost models was validated on a separate validation dataset.

3.2. Hungarian Seismic Cone Penetration Test (SCPT)

The cone penetration test (CPT) data have been in use by Hungarian geotechnical engineers for quite some time. A wealth of CPT data is readily available for research purposes. This study used SCPT data (see Figure 3) from Hungary (Pak site) to validate the predictive performance of Bayesian GLM and XGBoost models. The local soil deposit at the site is predominantly characterized by fluvial sediments from the Danube River. Details about the SCPT procedures and geological characteristics are available elsewhere and readers are encouraged to refer to [3,36,37,38,39].

3.3. Performance Measurements

The predictive performance of the Bayesian GLM and XGBoost models was evaluated by computing various performance and error metrics (Table 2). For Bayesian GLM, trace plot and R-hat, mean prediction, 95% credible intervals were used to assess the performance. Additionally, the linear correlation coefficients

(r)

between the measured Vs values and the predicted values for the validation dataset were computed for both models.

The predictive performance of XGBoost model was further evaluated using several metrics, including the coefficient of determination

{(R}^{2})

, index of agreement (IA), Kling–Gupta efficiency (KGE), mean squared error (MSE), root mean squared relative error (RMSRE), mean absolute error (MAE), mean absolute relative error (MARE), mean square relative error (MSRE), mean bias error (MBE), and maximum absolute relative error (MaxARE) [40,41,42]. These metrics provide a comprehensive quantitative evaluation of the models in predicting the Vs. IA, KGE, and

R^{2}

indicate overall agreement and explanatory power, with ideal values close to unity, while error metrics quantify prediction errors (lower values indicate high performance).

4. Discussion

In the following subsections, the Vs prediction capabilities of the proposed models are evaluated. The Bayesian GLM was implemented in R while the XGBoost model was implemented in Python (version 3.11.7).

4.1. Evaluation of the Models

Table 3 presents the optimized critical hyperparameters of the XGBoost model. These hyperparameters were fine-tuned through Optuna with a 10-fold cross-validation technique to enhance the model’s predictive performance. The Optuna framework iteratively trained and evaluated models using the mean

R^{2}

score across the cross-validation folds as the objective metric. By focusing on maximizing this metric, Optuna identified the optimal set of hyperparameters (see Table 3) that balanced accuracy and generalization. The results of the hyperparameter importance analysis are illustrated in Figure 4. As can be observed, the relative importance of each parameter is ranked based on its contribution to model performance. Among the hyperparameters, the learning rate (eta) emerged as the most influential factor. This underscores its pivotal role in determining the pace of model updates during training. Following the learning rate, the subsampling ratio and maximum tree depth (max_depth) were found to significantly affect model performance. The max_depth governs the complexity of the decision trees, allowing the model to capture complicated patterns in the dataset. Notably, the regularization term (lambda) exhibited relatively lower importance indicating that the other hyperparameters already provided sufficient control over model complexity.

The partial dependence analysis was conducted to explore the relationships between key input features and the predicted Vs. The results (see Figure 5) indicate that Vs exhibits positive correlations with σ,v, qc_mean, fs_mean, depth and qc_cv. The relationship between Ic_mean and Vs, however, is non-linear, indicating a complex influence on Vs predictions. This non-linear interaction can be effectively captured by the XGBoost model but could not be accommodated by the linear framework of the Bayesian GLM. Consequently, Ic_mean was excluded from the Bayesian GLM. In contrast, fs_cv, Rf_mean and Rf_cv show negligible influence on Vs, as observed from their flat PDP curves. These parameters were excluded from both the XGBoost and Bayesian GLMs to optimize performance. As a result, the final optimized XGBoost model was trained using σ,v, qc_mean, fs_mean, depth, qc_cv, and Ic_mean, while Bayesian GLM was fitted using σ,v, qc_mean, fs_mean, depth, and qc_cv.

The predictive performance of the optimized XGBoost model is presented in Figure 6. In Figure 6a, the Vs values predicted by XGBoost model for both the training and testing dataset are plotted against measured Vs values. The diagonal dashed green line (y = x) represents perfect prediction (i.e., the predicted Vs values match the measured Vs values). A good agreement is observed between predicted and measured Vs values. The residual (the difference between predicted and measured Vs) plots for the predictions are illustrated in Figure 6b. It can be observed that random scatters of residuals around zero (horizontal dashed green line) for both training and testing dataset, indicating unbiased predictions. On the right side of the scatter plots, histograms of the residuals are presented. These histograms offer a visual summary of the distribution of residuals. For the training dataset, the residuals appear to be symmetrically distributed around zero, with a relatively narrow spread, indicating strong predictive performance and minimal bias. In contrast, for the testing dataset, the residuals exhibit a slightly wider spread, reflecting the model’s reduced performance on unseen data. However, the residuals are still centered around zero, suggesting that the model generalizes reasonably well without significant overfitting. The results demonstrate the robustness of the XGBoost model and emphasize the importance of data preprocessing, feature selection, and hyperparameter optimization in improving model performance.

The predictive performance of the model across various performance metrics (

R^{2}

, IA, KGE) and error metrics are summarized in Table 4 for both the training and testing dataset. For instance, the model achieved an

R^{2}

of 0.54, an IA of 0.84 and KGE of 0.65 on the test dataset. For the training dataset, the model achieved an

R^{2}

of 0.91, an IA of 0.97 and KGE of 0.82. The difference in performance between the training and testing datasets can be attributed to the complexity and variability inherent in the test data, which were not fully captured during model training. This difference is a common occurrence in ML models, particularly when the training data does not encompass all the potential variations present in the test set. Additionally, while the model was designed to generalize well and hyperparameter optimization techniques were employed to mitigate overfitting, the relatively lower performance on the test dataset reflects the challenges in accurately modeling highly complex or non-linear relationships between input features and the target variable Vs. However, these results are superior to those reported in [20]. The improvements can be attributable to the feature selection and data preprocessing step.

Figure 7 shows SHAP analysis results, where the input features are displayed on the y-axis in descending order based on their contribution to the Vs prediction. The most influential feature appears at the top. The x-axis represents SHAP values, which quantify the contribution of each feature to the model’s prediction. A positive SHAP value indicates that the feature increases the predicted Vs, whereas a negative SHAP value indicates a decrease in the predicted Vs. The SHAP values are represented by colored points that indicate the influence of the features on the model (Figure 7a). Each point reflects the impact of specific features on an observation from the entire database. The red color in the figure indicates high feature values while the blue color indicates low feature values. Positive SHAP values (red points) increase the Vs prediction while the negative SHAP value (blue points) decrease Vs prediction. For instance, for cone tip resistance (qc_mean), the red points predominantly on the positive side indicate higher qc_mean value generally increase Vs. On the other hand, the blue points on the negative side indicate lower qc_mean values generally decrease Vs. Interestingly, the instances of red points on the left or blue points on the right indicate a non-linear relationship between the feature and Vs. The global impact of features on Vs is shown by the mean absolute SHAP values of each feature (Figure 7b). Features with higher bar length indicate significant impact on the Vs. It can be observed that total overburden stress σ,v is the most influential feature, followed by cone tip resistance and sleeve frictions. On the other hand, the depth and soil behavior type index have lower impact on Vs. It is important to note that less impactful features were removed from the model, and only these features were incorporated into the final model.

The Bayesian GLM was fitted on the complete dataset using the final predictors identified through partial dependence analysis. It is important to note that the term “predictors” is more appropriate for the Bayesian GLM, while the term “input features” is suited for the ML models. In this study, both terms refer to the same geotechnical parameters. The model was developed using rstanarm package with four independent Markov chains, each consisting of 10,000 iterations, of which the first 1000 iterations were discarded as burn-in period. Weakly informative prior (normal prior (0, 2.5) for the intercept and coefficients, and an exponential prior (λ = 1) for the variance of the error term) were used. The performance of the model was evaluated using Rhat convergence diagnostic (evaluation of the agreement between between-chain and within-chain estimates for the model parameters), Monte Carlo Standard Error, MCSE (the ratio of standard deviation of the model parameters to the square root of the effective sample size), and trace plots of the parameters.

Figure 8 illustrates the convergence of Rhat values across the first 6000 iterations (burn-in period excluded) for the model fitted using four independent Markov chains. As can be observed, Rhat values remained consistently close to one (0.996–1.016) across iterations for each model parameter. This indicates the effective convergence of Markov chains. The trace plots (Figure 9) show the sampling behavior of Markov chains for each model parameter. It can be observed that the sampling demonstrates stable posterior distributions and reliable inference for all model parameters, as evidenced by well-mixed chains with no apparent trends. Table 5 summarizes the MCMC diagnostics (mean, standard deviation, MCSE, Rhat, and effective sample size n_eff) for the model parameters. The final Rhat values (computed based on all iteration) are nearly one, further confirming the convergence of the MCMC chains. Additionally, high effective sample sizes and minimal MCSE values validate the robustness of the parameter estimates.

4.2. Validation of the Models

The Bayesian GLM was evaluated using validation SCPT data from Hungary to assess its predictive performance. Figure 10 presents the comparison between the measured and predicted Vs values across varying depths. Figure 10a displays the mean predictions (blue diamonds) with 95% credible intervals (2.5–97.5%), along with the measured Vs values (red stars). The close alignment of the predicted mean values with the measured values across the entire depths demonstrates the model’s accuracy and reliability. The lower plots (Figure 10b,c) provide a detailed analysis of predictions at a depth of 10 m (selected for illustration purpose only). The histogram (Figure 10c) shows the posterior distribution of Vs prediction at this depth. The blue curve represents the fitted probability density curve. It can be observed that the distribution follows lognormal distribution (attributable to data transformation during fitting the model). The vertical red arrow marks the measured value of Vs, while the blue arrows mark the 2.5% (left), mean prediction (middle), and 97.5% (right) credible interval. The 95% credible interval for this depth (142–627 m/s) encompasses the measured value (258 m/s). Additionally, Figure 10b summarizes the predictors used for this prediction. In general, the results demonstrate robustness of Bayesian GLM in capturing uncertainty and its capability to provide reliable predictions for Vs across varying depths. The use of credible intervals and posterior distributions further underscores its significance in quantifying prediction reliability.

In Figure 11, the predictive performance of the XGBoost model was compared against measured Vs values and Bayesian GLM predictions for the same validation dataset. The analysis involved comparing the outputs of the XGBoost model and the mean predictions of the Bayesian GLM across the Vs profile depth. For the Bayesian GLM, the 95% credible interval was also included to capture the uncertainty in the Vs predictions. It can be observed that both the XGBoost model’s predictions and the Bayesian GLM’s mean predictions align well with the measured Vs values across depths. Notably, the XGBoost model’s predictions fall within the 95% credible interval across the entire depth, showcasing its excellent predictive performance of the Vs. The uncertainty captured by the Bayesian GLM (the credible intervals) is crucial for site response analysis, where accounting for variability in Vs is essential. These intervals can be used to conduct sensitivity studies by varying the Vs profile within the predicted bounds, enabling the estimation of a range of possible seismic responses for the site. This approach minimizes the risk of underestimating hazards and provides a foundation for prioritizing additional site investigations in areas with significant uncertainty. Furthermore, to quantitatively assess the predictive performance of the two models, linear correlation coefficients

(r)

were computed. Both models demonstrated good agreement with the measured Vs values, with the XGBoost model achieving a relatively higher

r

value (0.39).

5. Conclusions

This study integrates Bayesian GLM and the XGBoost algorithm, coupled with SHAP and partial dependence analysis, to robustly predict Vs. Six key features were identified as the most influential factors for Vs prediction: cone tip resistance and its coefficient of variation, sleeve friction, soil behavior type index, depth, and overburden stress. These features formed the foundation for the final development of the XGBoost model. For the Bayesian GLM, cone tip resistance and its coefficient of variation, sleeve friction, depth, and overburden stress were identified as key predictor variables to fit the model on the same preprocessed datasets as the XGBoost model.

The XGBoost model demonstrated strong predictive performance, achieving high scores across various performance metrics and maintaining low prediction errors. SHAP analysis enhanced the model’s interpretability by quantifying the contribution of each feature to the Vs predictions. Overburden stress was identified as the most influential factor, followed by cone tip resistance and sleeve friction. The validation results on the SCPT dataset revealed a strong agreement between measured and predicted Vs values for both models, with the Bayesian GLM offering credible intervals representing prediction uncertainty.

The integration of Bayesian GLM and explainable ML algorithms signifies a robust framework for predicting Vs while leveraging the strengths of both approaches. Explainable ML model excel at identifying critical input geotechnical parameters and explaining their contributions to the prediction. This capability not only enhances the interpretation of the underlying factors influencing Vs but also delivers high predictive performance. On the other hand, Bayesian GLM offers complementary advantages by explicitly capturing the uncertainties associated with geotechnical variability, measurement errors, and other sources of uncertainty. This approach enhances confidence in Vs predictions and their applicability to seismic response assessments.

To further enhance the applicability of this work, developing a user-friendly computational platform for Vs prediction is recommended. Such a platform could integrate both Bayesian GLM and ML approaches, enabling geotechnical engineers to easily interpret predictions and uncertainties. Additionally, site-specific seismic response analyses using Vs profiles obtained from the two approaches should be conducted to further validate their performance in practical engineering scenarios.

Author Contributions

Conceptualization, A.T.C. and R.R.; methodology, A.T.C.; software, A.T.C.; validation, A.T.C. formal analysis, A.T.C.; writing—original draft preparation, A.T.C.; writing—review and editing, R.R.; visualization, A.T.C.; supervision, R.R. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Széchenyi István University.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The date utilized for this study can be downloaded from the following link: https://www.tugraz.at/en/institutes/ibg/research/computational-geotechnics-group/database/ (accessed on 12 May 2023).

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

CHT	Cross-hole test
CI	Credible interval
colsample _bytree	Column subsampling ratio per tree
CPT	Cone penetration test
cv	Coefficient of variation
DHT	Downhole test
E	Error term
fs	Sleeve friction
fs_cv	Coefficient of variation of sleeve friction
fs_cv	sleeve friction coefficient of variation
fs_mean	Mean of sleeve friction
gamma	Minimum loss reduction
GLM	Generalized linear model
IA	Index of Agreement
Ic	Soil Behavior type index
Ic_mean	Mean of soil behavior type index
IQR	Interquartile range
KGE	Kling–Gupta efficiency
kPa	Kilo pascal
$l (.)$	Loss function
$L (.)$	Likelihood function
$M (X)$	Predictor function
m/s	Meter per second
MAE	Mean absolute error
MARE	Mean absolute relative error
MASRE	Mean square relative error
MASW	Multichannel analysis of surface waves
max_bin	Maximum number of bins
max_depth	Maximum depth of trees
MaxARE	Maximum absolute relative error
MBE	Mean bias error
MCMC	Markov Chain Monte Carlo
MCSE	Monte Carlo Standard Error
ML	Machine learning
MPa	Mega pascal
MSE	Mean squared error
N	Total number of input features
n_eff	Effective sample size
n_estimators	Number of trees
PDPs	Partial dependence plots
qc	Cone tip resistance
qc_cv	Coefficient of variation of cone tip resistance
qc_mean	Mean of cone tip resistance
r	Linear correlation coefficient
R2	coefficient of determination
reg_alpha	L1 Regularization term on weights
reg_lambda	L2 Regularization term on weights
Rf	Friction ratio
Rf_mean	Mean of friction ratio
Rhat	Potential scale reduction factor
RMSRE	Root mean squared relative error
S	Subset of features
scale_pos_ weight	Balancing weight for positive and negative classes
SCPT	Seismic cone penetration test
SCPTu	Cone penetration test with pore pressure measurement
SHAP	Shapley Additive Explanations
STD	Standard deviation
Vs	Shear wave velocity
$ω_{t}$	Weight of leaf node
X	Measured shear wave velocity
XGBoost	Extreme Gradient Boosting
Y	Predicted shear wave velocity
σ,v	Total overburden stress
$µ_{i}$	Mean of prior distribution
$σ_{i}^{2}$	Variance of prior distribution
$β_{0}$	Intercept
$β_{i}$	Coefficients of predictor variables
$L (ϕ)$	Objective function
$Ω (f_{t})$	regularization parameter
$γ, λ$	regularization coefficients
$ϕ_{0}$	Base value
$ϕ_{i}$	SHAP value

References

Bazzurro, P. Ground-Motion Amplification in Nonlinear Soil Sites with Uncertain Properties. Bull. Seismol. Soc. Am. 2004, 94, 2090–2109. [Google Scholar] [CrossRef]
Rathje, E.M.; Kottke, A.R.; Trent, W.L. Influence of Input Motion and Site Property Variabilities on Seismic Site Response Analysis. J. Geotech. Geoenviron. Eng. 2010, 136, 607–619. [Google Scholar] [CrossRef]
Chala, A.; Ray, R. Impact of Randomized Soil Properties and Rock Motion Intensities on Ground Motion. Adv. Civ. Eng. 2024, 2024, 1–12. [Google Scholar] [CrossRef]
Campanella, R.G.; Stewart, W.P. Seismic Cone Analysis Using Digital Signal Processing for Dynamic Site Characterization. Can. Geotech. J. 1992, 29, 477–486. [Google Scholar] [CrossRef]
Hardee, H.C.; Elbring, G.J.; Paulsson, B.N.P. Downhole Seismic Source. Geophysics 1987, 52, 729–739. [Google Scholar] [CrossRef]
Robertson, P.K.; Campanella, R.G.; Gillespie, D.; Rice, A. Seismic CPT to Measure in Situ Shear Wave Velocity. J. Geotech. Eng. 1986, 112, 791–803. [Google Scholar] [CrossRef]
Stokoe, K.H.; Woods, R.D. In Situ Shear Wave Velocity by Cross-Hole Method. J. Soil Mech. Found. Div. 1972, 98, 443–460. [Google Scholar] [CrossRef]
Park, C.B.; Miller, R.D.; Xia, J. Multichannel Analysis of Surface Waves. Geophysics 1999, 64, 800–808. [Google Scholar] [CrossRef]
Meisina, C.; Bonì, R.; Bordoni, M.; Lai, C.G.; Bozzoni, F.; Cosentini, R.M.; Castaldini, D.; Fontana, D.; Lugli, S.; Ghinoi, A.; et al. 3D Engineering Geological Modeling to Investigate a Liquefaction Site: An Example in Alluvial Holocene Sediments in the Po Plain, Italy. Geosciences 2022, 12, 155. [Google Scholar] [CrossRef]
Yang, H.Q.; Chu, J.; Wu, S.; Zhu, X.; Qi, X.; Chiam, K. Advancing Geological Modelling and Geodata Management: A Web-Based System with AI Assessment in Singapore. Georisk 2024. [Google Scholar] [CrossRef]
Robertson, P.K. Interpretation of Cone Penetration Tests—A Unified Approach. Can. Geotech. J. 2009, 46, 1337–1355. [Google Scholar] [CrossRef]
Mayne, P.W.; Rix, G.J. Correlations Between Shear Wave Velocity and Cone Tip Resistance in Natural Clays. Soils Found. 1995, 35, 107–110. [Google Scholar] [CrossRef] [PubMed]
Andrus, R.D.; Mohanan, N.P.; Piratheepan, P.; Ellis, B.S.; Holzer, T.L. Predicting shear-wave velocity from cone penetration resistance. In Proceedings of the 4th International Conference on Earthquake Geotechnical Engineering, Thessaloniki, Greece, 24–29 June 2007. [Google Scholar]
Griffiths, S.C.; Cox, B.R.; Rathje, E.M.; Teague, D.P. Surface-Wave Dispersion Approach for Evaluating Statistical Models That Account for Shear-Wave Velocity Uncertainty. J. Geotech. Geoenviron. Eng. 2016, 142, 04016061. [Google Scholar] [CrossRef]
Matasovic, N.; Hashash, Y. NCHRP Synthesis 428: Practices and Procedures for Site-Specific Evaluations of Earthquake Ground Motions, a Synthesis of Highway Practice; National Academies Press: Washington, DC, USA, 2012. [Google Scholar]
Toro, G.R. Probabilistic Models of Site Velocity Profiles for Generic and Site-Specific Ground-Motion Amplification Studies. Tech. Rep. 1995, 779574. [Google Scholar]
Rauter, S.; Tschuchnigg, F. Cpt Data Interpretation Employing Different Machine Learning Techniques. Geosciences 2021, 11, 265. [Google Scholar] [CrossRef]
Padarian, J.; Minasny, B.; McBratney, A.B. Machine Learning and Soil Sciences: A Review Aided by Machine Learning Tools. SOIL 2020, 6, 35–52. [Google Scholar] [CrossRef]
Chala, A.T.; Ray, R.P. Machine Learning Techniques for Soil Characterization Using Cone Penetration Test Data. Appl. Sci. 2023, 13, 8286. [Google Scholar] [CrossRef]
Felić, H.; Marzouk, I.; Tschuchnigg, F.; Peterstorfer, T. Data-Driven Site Characterization—Focus on Small-Strain Stiffness. In Proceedings of the 7th International Conference on Geotechnical and Geophysical Site Characterization—CIMNE, Barcelona, Spain, 18–21 June 2024. [Google Scholar]
Olayiwola, T.; Tariq, Z.; Abdulraheem, A.; Mahmoud, M. Evolving Strategies for Shear Wave Velocity Estimation: Smart and Ensemble Modeling Approach. Neural Comput. Appl. 2021, 33, 17147–17159. [Google Scholar] [CrossRef]
Taheri, A.; Makarian, E.; Manaman, N.S.; Ju, H.; Kim, T.H.; Geem, Z.W.; Rahimizadeh, K. A Fully-Self-Adaptive Harmony Search GMDH-Type Neural Network Algorithm to Estimate Shear-Wave Velocity in Porous Media. Appl. Sci. 2022, 12, 6339. [Google Scholar] [CrossRef]
Goodrich, B.; Gabry, J.A.I.; Brilleman, S. Rstanarm: Bayesian Applied Regression Modeling via Stan, R Package Version 2.21.4. 2023.
Gelman, A.; Carlin, J.B.; Stern, H.S.; Rubin, D.B. Bayesian Data Analysis; Chapman and Hall/CRC: Boca Raton, FL, USA, 1995; ISBN 0429258410. [Google Scholar]
Yang, H.Q.; Zhang, L.; Pan, Q.; Phoon, K.K.; Shen, Z. Bayesian Estimation of Spatially Varying Soil Parameters with Spatiotemporal Monitoring Data. Acta Geotech. 2021, 16, 263–278. [Google Scholar] [CrossRef]
Gong, W.; Tien, Y.M.; Juang, C.H.; Martin, J.R.; Luo, Z. Optimization of Site Investigation Program for Improved Statistical Characterization of Geotechnical Property Based on Random Field Theory. Bull. Eng. Geol. Environ. 2017, 76, 1021–1035. [Google Scholar] [CrossRef]
Gelman, A.; Rubin, D.B. Inference from Iterative Simulation Using Multiple Sequences. Stat. Sci. 1992, 7, 457–472. [Google Scholar] [CrossRef]
Wang, X.; Wang, X.S.; Li, N.; Wan, L. Bayesian Inversion of Soil Hydraulic Properties from Simplified Evaporation Experiments: Use of DREAM(ZS) Algorithm. Water 2021, 13, 2614. [Google Scholar] [CrossRef]
Qin, S.; Song, R.; Li, N. Bayesian Model Updating for Bridge Engineering Applications Based on DREAMKZS Algorithm and Kriging Model. Structures 2023, 58, 105565. [Google Scholar] [CrossRef]
Liu, G.; Jiang, W. Model Updating of a Prestressed Concrete Rigid Frame Bridge Using Multiple Markov Chain Monte Carlo Method and Dfferential Evolution. Int. J. Struct. Stab. Dyn. 2022, 22, 2240020. [Google Scholar] [CrossRef]
Chen, T.; Guestrin, C. Xgboost: A Scalable Tree Boosting System. In Proceedings of the 22nd Acm Sigkdd International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
Akiba, T.; Sano, S.; Yanase, T.; Ohta, T.; Koyama, M. Optuna: A Next-Generation Hyperparameter Optimization Framework. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Anchorage, AK, USA, 4–8 August 2019; ACM: New York, NY, USA, 2019; pp. 2623–2631. [Google Scholar]
Lundberg, S. A Unified Approach to Interpreting Model Predictions. arXiv 2017, arXiv:1705.07874. [Google Scholar]
Friedman, J.H. Greedy Function Approximation: A Gradient Boosting Machine. Ann. Stat. 2001, 29, 1189–1232. [Google Scholar] [CrossRef]
Oberhollenzer, S.; Premstaller, M.; Marte, R.; Tschuchnigg, F.; Erharter, G.H.; Marcher, T. Cone Penetration Test Dataset Premstaller Geotechnik. Data Brief 2021, 34, 106618. [Google Scholar] [CrossRef]
Ray, R.P.; Wolf, A.; Kegyes-Brassai, O. Harmonizing Dynamic Property Measurements of Hungarian Soils. In Proceedings of the 6th International Conference on Geotechnical and Geophysical Site Characterization (ISC2020), Budapest, Hungary, 7–11 September 2020. [Google Scholar]
Kegyes-Brassai, O.; Wolf, Á.; Szilvágyi, Z.; Ray, R.P. Effects of Local Ground Conditions on Site Response Analysis Results in Hungary. In Proceedings of the 19th International Conference on Soil Mechanics and Geotechnical Engineering (19th ICSMGE), Seoul, Republic of Korea, 17–21 September 2017; pp. 2003–2006. [Google Scholar]
Szilvágyi, Z.; Panuska, J.; Kegyes-brassai, O.; Wolf, Á.; Tildy, P.; Ray, R.P. Ground Response Analyses in Budapest Based on Site Investigations and Laboratory Measurements. World Acad. Sci. Eng. Technol. Int. J. Environ. Chem. Ecol. Geol. Geophys. Eng. 2017, 11, 307–317. [Google Scholar]
Wolf, Á.; Ray, R.P. Comparison and Improvement of the Existing Cone Penetration Test Results: Shear Wave Velocity Correlations for Hungarian Soils. Int. J. Geol. Environ. Eng. 2017, 11, 362–371. [Google Scholar]
Kazemi, F.; Asgarkhani, N.; Jankowski, R. Optimization-Based Stacked Machine-Learning Method for Seismic Probability and Risk Assessment of Reinforced Concrete Shear Walls. Expert Syst. Appl. 2024, 255, 124897. [Google Scholar] [CrossRef]
Asgarkhani, N.; Kazemi, F.; Jakubczyk-Gałczyńska, A.; Mohebi, B.; Jankowski, R. Seismic Response and Performance Prediction of Steel Buckling-Restrained Braced Frames Using Machine-Learning Methods. Eng. Appl. Artif. Intell. 2024, 128, 107388. [Google Scholar] [CrossRef]
Wakjira, T.G.; Kutty, A.A.; Alam, M.S. A Novel Framework for Developing Environmentally Sustainable and Cost-Effective Ultra-High-Performance Concrete (UHPC) Using Advanced Machine Learning and Multi-Objective Optimization Techniques. Constr. Build. Mater. 2024, 416, 135114. [Google Scholar] [CrossRef]

Figure 1. Flow diagram illustrating the methodology for developing and evaluating XGBoost and Bayesian GLMs.

Figure 2. Correlation matrix illustrating the relationships between input features.

Figure 3. Seismic cone penetration (SCPT) data used for model validation.

Figure 4. Hyperparameter importance of XGBoost algorithm.

Figure 5. Partial dependence plots.

Figure 6. Predictive performance of the XGBoost model: (a) scatter plots of predicted Vs versus measured Vs for the training and testing datasets; (b) residual scatter and histogram plots for training and testing datasets.

Figure 7. Summary of global feature impacts (a) and importance rankings (b) for input features.

Figure 8. Convergence diagnostic plot showing Rhat values against the number of iterations.

Figure 9. Trace plots depicting the sampling dynamics and convergence behavior of each model parameter in the Bayesian GLM.

Figure 10. Predictive performance Bayesian GLM, (a) comparison of predicted Vs against measured Vs, (b) predictor values at 10 m depth and obtained results, and (c) histogram of predicted Vs values at 10 m depth.

Figure 11. Predictive performance of Bayesian GLM and XGBoost models against validation dataset.

Table 1. Statistical summary of input features.

Metrics	qc_mean (MPa)	qc_cv (-)	fs_mean (kPa)	fs_cv (-)	Rf_mean (%)	Ic_mean (-)	Depth (m)	σ,v (kPa)	Vs (m/s)
Mean	4.24	0.24	50.53	0.26	2.29	2.74	13.38	254	237.6
STD	7.12	0.25	56.57	0.21	3.16	0.65	9.01	171	91.1
Minimum	0.02	0	0.35	0	0.09	0	0.5	9.5	22
Maximum	74.75	2.59	820.25	1.79	112.95	4.06	49.5	941	547
Count	3600	3600	3600	3600	3600	3600	3600	3600	3600

Table 2. Performance and error metrics used to evaluate Vs predictive capabilities of XGBoost and Bayesian GLM.

Metrics	Formula	Ideal Value	Equation. No
Correlation coefficient	$r = \frac{\sum_{i = 1}^{n} (x_{i} - \bar{x}) (y_{i} - \bar{y})}{\sqrt{\sum_{i =}^{n} {(x_{i} - \bar{x})}^{2}} \sqrt{\sum_{i =}^{n} {(y_{i} - \bar{y})}^{2}}}$	1	(13)
Coefficient of determination	$R^{2} = 1 - \sum_{i = 1}^{N} \frac{{(x_{i} - y_{i})}^{2}}{{(x_{i} - \bar{x})}^{2}}$	1	(14)
Index of Agreement	$I A = 1 - \frac{\sum_{i = 1}^{N} {(y_{i} - {\hat{y}}_{i})}^{2}}{\sum_{i =}^{N} {(\|y_{i} - \bar{y}\| + \| {\hat{y}}_{i} - \bar{y} \|)}^{2}}$	1	(15)
Kling–Gupta efficiency	$K G E = 1 - \sqrt{{(r - 1)}^{2} + {(α - 1)}^{2} + {(β - 1)}^{2}}$	1	(16)
Mean squared error	$M S E = \frac{1}{n} \sum_{i = 1}^{n} {(x_{i} - y_{i})}^{2}$	0	(17)
Root mean squared relative error	$R M S R E = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(\frac{x_{i} - y_{i}}{x_{i}})}^{2}}$	0	(18)
Mean absolute error	$M A E = \frac{1}{n} \sum_{i = 1}^{n} \|x_{i} - y_{i}\|$	0	(19)
Mean absolute relative error	$M A R E = \frac{1}{n} \sum_{i = 1}^{n} \|\frac{x_{i} - y_{i}}{x_{i}}\|$	0	(20)
Mean square relative error	$M S R E = \frac{1}{n} \sum_{i = 1}^{n} {\|\frac{x_{i} - y_{i}}{x_{i}}\|}^{2}$	0	(21)
Mean bias error	$M B E = \frac{1}{n} \sum_{i = 1}^{n} (x_{i} - y_{i})$	0	(22)
Maximum absolute relative error	$M a x A R E = M a x (\|\frac{x_{i} - y_{i}}{x_{i}}\|)$	0	(23)

Let

X

represent the measured Vs values and

Y

represent the predicted Vs values. Then,

x_{i}

represents the

i^{t h}

value of variable

X

,

\bar{x}

is mean of

X

,

y_{i}

is the

i^{t h}

value of variable

Y

,

\bar{y}

is the mean of

Y,

r

represents the linear correlation coefficients, and

n

is the number of data points.

Table 3. Optimized hyperparameters values of XGBoost model.

Hyperparameters	Search Span	Optimized Values
Number of trees (n_estimators)	50–600	444
Learning rate	0.001–0.5	0.0094
Maximum depth of trees	1–10	10
Subsampling ratio	0.05–1	0.640
L1 Regularization term on weights (reg_alpha)	0.01–1	0.208
L2 Regularization term on weights (reg_lambda)	0.01–1	0.603
Column subsampling ratio per tree (colsample_bytree)	0.5–1	0.707
Minimum loss reduction required to make a further split (gamma)	0–10	8.79
Maximum number of bins for feature quantization (max_bin)	128–512	499
Balancing weight for positive and negative classes (scale_pos_weight)	0.1–10	6.09

Table 4. Performance of XGBoost model on test and train dataset.

Performance Metrics	Test Dataset	Train Dataset
$R^{2}$	0.54	0.91
IA	0.84	0.97
KGE	0.65	0.82
MSE	3792	781
RMSRE	0.39	0.22
MAE	41	19.6
MARE	0.21	0.11
MSRE	0.15	0.05
MBE	1.12	0.13
MaxARE	3.30	3.95

Table 5. MCMC diagnostics for the Bayesian GLM.

Parameters	Mean	STD	MCSE (%)	Rhat	n_eff
Intercept	5.387	0.006	0.004	1.000	32,562
$q c_m e a n$	0.079	0.010	0.007	0.999	19,341
$q c_c v$	0.015	0.007	0.004	1.000	28,913
$f s_m e a n$	0.092	0.010	0.007	1.000	19,305
$σ v$	0.092	0.786	0.65	0.999	14,770
$D e p t h$	0.109	0.787	0.65	1.000	14,771

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Chala, A.T.; Ray, R. Uncertainty Quantification in Shear Wave Velocity Predictions: Integrating Explainable Machine Learning and Bayesian Inference. Appl. Sci. 2025, 15, 1409. https://doi.org/10.3390/app15031409

AMA Style

Chala AT, Ray R. Uncertainty Quantification in Shear Wave Velocity Predictions: Integrating Explainable Machine Learning and Bayesian Inference. Applied Sciences. 2025; 15(3):1409. https://doi.org/10.3390/app15031409

Chicago/Turabian Style

Chala, Ayele Tesema, and Richard Ray. 2025. "Uncertainty Quantification in Shear Wave Velocity Predictions: Integrating Explainable Machine Learning and Bayesian Inference" Applied Sciences 15, no. 3: 1409. https://doi.org/10.3390/app15031409

APA Style

Chala, A. T., & Ray, R. (2025). Uncertainty Quantification in Shear Wave Velocity Predictions: Integrating Explainable Machine Learning and Bayesian Inference. Applied Sciences, 15(3), 1409. https://doi.org/10.3390/app15031409

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Uncertainty Quantification in Shear Wave Velocity Predictions: Integrating Explainable Machine Learning and Bayesian Inference

Abstract

1. Introduction

2. Data-Driven Shear Wave Velocity (Vs) Prediction Models

2.1. Bayesian Generalized Linear Model

2.2. Extreme Gradient Boosting (XGBoost) Algorithm

3. Methodology

3.1. Training and Testing Dataset

3.2. Hungarian Seismic Cone Penetration Test (SCPT)

3.3. Performance Measurements

4. Discussion

4.1. Evaluation of the Models

4.2. Validation of the Models

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI