2.1. Analysis of Quality Anomaly Identification and Diagnosis in the Complex Product Manufacturing Process
General models for the identification and diagnosis of quality anomalies have inherent flaws that make them less accurate and less efficient.
Figure 1 illustrates the general methodology for identifying and diagnosing manufacturing process quality anomalies. The formation process of product quality in the figure demonstrates that complex product quality influences numerous factors. Based on historical data, the quality anomaly pattern is not comprehensive, resulting in the emergence of new quality anomalies. This can lead to an inability to identify these anomalies in a timely and effective manner. The classification of quality anomalies process in the figure illustrates that quality anomalies are typically divided into several types. However, this classification introduces increased complexity to the model. The data dimensionality reduction process, while downgrading the quality features, also results in the loss of some of the features, which affects the accuracy of the identification. In the diagnosis of the causes of quality anomalies, a general solution is typically provided for a class of problems but not for each sample. This results in the inability to quickly carry out the processing of the quality problem, which affects the progress of the production of the product.
The manufacturing process for complex products is multifaceted and involves various quality indicators. Meanwhile, the correlation between the different factors affecting product quality in the manufacturing process can lead to quality abnormalities that are difficult to identify. It is challenging to accurately diagnose the primary factors responsible for quality issues. To this end, this paper presents a method for identifying quality anomalies based on Deep Neural Networks (DNN) and SVM. Additionally, the SHAP model is utilized to diagnose the underlying cause of such anomalies, as illustrated in
Figure 2. Initially, as shown in the deep residual network quality feature extraction model in
Figure 2, the feature extraction capability of the deep residual shrinkage network is used to extract the relevant information for anomaly identification from the quality characteristic data, which is characterized by high-level noise and strong redundancy. Subsequently, the quality diagnostic model is based on the SVM model in
Figure 2. The product quality is categorized into normal and abnormal categories with the help of SVM’s stable and good binary classification ability for high-dimensional data. Additionally, the quality anomaly identification process for individual products can be achieved by utilizing the dependable interpretation abilities delivered by the SHAP model in conjunction with deep learning algorithms. Furthermore, the contribution ranking of quality characteristic data that affects the quality anomaly is provided to facilitate the identification and diagnosis of manufacturing process quality anomalies. The flow of identification and diagnosis of quality anomalies in the manufacturing process based on DRSN-SVM-SHAP in the figure reflects the operation logic of the model. As illustrated in the model’s structure, in addition to the model’s inherent advantages, such as accelerated convergence and enhanced accuracy, it also possesses numerous intrinsic advantages for addressing the variability inherent to the manufacturing process. These advantages are primarily as follows: (1). The model exhibits profound feature learning and characterization capacity. The deep architectural design of DRSN enables the automatic acquisition of multi-level, non-linear feature representations from raw data, along with the consideration of symmetry factors. This capability effectively captures complex, hidden variability patterns within the manufacturing process. The resulting representation enables the model to maintain high anomaly identification accuracy in the face of multi-factor variability. (2). The model has the ability to generalize and adapt. As a classifier, SVM has good generalization performance and the ability to identify anomalous samples, which is suitable for dealing with symmetric information feature space. Meanwhile, SHAP can provide a local interpretation of the model prediction, help identify the key factors leading to anomalies, and facilitate the model’s adaptation to specific variability. (3). The model is capable of dynamic feature importance assessment. In addition to providing a global view of feature contributions, SHAP also dynamically assesses the importance of each feature in identifying product quality anomalies under specific manufacturing conditions. Furthermore, SHAP analyses can reveal how a particular symmetry or symmetry-breaking pattern affects the identification of quality anomalies. This allows the model to respond quickly to changes in the manufacturing process and provide valuable insights for targeted process improvement and anomaly mitigation strategies. The application of the aforementioned strategies within this model enables the model to effectively cope with the variability inherent to the manufacturing process, to accurately and timely identify and diagnose anomalies, and to improve the stability of the manufacturing process and product quality.
2.2. Construction of a Neural Network-Based Model for Identifying and Diagnosing Quality Anomalies
The complex product manufacturing process necessitates the implementation of quality abnormal analysis to identify problems. This process entails the identification of quality-related data imbalance, asymmetry, high noise, high redundancy, and the characteristics of high coupling, which collectively lead to the recognition and diagnosis of the main causes of low accuracy. This paper presents a recognition and diagnosis model of quality anomalies in complex product manufacturing processes based on a neural network. The processing method of the highly imbalanced and asymmetrical quality data set is first outlined. The residual shrinkage network, which is capable of learning from high-noise, high-redundant data, is employed to extract effective features from quality-related data. This enables the identification of information related to quality. The support vector machine, which exhibits excellent binary classification performance, is used to identify the current quality state by taking the extracted quality data as input. The SHAP model is employed to diagnose the cause of quality anomalies by explaining the process of feature extraction and quality anomaly identification.
In the context of designing a prediction model for quality assurance in complex manufacturing processes, the methodology employed possesses a natural advantage in addressing the issue of dataset imbalance and asymmetry. This remains a core reason for the subpar accuracy in recognizing quality anomalies. The common data processing techniques for unbalanced datasets include oversampling, undersampling, and mixed sampling. Therefore, this issue warrants attention, and proper processing methods need to be employed to improve accuracy. Among these techniques, mixed sampling combines oversampling to expand the data set and undersampling to clean the expanded data set. This helps to avoid the over-fitting that can result from repeatedly extracting samples through oversampling and skipping core data through undersampling.
The “SMOTE + ENN” technique for mixed sampling initially creates novel minority samples based on Equation (1), employs the Euclidean distance metric for gauging the “distance” between samples, and randomly selects some samples from its M nearest neighbors for creating new ones. In order to predict the samples on the dataset, the Edited Nearest Neighbors (ENN) approach is employed. Samples with incorrect predictions are then eliminated post-final sampling in order to generate a new dataset.
2.2.1. Based on the Quality of the Depth of the Residual Shrinkage Network Feature Extraction
The manufacturing process for intricate products is intricate and incorporates different quality features. Initially, the raw quality data gathered undergoes principal component analysis, random forest, correlation filtering, and other techniques to reduce the various parameters. However, some attributes are absent due to the dimension reduction process, leading to inaccuracies in the recognition results. In this paper, the residual shrinkage deep feature extraction method is evaluated to determine the quality of a convolutional neural network and its ability to learn from data correlations. Additionally, the study explores the ability of residual modules in the deep residual shrinkage network to process high levels of noise and redundant information, leading to efficient feature dimension reduction. Technical terms are explained upon first use. The DRSN used in this model is architecturally symmetric, making the feature extraction process more balanced and stable.
- 1.
Convolutional Neural Network
The Convolutional Neural Network (CNN) comprises an input layer, a hidden layer, and an output layer. Through convolution and pooling operations, the hidden layer autonomously learns and extracts features from extensive data.
Convolutional layer (Conv). To extract features from original data, a convolution operation is used rather than a complete connection between artificial neural network layers. This approach adopts local connection and weight sharing to reduce computation while allowing the retention of upper-layer information for full feature extraction in lower layers. The operation is expressed in Formula (2).
Here,
denotes the
weight in the
th convolution kernel of the
th layer.
denotes
the weight-aware location in the
th convolved region of the
th layer.
is the size of the convolution kernel.
Pooling layer (pooling the layers). It appears in pairs with the convolutional layer to compress the features after convolution to achieve data dimensionality reduction. According to the different calculation methods, the pooling operation is divided into random pooling, maximum pooling, and so on. Taking random pooling as an example, by calculating the probability value of elements in the current range and randomly retaining features according to the probability, the generalization ability is stronger.
- 2.
Deep residual networks
Convolutional neural network layers provide more comprehensive feature extraction, but may also suffer from excessive fitting and gradient explosion. The Residual Network (ResNet) [
18] enhances the depth of the convolutional neural network by adding several standardization batches along with activation functions. The structure of the residual module consists of two Conv paths with an identity link, as depicted in
Figure 3. The identity path allows for superior gradient propagation in the loss function of the deep residual network, leading to slight parameter updates, enhancing training speed, and mitigating network degradation.
Here, the cuboid represents the feature map with C channels, W width, and 1 height, and K represents the number of convolution kernels in the convolution layer (K = C means that the number of convolution kernels is the same as the number of channels of the input feature). ① indicates that the size of the input feature map is equal to the size of the output feature map. ② means that the width of the output feature map is halved. ③ means that the width of the output feature map is halved and the number of channels is doubled.
Batch normalization (BN) is utilized for the standardization of data and to diminish the impact of variations in data feature distribution on the model. Equations (3)–(6) reflect the batch calculation process.
Here, and denote the input and output features of the th sample, and denote the two trainable parameters of the scaling and translation distribution, and is a positive number close to zero.
Activation function (AF) is employed to transform data non-linearly and prevent the occurrence of the vanishing gradient problem. Many activation functions utilize the Rectified Linear Unit activation function (ReLu). Compared to sigmoid and tanh functions, ReLU is widely used in CNN since it can not only speed up learning but also alleviate the issue of gradient vanishing [
19]. The ReLu activation function has a derivative of either 1 or 0, with a value range that remains roughly unchanged when transferred between layers. By doing so, it improves the speed of training.
Here, and represent function input and output characteristics, respectively.
- 3.
Deep Residual Shrinkage Network (DRSN)
A deep residual shrinkage network (DRSN) builds upon the deep residual network’s integrated attention mechanism and soft threshold function [
20]. This enhancement enables the network to concentrate on non-core features by employing the attention mechanism, deactivating unimportant features via the soft threshold function, and conserving useful features. This module effectively extracts useful features from noisy signals. The DRSN consists of the input layer, convolutional layer, stacked residual shrinkage building unit (RSBU), batch normalization (BN), activation function, global average pooling (GAP), and fully connected layer [
21], as depicted in
Figure 4. The DRSN offers several key advantages in terms of feature extraction in this model: (1). Residual learning: the DRSN contains residual connections that allow the network to learn additive residuals instead of mapping inputs directly to outputs. This alleviates the problem of gradient vanishing encountered in deep networks and allows the model to train deeper architectures more efficiently. Deep networks can capture more complex and abstract features, thus improving the discriminative power of extracted features. (2). Efficient information propagation: residual connectivity circumvents the nonlinear activation function, enabling gradients to flow directly through the network. Shrinkage regularization also suppresses noise and irrelevant features, thereby accelerating training convergence, reducing overfitting, and ultimately improving model performance. (3). Reduced feature dimension: it helps to reduce the dimensionality of the feature space while retaining important information. The addition of residual join and contraction operations to the model enables it to generalize better to new data.
2.2.2. Quality Anomaly Recognition Based on DRSN-SVM
The identification of quality anomalies in complex product manufacturing processes is primarily based on the current value of the product quality characteristics index to determine if the product meets quality standards. The deep residual shrinkage network (DRSN) reduces noise and redundancy from the original quality data, resulting in reduced data dimensions. The support vector machine (SVM) is then used to identify abnormal quality states based on the reduced quality data. Through the extent of residual shrinkage, the combination of a network and support vector machine (SVM) can address complex problems of data redundancy arising from high noise and high product quality characteristics, leading to good recognition results.
The support vector machine is a linear classifier in supervised learning [
22]. Its basic principle is to find a hyperplane that meets the classification requirements and make the points in the training set as far away from the classified hyperplane as possible to maximize the white space on both sides of it. Let the training set D be
, and
are eigenvectors and associated labels, respectively. A separate hyperplane can be briefly described as
, where
is the normal vector of the hyperplane and
represents its deviation.
Find a marginal hyperplane with the greatest distance from the nearest training set that satisfies the formula for all sample points:
is the slack variable reflecting the
constraint, and the optimization problem corresponding to the SVM model can be formulated as follows:
is a penalty factor that is added when violating the constraint to prevent overfitting, and the Lagrange multiplier
is introduced to convert it into the corresponding dual problem:
The solution to the problem
,
is either a training sample of 0 or a support vector, and a small number of support vectors determine the hyperplane of classification. The optimal classification function obtained is:
is a symbolic function, and .
2.2.3. Quality Abnormal Cause Diagnosis Based on SHAP
The DRSN-SVM model for identifying quality anomalies can determine whether a product has normal qualifications or not. When quality is abnormal, identifying the root cause promptly is crucial. Due to the complexity of the production process for complex products and the high dimensionality and numerous characteristics of quality data, determining the true cause of an anomaly by empirical diagnosis alone is challenging. The SHAP model is an interpretable tool for understanding machine learning algorithms. This is achieved by assessing the marginal contribution of each eigenvalue to the prediction or classification process [
23]. The root cause of SHAP diagnostic anomalies can be identified in the following ways: (1). Feature importance: The SHAP value indicates the relative importance of each feature in determining the model predictions. This helps to determine which features have the greatest impact when an anomaly occurs. (2). Interpretation of individual predictions: This enables an understanding of how combinations of features contribute to a particular prediction, thus identifying outlier combinations or extremes that may be associated with an anomaly. (3). Feature interactions: SHAP can reveal how features interact and contribute to model outputs, thus helping to identify complex relationships that may be contributing to anomalies. In this paper, we employ the SHAP model to visually demonstrate the process of quality anomaly recognition. By calculating the contribution value of each sample to the classification result, we can identify the fundamental characteristics responsible for product quality anomalies.
- (1)
Shapley value
Game, also called countermeasures, refers to behavior with a competitive or adversarial nature. For the players in the game, that is, the cooperative parties, how to fairly distribute the benefits of successful cooperation is a problem that needs to be studied. In 1953, the scholar Lloyd Shapley proposed the Shapley value distribution method, that is, the use of the players in the cooperation of the bureau to measure the benefits received. The cooperative response is the need to determine the benefits received by each person and be content with:
Here, is a set of subsets in which contain all number of . is the number of elements of the set , is the weighting factor.
- (2)
SHAP (Shapley additive explanation) value
The SHAP value relies on the Shapley value distribution approach in game theory to compute the marginal contribution of each feature to the sample in the machine learning model. This includes when the feature is added to the model and in all feature sequences. The SHAP value of the feature represents its contribution at the time of prediction. This paper analyzes the impact of diverse features on changes in the black box model of machine learning. Typical evaluations of the significance of variables, for instance, Pearson correlations, can only establish a connection for the complete dataset and not for each sample. SHAP resolves the prediction black box issue for each sample by providing local interpretability, as depicted in
Figure 5.
Taking a linear prediction model as an example, it is easy to calculate the impact of a single feature on the model:
Here,
is the sample instance. Each
is an eigenvalue of the sample
, here
;
is the weight corresponding to the feature
. Predict the marginal contribution
of the
th feature in
:
Here,
is the average estimation effect of the feature
. The contribution is the difference between the feature effect and the mean effect. The total contribution of all features of sample
is obtained by subtracting the average predicted value from the predicted value of sample
:
The interpretation model can represent the prediction process of the entire sample as well as calculate the contribution of each characteristic value in the examples. Consequently, it enables the identification of the fundamental features that cause abnormal results in the prediction.
2.2.4. Establishment of a Quality Anomaly Recognition and Diagnosis Model Based on DRSN-SVM-SHAP
As previously stated, quality data for the complex manufacturing process of the product is collected, along with predicted results. The matrix samples are then divided into test and training sets in proportion. The DRSN is used to process the training set for feature extraction and generate input for the SVM classifier. This generates the quality anomaly classification results. The quality anomaly diagnosis was conducted, resulting in the final complex product manufacturing quality anomaly recognition and diagnosis model based on DRSN-SVM-SHAP.
The process of abnormal identification and diagnosis of complex product manufacturing quality based on DRSN-SVM-SHAP is as follows:
Step 1: Collect the relevant processing factor data during processing.
Step 2: Data cleaning.
Step 3: The “SMOTE + ENN” mix sampling to increase little class data.
Step 4: Use the Max–min standardization for data normalization processing.
Step 5: Use the quality abnormal data labels to train DRSN to get the DRSN initialization parameter.
Step 6: Transfer the parameters before the Flatten layer of the DRSN model to the DRSN part of the DRSN-SVM model, and the grid search algorithm is used to optimize the hyperparameters.
Step 7: Repeat Step 4 ~ Step 6 to generate DRSN-SVM model.
Step 8: Bringing quality characteristics to identify the abnormal quality.
Step 9: Input the abnormal quality data into the SHAP model for quality diagnosis.
Step 10: Output quality diagnosis results.
The abnormal identification of complex product manufacturing quality and the diagnosis process based on DRSN-SVM-SHAP are shown in
Figure 6.
2.2.5. Model Evaluation
In this paper, the confusion matrix and ROC curve, commonly used evaluation indexes of classification models, were used to evaluate the model [
24].
- (1)
Confusion Matrix
The evaluation of a model is accomplished using the confusion matrix, which quantifies the number of errors and classification result pairs. In the binary classification model, the True Positive rate of the model for truth-value errors is called TP and the False Negative rate of the model for truth-value is called FN. The True Negative rate of the model for truth-value errors is called TN, and the False Positive rate for truth-value errors is called FP.
Table 1 displays the confusion matrix. The matrix’s diagonal denotes the number of correctly classified samples by the model. Higher values indicate superior model performance. A type I error occurs when the model predicts a positive class when in fact the true class is negative. This type of error frequently results in overdiagnosis or overtreatment. A type II error occurs when the model predicts a negative class when in fact the true class is positive. This type of error frequently results in underdiagnosis or failure to take necessary action. Other evaluation indexes derived from the confusion matrix are the classification accuracy
AUC,
Recall,
Precision, and the
F-measure. Values closer to 1 correspond to more accurate model classification results. Equations (17)–(19) reflect the
AUC,
Recall,
Precision and
F-measure calculation processes.
- (2)
Receiver Operating Characteristic Curve (ROC Curve)
In the ROC curve, the horizontal axis represents the error rate of negative samples. The vertical axis represents the recall rate of the derivative evaluation index of the confusion matrix. The closer the curve is to the upper left corner, the better the classification performance of the model.
- (3)
Matthews Correlation Coefficient (MCC)
MCC is mainly used to measure the binary classification problem, which takes into account
TP,
TN,
FP, and
FN, and is a more balanced index, and can be used in the case of sample imbalance and asymmetry. The value of MCC is in the range of [−1, 1], and the value of 1 means that the prediction is in perfect agreement with the actual result, and the value of 0 means that the predicted result is not as good as the random prediction, and the value of −1 means that the predicted result is not in agreement with the actual result at all. inconsistent with the actual results. Thus, we see that
MCC essentially describes the correlation coefficient between the predicted and actual results.
2.3. Case Analysis
A data set of manufacturing processes for a complex semiconductor part consists of 1567 instances (104 of which are unqualified). Each instance represents a product entity with 591 features and 1 label (label 1 for qualified and label −1 for unqualified).
If the data is simply divided into a training set and a validation set according to a certain percentage, the resulting validation set will contain a very small number of data points and will exhibit significant fluctuations in the validation scores. In such cases, it is preferable to use k-fold cross-validation, which randomly divides the data set into
k partitions of equal size. For each partition
i, the model is trained on the remaining
k − 1 partitions, and then the model is evaluated on partition
i. The final validation score is equal to the average of the
K scores. The findings indicate that using
k = 10 resulted in higher performances and fewer skewed estimations [
25]. In this paper, the value of
k is set to 10, which means that 90% of the data is used for training purposes, while the remaining 10% is employed as a test set for model testing.
2.3.1. Data Preprocessing
Firstly, the features with outliers, such as null values and missing values, are found and sorted out. After data analysis, it was found that there were 12 columns of unique values, 33 columns of null values of more than 40%, 98 columns of repeated values in the data, and 448 columns of features that remained after deletion. The non-null data set was obtained by adding missing values according to the average value of the feature, including 1463 positive examples (qualified) and 104 negative examples (unqualified). The data set after sampling by “SMOTE + ENN” has a total of 2184 examples, including 1442 positive examples and 742 negative examples. Part of the data after preprocessing is shown in
Table 2.
2.3.2. Model Structure and Parameter Settings
Based on the complex product manufacturing quality, the DRSN-SVM exception recognition model is formed by the superposition of multiple depth residual shrinkage modules using a grid search algorithm for parameter optimization. This paper designed the module that has 12 residual contraction depths of residual shrinkage network model for feature extraction, involved in the model structure and parameters as shown in
Table 3. Neural network structure parameters include convolution kernel number, width, and step length (“/2” on behalf of the step length is 2). The important parameters of SVM are the C penalty function and
, where the penalty function C optimized to 1 indicates a low tolerance to errors. The optimization
is 0.0003, which indicates that the model has strong generalization ability.
2.3.3. Result Analysis
The training set data is brought into the deep residual shrinkage network for feature extraction, and the extracted features are input into the support vector machine model for classification.
Figure 7 shows the learning curve of the manufacturing quality anomaly recognition model of complex products based on the DRSN-SVM model. The learning curve for quality anomaly identification using the naive Bayes classifier and the SVM classifier is shown in
Figure 8. In
Figure 7 and
Figure 8, the red and green dots and lines indicate the training score and cross-validation score, respectively, which represent the performance of the model on the training and test sets. And, the light red and light green regions indicate the range of variation of the model’s cross-validation scores for different numbers of training instances, respectively. The light red and light green regions decrease, indicating that the model’s performance becomes more stable.
As can be seen from
Figure 8 and
Table 4, the accuracy of the naive Bayes classifier decreases first and then increases with the increase in sample size, and finally, the accuracy is 0.84 when the sample size reaches 1600. The accuracy of the support vector machine classifier is 0.98 when the number of samples is 1600. However, the accuracy of the DRSN-SVM model is 1 when the sample size is 1170. Obviously, the quality anomaly identification method based on DRSN-SVM converges earlier and has higher accuracy. On this basis, the SHAP model is used to diagnose quality anomalies.
- (1)
Analysis of the data sets
Combining features and the importance of the recognition results effect, the data collection of all the characteristics of each sample value displayed shapes drawn into a scatterplot, as shown in
Figure 9. The Y-axis represents the feature names of the dataset and is sorted by feature importance. The X-axis shows the SHAP value of each feature, which is the impact of the feature on the prediction. The characteristics of the identification results are coded by color, where red represents recognition results that are relevant, and blue represents results that are negatively related to the identification results. For each sample, if the combination result is greater than 0.5, the discriminant for the product is qualified, unqualified, and vice. The jitter of the same feature in the Y-axis direction represents the SHAP value distribution of the feature in the entire dataset, and the wider the regional distribution is, the more influential the feature is. Due to the large number of features, only the top 20 features that affect the recognition results are selected for display.
Figure 9 demonstrates that feature 19 holds the most significant sway on recognition results, closely pursued by feature 3 and feature 8. Conversely, feature 4 and feature 9 possess a negligibly minor influence on model outcomes. Nonetheless, the image alone is insufficient to indicate the characteristics’ overall impact on the model’s magnitude, given the importance of distribution properties. The color indicates the direction of influence that each feature has on the recognition result, with importance ranked accordingly.
Figure 10 displays a bar chart that clearly demonstrates the average impact of each feature on the model’s output. This enables us to assess the significance of each feature and make more informed decisions.
- (2)
Sample analysis
By analyzing the input of a single sample using SHAP, we can determine the contribution value of each characteristic towards the recognition result. The outcomes are presented in
Figure 11, where red highlights the features that lead to a positive recognition result and blue indicates those that lead to a negative result. The values of the feature are represented by the numbers below the bar. The wider the color area of the bar chart, the greater the feature’s influence. If the f(x) value is less than 0.5, the instance’s identification result is unqualified, and vice versa. The single-sample SHAP analysis of samples 257 and 1290 was conducted as described below.
As depicted in
Figure 11, sample 257’s model output value is 0.11, resulting in an unqualified classification. The main contributor to identifying this sample as an anomaly was feature 3, followed by feature 19 and feature 5. Therefore, it is possible to diagnose the result, and the abnormality cause of this sample is linked to feature 3, feature 9, and feature 5.
As illustrated in
Figure 12, sample 1290’s model output value is 0.59, leading to a qualified classification. During the qualification identification process, feature 19 is the most influential, with feature 14 and feature 18 as secondary contributors.
Figure 12 is inadequate in presenting the SHAP value for every feature. By sorting the contribution value of each feature and using it as the X-axis, we can generate a histogram (
Figure 13), which provides a clear visual representation of the SHAP value for every feature in sample 1290.
2.3.4. Model Evaluation
The DRSN-SVM quality anomaly recognition model was evaluated using the confusion matrix and ROC curve. The test set data was brought into the model for testing, and the resulting confusion matrix is displayed in
Figure 14. It is evident that 142 out of 143 positive cases were correctly classified, along with 75 out of 76 negative cases, resulting in
TP = 141,
TN = 75,
FN = 2, and
FP = 1. Further calculations yield model
= 0.99,
= 0.986,
= 0.993, and
= 0.979,
MCC = 0.97 indicating satisfactory classification performance of the model.
The ROC curve for the DRSN-SVM anomaly recognition model’s quality is depicted in
Figure 15. The depiction is indicative of the model’s classification performance, having an AUC of 0.99, suggesting that it conducts well.