In this section, the simulation results of the SDKABL model are evaluated and discussed. At the base level, three different ML classifiers, namely SVM, DT, and KNN, are implemented. ABiLSTM is deployed on the meta-layer of the proposed model. In order to obtain accurate and efficient results, we used PCA technology to perform feature fusion.
4.1. Dataset Description
For this study, we used the Cleveland dataset, the Framingham dataset, and the Z-Alizadeh Sani dataset. The Cleveland and Z-Alizadeh Sani datasets were obtained from the UCI (The University of California, Irvine) machine learning repository [
33]. The Cleveland dataset contains 303 instances, with 14 different attributes (13 predictors; 1 class), such as exang, oldpeak, slope, etc. (
Table 3). The Z-Alizadeh Sani dataset contains 303 instances with 56 different attributes (55 predictors; 1 class) [
34]. The Framingham dataset was obtained from the Kaggle website [
35], and consists of 4240 instances with 16 attributes (15 predictors, 1 class) (
Table 4).
The Framingham dataset included missing attributes, and attributes with missing data included education, cigsPerDay, BPMeds, totChol, BMI, heartRate, and glucose. There is no direct relationship between the education attribute and the heart disease prediction, so the blank value of the education attribute is treated with all zeros. The cigsPerDay attribute is the number of cigarettes smoked in a day; the remaining data average is taken as 9 and filled in. The number of 0 value data of the BPMeds attribute is 4063, and the number of 1 value data is 124. The number of 0 value data is much greater than that of 1 value data, so we have blank data with 0 values. The BMI attribute contains 19 blank pieces of data. Take the average of the remaining 4221 pieces of data and fill in the blank with 25.8. The heartRate attribute has only one blank data block, filled with an average of 76 of the remaining 4239 data. The number of blank data in the glucose attribute is 388, and the average value of the remaining 3852 data is filled in. It is important to note that the unit of the age attribute is the number of days and can be converted to the number of years. Since the Z-Alizadeh Sani dataset contains a large number of continuous variables, it is necessary to normalize some of its features, including features FBS, TG, LDL, WBC, and PLT. Since the Z-Alizadeh Sani dataset contains a large number of continuous variables, it is necessary to normalize some of its features, including features FBS, TG, LDL, WBC, and PLT.
In each dataset, 20% of the samples were used for testing and 80% were used for training. The Z-Alizadeh Sani dataset has too many features, so it is not included in this list and heat map.
The heatmap function in the seaborn library is used to display the correlation between each feature. The correlation between the two characteristics is stronger for deeper grid colors; a positive correlation is shown by colors near red, and a negative correlation is indicated by colors near blue. Conversely, the lighter the grid color, the lower the correlation between the corresponding two features. The heat maps of the Cleveland and Framingham datasets are shown in
Figure 3 and
Figure 4.
As can be seen from
Figure 3, feature trestbps, chol, thalach, oldpeak, ca, and HeartDisease are highly correlated with age. The features of trestbps, chol, oldpeak, and ca are positively correlated with age, while the features of thalach and HeartDisease are negatively correlated with age. Generally speaking, a feature increases with the increase in another feature that is positively correlated with it, and decreases with the increase in another feature that is negatively correlated with it, and vice versa. For example, the feature cp obtained from the heat map showed a high positive correlation with the feature HeartDisease. It can be seen from
Table 3 that cp is the kind of chest pain; that is, the more severe the type of chest pain, the higher the likelihood of having heart disease, which is the same as common sense would suggest.
It can also be seen from
Figure 3 that features of HeartDisease are highly correlated. From high to low, the features are exang, cp (feature oldpeak), thalach, ca, slope, thal, sex, age, trestbps(feature restecg), chol, and fbs.
As can be seen from
Figure 4, features cigsPerDay and currentSmoker, features sysBP and diaBP, features prevalentHyp and diaBP, and features prevalentHyp and diaBP are highly correlated. Among them, the feature cigsPerDay represents the number of cigarettes smoked per day, and the feature currentSmoker represents whether or not one smokes, and only then will you obtain the number of cigarettes smoked per day. prevalentHyp, sysBP, and diaBP are all indicators of blood pressure. prevalentHyp represents whether or not you have hypertensive disease, sysBP represents systolic blood pressure, and diaBP represents diastolic blood pressure. People with hypertensive disease have high systolic and diastolic blood pressure, which is what people would expect. Feature glucose represents a person’s glucose level, and when glucose is high, it will lead to diabetes, so feature glucose is related to feature diabetes.
4.2. Experimental Results
In this section, we utilize different machine learning classifiers to build a prediction model and validate the efficacy of our suggested methodology across three datasets.
4.2.1. Experiment Using the Cleveland Dataset
First,
Figure 5 and
Table 5 compare the influence of the PCA method for feature fusion on prediction accuracy in dataset 1.
From
Figure 5 and
Table 5, the influence of the PCA method on the performance indicators of the SDKABL model, including accuracy, precision, and recall, can be obtained. In addition, CMs are also given to show the performance of the classifier.
What can be obtained is that the data processed by PCA show better performance under the same classifier. The values of these four indexes all reach more than 91%, among which recall increases the most, by 15.6% compared with the data without PCA processing. Although precision shows the least improvement, it improves by 12.7%. In the CM, the data processed by PCA greatly reduces false positives in the classification of the SDKABL model and greatly increases the number of true negative samples. Finally, the AUC values of the two methods were obtained through the ROC curve. The AUC value of the model trained by PCA processing data was 0.92, which was much higher than that of the model trained without PCA processing, while the AUC value of the model trained without PCA processing was only 0.76.
Figure 6 and
Table 6 show the comparison between the SDKABL method and the DT, KNN, and SVM. The max_depth parameter of DT is 5, and the n_neighbors parameter of KNN is 7.
From
Table 6, the performance of the SDKABL model is superior to the other three comparison models in all aspects, followed by the DT model, the AUC value of which is 0.82, and finally, the KNN and SVM models, the AUC values of which are almost 0.70. Among them, all indicators of the SDKABL model reach more than 90%, and its AUC value is 0.92, which is 0.1 higher than that of the DT model after its second performance and about 0.2 higher than that of the KNN and SVM models.
4.2.2. Experiment Using the Framingham Dataset
Figure 7 and
Table 7 compare the influence of the PCA method for feature fusion on prediction accuracy in dataset 2.
Figure 7 and
Table 7 show the performance indexes of the PCA method for the SDKABL model using dataset 2.
It can be seen that the PCA data show better performance under the same classifier. Although the values of these four indicators are not as good as those of dataset 1, they all reach more than 80%, with the recall rate improving the most, by 11.2%, compared to the data without PCA processing. Although the improvement in accuracy is the smallest, it is also an improvement of 7.1%. In dataset 2, PCA-treated data similarly reduced false positives in SDKABL model classification and greatly increased the number of TN samples. Finally, the AUC values of the two methods were obtained by the ROC curve. The AUC value of the PCA-treated model is 0.89, which is much higher than that of the non-PCA-treated model, and the AUC of the non-PCA-treated model is only 0.78.
Figure 8 and
Table 8 show the SDKABL method compared to DT, KNN, and SVM. The best parameter of DT max_depth obtained by the GridSearchCV method is 7, and the n_neighbors parameter of KNN is 10.
From
Table 8, the performance of the SDKABL model is superior to the other three comparison models in all aspects. The AUC values of the SVM, DT, and KNN models are 0.77, 0.75, and 0.70, respectively. Among them, all indicators of the SDKABL model in dataset 2 reach more than 80%, and the prediction accuracy rate is more than 90%. The AUC of SDKABL is 0.89, which is 0.12 higher than the SVM model, 0.14 higher than the DT model, and nearly 0.2 higher than the KNN model.
4.2.3. Experiment Using the Z-Alizadeh Sani Dataset
Figure 9 and
Table 9 compare the influence of THE PCA method for feature fusion on prediction accuracy in Z-Alizadeh Sani dataset.
Figure 9 and
Table 9 also show the performance indexes of the PCA method on dataset 3 of the SDKABL model proposed in this paper.
The classifier trained by PCA-processed data has better performance in classification performance, and has different degrees of improvement in multiple indicators. The performance of SDKABL on dataset 3 is similar to that of dataset 1, with both around 90%. Among them, the improvement in Precision was the largest, increasing by more than 20% compared to data without PCA processing. Although the improvement in accuracy is small, it is still 18%. In dataset 3, PCA-treated data significantly reduced false negatives in SDKABL model classification and significantly increased the number of true positive samples. Finally, the ROC curve obtains the AUC values of the two methods. The AUC value of the PCA treated model is 0.89, which is much higher than that of the non-PCA treated model. In summary, it can be concluded that the PCA method plays a certain part in extracting key features and can provide certain help in data processing.
Figure 10 and
Table 10 show the SDKABL method compared to decision trees, KNN, and support vector machines.
From
Table 10, the performance of the SDKABL model is superior to the other three comparison models in all aspects. The AUC values of the DT, SVM, and KNN models were 0.81, 0.8, and 0.73, respectively. Among them, the indicators of the SDKABL model in dataset 3 are around 90%, and the prediction accuracy rate is more than 91%. The AUC value is 0.89, which is 0.08 higher than the DT model, 0.09 higher than the SVM model, and 0.16 higher than the KNN model. In summary, the superposition model with DT, KNN, and SVM as the base classifiers and ABiLSTM as the meta-layer classifier greatly improves the model’s performance compared with a single classifier.
4.2.4. Comparative Analysis of Existing and Proposed Methods
Many academics have employed ML methods to forecast heart disease in recent years. We also compared and analyzed the experimental results of the SDKABL model with those in the literature survey, which are shown in
Table 11.
Table 11 compares our proposed methods with those of other studies. These methods include SVM and KNN. Other methods include the stacking model involving KNN, random forest, and an SVM classifier [
38], as well as the Hybrid Random Forest with Linear Model (HRFLM) [
39]. It should be noted that the dataset 1 used in our model is the Cleveland dataset. From
Table 11, the SDKABL model achieves considerable performance. Taken together, our proposed method performed best on the Cleveland dataset. On the other two datasets, the SDKABL model we proposed also achieves good results.
It can be seen that the stacking model using the DT, KNN, and SVM models as the base classifier and ABiLSTM as the meta layer classifier greatly improves the model performance compared with a single classifier.