1. Introduction
Contemporary medicine depends on a large amount of information accumulated in medical datasets. The extraction of such constructive knowledge can help when making scientific decisions to diagnose disease. Medical data can enhance the management of hospital information and endorse the growth of telemedicine. Medical data primarily focuses on patient care first, and research resources second. The main rationalization to collect medical data is to promote patient health conditions [
1]. The accessibility of numerous medical data causes redundancy, which requires effectual and significant techniques for processing data to extract beneficial knowledge. However, the diagnostics of various diseases indicate significant issues in data analysis [
2]. Quantifiable diagnosis is performed by adoctor’s guidance rather than patterns of the medical dataset; thus, there is the possibility of incorrect diagnosis [
3]. Cloud-based services can assist with managing medical data, including compliance management, policy integration, access controls, and identity management [
4].
Now a day, heart disease is a foremost source of death. We are moving towards a new industrial revolution; thus, lifestyle changes should take place to prevent risk factors of heart disease, such as obesity, diabetes, hypertension, and smoking [
5]. The treatment of disease is a complex mission in medical field. The discovery of heart disease, with different risk factors, is considered a multi-layered issue [
6]. Thus, patient medical data are collected to simplify the diagnosis process. Offering a valuable service (at less cost) is a major limitation in the healthcare industry. In [
7], valuable quality service refers to the precise diagnosis (and effective treatment) in patients. Poor clinical decisions cause disasters, which may affect the health of patients. Automated approaches, such as the machine-learning approach [
8,
9] and data mining [
10] approach, assist with attaining clinical tests, or diagnoses, at reduced risks [
11,
12]. The classification and pattern recognition by machine learning algorithms are widely included in prognostic and diagnosis monitoring. The machine learning approach supports decision-making, which increases the safety of the patients and avoids medical errors, so that it can be used in clinical decision support systems (CDSS) [
13,
14].
Several methods are devised for automatic heart disease detection to evaluate the efficiency of the decision tree and Naive Bayes [
15]. Moreover, optimization with the genetic algorithm is employed for minimizing the number of attributes without forfeiting accuracy and efficiency to diagnose heart disease [
16]. Data mining methods for heart disease diagnosis include the bagging algorithm, neural network, support vector machine, and automatically defined groups [
17]. In [
18], the study acquired 493 samples from a cerebrovascular disease prevention program, and utilized three classification techniques (the Bayesian classifier, decision tree, and backpropagation neural network) for constructing classification models. In [
19], a method is devised for diagnosing coronary artery disease. The method utilized 303 samples by adapting the feature creation technique. In [
20], a methodology is devised for automatically detecting the efficiency of features to reveal heart rate signals. In [
21], a hybrid algorithm is devised with K-Nearest Neighbour (KNN), and the genetic algorithm for effectual classification. The method utilized a genetic search as a decency measure for ranking attributes. Then, the classification algorithm was devised on evaluated attributes for heart disease diagnosis. The extraction of valuable information from huge data is a time-consuming task [
22]. The size of the medical dataset is increasing in a rapid manner and advanced techniques of data mining help physicians make effective decisions. However, the issues of heart disease data involve feature selection, in which the imbalance of samples and the lack of magnitude of features are just some of the issues [
23]. Although there are methods for heart disease detection with real-world medical data, these methods are devised to improve accuracy and time for computation in disease detection [
24]. In [
25], a hybrid model with the cuckoo search (CS)—and a rough set—is adapted for diagnosing heart disease. The drawback is that a rough set produces an unnecessary number of rules. To solve these challenges in heart disease diagnoses; a novel method, named the Taylor-based bird swarm algorithm–deep belief network (Taylor-BSA–DBN), is proposed for medical data classification.
The purpose of the research is to present a heart disease diagnosis strategy, for which the proposed Taylor-BSA–DBN is employed. The major contribution of the research is the detection of heart disease using selected features. Here, the feature selection is performed using sparse FCM for selecting imperative features. In addition, DBN is employed for detecting heart disease data using the features. Here, the DBN is trained by the proposed Taylor-BSA, in such a way that the model parameters are learned optimally. The proposed Taylor-BSA is developed through the inheritance of the high global convergence property of BSA in the Taylor series. Hence, the proposed Taylor-BSA–DBN renders effective accuracy, sensitivity, and specificity while facilitating heart disease diagnosis.
The major portion of the paper focuses on:
Proposed Taylor-BSA–DBN for heart disease diagnosis:Taylor-BSA–DBN(a classifier) is proposed by modifying the training algorithm of the DBN with the Taylor-BSA algorithm, which is newly derived by combining the Taylor series and BSA algorithm, for the optimal tuning of weights and biases. The proposed Taylor-BSA–DBN is adapted for heart disease diagnosis.
Other sections of the paper are arranged as follows:
Section 2 elaborates the descriptions of conventional heart disease detection strategies utilized in the literature, as well as challenges faced, which are considered as the inspiration for developing the proposed technique. The proposed method for heart disease diagnosis using modified DBN is portrayed in
Section 3. The outcomes of the proposed strategy with other methods are depicted in
Section 4;
Section 5 presents the conclusion.
3. Proposed Taylor-BSA–DBN for Medical Data Classification
The accessibility of a large amount of medical data led to the requirement of strong data analysis tools for extracting valuable knowledge. Researchers are adapting data mining and statistical tools for improving the analysis of data on huge datasets. The diagnosis of a disease is the foremost application in which data mining tools are offering triumphant results. Medical data tend to be rich in information, but poor in knowledge. Thus, there is a deficiency of effectual analysis tools for discovering hidden relation and trends from medical data generated from clinical records. The processing of medical data brings a manifestation if it has some powerful methods. Thus, the proposed Taylor-BSA–DBN is devised to process medical data for attaining effective heart disease diagnosis.
Figure 1 portrays the schematic view of the proposed Taylor-BSA–DBN for heart disease diagnosis. The complete process of the proposed model is pre-processing feature selection, and detection. At first, the medical data is fed as an input to the pre-processing phase, wherein log transformation is applied to pre-process the data. Log transformation is applied for minimizing skew, and to normalize the data. Once the pre-processed data are obtained, then it is further subjected to the feature selection phase. In the feature selection phase, the imperative features are selected with Sparse FCM. After obtaining imperative features, the detection is performed with DBN, wherein the training of DBN is carried out using Taylor-BSA. The proposed Taylor-BSA is devised by combining the Taylor series and BSA. The output produced from the classifier is the classified medical data.
Consider an input medical data be given as
, with various attributes, and is expressed as
where
denotes
attribute in
data,
specifies a total number of data, and
specifies total attributes in each data. The dimension of the database is represented as
.
3.1. Pre-Processing
The importance of pre-processing is to facilitate smoother processing of the input data. Additionally, the pre-processing is carried out for eliminating the noise and artefacts contained in the data. In this method, the pre-processing is carried out by using log transformation, in which data are replaced with a log function, wherein the base of the log is set by the analyst (maybe 2, or 10). The process is used to compress the massive data. In addition, the log transformation has extensively adapted the method to solve skewed data and assist data normalization. The log transformation is formulated as,
The dimension of pre-processed dataset becomes .
3.2. Selection of Features with Sparse FCM Clustering
The pre-processed data are fed to the feature selection module, considering the Sparse FCM algorithm [
30], which is the modification of the standard FCM. The benefit of using Sparse FCM is to provide high dimensional data clustering. The pre-processed data contain different types of attributes, each indicating individual value. In the medical data classification strategy, the sparse FCM is applied for determining the features from the data. The sparse FCM clustering algorithm clusters nodes, to attain communication between nodes through the cluster head, and facilitate effective detection of the attacker node. Generally, in sparse FCM, dimensional reduction is effective, poses the ability to handle disease diagnosis without delay, and is easier with optimization techniques.
3.3. Classification of Medical Data with Proposed Taylor-BSA-Based DBN
In this section, medical data classification using the proposed Taylor-BSA method is presented, and the classification is progressed using the feature vector.
3.3.1. Proposed Taylor-BSA Algorithm
The proposed Taylor-BSA is the combination of the Taylor series and BSA. The Taylor series [
31] explains the functions of complex variables, and it is the expansion of a function into an infinite sum of terms. It not only serves as a powerful tool, but also helps in evaluating integrals and infinite sums. Moreover, the Taylor series is aone-step process, and it can deal with higher-order terms. The Taylor series seems to be advantageous for derivations, and can be used to get theoretical error bounds. Above all, the Taylor series ensures the accuracy of classification. Moreover, it is a simple method to solve complex functions. BSA [
32] is duly based on the social behaviors of birds that follow some idealistic rules. BSA is more accurate than other standard optimizations with highly efficient, accurate, and robust performances. In addition, there is a perfect balance between exploration and exploitation in BSA. The DBN has recently become a popular approach in machine learning for its promised advantages, such as fast inference and the ability to encode richer and higher order network structures. DBN is used to extract better feature representations, and several related tasks are solved simultaneously by using shared representations. Moreover, it has the advantages of a multi-layer structure, and pre-training with the fine-tuning learning method. The algorithmic steps of the proposed Taylor-BSA are described below:
Step 1. Initialization: the first step is the initialization of population and other algorithmic parameters, including: ;, where, the population size is denoted as , represent maximal iteration, indicate the probability of foraging food, and the frequency of flight behavior of birds is expressed as .
Step 2. Determination of objective function: the selection of the best position of the bird is termed as a minimization issue. The minimal value of error defines the optimal solution.
Step 3. Position update of the birds: for updating the positions, birds have three phases, which are decided using probability. Whenever the random number , then the update is based on foraging behavior, or else the vigilance behavior commences. On the other hand, the swarm splits as scroungers and producers, which is modeled as flight behaviors. Finally, the feasibility of the solutions is verified and the best solution is retrieved.
Step 4. Foraging behavior of birds: the individual bird searches for the food based on its own experience, and the behavior of the swarm, which is given below. The standard equation of the foraging behavior of birds [
32] is given by,
where,
and
denotes the location of
bird in
dimension at
and
,
refers to the previous best position of the
bird,
is independent uniformly distributed numbers,
indicates the best previous location shared by the birds swarm,
denotes the cognitive accelerated coefficients, and
denotes the social accelerated coefficients. Here,
and
are positive numbers.
According to the Taylor series [
31], the update equation is expressed as,
Substituting Equation (5) in Equation (3),
Step 5. Vigilance Behavior of Birds: the birds move towards the center, during which, the birds compete with each other; the vigilance behavior of birds is modeled as,
where,
represents the number of birds,
and
are the positive constants lying in the range of
,
denotes the optimal fitness value of
bird, and
corresponds to the addition of the best fitness values of the swarm.
be the constant that keeps optimization away from zero-division error.
signifies the positive integer.
Step 6. Flight Behavior: this behavior is of the birds’ progress, when the birds fly to another site in case of any threatening events and foraging mechanisms. When the birds reach a new site, they search for food. Some birds in the group act as producers and others as scroungers. The behavior is modeled as,
where,
refer to the Gaussian distributed random number with zero-mean and standard deviation.
Step 7. Determination of best solution:the best solution is evaluated based on error function. If the newly computed solution is better than the previous one, then it is updated by the new solution.
Step 8. Terminate: the optimal solutions are derived in an iterative manner until the maximum number of iterations is reached. The pseudo-code of the proposed Taylor-BSA algorithm is illustrated in Algorithm 1.
Algorithm 1. Pseudocode for the proposed Taylor-BSA algorithm |
Input: Bird swarm population |
Output: Best solution |
Procedure: |
Begin |
Population initiation: |
Read the parameters: -frequency of flight behavior of birds |
Determine the fitness of the solutions |
While |
For |
If |
Foraging behavior using Equation (3) |
Else |
Vigilance behavior using Equation (12) |
End if |
End for |
Else |
Split the swarm as scroungers and producers |
For |
If k is a producer |
Update using Equation (13) |
Else |
Update using Equation (14) |
End if |
End for |
Check the feasibility of the solutions |
Return the best solution |
|
End while |
Optimal solution is obtained |
End |
3.3.2. Architecture of Deep Belief Network
The DBN [
33] is a subset of Deep Neural Network (DNN) and comprises different layers of Multilayer Perceptrons (MLPs) and Restricted Boltzmann Machines (RBMs). RBMs comprise of visible and hidden units that are associated with weights. The basic structural design of the DBN is illustrated in
Figure 2.
Training of Deep Belief Network
This section elaborates on the training process of the proposed Taylor-BSA–DBN classifier. A RBM has unsupervised learning based on the gradient descent method, whereas MLP performs a supervised learning method using the standard backpropagation algorithm. Therefore, the training of DBN is based on a gradient descent–backpropagation algorithm. Here, the most appropriate weights are chosen optimally for the update. The training procedure of the proposed DBN classifier is described below,
Training of RBM Layers
A training sample is given as the input to the first layer of RBM. It computes the probability distribution of the data and encodes it into the weight parameters. The steps involved in the training process of RBM are illustrated below.
The input training sample is read and the weight vector is produced randomly.
The probability function of each hidden neuron in the first RBM is calculated.
The positive gradient is computed using a visible vector and the probability of the hidden layer.
The probability of each visible neuron is obtained by reconstructing the visible layer from the hidden layer.
The probability of reconstruction of hidden neurons is obtained by resampling the hidden states.
The negative gradient is computed.
Weights are updated by subtracting the negative gradient from the positive gradient.
Weights are updated for the next iteration, using the steepest or gradient descent algorithm.
Energy is calculated for a joint configuration of the neurons in the visible and the hidden layers.
Training of MLP
The training procedure in MLP is based on a backpropagation approach by feeding the training data, which are the hidden output of the second RBM layer through the network. Analyzing the data, the network is adjusted iteratively until the optimal weights are chosen. Moreover, Taylor-BSA is employed to compute the optimal weights, which are determined using the error function. The training procedure is summarized below.
Randomly initialize the weights.
Read the input sample from the result of the preceding layer.
Obtain the average error, based on the difference between the obtained output and the desired output.
Calculate the weight updates in the hidden and the visible layers.
Obtain the new weights from the hidden and the visible layers by applying gradient descent.
Identify the new weights using the updated equation of Taylor-BSA.
Estimate the error function using gradient descent and Taylor-BSA.
Choose the minimum error and repeat the steps.
4. Results and Discussion
This section elaborates on the assessment of the proposed strategy with classical strategies for medical data classification using accuracy, sensitivity, and specificity. The analysis is done by varying training data. In addition, the effectiveness of the proposed Taylor-BSA–DBN is analyzed.
4.1. Experimental Setup
The implementation of the proposed strategy is carried out using Java libraries via Java Archive (JAR) files, utilizing a PC, Windows 10 OS, 2GB RAM, and an Intel i3 core processor. The simulation setup of the proposed system is depicted in
Table 1.
4.2. Dataset Description
The experimentation is done using Cleveland, Hungarian, and Switzerland datasets taken from healthcare data based on University of California Irvine (UCI) machine learning repository [
34], which is commonly used for both detection and classification. The Cleveland database is taken from the Cleveland Clinical Foundation contributed by David W. Aha. The Hungarian dataset is obtained from the Hungarian Institute of Cardiology. The Switzerland dataset is obtained from the University Hospital, Basel, Switzerland. The dataset comprises of 303 number of instances and 75 attributes, ofwhich, 13 attributes are employed for experimentation. Furthermore, the dataset is characterized as multivariate with integer and real attributes. The attributes (features), such asresting blood pressure (trestbps), maximum heart rate achieved (thalach), the slope of the peak exercise ST segment (slope), age (age), sex (sex), fasting blood sugar (fbs), ST depression induced by exercise relative to rest (oldpeak), chest pain (cp), serum cholesterol (chol), exercise-induced angina (exang), resting electrocardiographic results (restecg), number of major vessels (0–3) colored by fluoroscopy (ca), and 3 = normal; 6 = fixed defect; 7 = reversible defect (thal).
4.3. Evaluation Metrics
The performance of the proposed Taylor-BSA–DBN is employed for analyzing the methods, including accuracy, sensitivity, and specificity.
4.3.1. Accuracy
The accuracy is described as the degree of closeness of an estimated value with respect to its original value in optimal medical data classification, and it is represented as,
where,
represent true positive,
indicate false positive,
indicate true negative, and
represents false negative, respectively.
4.3.2. Sensitivity
This measure is described as the ratio of positives that are correctly identified by the classifier, and it is represented as,
4.3.3. Specificity
This measure is defined as the ratio of negatives that are correctly identified by the classifier, and is formulated as.
4.4. Comparative Methods
The methods employed for the analysis include the Support Vector Machine (SVM) [
35], Naive Bayes (NB) [
36], DBN [
33], and the proposed Taylor-BSA–DBN.
4.5. Comparative Analysis
The analysis of the proposed Taylor-BSA–DBN, with the conventional methods, with accuracy, sensitivity, and specificity parameters, is evaluated. The analysis is performed by varying the training data using Cleveland, Hungarian, and Switzerland databases.
4.5.1. Analysis with Cluster Size = 5
The analysis of methods, considering cluster size = 5, using Cleveland, Hungarian, and Switzerland databases are specified below:
Analysis Considering Cleveland Database
Table 2 elaborates the analysis of methods using the Cleveland database, considering training data with accuracy, sensitivity, and specificity parameters. The maximum accuracy, sensitivity, and specificity is considered as the best performance. Here, the proposed system offers better performances than the existing methods, such as SVM, NB, and DBN, respectively.
Analysis Considering Hungarian Database
Table 3 elaborates the analysis of methods using the Hungarian database, considering training data with accuracy, sensitivity, and specificity parameters. The proposed system offers the best performance when considering 90% of training data.
Analysis Considering Switzerland Database
Table 4 elaborates the analysis of methods using the Switzerland database considering training data with accuracy, sensitivity, and specificity parameters. The better performances of the proposed system, with values, are 0.8462, 0.8571, and 0.8333 for performance metrics, such as accuracy, sensitivity, and specificity.
4.5.2. Analysis with Cluster Size = 9
The analysis of methods considering cluster size = 9, using Cleveland, Hungarian, and Switzerland databases are specified below:
Analysis Considering Cleveland Database
Table 5 depicts the analysis of methods using the Cleveland database, considering training data with accuracy, sensitivity, and specificity parameters. The maximum accuracy, sensitivity, and specificity are considered as the best performances. Here, the proposed system offers better performance than the existing methods, such as SVM, NB, and DBN, respectively.
Analysis Considering Hungarian Database
Table 6 shows the analysis of methods using the Hungarian database, considering training data with accuracy, sensitivity, and specificity parameters. The proposed system offers the best performance when considering 90% of training data.
Analysis Considering Switzerland Database
Table 7 depicts the analysis of methods using the Switzerland database considering training data with accuracy, sensitivity, and specificity parameters. The better performance of the proposed system with values is 0.7778, 0.7857, and 0.7692, for the performance metrics, such as accuracy, sensitivity, and specificity.
4.5.3. Analysis Based on Receiver Operating Characteristic (ROC) Curve
Table 8 depicts the comparative analysis based on ROC curve, using Cleveland, Hungarian, and Switzerland databases. In the Cleveland dataset, when the false positive rate (FPR) is 5, the corresponding true positive rate (TPR) of the methods, such as SVM, NB, DBN, and the proposed Taylor-BSA–DBN is 0.8857, 0.9119, 0.9535, and 0.9684, respectively. By considering the Hungarian dataset, when the FPR is 4, the corresponding TPR of the proposed method is a maximum of 0.9348. For the same FPR, the TPR of the methods, such as SVM, NB, and DBN is 0.9030, 0.9130, and 0.9233, respectively. By considering the Switzerland dataset, when the FPR is 6, the TPR of the methods, such as SVM, NB, DBN, and the proposed Taylor-BSA–DBN is 0.9105, 0.9443, 0.9569, and 0.9794, respectively.
4.5.4. Analysis Based on k-Fold
Table 9 depicts the comparative analysis based on k-fold using the Cleveland, Hungarian, and Switzerland databases, for cluster size = 5. The Hungarian datasets offer the maximum accuracy of 0.9021, when k-fold = 8. By considering k-fold = 7, the specificity offered by the Cleveland datasets for the methods, such as SVM, NB, DBN, and the proposed Taylor-BSA–DBN, is 0.8032, 0.8189, 0.8256, and 0.8321, respectively. The proposed Taylor-BSA–DBN offers maximum accuracy, sensitivity, and specificity, when considering k-fold = 8.
4.6. Comparative Discussion
Table 10 portrays the analysis of methods using accuracy, sensitivity, and specificity parameter with varying training data. The analysis is done with Cleveland, Switzerland, and Hungarian databases. Using cluster size = 5, and considering the Cleveland database, the proposed Taylor-BSA–DBN showed maximal accuracy of 0.871, which is 13.43%, 12.17%, and 11.14%, better than the existing methods, such as SVM, NB, and DBN, respectively. In the existing methods, the DBN offers maximum sensitivity of 0.771, but the proposed method is 12.29% better than the existing DBN. The proposed method has a maximum specificity of 0.862. The percentage of improvement of the proposed method with the existing methods, such as SVM, NB, and DBN, is 12.99%, 12.06%, and 9.40%, respectively. Considering the Hungarian database, the proposed Taylor-BSA–DBN showed maximal accuracy of 0.913, maximal sensitivity of 0.933, and maximal specificity of 0.875. Considering the Switzerland database, the proposed Taylor-BSA–DBN showed maximal accuracy of 0.846, which is 19.98%, 16.78%, and 15.60% better than the existing methods, such as SVM, NB, and DBN, respectively. Similarly, the proposed system has a maximum sensitivity of 0.857. The percentage of improvement of the proposed system sensitivity, with the existing methods, such as SVM, NB, and DBN is 19.72%, 19.25%, and 16.69%, respectively. Likewise, the proposed Taylor-BSA–DBN showed maximal specificity of 0.833.
Using cluster size = 9, and considering the Cleveland database, the proposed Taylor-BSA–DBN showed maximal accuracy of 0.934, which is 16.92%, 11.13%, and 3.96%, better than the existing methods, such as SVM, NB, and DBN, respectively. In the existing methods, the DBN offers maximum sensitivity of 0.913, but the proposed method is 3.89% better than the existing DBN. The proposed method has a maximum specificity of 0.903. The percentage of improvement of the proposed method with the existing methods, such as SVM, NB, and DBN, is 23.15%, 15.28%, and 3.10%, respectively. Considering the Hungarian database, the proposed Taylor-BSA–DBN showed maximal accuracy of 0.902, maximal sensitivity of 0.909, and maximal specificity of 0.893. Considering the Switzerland database, the proposed Taylor-BSA–DBN showed maximal accuracy of 0.840, which is 19.17%, 10.12%, and 2.38%, better than the existing methods, such as SVM, NB, and DBN, respectively. Similarly, the proposed system has a maximum sensitivity of 0.846. The percentage of improvement of the proposed system sensitivity with the existing methods, such as SVM, NB, and DBN is 19.74%, 11.35%, and 1.89%, respectively. Likewise, the proposed Taylor-BSA–DBN showed maximal specificity of 0.833.
Table 11 shows the computational time of the proposed system and the existing methods, such as SVM, NB, and DBN, in which the proposed Taylor-BSA–DBN has a minimum computation time of 6.31 sec.
Table 12 shows the statistical analysis of the proposed work and the existing methods based on mean and variance.
5. Conclusions
Contemporary medicine depends on a huge amount of information contained in medical databases. The obtainability of large medical data leads to the requirement of effective data analysis tools for extracting constructive knowledge. This paper proposes a novel, fully automated DBN for heart disease diagnosis using medical data. The proposed Taylor-BSA is employed to train DBN. The proposed Taylor-BSA is designed by combining the Taylor series and BSA algorithm, which can be utilized for finding the optimal weights for establishing effective medical data classification. Here, the sparse-FCM is employed for selecting significant features. The incorporation of sparse FCM for the feature selection process provides more benefits for interpreting the models, as this sparse technique provides important features for detection, and can be utilized for handling high dimensional data. The obtained selected features are fed to DBN, which is trained by the proposed Taylor-BSA. The proposed Taylor-BSA is designed by integrating the Taylor series and BSA in order to generate optimal weights for classification. The proposed Taylor-BSA–DBN outperformed other methods with maximal accuracy of 93.4%, maximal sensitivity of 95%, and maximal specificity of 90.3%, respectively. The proposed method does not classify the type of heart disease. In the future, other medical data classification datasets will be employed for computing efficiency of the proposed method. In addition, the proposed system will be further improved to classify heart diseases, such ascongenital heart disease, coronary artery disease, and arrhythmia.