2.2.2. Feature Extraction
In this work, transfer learning for feature extraction was used with the help of two deep CNNs, i.e., DenseNet201 and DarkNet53. Both these networks are discussed as follows.
DarkNet53 is a deep CNN originally proposed in a YOLO3 image detection framework [
10]. The pretraining of this network was performed on the ImageNet [
30] database. The pretrained network has the ability to classify about 1000 different image classes.
Table 1 shows the layerwise details of DarkNet53 architecture. The input layer received an image of size
. The network was composed of repeated convolution layers with filter sizes
and
. The output of the convolutional layers was connected to batch normalization [
31] and LeakyReLU [
9] layers. Further, a residual layer was added to address the gradient disappearance problem.
In order to re-train the Darknet53 CNN, a fully connected layer (FCL) was inserted in the network in place of the “Conv53” layer. The number of outputs of FCL were kept equal to the number of WBC classes in the dataset (i.e., five classes). Further, the softmax and classification layers of the networks were also replaced by new layers. To match the network’s input layer, the dataset images were resized to , followed by specific augmentation operations such as random rotation and flipping. The deep feature extraction was performed from the “GlobalAvgPool” layer. A feature vector of size 1024 was obtained from Darknet53 for each image of the training as well as the testing datasets.
DenseNet201 The dense convolutional network (DenseNet [
32]) was composed of 201 layers and also trained on the Imagenet [
30] dataset. The network was designed to minimize the problem of the vanishing gradient in neural networks. In DenseNet, layer concatenation was performed in such a way that each layer receives “collective knowledge” from all preceding levels. The resulting network was compact and highly efficient in terms of computational complexity and memory requirements.
Table 2 shows layer details of DenseNet201.
To perform network training on our dataset, the “fc1000” layer was replaced with an FCL with five classes. A new softmax layer was also inserted along with the classification layer without class labels. In a similar fashion, the images were resized to and augmented (rotated, flipped etc.) prior to training. The global average pooling layer was used to extract features from the trained network. A feature vector of size 1920 was obtained for each training and testing image.
2.2.4. Feature Selection Using Quantum Inspired Evolutionary Algorithm
The considerably large size of the fused feature vector extracted from above steps requires an intelligent selection of the most important features to reduce the computational complexity of the classification model while ensuring a high accuracy. As discussed earlier, in the classical filter based approach, a predefined criterion is used to select the features without consideration of the learning model. On the other hand, in the wrapper approach, the main feature selection criterion is the learning model accuracy. The main contribution of this work was to model the feature selection as an optimization problem and propose a meta-heuristic for its solution. The objective of the optimization problem was to minimize the number of selected features while maximizing the accuracy of the learning model.
Quantum Inspired Evolutionary Algorithm. The QIEA is a class of population-based algorithms inspired by the concepts of quantum mechanics and evolutionary computing [
27]. In binary QIEA, the candidate solutions to an optimization problem, i.e., chromosomes are represented as vectors of qubits. A qubit is a vector
such that
and
are the probabilities of observing the qubit state as
or
, respectively, complying with the condition
. AqQubit state is represented as
A qubit individual (chromosome)
is represented as a string of
m Qubits as
Each qubit of the chromosome represents a linear superposition of quantum states
and
. For example, consider the following three qubit system:
Here, a linear superposition of
states, i.e.,
is represented. A greater population diversity is achieved by QIEA as compared to classical evolutionary algorithms. A quantum population
consists of
n individual chromosomes of
m qubits each. From the quantum population
, a binary population
is generated by performing the observation of the quantum state of each qubit. The evaluation phase involves the calculation of the fitness value of each individual from
. The recombination is performed using uniform crossover. The quantum population is then updated by applying the quantum rotation operation on each qubit as:
where
is the rotation angle and
i is the generation counter. The termination criterion is the maximum number of generations.
Feature selection using QIEA. In this study, we proposed a customized QIEA for the selection of the most powerful features from the fused feature vector. The main computation steps of the proposed QIEA are listed in Algorithm 1.
Notations: In Algorithm 1, double struck characters (e.g., ) are used to represent matrices and vectors, whereas normal characters denote the scalar quantities.
The inputs to the algorithm include fused feature matrix , the label vector , total number of generations G, population size N, and the number of variables in each solution, i.e., M. The matrix has dimensions , where denotes the number of training images and M is the number of features in the fused feature vector. In Phase 1 of the algorithm, the main parameters are initialized, which include , i.e., the probability of the best selected individuals in a population, the number of best selected features , the angle of quantum rotation , iteration of best individual solution , and global best individual . In step 4 (Phase 2), all qubits of the quantum population are initialized to equal probability . In steps 5–12, a binary population matrix is generated by performing an observation procedure on the qubits.
Steps 13–39 constitute the execution phase of the algorithm. The while loop runs for
G number of iterations (generations); during each iteration, all individuals of the population
are evaluated for their fitness value in steps 16–17. The fitness values are computed using the
CostFunction routine, which receives as inputs the matrix
, label vector
, and binary vector
, which contains one individual of the population. Step 49 of the
CostFunction first extracts all the features of
, whose indexes correspond to non-zero values of
, and then splits the updated feature matrix
into training and testing sets. In steps 52–58 of the
CostFunction, model training and prediction is performed using the K nearest neighbors (KNN) classifier with
nearest neighbors to consider. The fitness value (cost) is computed using the classification error metric. The fitness values of all individuals returned by
CostFunction are stored in vector
of the main function. In step 19 of the main function, the
function arranges the fitness values in ascending order along with their indexes and stores them in vectors
and
I, respectively. The individuals with minimum cost are the fittest individuals of a population. In steps 20–25, from the sorted fitness values, the iteration best and global best individuals are updated along with their fitness scores. In step 27, all probabilities of qubits are modified by applying a
function of (
5).
Algorithm 1 QIEA for feature selection. |
- 1:
Inputs: - 2:
Phase 1: Parameter Initialization - 3:
Phase 2: Generate Initial Quantum Population - 4:
- 5:
- 6:
fordo - 7:
for do - 8:
if then - 9:
- 10:
else - 11:
- 12:
end if - 13:
end for - 14:
end for - 15:
Phase 3: Execution - 16:
whiledo - 17:
for do - 18:
- 19:
- 20:
end for - 21:
- 22:
- 23:
- 24:
if then - 25:
- 26:
- 27:
end if - 28:
Perform Quantum Rotation Gate Operation - 29:
- 30:
Population Update - 31:
- 32:
for x=1:N do - 33:
for y=1:M do - 34:
if then - 35:
- 36:
else - 37:
- 38:
end if - 39:
end for - 40:
end for - 41:
end while - 42:
Extract Index of Best Features - 43:
| - 44:
OUTPUT: - 45:
Function: - 46:
Inputs: - 47:
Parameters: - 48:
if (sum() then - 49:
- 50:
else - 51:
- 52:
- 53:
- 54:
- 55:
- 56:
- 57:
- 58:
- 59:
- 60:
- 61:
end if - 62:
Return:
- 63:
Function: - 64:
Inputs: - 65:
Initialize Parameters: - 66:
- 67:
for do - 68:
for do - 69:
if () then - 70:
- 71:
else - 72:
- 73:
end if - 74:
- 75:
- 76:
- 77:
- 78:
end for - 79:
end for - 80:
Return:
|
The classical QIEA uses the first individual solution of the current population to compute the rotation angle. This angle is applied to all other individuals of the population. In this work, we proposed a customized version of QIEA, which uses the qubit probabilities of all individuals of the population, performing rotation on each of them with a separate rotation angle. This random rotation at the individual solution as well as at the qubit level helps the algorithm to avoid being stuck in local optima. In steps 29–39 of the main routine, a new population is generated from the updated qubit probability values. In steps 41–42, the indexes of non zero entities of global best individual correspond to the indexes of selected features from the fused feature vector. The reduced feature vector of these selected features is returned by the algorithm.