1. Introduction
Machine learning has been widely used in many practical applications such as data mining, text processing, pattern recognition, and medical image analysis, which often rely on large data sets [
1,
2]. From utilizing label information, feature selection algorithms are mainly categorized as filters or wrapper approaches [
3,
4]. The wrapper-based methods are commonly used to finish the classification task [
5]. The main step includes classifiers, evaluation criteria of features, and finding the optimal features [
6].
The SVM algorithm is one of the most popular supervised models and is regarded as one of the most robust methods in the machine learning field [
7,
8]. SVM has some robust characteristics compared to other methods, such as excellent generalization performance, which is able to generate high-quality decision boundaries based on a small subset of training data points [
9]. The largest problems encountered in setting up the SVM model are how to select the kernel function and its parameter values. Inappropriate parameter settings will lead to poor classification results [
10].
Swarm intelligence algorithms can solve complex engineering problems, but different optimization algorithms solve different engineering problems with different effects [
11,
12]. The optimization algorithms can reduce the time and improve the segmentation accuracy. There many optimization algorithms are proposed, such as Genetic Algorithm (GA) [
13], Particle Swarm Optimization (PSO) [
14], Differential Evolution (DE) [
15], Ant Colony Optimization (ACO) [
16], Artificial Bee Colony (ABC) algorithm [
17], Grey Wolf Optimizer (GWO) [
18], Ant Lion Optimizer (ALO) [
19], Moth-flame Optimization (MFO) [
20], Whale Optimization Algorithm (WOA) [
21], Invasive weed optimization algorithm [
22], Flower Pollination Algorithm [
23]. Although all algorithms have advantages, no-free lunch (NFL) [
24] has proved that no algorithm can solve all optimization problems.
There is no perfect optimization algorithm, and the optimization algorithm should be improved to solve engineering problems better. Many scholars study the strategies for improving optimization algorithm. The strategies commonly used by scholars are as follows adaptive weight strategy and chaotic map. Zhang Y. proposed an improved particle swarm optimization algorithm with an adaptive learning strategy [
25]. The adaptive learning strategy increased the population diversity of PSO. Dong Z. proposed a self-adaptive weight vector adjustment strategy based on a chain segmentation strategy [
26]. The self-adaptive solved the shape of the true Pareto front (PF) of the multi-objective problem. Li E. proposed a multi-objective decomposition algorithm based on adaptive weight vector and matching strategy [
27]. The adaptive weight vector solved the degradation of the performance of the solution set. The chaotic map is also a general nonlinear phenomenon, and its behavior is complex and semi-random. It is mathematically defined as the randomness generated by a simple deterministic system [
28]. Xu C. proposed an improved boundary bird swarm algorithm [
29]. The algorithm combined the good global convergence and robustness of the birds’ swarm algorithm. Tran, N. T. presented a method for fatigue life prediction of 2-DOF compliant mechanism which combined the differential evolution algorithm and the adaptive neuro-fuzzy inference system [
30]. The experiment result shows that the accuracy of the proposed method is high.
Teaching-Learning-Based Optimization (TLBO) is proposed by R. V. Rao, which solves the global problem of continuous nonlinear functions [
31]. The TLBO approach works on the philosophy of teaching and learning. Many scholars study the strategies to improve the optimization ability for a different problem. Gunji A. B. proposed improved TLBO for solving assembly sequence problems [
32]. Zhang H. proposed a hybridizing TLBO [
33]. The approach can enable better tracking accuracy and efficiency. Ho, N.L. presented a hybrid Taguchi-teaching learning-based optimization algorithm (HTLBO) [
34]. The proposed method had good agreement with the predicted results. The strategies can improve the optimal ability of TLBO. In this paper, for solving the problem of learning efficiency and initial parameter setting, we use several strategies to enhance the optimal ability of the TLBO.
The main contribution of our work includes:
- (1)
The enhanced Teaching-Learning-Based Optimization (ETLBO) is proposed to improve optimal ability. The adaptive weight and Kent chaotic map are used to enhance the TLBO. These two strategies can improve the searching ability of the students and teachers in TLBO.
- (2)
We adopt the Tsaliis entropy-based feature selection method for finding the crucial feature. The selected feature x and the parameter of Tsallis entropy are optimized by ETLBO.
- (3)
The parameter c of the SVM classifier is optimized by ETLBO for obtaining high classification accuracy. The core idea of this method is to automatically determine the parameter of Tsallis entropy and parameter c of the SVM under different data.
The proposed method is tested on several feature selection and classification problems in terms of several comment evaluation measures. The results are compared with other well-established optimization methods. The obtained results showed that the proposed ETLBO got better and promising results in almost all the tested problems compared to other methods.
The rest of the paper is described as follows:
Section 2 introduces Tsallis’s entropy-based feature selection formula.
Section 3, Enhance Teaching-learning-based optimization, and the ETLBO optimizes the feature selection design is introduced. In
Section 4 and
Section 5, the feature selection results and the algorithm analysis are given. Finally, the conclusions are summarized in
Section 6.
3. Enhance Teaching-Learning-Based Optimization (ETLBO)
In this section, we introduce the proposed method in detail. Firstly, we introduce the TLBO and the strategies used in the proposed method. And then, the ETLBO is introduced. Finally, the flowchart of the proposed method is described.
3.1. Teacher Phase
It is the first part of the algorithm where the learner with the highest marks acts as a teacher, and the teacher’s task is to increase the mean marks of the class. The update process of i-th learner in teacher phase is formulated as:
where,
is the solution of the i-th learner,
represents the teacher’s solution,
means the average of all learners, rand is a random number in (0,1), and
is the teaching factor that decides the value of mean to be changed. The value can be either 1 or 2, which is again a heuristic step and decided randomly with equal probability
.
In addition, the new solution
is accepted only if it is better than the previous solution, it can be formulated as:
where, f means the fitness function.
3.2. Learner Phase
The second part of the algorithm is where the learner updates its knowledge through interaction with other learners. In each iteration, two learners interact with
and
, in which the more innovative learner improves the marks of other learners. In the learner phase, one learner learns new things if the other learner has more knowledge than himself. The phenomenon is described as follows:
The temporary solution is accepted only if it is better than the previous solution; it can be formulated as:
3.3. Adaptive Weight Strategy
The adaptive weight strategy is easier to jump out of local minima, facilitating global optimization. While the TLBO solves the problem of the complex optimized function, the algorithm will easily fall into the local optimum. And a smaller inertia factor is beneficial for precise local search for the current search domain. We design a new weight strategy t which can be written as follows:
where,
iter is the current number of the iteration;
Max_iter is the max number of the iteration.
3.4. Kent Chaotic Map (KCM)
Chaotic mapping is one kind of nonlinear mapping that can generate a random number sequence. It is sensitive to initial values, which ensures that the encoder can generate an unrelated encoding sequence. There are many kinds of chaotic maps, such as Logistic map, Kent map, etc. In this paper, we use the Kent map as the improved strategy. The formula of the Kent map can be seen as follow:
where,
a is a variable value,
x is the initial value of the
. In this paper,
.
3.5. Proposed Method
There are two phases in the basic TLBO search process to update the individual’s position. In the teacher phase, we use the Kent chaotic map to improve the original state of the teacher. The teacher can be endowed with different abilities to teach the different students. This strategy allows the abilities of different teachers to be demonstrated. In the learner phase, we design a learning efficiency to improve the students’ learning state. The adaptive weight strategy can improve itself with the iteration increases. The students will learn more knowledge at the beginning phase of the iteration. The students can obtain enough knowledge at the end of the iteration, and the adaptive weight gets small. The students can learn the different knowledge at the different phases. The formula can be represented as follow:
where,
t is the adaptive weight.
The proposed classification method can be divided into two parts: feature selection and the parameter selection of the SVM. At first, the Tsallis entropy of the target is calculated using Equation (1). Then the entropy of each feature concerning the target is calculated and subtracted from the target’s entropy using Equation (2). In this process, the selected feature x and the parameter of Tsallis entropy are optimized by the ETLBO. The parameter can decide the ability of the Tsallis entropy.
In the second part, we use the ETLBO to optimize parameter c of SVM. The penalty coefficient c is the compromise between the smoothness of the fitting function and the classification accuracy. When c is too large, the training accuracy is high, and the generalization ability is poor; while c is too small, errors will be increased. Therefore, a reasonable selection of parameter c can obviously improve the model’s classification accuracy and generalization ability.
Finally, the selected feature
x, the parameter
of Tsallis entropy, and the parameter
c of SVM are optimized by ETLBO. We use the parameter optimized by the ETLBO and the SVM to classify the test dataset. The SVM classifier output the classification result. The flowchart of the proposed method is shown in
Figure 1.
4. Experiment and Result
To analysis the effectiveness of the proposed method, five optimization algorithms are used for comparison, such as PSO [
13], WOA [
20], HHO [
36], TLBO [
29], HSOA [
37], and HTLBO [
34]. The PSO, WOA, HHO, and TLBO are the original optimization algorithms. These optimization algorithms have the strong ability to find the optimal value of the mathematical function. While these algorithms optimize the engineering problems, the optimization performance is not well. Many schoolers study the strategies to improve the optimization algorithms. The HSOA and HTLBO are improved methods. These two algorithms use the hybrid way to enhance the optimization ability of the SOA and TLBO. The improved methods have the excellent performance to solve the problems which mentioned in the reference [
34,
37]. However, these algorithms may not solve all problems. Therefore, we select these algorithms as compared algorithms to test the performance of the proposed method.
The set of parameters is the same as the reference. All the methods are coded and implemented in MATLAB 2018B. To keep the fairness of the compared algorithms, each algorithm runs 30 times independently. To test the performance of the comparison algorithm, we set the number of populations to 30 and the maximum iteration to 500. The proposed ETLBO is training in MATLAB2018B. Experiments are managed on a computer with an i7-11800H central processing unit.
The results of the proposed method are described in this section. First, the fitness values obtained by the different optimization algorithms are compared to show the performances of these approaches. Then, we analyze the classification result of the compared algorithms. Finally, the discussion of the proposed method is described.
4.1. Datasets and Evaluation Index
The benchmark datasets used in the evaluations are introduced. The dataset selects 16 standard datasets from the University of California (UCI) data repository [
38].
Table 1 records the primary information of these selected datasets.
To evaluate the result of the health index diagnosis, we use the F-score, the accuracy of the classification and the CPU time as the metric index.
The function of F-score can be defined as follow:
where,
is the number of negative classes,
is the number of negative classes,
is the number of positive classes, and
is the number of positive classes.
4.2. Experiment 1: Feature Selection
Table 2 shows the fitness value of the compared algorithms. The table shows that when the number of features is small, the compared algorithms can reduce the number of features. When the number of features increases, it takes a huge challenge for the optimization algorithms. The ETLBO obtains better performance than compared algorithms.
Table 3 shows the std of the fitness values. It can be known from the given table that the ETLBO has strong robustness.
Table 4 shows the number of the selected attributes. The compared algorithms can reduce the number of features. The attributes are little, and the compared algorithms obtain the same result. The ETLBO gets the least attributes among the compared algorithms when the attributes are large. The total attributes of the dataset, the ETLBO also obtain the least attributes than other algorithms. It means that the ETLBO can reduce the number of features. However, reducing the number of features does not mean the classification accuracy is high.
Table 5 shows the parameter obtained by ETLBO. It can be seen from the table that the ETBLO obtains the different values under the diverse dataset. The ETBLO not only reduce the number of features but also acquires the parameter α of Tsallis entropy and the parameter
c of SVM. We will test the performance of the compared algorithms in the next section.
4.3. Experiment 2: Classification
Table 6 shows the classification results of compared algorithms.
Table 7 shows the f-score of the compared methods. The table result shows that the ETLBO is better than the original TLBO. The strategies improve the optimal ability of the TLBO. At the same time, the HSOA and ETLBO are better than the other algorithms. It means that the strategies significantly boost the original optimization algorithms. It can be known that the methods can be ordered as follows in terms of them F-score result: ETLBO > HTLBO > HSOA > HHO > PSO > WOA > TLBO. To sum up, the ETLBO obtains the high f-score values.
To sum up, the ETLBO obtained the best result in compared algorithms. The ETLBO not only reduces the number of features but also obtains high classification accuracy.
Table 8 shows that the std of classification accuracy. The ETLBO has a better stable ability than other algorithms. The proposed method has strong robustness to finish the classification task.
A statistical test is an essential and vital measure to evaluate and prove the performance of the tested methods. Parameter statistical test is based on various assumptions. This section uses well-known non-parametric statistical test types, Wilcoxon’s rank-sum test [
39].
Table 9 shows the results of the Wilcoxon rank-sum test. It can be found that the ETLBO is significantly different from other methods.
The CPU time is also an important index for the practical engineering testing problem. The CPU time results of the compared algorithms can be seen in
Table 10. The CPU time ordering of each algorithm is: TLBO < PSO < WOA < HHO < ETLBO < HTLBO < HSOA. Although the ETLBO costs considerable CPU time, the classification accuracy has good performance. At the same time, the ETLBO uses less CPU time than HSOA. It means that the strategies have good adaptive effectiveness for the TLBO. The strategies enhance the TLBO under less CPU time than the improved method.
4.4. Experiment 3: Compared with Different Classifiers
In this section, we compare with the different classifiers. The compared classifiers contain K-NearestNeighbor (KNN), original SVM, and random forest (RF) [
40].
Table 11 shows the configuration parameters and characteristics of the classifier models.
Table 12 demonstrates the evaluation index of the compare algorithsm. The BTLBO obtains the best result than other compared classifiers in all index. The BTLBO outperforms KNN, SVM, and RF by yielding an improvement of 3.45%, 2.94%, and 1.62% in F-score index. To sum up, the optimization algorithms obtain the optimal parameter of the SVM. The classification accuracy is higher than other compared classifiers.
5. Discussion
The proposed method has an optimal ability to solve the Tsallis-entropy-based feature selection problem in the feature selection domain. The ETLBO selects the suitable parameter of the Tsallis-entropy. At the same time, the proposed method reduces the number of features successfully. The optimization algorithms have a robust optimal ability; however, they do not adapt to solve the different optimized problems. So some adaptive strategies are very effective for improving themselves.
The proposed method obtains better classification accuracy than the compared algorithms in the classification field. The proposed method finds the proper parameter of the SVM classifier. The proposed method has a higher classification accuracy and strong robustness than the compared algorithms. At the same time, the proposed method is better than orther compared classifiers. So, the ETLBO algorithms can be used in the classification task field.
The proposed method’s limitation is that the optimization algorithm needs iteration to find the optimal solution, which is time-consuming. Improving the optimization capability and reducing the number of iterations can solve this problem. Therefore, it is necessary to search for powerful optimization algorithms and new strategies in future work.