Next Article in Journal
Common Attractive Point Results for Two Generalized Nonexpansive Mappings in Uniformly Convex Banach Spaces
Previous Article in Journal
Application of Game Method for Modelling and Temporal Intuitionistic Fuzzy Pairs to the Forest Fire Spread in the Presence of Strong Wind
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Cox Proportional-Hazards Model Based on an Improved Aquila Optimizer with Whale Optimization Algorithm Operators

by
Ahmed A. Ewees
1,2,*,
Zakariya Yahya Algamal
3,
Laith Abualigah
4,5,
Mohammed A. A. Al-qaness
6,
Dalia Yousri
7,
Rania M. Ghoniem
8 and
Mohamed Abd Elaziz
9,10,11
1
Department of e-Systems, University of Bisha, Bisha 61922, Saudi Arabia
2
Department of Computer, Damietta University, Damietta 34517, Egypt
3
Department of Statistics and Informatics, University of Mosul, Mosul 41002, Iraq
4
Faculty of Computer Sciences and Informatics, Amman Arab University, Amman 11953, Jordan
5
School of Computer Sciences, Universiti Sains Malaysia, Gelugor 11800, Pulau Pinang, Malaysia
6
State Key Laboratory for Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan 430079, China
7
Department of Electrical Engineering, Faculty of Engineering, Fayoum University, Fayoum 63514, Egypt
8
Department of Information Technology, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia
9
Faculty of Computer Science & Engineering, Galala University, Suze 435611, Egypt
10
Artificial Intelligence Research Center (AIRC), Ajman University, Ajman 346, United Arab Emirates
11
Department of Mathematics, Faculty of Science, Zagazig University, Zagazig 44519, Egypt
*
Author to whom correspondence should be addressed.
Mathematics 2022, 10(8), 1273; https://doi.org/10.3390/math10081273
Submission received: 4 March 2022 / Revised: 3 April 2022 / Accepted: 6 April 2022 / Published: 12 April 2022

Abstract

:
Recently, a new optimizer, called the Aquila Optimizer (AO), was developed to solve different optimization problems. Although the AO has a significant performance in various problems, like other optimization algorithms, the AO suffers from certain limitations in its search mechanism, such as local optima stagnation and convergence speed. This is a general problem that faces almost all optimization problems, which can be solved by enhancing the search process of an optimizer using an assistant search tool, such as using hybridizing with another optimizer or applying other search techniques to boost the search capability of an optimizer. Following this concept to address this critical problem, in this paper, we present an alternative version of the AO to alleviate the shortcomings of the traditional one. The main idea of the improved AO (IAO) is to use the search strategy of the Whale Optimization Algorithm (WOA) to boost the search process of the AO. Thus, the IAO benefits from the advantages of the AO and WOA, and it avoids the limitations of the local search as well as losing solutions diversity through the search process. Moreover, we apply the developed IAO optimization algorithm as a feature selection technique using different benchmark functions. More so, it is tested with extensive experimental comparisons to the traditional AO and WOA algorithms, as well as several well-known optimizers used as feature selection techniques, like the particle swarm optimization (PSO), differential evaluation (DE), mouth flame optimizer (MFO), firefly algorithm, and genetic algorithm (GA). The outcomes confirmed that the using of the WOA operators has a significant impact on the AO performance. Thus the combined IAO obtained better results compared to other optimizers.

1. Introduction

Data mining is an unavoidable stage in extracting knowledge and data gained from data mining, employed in various sectors, including industrial and medical applications [1]. Recently, there has been a growth in the number of gathered and retained features in databases, though not all of them are valuable for data analysis; therefore, some of them are utterly useless or unnecessary. These traits, therefore, have no utility in the information extraction, but they mainly enhance the complexity and incompleteness of the outcomes. As a result, feature selection aids in reducing the dimensionality of data prior to data processing [2]. There are n features in a vast database with numerous features to manage. The computation cost to assess all the features is exponential (O( 2 n )), making it essentially unattainable. As a result, feature selection techniques serve as the foundation for data mining, allowing beneficial characteristics to be retained for further learning tasks while discarding the most irrelevant and less significant ones. In reality, feature selection approaches disregard unimportant features, allowing the learning process to be more successful [3]. It has also been demonstrated that feature selection improves the classification performance of data mining algorithms such as the KNN classifier.
Mainly the three techniques to feature selection are the filter methods, embedded methods, and wrapper approaches [4]. For filter methods, the selected features can be filtered depending on the general properties of the used datasets (i.e., known metrics, for example, correlation). Those methods can be implemented without predictive models. Filter methods are fast, but they face some problems in case of avoiding overfitting, and they may fail in selecting the best features. In contrast, the wrapper methods exist as wrappers around the predictive models, and they employ the predictive models to select the best features. The main drawbacks of those methods are their expensive computation, but they produce better performance. In the case of embedded, the process of selecting features is embedded in the learning model. Embedded models are computationally expensive than wrapper methods, but they can be considered better from the aspects of overfitting.
The Catfish BPSO presented in this work is a wrapper approach. A vast number of chosen features for many pattern classification issues does not necessarily result in an excellent accuracy rate. In some instances, the effectiveness of algorithms solely dedicated to data classification speed and predictive value can reduce even though features may be unimportant or confusing, or are due to positive correlations. During the learning stage, these characteristics might have a detrimental influence on the categorization process. Ideally, the feature selection approach decreases the cost of feature assessment while increasing classifier performance and quality. Several techniques have traditionally been used to choose features from training and testing data, including Arithmetic Optimization Algorithm [5], Binary Butterfly Optimization [6], Aquila Optimizer [7], binary Gradient-based Optimizer [8], Firefly Algorithm [9], Atomic Orbit Search [10], RUN Kutta optimizer (RUN) [11], Colony Predation Algorithm (CPA) [12], Slime Mould Algorithm (SMA) [13], Harris Hawk Optimization (HHO) [14], Hunger Games Search (HGS) [15], and others.
A feature selection approach by ant colony optimization is given in [16]. The approach uses numerous rounds to select the best feature subset without utilizing any learning techniques. Furthermore, the feature importance will be estimated using the correlation among features, resulting in reducing repetition. The experimental findings on numerous commonly used datasets demonstrate the proposed method’s efficiency and enhancements over earlier comparable approaches. This paper provides a new machine learning technique for high-dimensional data [17], which uses the Henry gas solubility optimization (HGSO) method to pick key features and enhance classification performance. The suggested technique is assessed against well-established optimization algorithms using multiple datasets with a broad feature size range, from tiny to large. Finally, the empirical research indicates that the suggested method is significantly successful on low or high-dimensional data.
A new mixed ant colony optimization approach is presented in [18] for feature selection utilizing a learning algorithm in this study. Choosing a subset of conspicuous characteristics of decreased size is an essential part of this approach. The suggested method employs a hybrid search methodology that combines the benefits of the filter and wrapper techniques. The specifics of the comparison demonstrate that the presented process has a surprising capacity to construct reduction subsets of prominent features while still giving high classification performance.
In [19], a new feature selection technique using a mathematical framework of grasshopper interaction in discovering nutrition is suggested. The grasshopper optimization technique was modified to make it acceptable for a feature selection challenge. The proposed strategy is augmented by statistical measures to remove redundant features with the most interesting features during repetitions. Comparative trials show that the suggested approach is more effective than existing classification techniques. This work attempted to increase the effect of text classification using the particle swarm optimizer [20]. Many exploratory search strategies are conducted in this study by examining current accomplishments of enhanced particle swarm optimizers and characteristics of traditional feature selection methods. The basic model is chosen first, followed by two upgraded models based on the structural inertia weight and steady restriction factor to optimize feature selection approaches. The trial findings and significance tests reveal that the dynamically upgraded model outperforms all others in text classification effectiveness and dimension reliability.
The work in [21] suggests a non-negative inter feature selection method with variable graph restrictions to overcome the feature selection problem. In the presented model, linear regression is used to design the original data environment into a low-dimensional space to create the label matrix. The results demonstrate the efficiency of the suggested strategy on ten real datasets compared to other comparative approaches. A novel binary version of the grasshopper optimizer is presented and employed in [22] for the feature subset selection challenge in this research. This suggested novel binary grasshopper optimizer is evaluated and analyzed to five optimization algorithms employed in the feature selection issue. These techniques have been developed and tested on different data sets of varying sizes. The findings showed that the suggested strategy outperformed the other approaches examined.
The paper [23] provides a novel feature-selection search strategy for feature selection-based intrusion detection systems (IDS) using the cuttlefish optimization algorithm. Because IDS deal with a vast quantity of data, one of their most important duties is maintaining the highest quality of features that reflect the real data set while removing duplicate and unnecessary characteristics. Compared to the results produced utilizing all features, the feature subset derived via the proposed method provides a greater increasing security and correctness rate with a reduced probability of detection.
Unfortunately, Several issues are not addressed in the research mentioned above [24]. To begin, all features are chosen at random with the same chance. As a result, the principal features cannot be quickly taken for inclusion in the newly generated feature subset. Moreover, the traditional feature selection techniques cannot adequately select the most informative features. Thus, the improved optimization methods are too near to determine the best relative features through an efficient search process. These methods significantly reduce the efficiency of searching for the ideal feature subset.

Motivation and Contribution

To some extent, population-based optimization algorithms can prevent local optima stagnation. It also has a high capacity to converge to the optima. One of the primary motivations for this research is that there is no suitable optimizer for addressing all kinds of problems, as given in No-Free-Lunch; hence the excellent version of any optimization method on a set of problems does not guarantee a compelling performance on another problems. To the aim to contribute, no one has yet used Aquila Optimizer using the leading search operators of the Whale Optimization Algorithm to tackle feature selection in a systematic manner. The authors selected the Whale Optimization Algorithm because of its proven efficiency and superiority compared with numerous algorithms such as PSO, GA, GWO, etc., in several optimization problems in different fields. Moreover, the logarithmic spiral function of WOA is an attractive operator to enhance the AO phases to cover a major area in uncertain search space. This was the primary motivator for us to select the Aquila Optimizer as the core of our work. This paper’s overarching focus is on providing new binary variants of the Aquila Optimizer, called IAO, for wrapper feature selection. The IAO enhanced the original search strategies of the Aquila Optimizer by using the main operators of the Whale Optimization Algorithm. This modification enables the IAO to tackle the main weaknesses of using a single search method by avoiding the local search problem and losing the diversity of the solutions in the search stage. The suggested technique identifies the best feature subset, which reduces feature subset size while increasing classification performance. The proposed IAO is evaluated on benchmark problems in terms of fitness values, the selected features number, and classification accuracy. The obtained results showed that the IAO got promising outcomes compared to different feature selection methods. Moreover, the IOA searchability is clearly observed in determining the best relative subset of features.
The following parts of the paper are arranged as follows, Section 2 the Aquila Optimizer and Whale Optimization Algorithm are described. Section 3 introduces the proposed algorithm for feature selection. Section 4 shows experiments, results, and discussions. Section 5, shows the conclusion and future work.

2. Background

2.1. The Aquila Optimizer (AO)

This subsection contains the basic formulation of the Aquila Optimizer (AO) [7]. To catch its target, the AO algorithm often imitates Aquila’s social behaviour. The AO has been adopted to solve various problems, such as time series forecasting [25], improving intrusion detection system (IDS) [26], task scheduling [27], global optimization [28,29], and others [30].
AO is a population-based optimization technique. It involves the formation of X with N agents as in Equation (1).
X i j = r 1 × ( U B j L B j ) + L B j , j = [ 1 , 2 , 3 , , D i m ] , i = [ 1 , 2 , 3 , , N ]
where U B j and L B j denotes search space. r 1 [ 0 , 1 ] . D i m refers to the solution’s dimension.
The next phase in the AO technique is to either explore or exploit until the best solution is identified. According to [7], there are two methods for exploitation and exploration. In the exploration, the best agent X b and the X M are used as:
X i ( t + 1 ) = X b ( t ) × 1 t T + ( X M ( t ) X b ( t ) r a n d ) ,
X M ( t ) = 1 N i = 1 N X ( t ) , j = 1 , 2 , 3 , , D i m
In Equation (2), the search is governed by 1 t T . T stands for the max generations number.
Meanwhile, the levy flight ( L e v y ( D ) is used in exploration phase to improve the ability of the population to find the best solution. This process is formulated as:
X i ( t + 1 ) = L e v y ( D ) × X b ( t ) + X R ( t ) + ( y x ) r a n d ,
L e v y ( D ) = s × u × σ | υ | 1 β , σ = Γ ( β + 1 ) × s i n e ( π β 2 ) Γ ( β + 1 2 ) × β × 2 ( β 1 2 )
where β and s are set to 0.01 and 1.5, respectively. υ and u are generated randomly. X R represents a solution selected randomly from X. Furthermore, y and x denote two parameters that are utilized to replicate the spiral shape:  
y = r × c o s ( θ ) , x = s i n ( θ ) × r
r = r 1 + U × D 1 , θ = ω × D 1 + θ 1 , θ 1 = 3 × π 2 , U = 0.005650 , ω = 0.0050
here r 1 [ 0 , 20 ] stands for random value.
The initial strategy employed in [7] to improve the agents in the exploitation phase, comparable to exploration, is based on both X b and X M , which is written as:
X i ( t + 1 ) = ( X b ( t ) X M ( t ) ) × α r n d + ( ( U B L B ) × r n d + L B ) × δ
where δ and α are exploitation parameters.
In the second exploitation approach, the position is updated by L e v y , the quality function Q F , or X b . The mathematical definition of this technique is as follows:
X i ( t + 1 ) = X b ( t ) × Q F G X G 2 × L e v y ( D ) + r n d × G 1
G X = ( X ( t ) × G 1 × r n d )
Q F ( t ) = t 2 × r n d ( ) 1 ( 1 T ) 2
where G 2 is a parameter which updated using the following formula.
G 2 = 2 × ( 1 t T )
In addition, the parameter G 1 that is used to track the motion of the best solution is updated as:
G 1 = 2 × r n d ( ) 1
where r n d represents a random value. Algorithm 1 lists the steps of the AO.

2.2. Whale Optimization Algorithm

WOA [31] is an optimization technique. Its mathematical formulation is based on how well it does in hunting. The WOA uses a unique hunting approach that used by a killer whale species known as the humpback whale, which is known as bubble-net feeding. Each whale’s location indicates a solution that may be updated with respect to its attitude toward attacking the prey; such position is denoted by the symbol X b . Whales can use two tactics to attack their prey [32]. The first strategy is known as encircling prey, in which the humpback whale can locate the target and encircle it. The target prey is assumed to be the best answer ( X b ( t ) by WOA. Once X b ( t ) has been identified (found), the other whales will attempt to update their positions to match X b ( t ) , as shown in Equations (13)–(15):
D i s i = | B X b ( t ) X i ( t ) | , B = 2 r
X i ( t + 1 ) = X b ( t ) A D i s i
In Equation (15), D i s i stands for the distance between X i ( t ) and X b ( t ) . A is a coefficient vector and is calculated by the following equation:
A = 2 a r a
where r [ 0 , 1 ] . Whereas the parameter a denotes a parameter that updated its value as:
a = a t a t m a x
where t m a x stands for the number of generations.
Algorithm 1 Aquila Optimizer (AO)
1:
Input: dimension of each agent D i m ,
2:
Solutions number N,
3:
Generations number T.
4:
Form the initial population X.
5:
t = t + 1
6:
while   t T   do
7:
      Compute the fitness for X i .
8:
      Select the best X b (t)
9:
      for ( i = 1 , 2 , . . . , N ) do
10:
          if t≤ ( 2 3 T then
11:
             if  r a n d ( ) > 0.50 then
12:
                 Update the X i by Equation (2).
13:
             else
14:
                 Update the X i by Equation (4).
15:
             end if
16:
          else
17:
             if  r a n d ( ) ≤ 0.50 then
18:
                 Update the X i by Equation (8).
19:
             else
20:
                 Update X i by Equation (9).
21:
             end if
22:
          end if
23:
          if Fit( X 3 ( t + 1 )) < Fit(X(t)) then
24:
             X(t) =( X 3 ( t + 1 ))
25:
             if Fit( X 3 ( t + 1 )) < Fit( X b (t)) then
26:
                  X b (t) = X 3 ( t + 1 )
27:
             end if
28:
          end if
29:
      end for
30:
end while
31:
Output: return ( X b ).
The second strategy is bubble-net attacking which represents the exploitation phase and it has two techniques: spiral updating position and shrinking encircling mechanism. Reduce the a in Equation (15) to satisfy the shrinking encircling process. The distance between X i and X b is calculated as follows using the spiral updating position mechanism:
X ( t + 1 ) = D i s e b l c o s ( 2 π l ) + X b ( t )
In Equation (17), l is a value that stands for the shape from the logarithmic spiral. In this regard, the whales can also swim around the X b utilizing a spiral-shaped path as well as a diminishing circle at the same time. As a result, the following equation, which is based on combining (13)–(15) and Equation (17), can be used to improve the location.
X ( t + 1 ) = X b ( t ) A D i s i f p 0.50 D i s e b l c o s ( 2 π l ) + X b ( t ) i f p < 0.50 .
In Equation (18), p [ 0 , 1 ] stands for probability used to control the updating mechanism.
Furthermore, instead of using X b , each whale’s position can be updated through selecting arbitrary search whale, X r , as shown in the equation below:
X ( t + 1 ) = X r A D i s
D i s = | B     X r a n d X ( t ) |
Algorithm 2 provides the basic steps of WOA.
Algorithm 2 WOA
1:
Input: The max iterations number t m a x and the population size N.
2:
Create a set of N random solutions (X).
3:
Set t = 1 .
4:
Calculate each X i ’s fitness value ( F i ).
5:
Find the X b value that corresponds to the best fitness value F b .
6:
for   t = 1 : t m a x   do
7:
      while  a > 0  do
8:
          for all x i X  do
9:
             Update the value of p randomly as p = r a n d .
10:
             if  p 0.5  then
11:
                 To improve X i , use Equation (17).
12:
             else
13:
                 if  A 0.5  then
14:
                     To improve X i , use Equation (19).
15:
                 else
16:
                     To improve X i , use Equation (13).
17:
                 end if
18:
             end if
19:
          end for
20:
          Update the value of a.
21:
      end while
22:
end for

3. Proposed Method

Herein, this section provides description on the IAO method. It utilizes the benefits of the WOA to enhance the performance of the original version of the AO. In detail, the WOA is used as a local search of the AO to raise its capability in solving different optimization problems which adds more ability and flexibility to the IAO to explore and exploit the search space as well as improve its diversity.
The structure of the IAO is illustrated in Figure 1. The IAO starts by declaring the global parameters and generating the initial population using random distribution methods. This population is evaluated to determine the best solution using the objective function. Throughout the optimization process of the IAO, the expanded exploitation of the original AO is improved using the spiral behavior of the WOA to update the solutions. In this regard, the expanded exploitation equation of the AO is replaced by the spiral equation of the WOA namely Equation (8), as in steps 30 to 36 in Algorithm 1, are updated using the WOA Equation (17). Therefore, the exploitation phase of the IAO benefits from both AO and WOA algorithms. Then, each solution is checked and updated by the objected function then the best one is retained to the subsequent iteration. Such sequence is iterated for all solutions till reaching stop condition, afterwards the best result within the population are selected and saved. Finally, the final results are presented.
Furthermore, the IAO begins by declaring the parameters of both AO and WOA. Then the AO generates a X [ x i , i = 1 , 2 , , x N ] random binary population with size N and dimension D. The population values are converted to binary values by Equation (22).
X = 1 i f x i 0.5 0 o t h e r w i s e
Then, the initial objective function value is computed using the operators of the AO whereas, the remaining values of the objective function are computed using the IAO structure. This sequence is iterated till meeting stop condition. So, in the final step, the best results are presented as the output of the IAO. The formula Equation (22) is applied to compute the objective function value:
f ( x i ( t ) ) = ξ E x i ( t ) + ( 1 ξ ) ( | x i ( t ) | | C | )
where E x i ( t ) denotes the error of the fitness function Equation (23), ξ [ 0 , 1 ] balances the error and the selected features number. The terms x i ( t ) and C denote the selected features number and total feature numbers, respectively [5].
f i t = h ( X i , t ) = h 0 ( t ) e x p [ j = 1 p x i j b j ]
where, X i = ( x i 1 , x i 2 , ) denotes the predictor variable. h 0 ( t ) refers to the baseline hazard rate function. h ( X i , t ) refers to the hazard rate at time t for X i .
Moreover, the complexity of the presented algorithm depends on the complexity AO and WOA. So, it is O(N × ( t m a x × D)). Since O I A O = K 1 O A O + ( N K 1 ) O W O A where O A O is O(N × ( t m a x × D)). O W O A is O(N × ( t m a x × D)).

4. Experiment and Results

The use of regression modeling to investigate the effects of many factors on a response is commonplace. In examining time-to-event data, the model of Cox proportional hazards is widely applied. It is a model applied in medical studies to examine the association between a patient’s survival time and other predictor variables.
Creating a model of Cox proportional hazards that includes all of the predictors is undesirable when the number of predictors is enormous since it gives low prediction accuracy and is challenging to comprehend [33]. Variable selection has become a significant emphasis on Cox proportional-hazards modeling due to these factors.
Four biological benchmark datasets are used in this study to evaluate the modified variant of AO’s (IAO) performance. Diffuse large B-cell lymphoma (DLBC2002) [34] comprises the samples of 240 lymphoma patients, each of one has 7399 gene expression measurements. The Lung cancer (Lung-cancer) [35] is the second dataset. This dataset comprises information on 86 lung cancer patients, each one had their gene expression tested 7129 times. The Dutch Breast Cancer (Duch Breasst) is the third dataset [36] contains 295 breast cancer patients’ information. The data for each patient consists of 4919 gene expression measurements. The cytogenetically normal acute myeloid leukaemia dataset (AML-full) is the fourth collection [37]. This dataset contained information from 165 problems. A total of 6283 gene expression measurements are included in the data for each prob. The survival time, whether censored or not, is the response variable in both datasets.
The improved version of AO has been compared with a set of more popular optimizers, including the standard AO algorithm, firefly algorithm (Firefly), genetic algorithm (GA), salp swarm algorithm, indicated as SSA, particle swarm optimizer (PSO), in addition to differential evolution (DE), WOA and finally moth-flame optimizer (MFO). The parameters of those algorithms are listed in Table 1. All optimizers, as mentioned earlier, are implemented over 30 runs with 100 iteration numbers besides to 50 search agents on Matlab“2020a” platform for unbiased comparison. Several statistical metrics have been computed for providing a detailed analysis. The computed metrics of the Table 2, Table 3, Table 4, Table 5 and Table 6 are the average, worst (Max) (Equation (25)) and best (Min) (Equation (24)) fitness function ( f i t ) values (Equation (23)), moreover, the standard deviation (Equation (26)) and the selected features number are shown below:
M i n f i t = min 1 i r f i t 1
M a x f i t = max 1 i r f i t i
S t a n d a r d d e v i a t i o n = 1 N i = 1 N | f i t i μ | 2
where, μ refers to the fitness function ( f i t ). N refers to the sample number.
The average of the fitness function, Max, and Min values of the log-likelihood were reported in Table 1, Table 2, Table 3, Table 4 and Table 5, respectively, to highlight the performance that our IAO and other employed algorithms can achieve on the four datasets (in all Tables the best values are in boldface). According to Table 1, Table 2 and Table 3, the suggested algorithm, IAO, outperformed the different algorithms for all datasets as it has the least fitness function values. The reported data in Table 4 affirms the highly consistent performance of the proposed optimizer compared with the AO, Firefly, SSA, GA, PSO, DE, and MFO. The WOA can be located at the section rank after the proposed IAO in handling the first and fourth datasets (DLBC2002, AML-full). In contrast, WOA is not efficient for the Lung-cancer dataset. Moreover, WOA has a remarkable deviation from the IAO for the Duch Breast dataset. Accordingly, the IAO can be considered a successful technique across all datasets. For the selected features number of Table 5, the IAO chose fewer genes than the other algorithms. The MFO is the worst technique for handling these datasets as it picked the highest number of genes.
For measuring the accumulated performance of the IOA, the mean values of average, Min, Max, standard deviation, and the features number over the four datasets are depicted in Figure 2, Figure 3, Figure 4, Figure 5 and Figure 6. The displayed figures are a shred of evidence of the efficiency and superiority of the IAO and its success in handling the four datasets as it has the minor average, Min, Max, standard deviation, and the selected features number with high performance. These observations are primarily due to the created algorithm’s ability to adjust for the limitations of the typical AO algorithm. In addition, Figure 7 illustrates the average of the computation time overall datasets. From this figure the IAO showed accepted time compared to the other methods.
The Friedman test is applied to check the statistical significance of the experiment’s methods for further analysis. It is one of the most important statistical tests that indicate the significant differences between the compared algorithms [38,39]. Table 7 ranks all methods in all datasets utilizing the Friedman test. From Table 7, we can conclude that the IAO was ranked first in DLBC2002, Lung-cancer, and AML-full datasets whereas, it was ranked second in Duch Breasst after the Firefly method. Moreover, the IAO showed good performance as in Figure 8, Figure 9, Figure 10 and Figure 11 which illustrate the boxplot for all datasets.
To analyse the exploration and exploitation of the IAO and the original version of AO, their behaviours with the studied datasets are illustrated in Figure 12. The curves of the Figure 12 illustrate the ratios of exploitation and exploration throughout the search stages at the studied datasets for the IOA and standard AO. From these curves, it can be observed that there is a balance between both curves of IAO throughout the search process. The exploration stage raises in the first part of the optimization stage. The exploitation is started after 10% of this process and working together with exploration with a nearly equal ratio as indicated in the cases of DLBC2002, Duch Breast, and AML-full, while the AO is still searching for adequate solutions. Hence the IAO achieves a successful trade-off between the two phases.

5. Advantages and Drawbacks

In this section, the attractive features and limitations of the proposed IAO can be summed up in the following bullet points:
  • The advantages of the proposed IAO are achieving a high balance between the exploration and exploitation stages. The consistency and efficiency of the IAO are more remarkable than the standard AO. Moreover, the IAO’s ability to reach the optimal solutions compared with the other competitors (Firefly, GA, SSA, PSO, DE, WOA, and MFO).
  • The main limitation of the proposed IAO is increasing the adjustable parameters of the algorithm via using the operators of WOA. For solving this issue, the authors will modify these parameters adaptively in future work.

6. Conclusions

The advances in optimization methods have been adopted to address different problems, including feature selection (FS). Therefore, this study proposed a novel version of the Aquila optimizer (AO) to solve FS applications. The main idea of this improved, called IAO is to boost the search performance of the original AO using the operators of the Whale Optimization Algorithm (WOA). Thus, this new combination provides the proposed IAO with a strong search ability to avoid local optima stagnation. We considered extensive evaluation and comparisons to verify the quality of the suggested IAO. We used four medical datasets to test the IAO. The results reflected that the IAO performed better than the versions of AO and WOA, as well as several well-known optimization methods, such as e particle swarm optimizer, differential evaluation, mouth flame optimizer, firefly algorithm, and genetic algorithm. In future work, the IAO will be evaluated in different fields for example multi-objective optimization and parameter estimation.

Author Contributions

Conceptualization, A.A.E. and M.A.A.A.-q.; Data curation, Z.Y.A.; Formal analysis, A.A.E., Z.Y.A., L.A. and D.Y.; Investigation, A.A.E., L.A., M.A.A.A.-q., R.M.G. and M.A.E.; Methodology, A.A.E., Z.Y.A. and M.A.A.A.-q.; Resources, Z.Y.A., R.M.G. and M.A.E.; Software, A.A.E.; Validation, A.A.E., Z.Y.A., D.Y., R.M.G. and M.A.E.; Visualization, A.A.E., L.A., M.A.A.A.-q., D.Y., R.M.G. and M.A.E.; Writing—original draft, A.A.E., Z.Y.A., L.A., M.A.A.A.-q., D.Y., R.M.G. and M.A.E.; Writing—review & editing, A.A.E., Z.Y.A., L.A., M.A.A.A.-q., D.Y. and M.A.E. All authors have read and agreed to the published version of the manuscript.

Funding

Princess Nourah bint Abdulrahman University Researchers Supporting Project number (PNURSP2022R138), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Ghaemi, M.; Feizi-Derakhshi, M.R. Feature selection using forest optimization algorithm. Pattern Recognit. 2016, 60, 121–129. [Google Scholar] [CrossRef]
  2. Agrawal, R.; Kaur, B.; Sharma, S. Quantum based whale optimization algorithm for wrapper feature selection. Appl. Soft Comput. 2020, 89, 106092. [Google Scholar] [CrossRef]
  3. Gasca, E.; Sánchez, J.S.; Alonso, R. Eliminating redundancy and irrelevance using a new MLP-based feature selection method. Pattern Recognit. 2006, 39, 313–315. [Google Scholar] [CrossRef]
  4. Chuang, L.Y.; Tsai, S.W.; Yang, C.H. Improved binary particle swarm optimization using catfish effect for feature selection. Expert Syst. Appl. 2011, 38, 12699–12707. [Google Scholar] [CrossRef]
  5. Ibrahim, R.A.; Abualigah, L.; Ewees, A.A.; Al-qaness, M.A.; Yousri, D.; Alshathri, S.; Abd Elaziz, M. An Electric Fish-Based Arithmetic Optimization Algorithm for Feature Selection. Entropy 2021, 23, 1189. [Google Scholar] [CrossRef]
  6. Arora, S.; Anand, P. Binary butterfly optimization approaches for feature selection. Expert Syst. Appl. 2019, 116, 147–160. [Google Scholar] [CrossRef]
  7. Abualigah, L.; Yousri, D.; Abd Elaziz, M.; Ewees, A.A.; Al-qaness, M.A.; Gandomi, A.H. Aquila Optimizer: A novel meta-heuristic optimization Algorithm. Comput. Ind. Eng. 2021, 157, 107250. [Google Scholar] [CrossRef]
  8. Jiang, Y.; Luo, Q.; Wei, Y.; Abualigah, L.; Zhou, Y. An efficient binary Gradient-based optimizer for feature selection. Math. Biosci. Eng. 2021, 18, 3813–3854. [Google Scholar] [CrossRef]
  9. Ewees, A.A.; Abualigah, L.; Yousri, D.; Algamal, Z.Y.; Al-qaness, M.A.; Ibrahim, R.A.; Abd Elaziz, M. Improved Slime Mould Algorithm based on Firefly Algorithm for feature selection: A case study on QSAR model. Eng. Comput. 2021, 1–15. [Google Scholar] [CrossRef]
  10. Abd Elaziz, M.; Abualigah, L.; Yousri, D.; Oliva, D.; Al-qaness, M.A.; Nadimi-Shahraki, M.H.; Ewees, A.A.; Lu, S.; Ali Ibrahim, R. Boosting Atomic Orbit Search Using Dynamic-Based Learning for Feature Selection. Mathematics 2021, 9, 2786. [Google Scholar] [CrossRef]
  11. Ahmadianfar, I.; Heidari, A.A.; Gandomi, A.H.; Chu, X.; Chen, H. RUN beyond the metaphor: An efficient optimization algorithm based on Runge Kutta method. Expert Syst. Appl. 2021, 181, 115079. [Google Scholar] [CrossRef]
  12. Tu, J.; Chen, H.; Wang, M.; Gandomi, A.H. The colony predation algorithm. J. Bionic Eng. 2021, 18, 674–710. [Google Scholar] [CrossRef]
  13. Li, S.; Chen, H.; Wang, M.; Heidari, A.A.; Mirjalili, S. Slime mould algorithm: A new method for stochastic optimization. Future Gener. Comput. Syst. 2020, 111, 300–323. [Google Scholar] [CrossRef]
  14. Heidari, A.A.; Mirjalili, S.; Faris, H.; Aljarah, I.; Mafarja, M.; Chen, H. Harris hawks optimization: Algorithm and applications. Future Gener. Comput. Syst. 2019, 97, 849–872. [Google Scholar] [CrossRef]
  15. Yang, Y.; Chen, H.; Heidari, A.A.; Gandomi, A.H. Hunger games search: Visions, conception, implementation, deep analysis, perspectives, and towards performance shifts. Expert Syst. Appl. 2021, 177, 114864. [Google Scholar] [CrossRef]
  16. Tabakhi, S.; Moradi, P.; Akhlaghian, F. An unsupervised feature selection algorithm based on ant colony optimization. Eng. Appl. Artif. Intell. 2014, 32, 112–123. [Google Scholar] [CrossRef]
  17. Neggaz, N.; Houssein, E.H.; Hussain, K. An efficient henry gas solubility optimization for feature selection. Expert Syst. Appl. 2020, 152, 113364. [Google Scholar] [CrossRef]
  18. Kabir, M.M.; Shahjahan, M.; Murase, K. A new hybrid ant colony optimization algorithm for feature selection. Expert Syst. Appl. 2012, 39, 3747–3763. [Google Scholar] [CrossRef]
  19. Zakeri, A.; Hokmabadi, A. Efficient feature selection method using real-valued grasshopper optimization algorithm. Expert Syst. Appl. 2019, 119, 61–72. [Google Scholar] [CrossRef]
  20. Lu, Y.; Liang, M.; Ye, Z.; Cao, L. Improved particle swarm optimization algorithm and its application in text feature selection. Appl. Soft Comput. 2015, 35, 629–636. [Google Scholar] [CrossRef]
  21. Zhang, Y.; Ma, Y. Non-negative multi-label feature selection with dynamic graph constraints. Knowl.-Based Syst. 2021, 238, 107924. [Google Scholar] [CrossRef]
  22. Hichem, H.; Elkamel, M.; Rafik, M.; Mesaaoud, M.T.; Ouahiba, C. A new binary grasshopper optimization algorithm for feature selection problem. J. King Saud-Univ.-Comput. Inf. Sci. 2019, 34, 316–328. [Google Scholar] [CrossRef]
  23. Eesa, A.S.; Orman, Z.; Brifcani, A.M.A. A novel feature-selection approach based on the cuttlefish optimization algorithm for intrusion detection systems. Expert Syst. Appl. 2015, 42, 2670–2679. [Google Scholar] [CrossRef]
  24. Chen, Y.P.; Li, Y.; Wang, G.; Zheng, Y.F.; Xu, Q.; Fan, J.H.; Cui, X.T. A novel bacterial foraging optimization algorithm for feature selection. Expert Syst. Appl. 2017, 83, 1–17. [Google Scholar] [CrossRef]
  25. AlRassas, A.M.; Al-qaness, M.A.; Ewees, A.A.; Ren, S.; Abd Elaziz, M.; Damaševičius, R.; Krilavičius, T. Optimized ANFIS model using Aquila Optimizer for oil production forecasting. Processes 2021, 9, 1194. [Google Scholar] [CrossRef]
  26. Fatani, A.; Dahou, A.; Al-qaness, M.A.; Lu, S.; Abd Elaziz, M. Advanced Feature Extraction and Selection Approach Using Deep Learning and Aquila Optimizer for IoT Intrusion Detection System. Sensors 2021, 22, 140. [Google Scholar] [CrossRef]
  27. Kandan, M.; Krishnamurthy, A.; Selvi, S.; Sikkandar, M.Y.; Aboamer, M.A.; Tamilvizhi, T. Quasi oppositional Aquila optimizer-based task scheduling approach in an IoT enabled cloud environment. J. Supercomput. 2022, 1–15. [Google Scholar] [CrossRef]
  28. Wang, S.; Jia, H.; Liu, Q.; Zheng, R. An improved hybrid Aquila Optimizer and Harris Hawks Optimization for global optimization. Math. Biosci. Eng 2021, 18, 7076–7109. [Google Scholar] [CrossRef]
  29. Zhang, Y.J.; Yan, Y.X.; Zhao, J.; Gao, Z.M. AOAAO: The Hybrid algorithm of Arithmetic Optimization algorithm with Aquila Optimizer. IEEE Access 2022, 10, 10907–10933. [Google Scholar] [CrossRef]
  30. Vashishtha, G.; Kumar, R. Autocorrelation energy and aquila optimizer for MED filtering of sound signal to detect bearing defect in Francis turbine. Meas. Sci. Technol. 2021, 33, 015006. [Google Scholar] [CrossRef]
  31. Mirjalili, S.; Lewis, A. The whale optimization algorithm. Adv. Eng. Softw. 2016, 95, 51–67. [Google Scholar] [CrossRef]
  32. Al-qaness, M.A.; Ewees, A.A.; Abd Elaziz, M. Modified whale optimization algorithm for solving unrelated parallel machine scheduling problems. Soft Comput. 2021, 25, 9545–9557. [Google Scholar] [CrossRef]
  33. Leng, C.; Helen Zhang, H. Model selection in nonparametric hazard regression. Nonparametr. Stat. 2006, 18, 417–429. [Google Scholar] [CrossRef]
  34. Rosenwald, A.; Wright, G.; Chan, W.C.; Connors, J.M.; Campo, E.; Fisher, R.I.; Gascoyne, R.D.; Muller-Hermelink, H.K.; Smeland, E.B.; Giltnane, J.M.; et al. The use of molecular profiling to predict survival after chemotherapy for diffuse large-B-cell lymphoma. N. Engl. J. Med. 2002, 346, 1937–1947. [Google Scholar] [CrossRef] [PubMed]
  35. Beer, D.G.; Kardia, S.L.; Huang, C.C.; Giordano, T.J.; Levin, A.M.; Misek, D.E.; Lin, L.; Chen, G.; Gharib, T.G.; Thomas, D.G.; et al. Gene-expression profiles predict survival of patients with lung adenocarcinoma. Nat. Med. 2002, 8, 816–824. [Google Scholar] [CrossRef]
  36. Van Houwelingen, H.C.; Bruinsma, T.; Hart, A.A.; Van’t Veer, L.J.; Wessels, L.F. Cross-validated Cox regression on microarray gene expression data. Stat. Med. 2006, 25, 3201–3216. [Google Scholar] [CrossRef]
  37. Metzeler, K.H.; Hummel, M.; Bloomfield, C.D.; Spiekermann, K.; Braess, J.; Sauerland, M.C.; Heinecke, A.; Radmacher, M.; Marcucci, G.; Whitman, S.P.; et al. An 86-probe-set gene-expression signature predicts survival in cytogenetically normal acute myeloid leukemia. Blood J. Am. Soc. Hematol. 2008, 112, 4193–4201. [Google Scholar] [CrossRef]
  38. Al-qaness, M.A.; Ewees, A.A.; Fan, H.; Abualigah, L.; Abd Elaziz, M. Boosted ANFIS model using augmented marine predator algorithm with mutation operators for wind power forecasting. Appl. Energy 2022, 314, 118851. [Google Scholar] [CrossRef]
  39. Yousri, D.; AbdelAty, A.M.; Al-qaness, M.A.; Ewees, A.A.; Radwan, A.G.; Abd Elaziz, M. Discrete fractional-order Caputo method to overcome trapping in local optima: Manta Ray Foraging Optimizer as a case study. Expert Syst. Appl. 2022, 192, 116355. [Google Scholar] [CrossRef]
Figure 1. Structure of the IAO.
Figure 1. Structure of the IAO.
Mathematics 10 01273 g001
Figure 2. Average computed from the fitness values over the datasets.
Figure 2. Average computed from the fitness values over the datasets.
Mathematics 10 01273 g002
Figure 3. Average computed from the MAX fitness values over the datasets.
Figure 3. Average computed from the MAX fitness values over the datasets.
Mathematics 10 01273 g003
Figure 4. Average computed from the MIN fitness values over the datasets.
Figure 4. Average computed from the MIN fitness values over the datasets.
Mathematics 10 01273 g004
Figure 5. Average accuracy measure over all datasets.
Figure 5. Average accuracy measure over all datasets.
Mathematics 10 01273 g005
Figure 6. Average of the selected features for all datasets.
Figure 6. Average of the selected features for all datasets.
Mathematics 10 01273 g006
Figure 7. Average computation time.
Figure 7. Average computation time.
Mathematics 10 01273 g007
Figure 8. Boxplot for DLBC2002 dataset.
Figure 8. Boxplot for DLBC2002 dataset.
Mathematics 10 01273 g008
Figure 9. Boxplot for Lung-cancer dataset.
Figure 9. Boxplot for Lung-cancer dataset.
Mathematics 10 01273 g009
Figure 10. Boxplot for Duch Breasst dataset.
Figure 10. Boxplot for Duch Breasst dataset.
Mathematics 10 01273 g010
Figure 11. Boxplot for AML-full dataset.
Figure 11. Boxplot for AML-full dataset.
Mathematics 10 01273 g011
Figure 12. Exploration and exploitation curves of the IAO and AO for all datasets. (a) DLBC2002. (b) Lung-cancer. (c) Duch Breasst. (d) AML-full.
Figure 12. Exploration and exploitation curves of the IAO and AO for all datasets. (a) DLBC2002. (b) Lung-cancer. (c) Duch Breasst. (d) AML-full.
Mathematics 10 01273 g012
Table 1. Parameter settings.
Table 1. Parameter settings.
AlgorithmParameters Values
AO δ = 0.1, α = 0.1
Firefly β = 0.2 , α = 0.5 , γ = 1
SSA C 3 [ 0 , 1 ] , C 2 [ 0 , 1 ]
GA γ = 0.20 , p c = 0.80 , m u = 0.020 , p m = 0.30 , β = 8
PSOwDamp = 0.990, w = 1 , C 1 = 1 , C 2 = 2
DE p C R = 0.20 , β m a x = 0.80 , β m i n = 0.20
WOA a = [ 0 , 2 ] , l = [ 1 , 1 ] , b = 1
MFO a = [ 2 1 ] , b = 1
IAO δ = 0.1, α = 0.1, a = [ 0 , 2 ] , b = 1 , l = [ 1 , 1 ]
Table 2. Numerical results of the fitness functions.
Table 2. Numerical results of the fitness functions.
DSIAOAOFireflySSAGAPSODEWOAMFO
DLBC2002−233.043−232.904−231.880−229.445−230.639−231.227−230.936−231.989−230.212
Lung-cancer−62.913−58.9001−58.171−57.963−58.223−58.271−58.451−61.276−57.297
Duch Breasst−307.687−301.913−305.818−303.578−304.606−305.465−305.285−305.154−301.364
AML-full−93.533−88.8636−88.822−88.304−88.416−88.759−88.737−92.010−88.075
Table 3. Numerical results of the Max.
Table 3. Numerical results of the Max.
DSIAOAOFireflySSAGAPSODEWOAMFO
DLBC2002−230.672−229.409−230.672−227.704−228.858−229.427−229.427−230.671−227.704
Lung-cancer−58.087−56.1811−56.463−55.993−56.463−56.463−56.662−57.290−55.969
Duch Breasst−303.420−298.355−301.943−300.563−299.440−301.887−301.979−300.589−297.438
AML-full−91.573−86.986−86.808−87.063−87.214−87.655−87.451−90.867−87.107
Table 4. Numerical results of the Min.
Table 4. Numerical results of the Min.
DSIAOAOFireflySSAGAPSODEWOAMFO
DLBC2002−233.409−233.343−233.342−231.866−233.310−233.409−233.409−232.713−232.010
Lung-cancer−63.935−63.935−63.935−63.935−63.935−63.935−63.935−63.935−60.872
Duch Breasst−314.193−306.826−314.193−314.193−311.276−314.184−314.193−311.264−311.276
AML-full−93.838−90.8613−91.573−90.849−90.849−90.849−91.573−93.109−89.601
Table 5. Numerical results of the standard deviation.
Table 5. Numerical results of the standard deviation.
DSIAOAOFireflySSAGAPSODEWOAMFO
DLBC20020.771641.03210.848941.258011.286741.215171.243010.802911.25666
Lung-cancer1.336542.142852.348392.447222.322502.287892.192592.492361.40572
Duch Breasst2.466721.374953.538704.315233.635343.903583.949573.231304.18625
AML-full0.670681.171631.693821.323531.210931.215781.478120.925840.95712
Table 6. Numerical results of the selected features ratio.
Table 6. Numerical results of the selected features ratio.
DSIAOAOFireflySSAGAPSODEWOAMFO
DLBC20020.35600.438780.49970.49960.49950.49820.49960.39140.5348
Lung-cancer0.33680.434120.49800.49870.49700.49620.49820.38240.5288
Duch Breasst0.40420.419950.49720.50100.49530.49930.50180.46600.5318
AML-full0.41520.440550.50050.50090.50290.50300.49840.46370.5253
Table 7. Results of the Friedman for all datasets.
Table 7. Results of the Friedman for all datasets.
IAOAOFireflySSAGAPSODEWOAMFO
DLBC20022.4383.7503.4388.5636.3754.5005.5633.4386.938
Lung-cancer2.2505.7505.3756.5635.0005.1254.3132.3758.250
Duch Breasst3.1887.0003.1256.1254.8133.9383.5634.9388.313
AML-full1.0004.5005.1256.8136.5005.3755.3752.3757.938
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Ewees, A.A.; Algamal, Z.Y.; Abualigah, L.; Al-qaness, M.A.A.; Yousri, D.; Ghoniem, R.M.; Abd Elaziz, M. A Cox Proportional-Hazards Model Based on an Improved Aquila Optimizer with Whale Optimization Algorithm Operators. Mathematics 2022, 10, 1273. https://doi.org/10.3390/math10081273

AMA Style

Ewees AA, Algamal ZY, Abualigah L, Al-qaness MAA, Yousri D, Ghoniem RM, Abd Elaziz M. A Cox Proportional-Hazards Model Based on an Improved Aquila Optimizer with Whale Optimization Algorithm Operators. Mathematics. 2022; 10(8):1273. https://doi.org/10.3390/math10081273

Chicago/Turabian Style

Ewees, Ahmed A., Zakariya Yahya Algamal, Laith Abualigah, Mohammed A. A. Al-qaness, Dalia Yousri, Rania M. Ghoniem, and Mohamed Abd Elaziz. 2022. "A Cox Proportional-Hazards Model Based on an Improved Aquila Optimizer with Whale Optimization Algorithm Operators" Mathematics 10, no. 8: 1273. https://doi.org/10.3390/math10081273

APA Style

Ewees, A. A., Algamal, Z. Y., Abualigah, L., Al-qaness, M. A. A., Yousri, D., Ghoniem, R. M., & Abd Elaziz, M. (2022). A Cox Proportional-Hazards Model Based on an Improved Aquila Optimizer with Whale Optimization Algorithm Operators. Mathematics, 10(8), 1273. https://doi.org/10.3390/math10081273

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop