Next Article in Journal
Compression of Neural Networks for Specialized Tasks via Value Locality
Next Article in Special Issue
A Novel Learning-Based Binarization Scheme Selector for Swarm Algorithms Solving Combinatorial Problems
Previous Article in Journal
A High Fidelity Authentication Scheme for AMBTC Compressed Image Using Reference Table Encoding
Previous Article in Special Issue
A Metaheuristic Algorithm for Flexible Energy Storage Management in Residential Electricity Distribution Grids
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Binary Machine Learning Cuckoo Search Algorithm Improved by a Local Search Operator for the Set-Union Knapsack Problem

1
Escuela de Ingeniería en Construcción, Pontificia Universidad Católica de Valparaíso, Valparaíso 2362804, Chile
2
Escuela de Construcción Civil, Pontificia Universidad Católica de Chile, Santiago 7820436, Chile
3
Facultad de Ingeniería y Negocios, Universidad de las Américas, Santiago 7500975, Chile
4
Escuela de Ingeniería Informática, Pontificia Universidad Católica de Valparaíso, Valparaíso 2362807, Chile
5
Facultad de Ingeniería, Ciencia y Tecnología, Universidad Bernardo O’Higgins Santiago, Metropolitana 8370993, Chile
6
Escuela de Negocios Internacionales, Universidad de Valparaíso, Viña del Mar 2572048, Chile
*
Authors to whom correspondence should be addressed.
Mathematics 2021, 9(20), 2611; https://doi.org/10.3390/math9202611
Submission received: 9 September 2021 / Revised: 6 October 2021 / Accepted: 12 October 2021 / Published: 16 October 2021
(This article belongs to the Special Issue Metaheuristic Algorithms)

Abstract

:
Optimization techniques, specially metaheuristics, are constantly refined in order to decrease execution times, increase the quality of solutions, and address larger target cases. Hybridizing techniques are one of these strategies that are particularly noteworthy due to the breadth of applications. In this article, a hybrid algorithm is proposed that integrates the k-means algorithm to generate a binary version of the cuckoo search technique, and this is strengthened by a local search operator. The binary cuckoo search algorithm is applied to the NP -hard Set-Union Knapsack Problem. This problem has recently attracted great attention from the operational research community due to the breadth of its applications and the difficulty it presents in solving medium and large instances. Numerical experiments were conducted to gain insight into the contribution of the final results of the k-means technique and the local search operator. Furthermore, a comparison to state-of-the-art algorithms is made. The results demonstrate that the hybrid algorithm consistently produces superior results in the majority of the analyzed medium instances, and its performance is competitive, but degrades in large instances.

1. Introduction

Metaheuristics have demonstrated their efficacy in recent years in handling complex problems, especially complex combinatorial challenges. There are several examples in biology [1], logistics [2], civil engineering [3], and machine learning [4], among others. Despite the increased efficiency, and in part due to the vast scale of many combinatorial problems, it is also vital to maintain the strength of metaheuristic approaches. Thus, hybrid techniques have been employed to enhance metaheuristic algorithmic performance.
Among the main approaches of how to integrate metaheuristics, has been found hybrid heuristics, [5], where multiple metaheuristic algorithms are merged to boost their capabilities. In [6], for example, the authors employed simulated annealing-based genetic and tabu-search-based genetic algorithms to address the ordering planning problem. The hybrid approaches were compared to the traditional approaches in this study, with the hybrid approaches outperforming the traditional approaches. In [7], the cuckoo search and firefly algorithm search methods are combined in order to avoid getting the procedure stuck in local optimum. The hybrid algorithm was applied to a job schedulers problem in high-performance computing systems. When compared to traditional policies, the results indicated significant reductions in server energy consumption.
Another interesting hybrid approach, [8], is matheuristics, which combines mathematical programming approaches with metaheuristic algorithms. The vehicle routing problem, for example, was studied utilizing mixed-integer linear programming and metaheuristic techniques in [9]. These methods, generally, do not take advantage of the auxiliary data created by metaheuristics in order to obtain more reliable results. In the solution-finding process, metaheuristics provide useful accessory data, which may be used to inform machine learning approaches. The area of artificial intelligence and in particular machine learning has grown important in recent times applying in different areas [10,11,12]. Machine learning approaches combined with metaheuristic algorithms is a novel area of research that has gained traction in recent years [13].
According to [13,14], there are three primary areas in which machine learning algorithms utilize metaheuristic data: low-level integrations, high-level integrations, and optimization problems. A current area of research in low-level integrations is the construction of binary versions of algorithms that operate naturally in continuous space. In [15], a state-of-the-art of the different binarization techniques is developed in which two main groups stand out. The first group corresponds to general binarization techniques in which the movements of the metaheuristics are not modified, but rather after its execution, the binarization of the solutions is applied. The second group corresponds to modifications applied directly to the movement of metaheuristics. The first group has the advantage that the procedure is used for any continuous metaheuristic, the second, when the adjustments are carried out in an adequate way, have good performance. There are examples of integration between machine learning and metaheuristics in this domain. In [16,17], the binary versions of the cuckoo search algorithm were generated using the k-nearest neighbor technique. These binary versions were applied to multidimensional knapsack and set covering problems, respectively. Whereas in the field of civil engineering [18,19], hybrid methods were proposed that utilizes db-scan and k-means, respectively, as a binarization method and is used to optimize the emission of CO 2 of retaining walls.
In accordance with low-level integration between machine learning and metaheuristics, in this article, a hybrid approach was used that combines a cuckoo search algorithm with the unsupervised k-means technique to obtain a binary version of the continuous cuckoo search algorithm. The suggested approach combines these two strategies with the objective of obtaining a robust binary version through the use of the data acquired during the execution of the metaheuristic. The proposed algorithm was applied to the set-union knapsack problem. The set union knapsack problem (SUKP) [20] is a generalization of the classical knapsack problem. SUKP has received attention from researchers in recent years [21,22,23] due to its interesting applications [24,25], as well as the difficulty of being able to solve it efficiently. In SUKP, there is a set of items where each item has a profit. Additionally, each item associates a set of elements where each element has a weight that is associated with the knapsack constraint. In the literature, it is observed that the algorithms that have addressed SUKP are mainly improved metaheuristics and have allowed obtaining results in reasonable times. When applying a metaheuristic in its standard form to SUKP, these algorithms have had limitations such as stability and decreased performance as the instance grows in size. For example, in [26], different transfer functions were used and evaluated with small and medium SUKP instances. This effect is observed when the algorithms are applied to standard SUKP instances, additionally, to increase the challenge in [27], a new set of benchmark problems was recently generated. All previous, leads to exploring hybrid techniques in order to strengthen the performance of the algorithm. The following are the contributions made by this work:
1.
A new greedy initiation operator is proposed.
2.
The k-means technique, proposed in [28], is used to binarize the cuckoo search (CS) algorithm, tuned and applied for the first time to the SUKP. Additionally, a random binarization operator is designed and two transition probabilities are applied to evaluate the contribution of k-means in the final result. It should be noted that the binarization method allows generating binary versions of other continuous swarm intelligence metaheuristics.
3.
A new local search operator is proposed to improve the exploitation of the search space.
4.
The results obtained by the hybrid algorithm are compared with different algorithms that have addressed SUKP. It should be noted that the standard SUKP instances and the new instances proposed in [27] were solved.
The following is a summary of the contents: Section 2 delves into the set-union knapsack problem and its applications. The k-means cuckoo search algorithm and the local search operator are described in Section 3. In Section 4, the detail of the numerical experiments and comparisons are developed. Finally, the conclusions and potential lines of research are discussed in Section 5.

2. The Set Union Knapsack Problem

The Set-Union Knapsack Problem (SUKP) is a generalized knapsack model with the following definition. First, let U be a set of n elements with each element j U having a weight w j > 0. Let V be a set of m items with each item i V being a subset of elements U i U and having a profit p i . Finally, for a knapsack with capacity C, SUKP entails identifying a set of items S V that maximizes the total profit of S while guaranteeing that the total weight of the components of S does not exceed the capacity C of the knapsack. Being the elements belonging to the set S, the decision variables of the problem. It is worth noting that an elements weight is only tallied once, even if it corresponds to several chosen items in S. SUKP may be written mathematically as follows:
Maximize   P ( S ) = i S p i .
subject to:
W ( S ) = j i S U i w j C , S V .
In reviewing the literature SUKP has been found to have interesting applications, for example in [24]. The goal of this application is to improve the scalability of cybernetic systems robustness. Given a centralized cyber system with a fixed memory capacity that holds a collection of profit-generating services (or requests), each of which contains a set of data objects. When a data object is activated, it consumes a particular amount of memory, and using the same data object several times does not result in increased memory consumption (An important condition of SUKP). The goal is to choose a subset of services from among the candidate services that maximizes the total profit of those services while keeping the total memory required by the underlying data objects within the cyber system’s memory capacity. The SUKP model, in which an item corresponds to a service with its profit and an element relates to a data object with its memory usage, is a convenient way to structure this application (element weight). Finding the optimal solution to the ensuing SUKP problem is thus comparable to solving the data allocation problem.
Another interesting application is related to the rendering of an animated crow in real-time [29]. In the article, the authors present a method to accelerate the visualization of large crowds of animated characters. They adopt a caching system that enables a skinned key-pose (elements) to be re-used by multi-pass rendering, between multiple agents and across multiple frames, an interpolative approach that enables key-pose blending to be supported. In this problem, each item corresponds to a crowd member. Applications are also found in data stream compression through the use of bloom filters [25].
SUKP is an NP -hard problem [20] that has been tackled by a variety of methods. In [20,30], theoretical studies using greedy approaches or dynamic programming are found. An integer linear programming model was developed in [31] and applied to small instances of 85 and 100 items, finding the optimal solutions.
Metaheuristic algorithms have also addressed SUKP. In [32], the authors use an artificial bee colony technique to tackle SUKP. In addition, this algorithm integrates a greedy operator with the aim of addressing infeasible solutions. In [33], the authors designed an enhanced moth search algorithm. To improve its efficiency, this algorithm incorporates an integrating differential mutation operator. The Jaya algorithm was employed in [34]. Additionally, a differential evolution technique was incorporated to enhance exploration capability. The Cauchy mutation is used to boost its exploitation ability. Furthermore, an enhanced repair operator has been designed to repair the infeasible solutions. In [26], the effectiveness of different transfer functions is studied in order to binarize the moth metaheuristics. A local search operator is designed in [35] and applied to long-scale instances of SUKP. The article proposes three strategies that conform to the adaptive tabu search framework and efficiently solve new instances of SUKP. In [36], the grey wolf optimizer (GWO) algorithm is adapted to address binary problems. For the algorithm to be robust, traditional binarization methods are not used. To replicate the GWO leadership hierarchy technique, a multiple parent crossover is established with two distinct dominance tactics. In addition, an adaptive mutation with an exponentially decreasing step size is used to avoid early convergence and achieve a balance of intensification and diversification.

3. The Machine Learning Cuckoo Search Algorithm

This section describes the machine learning binary cuckoo search algorithm used to solve the SUKP problem. This hybrid algorithm consists of three main operators: A greedy initialization operator detailed in Section 3.1. CS is then used to develop the optimization. Here, it should be noted that CS is going to produce results with values in R and therefore they must be binarized. Then, a machine learning binarization operator performs the binarization of the solutions generated by the cuckoo search algorithm, and which uses the unsupervised k-means technique. This operator is detailed in Section 3.2. Finally, a local search operator is applied when the condition of finding a new maximum is met. The logic of the local search operator is detailed in Section 3.3. Figure 1 shows the flowchart of the binary machine learning cuckoo search algorithm. It is also worth noting that CS can be replaced by any other continuous swarm intelligence metaheuristic.

3.1. Greedy Initialization Operator

The objective of this operator is to build the solutions that will start the search process. For this, the items are ordered using the ratio defined in Equation (3). As input to the operator, s o r t I t e m s is utilized, and it contains the elements ordered by r from highest to lowest. As output, a valid solution, S o l , is obtained.
r = item profit sum of element weights
In line 4, a blank S o l solution is initialized, then in line 5 the fulfillment of the constraint by S o l is validated. While the weight of the solution items ( w e i g h S o l ), Equation (2), is not equal or greater thab the knapsack constraint ( k n a p s a c k S i z e ), a random number r a n d is generated in line 6, and compare it in line 7 with β . If r a n d is greater than β , an element of s o r t I t e m s is added in line 8, fulfilling the order. Otherwise, in line 11, a random item is chosen, then add it to the solution, and in line 12 remove it from s o r t I t e m s . Once the knapsack is full, the solution needs to be cleaned up in line 15, as it is greater than or equal to k n a p s a c k S i z e . In the case that it is the same, it does not take action. In the event that it is greater, the items of S o l must be ordered using r defined in Equation (3) and it is removed in order starting with the smallest and checking the constraint in each elimination. Once the constraint is fulfilled, the procedure stops and the solution S o l is returned. The pseudo-code is shown in Algorithm 1.
Algorithm 1 Greedy initialization operator
1:
Function initSolutions( s o r t I t e m s )
2:
Input  s o r t I t e m s
3:
Output  S o l
4:
S o l [ ]
5:
while (weightSol < knapsackSize) do
6:
   r a n d getRandom()
7:
  if  r a n d > β  then
8:
     S o l a d d S o r t I t e m ( s o r t I t e m s )
9:
     s o r t I t e m s r e m o v e F r o m S o r t I t e m s ( I t e m )
10:
  else
11:
     S o l a d d R a n d o m I t e m ( s o r t I t e m s )
12:
     s o r t I t e m s r e m o v e F r o m S o r t I t e m s ( I t e m )
13:
  end if
14:
end while
15:
S o l cleanSol( S o l )
16:
return  S o l

3.2. Machine Learning Binarization Operator

The machine learning binarization operator (MLBO) is responsible for the binarization process. This receives as input the list l S o l of solutions obtained from the previous iteration, the metaheuristic ( M H ), in this case CS, the best solution obtained, b e s t S o l so far, and the transition probability for each cluster, t r a n s P r o b s . To this list l S o l , in line 4, the M H is applied, in this case it corresponds to CS. From the result of applying M H to l S o l , the absolute value of velocities, v l S o l , is obtained. These velocities correspond to the transition vector obtained by applying MH to the list of solutions. The set of all velocities is clustered in line 5, using k-means (getKmeansClustering), in this particular case K = 5.
So, for each S o l i and each dimension j, a cluster is assigned and each cluster is associated with a transition probability ( t r a n s P r o b s ), ordered by the value of the cluster centroid. For this case the transition probabilities used were [0.1, 0.2, 0.4, 0.8, 0.9]. Then for the set of points that belong to the cluster with the smallest centroid, which is represented by the green color in Figure 2, the transition probability 0.1 was associated. For the group of blue points that obtained the centroid with the highest value, a transition probability of 0.9 was associated. The smaller the value of the centroid, the smaller the value of t r a n s P r o b s are associated with it. Then, in line 8, for each l S o l i , j , a transition probability d i m S o l P r o b i , j is associated and and later on line 9 compared with a random number r 1 . In the case that d i m S o l P r o b i , j > r 1 , then it is updated considering the best value, line 10, and otherwise, it is not updated, line 12. Once all the solutions have been updated, each of them is cleaned up using the process explained in Section 3.1. In the case of a new best value is obtained, in line 19, a local search operator is executed. This local search operator is detailed in the following section. Finally, the updated list of solutions l S o l and the best solution b e s t S o l are returned. The pseudo-code is shown in Algorithm 2.
Algorithm 2 Machine learning binarization operator (MLBO).
1:
Function MLBO( l S o l , M H , t r a n s P r o b s , b e s t S o l )
2:
Input  l S o l , M H , t r a n s P r o b s
3:
Output  l S o l , b e s t S o l
4:
v l S o l getAbsValueVelocities( l S o l , M H )
5:
l S o l C l u s t getKmeansClustering( v l S o l , K)
6:
for (each S o l i in l S o l C l u s t ) do
7:
  for (each d i m S o l i , j l in S o l i ) do
8:
     d i m S o l P r o b i , j = getClusterProbability( d i m S o l , t r a n s P r o b s )
9:
  if d i m S o l P r o b i , j > r 1 then
10:
      Update l S o l i , j considering the best.
11:
    else
12:
      Do not update the item in l S o l i , j
13:
    end if
14:
  end for
15:
   S o l i cleanSol( S o l i )
16:
end for
17:
t e m p B e s t getBest( l S o l )
18:
if c o s t ( t e m p B e s t ) > c o s t ( b e s t S o l )  then
19:
   t e m p B e s t execLocalSearch( t e m p B e s t )
20:
   b e s t S o l t e m p B e s t
21:
end if
22:
return l S o l , b e s t S o l

3.3. Local Search Operator

According to Figure 1, the local search operator is executed every time the metaheuristic finds a new best value. As input, the local search operator receives the new best values ( b e s t S o l ), and as a first stage, it uses it to obtain the items that belong and do not belong to b e s t S o l , line 4 of Algorithm 3. These two lists of items are iterated, T = 300 times, performing a swap without repetition, line 7 of Algorithm 3. Once the swap is carried out, the conditions are evaluated: it will improve the profit and that the weight of the knapsack is less than or equal to k n a p s a c k S i z e . If both conditions are met, the b e s t S o l is updated by t e m p S o l , to finally return b e s t S o l .
Algorithm 3 Local search.
1:
Function LocalSearch( b e s t S o l )
2:
Input  b e s t S o l
3:
Output  b e s t S o l
4:
l s o l I t e m s , l s o l N o I t e m s getItems( b e s t S o l )
5:
i = 0
6:
while (i < T) do
7:
   t e m p S o l swap( l s o l I t e m s , l s o l N o I t e m s )
8:
  if p r o f i t ( t e m p S o l ) > p r o f i t ( b e s t S o l ) and k n a p s a c k ( t e m p S o l ) < = k n a p s a c k S i z e
  then
9:
     b e s t S o l t e m p S o l
10:
  end if
11:
  i += 1
12:
end while
13:
return b e s t S o l

4. Results

This section details the experiments conducted with MLBO and cuckoo search metaheuristic, to determine the proposed algorithms effectiveness and contribution when applied to a NP -hard combinatorial problem. This specific version of MLBO that cuckoo search uses will be denoted by MLCSBO. The SUKP was chosen as a benchmark problem because it has been approached by several algorithms and is not trivial to solve in small, medium and large instances. However, it should be emphasized that the MLBO binarization technique is easily adaptable to other optimization algorithms. The optimization algorithm chosen was CS because it is a simple-to-parameterize algorithm that has been used to solve a wide variety of optimization problems.
Python 3.6 was used to build the algorithm, as well as a PC running Windows 10 with a Core i7 processor and 16 GB of RAM. To evaluate whether the difference is statistically significant, the Wilcoxon signed-rank test was used. Additionally, 0.05 was utilized as the significance level. The test is chosen in accordance with the methodology outlined in [37,38]. The Shapiro–Wilk normality test is used initially in this process. If one of the populations is not normal and both have the same number of points, the Wilcoxon signed-rank test is proposed to determine the difference. In the experiments, the Wilcoxon test was used to compare the MLCSBO results with the other variants or algorithms used in pairs. For comparison, the complete list of results was always used. Further, in the case of the experiment in Section 4.2, since there are multiple comparisons and in order to correct for these comparisons, a post hoc test was performed with the Holm–Bonferroni correction. The statsmodels and scipy libraries of Python were used to develop the tests. Each instance was resolved 30 times in order to acquire the best value and average indicators. Additionally, the average time (in seconds) required for the algorithm to find the optimal solution is reported for each instance.
The first set of instances were proposed in [39]. These instances have between 85 and 500 items and elements. These instances are characterized by two parameters. A first parameter μ = ( i = 1 m j = 1 n R i j ) / ( m n ) , which represents the density in the matrix, where R i j = 1 means the item i includes to the j element. A second parameter ν = C / ( j = 1 n w j ) , which represents the capacity ratio C over the total weight of the elements. Then, a SUKP instance is named as m _ n _ μ _ ν . The second group of instances was introduced in [27], and in this case, they contain between 585 and 1000 items and elements. The form was built following the same previous structure.

4.1. Parameter Setting

The methods described in [28,40] was used to pick the parameters. To make an appropriate parameter selection, this methodology employs four metrics specified by the Equations (4)–(7). Values were generated using the instances 100_85_0.10_0.75, 100_100_0.15_0.85, and 85_100_0.10_0.75. Each parameter combination was run ten times. The collection of parameters that have been explored and selected is presented in Table 1. To determine the configuration, the polygon area obtained from the four metric radar chart is calculated for each setting. The configuration that obtained the largest area was selected. In the case of the transition probabilities, only the probability of the third cluster was varied considering the values [0.4, 0.5], the rest of the values were considered constant.
1.
The difference in percentage terms between the best value achieved and the best known value:
b S o l u t i o n = 1 K n o w n B e s t V a l u e B e s t V a l u e K n o w n B e s t V a l u e
2.
The percentage difference between the worst value achieved and the best value known:
w S o l = 1 K n o w n B e s t V a l u e W o r s t V a l u e K n o w n B e s t V a l u e
3.
The percentage departure of the obtained average value from the best-known value:
a S o l = 1 K n o w n B e s t V a l u e A v e r a g e V a l u e K n o w n B e s t V a l u e
4.
The convergence time used in the execution:
n T i m e = 1 c o n v e r g e n c e T i m e m i n T i m e m a x T i m e m i n T i m e

4.2. Insight into Binary Algorithm

The objective of this section is to determine the contribution of the MLCSBO operator and the local search operator in the final result of the optimization. To address this challenge, a random operator is designed that aims to replace MLBO in Figure 1 with an operator that performs random transitions. In particular, two configurations are studied Random-05, which has a 50% chance of making a transition, and Random-03, which has a 30% chance of making a transition. Additionally, the configuration with and without a local search operator is studied. Each of the algorithms is evaluated for its performance without (NL) and with the local search operator.
The results are shown in Table 2 and Table 3 and Figure 3. From Table 2, it can be deduced that the best values obtained are for MLCSBO, which has the binarization mechanism based on k-means. The above for both indicators average and best value. When comparing MLCSBO-NL, note that MLCSBO-NL does not have the local search operator, with Random-03-NL and Random-05-NL, it is noted that MLCSBO-NL is more robust in the averages and best values. This allows evaluating the effect of incorporating k-means with respect to a random binarization operator in the optimization result. Furthermore, MLCSBO-NL works better than Random-03 and Random-05, where the latter incorporate the local search operator. On the other hand, when analyzing the contribution of the local operator, it is observed that each time it is incorporated generates an improvement in both the averages and the best values. First the Wilcoxon statistical test was applied, where MLCSBO is compared with the other variations. The statistical test indicates that the differences are significant between MLCSBO and the other variations analyzed. However, as there are multiple comparisons, the p-values were corrected using the Holm–Bonferroni test. For this correction, the experiments of the operators Random-03 and Random-05 were treated as independent groups. In Figure 3, we see that the highest time is for MLCSBO. In particular, Random-05-NL, which corresponds on average to the best performer, is 18.8% faster than MLCSBO. On the other hand, MLCSBO-NL which does not have the local search operator is 7.4% faster than MLCSBO.
In Figure 3 and Table 4, the %-Gap, defined in Equation (8), with respect to the best known value is compared of the different variants developed in this experiment. The comparison is made through box plots. In Figure, it is observed that MLCSBO has a more robust behavior than the rest, since it obtains better values and smaller dispersions than the other variants. On the other hand, the variants that obtain the worst performance correspond to those that have the random binarization operator and do not use the local search operator.
% - G a p = 100 × B e s t k n o w n V a l u e V a l u e B e s t k n o w n V a l u e
Additionally, the significance has been analyzed using the Wilcoxon test for the other variants. The details of the results are shown in Table 5. In each cell of the table, the p-values of best|average are written. In the table, it is observed that the difference of MLCSBO-NL with respect to the Random variants is not significant in the best indicator, but it is significant in the average indicator. The same goes for Random03 with respect to Random05. However, when analyzing Random03-NL with respect to Random05-NL, there is no significant difference in any of the indicators.

4.3. Algorithm Comparisons

This section compares MLCSBO performance to that of other algorithms that have tackled SUKP. Different forms of approximations were used in the comparative selection. A genetic algorithm (GA), in which uniform mutation, point cross-over, and roulette wheel selection operators were used. In particular, the cross-over probability was p c = 0.8 and the mutation probability was selected at p m = 0.01 . An artificial bee colony (ABC b i n , BABC), where the parameters used were a = 5 and limit defined as M a x { m , n } / 5 , and a binary evolution technique (binDE) with factor F = 0.5 and crossover constant in 0.3, were adapted in [32] to tackle the SUKP. In [41], a weighted superposition attraction algorithm (bSWA), with parameters τ = 0.8 , ϕ = 0.008 , and s l 1 = 0.4 , is proposed to solve SUKP. Two variations gPSO and gPSO* of particle swam optimization algorithm were proposed in [22]. In the case of gPSO, the init parameters used were r 1 = 0.05 , ϕ = 0.005 , p 1 = 0.2 , and p 2 = 0.8 . In the case of gPSO*, p 1 = 0.10 and p 2 = 0.70 . An artificial search agent with cognitive intelligence (intAgents) was proposed in [42], where the parameters used are, θ m u t = 0.005 , m r a t e = 0.05 , p x o v e r = 0.6 , and p x o v e r i t M a x = 0.1 . Finally, the DH-Jaya algorithm was designed in [34], with parameters C r = 0.8 , F + 0.8 , and C a = 1 . In Table 6 and Table 7, the comparisons of the 30 smallest instances of SUKP are presented. Table 8 shows the results for the 30 largest instances. In the latter case, only results were found for BABC and DH-Jaya reported in the literature.
Consider Table 6 and Table 7, which summarize the results for the 30 smallest instances. MLCSBO had the best value in 27 of the 30 cases. After that, GWOrbd has 16 best values, DH-Jaya has 11 best values, and GWOfbd also has 11 best values. It is possible that more than one algorithm gets the best value in some cases, in which case they are repeated in the accounting. This shows a good performance of the MLCSBO algorithm with respect to the other algorithms both in finding the best values as well as in reproducing these systematically. However, when the results for the 30 largest instances are analyzed, which are shown in Table 8, it is observed that the good performance obtained by MLCSBO is not repeated. In the case of larger instances, DH-Jaya is observed to perform better than MLCSBO. In the case of the best value indicator, DH-Jaya obtains 22 best values and MLCSBO 10. Even more, so when the average indicator is compared, MLCSBO gets 4 best averages and DH-Jaya, 26. To make sure there was an exploit problem on the local search operator, in these cases T was changed to 800, however, no improvements were obtained. The latter raises the suspicion that the decrease in performance in large cases is related to the exploration of the algorithm.

5. Conclusions

In this research, a hybrid k-means cuckoo search algorithm has been proposed. This hybrid binarization method applies the k-means technique to binarize the solutions generated by the cuckoo search algorithm. Additionally, in order for the procedure to be efficient, it was reinforced with a greedy initialization algorithm and with a local search operator. The proposed hybrid technique was used to solve cases of the set-union knapsack problem on a medium and large scale. The role of binarization and local search operators was investigated. To do this, a random operator was designed using two transition probabilities Random03 and Random05, which were compared in different situations. Finally, when the proposed approach is compared with several state-of-the-art methods, it is observed that the proposed algorithm is capable of improving the previous results in most cases. We highlight that the proposed algorithm uses a general binarization framework based on k-means and which can be easily adapt different metaheuristics and integrate with initiation and local search operators and in this particular case solve SUKP giving reasonable results.
According to the behavior of MLCSBO, it is observed that in the first 30 instances, it performed robustly, significantly outperforming the algorithms used in the comparison. However, in the 30 largest instances, their efficiency was not as clear when compared to the algorithms that had solved these instances. When it came to increasing the exploitation capacity of the local search operator, increasing T, there were no improvements. The above suggests three ideas for new lines of research. The first idea is to improve the search space exploration, this can be achieved using different solution initiation mechanisms, in MLBO, a greedy initialization operator was used. The second idea, thinking that the algorithm could be trapped in local optimum, the incorporation of a perturbation operator can be investigated. At this point, it can also consider the use of machine learning techniques such as the k-nearest neighborhood. Finally, the last idea aims to explore other binarization techniques based on other clustering algorithms or some other binarization strategies.

Author Contributions

J.G.: Conceptualization, investigation, methodology, writing—review and editing, project administration, resources, formal analysis. J.L.-R., M.B.-R., F.A.: Conceptualization, investigation, validation. B.C., R.S., J.-M.R., P.M., A.P.B., A.P.F., G.A.: Validation, funding acquisition. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by: José García was supported by the Grant CONICYT/ FONDECYT/INICIACION/ 11180056. PROYECTO DI INVESTIGACIÓN INNOVADORA INTERDISCIPLINARIA: 039.414/2021. José Lemus-Romani is supported by National Agency for Research and Development (ANID)/ Scholarship Program/DOCTORADO NACIONAL/2019- 21191692. Marcelo Becerra-Rozas is supported by National Agency for Research and Development (ANID)/Scholarship Program/DOCTORADO NACIONAL/2021-21210740. Broderick Crawford is supported by Grant CONICYT / FONDECYT/REGULAR/1210810.Ricardo Soto is supported by Grant CONICYT/FONDECYT/REGULAR/1190129. Broderick Crawford, Ricardo Soto, and Marcelo Becerra-Rozas are supported by Grant Nucleo de Investigacion en Data Analytics/VRIEA/PUCV/039.432/2020.

Institutional Review Board Statement

Not applicable for studies not involving humans or animals.

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The data set used in this article can be obtained from: https://drive.google.com/drive/folders/1aH11zXXBFtWbKjS9MlKxv-7eZjgcvCpL?usp=sharing, accessed on 14 October 2021. The results of the experiments are in: https://drive.google.com/drive/u/2/folders/1xLY1Cu8loizh44oVa7vS0s4nqUAvhHNV, accessed on 14 October 2021.

Acknowledgments

José García was supported by the Grant CONICYT/FONDECYT/INICIACION/ 11180056. PROYECTO DI INVESTIGACIÓN INNOVADORA INTERDISCIPLINARIA: 039.414/2021. José Lemus-Romani is supported by National Agency for Research and Development (ANID)/ Scholarship Program/DOCTORADO NACIONAL/2019-21191692. Marcelo Becerra-Rozas is supported by National Agency for Research and Development (ANID)/Scholarship Program/DOCTORADO NACIONAL/2021-21210740. Broderick Crawford is supported by Grant CONICYT/FONDECYT/REGULAR/1210810.Ricardo Soto is supported by Grant CONICYT/FONDECYT/REGULAR/ 1190129. Broderick Crawford, Ricardo Soto, and Marcelo Becerra-Rozas are supported by Grant Nucleo de Investigacion en Data Analytics/VRIEA/PUCV/039.432/2020.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Guo, H.; Liu, B.; Cai, D.; Lu, T. Predicting protein–protein interaction sites using modified support vector machine. Int. J. Mach. Learn. Cybern. 2018, 9, 393–398. [Google Scholar] [CrossRef]
  2. Korkmaz, S.; Babalik, A.; Kiran, M.S. An artificial algae algorithm for solving binary optimization problems. Int. J. Mach. Learn. Cybern. 2018, 9, 1233–1247. [Google Scholar] [CrossRef]
  3. Penadés-Plà, V.; García-Segura, T.; Yepes, V. Robust design optimization for low-cost concrete box-girder bridge. Mathematics 2020, 8, 398. [Google Scholar] [CrossRef] [Green Version]
  4. Al-Madi, N.; Faris, H.; Mirjalili, S. Binary multi-verse optimization algorithm for global optimization and discrete problems. Int. J. Mach. Learn. Cybern. 2019, 10, 3445–3465. [Google Scholar] [CrossRef]
  5. Talbi, E.G. Combining metaheuristics with mathematical programming, constraint programming and machine learning. Ann. Oper. Res. 2016, 240, 171–215. [Google Scholar] [CrossRef]
  6. Tsao, Y.C.; Vu, T.L.; Liao, L.W. Hybrid Heuristics for the Cut Ordering Planning Problem in Apparel Industry. Comput. Ind. Eng. 2020, 144, 106478. [Google Scholar] [CrossRef]
  7. Chhabra, A.; Singh, G.; Kahlon, K.S. Performance-aware energy-efficient parallel job scheduling in HPC grid using nature-inspired hybrid meta-heuristics. J. Ambient. Intell. Humaniz. Comput. 2021, 12, 1801–1835. [Google Scholar] [CrossRef]
  8. Caserta, M.; Voß, S. Metaheuristics: Intelligent problem solving. In Matheuristics; Springer: Berlin, Germany, 2009; pp. 1–38. [Google Scholar]
  9. Schermer, D.; Moeini, M.; Wendt, O. A matheuristic for the vehicle routing problem with drones and its variants. Transp. Res. Part Emerg. Technol. 2019, 106, 166–204. [Google Scholar] [CrossRef]
  10. Roshani, M.; Phan, G.; Roshani, G.H.; Hanus, R.; Nazemi, B.; Corniani, E.; Nazemi, E. Combination of X-ray tube and GMDH neural network as a nondestructive and potential technique for measuring characteristics of gas-oil–water three phase flows. Measurement 2021, 168, 108427. [Google Scholar] [CrossRef]
  11. Roshani, S.; Jamshidi, M.B.; Mohebi, F.; Roshani, S. Design and Modeling of a Compact Power Divider with Squared Resonators Using Artificial Intelligence. Wirel. Pers. Commun. 2021, 117, 2085–2096. [Google Scholar] [CrossRef]
  12. Nazemi, B.; Rafiean, M. Forecasting house prices in Iran using GMDH. Int. J. Hous. Mark. Anal. 2020, 14, 555–568. [Google Scholar] [CrossRef]
  13. Talbi, E.G. Machine Learning into Metaheuristics: A Survey and Taxonomy. ACM Comput. Surv. (CSUR) 2021, 54, 1–32. [Google Scholar]
  14. Calvet, L.; de Armas, J.; Masip, D.; Juan, A.A. Learnheuristics: Hybridizing metaheuristics with machine learning for optimization with dynamic inputs. Open Math. 2017, 15, 261–280. [Google Scholar] [CrossRef]
  15. Crawford, B.; Soto, R.; Astorga, G.; García, J.; Castro, C.; Paredes, F. Putting continuous metaheuristics to work in binary search spaces. Complexity 2017, 2017, 8404231. [Google Scholar] [CrossRef] [Green Version]
  16. García, J.; Lalla Ruiz, E.; Voß, S.; Lopez Droguett, E. Enhancing a machine learning binarization framework by perturbation operators: Analysis on the multidimensional knapsack problem. Int. J. Mach. Learn. Cybern. 2020, 11, 1951–1970. [Google Scholar] [CrossRef]
  17. García, J.; Astorga, G.; Yepes, V. An analysis of a KNN perturbation operator: An application to the binarization of continuous metaheuristics. Mathematics 2021, 9, 225. [Google Scholar] [CrossRef]
  18. García, J.; Martí, J.V.; Yepes, V. The buttressed walls problem: An application of a hybrid clustering particle swarm optimization algorithm. Mathematics 2020, 8, 862. [Google Scholar] [CrossRef]
  19. García, J.; Yepes, V.; Martí, J.V. A hybrid k-means cuckoo search algorithm applied to the counterfort retaining walls problem. Mathematics 2020, 8, 555. [Google Scholar] [CrossRef]
  20. Goldschmidt, O.; Nehme, D.; Yu, G. Note: On the set-union knapsack problem. Nav. Res. Logist. 1994, 41, 833–842. [Google Scholar] [CrossRef]
  21. Wei, Z.; Hao, J.K. Multistart solution-based tabu search for the Set-Union Knapsack Problem. Appl. Soft Comput. 2021, 105, 107260. [Google Scholar] [CrossRef]
  22. Ozsoydan, F.B.; Baykasoglu, A. A swarm intelligence-based algorithm for the set-union knapsack problem. Future Gener. Comput. Syst. 2019, 93, 560–569. [Google Scholar] [CrossRef]
  23. Liu, X.J.; He, Y.C. Estimation of distribution algorithm based on Lévy flight for solving the set-union knapsack problem. IEEE Access 2019, 7, 132217–132227. [Google Scholar] [CrossRef]
  24. Tu, M.; Xiao, L. System resilience enhancement through modularization for large scale cyber systems. In Proceedings of the 2016 IEEE/CIC International Conference on Communications in China (ICCC Workshops), Chengdu, China, 27–29 July 2016; pp. 1–6, 27–29. [Google Scholar]
  25. Yang, X.; Vernitski, A.; Carrea, L. An approximate dynamic programming approach for improving accuracy of lossy data compression by Bloom filters. Eur. J. Oper. Res. 2016, 252, 985–994. [Google Scholar] [CrossRef] [Green Version]
  26. Feng, Y.; An, H.; Gao, X. The importance of transfer function in solving set-union knapsack problem based on discrete moth search algorithm. Mathematics 2019, 7, 17. [Google Scholar] [CrossRef] [Green Version]
  27. Wei, Z.; Hao, J.K. Kernel based tabu search for the Set-union Knapsack Problem. Expert Syst. Appl. 2021, 165, 113802. [Google Scholar] [CrossRef]
  28. García, J.; Crawford, B.; Soto, R.; Castro, C.; Paredes, F. A k-means binarization framework applied to multidimensional knapsack problem. Appl. Intell. 2018, 48, 357–380. [Google Scholar] [CrossRef]
  29. Lister, W.; Laycock, R.; Day, A. A Key-Pose Caching System for Rendering an Animated Crowd in Real-Time; Computer Graphics Forum; Wiley Online Library: Hoboken, NJ, USA, 2010; Volume 29, pp. 2304–2312. [Google Scholar]
  30. Arulselvan, A. A note on the set union knapsack problem. Discret. Appl. Math. 2014, 169, 214–218. [Google Scholar] [CrossRef]
  31. Wei, Z.; Hao, J.K. Iterated two-phase local search for the Set-Union Knapsack Problem. Future Gener. Comput. Syst. 2019, 101, 1005–1017. [Google Scholar] [CrossRef] [Green Version]
  32. He, Y.; Xie, H.; Wong, T.L.; Wang, X. A novel binary artificial bee colony algorithm for the set-union knapsack problem. Future Gener. Comput. Syst. 2018, 78, 77–86. [Google Scholar] [CrossRef]
  33. Feng, Y.; Yi, J.H.; Wang, G.G. Enhanced moth search algorithm for the set-union knapsack problems. IEEE Access 2019, 7, 173774–173785. [Google Scholar] [CrossRef]
  34. Wu, C.; He, Y. Solving the set-union knapsack problem by a novel hybrid Jaya algorithm. Soft Comput. 2020, 24, 1883–1902. [Google Scholar] [CrossRef]
  35. Zhou, Y.; Zhao, M.; Fan, M.; Wang, Y.; Wang, J. An efficient local search for large-scale set-union knapsack problem. Data Technol. Appl. 2020. [Google Scholar]
  36. Gölcük, İ.; Ozsoydan, F.B. Evolutionary and adaptive inheritance enhanced Grey Wolf Optimization algorithm for binary domains. Knowl.-Based Syst. 2020, 194, 105586. [Google Scholar] [CrossRef]
  37. Crawford, B.; Soto, R.; Lemus-Romani, J.; Becerra-Rozas, M.; Lanza-Gutiérrez, J.M.; Caballé, N.; Castillo, M.; Tapia, D.; Cisternas-Caneo, F.; García, J.; et al. Q-Learnheuristics: Towards Data-Driven Balanced Metaheuristics. Mathematics 2021, 9, 1839. [Google Scholar] [CrossRef]
  38. Lanza-Gutierrez, J.M.; Crawford, B.; Soto, R.; Berrios, N.; Gomez-Pulido, J.A.; Paredes, F. Analyzing the effects of binarization techniques when solving the set covering problem through swarm optimization. Expert Syst. Appl. 2017, 70, 67–82. [Google Scholar] [CrossRef]
  39. He, Y.; Wang, X. Group theory-based optimization algorithm for solving knapsack problems. Knowl.-Based Syst. 2021, 219, 104445. [Google Scholar] [CrossRef]
  40. García, J.; Moraga, P.; Valenzuela, M.; Pinto, H. A db-scan hybrid algorithm: An application to the multidimensional knapsack problem. Mathematics 2020, 8, 507. [Google Scholar] [CrossRef] [Green Version]
  41. Baykasoğlu, A.; Ozsoydan, F.B.; Senol, M.E. Weighted superposition attraction algorithm for binary optimization problems. Oper. Res. 2020, 20, 2555–2581. [Google Scholar] [CrossRef]
  42. Ozsoydan, F.B. Artificial search agents with cognitive intelligence for binary optimization problems. Comput. Ind. Eng. 2019, 136, 18–30. [Google Scholar] [CrossRef]
Figure 1. Machine learning cuckoo search binary algorithm.
Figure 1. Machine learning cuckoo search binary algorithm.
Mathematics 09 02611 g001
Figure 2. K-means binarization procedure.
Figure 2. K-means binarization procedure.
Mathematics 09 02611 g002
Figure 3. Box plots for MLCSBO and random operators, with and without local search operator.
Figure 3. Box plots for MLCSBO and random operators, with and without local search operator.
Mathematics 09 02611 g003
Table 1. Parameter setting for the MLCSBO.
Table 1. Parameter setting for the MLCSBO.
ParametersDescriptionValueRange
NNumber of Nest20[10, 15, 20]
KClusters number5[4, 5, 6]
γ Step Length0.010.01
κ Levy distribution parameter1.51.5
TMaximum local search iterations300[300, 400, 800]
β Random initialization parameter0.3[0.3, 0.5]
Transition probabilityTransition probability[0.1, 0.2, 0.4, 0.8, 0.9][0.1, 0.2, [0.4, 0.5], 0.8, 0.9]
Table 2. Comparison between MLCSBO and random operators, with and without local search operator.
Table 2. Comparison between MLCSBO and random operators, with and without local search operator.
Random-03Random-05Random-03-NLRandom-05-NLMLCSBO-NLMLCSBO
InstanceAvgBestStdAvgBestStdAvgBestStdAvgBestStdAvgBestStdAvgBestStd
100_85_0.10_0.7512,82513,089200.912,862.513,089136.312,779.413,044243.812,794.413,08920712,927.813,0899313,06013,28346.4
100_85_0.15_0.8512,071.612,23383.712,12812,23358.312,063.712,226100.612,039.412,272106.812,17712,27446.312,237.612,27418.2
200_185_0.10_0.7513,296.413,50289.413,296.513,44310213,26513,521112.713,252.113,40594.713,357.113,52172.813,429.813,52144.1
200_185_0.15_0.8513,701.314,215238.213,711.213,995163.413,62414,102226.913,581.913,97918113,729.514,187165.213,853.814,215149.9
300_285_0.10_0.7511,221.811,469118.811,282.711,563104.911,219.311,545167.411,218.911,545165.811,305.711,563110.111,419.411,56370.8
300_285_0.15_0.8512,082.212,402158.412,114.312,380121.711,942.912,273190.611,960.312,402221.612,116.112,38011912,263.412,40261.7
400_385_0.10_0.7511,28211,48499.911,333.211,48484.411,241.511,484109.211,273.411,484117.811,295.111,4848311,461.311,48448.9
400_385_0.15_0.8510,716.311,209209.710,836.511,209180.910,677.711,20923810,598.710,923188.810,837.511,209179.110,971.811,209164.4
500_485_0.10_0.7511,467.511,658101.311,53011,68996.811,416.711,610109.111,507.211,72912811,554.111,7227411,636.211,72938.3
500_485_0.15_0.859779.810,217126.29783.510,217136.19695.410,217144.99686.410,086154.99811.410,08698.19916.410,21799.8
100_100_0.10_0.7513,831.713,957105.213,835.213,95774.813,698.613,957165.613,686.113,93716013,856.813,95770.213,952.813,99011.4
100_100_0.15_0.8513,133.513,445182.413,15713,498180.513,057.213,407181.213,013.113,449249.913,180.413,498173.113,337.813,508148.3
200_200_0.10_0.7512,180.812,35087.212,181.312,522106.112,136.712,384126.112,088.612,301124.712,196.112,522110.812,330.612,522101.1
200_200_0.15_0.8511,79312,317153.111,797.112,048165.111,663.711,930156.311,681.412,100194.611,757.811,982118.711,975.112,317140.4
300_300_0.10_0.7512,539.512,81790.912,578.812,81798.312,52612,736114.912,465.112,817162.112,621.512,81785.312,716.412,81769.2
300_300_0.15_0.8511,157.811,425138.811,240.711,41098.911,137.411,41020011,042.911,410178.111,231.611,41097.811,408.811,42526.7
400_400_0.10_0.7511,397.711,665154.811,406.711,665134.311,328.711,665147.511,378.211,665143.511,415.111,665102.811,600.911,66573.9
400_400_0.15_0.8511,046.411,325166.911,087.211,325156.610,911.211,325244.510,936.811,325258.711,090.911,325122.811,271.211,32562.4
500_500_0.10_0.7510,753.210,94386.110,841.611,04185.510,748.911,078148.510,769.511,011121.210,846.410,9837010,954.311,07874.2
500_500_0.15_0.859847.310,108146.19874.610,194163.39768.110,160191.49829.910,2091889879.410,162146.510,056.610,209107.2
85_100_0.10_0.7511,761.212,045159.611,797.412,045131.711,684.111,964181.911,713.412,04516011,797.712,045159.811,945.512,045126.6
85_100_0.15_0.8511,994.512,369223.412,05312,299129.211,942.512,348235.511,98812,369233.612,08312,369152.912,253.112,36981.6
185_200_0.10_0.7513,539.913,65999.613,493.713,69590.813,46813,69612013,449.813,695125.113,558.713,69678.713,651.713,69636.1
185_200_0.15_0.8510,890.611,15594.410,883.211,298164.610,893.911,24213510,873.811,298149.910,969.711,298139.411,068.711,298162.4
285_300_0.10_0.7511,401.411,568100.411,407.411,568105.411,323.911,568146.611,359.611,568126.711,410.111,56883.311,54611,56813.5
285_300_0.15_0.8511,258.411,763221.411,368.111,763149.911,220.411,714225.811,193.411,590240.111,333.511,763208.411,564.311,763129.2
385_400_0.10_0.7510,274.110,39769.510,300.610,40771.210,248.810,436100.310,249.110,4678810,30210,40763.710,400.710,60047.3
385_400_0.15_0.859918.210,506242.89921.810,294212.1981510,354253.2977110,329358.29955.110,506274.910,16210,506172.7
485_500_0.10_0.7510,823.311,09483.110,828.511,115100.510,790.611,09713510,785.811,09797.610,895.611,115105.410,965.311,12596.2
485_500_0.15_0.859836.610,117162.29873.810,208150.79760.310,104160.19795.110,220191.99897.410,10412210,095.710,22074.1
Average11,594.111,883.4139.811,626.911,882125.111,535.011,860167.111,532.811,860170.611,646.311,890117.611,783.611,93183.2
p-value Wilcoxon 1.7 × 10 6 6.5 × 10 4 1.8 × 10 6 1.9 × 10 4 1.7 × 10 6 5.9 × 10 5 1.7 × 10 6 1.9 × 10 4 1.7 × 10 6 9.7 × 10 4
p-value Holm–Bonferroni 5.1 × 10 6 0.0013 5.4 × 10 6 3.8 × 10 4 5.1 × 10 6 1.2 × 10 4 5.1 × 10 6 3.8 × 10 4 1.7 × 10 6 9.7 × 10 4
Table 3. Average runtime values in seconds for MLCSBO and random operators, with and without a local search operator.
Table 3. Average runtime values in seconds for MLCSBO and random operators, with and without a local search operator.
InstanceMLCSBOMLCSBO-NLRandom-03Random-03-NLRandom-05Random-05-NL
100_85_0.10_0.75109107105
100_85_0.15_0.85201819161815
200_185_0.10_0.75262117151914
200_185_0.15_0.85614854405451
300_285_0.10_0.75332522152317
300_285_0.15_0.85666055576948
400_385_0.10_0.75353231263229
400_385_0.15_0.859710210610110199
500_485_0.10_0.75635539435647
500_485_0.15_0.8511910610794111112
100_100_0.10_0.75654445
100_100_0.15_0.85171712141614
200_200_0.10_0.75323124252424
200_200_0.15_0.85140143124122133112
300_300_0.10_0.75999588848785
300_300_0.15_0.85156148160152153144
400_400_0.10_0.75464332304237
400_400_0.15_0.85202198176161191143
500_500_0.10_0.75828980698480
500_500_0.15_0.85168135153128123110
85_100_0.10_0.75444444
85_100_0.15_0.85181718171818
185_200_0.10_0.75293229252920
185_200_0.15_0.85675342434744
285_300_0.10_0.75201816171416
285_300_0.15_0.85917973666760
385_400_0.10_0.75726553705769
385_400_0.15_0.8512210192928599
485_500_0.10_0.7511411410810487108
485_500_0.15_0.85130123120106134110
Average71.566.262.358.263.158.0
Table 4. Percentile values for MLCSBO and Random operators, with and without local search operator.
Table 4. Percentile values for MLCSBO and Random operators, with and without local search operator.
PercentileMLCSBOMLCSBO-NLRandom-03Random-03-NLRandom-05Random-05-NL
2.50.000.010.160.630.020.22
250.571.821.982.441.952.25
501.512.813.193.782.943.60
752.433.804.345.124.054.85
97.54.066.086.647.646.257.61
Table 5. Best|average p-values for the Wilcoxon test.
Table 5. Best|average p-values for the Wilcoxon test.
MLCSBOMLCSBO-NLRandom-03Random-03-NLRandom-05
MLCSBO-
MLCSBO-NL 9.7 × 10 4 | 1.7 × 10 6 -
Random-03 6.5 × 10 4 | 5.1 × 10 6 0.31| 5.7 × 10 5 -
Random-03-NL 1.2 × 10 4 | 5.1 × 10 6 0.06| 1.7 × 10 5 0.23| 2.1 × 10 5 -
Random-05 3.8 × 10 4 | 5.4 × 10 6 0.97|0.00120.61| 3.1 × 10 4 0.05| 1.9 × 10 5 -
Random-05-NL 3.8 × 10 4 | 5.1 × 10 6 0.44| 1.7 × 10 5 0.47| 8.4 × 10 4 0.77|0.870.50| 1.7 × 10 5
Table 6. Comparison between GA, BABC, ABC bin, gPSO*, gPSO, intAgents, DH-Jaya, GWOfbd, GWOrbd and MLCSBO algorithms for medium instances.
Table 6. Comparison between GA, BABC, ABC bin, gPSO*, gPSO, intAgents, DH-Jaya, GWOfbd, GWOrbd and MLCSBO algorithms for medium instances.
InstanceResultsBest KnownGABABCABC bin binDEbWSAgPSO*gPSOintAgentsDH-jayaGWOfbdGWOrbdMLCSBO
100_85_0.10_0.75best13,28313,04413,25113,04413,04413,04413,16713,28313,28313,28313,08913,28313,283
Avg 12,956.413,028.512,818.512,99112,915.6712,937.0513,050.5313,061.0213,07613,041.3713,065.9313,060
std dev 130.6692.63153.0675.95185.45189.637.4144.0866.6131.1770.0246.4
100_85_0.15_0.85best12,47912,06612,23812,23812,27412,23812,21012,27412,27412,27412,27412,27412,274
Avg 11,54612,15512,049.312,123.911,527.4111,777.7112,084.8212,074.8412,192.512,079.412,053.5712,237.6
std dev 214.9453.2996.1167.61332.27277.1695.3886.3770.2593.3481.9918.2
200_185_0.10_0.75best13,52113,06413,24112,94613,24113,25013,30213,40513,50213,40513,40513,40513,521
Avg 12,492.513,064.411,861.512,940.712,657.6512,766.3813,286.5613,226.2813,306.613,282.313,280.7813,429.8
std dev 320.0399.57324.65205.7319.58304.8293.18150.9260.96102.88123.6344.1
200_185_0.15_0.85best14,21513,67113,82913,67113,67113,85813,99314,04414,04414,21514,21514,21514,215
Avg 12,802.913,359.212,53713,11012,585.3512,949.0513,492.613,441.0613,660.213,464.3513,479.9913,853.8
std dev 291.66234.99289.53269.69302.66325.58328.72324.96274.76358.97358.56149.9
300_285_0.10_0.75best11,56310,55310,428975110,42010,99110,60011,33511,33510,93411,41311,33511,563
Avg 9980.879994.769339.39899.2410,366.2110,090.4710,669.5110,576.110,703.210,707.5410,684.1711,419.4
std dev 142.97154.03158.15153.18257.1236.14227.85281.13112.95230.46242.4370.8
300_285_0.15_0.85best12,60711,01612,01210,91311,66112,09311,93512,24512,24712,24512,40212,25912,402
Avg 10,349.810,902.99957.8510,499.410,901.5910,750.311,607.111,490.2612,037.511,646.2311,606.3212,263.4
std dev 215.13449.45276.9403.95508.79524.53477.8518.81296.02517.63492.9961.7
400_385_0.10_0.75best11,48410,08310,766967410,57611,32110,69811,48411,48411,33711,48411,48411,484
Avg 9641.8510,065.29187.769681.4610,785.749946.9610,915.8710,734.6211,06210,884.4910,880.2211,461.3
std dev 168.94241.45167.08275.05361.45295.28367.75371.37273.63396.92386.7948.9
400_385_0.15_0.85best11,209983196498978964910,43510,16810,71010,71010,43110,71010,75711,209
Avg 9326.779135.988539.959020.879587.729417.29864.55973510,017.99894.549900.0110,971.8
std dev 192.2151.9161.83150.99360.29360.03315.38370.44207.98329.34325.03164.4
500_485_0.10_0.75best11,77111,03110,78410,34010,58611,54011,25811,72211,72211,72211,72211,77111,729
Avg 10,567.910,452.29910.3210,363.810,921.5810,565.911,184.5111,111.6311,269.411,276.4911,338.2611,636.2
std dev 123.15114.35120.8293.39351.69260.32322.98355.18275.37347.99351.4638.3
500_485_0.15_0.85best10,23894729090875991919681975610,02210,059977010,19410,19410,217
Avg 8692.678857.898365.048783.999013.098779.449299.569165.269354.289339.89398.079916.4
std dev 180.1294.55114.1131.05204.85300.11277.62282.55212.69252.98266.4699.8
100_100_0.10_0.75best14,04414,04413,86013,86013,81414,04413,96314,04414,04414,04414,04414,04413,990
Avg 13,80613,734.913,547.213,675.913,492.7113,739.7113,854.7113,767.2313,912.513,861.3513,847.8613,952.8
std dev 144.9170.76119.11119.53325.34119.5296.23131.5984.5584.62100.3311.4
100_100_0.15_0.85best13,50813,14513,50813,49813,40713,40713,49813,50813,50813,50813,50813,50813,508
Avg 12,234.813,352.413,103.113,212.812,487.8812,937.5313,347.5813,003.6213,439.113,312.5713,297.3713,337.8
std dev 388.66155.14343.46287.45718.23417.91194.34375.7444.86189.12172.16148.3
200_200_0.10_0.75best12,52211,65611,84611,19111,53512,27111,97212,52212,52212,52212,35012,52212,522
Avg 10,888.711,194.310,424.110,969.411,430.2311,232.5511,898.7311,586.2612,171.611,852.4411,906.9712,330.6
std dev 237.85249.58197.88302.52403.33349.39391.83419.09220.68371.57382.91101.1
200_200_0.15_0.85best12,31711,79211,52111,28711,46911,80412,16712,31711,91112,18711,99312,31712,317
Avg 10,827.510,94510,345.910,717.111,062.0611,026.8111,584.6411,288.2511,74611,612.0711,594.911,975.1
std dev 334.43255.14273.47341.08423.9421.22275.32410.54181.18217.13301.1140.4
300_300_0.10_0.75best12,81712,05512,18611,49412,30412,64412,73612,69512,69512,69512,78412,69512,817
Avg 11,755.111,945.810,922.311,864.412,227.5611,934.6412,411.2712,310.1912,569.312,44112,446.2112,716.4
std dev 144.45127.8182.63160.42308.11293.83225.8238.32114.13247.67227.4769.2
300_300_0.15_0.85best11,58510,66610,382963310,38211,11310,72411,42511,42511,11311,42511,42511,425
Avg 10,099.29859.699186.879710.3710,216.719906.8110,568.4110,38410,701.910,632.7110,648.5311,408.8
std dev 337.42177.02147.78208.48351.12399.13327.48378.42153.66345.63328.1326.7
400_400_0.10_0.75best11,66510,57010,62610,16010,46211,19911,04811,53111,53111,31011,53111,53111,665
Avg 10,112.410,101.19549.049975.810,624.7910,399.9710,958.9610,756.9210,914.810,961.2510,964.9811,600.9
std dev 157.89196.99141.27185.57266.46281.99274.9250.56216.47258.47276.1673.9
400_400_0.15_0.85best11,325923595419033938810,91510,26410,92710,92710,91510,92710,92711,325
Avg 8793.769032.958365.628768.429580.649195.249845.179608.079969.99849.049873.2711,271.2
std dev 169.52194.18153.4212.24411.83311.9358.91363.72287.61343.9373.762.4
500_500_0.10_0.75best11,24910,46010,75510,07110,54610,82710,64710,88810,96010,96010,92110,96011,078
Avg 10,185.410,328.59738.1710,227.710,482.810,205.0810,681.4610,610.5310,703.510,716.5510,742.9810,954.3
std dev 114.1991.62111.63103.32165.62190.05125.36169.73105.18140.87130.0574.2
500_500_0.15_0.85best10,381949693189262931210,082983910,19410,38110,17610,19410,19410,209
Avg 8882.889180.748617.919096.139478.719106.649703.629578.899801.59758.619737.4810,056.6
std dev 158.2184.91141.32145.45262.44257.65252.84278.06222.21243.59272.51107.2
Table 7. Comparison between GA, BABC, ABC bin, gPSO*, gPSO, intAgents, DH-Jaya, GWOfbd, GWOrbd and MLCSBO algorithms for medium instances.
Table 7. Comparison between GA, BABC, ABC bin, gPSO*, gPSO, intAgents, DH-Jaya, GWOfbd, GWOrbd and MLCSBO algorithms for medium instances.
InstanceResultsBest KnownGABABCABCbinbinDEbWSAgPSO*gPSOintAgentsDH-jayaGWOfbdGWOrbdMLCSBO
85_100_0.10_0.75best12,04511,45411,66411,20611,35211,94711,71012,04512,04512,04512,04512,04512,045
Avg 11,092.711,182.710,879.511,07511,233.1611,237.0511,486.9511,419.7511,570.611,441.2311,430.4411,945.5
std dev 171.22183.57163.62119.42216.67168.96137.52140.77177.8611172127.56126.6
85_100_0.15_0.85best12,36912,12412,36912,00612,36912,36912,36912,36912,36912,36912,36912,36912,369
Avg 11,326.312,081.611,485.311,875.911,342.711,684.4611,994.3611,885.2112,31811,917.8311,942.9312,253.1
std dev 417193.79248.33336.94474.76353.79436.81431.67181.92442.25418.7281.6
185_200_0.10_0.75best13,69612,84113,04712,30813,02413,50513,29813,69613,69613,69613,64713,69613,696
Avg 12,236.612,522.811,667.912,277.512,689.0912,514.7213,204.2613,084.5213,350.213,121.2313,125.8513,651.7
std dev 198.18201.35177.14234.24336.51356.2366.56388.39182.56365.41367.0636.1
185_200_0.15_0.85best11,29810,92010,60210,37610,54710,83110,85611,29811,29811,29811,29811,29811,298
Avg 10,351.510,150.69684.3310,085.410,228.0710,208.3310,801.4110,780.1410,828.910,871.4910,819.3411,068.7
std dev 208.08152.91184.84160.6286.92263.73205.76239.61191.76240.32239.52162.4
285_300_0.10_0.75best11,56810,99411,15810,26911,15211,56811,31011,56811,56811,56811,56811,56811,568
Avg 10,640.110,775.99957.0910,661.311,105.0910,761.9611,317.9911,205.7211,327.710,001.3310,014.4711,546.0
std dev 126.84116.8141.48149.84197.78199.43182.82258.49166.91174.62215.3513.5
285_300_0.15_0.85best11,80211,09310,52810,05110,52811,37711,22611,51711,51711,40111,59011,76311,763
Avg 10,190.39897.929424.159832.3210,452.0310,309.1910,899.210,747.3311,025.910,470.3310,871.4911,564.3
std dev 249.76186.53197.14232.72416.76389.1230036334.25208.08340.47332.09129.2
385_400_0.10_0.75best10,600979910,0859235988310,414987110,48310,32610,41410,39710,48310,600
Avg 9432.829537.58904.949314.579778.039552.1410,013.439892.1710,01710,043.239902.7210,400.7
std dev 163.84184.62111.85191.59221.49234.1202.4179.19141.15163.95180.3247.3
385_400_0.15_0.85best10,506917394568932935210,077938910,33810,13110,30210,30210,30210,506
Avg 8703.669090.038407.068846.999203.528881.179524.989339.679565.729472.399455.2410,162.0
std dev 154.15156.69148.52210.91303.12283.3286.16288.88237.9242.25261.75172.7
485_500_0.10_0.75best11,32110,31110,82310,35710,72810,83510,59511,09411,09410,97110,98911,09711,125
Avg 9993.1610,483.49615.3710,159.410,607.2110,145.2610,687.6210,603.5310,754.810,702.7210,725.2610,965.3
std dev 117.73228.34151.41198.49191.86199.99168.06204.99112.69154.92160.00196.2
485_500_0.15_0.85best10,22093299333879992189603980710,10410,104971510,10410,10410,220
Avg 8849.469085.578347.828919.649141.948917.449383.289259.369467.894629455.2410,095.7
std dev 141.84115.62122.65168.9180.42267.49241.01268.33106.55229.88261.7474.1
Table 8. Comparison between BABC, DH-Jaya, and MLBO algorithms for large instances.
Table 8. Comparison between BABC, DH-Jaya, and MLBO algorithms for large instances.
BABCDH-JayaMLBO
InstanceBest KnownBestAvgStd t avg BestAvgStd t avg BestAvgStd t avg
600_585_0.10_0.75991490989026.134.9498.696409450.060.2690.597219668.654.288.3
600_585_0.15_0.85935787368540.520.5172.591878998.579.2881.393139045.9119.6320.5
700_685_0.10_0.75988193119176.346.9363.497909602.056.0543.297369545.7103.3162.5
700_685_0.15_0.85916386718397.487.7302.691068894.1140.5426.191358834.1106.7356.7
800_785_0.10_0.75983792759192.420.3253.397719540.148.0637.394709268101.6316.6
800_785_0.15_0.85902484478366.572.0254.387978649.063.0236.889078611.966.3200.9
900_885_0.10_0.75972589538837.2103.2471.494559249.5109.1687.294549142116.9260.1
900_885_0.15_0.85862080727881.288.5228.484188244.587.9316.684278120.5186.2264.3
1000_985_0.10_0.75966892769254.227.9640.594249306.945.0309.991468642.4242.3202.5
1000_985_0.15_0.85845381338099.125.4648.284338280.590.9312.681497755.7223.7234.0
600_600_0.10_0.7510,52410,2079939.447.566.710,50710,504.319.7321.210,51810,470.922.5192.4
600_600_0.15_0.85906286218361.8101.3455.589108785.643.5572.089398891.331.7570.6
700_700_0.10_0.75978690789056.521.9224.495129409.028.7809.897869416.6156.6302.3
700_700_0.15_0.85922986148290.277.6126.891218985.565.9507.790688786140.9244.1
800_800_0.10_0.75993295179305.456.8418.598909656.451.4567.196799458.9110.7278.8
800_800_0.15_0.85910184448163.8132.7376.789618774.259.8161.788648433.7175.9276.7
900_900_0.10_0.75974592909273.014.6460.095269462.937.8671.095339289.8138.6254.4
900_900_0.15_0.85899081188114.59.2151.087188492.962.3702.786478233.4279.8250.5
1000_1000_0.10_0.75954490308891.339.0658.093489250.853.7542.290628656.8196.8198.9
1000_1000_0.15_0.85847478677627.844.9635.083308037.971.9932.681067767.3189.1318.6
585_600_0.10_0.7510,39397689677.881.9535.910,30010,161.572.898.210,001995452.1160.0
585_600_0.15_0.85925686898623.828.5461.990318944.261.7616.692568921.7122.6246.4
685_700_0.10_0.7510,12197969627.473.2248.710,0709953.649.0430.299149633.3144.9210.7
685_700_0.15_0.85917684538424.94.8958.791028860.8106.4160.091108828.685.9282.2
785_800_0.10_0.75938487658658.554.3869.091238885.154.1316.590398875.487.4304.4
785_800_0.15_0.85874682498021.9117.1577.085568482.351.5604.685558280.5108.4310.1
885_900_0.10_0.75931889388897.630.2587.291379079.146.7590.490198761.6114.9186.5
885_900_0.15_0.85842576107518.050.5869.782177881.465.8140.980017766.7134.6326.2
985_1000_0.10_0.75919389148741.3101.8739.990678994.545.0313.189348435.7190.5148.1
985_1000_0.15_0.85852880718066.515.2486.584538425.348.7504.081147600.7287.6190.4
Average9352.38800.48668.454.3458.09196.79041.462.5486.89120.18836.6136.4255.3
p-value 3.5 × 10 6 0.008 0.02 5.7 × 10 5
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

García, J.; Lemus-Romani, J.; Altimiras, F.; Crawford, B.; Soto, R.; Becerra-Rozas, M.; Moraga, P.; Becerra, A.P.; Fritz, A.P.; Rubio, J.-M.; et al. A Binary Machine Learning Cuckoo Search Algorithm Improved by a Local Search Operator for the Set-Union Knapsack Problem. Mathematics 2021, 9, 2611. https://doi.org/10.3390/math9202611

AMA Style

García J, Lemus-Romani J, Altimiras F, Crawford B, Soto R, Becerra-Rozas M, Moraga P, Becerra AP, Fritz AP, Rubio J-M, et al. A Binary Machine Learning Cuckoo Search Algorithm Improved by a Local Search Operator for the Set-Union Knapsack Problem. Mathematics. 2021; 9(20):2611. https://doi.org/10.3390/math9202611

Chicago/Turabian Style

García, José, José Lemus-Romani, Francisco Altimiras, Broderick Crawford, Ricardo Soto, Marcelo Becerra-Rozas, Paola Moraga, Alex Paz Becerra, Alvaro Peña Fritz, Jose-Miguel Rubio, and et al. 2021. "A Binary Machine Learning Cuckoo Search Algorithm Improved by a Local Search Operator for the Set-Union Knapsack Problem" Mathematics 9, no. 20: 2611. https://doi.org/10.3390/math9202611

APA Style

García, J., Lemus-Romani, J., Altimiras, F., Crawford, B., Soto, R., Becerra-Rozas, M., Moraga, P., Becerra, A. P., Fritz, A. P., Rubio, J. -M., & Astorga, G. (2021). A Binary Machine Learning Cuckoo Search Algorithm Improved by a Local Search Operator for the Set-Union Knapsack Problem. Mathematics, 9(20), 2611. https://doi.org/10.3390/math9202611

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop