1. Introduction
Ant colony optimization (ACO) is a nature-inspired metaheuristic that has been used as a design template for many successful algorithms. These algorithms are mostly used to solve NP-hard combinatorial optimization problems, although ACO has also been used for continuous and mixed-variable optimization problems [
1,
2,
3,
4,
5,
6]. ACO algorithms are stochastic and cannot guarantee optimal solutions, but they are commonly applied to NP-hard optimization problems because, for such problems, exact algorithms generally cannot yield optimal solutions in a reasonable time. It is sometimes also possible to combine ACO techniques like pheromone trails with Monte Carlo methods to maintain some advantages from both procedures [
7].
The key concept in ACO is the use of pheromone trails associated with solution components to allow the colony of artificial ants to accumulate collective knowledge and experience about the problem instance that they are trying to solve. These pheromone trails bias future attempts of artificial ants to construct better solutions. Being metaheuristic, ACO only gives general recipes on how to devise an algorithm. For solving a specific type of problem, it is necessary to devise a specific ACO algorithm, and success depends not only on the nature of the problem but also greatly on how these recipes are applied. It is important to enable good guidance in the search space by properly balancing between exploration and exploitation. A too greedy algorithm might result in fast convergence toward moderate or bad solutions, but too little greediness can result in slow convergence or unguided roaming through search space with a very low probability of obtaining good solutions. For some NP-hard problems, ACO can get substantial help from heuristic information, but for other NP-hard problems such useful heuristic information is unknown or maybe impossible. Heuristic information is very useful for problems where some kind of optimal route is objective, such as the traveling salesman problem (TSP) [
8], asymmetric traveling salesman problem (ATSP) [
8], sequential ordering problem (SOP) [
9,
10], various vehicle routing problems (VRPs) [
11,
12], car renter salesman problem (CaRS) [
13], etc. In these cases, the searching process can be faster because of heuristic information, and thus, the parameters and strategies used in ACO should be adjusted accordingly.
The standard methods for reinforcing pheromone trails in modern ACO variants have diametrically opposite strategies, leaving a large gap between too much and too little greediness. In this paper, we conduct experimental research on generalized reinforcement strategies to gain better control over algorithmic behavior and, thus, achieve better performance in ant colony optimization algorithms. The research questions that we tried to answer were the following. Can adjustable pheromone reinforcement strategies improve the algorithmic performance of ACO algorithms? How can
κ-best and max-
κ-best be generalized or extended to provide less greedy algorithmic behavior than with
κ = 1, in the case when this is desirable? How do different adjustable pheromone strategies influence the behavior of ACO algorithms (with and without local optimization) for combinatorial problems that have efficient heuristic information? For experiments performed to answer these research questions, we have chosen well-known instances of TSP and ATSP. Initial ideas for such strategies were presented at the conference [
14], and here, we extend our proposal and carry out a comprehensive experimental evaluation. The results show that, by using adjustable reinforcement strategies, an ACO algorithm can obtain better solutions and allow greater control over algorithmic behavior. The main contributions of this paper are a novel 1/
λ-best strategy and comprehensive experimental research that answered previously stated research questions and provided insight into what influence the adjustable pheromone strategies have on the algorithmic behavior of ACO, thus expanding scientific knowledge about ACO.
The remainder of this paper is structured as follows. After the relevant related work is covered in
Section 2,
Section 3 explains combinatorial problems that are relevant to this study and important to understand the MAX-MIN ant system explained in
Section 4.
Section 5 explains the numerically adjustable strategies
κ-best, max-
κ-best, and 1/
λ-best.
Section 6 gives details about experimental settings and procedures, while
Section 7 presents the results of the experiments. Finally, the discussion is given in
Section 8.
2. Related Work
Improvements made by ant colony optimization variants are achieved by changing either the solution construction procedure or the pheromone update procedure. In the case of a solution construction procedure, almost all ACO algorithms use the random proportional rule introduced by the ant system [
15,
16] or the more general variant, the pseudo-random proportional rule, used by the ant colony system [
17]. Many improvements in ant colony optimization variants are based on changing the way the pheromone trails are modified in the pheromone update procedure. These changes are more often concerned with the way the pheromone trails are reinforced than with the way the pheromone trails are evaporated.
The initial variant of ACO, named the ant system [
18], uses all solutions from the current iteration in the pheromone reinforcement procedure. The components of better solutions are proportionally reinforced with more pheromone value than the solutions with worse quality. This type of strategy provides only little guidance to the subsequent solution construction procedure in searching through the solution space. The pheromone reinforcement strategy introduced by the ant system causes too much exploration and too little exploitation of previous knowledge.
In an attempt to improve algorithmic behavior, an ACO variant named the elitist ant system was proposed [
16]. To make the algorithm greedier, in addition to pheromone reinforcement of all solutions in the current iteration, the components of the best solution found from the beginning of the algorithm (also called global-best solution) are reinforced with a weight determined by the parameter
e.
More selective in choosing solutions for pheromone reinforcement is the rank-based ant system, which uses weights for pheromone reinforcement based on solution ranks and reinforces only w solutions from the current iteration [
19], where w is an integer parameter.
Modern variants of the ACO algorithms use only one solution, in some sense “the best” solution, to reinforce pheromone trails. The ant colony system (ACS) uses the global-best solution for pheromone reinforcement [
16]. The MAX-MIN ant system (MMAS) uses the iteration-best or global-best solution [
8].
The most commonly used variant of ACO is probably the MMAS because of its excellent performance on many combinatorial optimization problems. When a faster and greedier variant is preferred, then ACS might be used instead of MMAS.
The best–worst ant system uses only the global-best solution to reinforce pheromone trails but also decreases the pheromone trails associated with the components of the worst solution in the current iteration that are not contained in the global-best solution [
20,
21].
Although the authors of BWAS claimed good performance, the BWAS algorithm did not gain wide popularity.
Population-based ant colony optimization uses the iteration-best solution to reinforce pheromone trails but also has a special mechanism to completely unmake the pheromone reinforcement made in the previous iteration, which is especially suitable for dynamic optimization problems [
22,
23].
There are also some studies about reinforcing pheromone trails that cannot be applied in general cases and address only specific situations. In [
24], Deng et al. proposed a technique for situations when pheromone trails are associated with nodes instead of arcs of the construction graph. The best-so-far strategy is used together with new rules (these new rules are named
r-best-node update rule and a relevant-node depositing rule; the first one has a somewhat similar name to our
κ-best strategy, although the actual methods are very different) proposed by the authors specifically for this type of problem. Their approach is applied to the shortest path problem, even though the path is not uniquely defined by the set of nodes and, therefore, pheromone trails associated with arcs would seem a more appropriate choice. Associating pheromone trails to nodes in the shortest path problem might have some success only if a path has a small subset of all possible nodes.
Pheromone modification strategies are proposed for dynamic optimization problems [
25,
26]. These strategies are not related to pheromone reinforcement, but instead, they are about reacting to the change in problem to avoid restarting the algorithm. The pheromone trails are modified to recognize changes in the problem instance, which is applicable in cases where changes are small or medium.
Rather complex rules for giving weights to additional pheromone values used in the pheromone reinforcement procedure of ACS are proposed in [
27]. The authors performed an experimental evaluation of the proposed rules on three small instances of Euclidean TSP and claim better results than those obtained by AS and standard ACS.
The pheromone update strategy that is based on the theory of learning automata, where additional pheromone values for reinforcement procedure depend not only on solution quality but also on current pheromone trails, is used in [
28].
Swarm intelligence approaches are successfully combined with machine learning, forming together a novel research field that has provided some outstanding results in different areas [
29,
30].
3. Combinatorial Optimization Problems Relevant to This Study
Ant colony optimization always uses pheromone trails and, for some problems, heuristic information to guide the ant’s construction of a particular solution. For some problems, useful heuristic information was not discovered. For example, in the case of the quadratic assignment problem (QAP) one of the best-performing variants of the ACO algorithm, the MAX-MIN ant system, does not use heuristic information. Other examples of NP-hard problems for which successful ACO algorithms do not use heuristic information are the shortest common supersequence problem (SCSP) [
31] and the maximum clique problem (MCP) [
32].
For some problems, researchers discovered heuristic information that can significantly guide and speed up the search process and, thus, improve the performance of ACO algorithms. This includes the traveling salesman problem (TSP), asymmetric traveling salesman problem (ATSP), sequential ordering problem (SOP), vehicle routing problem with time window constraints (VRPTW) [
33], car renter salesman problem (CaRS), and many others. For example, by using heuristic information (with parameter
β = 4) for TSP problem instance pka379, one ant in ACO can increase the probability of obtaining a particular optimal solution in the first iteration by an enormous 1.03 × 10
719 times. Since TSP with 379 nodes has 378!/2 = 3.28 × 10
811 solutions, this probability is increased from 3.05 × 10
−812 without heuristic information to 3.14 × 10
−93 with heuristic information. It is obvious that, when considering those probabilities, we never expect to obtain an optimal solution in the first iteration of the algorithm, but heuristic information can help tremendously in starting iterations of ACO algorithms [
34].
We used TSP and ATSP for experimental investigation in this research as they are often used for testing new techniques or strategies of the swarm and evolutionary computation algorithms and have various interesting extensions like SOP, VRPTW, and CaRS.
The asymmetric traveling salesman problem can be defined using a weighted directed graph with a set of vertices V and a set of arcs A. The objective is to visit all vertices exactly once and return to the starting vertex while minimizing the total tour weight. All the information needed to solve the problem can be stored in a matrix that specifies the distance between vertices in each direction (e.g., because of one-way streets).
The symmetric variant of the problem, where
dij =
dji, is traditionally studied separately, is denoted simply as TSP. The symmetric TSP is possibly one of the most researched NP-hard problems with significant achievements for some special variants. After decades of research for the metric variant of TSP and especially Euclidian TSP, there are specialized heuristics that can efficiently solve rather large instances, often to an optimum. In metric TSP, there is a constraint that, for any three nodes labeled as
i,
j, and
k, the following triangle inequality holds:
In addition, for Euclidian TSP, it is true that for any two nodes
i and
j with coordinates (
xi, yi,), (
xj, yj) the distance
dij is given by:
In this context, metaheuristics usually use metric and Euclidean TSP instances for testing new techniques and strategies and less commonly to produce the best performing algorithm for such variants.
6. Experimental Settings
For experiments, we have implemented MMAS for TSP and ATSP both with and without local optimization in C++. For TSP, implemented local optimization was the 2-opt that always accepts the first improvement gained by 2-exchange. For ATSP, we used 2.5-opt also with the first improvement. For all algorithms, the parameter α was set to 1; for problem instances of size 100 or more, the favorite list size was set to 30, and an initial value for the pheromone trail was set according to the following equation:
where f(
snn) is the cost of an approximate solution obtained by the nearest neighbor heuristic, and the upper trail limit
τMAX was set equal to
τ0. For MMAS algorithms without local optimization, the colony size
m was set equal to the size of the problem
n,
β was set to 4, and
ρ was set to 0.02, while for MMAS algorithms with local optimization, the colony size
m was set equal to 25,
β was set to 2, and
ρ to 0.2. These parameters are recommended in the literature for MMAS designed for TSP and ATSP [
8,
37].
For each problem instance, we tested 24 pheromone reinforcement strategies: 1/5-best, 1/4-best, 1/3-best, 1/2-best, 2-best, 4-best, 8-best, 16-best, 32-best, 64-best, 128-best, max-2-best, max-4-best, max-8-best, max-16-best, max-32-best, max-64-best, max-128-best, 5-1-ib-gb, 3-1-ib-gb, 2-1-ib-gb, 1-1-ib-gb, iteration-best = 1/1-best = 1-best = max-1-best, and global-best = ∞-best = max-∞-best. Each experiment was allowed 10,000 iterations, and each experiment was repeated 101 times. All together, 7.05 × 1011 solutions were constructed by MMASs, not counting changes achieved by local optimization. The experiments were conducted on computer cluster Isabella.
7. Results
The results of experiments are analyzed by using medians as a suitable measure of average algorithmic performance [
38,
39]. It is worth noting that for a single execution of the MMAS algorithm, which is stochastic by nature, there is at least a 50% probability of obtaining a solution that is as good as the median solution or even better than that. By multiple running of the MMAS algorithm, whether sequentially or in parallel, it is possible to arbitrarily increase this probability of getting a solution at least as good as the median solution. For five executions, the probability becomes 96.88%, which is rather high, or by 10 executions with a very high probability of 99.9% [
39].
Although for each experiment the stopping criteria were limited to 10,000 iterations, in the case of the MMAS algorithm with some smaller problem instances, this was obviously unnecessarily too many iterations. With some strategies, the MMAS found optimal solutions within much fewer iterations in all 101 repetitions, while with other strategies, this was achieved after many more iterations. Therefore, after collecting results, we determined for each problem instance the number of iterations after which we performed analyses of the results using the following method. We tracked the best median solution from 10,000 iterations backward until we reached the iteration at which this median solution (out of 101 samples) was reached by at least one of 24 strategies. This iteration was rounded up to the nearest hundred, after which we analyzed the obtained solution quality. So, for example, when MMAS without local optimization was used for gr21, then after 100 iterations the analysis was performed, but in the case of u574, the analysis was performed only after 9800 iterations, as presented in
Table 1. The same method was used for ATSP, and the iterations after which the analyses were performed are listed in
Table 2, along with other data about problem instances.
The median values out of 101 samples were calculated for each strategy and problem instance after the selected number of iterations, as presented in
Table 1 and
Table 2. The median results achieved by four MMAS algorithms are presented in
Figure 3,
Figure 4,
Figure 5 and
Figure 6. These figures contain a combination of graphical and tabular forms. For each problem instance (first column), the size is given in the second column. The third column has a percentage deviation from the optimum for the median solution achieved by the best strategy, and the fourth column has a percentage deviation from the optimum for the median solution achieved by the worst strategy for that problem instance. These min. and max. deviation [%] values are calculated according to the following method. For each problem instance, we found the best and worst strategies according to their median solutions. Those median solutions,
Mbest and
Mworst, were divided by the optimal solution, then the resulting quotient was subtracted by 1, and the final result was multiplied by 100. For example, in the case of MMAS without local optimization for TSP instance fl417, the algorithm with the 4-best strategy had the median solution of 12,055, and for the algorithm with the 1/5-strategy, the median solution was 12,972. Since the optimal solution is 11,861, the min. median deviation is 1.64%, and the max. median deviation is 9.37%, as shown in
Figure 3.
For each problem instance, the best strategy was colored yellow, the worst strategy red, and all the other strategies with shades of colors in between. The best strategy is also marked with the symbol “O” and the worst with the symbol “X”. Strategies that were the best within their own category are marked by the symbol “°”. In some cases, there is more than one best strategy, e.g., MMAS without local optimization achieves a median solution equal to the optimum for gr21 for all strategies, as shown in
Figure 3.
Since κ-best and max-κ-best are intended to be adjustable and extended with the 1/λ-best strategy, we also performed statistical analyses of combined strategies based on counting.
In addition, if, for some reason, it is not possible to adjust the appropriate strategy before the algorithm is used for solving some problem instance in practice, one possible approach would be to choose the strategy with the lowest average rank, as used in the Friedman test. Therefore, we also calculated the average rank for each strategy and performed the Friedman test and a post hoc procedure [
40]. The null hypothesis of the Friedman test is that there is no difference in performance of MMAS algorithm for tested pheromone reinforcement strategies. If the P-value is lower than the significance level α, which is usually set to 0.05, then the null hypothesis is rejected. When the null hypothesis is rejected, this means that the difference in results for some strategies is not due to statistical error. However, this does not mean that each strategy provided a statistically significant difference in comparison to every other strategy. To statistically compare the strategy with the best average rank with every other strategy, pairwise comparisons can be performed in a post hoc procedure, for which we used Holm’s and Hochberg’s procedures [
40].
7.1. Results of the MAX-MIN Ant System without Local Optimization for TSP
Results of MMAS without local optimization, presented in
Figure 3, show that choosing an appropriate strategy can make a significant difference in the performance of the algorithm. For some problem instances, the algorithm with the most suitable strategy was able to obtain a median solution equal to the optimum, while the most unsuitable strategy has a median solution that is a few percent worse than the optimum. This is, e.g., the case with hk48, lin105, brg180, ts225, and some others. There are also some cases where the most successful strategies have a median solution less than 1% worse than optimum, but unsuccessful strategies have a median solution more than 10% worse than optimum, such as in the cases of eil101, kroA200, lin318, pma343, pbm436, and some others. Thus, using an appropriate strategy is very important.
By looking at colors, it is obvious that most red-colored strategies are of the 1/λ-best type, while most “X” marks are on the 1/5-best strategy. There is also a certain similarity between the coloring of κ-best and max-κ-best and to some degree also to ib-gb scheduling. This shows that for some problem instances, the algorithm has similar behavior with respect to a similar level of gridlines.
To create an entire spectrum of adjustable strategies, we have proposed κ-best, max-κ-best, and 1/λ-best strategies. In that respect, it is important to evaluate their capabilities in contrast to classical iteration-best and global-best strategies and occasionally use ib-gb scheduling. By counting exclusive wins, it is noticeable that iteration best has 5 out of 45 (11%) exclusive wins, global best has 0, strategy 1/λ-best for λ = {5, 4, 3, 2} has 4 (9%), strategy κ-best for κ = {2, 4, 8, 16, 32, 64, 128} has 10 (22%), strategy max-κ-best for κ = {2, 4, 8, 16, 32, 64, 128} has 3 (7%), and ib-bg scheduling for combinations {5-1, 3-1, 2-1, 1-1} has 12 (27%) of exclusive wins.
When combined in some logical way so it could be adjusted by parametric number or schedule, the results of total wins (not exclusive wins) are summarized in
Table 3. Any adjustable choice would be better than simply using ib or gb strategies, but 1/
λ-best and
κ-best seem most promising, followed by ib-gb scheduling with included ib and gb, and finally by 1/
λ-best and max-
κ-best.
If for some reason, strategy is not adjusted to a particular problem instance, and normally it should be, then the best average rank strategy might be used; in this case, this is 3-1-ib-gb, followed by 5-1-ib-gb, 2-best, 4-best, max-2-best, max-4-best, etc., with average ranks 6.8222, 6.9444, 7.0556, 7.0778, 7.5889, and 7.7111, respectively. The worst average rank, 23.2889, was achieved by 1/5-best. Friedman statistic Q (distributed according to chi-square with 23 degrees of freedom) is 537.807778, which corresponds to p-value 4.67⋅10−99, while Iman and Davenport statistic T (distributed according to F-distribution with 23 and 1012 degrees of freedom) is 47.594353, which corresponds to p-value 1.9⋅10−143, so both p-values are much lower than the usual significance level 0.05. This means that differences in results obtained by different strategies are statistically significant.
In the post hoc analysis, we used Holm’s and Hochberg’s procedures to pairwise compare the strategy that has the best average rank (3-1-ib-gb) with all the other strategies. Both post hoc procedures, with an overall significance level of 0.05, could not reject the null hypotheses for 5-1-ib-gb, 2-best, 4-best, max-2-best, max-4-best, 2-1-ib-gb, 8-best, 1-best, 1-1-ib-gb, and max-8-best, but they rejected the null hypotheses for all other strategies.
7.2. Results of the MAX-MIN Ant System with 2-opt Local Optimization for TSP
Figure 4 contains the results observed by the MMAS with 2-opt local optimization that accepts the first improvement. These results show that using an appropriate strategy is important because with the most successful strategy, the algorithm achieved for almost all instances a median solution that is less than 1% distant from the optimum, but for an inappropriate strategy, this can be more than 10%. As was the case with MMAS for TSP without local optimization, most of the red area is within the 1/
λ-strategy. There are also similarities between
κ-best, max-
κ-best, and ib-gb-scheduling, which might be interpreted as similar behavior of the algorithm for similar levels of a strategy’s greediness. In this case, it can be observed that for larger problem instances, the less greedy strategies cause poor performance.
When considering the number of exclusive wins by some strategies, 1/λ-best and iteration best have zero exclusive wins. The global best has only one exclusive win. The most exclusive wins have κ-best, 16 out of 45 (36%), followed by ib-gb scheduling, 6 out of 45 (13%), and max-κ-best, 5 out of 45 (11%).
In the case of combined strategies, as shown in
Table 4, it is obvious that adjustable strategies can improve the algorithmic performance in comparison with the classical approach of using only iteration best and global best. In this case, the most successful strategy was
κ-best, followed by ib-gb scheduling and max-
κ-best.
If the strategy is not adjusted to a problem instance (adjusting is highly recommended), then the 8-best strategy might be used as a strategy that achieved the best average rank in this group of experiments. After the 8-best strategy (average rank = 8.4), the descending order follows max-4-best, max-8-best, 1-1-ib-gb, 4-best, 32-best, etc., with average ranks of 8.7, 8.9444, 9.1333, 9.3333, and 9.5333, respectively. The worst average rank, 22.3889, was achieved by 1/5-best. Friedman statistic Q is 377.231778, which corresponds to p-value 8.47⋅10−66, while Iman and Davenport statistic T is 25.234114, which corresponds to p-value 1.5⋅10−83, so both p-values are much lower than the usual significance level of 0.05.
In the post hoc analysis, Holm’s and Hochberg’s procedures compared with equal decisions: the 8-best strategy against all the other strategies. The null hypotheses could not be rejected for max-4-best, max-8-best, 1-1-ib-gb, 4-best, 32-best, 64-best, global best, 128-best, max-64-best, max-16-best, max-32-best, 16-best, max-128-best, 2-1-ib-gb, 3-1-ib-gb, or 2-best but was rejected for all other strategies.
7.3. Results of the MAX-MIN Ant System without Local Optimization for ATSP
The results of MMAS for ATSP are presented in
Figure 5. With the most successful strategies, the algorithm for some instances achieved a median solution that is optimal, but for other instances, it was up to 42.6% worse than optimal. For some instances, choosing the appropriate strategy is more important with respect to distance from optimum, while for others, it is not that important. For most instances, the worst strategy was 1/
λ-best, and generally, there is a similarity between
κ-best, max-
κ-best, and ib-gb scheduling strategies with respect to level of greediness and algorithmic performance.
When exclusive wins are counted, 1/λ-best with λ = {5, 4, 3, 2} has 13 out of 47 (28%), iteration best has 5 (11%), κ-best with κ = {2, 4, 8, 16, 32, 64, 128} has 8 (17%), max-κ-best also has 8 (17%), ib-gb scheduling has 6 (13%), and global best has 0 exclusive wins.
All dc* problem instances (dc112, dc126, …, dc932) have a similar pattern, where for 1/2-best and iteration best, the algorithm gave the best performance (only for d849, this is 8-best or max-8-best), and greedier k-best or max-k-best strategies have rather bad performance.
When counting all wins, as shown in
Table 5, the largest number of wins has a combination of 1/
λ-best and
κ-best and a combination of 1/
λ-best and max-
κ-best, both 70.2%, followed by ib-gb scheduling, which includes ib and gb and has 38.3% wins.
When the strategies are not adjusted for a particular problem instance, the strategy with the best average rank, in this case 5-1-ib-gb, might be used. The strategies with the best ranks are 5-1-ib-gb (7.6489), 2-best (7.8085), 3-1-ib-gb (8.1596), max-2-best (8.6277), 4-best (8.9255), 2-1-ib-gb (9.0319), etc. The worst average rank, 21.0957, has the 1/5-best strategy. Friedman statistic Q = 334.488085, which corresponds with p-value = 4.62⋅10−57, while Iman and Davenport statistic T, distributed according to F-distribution with 23 and 1058 degrees of freedom, is 20.611127, which corresponds with p-value = 1.52⋅10−69, both much lower than 0.05.
In Holm’s and Hochberg’s post hoc procedures with 5-1-ib-gb against others, the null hypotheses could not be rejected against 2-best, 3-1-ib-gb, max-2-best, 4-best, 2-1-ib-gb, 1-best, 8-best, max-4-best, max-8-best, 1/2-best, or 1-1-ib-gb, but it was rejected for all other strategies.
7.4. Results of the MAX-MIN Ant System with 2.5-opt Local Optimization for ATSP
The results for ATSP obtained by MMAS with 2.5-opt local optimization are presented in
Figure 6. For almost all but a few problem instances, with a suitable strategy, the algorithm obtained median solutions less than 1% within optimum and often completely equal to optimum. In the case when reinforcement strategy was not adequately matched with the characteristics of the problem instance, for some instances the median solutions were more than 10% worse than optimum. The distribution of red color is more complex and not so dominantly reserved for the 1/
λ-strategy as with other tested algorithms. There are also obvious similarities between
κ-best, max-
κ-best, and ib-gb scheduling with respect to the level of greediness and corresponding algorithmic performance.
When it comes to exclusive wins, 1/λ-best with λ = {5, 4, 3, 2} has 11 out of 47 (23%), iteration best has 0, κ-best has 10 (21%), max-κ-best has 3 (6%), both with κ = {2, 4, 8, 16, 32, 64, 128}, ib-gb scheduling has 16 (34%), and global best has 0.
Table 6 presents summarized results for combinations that cover a wider range of numerically controlled strategies. The best in terms of all wins was combination 1/
λ-best and
κ-best, followed by ib-gb scheduling with included ib and gb, and 1/
λ-best and max-
κ-best. So, any of these combinations allowed better performance through adjustability than only using iteration-best and global-best strategies.
In the cases of some instances, the lower level of greediness is preferred, but in others, it is the opposite. There are also some cases where only 1/5-best and 1/4-best strategies allow the best performance of the algorithm.
The strategies with the best average ranks are 3-1-ib-gb (8.4468), 5-1-ib-gb (8.7766), 2-1-ib-gb (8.8191), 1-1-ib-gb (9.734), 4-best (10.1489), max-2-best (10.4149), etc. The worst average rank, 18.5426, has the 1/5-best strategy. Friedman statistic Q = 148.90383, which corresponds with p-value = 2.04 × 10−20, while Iman and Davenport statistic T, distributed according to F-distribution with 23 and 1058 degrees of freedom, is 7.348572, which corresponds with p-value = 3.45 × 10−22, both much lower than 0.05.
Holm’s post hoc procedure, as well as Hochberg’s procedure with the 3-1-ib-gb against others, could not reject the null hypotheses for 5-1-ib-gb, 2-1-ib-gb, 1-1-ib-gb, 4-best, max-2-best, 2-best, 8-best, 1-best, 16-best, max-4-best, max-8-best, or 1/2-best, but it was rejected for all other strategies.
8. Discussion
A large number of experiments were carried out in this research to allow the analysis of the behavior of ant colony optimization algorithms (MMASs with and without local optimization) with respect to different pheromone trail reinforcement strategies that can be adjusted with numerical parameters. The experiments confirmed that, by using numerically adjustable strategies, it is possible to significantly improve algorithmic performance. Although some regularities between different strategies were observed, they are not so clear and simple as to would allow the recommending of some predefined parameters and strategy that would be the best for all problem instances and all variants of the algorithms. There is also a similarity between κ-best, max-κ-best, and ib-gb scheduling, so it is possible to fine tune an algorithm by using only one of these strategies, although in some cases κ-best and max-κ-best should be extended with 1/λ-best to one compound numerically controlled strategy with a lower level of greediness.
In our previous studies, we had some limited experiences with new adjustable strategies that helped us, in some cases, achieve the state-of-the-art results. However, there were no comprehensive analyses that would allow us to estimate the potential of adjustable strategies as a tool for improving ACO algorithms. The introduction of 1/λ-best was motivated by our observation that in the case of the quadratic assignment problem, which does not use heuristic information and, thus, has much higher exploration at the start, often strategies with low greediness provide the best results. Because of this, we wanted to further extend numerically adjustable strategies in a way that they could be even less greedy than iteration best. To our surprise, this research showed that 1/λ-best can have some success even with TSP, which, however, has very efficient heuristic information. For MMAS without local optimization, 1/λ-best had some occasions of great success in contrast to a classically used global-best strategy that had 0 exclusive wins, but in most cases, global best is safer for scenarios where there is limited parameter tuning. The 1/λ-best is recommended only in combination, preferably with 1/λ-best or alternatively with max-κ-best. The 1/λ-best was shown to be rather important for ATSP with and even more without local optimization. Only in the case of TSP, for which the MMAS with good local optimization (2-opt) allows faster coverage toward very good solutions, implementing the 1/λ-best strategy seems completely unnecessary.
Judging by the results of these experiments, it seems that the higher level of greediness is better for TSP with larger problem instances, especially when local optimization is used, but for ATSP, the level of desirable greediness seems to be more connected to a group of problem instances, presuming with similar structure, than it was with the size of the problem. So, in the case of problems with good heuristic information, it might be helpful to implement 1/λ-best and k-best strategies and try κ-best first. If lower values for κ give better results, then smaller λ parameters might also be worth trying.
When considering the good potential of numerically adjustable strategies, it could be worth trying them all for problems that are related to TSP and ATSP and have efficient heuristic information like SOP, VRPs, CaRS, CaRSP, etc.
It is a highly advisable to adjust the pheromone reinforcement strategy to a particular problem instance that should be solved, but if, for some reason, this is not possible, then the strategy with the best average rank with similar problem instance and working conditions might be used. We used Friedman test with Holm’s and Hochberg’s procedures to test the statistical significance of differences among achieved average ranks.
In our future research, in addition to trying out adjustable strategies for some of the aforementioned problems, we plan to carry out a comprehensive study with ACO for a combinatorial problem that does not have useful heuristic information and presumably might find a strategy with a low level of greediness a good fit for overall balancing between exploration and exploitation.