1. Introduction
Distributed computing platforms are becoming increasingly important as cost-effective options to conventional high-performance computing platforms. The distributed systems keep going to expand in size, heterogeneity, and diversity of network resources [
1,
2]. In such complicated platforms, workload management and load balancing become critical factors in keeping business activities afloat. Workload management has reached the top of business and management engineering research priorities [
3,
4]. One of the most complex concerns in real-time optimization problems is the robust management of a wide range of workload patterns. Robust management includes a resilient load balancing system [
5]. Load balancing is an important element in distributed and parallel environments [
6,
7], as it is used to achieve maximum use of resources, to avoid node overload, to reduce response time, to avoid network bottlenecks, and to ensure system scalability. The challenge of dynamic load balancing persists as a difficult issue of global optimization because of system structure heterogeneity, requisitions of Quality of Service (QoS) per application and administration of computational resources [
8,
9].
There are several algorithmic methods for optimizing load balancing problems [
10,
11,
12]. These optimization algorithms were based on a metaheuristic (or stochastic) nature-inspired paradigm. Most metaheuristic algorithms generate random possible solutions at each iteration to enhance the chances of exploring the whole search space [
13]. The first feasible solution can be improved using mechanisms as movement, mutation, exchange, and cooperative perception. The improvement process is then repeated many times until the best solution is discovered, and the optimization is completed using termination conditions. The stochastic methods use a cost function (or fitness function) [
14]. The cost function is a set of agents passing through the solution space. The position of an agents in the solution space represents a set of parameters in the cost function. Thus, the aim of the metaheuristic algorithm is to find the global minimum or global maximum of the cost function and to supply high-quality solutions within a feasible lead time [
15].
The swarmintelligence-based (SI-based) algorithms belong to the category of stochastic population-based bioinspired algorithms [
16]. SI-based algorithms covered algorithms that are based on the behavior of different animal species, chemical processes, or even other natural processes [
17]. Several nature-inspired metaheuristic algorithms have been presented to handle optimization problems in different applications. The ant colony optimization (ACO) [
18], genetic algorithm (GA) [
19], and particle swarm optimization (PSO) [
20] are some examples of algorithms inspired by nature. ACO imitates the behavior of a colony of real foraging ants to find the most cost-effective path. The shortest or optimal path is found via the stigmergy process. It is a social network mechanism where pheromones push agents toward promising solutions. ACO is designed to solve the most challenging concern of combinatorial optimization problems, as well as of network applications, such as routing and load balancing. ACO algorithms can be applied to solve different problems, such as the traveling salesman problem (TSP) [
21], vehicle routing [
22], sequential ordering [
23], and scheduling [
24]. GA is a search algorithm based on the concepts of natural selection and natural genetics (crossover and mutation). The GA is employed in real-world optimization problems, for example, manufacturing [
25], warehouses [
26], robotics [
27], and automatic control [
28]. PSO mimics the behavior of a swarm of birds or fish in search of food. In PSO, each particle in a swarm does not exchange materials with other particles. A particle is impacted by its current position, the best position in the swarm, and its velocity. Lately, PSO has been used to solve real-world engineering problems such as the design of multilayered rectangular microstrip antenna using electromagnetic band-gap (EBG) structures [
29], and bandwidth improvement of an inverted-F antenna (IFA) [
30]. Furthermore, the PSO algorithm is also able to design new engineering components, for example, artificial magnetic conductors [
31].
Apart from optimizing solutions based on bioinspired algorithms, the multiobjective optimization (MOO) algorithm is designed to solve the most challenging concern of combinatorial optimization problems, as well as of network applications. MOO algorithm is known as Pareto optimization, and it is classified as a multiple-criteria decision analysis (MCDA). The MOO algorithm is used in connection with optimization problems that include more than one conflicting objective function to be optimized simultaneously [
32]. Multiobjective problems, such as routing in communication networks [
33,
34], compressor design [
35,
36], engineering [
37,
38], and logistics [
39], require a number of objective functions for simultaneous optimization [
40].
It is essential to solve the system’s task scheduling problem to increase resource utilization and sharing rate in distributed systems. Task scheduling became more difficult and complex for parallel and distributed environment due to resource distribution, heterogeneity, and autonomy. Different research projects have been carried out to improve the ant colony optimization performance, for example, a novel pheromone update strategy [
41,
42], a modified ant colony optimization with improved tour construction and pheromone updating strategies [
43], an AntNet with reward-penalty reinforcement learning [
44], and an AntNet routing technique for real-world application [
45]. A further improvement on the original form of the ant system (AS) is called the elitist strategy (elitist-AS) [
46]. The concept is to provide strong reinforcement at the edges of the optimal path determined at the beginning of the algorithm. Using the elitist method with an appropriate number of elitist ants enables AS to locate better routes earlier in the run. Despite the existing researches for ACO algorithm can improve the efficiency of task scheduling, they do not contribute significantly to improving the overall load imbalance of the system. In this paper, a novel ACO-NN approach is proposed to manage task assignment of load balancing system. The developed algorithm (ACO-NN) meets the following main contributions:
- (i)
Take into consideration the system heterogeneity and the dynamic task scheduling;
- (ii)
Carry out different computing data sets and experiments to investigate load balancing performance;
- (iii)
Minimize the total cost of the computing system;
- (iv)
Validate the obtained results with the results of previous research using makespan and the optimal solution as performance measures.
The structure of this paper is organized as follows.
Section 2 introduces the hybrid of nearest neighbor and ACO algorithm. In
Section 3, various experiments are presented, and the outcomes of the experiments are analyzed.
Section 4 carries out the discussion of comparison tests. Finally, conclusion is provided in
Section 5.
2. Methods
In this research, a new load balancing algorithm based on ant-inspired behavior called hybrid nearest-ant colony optimization (ACO-NN) is proposed. It is properly described in Algorithm 1. ACO-NN contains three strategies: approval procedure, nearest neighbor operator, and ACO operator.
Besides the heuristic and fitness functions of load balancing, the approval procedure
controls the management of workload. ACO-NN can minimize the amount of processing time used in scheduling.
Algorithm 1 Pseudocode of nearest-neighbor ant colony optimization. |
- 1:
Initialization; - 2:
Every task comes; - 3:
; - 4:
while
do - 5:
Apply NN operator; - 6:
Store the current value of optimal path; - 7:
Update routing table; - 8:
Sort the ants for search using the routing table; - 9:
Path construction; - 10:
Update pheromone table; - 11:
if all ants complete their tour then - 12:
Evaluate updated pheromone table; - 13:
Select the best path; - 14:
if then - 15:
Choose the optimal node based on pheromone table; - 16:
Assign task to optimal node; - 17:
else - 18:
Find next optimal solution; - 19:
end if - 20:
end if - 21:
end while
|
2.1. Nearest Neighbor (NN) with Ant Colony Optimization (ACO)
In order to minimize computation time and boost the scheduling result of a total system resource set
and a finite set of tasks
, it is important to choose the right job scheduling order through an approval procedure,
. The local search phase through the nearest neighbor operator constructs the problem graph and determines the beginning node of ant. The proposed method assumes that the job scheduling mechanism is analogous to the ant foraging process. According to the system resource node’s hardware performance parameters and the system average load gap, the pheromone is updated, and the estimated time to execute all tasks is kept to a minimum. The execution time is the amount of time required by a task to complete its execution. We define
to be the execution time of task
i. The parameters
and
represent respectively the arrival time of task
i for processing and the complete processing time of task
i. The general formula for an execution time for a task
i is defined by Equation (
1):
The estimated total execution time of
n tasks is represented by Equation (
2):
The heuristic function of the ant colony optimization algorithm is calculated using the estimated completion time. Moreover, to assign a task, a resource node with a strong pheromone concentration (high efficiency and low load), maximum remaining memory, and a short completion time is chosen.
The aim of the proposed task scheduling method, nearest-ant colony optimization (ACO-NN) is to assign each task of the set
T to the resource set
N on the basis of maintaining load balance and increasing the efficiency of the system’s task scheduling. This hybrid algorithm combined ACO with NN to get an efficient and feasible optimization method.
Figure 1 displays the flowchart of the specific steps for ACO-NN in distributed systems based on load balancing, while the generic pseudocode of this proposed method is presented in Algorithm 1.
2.2. Preapprove Procedure
Preapprove is the first stage of ACO-NN at each time tasks arrive. In this stage, the algorithm will verify the available memory of each computing node and determine the maximum amount of memory left over. If the request requires a higher memory than the maximum left memory, the request will be denied prior to scheduling. The preapprove step reduces the size of the ACO solutions, and indeed, the computing time of the ACO scheduling is reduced. Considering and , two service nodes in the system. The residual memory in and is 2 and 3 GB. Supposing two requests and are reaching the system with demanding memory of 1 and 5 GB. Whenever a new request comes in, preapprove evaluates the maximum existing memory in each node, which is 3 GB. After this, see if the new request can be approved. Though T1 is demanding memory of 1 GB, which is less than the overall remaining memory in a single service node, so can be served by any node. While the demanding memory of is 5 GB, which is bigger than the total remaining memory of nodes. As consequence, will not be approved by because there is no available resource to serve it. The rejected request will be in queue buffer until one of the available resources can handle it. The request that is in the queue has priority over new incoming requests (in case available resource can serve the pending workload).
2.3. Nearest Neighbor Operator
It is a local search strategy used for pattern recognition of the distributed system and construct the routing table for all ants. This is the simplest, easiest, and most straightforward heuristic method to generate the short tour using Euclidean distance calculation.
The Euclidean distance
between two nodes
and
of
m dimensions is obtained by Equation (
3):
The nearest neighbor is a structured approach that follows the following steps:
- step 1
Select a random node.
- step 2
Find the nearest unvisited node using a distance calculation.
- step 3
If unvisited nodes exist, repeat step 2.
2.4. ACO Operator
It is used to build possible solutions for all ants. Each ant will choose the next resource node according to the probability matrix
as shown in Equation (
4). The value of this transition probability of ant from resource node
x to resource node
y is used to select the next node to be allocated by a task.
In Equation (
4),
is the pheromone value for the transition of resource node
x to resource node
y.
is a heuristic function that represents a priori desirability of the move from resource node
x to resource node
y.
is a positive parameter used to control the influence of pheromone concentrations and heuristic information and which is the relative value of scheduling order;
is the expected heuristic component, which sets out the relative heuristic information in the scheduling sequence of selections for an ant.
T denotes the set of unscheduled requests that remain. An ant can choose a task among the set
T to execute in the next move.
The two main factors influencing the choosing of resource nodes, according to Equation (
4), are
and
. The complexity and emphasis of the research algorithm are considered the improvement of these two main factors. The heuristic function
is described by Equation (
5).
where
,
and
are the effect weights of CPU utilization, memory, and disk utilization, respectively;
,
, and
indicate the efficiency of resource node
x in the three following resources:
where
is the CPU utilization for the node
x and
and
represent the remaining memory amount and the disk utilization for the node
x, respectively.
In ACO, the heuristic function corresponds to a local point, which implies that each ant has its decision when it comes to path selection. As a result, each ant chooses its path based on the available resources on each server. In terms of CPU and disk usage, lower utilization means higher resource availability, as shown in (
6) and (
8). From the standpoint of memory, more remaining memory on the server leads to better efficiency as shown in Equation (
7). The sum of these three values represents the average remaining resource as indicated in Equation (
5).
2.5. Global Pheromone Update Operator
The proposed ACO-NN method updates pheromone according to the load balancing value and the performance of resource node hardware. Ants lookup the shortest path in the neighborhood according to the current algorithm iteration. Pheromones rise exclusively on the routes that correspond to the current best solution in this iteration, whereas pheromones on the remaining routes decrease with evaporation mechanism. The pheromone trail update is done in accordance with the ant-cycle, where the ants update the pheromone after all the ants have built the tours. The global pheromone update is performed using Equation (
9).
where
is the global residual coefficient of pheromones decay rate,
. The pheromone trail evaporation prevents bad decisions made before.
represents the amount of pheromones dropped by ant
k on arc (
x,
y). Generally,
is defined in Equation (
10).
where
is the quantity of pheromone to be distributed along the route.
is the path length performed by ant
k.
3. Results
The proposed approach was developed and implemented in MATLAB environment R2018a. The MATLAB environment was used for numerical computing. In this dissertation, each of the developed algorithms for load balancing were implemented with the aid of MATLAB toolbox, which allows matrix manipulations and the plotting of functions. The algorithms were run with the following computer configuration: Intel Core i5, CPU 2.20 GHz, RAM 12 GB, and Windows 10.
ACO-NN is a kind of heuristic algorithm; its performance is influenced by the various parameter values.
Table 2 shows the parameters used by the proposed algorithm.
The layout of each experiment consists mainly of the total number of resource nodes where the scheduler assigns tasks to the available computing nodes according to system measurement and the optimal tour cost between nodes.
3.1. Experiment 1: Comparison ACO-NN to GA and SA Approaches
Bayg29, Eil76, Gr120, Pcb442, and Gr666 problems are used for testing the performance of ACO-NN. The obtained results are compared with GA and SA.
Figure 2 displays the convergence of a small-scale instance using Bayg29 as an example. ACO-NN performs better in terms of convergence rate while GA has the worst convergence rate.
As can be seen from
Figure 3, ACO-NN obtains the optimal solution with the best convergence rate for a small-scale instance using Eil76. SA has the worst performance among the optimization algorithms.
According to
Figure 4, ACO-NN has better convergence performance than GA and SA during 10 experimental runs. For medium-scale instance using Gr120, SA has worse performance.
As illustrated in
Figure 5 and
Figure 6, ACO-NN has minimal tour cost for five data sets with an increase in the number of runs. The performance of ACO-NN and GA is significantly similar for large-scale instances of Pcb442 and Gr666.
3.2. Experiment 2: Comparison ACO-NN to GRASP Approach
To verify the effectiveness of ACO-NN for data sets Eil51, St70 and Kroc100, we compare the proposed algorithm with GRASP.
3.2.1. Results of Optimal Solutions vs. Number of Runs
The comparison of optimal solution based on ACO-NN and GRASP for two data sets with an increase in the number of runs is shown by
Figure 7 and
Figure 8. The simulation results demonstrate that our proposed algorithm has the lowest value of optimal solutions during 10 runs. ACO-NN is superior to GRASP.
3.2.2. Results of Response Time vs. Number of Runs
The comparison of optimal solution based on ACO-NN and GRASP for two data sets with an increase in the number of runs is shown by
Figure 9 and
Figure 10. From the perspective of response time, the simulation results shows that our proposed algorithm has minimal response time during 10 runs and it is superior to GRASP.
3.3. Experiment 3: Comparison ACO-NN to GA and GRASP Approaches
The comparison of optimal solution based on ACO-NN, GA, and GRASP in terms of two factors: tour cost and response time, with an increase in the number of runs is carried out by
Figure 11 and
Figure 12.
According to
Figure 11, the simulation results show that our proposed algorithm has the lowest value of optimal solutions during 10 runs for the data set Kroc100 from the perspective of tour cost.
As seen in
Figure 12, the results of response time are similar for both algorithms ACO-NN and GA, while the GRASP provides the worst result for the instance KroC100.
4. Discussion
Several tests were carried out to evaluate the performance of the proposed method. The results obtained by the nearest-neighbor ant colony optimization (ACO-NN) algorithm are compared to renowned metaheuristic algorithms such as artificial bee colony (ABC), genetic algorithm (GA), simulated annealing (SA), ant colony optimization (ACO), camel herd algorithm (CHA), black hole (BH), greedy randomized adaptive search procedure (GRASP), particle swarm optimization (PSO), traveling salesman problem based on simulated annealing and gene expression programming (TSP-SAGEP), simulated-annealing-based symbiotic organisms search (SOS-SA), multioffspring genetic algorithm (MO-GA), and discrete tree-seed algorithm (DTSA) with their variants (DTSA0, DTSAI, DTSAII).
4.1. Test1
In this experiment, three state-of-the-art algorithms based on ACO approaches developed in recent years from the literature were used to test the performance of ACO-NN. The modified ant system was proposed by Yan et al. in 2017 [
49]. The adaptive tour construction and pheromone updating techniques are integrated into the standard ant system. The modified ACO (MACO) with improved tour construction and pheromone updating strategies is proposed by Gao in 2021 [
43]. The hybrid elitist-ant system (Elitist-AS) with an external memory structure [
46] gives strong reinforcement at the arcs of the optimal tour determined at the beginning of the algorithm. The computing results of instance Gr666 are summarized in
Table 3.
As demonstrated in
Table 3, ACO-NN proves its robustness through the low standard deviation (SD) value and the optimal solution. The comparison results of ACO-NN with Elitist-AS are shown in
Table 4. It is noted that ACO-NN is faster than Elitist-AS for instances Eil76, Ch150, and D198. From the perspective of optimal solution value, Elitist-AS performed better than ACO-NN.
4.2. Test2
To demonstrate the benefits of the proposed algorithm in this research, we compared ACO-NN to ACO [
50], PSO [
50], GA [
50], BH [
50], DTSA [
50], and CHA [
51] for the instance Eil76 in terms of the meaning of best solutions, the rate of difference (R-mean), and standard deviation. The results are defined in
Table 5.
According to
Table 5, ACO-NN algorithm is superior in computational results to the other algorithms. In Test 2, the rate of difference was used to assess the merits and disadvantages of experiment outcomes based on ACO-NN and other methods. Equation (
11) represents the rate of difference.
where
defines the rate of difference, the well-known optimal solution is represented by
while
describes the mean of the best solution obtained by ACO-NN.
4.3. Test3
In this experiment, we compared ACO-NN to ACO [
50], ABC [
50], DTSA [
50], and CHA [
51] for the instance Eil101. As shown in
Table 6, the ACO-NN method has good results with the rank 3, and it proves a better experimental result for standard deviation.
4.4. Test4
In this test, ACO-NN was compared with other optimization algorithms for the instance KroA100. As illustrated in
Table 7, the results revealed that ACO-NN performed much better than the other approaches for optimal solutions and the rate of difference metrics. ACO [
50] has a low standard deviation (SD) value. A low SD value indicates that ACO [
50] is a reliable algorithm. ACO-NN was in second place for the lower SD results and proves its robustness.
4.5. Test5
The ACO-NN was compared with SA [
50], DSTA0, DSTAI, and DTSA [
50] for the instance KroC100. According to
Table 8, our approach confirmed better results than the other approaches.
As seen from
Figure 13, ACO-NN is a greater scheduler in comparison with TSA, DSTA0, DSTAI, and DTSA for the instance Kroc100 from the perspective of rank values.
4.6. Test6
In this experiment, we compare ACO-NN to various approaches such as TSP-SAGEP [
52], SOS-SA [
53], and MO-GA [
54]. For the instances Gr120 and Gr666, our method proves the best results while it provides an acceptable outcome with best execution time for the instance Pcb442.
Table 9 displays the following outcomes for different data sets.
As seen from
Figure 14, ACO-NN is a leader scheduler in comparison with TSP-SAGEP, SOS-SA, and MO-GA for instances GR120 and Pcb442. For the instance Gr666, ACO-NN proves better results than SOS-SA and MO-GA.