1. Introduction
In the era of the Internet of Things (IoT) [
1,
2,
3,
4], the surge in the number of mobile devices and network traffic has created a high demand for data processing capabilities and response speeds. Mobile Edge Computing (MEC) technology effectively enhances the operational efficiency of smart devices by providing powerful computing resources at the network’s edge, particularly in delay-sensitive applications such as Augmented Reality (AR), Virtual Reality (VR) and autonomous driving [
5]. By offloading complex computational tasks to MEC servers [
6,
7,
8,
9,
10,
11], resource-constrained mobile devices (MDs) can significantly alleviate the pressure experienced when running high-demand applications and achieve a notable leap in performance. However, the limited battery capacity of mobile devices has become a bottleneck that restricts their further development [
12]. This limitation is especially evident for devices that require continuous operation over long periods and are not easily recharged, such as in remote areas or emergency situations, where insufficient battery life can severely impact the functionality and reliability of the devices. Therefore, despite the significant advantages of MEC in enhancing network performance, the limitation of battery endurance remains a key issue that urgently needs to be addressed.
In addition to battery constraints, reducing the energy consumption of IoT devices during the data offloading process is equally important. Trillions of tiny smart sensors make up the Internet of Things, facing significant limitations in computational capabilities and energy supplied by batteries. Advancements in wireless energy harvesting technologies, including renewable energy harvesting and wireless power transfer (WPT) [
13], can alleviate the challenges previously posed by battery capacity limitations. Renewable energy sources like solar, wind, and ocean energy can provide power to some extent, but they are significantly influenced by natural conditions such as weather and climate [
14]. To address this issue, green wireless charging technology has emerged. This technology can offer stable energy to devices through radio frequency signals and store it in the batteries of IoT nodes for future use, extending battery life [
15,
16]. To ensure that nodes do not fail due to energy depletion, green wireless charging adheres to the principle of energy neutrality, as stated in [
17], ensuring that the energy consumed in any operation never exceeds the energy collected. Green Wireless-Powered Mobile Edge Computing (WPMEC) combines the strengths of WPT and MEC, enhancing devices’ computational capabilities and energy self-sufficiency. In the upcoming 6G networks, green WPMEC will provide IoT devices with quick response and real-time experiences [
18], while reducing operational costs and extending the lifespan of devices.
However, WPMEC networks face the challenge of the dual far–near effect [
19] caused by positional differences, which has prompted the development of edge collaborative networks [
20,
21,
22] to optimize application offloading performance. By introducing a user cooperation (UC) mechanism, where nearby users assist distant users in accelerating the processing of computational tasks while effectively offloading their own tasks, this collaborative approach leverages the superior channel conditions of nearby users to gain more energy during the WPT phase. This not only addresses the unfairness caused by geographical location but also enhances the efficiency of energy utilization. The dense deployment of smart IoT devices further facilitates the opportunity to utilize the unused computing resources of idle devices and wireless energy harvesting. These devices, by assisting in completing the computational tasks of distant users, contribute to improving the overall computational performance of the WPMEC network.
To demonstrate the effectiveness of UC, recent studies, such as References [
23,
24], have effectively addressed the dual far–near effect in WPMEC networks through UC. The D2D communication in Reference [
25] and the incentive mechanism in Reference [
26] are both designed to facilitate resource sharing and collaborative offloading. In References [
20,
24,
27,
28], authors have focused on studying the basic three-node WPMEC model, which involves a Far User (FU) being allowed to offload computational input data to a Near User (NU). In Reference [
29], researchers designed a Non-Orthogonal Multiple Access (NOMA)-based computation offloading scheme aimed at enhancing the performance of multi-user MEC systems. Google has also developed federated learning technology, which enables multiple devices to collaborate on machine learning tasks. Despite this, these studies are often based on the assumption of determinism or predictability of future information, failing to fully integrate the dynamic changes of the network environment, which may affect the efficiency and success rate of task offloading and processing.
This paper investigates the basic three-node green WPMEC network shown in
Figure 1, focusing on the use of collaborative communication technology to accomplish the computation-intensive and delay-sensitive tasks powered by the HAP. Our goal is to maximize the network’s data processing capability in a real-time dynamic offloading system, taking into account the randomness of data generation and the high dynamics of wireless channel conditions. The challenges we face in addressing this issue include the unpredictability of task arrivals and the dynamics of channel conditions, as well as the coupling of variables in resource allocation, which makes traditional convex optimization methods inapplicable. To tackle these challenges, we have designed an efficient dynamic task offloading algorithm, the User-Assisted Dynamic Resource Allocation Algorithm (UADRA). This algorithm employs Lyapunov optimization techniques to transform the problem into a simplified form that relies on current information, and performs dynamic resource allocation in each time slot to enhance the network’s data processing capability. Our primary contributions are summarized as follows:
We propose a long-term computation rate maximization model for green sustainable WPT-MEC networks. Our model extends previous works [
28,
30] to address the double near–far effect problem, while introducing an incentive mechanism grounded in data-weight assignment for near and far nodes to improve data transmission efficiency.
By applying the stochastic network optimization technique, variable substitution method, and convex optimization theory, the multi-stage stochastic problem is transformed into a deterministic convex problem for each time slot, which can then be solved efficiently. Our proposed algorithm UADRA can work efficiently without relying on the prior system information.
We evaluate the proposed algorithm performance under various system parameter configurations through extensive simulations. Simulation results show that our algorithm outperforms benchmark methods, enhancing overall performance by up to 4% while ensuring system queue stability.
The remainder of this paper is organized as follows. In
Section 2, we present the system model of the user-assistance green WPMEC network and formulate the MSSO problem. In
Section 3, we utilize the Lyapunov optimization approach to tackle the problem, putting forward an effective dynamic offloading algorithm with an accompanying theoretical analysis of its performance.
Section 4 evaluates the efficacy of the suggested algorithm via simulation outcomes. Finally,
Section 5 concludes the paper.
Related Work
The integration of WPT technology with MEC networks provides an effective solution for IoT devices, enhancing their energy and computing capabilities with controllable power supply and low-latency services. Recent research has extensively explored the potential of these wirelessly powered MEC networks. For instance, in [
31], researchers optimized charging time and data offloading rates for WPT-MEC IoT sensor networks to improve computational rates in various scenarios. Furthermore, the authors in [
32] investigated a NOMA-assisted WPT-MEC network with a nonlinear EH model, successfully enhancing the system’s Computational Energy Efficiency (CEE) by fine-tuning key parameters within the network. Specifically, to meet the energy consumption requirements of devices, the authors in [
33] proposed a Particle Swarm Optimization (PSO)-based algorithm. The goal was to reduce the latency of devices processing computational data streams by jointly optimizing charging and offloading strategies. Additionally, in [
34], the authors focused on the computational latency issue in WPT-MEC networks. They found suitable offloading ratio strategies to achieve synchronized latency for all WDs, effectively reducing the duration of the overall computational task.
To tackle the dual far–near effect issue, researchers have begun to focus on user-assisted WPMEC networks and have confirmed their effectiveness in enhancing the computing performance of distant users. Specifically, in [
35], the study analyzed a three-node system composed of distant users, nearby users, and the base station within a user-assisted MEC-NOMA network model, addressing the optimization problem of joint transmission time and power allocation for users. Furthermore, References [
36,
37] respectively explored joint computing and communication collaboration schemes and the application of Device-to-Device (D2D) communication in MEC. The method proposed in [
36] aims to maximize the total computing rate of the network with the assistance of nearby users, while [
37] focuses on minimizing the overall network response delay and energy consumption through joint multi-user collaborative partial offloading, transmission scheduling, and computing allocation. Ref. [
16] extended this research, expanding from a single collaboration pair to multiple collaboration pairs, proposing a scheme to achieve the minimization of the total energy consumption of the AP.
In user assistance networks, the online collaborative offloading method, which is highly adaptable and can promptly respond to changes in task arrivals, has garnered significant attention from the research community. For instance, in [
38], to address the randomness of energy and data arrivals, a Lyapunov optimization-based method was proposed to maximize the long-term system throughput. Furthermore, in [
39], the authors studied and proposed a Lyapunov-based Profit Maximization (LBPM) task offloading algorithm in the context of the Internet of Vehicles (IoV), which aims at maximizing the time-averaged profit as the optimization goal. Additionally, in [
40], within the application of MEC in the industrial IoT, a Lyapunov-based privacy-aware framework was introduced, which not only addressed privacy and security issues but also achieved optimization effects by reducing energy consumption. In [
41], focusing on a multi-device single MEC system, the energy-saving task offloading problem was formulated as a time-averaged energy minimization problem considering queue length and resource constraints.
Unlike prior studies, this paper is dedicated to addressing the challenges of dynamic task offloading in green, sustainable WPMEC networks with user assistance. We take into account the total system’s energy consumption constraint, the dynamically arriving tasks in real-time scenarios, and the high dynamics of wireless channel conditions. Moreover, the temporal coupling between WPT and user collaborative communication, along with the coupling of data offloading timing and transmission power in collaborative communication, imposes significant challenges.
3. Algorithm Design
3.1. Lyapunov Optimization
To address the average power constraints, we introduce a virtual energy queue
, where
[
42]. Here,
can be seen as a queue with random “energy arrivals”
and a fixed “service rate”
. Thus, we derive the following Lemma 1.
Lemma 1. The long-term average power constraints will be met if the virtual queue satisfies average rate stability Proof. According to the above formula
, we can conclude that
we can expand all the terms of
, sum them up, and then take the average, yielding
Next, we simultaneously take the expectation on both sides and set
K, obtaining
By our assumption
, it follows that
, i.e.,
, the lemma is proven. □
By defining a network queue vector
, which encompasses the queue lengths for the NU, FU, and the virtual queue, respectively, we can obtain the associated Lyapunov function
and the Lyapunov drift
as
Employing the Lyapunov optimization theory, we derive the drift-plus-penalty as
Here,
signifies the penalty’s importance weight. The Lyapunov optimization method aims to minimize the upper bound of the drift plus penalty, thereby maximizing the objective function while ensuring queue stability. Optimizing the objective function across each time slot leads to long-term optimality. It is important to note that a higher value of
V prioritizes objective maximization, whereas a lower value emphasizes queue stability. To obtain the upper bound of
, we derive the following Lemma 2.
Lemma 2. At each time slot t, for any control strategy , the one-slot Lyapunov drift-plus-penalty
is bounded as per the following inequalitywhere B is a constant that satisfies the following Proof. For all
, we have the inequality
. By using the inequality, we have
Upon combining inequalities (
31)–(33), the resulting expression yields the upper bound of the Lyapunov drift-plus-penalty. □
Here, the parameter
V serves as a trade-off between the task computation rate and queue stability. An increased value of
V directs the algorithm to prioritize task computation rates, potentially at the expense of queue stability. A suitable selection of
V will enable the system to achieve a balance between the task computation rate and the task queue stability. By applying the drift-plus-penalty minimization technique, and eliminating the constant term observable at the start of time slot
t in (
29), we can obtain the optimal solution to problem (P1) by solving the following problem in each individual time slot.
Due to the the non-convex constraints (34b), (P2) remains a non-convex problem. We introduce auxiliary variables
here. According to Equation (22j), we have
. We denote
. To simplify the mathematical expressions, we omit
t here. So, (P2) can be equivalently substituted as
The coefficients are defined as
and
, where
, and
. Owing to the non-convex constraint (35e), problem (P3) remains non-convex. To address this, we introduce auxiliary variables
and
, defined such that
. By making these definitions, problem (P3) can be reformulated in terms of these auxiliary variables as follows:
, when the problem (P4) reaches the optimal solution, it aligns with the original problem (P3). Specifically, for constraint (36e),
is a concave function of
since the perspective operation applied to
preserves convexity [
45]. Thus, constraint (36e) is convex, making problem (P4) a convex optimization problem. To further analyze the solution, we will employ convex optimization tools, such as CVX [
46].
We introduce an efficient dynamic task offloading algorithm with user assistance to tackle Problem 4. Additionally, we apply the Lagrange method to gain meaningful insights into the optimal solution’s properties.
Theorem 1. Given non-negative Lagrange multipliers , , the optimal power allocation must fulfill certain conditions Proof. Let
denote the Lagrange multipliers corresponding to the constraints. The Lagrangian function for problem (P4), constructed based on these multipliers, is as follows
We can use the first-order optimality conditions. Taking the derivative of the Lagrangian function yields
By examining the first-order derivatives, we can establish the necessary conditions for optimality. The relationship between the auxiliary variables and the original variables is leveraged to derive the theorem. □
According to this theorem, we can infer that during the process of radio frequency energy transfer, higher power leads to better results. As the value of W increases, both FU and NU entities are more motivated to offload data, which results in a reduction of the computational tasks performed locally. Furthermore, an increase in V prompts FU and NU to offload a larger portion of their tasks. This shift towards offloading is driven by the desire to enhance the computation rate, which in turn leads the MDS to increase the volume of data offloaded to meet its objectives.
The process of solving the original MSSO problem, denoted as (P1), is encapsulated within Algorithm 1.
Algorithm 1: User-Assisted Dynamic Resource Allocation Algorithm |
|
3.2. Algorithm Complexity Analysis
We have employed the Lyapunov algorithm, which not only ensures the stability of the system but also allows us to effectively decompose the complex overall problem into multiple sub-problems (P2). By implementing Algorithm 1, we are able to solve (P2) within each time slot. Therefore, the solution complexity at each time slot is crucial in determining the overall performance and responsiveness of the algorithm. When solving the problem (P4), we have utilized the interior-point method, which possesses a computational complexity of approximately .
In this context, n denotes the total of decision variables, and reflects the degree of precision in computation. In our algorithm, with , the number of decision variables is sufficiently small. This design not only ensures the efficiency of the algorithm but also enables us to complete the optimization of resource allocation within a reasonable time, thus meeting the performance requirements in practical applications.
3.3. Algorithm Performance Analysis
In this section, we demonstrate that the proposed scheme can achieve an optimal long-term time-average solution. First, we establish a key assumption, as follows
Subsequently, we deduce that the expected value will also converge to the same set of solutions
Furthermore, we establish the existence of an optimal solution founded on the existing conditions of the queue, as follows.
Lemma 3. Should problem (P1) be solvable, there is a set of decisions that adhere to the conditions Here, the asterisk * denotes the value associated with the optimal solution.
Proof. Here, we omit the proof details for brevity. See parts 4 and 5 of [
42]. □
Theorem 2. The optimal long-term average weighted computation rate derived from (P1) is bounded below by a lower bound, which is independent of time and space. The algorithm is capable of achieving the following solutions:
- (1)
- (2)
All queues , , are mean rate stable, thereby satisfying the constraints.
Proof. For any
, we consider the policy and queue state as defined in Equation (40). Given that the values
, are independent of the queue status
, we can deduce that
By integrating these results into Equation (
29) and taking
, we obtain
Utilizing the iterated expectation, and summing the aforementioned inequality over the time horizon
, we derive the following result
By dividing both sides of Equation (47) by
, applying Jensen’s inequality, and considering that
, we obtain
Furthermore, letting
, we have
From Equation (41), we have
Furthermore, we obtain
□
Theorem 3. A defined upper limit confines the time-averaged sum of queue lengths Proof. By employing the method of iterated expectations and applying telescoping sums iteratively for each
, we can derive
Dividing both sides of (53) by
, and considering
, we rearrange the terms to obtain the desired result
Specifically,
□
Theorems 2 and 3 underpin our proposed algorithm by establishing that as
V increases, the computation rate
improves at a rate of
, whereas the queue length grows at a rate of
. This suggests that by choosing an appropriately large value for
V, we can achieve an optimal
. Furthermore, the time-averaged queue length, denoted as
, is shown to increase linearly with
V. This linear relationship implies a trade-off
between the optimized computation rate and the queue length. This is in line with Little’s Law [
13], which posits that delay is proportional to the time-averaged length of the data queue. In other words, our proposed algorithm enables a trade-off between
and the average network latency.
4. Simulation Results
In this section, plenty of numerical simulations are performed to assess the efficiency of our proposed algorithm. Our experiments were conducted on a computational platform with an Intel(R) Xeon(R) Gold 6148 CPU 2.40 GHz, 20 cores and four GeForce RTX 3070 GPUs. In our simulations, we employed a free-space path-loss model to depict the wireless channel characteristics [
47]. The averaged channel gain
is denoted as
where
denotes the antenna gain,
denotes the carrier frequency,
denotes the path loss exponent, and
denotes the distance between two nodes in meters. The time-varying WPT and task offloading channel gains are represented by the vector
, adhering to the Rayleigh fading channel model. In this model, the random channel fading factors
are characterized by an exponential distribution with an expectation of 1, capturing the variability inherent in wireless communication channels. For the sake of simplicity, we assume that the vector of fading factors
is constant and equal to
for any given time slot, implying that the channel gains are static within that slot. The task arrivals
and
both follow an exponential distribution with constant average rates 1.75 and 2, respectively. The parameters are all listed in
Table 2.
4.1. Impact of System Parameters on Algorithm Performance
Figure 3 illustrates trends for the average task computation rate
and the average task queue lengths of FU and NU over a period of 5000 time slots. The the task arrival rates for FU and NU are set at 1.75 Mbps and 2 Mbps, respectively. Initially,
is low, but it rapidly increases and eventually stabilizes as time progresses. This initial surge in
is due to the system’s initial adjustment to the initial task queue fluctuations, which demands more intensive processing for FU tasks, resulting in increased energy consumption and a temporary reduction in the overall computation rate. Moreover, the average queue length decreases and stabilizes, reflecting the system’s ability to self-regulate and reach a steady state.
Figure 4 demonstrates the average task computation rate of our proposed algorithm under different control parameter
. The results show that the average task computation rates converge similarly across different
V. Notably, as
V increases, there is a corresponding increase in the average task computation rates. This trend is attributed to the fact that a larger
V compels the algorithm to prioritize computation rates over queue stability, which is consistent with theoretical analysis, corresponding with Theorem 2. Here, the parameter
V serves as a balancing factor between the task computation rate and queue stability, reflecting a trade-off that is consistent with our theoretical predictions.
Figure 5 shows the trend of average task queue lengths of FU and NU with different
V. As
V increases from 100 to 900, the task queue lengths of FU and HD declines from
bits to
bits and from
bits to
bits, respectively. In the user-assisted offloading paradigm, processing a task from NU involves only a single offloading step, which is markedly more efficient than the two-step offloading data from FU. With the increasing of
V, the algorithm will pay more attention to optimizing the computation rate. Conversely, a smaller value of
V implies that the system will focus more on the stability of the queue. In real-world scenarios, according to Little’s Law, the length of the queue is directly proportional to the delay. The parameter
V can be tuned to align with the system’s latency requirements. Therefore, based on the system’s requirements for delay, the value of V can be adjusted to achieve a balance between queue stability and computation rate.
Figure 6 evaluates the impact of the average energy constraint
on system performance with
V fixed at 500. As
increases from
joules to
joules, the average task computation rate rises from
bits/s to
bits/s, while the average task queue length of FU and NU decreases from
bits to
bits. This reduction in queue length and increase in computation rate are attributed to the higher energy availability for WPT, enabling FU and NU to offload tasks more effectively. Notably, after the average energy constraint
reaching 2.1 joules, the variation of task computation rate and task queue is reduced. This observation suggests that there is an upper bound to the energy consumption of our algorithm. Beyond this threshold, additional energy has minimal impact on performance enhancement.
Hence, energy constraint as a critical parameter, significantly influences both the data processing rates and the stability of task queues within the system. The findings underscore the importance of energy management in optimizing system performance.
Figure 7 presents the total system energy consumption of our proposed algorithm under different values of the parameter
V, specifically
and
. Initially, the total energy consumption exhibits substantial fluctuations. However, with the progression of time, the system’s energy consumption stabilizes and hovers around the average energy constraint
after approximately 2500 time slots. Notably, an elevated
V value is correlated with higher average energy consumption, as the algorithm pays more attention to the system’s computation rate, consequently incurring greater energy costs.
Figure 7 highlights the algorithm’s efficacy in managing average energy consumption, a critical feature for the sustainability of IoT networks.
In
Figure 8, we evaluate the offloading power across varying bandwidths
W, ranging from
Hz to
Hz. It is observed that all the offloading power increases as bandwidth
W escalates. Consistent with the results of the analysis in Theorem 1, the increase in bandwidth makes the system more inclined to perform task offloading, reflected in the increase in offloading power.
4.2. Comparison with Baseline Algorithms
To evaluate the performance of our propose algorithm, we choose the the following three representative benchmarks as the baseline algorithms.
- (1)
All offloading scheme: Neither FU nor NU perform local computing and consume all the energy for task offloading.
- (2)
No cooperation scheme: FU offloads tasks directly to HAP without soliciting assistance from NU, similar to the method in [
48].
- (3)
No Lyapunov scheme: Disregarding the dynamics of task queues and the energy queue, this scheme focuses solely on maximizing the average task computation rate, similar to the Myopic method in [
30]. To ensure a fair comparison, we constrain the energy consumption of each time slot by solving the following equation:
Figure 9 shows the average task computation rates under four schemes over a period of 5000 time-slots, with the control parameter
V set to 500. All schemes converge after 1000 time slots. Our proposed algorithm achieves the best task computation rate after convergence, outperforming the other three by 0.8%, 3.9%, and 4.1% respectively. Our algorithm’s key strengths lie not only in achieving the highest data processing rate but also in ensuring the stability of the system queues. This prevents excessively long queues that could lead to prolonged response times and a negative user experience. The no-Lyapunov scheme, while achieving the second-highest computation rate, neglects queue stability in its pursuit of maximizing computation speed. This oversight can lead to system instability, prolonged user service times, and potential system failure. The all-offloading scheme, relying solely on edge computing, consumes more energy and thus underperforms in energy-limited scenarios. In the no-cooperation scheme, the system initially benefits from a high computation rate due to the NU’s lack of assistance to the FU. But as the NU’s tasks are completed and its resources are no longer available, the average computation rate falls sharply. The FU’s communication with the HAP is further impeded by the dual proximity effect, causing a notable decline in the system’s long-term computation performance.
Figure 10 shows the impact of varying network bandwidth
W from
Hz to
Hz on the performance of different schemes. As
W increase, task computation rates for all schemes rise, reflecting improved transmission efficiency for wireless power transfer and task offloading. This allows the HAP to handle more offloaded tasks, highlighting the critical role of bandwidth in system performance. Notably, our proposed scheme consistently outperforms others across all bandwidth levels, showcasing its adaptability and robustness in varying network conditions.
Figure 11 evaluates how the distance between FU and NU affects the performance of all four schemes, with distances varying from 120 m to 160 m. We observe that as the distance increases, the computation rates for both our proposed scheme and the all-offloading scheme decrease. This suggests that proximity plays a crucial role in task offloading efficiency. In contrast, the no-cooperation scheme shows a stable computation rate, consistent with its design that excludes task offloading between FU and NU. Interestingly, the no-Lyapunov scheme performs best at a distance of about 140 m. However, its performance drops as the distance decreases, contrary to the expectation that a shorter distance would enhance task offloading from FU to NU. This unexpected trend is likely due to instances where the FU’s task queue depletes faster than new tasks arrive, leading to lower computation rates for the no-Lyapunov scheme. This highlights the importance of balancing task computation rates with queue stability in system design.
In
Figure 12, we evaluate the performance of four schemes as the task arrival rate of NU varies from
bits/s to
bits/s. Our proposed scheme’s task computation rate demonstrates a modest increase and maintains the highest computation rate as tasks arrive more rapidly. This trend underscores the scheme’s robustness across diverse scenarios. Correspondingly, the no cooperation scheme exhibits a more pronounced increase in task computation rate. This is attributable to its vigorous task processing capacity at the NU, which allows it to capitalize on the higher task arrival rates effectively.
5. Conclusions and Future Work
The joint optimization of computation offloading and resource allocation in WPMEC systems poses a significant challenge due to the time-varying network environments and the time-coupling nature of energy consumption constraints. In this study, we aim to maximize the average task computation rate in an MEC system with WPT and user collaboration. A task computation rate maximization problem was formulated considering the uncertainty of load dynamics alongside the energy consumption constraint. We introduce an online control algorithm, named UAORA by leveraging Lyapunov optimization theory to transform the sequential decision-making dilemma into a series of deterministic optimization problems for each time slot. Extensive simulation results substantiate the effectiveness of the proposed UAORA algorithm, demonstrating a significant enhancement in average task computation performance when compared with benchmark methods. Simulations also underscore the advantages of jointly considering the task computation rate and the task queues in our algorithm.
Our algorithm currently relies only on the present system status for decision-making and does not utilize historical data. There is a significant opportunity to enhance the algorithm’s performance by integrating historical data into the decision-making process. In the future, we plan to employ deep learning or machine learning technologies to harness historical data and build a predictive module for load and channel conditions. This advancement could significantly improve the system’s decision-making efficiency. Moreover, we plan to evaluate our algorithm in various real-world scenarios and consider various real-world constraints, such as different task types and Service Level Agreement (SLA) time constraints.