1. Introduction
In the last decade, the diffusion of smartphones, tablets, and other smart devices has led to an astonishing demand for ubiquitous mobile network access, and the rise of online services, such as video streaming, social network applications, and mobile gaming, has significantly increased the traffic load characterizing wireless communications. Typically, the same popular data and content may be requested multiple times at the same location by different users/devices, which results in an unnecessary waste of backhaul capacity and spectrum resources. In fact, the main cause of these problems is due to video on demand contents accessed in an asynchronous way (unlike live streaming and digital TV), so that the demands concentrate on a small set of popular contents that are requested by a significant number of devices, often located in the same cellular coverage area [
1]. To solve these problems, it is possible to reconsider the current network architecture and explore more advanced communication models and paradigms. D2D direct communication is considered one of the most promising technological issues in next generation cellular networks [
2].
It consists of establishing a direct communication channel between two nearby mobile users, sharing a common short-range radio coverage space, without traversing a base station (BS) or core network equipment [
3]. It results in a very flexible communication model with unique advantages. First, due to its underlying short-range direct communication technology, D2D user equipment (UE) allows for higher transfer rates together with very limited end-to-end delays. Second, any direct proximity-based transmission that does not traverse centralized channel collection points (and potential bottlenecks) in the cellular infrastructure, such as the evolved Node B (eNB) in LTE networks, may be more resource-efficient than conventional cellular communication by saving energy and channel capacity, as well as improving spectrum utilization. Third, data are transferred locally, which is helpful for offloading cellular traffic and alleviating backhaul crowding [
4]. To enhance the efficiency of content sharing and reduce the load on the eNB, it is important to maximize the number of simultaneous D2D transmissions, each associated to an active communication channel or link, within a specific coverage area. In particular, maximizing the potential number of D2D links within a cellular coverage area not only improves the spectrum and energy efficiency but also reduces transmission delay. However, such a maximization process poses three major challenges. First, interference between cellular and D2D users could critically affect their performance. Second, the minimum quality of service (QoS) requirements must be simultaneously guaranteed for both cellular and D2D communications. Therefore, the selection of suitable active links in each time interval (or slot, in presence of a slotted time evolution model) is fundamental to achieve the maximum number of simultaneous D2D links. Third, a highly flexible and scalable network infrastructure needs to support a large number of heterogeneous users as well as the deployment and interworking of a multiplicity of combined technologies, such as the ones characterizing the Internet of Things (IoT) and vehicular ad-hoc communications [
5,
6,
7], so that the degree of complexity characterizing such infrastructure, both in terms of variables and degrees of freedom in the associated model, makes the traditional approaches provided by optimization theory more challenging than ever [
3]. The research of scheduling for network resources also causes great concern about computational efficiency [
8,
9].
The D2D links scheduling problem, aiming at maximizing the number of active D2D links, consists of determining which D2D pairs can communicate simultaneously (by using direct channels) while guaranteeing that the remaining communications between cellular users are not affected. Since D2D users share spectrum resources with cellular users, interference is one of the major challenges in D2D link scheduling. Interference may be not only experienced between D2D users operating within the same area but, D2D transmission may cause cross-layer interference phenomena also involving cellular users. The interference occurring in D2D communication can be modeled by using both a graph-based model and a physics-based one. The maximum D2D links problem has been proven to be NP-hard in both models [
10]. Most research is based on the graph model (e.g., [
11,
12,
13,
14,
15], but such model is too simplistic to simulate how the signal strength attenuation is affected by distance for D2D users which share the same cellular spectrum resources. On the contrary, the physics-based model adopted in this paper reflects the physical reality in a more precise and realistic way.
In real production networks, scheduling decisions must be taken online and in real-time so that the available optimization strategies aiming at solving offline the NP-hard linear programming-based problem formulations are not acceptable at all. In order to satisfy the strict delay demands of online operations, the complexity of a link scheduling algorithm must be significantly reduced, eventually applying some heuristic strategy leading to sub-optimal but yet acceptable results. However, this may affect network performance, depending on the achieved distance from the optimum solution. Therefore, developing efficient and effective heuristic-based strategies to solve the link scheduling problem becomes a really challenging task when implementing D2D communication in cellular network scenarios. In recent years, machine learning has emerged as a successful approach for coping with complex problems and has been widely applied to many fields, such as hyperspectral image classification, ship detection, and wireless communications [
16,
17,
18,
19]. With the success of massive parallel computing environments, also empowered by the use of graphics processing unit (GPU)-based acceleration frameworks, machine learning has emerged as a successful approach for coping with complex problems in the wireless communications area. In particular, deep learning, a promising subset of machine learning, has been applied to wireless networks optimization and big data processing. The purpose of deep learning is to simulate complex functions through a predefined model consisting of neuron units. In the deep learning training process, appropriate weight values are determined through calculation and tuning between neurons, a process which has the goal of extracting knowledge/information, in form of specific features from input data. Then the trained-model is able to make accurate classification or prediction decisions based on such features [
20]. Unlike traditional machine learning methods that heavily rely on relevant domain experts to extract features from data, the deep learning model automatically acquires a sample feature through multiple hidden layers consisting of neuron units to achieve classification or prediction [
21].
In this paper, we leverage the physical interference model and develop a novel machine learning-based approach for achieving the maximum number of D2D links within a cellular coverage area. We use deep neural network (DNN)-derived features from D2D link information to make time-efficient and near-optimal decisions in the operation phase through a real-time optimization strategy driven by the neural network itself. The DNN model is trained by pre-computed samples to find out the complex relationship between link information and D2D link activation. In doing this, we consider a more realistic and accurate signal-to-interference-plus-noise ratio (SINR) interference model that considers both cellular users and D2D users. We also accomplish extensive simulations to assess the performance of the DNN-based method and highlight its effectiveness. The trained-DNN approach can generate approximate solutions quite near the optimum by immediately achieving a satisfactory quality of the outputs. The numerical of simulation shows the superiority of the DNN-driven solution in terms of the time-cost of the operations involved respect to iterative optimization algorithms.
Our main contribution can be summarized as follows:
- (1)
In this paper, we propose a cutting-edge learning-based method for solving the maximum D2D links problem. Using the training samples generated by a conventional method, we train a deep learning model to simulate the complex relationship between link information and link activity. Due to computational burden transferred to the training phase, the algorithm significantly reduces the computational overhead.
- (2)
We introduced a sizable D2D communication network under an SINR interference model. Furthermore, we present the value of a two-layer interference — a D2D layer and cellular layer—and then formulate a maximum active D2D links problem subject to SINR constraints.
- (3)
We adjust the parameters of the learning model depending on simulation results to get an optimal output. The experiments show that our algorithm can reduce more than 90% the time-spent.
- (4)
Through simulation results, we analyze how some factors, such as SINR and the coverage of the BS, affect the number of active D2D links. Based on such analysis, we can efficiently schedule D2D links under different wireless network environments.
2. Related Work
Link scheduling plays a crucial role in meeting strict QoS requirement in wireless networks. Maximum link scheduling is an important sub-problem of link scheduling. The authors of [
10] proved for the first time that link scheduling under a SINR model is NP-hard and proposed an algorithm for the one-shot link scheduling problem. The algorithm first separates the problem instance into disjoint link classes and then schedules each link class by using a greedy method. The work in [
22] firstly explored a one-slot algorithm for maximizing the number of links with a constant approximation guarantee, and then an extended algorithm with
(
n is total number of links) computational complexity was introduced for a minimized-length schedule. However, none of the above papers considered the ambient noise. A method based on partitioning was presented in [
23] to find the independent link set with the maximum weight. In [
24], a distributed greedy algorithm for maximizing the number of links, subject to interference constraints was proposed. In these approaches, the improvements achieved in terms of optimization quality could result in a hugely higher computational complexity and, hence, an increment in overall runtime performance. A low-complexity scheduling scheme named DistGreedy based on a link-conflict graph has been proposed in [
25]. The algorithm repeatedly removed the active links and blocked links in contention slots until the graph was left empty. The authors of [
26] attempted to find a maximum-weighted subset of communications without spectral splitting at the individual time unit. In this model,
A denotes a set of communication tasks, and each request,
, has a demand
. If
, it is a heavy request, otherwise it is a light request. The algorithm considered both heavy requests and light requests of
A for determining the link schedule. In [
27], a maximum tolerance and minimum (MTMA) model based on a greedy schedule was studied for maximum link scheduling in one-slot. The experiments showed that the algorithm improved 28–64% of the current algorithm. The authors in [
28] developed a distributed greedy heuristic for a
k-hop link schedule. Because of interference from two layers, the algorithm for the D2D links schedule was much more complicated than above schemes.
With the development of the IoT, due the increased complexity of the network structure, link scheduling experiences new difficulties in meeting QoS requirements when using a traditional optimization method. In recent years, deep learning, has been promisingly applied to wireless networks optimization and big data processing [
21,
29,
30]. In [
31], the authors considered solving a resource allocation problem for D2D communication using a Q-learning-based method. In [
32], the authors applied deep learning to reduce the complexity of solving a wireless networks optimization problem. The authors in [
33] proposed a resource allocation strategy using cooperative reinforcement learning. The target of their algorithm was to maximize the throughput of the system by selecting the proper level of power for the resource blocks of the cellular user and D2D pairs. In [
34], the authors adopted deep learning to solve the objective function of maximizing the weighed sum rate over
N D2D users and demonstrated that link scheduling does not necessarily require the exact channel estimates. In [
35], the author adopted a method based on reinforcement learning to solve the resource scheduling for vehicle-to-vehicle (V2V) communications based on D2D. In this article, each vehicle, regarded as an agent, made decisions by itself. Since they do not require global information, decentralized scheduling methods are characterized by a low overhead. Experiments have shown that the proposed algorithm can effectively schedule limited links with minimized interference under delay constraints. Solving the maximum active D2D links problem under interference constraints by deep learning is promising.
3. System Model
We considered that D2D communications occur within a single cell of a cellular system and share the system’s downlink resources. In this scenario, a cellular device may suffer interference phenomena introduced by D2D communication activities. D2D transmitters can also interfere with the eNB. Due to the strong interference management capability of the eNB, it is necessary to schedule links in order to allow downlink connectivity for cellular device communications. In the proposed system model, D2D devices are randomly distributed under the coverage of the same eNB. As shown in
Figure 1, the heterogeneous network consists of a single eNB for serving cellular devices and
M D2D-capable devices. Let
denote the set of D2D users in the network. To make full use of available spectrum resources, we allowed multiple D2D devices to simultaneously use the same downlink channel of cellular devices. Two D2D devices (a D2D pair) can establish no more than a direct communication. According to the SINR model, the SINR involving a single D2D sender–receiver pair (
i,j)(
) in the presence of a cellular device
C connected to an eNB operating in the same cell is:
where
is the transmit power characterizing the D2D sender
i,
gij denotes the propagation attenuation (link gain) modeled as
gij =
dij−α (
dij denotes the distance between D2D pair and α ≥ 1 is a constant path-loss exponent),
is the transmission power of the eNB,
is the gain of a cellular device
C connected to the eNB concurrent to
i, and
is the ambient noise. Clearly, the signal power transmitted by the eNB to
C is perceived at
i as interference. Let
and
denote the minimum SINR threshold of the cellular device and D2D device, respectively. If the condition
is held, the D2D receiver
j successfully receives a message from D2D sender
i (the D2D pair of devices communicate successfully). We denote the set of
N pairs
A that are able to successfully perform D2D communication as
. The notation used in this paper is shown in
Table 1.
There are four conditions in the model: (1) Any D2D terminal device can communicate with another within the same cell; (2) a node can be a sender or receiver, but a node can send to at most one receiver or receive from at most one sender; (3) the D2D receiving user can estimate the link state information (channel state information, CSI) according to the received signal; and (4) the eNB controls channel resource allocation in a centralized way.
4. Problem Formulation
As mentioned above, multiple D2D communications are allowed within the same cell by sharing spectrum resources with cellular users. Therefore, the inter-layer interference between devices involved in cellular communication through the eNB and devices performing D2D data transfers as well as intra-layer interferences between the different D2D pairs can be experienced and must be correctly managed. When the number of D2D communication links heavily increases, the interference caused by the sharing of the same spectrum resources will affect the reliability of cellular device communications and even reduce the number of potential D2D links. In general, a device requiring spectrum resources for cellular communication has a higher priority than a device competing for establishing a D2D communication channel. Therefore, the resource scheduling problem is related to maximizing the amount of D2D links while ensuring the QoS of the cellular device, which selects a subset of A with the biggest cardinality under . Based on the above analysis, we started from a conventional link scheduling scheme which guaranteed the communication reliability of the cellular users and maximized the active D2D links. Due to the NP-hard nature of the underlying problem, we used a DNN to optimize the above scheduling algorithm and reduce its running-time.
In order to properly formulate the problem, at first, we need to analyze the interference in the cellular system (i.e., within the cell of inters). Here, when the pair of devices involved in direct D2D communication reuses the downlink spectrum, the signal received by a generic cellular device
C consists of the expected signal from the eNB, with the addition of the interference from the D2D layer and the ambient noise. In detail, by using the physical model, the SINR of the generic cellular device
C is defined as follows:
where
is the transmission power of the eNB,
is the transmission power of the D2D device
i,
denotes the link gain from D2D sender
i to cellular device
C,
is the gain of the cellular device connected to the eNB, and
is ambient noise. To ensure the performance of the cellular device,
should be satisfied. Analogously, in the D2D layer, the signal perceived at the D2D receiver
j side consists of the one transmitted by
i affected by the interference from both the cellular and D2D layers. As such, the SINR at the D2D device level is:
where
is the channel gain from the D2D sender
i to receiver
j and
is the channel gain from the eNB to D2D device
j. When
, the D2D links perform normally.
To maximize the total amount of admissible D2D pairs, we need to consider the following maximization link utility problem, subject to interference constraints associated to the performance of D2D devices and the cellular user:
The objective (4) maximizes the number of active D2D links. The binary variables
xij and
yi are respectively associated to the presence of an active D2D pair from the devices
i to
j and to the capability of D2D sender
i to perform D2D communication. The constraints (5) and (6) state that a D2D device can send to at most one receiver or receive from at most one sender; they also state that each device terminating a link must be D2D-capable, whereas the constraints (7) and (8) formulate, respectively, the SINR requirement of the cellular device and D2D devices. The use of a binary variable to control whether a linear constraint is active is a very well-known modeling trick in integer linear programming [
36]. The solution of the problem [
MaxL] can be obtained by using an exact algorithm or a heuristic one. The exact algorithm is typically based on the use of the branch and bound [
37] and dynamic programming methods [
38]. Though such methods can obtain an optimal solution, their computational complexity is large and only suitable for small scale problems. On the other hand, the heuristic algorithm is not based on finding the optimal solution of the problem, but it expects to obtain a near global optimal solution in an acceptable time. If
, (8) constrains the SINR to be at least
for
. When no D2D links are active for the pair (
i, j), the constraint is always satisfied for a sufficiently large
. Large values for
can easily lead to numerical problems and to weak linear relaxations ([
39,
40]). The choice of such big numbers
, known as Big-M in integer programming, is used to turn on or off some inequality constraints when necessary and can potentially result in difficulties when trying to solve an integer linear programming problem like
MaxL that relies heavily on the bounds of continuous relaxation to be solved in a computationally acceptable time. That is, an improper selection of the
Mij values can potentially result in a very weak continuous relaxation. Moreover, the gain values in (8) may vary significantly in magnitude and, hence, may introduce other numerical difficulties in solving the problem.
6. Performance Evaluation
In our proof-of-concept evaluation scenario the nodes are randomly placed on an area of 250 m
2. The channel gain is
, where
is the distance between D2D users
i and
j. The detailed simulation parameters are given in
Table 2:
In
Figure 4, we compare the time required (seconds) for reaching optimality with the DNN-based algorithm against the ones experienced by using respectively the conventional integer linear programming optimization scheme solved with CPLEX (C-Solver) and the maximum weighted links scheduling (MWLS) algorithm, a greedy approach for D2D link scheduling under the SINR constraints based on [
27,
28]. The TensorFlow framework has been used to train the DNN-based algorithm. We determined the average calculation time for each case. From
Figure 4, we can see that the computation time based on the DNN algorithm was considerably reduced when compared to the ones characterizing the other optimization approaches. In addition, as the scale of the instance increases, the computational time of the other algorithms increases as well according to a huge growth trend. Instead, for the DNN-based approach the computational time growth with the problem scale is not so obvious.
In
Table 3, we compare the effect of the number of hidden layers in our solution. To keep the comparison fairer, both of the models shared the same number of hidden units in total. The two-layer-deep model had 60 units per layer, while the model with three layers had 40 units per layer. We focused on time and accuracy as comparison metrics. In terms of time, we observed the training time and testing/validation time of the learning model with two and three hidden layers. To estimate accuracy, we relied on a binary variable denoting the link status: 1 meant that the D2D link was activated and the other link was asleep. The binary accuracy of the output link status has been used to show the performance of the model in training and testing.
Table 3 shows that the three-layer structure spent more time in the training and testing than the two-layer model. Since the total number of neurons was the same, the accuracy only experiences little differences.
In
Figure 5, we set 60 units per layers for both models. According to
Figure 4, it is reasonable to keep the depth of the DNN limited to three hidden layers. Training accuracy is better for a three-layer network while keeping the testing time within acceptability bounds. The difference in terms of accuracy between the two models can be explained by the number of weights that need to be trained. Weights are several times more numerous in the three-layer than in the two-layer model. Due to the vanishing gradient, we can observe the training loss declined gradually as the number of training steps increase.
It is critical to choose a proper batch size to effectively prevent the model from underfitting and overfitting. We compared the validation loss and training time with different batch sizes and learning rates, and the results are shown in
Figure 6. The values of learning rate and batch size were decided by an experimental comparison with a constant value of epoch. From
Figure 6a, we can see that the validation loss became smaller as the batch size increased. This is because the learning model approximated better with the characteristics of the training set with a bigger data size. Meanwhile, validation loss was impacted limitedly under different learning rates. As it can be seen from
Figure 6b, the training time decreased with an increase of the batch size. Moreover, when the learning rate increased, the training time first dropped and then rose. This is because the rate of gradient descent increased with the learning rate. However, an excessive learning rate can lead to an excessive parameter update and to an increase of training time.
As a result, the learning rate was equal to about 0.12 with the least training time. Then, we investigated the details of how variations of batch size affect training time, as shown in
Table 4.
Table 4 indicates the trade-off between validation loss and training time at the batch size of 64.
Figure 7 shows the accuracy of the results of the DNN-based approach varying with different dataset sizes. We used the optimality rate as a metric to check the probability that the result was the label value. For example, when the optimality rate was equal to 0.8, the output generated by the DNN had an 80% probability of being the label value. We tested the accuracy of the DNN algorithm by using only the training set and the entire data set. According to
Figure 7, the optimality rate increased with the size of the training set. When the test set was approximately 3000 instances, the output got the best optimization result of 95% or more. At the same time, the average accuracy of the output value based on the DNN algorithm can reach more than 90% of the results of a conventional optimization algorithm. We can observe that the optimality rate did not change much as the training set increased, but it gradually decreased when the total data set has been used. This is because the noise present within data got larger as the data set size increased, and during the training process, the model can learn some relationships by such noisy data. The learned model can perform well for a training set, but it cannot achieve the same accurate outputs when using all the data. In our simulation, the training set size of 3000 was the optimal choice.
Figure 8 compares the impact of the SINR threshold of the D2D receiving user on the number of D2D links that the system can activate at different base station transmit powers. As seen in
Figure 8, the more D2D pairs that can be activated by the system, the smaller is the transmit power of the base station. Conversely, the increase in transmit power of the base station resulted in a reduction of the number of active D2D links. This is because a greater transmit power of the base station implied more interference for the D2D users, resulting in fewer admissible D2D pairs and more D2D links that could be allowed to simultaneously communicate with a smaller SINR threshold. Otherwise, the amount of active D2D pairs decreased in the cell with a larger SINR. Since more D2D devices can receive data from others with lower SINR thresholds, the amount of D2D activity decreased with the strength of the interference with the base station and the peer D2D users.
In
Figure 9, we show the change in the number of D2D links with respect to the change of D2D distribution radiuses and base station transmit powers. It can be seen from
Figure 9 that when the base station transmit powers are 0.5w and 1w, the D2Ds device distribution range increases and the number of D2D links decreases slowly because the number of potential devices involved in D2D communication increases with the growth of the distribution radius, thus causing severe interference. In this case, the transmission distance of D2D links increased, and the receiving signals became easily affected by other D2D devices and the eNB activity. When the transmit power of the eNB was 1.5w, the amount of D2D links declined slightly and then turned to grow slowly. The overall change was not significant because the distance between the D2D devices became larger as the distribution radius increased. During signal transmission, the amount of potential D2D links grows as the distribution radius increases. That is because D2D equipment can potentially be located farther from the eNB, and the interference from the eNB can be reduced.
7. Conclusions
Creating maximum simultaneous D2D link schedules for dealing with increasing traffic loads is a challenging problem in modern cellular scenarios due to the need for reducing the load on base stations and because of their high computational complexity. To date, many link scheduling optimization algorithms have been developed, but most of them are not well suited for working online and hence cannot meet the real-time requirements of modern network infrastructures. In this paper, we used a machine learning-based strategy to maximize the number of simultaneously active D2D links without interfering with cellular communications. Based on the physical interference model, we designed a link scheduling algorithm based on a DNN to predict device activity in forming D2D links in order to reduce computation times-costs. Since the application of deep learning is still an emerging research topic in network scheduling, we empirically determined the proper operating parameters via specifically crafted experiments aiming at optimizing the model performance. The parameters chosen in this way were the number of hidden layers, the size of the training set, the batch size, and the learning rate. Simulation experiments showed that the proposed approach can reach more than 90% of the optimum results quality. At the same time, we learned that the influence of base station power is greater than the threshold of an SINR on the number of D2D links. Through the experiments, we also learned that the influence of the number of D2D links changes in presence of different base stations’ coverage areas. Furthermore, we also analyzed the reasons for these factors. For future work, we are going to consider more solutions, like transfer learning and reinforcement learning for the more general case that D2D links may be scheduled in a multi-cell.