1. Introduction
In recent years, virtual reality (VR) technology has been rapidly commercialized, forming a
$209 billion market by 2022 as predicted in [
1]. VR makes use of 360-degree panoramic videos with high resolution (higher than 4K), high frame rate (60–90 fps) and low delay (less than 20 ms) to provide an immersive environment for the user to interact with the virtual world by a head-mounted display (HMD) [
2]. Thanks to the development of VR display devices, the general public is able to experience VR capabilities on HMDs (e.g., HTC VIVE).
As the popularity of the User-Generated Content (UGC) platforms increases (e.g., Facebook and YouTube), more people would like to generate VR videos themselves with portable 360-degree cameras (e.g., GoPro OMNI, Samsung Gear 360, etc.) and share with others through UGC platform. Essentially, high-quality VR videos are produced and need to be uploaded to the UGC platform (uplink procedure); then, the UGC platform disseminates them to other VR viewers through applications like VR live broadcast (downlink procedure). These procedures, especially the uplink procedure, call for the use of 5G network as the strong support of mobility and collaboration. VR is predicted as the mass deployment of the 5G network [
3], however, the transmission efficiency of VR video during those procedures in 5G network is one of the biggest issues in near future.
Optimization of VR video transmission is actively under explanation, but previous work mostly focused on the downlink procedure, and a common precondition of these studies is transcoding VR videos into multiple representations (bit rates) on the cloud server. Alface et al. [
4] presented a typical tile-based adaptive downlink scheme which partitioned the panorama into independent tiles, enabling various content rate adaptable to the desired viewport of viewers. Recently, Corbillon et al. [
5] presented another downlink adaptive streaming dependent on the predicted viewport and available bandwidth. As for the underlying technology, some researchers began to explore the potential of downlink delivering VR video over the cellular network (e.g., 5G ) by applying the multicast [
6].
5G H-CRAN has been shown a promising solution for VR video transmission in the near future. Heterogeneous networks with massive densification of small cells and CRANs are combined in one network structure to improve spectral efficiency, resource management, and energy efficiency [
7]. However, spectrum utilization and resource management are the major topics in H-CRAN [
8]. Challenges and methodologies of RA in 5G H-CRAN has been reviewed in [
9]. Recently joint RRH-association, sub-channel assignment and power allocation in multi-tier 5G H-CRANs has been studied in [
10], which aimed at improving the system throughput. In addition, energy efficiency of nonorthogonal multiple access (NOMA) in C-RAN is researched in [
11,
12]. However, these researches mainly focus on downlink RA and power allocation (PA), which also fails to take into account the quality of content (QoC) [
13] during RA.
This paper addresses the VR video uploading problem in 5G network. Optimization for VR video uploading in 5G network is more challenging and demanded. First, it is not viable for a user terminal to create multiple representations due to the lack of computing and storage resources on board. Second, the uplink wireless bandwidth is more limited than the downlink in 5G network. Third, content-sensing based source coding is necessarily adopted due to the high data volume of 360 degree content. Fourth, moving feature of UE which calls dynamic association with corresponding RRH/g-NB. The last not the least, delay-sensitive of VR video needs to be well scheduled in the uplink procedure.
Quite few studies VR video uploading over wireless network. However, RA for regular video uplink streaming over wireless network has been studied for different application types. Ragaleux et al. [
14] presented an LTE uplink scheduling scheme for the heterogeneous QoS (Quality of Service) requirements of multimedia traffics, which was formulated as a joint time and frequency domain packet scheduling problem. El Essaili et al. [
15] presented a systematic resource allocation and transmission optimization approach for the simultaneous streaming of user-generated video content. In particular, the centralized multi-user resource allocation problem and the distributed optimization of the video content at the mobile terminal were distinguished in the paper. In [
16] surveillance video uplink streaming over wireless network was investigated, to solve the global video uplink streaming problem, they studied both the long-term bit-rate assignment for video encoding and the real-time packet scheduling in each OFDMA frame under the real-time constraint. Quality-of-Content (QoC) based joint source and channel coding in a mobile surveillance cloud has been investigated in [
13], they aimed at optimizing the wireless resource usage so that more accurate human detections can be performed at the cloud server based on the received videos. To our best knowledge, VR video uplink streaming in 5G has not been researched in previous work, especially delay-sensitive VR video uploading under H-CRAN scenario.
Motivated by the aforementioned challenges for VR video uploading over wireless network, RA for delay-sensitive VR videos uploading in 5G H-CRAN has been studied in this paper. As the storage resources on board and delay-sensitive character of VR video, tile-based source coding [
4] is adopt and only one bit rate representation for each video tile should be generated. Nevertheless, the source coding rate for each tile is bounded by maximum transmission delay and transmission rate of UE. Inspired by the previous work of QoC [
13], we propose a content-sensing RA scheme for VR video transmission over 5G H-CRAN, which aiming at joint VR video source coding and uplink RA optimization. Instead of considering max-SINR [
17] based RA in H-CRAN, in which consists of RRH association, PA and sub-channel allocation (SA) [
18], the proposed scheme optimizes the total QoC which defined as weighted function of tile source coding rate. Then, we formulate the problem as an mixed-integer nonlinear problem (MINLP) [
19].
Note that the problem is hardly solved in one shot by the optimization toolbox due to the problem consists of multiple stages in the VR video uploading process (i.e., RHH/g-NB association, sub-channel allocation and power allocation, and tile encoding rate assignment), a three stage algorithm is proposed to solve the problem efficiently in this paper. The total bandwidth is first allocated to g-NB groups according to the UE density of corresponding g-NB group, where frequency is reused at each RHH. Then RHH/g-NB association, sub-channel allocation and power allocation, and tile encoding rate assignment are jointly solved by decouple the problem into two sub-problem. First sub-problem can be described as allocate the optimal source coding rate for each tile under the constraint of maximum transition delay and upper bound of transmission rate, which is solved by optimization toolbox after the second sub-problem solved to obtain upper bound of transmission rate. While the second sub-problem can be expressed as find the optimization association, PA and SA to maximize the weighted sum-rate, which two step iterative algorithm is proposed to solve the sub-problem.
The remainder of the paper is organized as follows. System model and saliency of tile-based VR video are introduced in
Section 2. The content-sensing based resource allocation scheme and problem formulation are described in
Section 3. Our proposed algorithm for the problem is presented in
Section 4. In
Section 5, We evaluated the proposed scheme and algorithm by plenty of simulations. Finally,
Section 6 concludes the paper.
3. Content-Sensing RA Scheme and Problem Formulation
3.1. Content-Sensing RA Scheme
Different density of UEs within each g-NB (i.e., different resource requirement in each g-NB group) and moving characteristic of UEs lead unstable topology of the system, which calls for a centralized RA scheduler to dynamically allocate the resource. Furthermore, tiles in the same VR video probably need different bit rate to achieve reasonable perceived quality, while the source coding rate for each tile is bounded by the maximum transmission delay and transmission rate of UE. Essentially, the content in different tile region should be taken into consideration during the RA.
Motivated by this, we proposed a novel Content-Sensing RA scheme for VR video uploading in 5G H-CRAN, which is depicted as
Figure 4. RA scheduler integrated at the BBU pool dynamically allocate the resource to each g-NB group according to resource requirement of UEs within the g-NB group, and associate the UEs to g-NB or RRH based on the reported channel quality. The target of RA and UEs association is to maximize the total utility of VR video (i.e., QoC) under the constraint of resource, maximum transmission delay of video chunk and maximum UE power.
Each g-NB group is consisted by g-NB and RHHs under coverage of corresponding g-NB, which serve the UEs within the coverage of g-NB. Furthermore, all UEs report the channel quality to centralized RA scheduler through the g-NB (i.e., RA scheduler integrated with the BBU pool has the full knowledge of the channel side information), and associated to the corresponding RRH or g-NB according to the scheduling results.
For the UE part, once the VR video is generated, the saliency detection is executed every RA round by the tile weight detection module. Then the weight for each tile region is then quantized into the range of the 5G Quality of Service (QoS) class identifier (5QI), which is expressed as (
7). Finally the weight is reported to the RA Scheduler through the g-NB.
Meanwhile, the RA scheduler extracts the quantized weight for performing RA in our scheme instead of ensuring bearer traffic’s QoS. Finally, the encoder performs the tile-based encoding, while each tile is encoded at the target bit rate according to the RA results. After the multiplexing and modulation and coding, the encoded video signal then is transmit to the BBU pool for upper layer functions and baseband signal-processing through the corresponding RRH or g-NB.
Note that RA scheduler integrated at the BBU pool play a key role in the proposed scheme. G-NB group RA, RHH/g-NB association, SA and PA, and tile encoding rate assignment are centralized determined by the scheduling results. In order to obtain a optimal solution for the RA scheduler, mathematical problem formulation and a three stage algorithm to solve the problem will be described in the following part.
3.2. Problem Formulation
According the aforementioned analysis, the objective is to maximize the total QoC for VR uploading under constraints of resource, transmission delay of video chunk and UE power, while source coding rate for each tile is bounded by the maximum transmission delay and transmission rate of UE. Therefore, the transmission rate and transmission delay of UE is investigated in the following:
The signal to interference and noise ratio (SINR) for
i-th UE at
z-th video chunk associated with
m-th RRH on
n-th sub-channel is calculated as follows [
24].
where
The achievable data-rate of
i-th UE at
z-th video chunk when associated with
m-th RRH/g-NB can be written as follows
where
indicates
n-th sub-channel on
m-th RRH/g-NB assigned to
i-th UE at
z-th video chunk, and 0 otherwise. Note that a UE can be only associated with one RRH or g-NB at a certain time, the association can be changed as the motion of UEs. However the simplicity of system model and it is supposed that the association is unchanged during the period of each video chunk transmission.
The transmission delay for
i-th UE of
z-th video chunk can be calculated as
where
is the size of
z-th video chunk of
i-th UE, which can be written as (
12), and
T is the length of a video chunk, which is a fixed time length in this paper. And
is the transmission rate of
i-th UE at
z-th video chunk, which is expressed as Equation (
13),
is introduced to indicate that
i-th UE at
z-th video chunk is served by
m-th RRH/g-NB,
=1, and 0 otherwise.
The transmission power for
i-th UE at
z-th video chunk can be defined as
Thus the RA problem with the objective of total VR video utility maximization in the uplink of 5G H-CRAN subjected to the total resource constraint, maximum transmission delay of each video chunk and UE power constraint can be formulated as follows:
The constraint C1 limits target source coding rate of each tile is selected in the set of pre-defined tile encoding rate. C2 indicates that transmission delay for each video chunk should be less than the maximum delay due to limitation of storage in board and timeliness of VR video. C3 represents maximum per UE transmit power constraint. The constraint C4 indicates a UE can be served by a single RRH or the g-NB. C5 is constraint to ensure that n-th sub-channel is only allocated to one UE when the UE associated with RRH or g-NB. The C5 and C6 limit allocation sub-channel to UEs is in the assigned sub-channels set of corresponding g-NB group. The resource constraints C7 and C8 indicate assigned sub-channels set to each g-NB group is disjointed and sum of these sets is under the total resource constraint, respectively.
4. Algorithm for Problem Solution
Note that OPT-1 is mixed-integer nonlinear problem (MINLP) [
19], However the optimization consists of g-NB resource allocation (i.e.,
), RHH/g-NB association (i.e.,
), SA (i.e.,
) and PA (i.e.,
), and tile encoding rate assignment (i.e.,
). Furthermore, Even for each separated sub-problem still needs sophisticated algorithm to reach optimal solution [
18]. Therefore, the problem can be hardly solved by the existing methods and optimization toolbox.
In this paper, we propose a three-stage optimization algorithm to solve the problem. The flow chat of proposed algorithm is shown as
Figure 5. Specifically, in stage 1, the total bandwidth is first allocated to cardinality of g-NB group according to UE density of corresponding g-NB group. Then RHH/g-NB association, SA and PA, and tile encoding rate assignment are jointly solved by decouple the problem into two sub-problem. First sub-problem can be described as allocate the optimal source coding rate for each tile under the constraint of maximum transition delay and upper bound of transmission rate, which is solved by optimization toolbox after the second sub-problem solved to obtain upper bound of transmission rate (which is shown as stage 3 in the flow chat). While the second sub-problem can be expressed as find the optimization association, PA and SA to maximize the weighted sum-rate, which two step iterative algorithm is proposed to solve the sub-problem (which is shown as stage 2 in the flow chat). And the detail of each stage of proposed algorithm will be described in the following subsections.
4.1. G-NB Group Resource Allocation
Note that total resource bandwidth is allocated to each g-NB group non-overlapping in order to mitigate the inter macro cell (i.e., g-NB) interference and protection of control information signal. However the sub-channels assigned to each g-NB group, which not only can be reused by the g-NB, but also the RRHs within the coverage of the g-NB can be reused these sub-channels. And our strategy is to dynamically allocate the resource according to the requirement. Essentially, the g-NB group resource allocation which should be determined by the number of RRH and the UE transmission requirement in each g-NB group. For the simplicity, in our system, we assume the same density of RHH within each g-NB (i.e., the same number of RHH in each g-NB group). Based on this assumption, the g-NB group RA is based on density of UEs within corresponding group, which can be expressed as
where
and
denotes number of UE within
g-th g-NB group and total number of UE in the system.
Once g-NB group resource allocation completed, the OPT-1 problem can be rewritten as
The OPT-2 can be described as joint RHH/g-NB association, SA, PA and source coding optimization problem. Notice that source coding rate of each tile is determined by tile weight (i.e., saliency) and upper bound of transmission rate of UEs. When transmission delay for each video chunk is fixed (i.e., equal to the maximum value), it is obvious the objective function could get optimal value only if each UE reaches the maximum transmission rate. Consequently, The OPT-2 problem can be rewritten as
where
represents the upper bound of the transmission rate of
i-th UE at
z-th chunk.
Note that OPT-3 is an convex problem if the
is known. In addition, OPT-3 can obtain the optimal solutions only if all the transmission rate of UEs reach the maximum. Consider the weighted sum-rate fairness of the system [
18], upper bound of the transmission rate for each UE can be obtained by solving OPT-4 problem. Finally, joint RHH/g-NB association, sub-channel allocation and power allocation, and tile encoding rate assignment optimization problem are decoupled into two sub-problem (i.e., OPT-3 and OPT-4).
where
, denotes the total tile weight of
z-th chunk of UE
i. Note OPT-4 can be described as an Joint RRH/g-NB association, SA and PA problem and OPT-3 can be expressed as an tile encoding rate assignment problem under the constraint of upper bound transmission rate. And Stage 2 and Satge 3 of proposed algorithm solve the OPT-4 and OPT-3, respectively, which details are explained in the following part.
4.2. Sub-Channel Allocation and Power Allocation
The OPT-4 can be described as an weighted sum-rate maximization problem, and given the fixed association, the problem has be proven an NP-hard problem in [
26,
27]. Furthermore, the OPT-4 is also a MINLP. Even for a given RRH/g-NB association and PA, Find the optimal SA among the UEs alone is difficult due to the large search space of the optimization. Consequently, exhaustive search is not practical to solve the OPT-4. However, a two-step iterative algorithm is proposed in the Stage 3 which jointly optimizing the RRH/g-NB association, SA and PA. The basic idea is that SA and PA are performed with fixed association in first step while association is updated in second step. And the detail is discussed in the following subsections.
4.2.1. Sub-Channel Allocation
For an initial RRH/g-NB association
and initial PA
, the SA problem can be rewritten as
For initialization, we employed path-loss based association and uniform PA. More specifically, for RRH association,
for
and
, otherwise 0; while
, UE associates to the g-NB of corresponding group (i.e.,
). Where
represents the distance between UE
i and RRH
m, and
represents the predefined maximum distance between UE and RRH.
represents the initial association of RRH index,
represents the g-NB index of corresponding group which UE belongs. And the uniform PA
can be expressed as follows
So that for the fixed RRH/g-NB association and PA in each g-NB group, the achievable rate of each UE on each sub-channel is calculated iteratively and then the sub-channel is assigned to the UE which having highest weighted achievable rate on that sub-channel in each iteration. Finally, all the sub-channels are assigned to UEs according to the (
19). where
is sub-channel assignment result of
t iteration.
4.2.2. Power Allocation
In this step, PA is performed more precisely after the SA. Note that the initial PA is performed in an uniformed way. However, after getting the sub-channels assigned to each UE
with the initial association, the PA can be reallocated across the
. Equal Power Distribution (EPD) and Interior Point Algorithm (IPA) are the two common way of PA. However the IPA is more attractive method for fast coverage and easy management of inequality constraint [
10]. In addition, IPA provides optimal PA.
The IPA involves four phases to get optimality conditions. First, the inequality constraints are transformed into equality constraints by the addition of slack-variables to the former. Second, non-negativity situations are implicitly tackled by adding them to the objective function as logarithmic barrier terms. Third, the optimization problem with equality constraints is transformed into unconstrained optimization problem. Fourth, the perturbed Karush-Kuhn Tucker (KKT) first order optimality conditions are solved through the Newton method [
19].
While EPD allocates the maximum transmission power of UE equally distributed among the assigned sub-channels of UE. And the maximum transmission power of UE
i is set as
, power can be allocated as follows
Note the EPD is also an alternative method with the simplicity and low complexity compared with the IPA. Consequently, the PA solution is updated as after PA. In addition, performances of two PA method will be compared in the simulation part.
4.2.3. RRH/g-NB Association
Since the initial association is based on path-loss, which may not be the optimal association. RRH/g-NB association optimization is performed in this step. After SA and PA, we try to optimize RRH/g-NB association with
and
, which can be formulated as
Due to sub-channels are reused at each RRH within corresponding g-NB group, RRH/g-NB association can be performed with
and
to reach optimal association. Note that OPT-6 can be solved by setting all
to zeros except that
, where
represent the association which achieve the best weighted achievable rate of UE which is given in (
21). And the problem can be solved step-by-step by setting the
=1 until all the UEs are associated with RRH/g-NB.
One thing need to point out that method in [
10] to solve the problem by relaxing the binary constrain to continuous, which may solves the problem, however it conflicts with the real practice that one UE is only associated with one RRH/g-NB, which may also leads the sub-optimal solutions. In our proposed method, we just need to calculate the weighted achievable rate of UE across the g-NB group with the assigned sub-channels and power. Then choose the best association based on the weight achievable rate. The detail can be demonstrated as the
Figure 6 with given simple example. More specifically, the solid arrow line indicates the initial association and sub-channel allocation. UE 3 initially associated with the RRH 1 and the sub-channel 4 is assigned to it. However, the weighted achievable rate of UE 3 on the sub-channel 4 of RRH 2 is much higher than the initial association, hence, the association of UE 3 is changing from RRH 1 to RRH 2 according to our strategy.
Convergence of stage 2 is judged by comparing the previous iteration association with the current association. If the association is unchanged among all UEs, which indicates all UEs associate the optimal RRH/g-NB. Otherwise, the algorithm perform SA and PA again with current RRH/g-NB association (i.e., ).
4.3. Tile Encoding Rate Assignment
The upper bound transmission rate of UE can be obtained once association and sub-channel allocation completed. For given upper bound transmission rate of UE, the sub-problem OPT-3 can be expressed as assign optimal encoding rate for each tile of z-th chunk of UE. Note that the OPT-3 is an convex problem after relaxation of discrete tile encoding rate to continuous. And the problem can be solved by the convex optimization toolbox (e.g., CVX), the similar problem has been studied in our previous work.