Intelligent Reflecting Surface-Aided Device-to-Device Communication: A Deep Reinforcement Learning Approach
Round 1
Reviewer 1 Report
This paper’s major goal is to improve the network’s spectrum efficiency (SE) by jointly optimizing the transmission power of both cellular network users and the device-to-device communication users. To do so, they also rely on the assistance from the intelligent reflecting surface (IRS). In specific, they first propose an optimization problem to maximize the overall SE and prove it is an NP problem. Then they re-formalize the problem as a reinforcement learning problem and solve it using Q-learning, deep Q-learning and actor-critic based solutions respectively. The simulation results show that the proposed framework is able to outperform other approaches in spectrum efficiency, with actor-critic based solution having the best performance. The paper is overall well structured. There are a couple of comments/questions listed below:
The major idea of the approach is the utilization of IRS. The question is if the IRS is always available and reachable from all the users? How much extra investment would it needs? It would also be preferable if some basic working mechanism of IRS is explained, for example, how it can assist in the communication of the cellular network and D2D communication, how it can reduce co-channel interference, and how the number of reflecting elements impacts the performance?
In page 3, the first contribution, SINR should be “signal-to-interference-plus-noise ratio “
There are two types of users according to the paper, cellular users (CU) and D2D pairs. Are there any overlap between CU and D2D pairs? Are they the same group of users? It is usually common that users using cellular network also have D2D communication. What if there are overlaps between these two groups of users?
Do all CU and D2D pairs have access to IRS? Some D2D-only pairs may only perform short-range communication and cannot reach IRS. Also from the picture shown in Figure 1, only a couple can access. Then the assistance from IRS is very limited. Another question is if only limited users can access IRS, who have the priority.
Does transmission range affect the formulation in (1)-(3)? D2D only have short-range communication. They may not compete or interference if two pairs are far away.
For the reinforcement learning formulation, why only the D2D users can take actions, but not cellular users? Also, the action space is not clear enough by only listing (6). If only D2D users take actions, do all D2D users perform simultaneously? This might be hard since they may not have central communication with the controller (D2D pairs only reply on short-range communication). What if they cannot reach the controller?
The simulations settings are not clear enough. What is the simulation tool used? How big is the cell space? Where are the D2D pairs and IRS placed? While the number of pairs improves, how far or how close they are matters. If D2D pairs are far away, there might be no interference at all. This needs to be further justified.
Last, the English writing can be further improved. Some more professional proofreading (even with online tools) is recommended.
Author Response
Please see the attachment
Author Response File: Author Response.pdf
Reviewer 2 Report
1. The IRS(intelligent reflecting surface)-aided D2D communication technique suggested and analyzed by the authors is not clear what kind of problem it is proposed to solve in the application domain.
2. The authors should present the differences from the recent techniques related to 5G and 6G.
3. The authors should compare and explain how IRS was applied in the existing 5G communication application, and what advantages and problems there are. In addition, authors need to present a methodology for how the IRS technique solves the problem of stream interference between upstreams and downstreams between devices (eg, CUs and gNBs, DUEs and DUEs) in 5G and 6G application environments.
4. The authors should analyze characteristics such as transmission characteristics, spectrum, and energy power of 5G and 6G from the IRS point of view.
5. In 5G communication, IRS aided D2D communication may be affected by meta-interference from adjacent devices. A solution for how to solve such meta-interference should be presented. See Figure 2.
6. Many of the proposed formulas and components do not differ from those of 5G. What is the author's own paper novelty?
7. Please check whether the notation of (0,1] on pages 7 and 8 is correct.
8. In Figure 2, the authors should be presented a control method of how control them when flow traffic errors meet in BS and gi, D2D pairs, and other links.
Author Response
Please see the attachment
Author Response File: Author Response.pdf
Reviewer 3 Report
1. I wonder how the "discount factor" is computed.
2. Line 210: A pair of every D2D transmitter and receiver.
3. Rest paper is well written and organized.
Author Response
Please see the attachment
Author Response File: Author Response.pdf
Reviewer 4 Report
The authors proposed an RL-based resource allocation approach for an IRS-aided underlay D2D cellular network. Their idea is very interesting and the simulation results are promising. The reviewer just suggests some corrections of minor typos as follows:
- is considered as ---> is considered: wherever the sentence is found
- for a IRS-aided ---> for an IRS-aided: at the 108th line
- interference-plus-noise ratio (SINR) ---> signal-to-interference-plus-noise ratio (SINR): at the 113th line
- without RIS ---> without IRS: at the 130th line
- scheme-1 (without IRS) ---> scheme-2 (without IRS): at the 360th line
- Scheme-1 (without IRS) ---> scheme-2 (without IRS): at the 369th line
Author Response
Please see the attachment
Author Response File: Author Response.pdf
Round 2
Reviewer 1 Report
Thanks for the response letter. Some questions are addressed. But unfortunately, a lot of questions are left addressed.
For the first comment, the questions on the IRS are not all answered. Also, it is recommended to add the explanation on IRS to the paper, but not just to the response report.
For the fourth comment and response, if IRS is just in one location, how it can make sure all D2D users have access, since the area can be huge? If there are multiple locations, how are those locations selected? Also, what if D2D users compete the use of IRS if priority is not considered?
The response 5 is also confusing. If the range is short, then the interference also only exist in short range, which is limited. This will affect the formulation for sure. Then how this can be addressed? The response did not give an answer.
The comment 6 is not addressed too. It is not explained why only D2D users are controllers or can take actions. Action space (6) is not further explained too. Do D2D users take actions simultaneously? This might be hard since they are short range.
For comment 7, the simulation tools are still not listed. What does it mean by saying IRS is located between two D2D pairs? Between four users? How many IRS components are needed then? Since D2D users are mobile and dynamic, do the IRS locations also change?
Overall, there has been no major revisions made to the manuscript based on the original comments. I highly suggest the authors can take some time to carefully address the comments.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 2 Report
This paper can be published in your society.
Author Response
We have addressed the issue mentioned during revision.
Thank you for your comment and suggestion for publication.
Round 3
Reviewer 1 Report
I have no more further comments.