Multi-Agent Deep Reinforcement Learning Framework Strategized by Unmanned Aerial Vehicles for Multi-Vessel Full Communication Connection
Round 1
Reviewer 1 Report
autors in this paper proposed a multi-agent deep reinforcement learning framework strat gized by unmanned aerial vehicles (UAVs). UAVs can evaluate and navigate the multi-USV cooperation and position adjustment to establish FCC. When ensuring FCC, we aim to improve the IoV performance by maximizing the USVs communication range and movement fairness while minimizing their energy consumption, which cannot be explicitly expressed in a closed-form equation.
I think the article has the merit of being published, but the authors should consider the following major points:
1- In the abstract, the authors do not discuss the results of the simulations in detail. We recommend that you explain your results better in the abstract.
2- I recommend in particular that authors add a comparative table to better explain the comparison between their work and the literature.
3-your system model is badly explained, please improve the part of your system model.
4-why you should choose your propagation model.
5-equation 15 is not understood
6-In the design of the action, the optimization variables of the optimization problem do not correspond to each other, and the explanation is not given.
7-In the design of the reward, it is mentioned that the destination is reached. But the relevant constraints are not given.
8- comparison with other results is necessary. in addition, your results are being misinterpreted.
9-study the complexity of your algorithm and what the limits of your article are.
Please add this papers:
M. A. Ouamri, G. Barb, D. Singh, A. B. M. Adam, M. S. A. Muthanna and X. Li, "Nonlinear Energy-Harvesting for D2D Networks Underlaying UAV With SWIPT Using MADQN," in IEEE Communications Letters, vol. 27, no. 7, pp. 1804-1808, July 2023.
M. A. Ouamri, R. Alkanhel, D. Singh, E. M. El-kenaway and S. S. M. Ghoneim, "Double deep q-network method for energy efficiency and throughput in a uav-assisted terrestrial network," Computer Systems Science and Engineering, vol. 46, no.1, pp. 73–92, 2023.
Moderate english required
Author Response
The response to the comments is in the attachment. Thank you.
Author Response File: Author Response.pdf
Reviewer 2 Report
This paper proposes the UST-MADRL framework, which enables UAVs to efficiently navigate the movement of USVs to establish a multi-USV FCC based on MADDPG
A major revision is recommended.
1.There are several MADRL algorithms out there, so why choose MADDPG?
2.In a MADDPG structure, each agent needs to know the global state during the training process. Do the authors consider the information exchange in the training process?
3. For DRL, the reviewer suggests elaborating the training process and testing process separately. offline training is preferred, and how to design the training dataset and loss function to guarantee the generalization capability is the key issue.
4. Is the training process and testing process sharing the same sets?
5. In this model, does the action of each agent have an impact on the other agent's state?
6. In Sec3, it should be made clear what the purpose of this paper is to maximize or minimize, please add.
such as " max ..
s.t. ..."
7.Please elaborate on the rationale for the choice of parameters in this paper, or introduce references。
8.It is suggested to add other RL algorithms as BASELINE, e.g., centralized DDPG and MADQN.
Author Response
The response to comments is in the attachment. Thank you so much.
Author Response File: Author Response.pdf
Round 2
Reviewer 1 Report
The paper have been improved
Enflish can be improved
Reviewer 2 Report
The authors have answered all my concerns. I recommend to accept it in its current form.