Dynamic Scene Path Planning of UAVs Based on Deep Reinforcement Learning
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThis paper presents current and interesting topic. The lack of clear objectives and sound contributions are the main issues.
References are poor with some cited journal difficult to locate and other not referencing the updated citation.
The paper also shows a great overlap with the article (Towards Real-Time Path Planning through Deep Reinforcement Learning for a UAV in Dynamic Environments) reference [19].
The enhancement of this paper over the mentioned paper needs to be clearly stated to justify the significance of your contribution.
Comments for author File: Comments.pdf
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsTraditional UAV trajectory planning methods have focused on solving planning problems in static scenes, have struggled to balance optimality and real-time performance, and have been prone to local optimality. In this paper, the authors propose an improved deep learning approach with reinforcement learning for UAV trajectory planning in dynamic scenarios. First, the authors create a problem scenario including an obstacle estimation model and model the UAV path planning problem using a Markov Decision Process. The authors transfer the MDP model to a reinforcement learning framework and design the state space, action space and reward function, and incorporate heuristic rules into the action search strategy. Second, the authors use the Q-function approximation of the extended D3QN with a prioritized experience replay mechanism and design the network structure of the algorithm based on the TensorFlow framework. Through intensive training, the authors obtain reinforcement-based path planning policies for static and dynamic scenes and the authors use a visualized action field to analyze the planning performance. Simulations show that the proposed algorithm can solve UAV path planning problems in dynamic scenes and outperforms classical methods such as A*, RRT and DQN in terms of planning efficiency.
My comments:
1) Authors should more clearly emphasize the main points of novelty that distinguish their work from closely related papers on the topic.
2) Punctuation looks sloppy. For example, in Line 129, 137 and further the word "where" should start with a small letter.
3) In general, it is unclear how the emphasis on UAVs and the rather simple model used by the authors relate. What specifics of UAVs are manifested in the model.
4) Line 184: what are the 4 conditions in question?
5) Formulas (8) and (9): what is w and w'? Not defined? Also, $\gamma$ is not defined?
6) Formula (12): What is $\epsilon_0$
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Round 2
Reviewer 2 Report
Comments and Suggestions for AuthorsMy comments have been addressed