Deep Reinforcement Learning with Corrective Feedback for Autonomous UAV Landing on a Mobile Platform
Round 1
Reviewer 1 Report
Good work overall. Have some comments:
1. In Fig. 5, the axis for % success is missing.
2. The trajectories provided in results assume there is no restriction on the final approach/ path taken to reach the target. In case of moving targets like trucks or ships, the trajectory will have to be from a particular direction (e.g. helicopter landing on a ship has to approach from the back and cannot overshoot its target). How will such constraints be handled? In such cases, controller design has to be precise and cannot have overshoot.
3. Details of the UAV, its design, controlling mechanism etc. are not provided. Please add them to as a section. Without details, the case study implementation does not have the same weight. The controller PID values etc. only make sense taken together with the UAV performance and capabilities.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 2 Report
This study developed an autonomous UAV landing approach by combining PID control method and reinforcement learning. Overall the manuscript is well written. Before proceeding to the publication stage, I encourage the authors to address the following minor comments:
11. [Second last paragraph in Introduction line 76-93] This paragraph is nicely written. To further improve, can you apply the principles from the “Human–computer interaction” subject to describe your innovation of work at a high level?
22. [Section 2.3 line 122] “RL algorithms typically require long time to stabilize.”
- Can you quantify the “long time” here? Or provide a benchmark of short long algorithms?
33. [Section 4.1 – line 4.1] Please define or provide reference(s) of “the Gazebo simulator”
44. [Section 4.3.2 – Testing with a moving vehicle] Can you specify more dynamic motion feature or the vehicle? Such as trajectory and speed.
55. [Section 4.4 line 286] “Due to the security reason”
- Do you mean to say: “due to safety concerns”?
66. [Section 4.4 Real-world Experiments] Can you provide more information about the real world experiment results or KPI? For instance, landing accuracy and success rate.
77. [Section 4 – Experiments and Results] It is great to see authors designed and presented both software simulation and real-world simulation. Have authors considered performance hardware-in-the-loop simulations? Similar navigation & control experiments of GNSS-based satellites experiment are done by [1]. Is it possible to implement similar setup to evaluate the navigation and control algorithm in a such controllable and repeatable testbed format?
[1] Peng Y, Scales W, Esswein M, Hartinger M (2019), Small satellite formation flying simulation with multi-constellation GNSS and applications to future multi-scale space weather observations. Proceedings of the ION GNSS+, Institute of Navigation, Miami, Florida, USA, pp 2035–2047. https://doi.org/10.33012/2019.16883
[2] Sabatini, M.; Palmerini, G.B.; Gasbarri, P. A testbed for visual based navigation and control during space rendezvous operations. Acta. Astronaut. 2015, 117, 184–196 https://doi.org/10.1016/j.actaastro.2015.07.026
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Round 2
Reviewer 1 Report
Thank you.