A Stealth–Distance Dynamic Weight Deep Q-Network Algorithm for Three-Dimensional Path Planning of Unmanned Aerial Helicopter
Abstract
:1. Introduction
- The research aims to develop an ML algorithm of UAH/UAV path planning, with the stealth abilities of UAHs/UAVs considered in the reward function.
- The research is put forward to take account of not only the threats of radar and IR detections, but also threats of flight operations, in the path-planning simulations of UAHs/UAVs. The related models should also be established.
- The research is designed to balance evasive maneuvers, as well as the approaching maneuvers of UAHs/UAVs, while making maneuver decisions during path planning.
- Considering the detection and operation threats to UAHs, a stealth–distance dynamic weight DQN (SDDW-DQN) algorithm is put forward for UAH path planning under the threat of radar detection and flight operation.
- The stealth model of a UAH, as well as the guidance model of flight, are established in this paper, and the formulas of detection probability are illustrated in detail.
- A new reward function of the SDDW-DQN algorithm is designed with dynamic weights, which can be influenced by both the distance to the destination, and the detection probability.
2. Models
2.1. Dynamical Model of UAH
2.2. Dynamic Model and Guidance Model of Flight
- Step 1.
- Awaiting orders…
- Step 2.
- When the target is continuously detected three times by radar, the moving track of the target can be preliminary identified, and the flight will be launched.
- Step 3.
- The flight will be controlled by PN when the target is detected by radar. If not, the flight will search for the target itself using infrared detection, and will be controlled by TG when the target is detected via IR detection. If both radar and infrared fail to detect the target, the flight will keep flying in a straight line.
- Step 4.
- When the distance between the UAH and the flight is shorter than 20 m, the flight will hit the target UAH. Otherwise, the flight will fail to hit UAH, and will execute self-destruction when the UAH has reached the destination area or the flight crashes into the ground (z ≤ 0).
2.3. Stealth and Detection Model
2.3.1. RCS Model of a UAH
2.3.2. Infrared Radiation (IR) Model of UAH
2.3.3. Detection Probabilities
- (1)
- Radar detection probability
- (2)
- Infrared detection probability
3. Design of SDDW-DQN Algorithm for Path Planning
3.1. Reward Function Design of SDDW-DQN
- The UAH crashes into the ground. r = −100.
- The UAH is shot down by the flight (when the distance between the UAH and the flight is 20 m or shorter). r = −50.
- The UAH reaches the target within its working range (when the distance between the UAH and the target is 100 m or shorter). r = 0.
3.2. Structure and Elements of SDDW-DQN
4. Simulation
4.1. Environments and Parameters
- (1)
- Positions
- (2)
- Attitudes and velocities
- (3)
- Parameters
4.2. Assumptions
- (1)
- It is assumed that the UAH is set to be the blue side, while the target, radar, and flight are set to be the red side. The symbols of points on the UAH paths are circles, and the symbols of points on the flight tracks are triangles.
- (2)
- For the simplification of the complexity of the UAH motivation, the roll angle and pitch angle are fixed at 0°. Only the yaw maneuvers of the UAH are taken into consideration.
- (3)
- The flight launcher has the same position as the radar, and both of them are static. During the simulation, the flight launcher only fires one flight to operation the UAH. If the flight has failed to hit UAH, and has operated self-destruction, the flight launcher appears as no threat to the UAH.
5. Results and Discussions
5.1. Converge Results of Different Cases
5.2. Result Discussions of Different Cases
5.2.1. Result Discussions of Case1
5.2.2. Result Discussions of Case2
5.2.3. Result Discussions of Case3
5.2.4. Result Discussions of Case4
5.2.5. Result Discussions of Case5
5.2.6. Result Discussions of Case6 (Using the SDDW-DQN Algorithm)
6. Conclusions
- (1)
- The verification can be given from the simulation results in this paper that the SDDW-DQN algorithm has the ability of realizing path planning in UAHs. Under the threat of radar detection and flight operation, the UAH can evade these threats and, finally, reach the working range with the SDDW-DQN algorithm.
- (2)
- It is necessary for ensuring the survivability of UAHs that the stealth elements of UAHs are taking into consideration. The UAH can hardly survive the flight operation if the stealth weight w2 is not high enough. Therefore, it is essential that proper values for the weights of the reward function are selected.
- (3)
- Dynamic weights in the reward function can be helpful for the UAH, to shorten the length of the path and raise the task execution efficiency, on the premise of evading flight operations. The path-planning result with the SDDW-DQN algorithm appears with a safer and shorter path of the UAH than that with the DQN algorithm with static weights.
- (4)
- In Figure 23, the UAH maintains stealth against detections for a long time, causing the flight to fly on inertia for too long, and finally run out of energy. Thus, the method of avoiding flight operations can be summarized as “blind”. The “blind” method can be realized via stealth design, and by maneuvering far enough away from detection threats.
- (5)
- In Figure 26, the UAH operates evasive maneuvers, and appears stealth, or exposed to radar now and then. The flight has to keep tracking the UAH by twisting its trajectory hard, leading to the increase in drag, and a more serious loss of energy. Therefore, another method of avoiding flight operations can be summarized as “exhaust”. The “exhaust” method can be realized by high-overloaded maneuvers or maneuvers with high stealth orientations facing detection threats at proper times.
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- McEnroe, P.; Wang, S.; Liyanage, M. A Survey on the Convergence of Edge Computing and AI for UAVs: Opportunities and Challenges. IEEE Internet Things J. 2022, 9, 15435–15459. [Google Scholar] [CrossRef]
- Su, J.; Zhu, X.; Li, S.; Chen, W.-H. AI meets UAVs: A survey on AI empowered UAV perception systems for precision agriculture. Neurocomputing 2023, 518, 242–270. [Google Scholar] [CrossRef]
- Raja, A.; Njilla, L.; Yuan, J. Adversarial Attacks and Defenses Toward AI-Assisted UAV Infrastructure Inspection. IEEE Internet Things J. 2022, 9, 23379–23389. [Google Scholar] [CrossRef]
- Yin, X.; Peng, X.; Zhang, G.; Che, B.; Wang, C. Flight Control System Design and Autonomous Flight Control of Small-Scale Unmanned Helicopter Based on Nanosensors. J. Nanoelectron. Optoelectron. 2021, 16, 675–688. [Google Scholar] [CrossRef]
- Hoshu, A.A.; Wang, L.; Ansari, S.; Sattar, A.; Bilal, M.H.A. System Identification of Heterogeneous Multirotor Unmanned Aerial Vehicle. Drones 2022, 6, 309. [Google Scholar] [CrossRef]
- Gupta, P.; Pareek, B.; Singal, G.; Rao, D.V. Edge device based Military Vehicle Detection and Classification from UAV. Multimedia Tools Appl. 2021, 81, 19813–19834. [Google Scholar] [CrossRef]
- Wang, Z.; Henricks, Q.; Zhuang, M.; Pandey, A.; Sutkowy, M.; Harter, B.; McCrink, M.; Gregory, J. Impact of Rotor-Airframe Orientation on the Aerodynamic and Aeroacoustic Characteristics of Small Unmanned Aerial Systems. Drones 2019, 3, 56. [Google Scholar] [CrossRef] [Green Version]
- Bhattacharya, P.; Gavrilova, M.L. Roadmap-Based Path Planning—Using the Voronoi Diagram for a Clearance-Based Shortest Path. IEEE Robot. Autom. Mag. 2008, 15, 58–66. [Google Scholar] [CrossRef]
- Chi, W.; Ding, Z.; Wang, J.; Chen, G.; Sun, L. A Generalized Voronoi Diagram-Based Efficient Heuristic Path Planning Method for RRTs in Mobile Robots. IEEE Trans. Ind. Electron. 2021, 69, 4926–4937. [Google Scholar] [CrossRef]
- Song, C.; Cha, J.; Lee, M.; Kim, D.-S. Dynamic Voronoi Diagram for Moving Disks. IEEE Trans. Vis. Comput. Graph. 2021, 27, 2923–2940. [Google Scholar] [CrossRef]
- Alhassow, M.M.; Ata, O.; Atilla, D.C. Car-like Robot Path Planning Based on Voronoi and Q-Learning Algorithms. In Proceedings of the 2021 International Conference on Engineering and Emerging Technologies (ICEET), Istanbul, Turkey, 27–28 October 2021; pp. 591–594. [Google Scholar] [CrossRef]
- Zhang, T.; Wang, J.; Meng, M.Q.H. Generative Adversarial Network Based Heuristics for Sampling-Based Path Planning. IEEE-CAA J. Autom. Sin. 2021, 9, 64–74. [Google Scholar] [CrossRef]
- Ma, N.; Wang, J.; Liu, J.; Meng, M.Q.H. Conditional Generative Adversarial Networks for Optimal Path Planning. IEEE Trans. Cogn. Dev. Syst. 2022, 14, 662–671. [Google Scholar] [CrossRef]
- Zhu, H.; Wang, Y.; Ma, Z.; Li, X. A Comparative Study of Swarm Intelligence Algorithms for UCAV Path-Planning Problems. Mathematics 2021, 9, 171. [Google Scholar] [CrossRef]
- Du, N.; Zhou, Y.; Deng, W.; Luo, Q. Improved chimp optimization algorithm for three-dimensional path planning problem. Multimedia Tools Appl. 2022, 81, 27397–27422. [Google Scholar] [CrossRef]
- Hu, J.; Wang, L.; Hu, T.; Guo, C.; Wang, Y. Autonomous Maneuver Decision Making of Dual-UAV Cooperative Air Combat Based on Deep Reinforcement Learning. Electronics 2022, 11, 467. [Google Scholar] [CrossRef]
- Ren, J.; Huang, X.; Huang, R.N. Efficient Deep Reinforcement Learning for Optimal Path Planning. Electronics 2022, 11, 3628. [Google Scholar] [CrossRef]
- Chen, X.; Qi, Y.; Yin, Y.; Chen, Y.; Liu, L.; Chen, H. A Multi-Stage Deep Reinforcement Learning with Search-Based Optimization for Air–Ground Unmanned System Navigation. Appl. Sci. 2023, 13, 2244. [Google Scholar] [CrossRef]
- Wei, K.; Huang, K.; Wu, Y.; Li, Z.; He, H.; Zhang, J.; Chen, J.; Guo, S. High-Performance UAV Crowdsensing: A Deep Reinforcement Learning Approach. IEEE Internet Things J. 2022, 9, 18487–18499. [Google Scholar] [CrossRef]
- Zheng, S.; Liu, H. Improved Multi-Agent Deep Deterministic Policy Gradient for Path Planning-Based Crowd Simulation. IEEE Access 2020, 7, 147755–147770. [Google Scholar] [CrossRef]
- Li, D.; Yin, W.; Wong, W.E.; Jian, M.; Chau, M. Quality-Oriented Hybrid Path Planning Based on A* and Q-Learning for Unmanned Aerial Vehicle. IEEE Access 2021, 10, 7664–7674. [Google Scholar] [CrossRef]
- Zhu, X.; Wang, L.; Li, Y.; Song, S.; Ma, S.; Yang, F.; Zhai, L. Path planning of multi-UAVs based on deep Q-network for energy-efficient data collection in UAVs-assisted IoT. Veh. Commun. 2022, 36, 100491. [Google Scholar] [CrossRef]
- Xu, Y.; Wei, Y.; Jiang, K.; Wang, D.; Deng, H. Multiple UAVs Path Planning Based on Deep Reinforcement Learning in Communication Denial Environment. Mathematics 2023, 11, 405. [Google Scholar] [CrossRef]
- Yao, J.; Li, X.; Zhang, Y.; Ji, J.; Wang, Y.; Liu, Y. Path Planning of Unmanned Helicopter in Complex Environment Based on Heuristic Deep Q-Network. Int. J. Aerosp. Eng. 2022, 2022, 1360956. [Google Scholar] [CrossRef]
- Zhao, J.; Gan, Z.; Liang, J.; Wang, C.; Yue, K.; Li, W.; Li, Y.; Li, R. Path Planning Research of a UAV Base Station Searching for Disaster Victims’ Location Information Based on Deep Reinforcement Learning. Entropy 2022, 24, 1767. [Google Scholar] [CrossRef]
- Cheng, Y.; Zhang, W. Concise deep reinforcement learning obstacle avoidance for underactuated unmanned marine vessels. Neurocomputing 2017, 272, 63–73. [Google Scholar] [CrossRef]
- Yao, J.; Li, X.; Zhang, Y.; Ji, J.; Wang, Y.; Liu, Y. Path Planning of Unmanned Helicopter in Complex Dynamic Environment Based on State-Coded Deep Q-Network. Symmetry 2022, 14, 856. [Google Scholar] [CrossRef]
- Zhou, Z.; Huang, J.; Yi, M. Comprehensive optimization of aerodynamic noise and radar stealth for helicopter rotor based on Pareto solution. Aerosp. Sci. Technol. 2018, 82-83, 607–619. [Google Scholar] [CrossRef]
- Zhou, Z.; Huang, J.; Wu, N. Acoustic and radar integrated stealth design for ducted tail rotor based on comprehensive optimization method. Aerosp. Sci. Technol. 2019, 92, 244–257. [Google Scholar] [CrossRef]
- Zhang, Y.; Hu, M.; Lai, D.; Ma, G. Simulation on Aircraft Infrared Exposed Range and Detection Probability. J. Syst. Simul. 2016, 28, 441–448. [Google Scholar]
Symbol of State Elements | Meanings |
---|---|
X-coordinate of UAH | |
Y-coordinate of UAH | |
Z-coordinate of UAH | |
Yaw angle of UAH | |
X-velocity of UAH | |
Y-velocity of UAH | |
Z-velocity of UAH | |
Yaw rate of UAH |
Symbol of State Elements | Meanings |
---|---|
X-coordinate of flight | |
Y-coordinate of flight | |
Z-coordinate of flight | |
Speed of flight | |
Yaw angle of flight | |
Pitch angle of flight | |
State mark of flight | |
Numbers of continuous discoveries | |
Current discovery state | |
Time of flight |
Symbol of State Elements | Meanings |
---|---|
X-axis component of thrust | |
Y-axis component of thrust | |
Z-axis component of thrust | |
Z-axis moment |
Parameters | Symbols | Values |
---|---|---|
Weight of UAH | 5000 kg | |
Z-rotational inertia | 35,000 kg∙m2 | |
X-resistance coefficient | 2.3 | |
Y-resistance coefficient | 2.3 | |
Z-resistance coefficient | 2.3 | |
Z-rotation resistance coefficient | 2.3 | |
X-thrust force | −15,000~15,000 N | |
Y-thrust force | −15,000~15,000 N | |
Z-thrust force | 0~64,000 N (max) | |
Z-torque | −10,500~10,500 N∙m | |
Working range | 100 m |
Parameters | Symbols | Values |
---|---|---|
Weight of flight | 100 kg | |
Max lift coefficient | 0.3 | |
Lift–drag ratio | 10 | |
Zero-lift drag coefficient | 0.05 | |
Proportional navigation coefficient | 3 | |
Rocket thrust | 12,000 N | |
Max thrust duration | 8 s | |
Operation radiation | 20 m |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, Z.; Huang, J.; Yi, M. A Stealth–Distance Dynamic Weight Deep Q-Network Algorithm for Three-Dimensional Path Planning of Unmanned Aerial Helicopter. Aerospace 2023, 10, 709. https://doi.org/10.3390/aerospace10080709
Wang Z, Huang J, Yi M. A Stealth–Distance Dynamic Weight Deep Q-Network Algorithm for Three-Dimensional Path Planning of Unmanned Aerial Helicopter. Aerospace. 2023; 10(8):709. https://doi.org/10.3390/aerospace10080709
Chicago/Turabian StyleWang, Zeyang, Jun Huang, and Mingxu Yi. 2023. "A Stealth–Distance Dynamic Weight Deep Q-Network Algorithm for Three-Dimensional Path Planning of Unmanned Aerial Helicopter" Aerospace 10, no. 8: 709. https://doi.org/10.3390/aerospace10080709
APA StyleWang, Z., Huang, J., & Yi, M. (2023). A Stealth–Distance Dynamic Weight Deep Q-Network Algorithm for Three-Dimensional Path Planning of Unmanned Aerial Helicopter. Aerospace, 10(8), 709. https://doi.org/10.3390/aerospace10080709