Globally Guided Deep V-Network-Based Motion Planning Algorithm for Fixed-Wing Unmanned Aerial Vehicles
Abstract
:1. Introduction
- For the motion planning problem studied in this paper, a deep V-network based on the attention mechanism is adopted with a multi-stage, multi-scenario training strategy to improve training efficiency and network generalization. The effectiveness of the algorithm is verified by comparison simulation experiments.
2. Related Works
- (1)
- Traditional motion planning algorithms for fixed-wing UAVs: Existing work on the navigation problem in cluttered environments for fixed-wing UAVs often focuses on traditional algorithms. The A* algorithm is employed to generate planning path in [13,14]. The authors in [15,16] proposed an improved potential field method to help UAVs avoid obstacles in local path planning. However, the above papers only consider the UAV as a mass point and neglect its kinematic model. The authors in [17] modeled a fixed-wing UAV as a first-order equation and defined a safe flight corridor based on its maneuvering characteristics and Dubins curve to achieve obstacle avoidance. A multi-objective optimization model was established in [20] and proposed an improved particle swarm optimization to solve the three-dimensional path planning problem in complex environments. The work proposed in [21] combined the ant colony algorithm, the self-organizing mapping algorithm, and the optimal Dubins trajectory to achieve the task assignment and path planning for multiple UAVs. Considering the kinematic model of fixed-wing UAVs as second-order systems, Ref. [22] proposed an improved RRT* algorithm to realize 3D path planning.
- (2)
- Deep neural network-based learning methods in the field of robotics: In recent years, with the increase in computational power, deep neural network-based learning methods have shown great potential in dealing with high-dimensional and complex environmental states [23,24], especially in the field of robotics, such as robotic arms [25], wheeled and multi-legged robots [26,27,28], etc., which have been applied to some experimental real-world scenarios. For example, in [29], the authors proposed an imitation learning approach based on perceptual information to realize motion planning for UAVs, and Ref. [30] used deep learning to achieve effective 3D exploration of a vehicle in an indoor environment.
- (3)
- Deep reinforcement learning (DRL)-based motion planning for fixed-wing UAVs: Unlike the supervised learning methods mentioned above, the RL method is applicable to sequential decision control problems, based on which agents can learn to explore strategies that maximize reward by interacting with the environment. Under the assumption that the UAV flies at a fixed altitude, Li et al. [18] designed the action space to satisfy the first-order fixed-wing UAV kinematic constraints, and implemented a deep Q-network to generate a safe path for a fixed-wing UAV. Based on the same model, Wu et al. [19] proposed a multicritical delay depth deterministic policy gradient method. Moreover, Li et al. [31] considered the second-order kinematic constraints, and designed a deep deterministic policy gradient algorithm to enable UAVs’ complete path planning in a 3D environment with only static obstacles; Yan et al. [32] established a posture assessment model to evaluate the collision risk between followers for the case with only neighboring aircraft, and designed a DRL algorithm to implement a collision-free strategy for fixed-wing UAVs.
3. Preliminaries
3.1. Problem Formulation
3.2. Dynamics of Fixed-Wing UAV
4. Approach
4.1. Systems Description
4.2. Global Guidance Planner
4.2.1. Primitive Generation
4.2.2. Analytic Expansion
4.2.3. Cost Function
4.3. Local RL Planner
4.3.1. RL Formulation
4.3.2. Policy Network Architecture
4.4. Hybrid A* Deep V Network (HADVN) Algorithm
Algorithm 1 Hybrid A* deep V-network (HADVN) algorithm |
Input: target value network, number of supervised and RL training episodes , size of mini-batch Output: Optimal value network
|
5. Results
5.1. Experimental Setup
5.2. Training Procedure
5.3. Qualitative Analysis
5.4. Performance Results
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
UAV | Unmanned Aerial Vehicle |
HADVN | Hybrid A* Deep V-Network |
RRT | Rapidly-exploring Random Tree |
FOV | Field of View |
ORCA | Optimal Reciprocal Collision Avoidance |
SARL | Socially Attentive Reinforcement Learning |
References
- Pfeiffer, M.; Schaeuble, M.; Nieto, J. From perception to decision: A data-driven approach to end-to-end motion planning for autonomous ground robots. In Proceedings of the 2017 IEEE International Conference on Robotics and Automation, Marina Bay Sands, Singapore, 29 May–3 June 2017; pp. 1527–1533. [Google Scholar]
- Dugas, D.; Nieto, J.; Siegwart, R. Navrep: Unsupervised representations for reinforcement learning of robot navigation in dynamic human environments. In Proceedings of the 2021 IEEE International Conference on Robotics and Automation, Xi’an, China, 30 May–5 June 2021; pp. 7829–7835. [Google Scholar]
- Kästner, L.; Marx, C.; Lambrecht, J. Deep-reinforcement-learning-based semantic navigation of mobile robots in dynamic environments. In Proceedings of the 2020 IEEE 16th International Conference on Automation Science and Engineering, Hong Kong, China, 20–21 August 2020; pp. 1110–1115. [Google Scholar]
- Yan, T. Collision-avoiding flocking with multiple fixed-wing UAVs in obstacle-cluttered environments: A task-specific curriculum-based MADRL approach. IEEE Trans. Neural Netw. Learn. Syst. 2023, 1–15. [Google Scholar] [CrossRef] [PubMed]
- Klausen, K. Autonomous recovery of a fixed-wing UAV using a net sus-pended by two multirotor UAVs. J. Field Robot. 2018, 35, 717–731. [Google Scholar] [CrossRef]
- Liu, Z. Mission-oriented miniature fixed-wing UAV swarms: A multilayered and distributed architecture. IEEE Trans. Syst. Man Cybern. Syst. 2020, 52, 1588–1602. [Google Scholar] [CrossRef]
- Huang, S. Collision avoidance of multi unmanned aerial vehicles: A review. Annu. Rev. Control 2019, 48, 147–164. [Google Scholar] [CrossRef]
- Yasin, J.N. Unmanned aerial vehicles (UAVs): Collision avoidance systems and approaches. IEEE Access 2020, 8, 105139–105155. [Google Scholar] [CrossRef]
- Chen, H. Coordinated path-following control of fixed-wing unmanned aerial vehicles. IEEE Trans. Syst. Man Cybern. Syst. 2021, 52, 2540–2554. [Google Scholar] [CrossRef]
- Donald, B. Kinodynamic motion planning. J. ACM 1993, 40, 1048–1066. [Google Scholar] [CrossRef]
- Chen, H. Coordinated path following control of fixed-wing unmanned aerial vehicles in wind. ISA Trans. 2022, 122, 260–270. [Google Scholar] [CrossRef]
- Yan, C. Fixed-Wing UAVs flocking in continuous spaces: A deep reinforcement learning approach. Robot. Auton. Syst. 2020, 131, 103594. [Google Scholar] [CrossRef]
- Liu, Q.F. The Research of Unmanned Aerial Vehicles (UAV) Dynamic Path Planning Based on Sparse A* Algorithm and an Evolutionary Algorithm. Master’s Thesis, Nanchang Hangkong University, Nanchang, China, June 2016. [Google Scholar]
- Shi, H. Route Planning of Small Fixed-wing UAV Based on Sparse A* Algorithm. Ordnance Ind. Autom. 2021, 40, 14–18. [Google Scholar]
- Wu, Q. Application research on improved artificial potential field method in UAV path planning. J. Chongqing Univ. Technol. 2022, 36, 144–151. [Google Scholar]
- Guo, Y.C. 3D Path Planning Method for UAV Based on Improved Artificial Potential Field. J. Northwestern Polytech. Univ. 2020, 38, 977–986. [Google Scholar] [CrossRef]
- Fan, L.Y. A dense obstacle avoidance algorithm for UAVs based on safe flight corridor. J. Northwestern Polytech. Univ. 2022, 40, 1288–1296. [Google Scholar] [CrossRef]
- Li, J.; Liu, Y. Deep Reinforcement Learning based Adaptive Real-Time Path Planning for UAV. In Proceedings of the 2021 8th International Conference on Dependable Systems and Their Applications, Yinchuan, China, 5–6 August 2021; pp. 522–530. [Google Scholar]
- Wu, R. UAV path planning based on multicritic-delayed deep deterministic policy gradient. Wirel. Commun. Mob. Comput. 2022, 1–12. [Google Scholar] [CrossRef]
- Huang, C. A novel three-dimensional path planning method for fixed-wing UAV using improved particle swarm optimization algorithm. Int. J. Aerosp. Eng. 2021, 1–19. [Google Scholar] [CrossRef]
- Guo, J.H. Task allocation and path planning algorithm for multiple fixed-wing UAVs. J. Taiyuan Univ. Technol. 2023, 1, 1–10. [Google Scholar]
- Jiang, X. Global Path Planning Of Fixed-wing UAV Based On Improved RRT* Algorithm. J. Appl. Sci. Eng. 2023, 26, 1441–1450. [Google Scholar]
- Ren, W. Phase Space Graph Convolutional Network for Chaotic Time Series Learning. IEEE Trans. Ind. Inform. 2024, 20, 7576–7584. [Google Scholar] [CrossRef]
- Ren, W. An Interdigital Conductance Sensor for Measuring Liquid Film Thickness in Inclined Gas-Liquid Two-Phase Flow. IEEE Trans. Instrum. Meas. 2024, 73, 1–9. [Google Scholar] [CrossRef]
- Wang, L.; Ye, H.; Wang, Q. Learning-based 3D occupancy prediction for autonomous navigation in occluded environments. In Proceedings of the 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems, Prague, Czech Republic, 27 September–1 October 2021; pp. 4509–4516. [Google Scholar]
- Gu, P. Research on Obstacle-Avoiding Strategy of Hexapod Robotbased on Deep Leamning. Master’s Thesis, Southwest University of Science and Technology, Nanchang, China, May 2019. [Google Scholar]
- Cai, X. EVORA: Deep Evidential Traversability Learning for Risk-Aware Off-Road Autonomy. arXiv 2023, arXiv:2311.06234. [Google Scholar]
- Yang, L. TacGNN: Learning Tactile-Based In-Hand Manipulation with a Blind Robot Using Hierarchical Graph Neural Network. IEEE Robot. Autom. Lett. 2023, 8, 3605–3612. [Google Scholar] [CrossRef]
- Tordesillas, J. Deep-panther: Learning-based perception-aware trajectory planner in dynamic environments. IEEE Robot. Autom. Lett. 2023, 8, 1399–1406. [Google Scholar] [CrossRef]
- Tao, Y.; Wu, Y.; Li, B. SEER: Safe efficient exploration for aerial robots using learning to predict information gain. In Proceedings of the 2023 IEEE International Conference on Robotics and Automation, London, UK, 29 May–2 June 2023; pp. 1235–1241. [Google Scholar]
- Li, Y.; Zhang, S.; Ye, F. A UAV path planning method based on deep reinforcement learning. In Proceedings of the 2020 IEEE USNC-CNC-URSI North American Radio Science Meeting, Montréal, QC, Canada, 5–10 July 2020; pp. 93–94. [Google Scholar]
- Yan, C. Deep reinforcement learning of collision-free flocking policies for multiple fixed-wing uavs using local situation maps. IEEE Trans. Ind. Inform. 2021, 18, 1260–1270. [Google Scholar] [CrossRef]
- Berg, J.; Lin, M.; Manocha, D. Reciprocal Velocity Obstacles for real-time multi-agent navigation. In Proceedings of the 2008 IEEE International Conference on Robotics and Automation, Pasadena, CA, USA, 19–23 May 2008; pp. 1928–1935. [Google Scholar]
- Zhao, X.; Du, H.; Lu, H. Collision-Free Motion-Primitive-Based Motion Planning Algorithm for Fixed-Wing Robotic Aircraft. In Proceedings of the Advances in Guidance, Navigation and Control, Ha’erbin, China, 5–7 August 2022; pp. 5951–5960. [Google Scholar]
- Zhang, H.; Zhang, Y.; Guo, C. Path planning for fixed-wing UAVs based on expert knowledge and improved VFH in cluttered environments. In Proceedings of the 2022 IEEE 17th International Conference on Control & Automation, Naples, Italy, 27–30 June 2022; pp. 255–260. [Google Scholar]
- Zhang, T.Y. Research on Path Planning of Large UAV based on Dubins Algorithm. Master’s Thesis, Anhui Polytechnic University, Wuhu, China, June 2021. [Google Scholar]
- Che, C.; Liu, Y.; Kreiss, S. Crowd-robot interaction: Crowd-aware robot navigation with attention-based deep reinforcement learning. In Proceedings of the 2019 International Conference on Robotics and Automation, Montreal, QC, Canada, 20–24 May 2019; pp. 6015–6022. [Google Scholar]
- Akmandor, N.; Li, H.; Lvov, G. Deep reinforcement learning based robot navigation in dynamic environments using occupancy values of motion primitives. In Proceedings of the 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems, Kyoto, Japan, 23–27 October 2022; pp. 11687–11694. [Google Scholar]
- Fan, T. Distributed multi-robot collision avoidance via deep reinforcement learning for navigation in complex scenarios. Int. J. Robot. Res. 2020, 39, 856–892. [Google Scholar] [CrossRef]
- Zhou, B. Robust and Efficient Quadrotor Trajectory Generation for Fast Autonomous Flight. IEEE Robot. Autom. Lett. 2019, 4, 3529–3536. [Google Scholar] [CrossRef]
- Labbadi, M. Adaptive fractional-order nonsingular fast terminal sliding mode based robust tracking control of quadrotor UAV with Gaussian random disturbances and uncertainties. IEEE Trans. Aerosp. Electron. Syst. 2021, 57, 2265–2277. [Google Scholar] [CrossRef]
Parameters | Value |
---|---|
Discount factor | 0.9 |
Learning rate | 0.001 |
Number of training rounds | 300 |
5000 | |
30,000 | |
100 | |
−0.25 | |
1 | |
−0.1 | |
0.05 | |
0.1 | |
0.01 | |
0.01 | |
0.01 | |
5 m | |
1 m | |
10 m | |
0.25 s | |
25 s |
Methods | HADRL (without Noises) | HADRL (with Noises in x Position) | HADRL (with Noises in y Position) |
---|---|---|---|
Scenario 1 | 0.86 | 0.72 | 0.71 |
Scenario 2 | 0.79 | 0.65 | 0.63 |
Scenario 3 | 0.76 | 0.60 | 0.61 |
Methods | ORCA | SARL | HADRL |
---|---|---|---|
Scenario 1 | 0.63 | 0.78 | 0.86 |
Scenario 2 | 0.55 | 0.62 | 0.79 |
Scenario 3 | 0.43 | 0.57 | 0.76 |
Methods | ORCA | SARL | HADRL |
---|---|---|---|
Scenario 1 | 0.32 | 0.12 | 0.13 |
Scenario 2 | 0.35 | 0.13 | 0.14 |
Scenario 3 | 0.36 | 0.13 | 0.16 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Du, H.; You, M.; Zhao, X. Globally Guided Deep V-Network-Based Motion Planning Algorithm for Fixed-Wing Unmanned Aerial Vehicles. Sensors 2024, 24, 3984. https://doi.org/10.3390/s24123984
Du H, You M, Zhao X. Globally Guided Deep V-Network-Based Motion Planning Algorithm for Fixed-Wing Unmanned Aerial Vehicles. Sensors. 2024; 24(12):3984. https://doi.org/10.3390/s24123984
Chicago/Turabian StyleDu, Hang, Ming You, and Xinyi Zhao. 2024. "Globally Guided Deep V-Network-Based Motion Planning Algorithm for Fixed-Wing Unmanned Aerial Vehicles" Sensors 24, no. 12: 3984. https://doi.org/10.3390/s24123984
APA StyleDu, H., You, M., & Zhao, X. (2024). Globally Guided Deep V-Network-Based Motion Planning Algorithm for Fixed-Wing Unmanned Aerial Vehicles. Sensors, 24(12), 3984. https://doi.org/10.3390/s24123984