A Dual Aircraft Maneuver Formation Controller for MAV/UAV Based on the Hybrid Intelligent Agent
Abstract
:1. Introduction
2. Mathematical Modeling
2.1. UAV Dynamic Model
2.2. Formation Control Targets
2.2.1. Flight Velocity Control Targets
2.2.2. Flight Distance Control Targets
3. Design of the HIAC
3.1. Desired Command Solver
3.1.1. Framework of Hybrid Intelligent Agent Based on DDPG/DDQN
3.1.2. State Space
3.1.3. Action Space
3.1.4. Reward Function
3.2. Dynamic Inversion Controller
3.3. First-Order Lag Filter
4. Simulation Validation
4.1. Simulation Design
4.2. Basic Principles of LQR
4.3. Experiment of Nominal Conditions
4.4. Monte Carlo Experiments
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Lei, L.; Wang, T.; Jiang, Q. Key Technology Develop Trends of Unmanned Systems Viewed from Unmanned Systems Integrated Roadmap 2017—2042. Unmanned Syst. Technol. 2018, 1, 79–84. [Google Scholar]
- Mishory, J. DARPA Solicits Information for New Lifelong Machine Learning Program. Inside Pentagon 2017, 33, 10. [Google Scholar]
- Pittaway, N. Loyal Wingman. Air Int. 2019, 96, 12–13. [Google Scholar]
- Oh, K.; Park, M.; Ahn, H. A survey of multi-agent formation control. Automatica 2015, 53, 424–440. [Google Scholar] [CrossRef]
- Wang, H.; Liu, S.; Lv, M.; Zhang, B. Two-Level Hierarchical-Interaction-Based Group Formation Control for MAV/UAVs. Aerospace 2022, 9, 510. [Google Scholar] [CrossRef]
- Choi, I.S.; Choi, J.S. Leader-Follower formation control using PID controller. In Proceedings of the International Conference on Intelligent Robotics & Applications, Montreal, QC, Canada, 3–5 October 2012. [Google Scholar]
- Gong, Z.; Zhou, Z.; Wang, Z.; Lv, Q.; Xu, Q.; Jiang, Y. Coordinated Formation Guidance Law for Fixed-Wing UAVs Based on Missile Parallel Approach Metho. Aerospace 2022, 9, 272. [Google Scholar] [CrossRef]
- Liang, Z.; Ren, Z.; Shao, X. Decoupling trajectory tracking for gliding reentry vehicles. IEEE/CAA J. Autom. Sin. 2015, 2, 115–120. [Google Scholar]
- Kuriki, Y.; Namerikawa, T. Formation Control of UAVs with a Fourth-Order Flight Dynamics. J. Control. Meas. Syst. Integr. 2014, 7, 74–81. [Google Scholar] [CrossRef]
- Kuriki, Y.; Namerikawa, T. Consensus-based cooperative formation control with collision avoidance for a multi-UAV system. In Proceedings of the American Control Conference, Portland, OR, USA, 4–6 June 2014. [Google Scholar]
- Atn, G.M.; Stipanovi, D.M.; Voulgaris, P.G. Collision-free trajectory tracking while preserving connectivity in unicycle multi-agent systems. In Proceedings of the American Control Conference, Washington, DC, USA, 17–19 June 2013. [Google Scholar]
- Tsankova, D.D.; Isapov, N. Potential field-based formation control in trajectory tracking and obstacle avoidance tasks. In Proceedings of the Intelligent Systems, Sofia, Bulgaria, 6–8 September 2012. [Google Scholar]
- Hu, J.; Wang, L.; Hu, T. Autonomous Maneuver Decision Making of Dual-UAV Cooperative Air Combat Based on Deep Reinforcement Learning. Electronics 2022, 11, 467. [Google Scholar] [CrossRef]
- Luo, Y.; Meng, G. Research on UAV Maneuver Decision-making Method Based on Markov Network. J. Syst. Simul. 2017, 29, 106–112. [Google Scholar]
- Yang, Q.; Zhang, J.; Shi, G. Maneuver Decision of UAV in Short-Range Air Combat Based on Deep Reinforcement Learning. IEEE Access 2020, 8, 363–378. [Google Scholar] [CrossRef]
- Li, Y.; Han, W.; Wang, Y. Deep Reinforcement Learning with Application to Air Confrontation Intelligent Decision-Making of Manned/Unmanned Aerial Vehicle Cooperative System. IEEE Access 2020, 99, 67887–67898. [Google Scholar] [CrossRef]
- Wang, X.; Gu, Y.; Cheng, Y. Approximate Policy-Based Accelerated Deep Reinforcement Learning. IEEE Trans. Neural Netw. Learn. Syst. 2020, 31, 1820–1830. [Google Scholar] [CrossRef] [PubMed]
- Hasselt, H.V.; Guez, A.; Silver, D. Deep Reinforcement Learning with Double Q-Learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Canberra, Australia, 30 November–5 December2015. [Google Scholar]
- Mnih, V.; Kavukcuoglu, K.; Silver, D. Playing Atari with deep reinforcement learning. arXiv 2013, arXiv:1312.5602. [Google Scholar]
- Silver, D.; Huang, A.; Maddison, C.J. Mastering the game of go with deep neural networks and the tree search. Nature 2016, 529, 484. [Google Scholar] [CrossRef]
- Silver, D.; Lever, G.; Heess, N. Deterministic policy gradient algorithms. In Proceedings of the 31st International Conference on Machine Learning, Beijing, China, 21–26 June 2014. [Google Scholar]
- Ioffe, S.; Szegedy, C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv 2015, arXiv:1502.03167. [Google Scholar]
- Lillicrap, T.P.; Hunt, J.J.; Pritzel, A. Continuous control with deep reinforcement learning. arXiv 2015, arXiv:1509.02971. [Google Scholar]
- Wada, D.; Araujo-Estrada, S.A.; Windsor, S. Unmanned Aerial Vehicle Pitch Control Using Deep Reinforcement Learning with Discrete Actions in Wind Tunnel Test. Aerospace 2021, 8, 18. [Google Scholar] [CrossRef]
- Haarnoja, T.; Zhou, A.; Abbeel, P. Soft Actor-Critic Algorithms and Applications. arXiv 2018, arXiv:1812.05905. [Google Scholar]
- Heess, N.; Silver, D.; Teh, Y.W. Actor-critic reinforcement learning with energy-based policies. In Proceedings of the Tenth European Workshop on Reinforcement Learning, Edinburgh, UK, 30 June–1 July 2012. [Google Scholar]
- Schaul, T.; Quan, J.; Antonoglou, I. Prioritized experience replay. arXiv 2015, arXiv:1511.05952. [Google Scholar]
- Hu, Z.; Wan, K.; Gao, X.; Zhai, Y.; Wang, Q. Deep Reinforcement Learning Approach with Multiple Experience Pools for UAV’s Autonomous Motion Planning in Complex Unknown Environments. Sensors 2020, 20, 1890. [Google Scholar] [CrossRef] [PubMed]
- Neunert, M.; Abdolmaleki, A.; Wulfmeier, M. Continuous-Discrete Reinforcement Learning for Hybrid Control in Robotics. In Proceedings of the Conference on Robot Learning, Virtual Event, 30 October–1 November 2020. [Google Scholar]
- Xiong, J.; Wang, Q.; Yang, Z. Parametrized Deep Q-Networks Learning: Reinforcement Learning with Discrete-Continuous Hybrid Action Space. arXiv 2018, arXiv:1810.06394. [Google Scholar]
- Anderson, M.R.; Robbins, A.C. Formation flight as a cooperative game. In Proceedings of the AIAA Guidance, Navigation, and Control Conference and Exhibit, Boston, MA, USA, 10–12 August 1998. [Google Scholar]
- Kelley, H.J. Reduced-order modeling in aircraft mission analysis. AIAA J. 2015, 9, 349–350. [Google Scholar] [CrossRef]
- Williams, P. Real-time computation of optimal three-dimensional aircraft trajectories including terrain-following. In Proceedings of the AIAA Guidance, Navigation, and Control Conference and Exhibit, Keystone, CO, USA, 24–26 August 2006. [Google Scholar]
- Wang, X.; Guo, J.; Tang, S. Entry trajectory planning with terminal full states constraints and multiple geographic constraints. Aerosp. Sci. Technol. 2019, 84, 620–631. [Google Scholar] [CrossRef]
- Snell, S.A.; Enns, D.F.; Garrard, W.L. Nonlinear inversion flight control for a supermaneuverable aircraft. In Proceedings of the AIAA Guidance, Navigation, and Control Conference and Exhibit, Portland, OR, USA, 20–22 August 1990. [Google Scholar]
- Dukeman, G. Profile-Following Entry Guidance Using Linear Quadratic Regulator Theory. In Proceedings of the AIAA Guidance, Navigation, and Control Conference and Exhibit, Monterey, CA, USA, 5–8 August 2002. [Google Scholar]
- Wen, Z.; Shu, T.; Hong, C. A simple reentry trajectory generation and tracking scheme for common aero vehicle. In Proceedings of the AIAA Guidance, Navigation, and Control Conference, Minneapolis, MN, USA, 13–16 August 2012. [Google Scholar]
- Bryson, A.E.; Ho, Y. Applied Optimal Control. Technometrics 1979, 21, 3. [Google Scholar]
Parameters | Settings | Parameters | Settings |
---|---|---|---|
(s) | 50 | 0 | |
(s) | 0.1 | 1 | |
(lb) | 25,600 | 6 | |
(kg) | 14,470 | (rad) | |
(m/s2) | 9.81 | (m) | 600 |
(ft2) | 400 | (m) | 100 |
0.02 | (m/s) | 50 | |
0.1 | (rad) | 0.2 | |
(s) | 0.6 | (rad) | 0.2 |
(s) | 0.5 | (m/s) | 50 |
(s) | 0.5 | (rad) | |
(s) | 0.3 | (rad) | |
(s) | 0.2 | (rad) | |
(s) | 0.2 | (rad) |
Parameters | Settings |
---|---|
Learning Rate | 0.0001 |
Max Episode | 25,000 |
Batch Size (DDPG) | 256 |
Batch Size (DDQN) | 64 |
Discount Factor | 0.99 |
Experience Buffer Length | 1 × 106 |
Controller | Velocity (m/s) | Flight Path Angle (rad) | Flight Azimuth Angle (rad) | Relative Distance (m) Safe Distance [100, 600] | |
---|---|---|---|---|---|
LQR | RMS | 5.6957 | 0.4737 | 0.6202 | 516.7072 |
Max. | 15.3307 | 0.8027 | 0.7833 | 710.2799 | |
DDPG | RMS | 8.3953 | 0.2183 | 0.4493 | 444.1190 |
Max. | 13.3379 | 0.5241 | 0.5315 | 610.6078 | |
HIAC | RMS | 5.3647 | 0.1401 | 0.2174 | 460.0709 |
Max. | 6.0780 | 0.3586 | 0.2391 | 552.1845 |
Numbers of Monte Carlo Simulations | X (m) | Y (m) | Z (m) | Velocity (m/s) | Flight Path Angle (rad) | Flight Azimuth Angle (rad) |
---|---|---|---|---|---|---|
100 | [−50, 550] | [−50, 550] | [−1000, 1000] | [−100, 100] | [−/18, /18] | [−/18, /18] |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhao, L.; Liu, Y.; Peng, Q.; Zhao, L. A Dual Aircraft Maneuver Formation Controller for MAV/UAV Based on the Hybrid Intelligent Agent. Drones 2023, 7, 282. https://doi.org/10.3390/drones7050282
Zhao L, Liu Y, Peng Q, Zhao L. A Dual Aircraft Maneuver Formation Controller for MAV/UAV Based on the Hybrid Intelligent Agent. Drones. 2023; 7(5):282. https://doi.org/10.3390/drones7050282
Chicago/Turabian StyleZhao, Luodi, Yemo Liu, Qiangqiang Peng, and Long Zhao. 2023. "A Dual Aircraft Maneuver Formation Controller for MAV/UAV Based on the Hybrid Intelligent Agent" Drones 7, no. 5: 282. https://doi.org/10.3390/drones7050282
APA StyleZhao, L., Liu, Y., Peng, Q., & Zhao, L. (2023). A Dual Aircraft Maneuver Formation Controller for MAV/UAV Based on the Hybrid Intelligent Agent. Drones, 7(5), 282. https://doi.org/10.3390/drones7050282