Automatic Landing Control for Fixed-Wing UAV in Longitudinal Channel Based on Deep Reinforcement Learning
Abstract
:1. Introduction
- (1)
- PID control represents a fundamental element of the UAV landing controller, whereby the objective is to achieve convergence of the UAV flight state parameters and landing conditions by means of adjusting the parameters in the control model in response to a range of landing conditions and perturbations. The objective is to enhance the controllers’ stability and noise resistance following parameter tuning.
- (2)
- The flight state of the UAV is discretized into the controller of the incremental PID combined with the deep learning framework DQN for several training iterations. The designed reward function outputs the reward Q value of the training, and the optimal PID controller model is obtained after the convergence of the reward Q value is reached. This approach allows for the interpretability of the deep learning network model to be achieved.
- (3)
- Experiments are conducted in order to ascertain the efficacy and robustness of the proposed control model, which is designed to control the UAV to land automatically on the ground platform in the event of certain external perturbations. The parameters in the control model are evaluated in conjunction with conventional control methods in order to determine their effectiveness.
2. Fixed-Wing UAV Modeling
3. Principle of Landing Controller
3.1. Design of Conventional UAV-Landing Controller with PID-Based Guidance Law
3.2. Markov Decision Process
3.3. Principle of Reinforcement Learning Algorithm
3.4. Design of an Automatic Landing Controller Tuning Based on DQN
4. Controller and Algorithm Design
4.1. Structure of Conventional Landing Control System
4.2. Reinforcement Learning Algorithm Example
4.2.1. Intelligent Body Action Space
4.2.2. Reward Function Design
- Landing tracking accuracy
- Flight attitude optimization
4.2.3. Neural Network Framework
5. Simulation Experiment and Analysis
5.1. DQN-Network Algorithm Training Experiment
5.2. Controlling Operation Input Experimentation
5.3. Simulation Verification of Controller Landing Condition
5.4. Simulation with Real Atmospheric External Disturbance
5.4.1. Atmospheric Environment and Disturbance Model
5.4.2. Controller Landing Simulation Test in the Condition of External Perturbation
5.4.3. Experimental Results in an Open-Source Simulation Environment Gazebo
6. Conclusions
- (1)
- The actuator model of a UAV is established by combining the kinematics and dynamics equations of a UAV. Based on the conventional PID control combined with the stability augmentation control system (SAS), the control model is established to make the flight control more stable and realize the automatic landing control of the UAV. A control model is established based on a conventional PID control combined with a stability augmentation control system (SAS), with the objective of enhancing the stability of the flight control and enabling the automatic landing control of the UAV. A method for automatically tuning PID parameters using deep learning, based on the DQN reinforcement learning method, was devised. The landing control problem was transferred into Markov Decision Process (MDP) to analyze and establish the reward function to analyze its landing conditions. A neural network was built to calculate the expected maximization and output the action space PID parameter values in the output layer, which realizes the adaptive tuning of PID gain parameters and the optimization of the conventional controller.
- (2)
- A simulation experiment model is established by combining the actuator model of a UAV with a DQN-PID landing controller. Through the experiment and analysis, comparing the working condition test of the conventional PID control and DQN-PID control, the results show that the designed controller has obvious improvement in convergence speed and overshooting amount, which improves the stability and reliability of the UAV flight control, proves that the algorithm has obtained a good learning strategy, and verifies the validity of the controller for practical application.
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Mukherjee, A.; Misra, S.; Raghuwanshi, N.S. A survey of unmanned aerial sensing solutions in precision agriculture. J. Netw. Comput. Appl. 2019, 148, 102461. [Google Scholar] [CrossRef]
- Bhardwaj, A.; Sam, L.; Martín-Torres, F.J.; Kumar, R. UAVs as remote sensing platform in glaciology: Present applications and future prospects. Remote Sens. Environ. 2016, 175, 196–204. [Google Scholar] [CrossRef]
- Qin, T.; Zhang, G.; Yang, L.; He, Y. Research on the Endurance Optimisation of Multirotor UAVs for High-Altitude Environments. Drones 2023, 7, 469. [Google Scholar] [CrossRef]
- Asadi, K.; Suresh, A.K.; Ender, A.; Gotad, S.; Maniyar, S.; Anand, S.; Noghabaei, M.; Han, K.; Lobaton, E.; Wu, T. An integrated UGV-UAV system for construction site data collection. Autom. Constr. 2020, 112, 103068. [Google Scholar] [CrossRef]
- Gamagedara, K.; Lee, T.; Snyder, M. Delayed Kalman filter for vision-based autonomous flight in ocean environments. Control Eng. Pract. 2024, 143, 105791. [Google Scholar] [CrossRef]
- Åström, K.J.; Hägglund, T. The future of PID control. Control Eng. Pract. 2001, 9, 1163–1175. [Google Scholar] [CrossRef]
- Wu, T.; Zhou, C.; Yan, Z.; Peng, H.; Wu, L. Application of PID optimization control strategy based on particle swarm optimization (PSO) for battery charging system. Int. J. Low-Carbon Technol. 2020, 15, 528–535. [Google Scholar] [CrossRef]
- Ziquan, Y.U.; Youmin, Z.; Bin, J. PID-type fault-tolerant prescribed performance control of fixed-wing, U.A.V. J. Syst. Eng. Electron. 2021, 32, 1053–1061. [Google Scholar] [CrossRef]
- Acharya, D.S.; Mishra, S.K. A multi-agent based symbiotic organisms search algorithm for tuning fractional order PID controller. Measurement 2020, 155, 107559. [Google Scholar] [CrossRef]
- Juang, J.G.; Chien, L.H.; Lin, F. Automatic landing control system design using adaptive neural network and its hardware realization. IEEE Syst. J. 2011, 5, 266–277. [Google Scholar] [CrossRef]
- Zhao, W.; Liu, H.; Wang, X. Robust visual servoing control for quadrotors landing on a moving target. J. Frankl. Inst. 2021, 358, 2301–2319. [Google Scholar] [CrossRef]
- Zhen, Z.; Peng, M.; Xue, Y.; Jiang, S. Robust preview control and autoregressive prediction for aircraft automatic carrier landing. IEEE Access 2019, 7, 181273–181283. [Google Scholar] [CrossRef]
- Xue, Y.; Zhen, Z.; Zhang, Z.; Cao, T.; Wan, T. Automatic carrier landing for UAV based on integrated disturbance observer and fault-tolerant control. Aircr. Eng. Aerosp. Technol. 2023, 95, 1247–1256. [Google Scholar] [CrossRef]
- Kim, B.S.; Calise, A.J. Nonlinear flight control using neural networks. J. Guid. Control Dyn. 1997, 20, 26–33. [Google Scholar] [CrossRef]
- Lee, T.; Kim, Y. Nonlinear adaptive flight control using backstepping and neural networks controller. J. Guid. Control Dyn. (JGCD) 2001, 24, 675–682. [Google Scholar] [CrossRef]
- Tang, C.; Lai, Y.C. Deep reinforcement learning automatic landing control of fixed-wing aircraft using deep deterministic policy gradient. In Proceedings of the 2020 International Conference on Unmanned Aircraft Systems (ICUAS), Athens, Greece, 1–4 September 2020; IEEE: Piscataway, NJ, USA, 2020. [Google Scholar]
- Qing, Z.; Zhu, M.; Wu, Z. Adaptive neural network control for a quadrotor landing on a moving vehicle. In Proceedings of the 2018 Chinese Control And Decision Conference (CCDC), Shenyang, China, 9–11 June 2018; IEEE: Piscataway, NJ, USA, 2018. [Google Scholar]
- Wang, J.; Wang, T.; He, Z.; Cai, W.; Sun, C. Towards better generalization in quadrotor landing using deep reinforcement learning. Appl. Intell. 2023, 53, 6195–6213. [Google Scholar] [CrossRef]
- Mnih, V.; Kavukcuoglu, K.; Silver, D.; Rusu, A.A.; Veness, J.; Bellemare, M.G.; Graves, A.; Riedmiller, M.; Fidjeland, A.K.; Ostrovski, G.; et al. Human-level control through deep reinforcement learning. Nature 2015, 518, 529–533. [Google Scholar] [CrossRef]
- Zhang, M.; Wu, S.; Jiao, J.; Zhang, N.; Zhang, Q. Energy- and Cost-Efficient Transmission Strategy for UAV Trajectory Tracking Control: A Deep Reinforcement Learning Approach. IEEE Internet Things J. 2022, 10, 8958–8970. [Google Scholar] [CrossRef]
- Storey, V.C.; Lukyanenko, R.; Maass, W.; Parsons, J. Explainable ai. Commun. ACM 2022, 65, 27–29. [Google Scholar] [CrossRef]
- Guidotti, R.; Monreale, A.; Ruggieri, S.; Turini, F.; Giannotti, F.; Pedreschi, D. A survey of methods for explaining black box models. ACM Comput. Surv. (CSUR) 2018, 51, 1–42. [Google Scholar] [CrossRef]
- Okolo, W.; Dogan, A.; Blake, W.B. Development of an aerodynamic model for a delta-wing equivalent model II (EQ-II) aircraft. In Proceedings of the AIAA Modeling and Simulation Technologies Conference, Kissimmee, FL, USA, 5–9 January 2015; p. 0902. [Google Scholar]
- Chen, C.; Tan, W.Q.; Qu, X.J.; Li, H.X. A fuzzy human pilot model of longitudinal control for a carrier landing task. IEEE Trans. Aerosp. Electron. Syst. 2017, 54, 453–466. [Google Scholar] [CrossRef]
- Somefun, O.A.; Akingbade, K.; Dahunsi, F. The dilemma of PID tuning. Annu. Rev. Control 2021, 52, 65–74. [Google Scholar] [CrossRef]
- Xu, L. A proportional differential control method for a time-delay system using the Taylor expansion approximation. Appl. Math. Comput. 2014, 236, 391–399. [Google Scholar] [CrossRef]
- Bouallègue, S.; Haggège, J.; Ayadi, M.; Benrejeb, M. PID-type fuzzy logic controller tuning based on particle swarm optimization. Eng. Appl. Artif. Intell. 2012, 25, 484–493. [Google Scholar] [CrossRef]
- Van Hasselt, H.; Guez, A.; Silver, D. Deep reinforcement learning with double q-learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Phoenix, AZ, USA, 12–17 February 2016; Volume 30. [Google Scholar]
- Sun, R.; Zhou, Z.; Zhu, X. Flight quality characteristics and observer-based anti-windup finite-time terminal sliding mode attitude control of aileron-free full-wing configuration UAV. Aerosp. Sci. Technol. 2021, 112, 106638. [Google Scholar] [CrossRef]
- Wu, P.; Voskuijl, M.; Veldhuis, L.L.M. An approach to estimate aircraft touchdown attitudes and control inputs. Aerosp. Sci. Technol. 2017, 71, 201–213. [Google Scholar] [CrossRef]
- Guan, Y.-J.; Li, Y.-P.; Zeng, P. Aerodynamic analysis of a logistics UAV wing with compound ducted rotor. Aircr. Eng. Aerosp. Technol. 2023, 95, 366–378. [Google Scholar] [CrossRef]
- Huang, D.; Huang, T.; Qin, N.; Li, Y.; Yang, Y. Finite-time control for a UAV system based on finite-time disturbance observer. Aerosp. Sci. Technol. 2022, 129, 107825. [Google Scholar] [CrossRef]
- Yuan, Y.; Duan, H.; Zeng, Z. Automatic Carrier Landing Control with External Disturbance and Input Constraint. IEEE Trans. Aerosp. Electron. Syst. 2022, 59, 1426–1438. [Google Scholar] [CrossRef]
Actuators | |||||
Symbol | Model | Natural frequency (rad/sec) | Damping ratio | Amplitude limit (deg) | Rate limit (deg/sec) |
Second order | 50 | 0.8 | [−30, 30] | [−90, 90] | |
Engine | |||||
Symbol | Model | Frequency (rad/sec) | Thrust limit (N) | Rate limit (N/s) | |
First order | 2.4 | [4448, 44,480] | [−6450, 8363] |
Hyperparameter | Value |
---|---|
Steps | 600 |
Episodes | 2000 |
Hidden layer neurons (each hidden layer) | 100 |
Greediness | 0.90 |
Discount factor | 0.99 |
Parameter | Position | Speed | Track Angle | Pitch Angle |
---|---|---|---|---|
Value | (831, 0, 50) | 70 | 3.5 | 2.71 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, J.; Xu, S.; Wu, Y.; Zhang, Z. Automatic Landing Control for Fixed-Wing UAV in Longitudinal Channel Based on Deep Reinforcement Learning. Drones 2024, 8, 568. https://doi.org/10.3390/drones8100568
Li J, Xu S, Wu Y, Zhang Z. Automatic Landing Control for Fixed-Wing UAV in Longitudinal Channel Based on Deep Reinforcement Learning. Drones. 2024; 8(10):568. https://doi.org/10.3390/drones8100568
Chicago/Turabian StyleLi, Jinghang, Shuting Xu, Yu Wu, and Zhe Zhang. 2024. "Automatic Landing Control for Fixed-Wing UAV in Longitudinal Channel Based on Deep Reinforcement Learning" Drones 8, no. 10: 568. https://doi.org/10.3390/drones8100568
APA StyleLi, J., Xu, S., Wu, Y., & Zhang, Z. (2024). Automatic Landing Control for Fixed-Wing UAV in Longitudinal Channel Based on Deep Reinforcement Learning. Drones, 8(10), 568. https://doi.org/10.3390/drones8100568