Deep Deterministic Policy Gradient (DDPG) Agent-Based Sliding Mode Control for Quadrotor Attitudes
Abstract
:1. Introduction
2. Attitude Dynamics Modeling for Quadrotor UAV
3. Control Design for Attitude Control
3.1. SMC Design
Algorithm 1. Design Methodology of SMC. |
Input: (1) Desired attitude angles (2) Actual attitude angles (3) Model parameters of the quadrotor Output: Control signals for the attitude dynamics model Step 1: Design of the control signal (a) Define the sliding mode surface s (b) Select the reaching law (c) Compute the control signal Step 2: Proof of the stability of the closed-loop system (a) Select a Lyapunov candidate function (b) Calculate the first-order derivative of (c) Analyze the sign of the above derivative of (d) Conclude the convergence of the attitude motion Step 3: Termination If the attitude control errors meet the requirements, conduct the algorithm termination and output the control signal . Otherwise, carry out Step 1 until convergence of the control errors. |
3.2. DDPG-SMC Design
3.2.1. The Architectural Design of DDPG-SMC
3.2.2. The Basic Principle of the DDPG Algorithm
Algorithm 2. DDPG Algorithm. |
Input: Experience replay buffer , initial critic networks’ Q-function parameters , actor networks’ policy parameters , target networks and Initialize the target network parameters: for episode = 1 to M do Initialize stochastic process to add exploration to the action. Observe initial state for time step = 1 to T do Select action Perform action and transfer to next state , then acquire the reward value and the termination signal Store the state transition data in experience replay buffer Calculate the target function: Update the critic network using the minimized loss function: Update the actor network using the policy gradient method: Update target networks: end for end for |
3.2.3. Design of the Neural Network and Parameters Related to DDPG
3.2.4. The Training Results of DDPG
4. Simulation Results
4.1. Simulation Results of SMC
4.2. Simulation Results of AFGS-SMC
4.3. Simulation Results of DDPG-SMC
4.4. Comparative Analysis of Simulation Results
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Grima, S.; Lin, M.; Meng, Z.; Luo, C.; Chen, Y. The application of unmanned aerial vehicle oblique photography technology in online tourism design. PLoS ONE 2023, 18, e0289653. [Google Scholar]
- Clarke, R. Understanding the drone epidemic. Comput. Law Secur. Rev. 2014, 30, 230–246. [Google Scholar] [CrossRef]
- Xu, B.; Wang, W.; Falzon, G.; Kwan, P.; Guo, L.; Chen, G.; Tait, A.; Schneider, D. Automated cattle counting using Mask R-CNN in quadcopter vision system. Comput. Electron. Agric. 2020, 171, 105300. [Google Scholar] [CrossRef]
- Idrissi, M.; Salami, M.; Annaz, F. A review of quadrotor unmanned aerial vehicles: Applications, architectural design and control algorithms. J. Intell. Robot. Syst. 2022, 104, 22. [Google Scholar] [CrossRef]
- Adiguzel, F.; Mumcu, T.V. Robust discrete-time nonlinear attitude stabilization of a quadrotor UAV subject to time-varying disturbances. Elektron. Elektrotechnika 2021, 27, 4–12. [Google Scholar] [CrossRef]
- Shen, J.; Wang, B.; Chen, B.M.; Bu, R.; Jin, B. Review on wind resistance for quadrotor UAVs: Modeling and controller design. Unmanned Syst. 2022, 11, 5–15. [Google Scholar] [CrossRef]
- Gün, A. Attitude control of a quadrotor using PID controller based on differential evolution algorithm. Expert Syst. Appl. 2023, 229, 120518. [Google Scholar] [CrossRef]
- Zhou, L.; Pljonkin, A.; Singh, P.K. Modeling and PID control of quadrotor UAV based on machine learning. J. Intell. Syst. 2022, 31, 1112–1122. [Google Scholar] [CrossRef]
- Khatoon, S.; Nasiruddin, I.; Shahid, M. Design and simulation of a hybrid PD-ANFIS controller for attitude tracking control of a quadrotor UAV. Arab. J. Sci. Eng. 2017, 42, 5211–5229. [Google Scholar] [CrossRef]
- Landry, B.; Deits, R.; Florence, P.R.; Tedrake, R. Aggressive quadrotor flight through cluttered environments using mixed integer programming. In Proceedings of the 2016 IEEE International Conference on Robotics and Automation (ICRA), Stockholm, Sweden, 16–21 May 2016. [Google Scholar]
- Bouabdallah, S.; Noth, A.; Siegwart, R. PID vs LQ control techniques applied to an indoor micro quadrotor. In Proceedings of the 2004 1EEE/RSJ Internationel Conference On Intelligent Robots and Systems, Sendal, Japan, 28 September–2 October 2004. [Google Scholar]
- Miranda-Colorado, R.; Aguilar, L.T. Robust PID control of quadrotors with power reduction analysis. ISA Trans. 2020, 98, 47–62. [Google Scholar] [CrossRef] [PubMed]
- Wang, Z.; Zhao, J.; Cai, Z.; Wang, Y.; Liu, N. Onboard actuator model-based incremental nonlinear dynamic inversion for quadrotor attitude control: Method and application. Chin. J. Aeronaut. 2021, 34, 216–227. [Google Scholar] [CrossRef]
- Smeur, E.J.J.; Chu, Q.; de Croon, G.C.H.E. Adaptive incremental nonlinear dynamic inversion for attitude control of micro air vehicles. J. Guid. Control. Dyn. 2016, 39, 450–461. [Google Scholar] [CrossRef]
- da Costa, R.R.; Chu, Q.P.; Mulder, J.A. Reentry flight controller design using nonlinear dynamic inversion. J. Spacecr. Rocket. 2003, 40, 64–71. [Google Scholar] [CrossRef]
- Yang, J.; Cai, Z.; Zhao, J.; Wang, Z.; Ding, Y.; Wang, Y. INDI-based aggressive quadrotor flight control with position and attitude constraints. Robot. Auton. Syst. 2023, 159, 104292. [Google Scholar] [CrossRef]
- Wang, B.; Zhang, Y.; Zhang, W. A composite adaptive fault-tolerant attitude control for a quadrotor UAV with multiple uncertainties. J. Syst. Sci. Complex. 2022, 35, 81–104. [Google Scholar] [CrossRef]
- Huang, T.; Li, T.; Chen, C.L.P.; Li, Y. Attitude stabilization for a quadrotor using adaptive control algorithm. IEEE Trans. Aerosp. Electron. Syst. 2023, 60, 334–347. [Google Scholar] [CrossRef]
- Patnaik, K.; Zhang, W. Adaptive attitude control for foldable quadrotors. IEEE Control. Syst. Lett. 2023, 7, 1291–1296. [Google Scholar] [CrossRef]
- Chen, J.; Long, Y.; Li, T.; Huang, T. Attitude tracking control for quadrotor based on time-varying gain extended state observer. Proc. Inst. Mech. Eng. Part I J. Syst. Control. Eng. 2022, 237, 585–595. [Google Scholar] [CrossRef]
- Zheng, Z.; Su, X.; Jiang, T.; Huang, J. Robust dynamic geofencing attitude control for quadrotor systems. IEEE Trans. Ind. Electron. 2023, 70, 1861–1869. [Google Scholar] [CrossRef]
- Yang, Y.; Yan, Y. Attitude regulation for unmanned quadrotors using adaptive fuzzy gain-scheduling sliding mode control. Aerosp. Sci. Technol. 2016, 54, 208–217. [Google Scholar] [CrossRef]
- Chen, X.; Li, Y.; Ma, H.; Tang, H.; Xie, Y. A novel variable exponential discrete time sliding mode reaching law. IEEE Trans. Circuits Syst. II Express Briefs 2021, 68, 2518–2522. [Google Scholar] [CrossRef]
- Lian, S.; Meng, W.; Lin, Z.; Shao, K.; Zheng, J.; Li, H.; Lu, R. Adaptive attitude control of a quadrotor using fast nonsingular terminal sliding mode. IEEE Trans. Ind. Electron. 2022, 69, 1597–1607. [Google Scholar] [CrossRef]
- Sun, H.; Li, J.; Wang, R.; Yang, K. Attitude control of the quadrotor UAV with mismatched disturbances based on the fractional-order sliding mode and backstepping control subject to actuator faults. Fractal Fract. 2023, 7, 227. [Google Scholar] [CrossRef]
- Belgacem, K.; Mezouar, A.; Essounbouli, N. Design and analysis of adaptive sliding mode with exponential reaching law control for double-fed induction generator based wind turbine. Int. J. Power Electron. Drive Syst. 2018, 9, 1534–1544. [Google Scholar] [CrossRef]
- Mechali, O.; Xu, L.; Xie, X.; Iqbal, J. Fixed-time nonlinear homogeneous sliding mode approach for robust tracking control of multirotor aircraft: Experimental validation. J. Frankl. Inst. 2022, 359, 1971–2029. [Google Scholar] [CrossRef]
- Kelkoul, B.; Boumediene, A. Stability analysis and study between classical sliding mode control (SMC) and super twisting algorithm (STA) for doubly fed induction generator (DFIG) under wind turbine. Energy 2021, 214, 118871. [Google Scholar] [CrossRef]
- Danesh, M.; Jalalaei, A.; Derakhshan, R.E. Auto-landing algorithm for quadrotor UAV using super-twisting second-order sliding mode control in the presence of external disturbances. Int. J. Dyn. Control 2023, 11, 2940–2957. [Google Scholar] [CrossRef]
- Siddique, N.; Rehman, F.U.; Raoof, U.; Iqbal, S.; Rashad, M. Robust hybrid synchronization control of chaotic 3-cell CNN with uncertain parameters using smooth super twisting algorithm. Bull. Pol. Acad. Sci. Tech. Sci. 2023, 71, 1–8. [Google Scholar] [CrossRef]
- Chen, Y.; Cai, B.; Cui, G. The Design of Adaptive Sliding Mode Controller Based on RBFNN Approximation for Suspension Control of MVAWT; 2020 Chinese Automation Congress (CAC): Shanghai, China, 2020. [Google Scholar]
- Wang, D.; Shen, Y.; Sha, Q. Adaptive DDPG design-based sliding-mode control for autonomous underwater vehicles at different speeds. In Proceedings of the 2019 IEEE Underwater Technology (UT), Kaohsiung, Taiwan, 16–19 April 2019. [Google Scholar]
- Nicola, M.; Nicola, C.-I.; Selișteanu, D. Improvement of the control of a grid connected photovoltaic system based on synergetic and sliding mode controllers using a reinforcement learning deep deterministic policy gradient agent. Energies 2022, 15, 2392. [Google Scholar] [CrossRef]
- Lillicrap, T.P.; Hunt, J.J.; Pritzel, A.; Heess, N.; Erez, T.; Tassa, Y.; Silver, D.; Wierstra, D. Continuous control with deep reinforcement learning. arXiv 2015, arXiv:1509.02971. [Google Scholar]
- Mechali, O.; Xu, L.; Huang, Y.; Shi, M.; Xie, X. Observer-based fixed-time continuous nonsingular terminal sliding mode control of quadrotor aircraft under uncertainties and disturbances for robust trajectory tracking: Theory and experiment. Control. Eng. Pract. 2021, 111, 104806. [Google Scholar] [CrossRef]
- Tang, P.; Lin, D.; Zheng, D.; Fan, S.; Ye, J. Observer based finite-time fault tolerant quadrotor attitude control with actuator faults. Aerosp. Sci. Technol. 2020, 104, 105968. [Google Scholar] [CrossRef]
- Nasiri, A.; Kiong Nguang, S.; Swain, A. Adaptive sliding mode control for a class of MIMO nonlinear systems with uncertainties. J. Frankl. Inst. 2014, 351, 2048–2061. [Google Scholar] [CrossRef]
- Silver, D.; Lever, G.; Heess, N. Deterministic policy gradient algorithms. In Proceedings of the 31st International Conference on Machine Learning, Beijing, China, 21–26 June 2014. [Google Scholar]
Parameter | Value |
---|---|
State dimension of the input layer | 9 |
Action dimension of the output layer | 3 |
Reward discount factor | 0.995 |
Minimum batch size | 128 |
Max steps per episode | 500 |
Max episodes | 200 |
Agent sample time | 0.02 s |
Experience replay buffer | 1 × 106 |
Target smooth factor | 1 × 10−3 |
Parameter | Value |
---|---|
Mass m/kg | 3.350 |
Inertia moment about obxb Jx/(kg·m2) | 0.0588 |
Inertia moment about obyb Jy/(kg·m2) | 0.0588 |
Inertia moment about obzb Jz/(kg·m2) | 0.1076 |
Lift factor b | 8.159 × 10−5 |
Drag factor d | 2.143 × 10−6 |
Distance between the center of mass and the rotation axis of any propeller l/m | 0.195 |
Performance Indicators | SMC | AFGS-SMC | DDPG-SMC |
---|---|---|---|
Convergence time (s) | 2.1 | 2.5 | 2.0 |
Steady-state errors (%) | 2 | 5 | 1 |
Chattering amplitudes (N·m) | (−0.018, 0.024) | (−0.006, 0.006) | (−0.005, 0.005) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Hu, W.; Yang, Y.; Liu, Z. Deep Deterministic Policy Gradient (DDPG) Agent-Based Sliding Mode Control for Quadrotor Attitudes. Drones 2024, 8, 95. https://doi.org/10.3390/drones8030095
Hu W, Yang Y, Liu Z. Deep Deterministic Policy Gradient (DDPG) Agent-Based Sliding Mode Control for Quadrotor Attitudes. Drones. 2024; 8(3):95. https://doi.org/10.3390/drones8030095
Chicago/Turabian StyleHu, Wenjun, Yueneng Yang, and Zhiyang Liu. 2024. "Deep Deterministic Policy Gradient (DDPG) Agent-Based Sliding Mode Control for Quadrotor Attitudes" Drones 8, no. 3: 95. https://doi.org/10.3390/drones8030095
APA StyleHu, W., Yang, Y., & Liu, Z. (2024). Deep Deterministic Policy Gradient (DDPG) Agent-Based Sliding Mode Control for Quadrotor Attitudes. Drones, 8(3), 95. https://doi.org/10.3390/drones8030095