Research on PID Parameter Tuning and Optimization Based on SAC-Auto for USV Path Following
Abstract
:1. Introduction
2. The Design of the DRL
2.1. Dynamic Model of USV
2.2. Path Following
2.3. SAC Algorithm
2.3.1. Training and Updating of the Actor Network
2.3.2. Training and Updating of V Networks
2.3.3. Training and Updating of Critic-Q Network
2.3.4. The Design of State, Action Space, and Reward
3. Training and Simulation Results
3.1. Network Training
3.2. Simulation Results
3.2.1. Linear Path Following
3.2.2. Curve and Polyline Path Following
4. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Xu, F.C.; Xie, Y.L.; Liu, X.C.; Chen, X.; Han, W. Research Status and Key Technologies of Intelligent Technology for Unmanned Surface Vehicle System. In Proceedings of the International Conference on Sensing, Diagnostics, Prognostics, and Control (SDPC), Bejing, China, 5–7 August 2020; IEEE: New York, NY, USA, 2020; pp. 229–233. [Google Scholar]
- Song, L.F.; Shi, X.Q.; Sun, H.; Xu, K.K.; Huang, L. Collision avoidance algorithm for USV based on rolling obstacle classification and fuzzy rules. J. Mar. Sci. Eng. 2021, 9, 1321. [Google Scholar] [CrossRef]
- Mnih, V.; Kavukcuoglu, K.; Silver, D.; Rusu, A.A.; Veness, J.; Bellemare, M.G.; Graves, A.; Riedmiller, M.; Fidjeland, A.K.; Ostrovski, G.; et al. Human-level control through deep reinforcement learning. Nature 2015, 518, 529–533. [Google Scholar] [CrossRef] [PubMed]
- Silver, D.; Lever, G.; Heess, N.; Degris, T.; Wierstra, D.; Riedmiller, M. Deterministic policy gradient algorithms. In Proceedings of the International Conference on Machine Learning, PMLR 2014, Bejing, China, 22–24 June 2014; pp. 387–395. [Google Scholar]
- Mnih, V.; Badia, A.P.; Mirza, M.; Graves, A.; Harley, T.; Lillicrap, T.P.; Silver, D.; Kavukcuoglu, K. Asynchronous methods for deep reinforcement learning. In Proceedings of the International Conference on Machine Learning, PMLR 2016, New York, NY, USA, 20–22 June 2016; pp. 1928–1937. [Google Scholar]
- Haarnoja, T.; Zhou, A.; Hartikainen, K.; Tucker, G.; Ha, S.; Tan, J.; Kumar, V.; Zhu, H.; Gupta, A.; Abbeel, P.; et al. Soft actor-critic algorithms and applications. arXiv 2018, arXiv:1812.05905. [Google Scholar]
- Gonzalez-Garcia, A.; Castañeda, H.; Garrido, L. USV Path-Following Control Based on Deep Reinforcement Learning and Adaptive Control. In Proceedings of the Global Oceans 2020: Singapore–US Gulf Coast, Online, 5–30 October 2020; IEEE: New York, NY, USA, 2020; pp. 1–7. [Google Scholar]
- Zhao, Y.J.; Qi, X.; Ma, Y.; Li, Z.X.; Malekian, R.; Sotelo, M.A. Path following optimization for an underactuated USV using smoothly-convergent deep reinforcement learning. IEEE Trans. Intell. Transp. Syst. 2020, 22, 6208–6220. [Google Scholar] [CrossRef]
- Wang, N.; Gao, Y.; Liu, Y.J.; Li, K. Self-learning-based optimal tracking control of an unmanned surface vehicle with pose and velocity constraints. Int. J. Robust Nonlinear Control 2022, 32, 2950–2968. [Google Scholar] [CrossRef]
- Zheng, Y.M.; Tao, J.; Sun, Q.L.; Sun, H.; Chen, Z.Q.; Sun, M.W.; Xie, G.M. Soft Actor–Critic based active disturbance rejection path following control for unmanned surface vessel under wind and wave disturbances. Ocean. Eng. 2022, 247, 110631. [Google Scholar] [CrossRef]
- Feng, Z.; Pan, Z.S.; Chen, W.; Liu, Y.; Leng, J.X. USV Application Scenario Expansion Based on Motion Control, Path Following and Velocity Planning. Machines 2022, 10, 310. [Google Scholar] [CrossRef]
- Moreira, L.; Fossen, T.I.; Soares, C.G. Path following control system for a tanker ship model. Ocean Eng. 2007, 34, 2074–2085. [Google Scholar] [CrossRef]
- Lekkas, A.M.; Fossen, T.I. A time-varying lookahead distance guidance law for path following. IFAC Proc. Vol. 2012, 45, 398–403. [Google Scholar] [CrossRef] [Green Version]
- Liu, T.; Dong, Z.P.; Du, H.W.; Song, L.F.; Mao, Y.S. Path following control of the underactuated USV based on the improved line-of-sight guidance algorithm. Pol. Marit. Res. 2017, 24, 3–11. [Google Scholar] [CrossRef] [Green Version]
- Yang, Z.K.; Zhong, W.B.; Feng, Y.B.; Sun, B. Unmanned surface vehicle track control based on improved LOS and ADRC. Chin. J. Ship Res. 2021, 16, 121–127. [Google Scholar]
- Liu, W.; Liu, Y.; Bucnall, R. A robust localization method for unmanned surface vehicle (USV) navigation using fuzzy adaptive Kalman filtering. IEEE Access 2019, 7, 46071–46083. [Google Scholar] [CrossRef]
- Do, K.D. Global robust adaptive path-tracking control of underactuated ships under stochastic disturbances. Ocean Eng. 2016, 111, 267–278. [Google Scholar] [CrossRef]
- Dong, Z.; Wan, L.; Li, Y.; Liu, T.; Zhang, G. Trajectory tracking control of underactuated USV based on modified backstepping approach. Int. J. Nav. Archit. Ocean Eng. 2015, 7, 817–832. [Google Scholar] [CrossRef] [Green Version]
- Zhou, W.; Wang, Y.; Ahn, C.K.; Cheng, J.; Chen, C. Adaptive fuzzy backstepping-based formation control of unmanned surface vehicles with unknown model nonlinearity and actuator saturation. IEEE Trans. Veh. Technol. 2020, 69, 14749–14764. [Google Scholar] [CrossRef]
- Ashrafiuon, H.; Muske, K.R.; McNinch, L.C.; Soltan, R.A. Sliding-mode tracking control of surface vessels. IEEE Trans. Ind. Electron. 2008, 55, 4004–4012. [Google Scholar] [CrossRef]
- Sun, Z.J.; Zhang, G.Q.; Qiao, L.; Zhang, W.D. Robust adaptive trajectory tracking control of underactuated surface vessel in fields of marine practice. J. Mar. Sci. Technol. 2018, 23, 950–957. [Google Scholar] [CrossRef]
- Guo, B.Z.; Zhao, Z.L. Active Disturbance Rejection Control for Nonlinear Systems: An Introduction; John Wiley & Sons: Hoboken, NJ, USA, 2016. [Google Scholar]
- Miao, R.; Dong, Z.; Wan, L.; Zeng, J. Heading control system design for a micro-USV based on an adaptive expert S-PID algorithm. Pol. Marit. Res. 2018, 25, 6–13. [Google Scholar] [CrossRef] [Green Version]
- Wang, J.H.; Zhao, M.K. Simulation of path following optimization control of unmanned surface vehicle. Comput. Simul. 2016, 33, 362–367. [Google Scholar]
- Fan, Y.S.; Guo, C.; Zhao, Y.S.; Wang, G.F.; Shi, W.W. Design and verification of straight-line path following controller for USV with time-varying drift angle. Chin. J. Sci. Instrum. 2016, 37, 2514–2520. [Google Scholar]
- Xu, P.F.; Cheng, C.; Cheng, H.X.; Shen, Y.L.; Ding, Y.X. Identification-based 3 DOF model of unmanned surface vehicle using support vector machines enhanced by cuckoo search algorithm. Ocean Eng. 2020, 197, 106898. [Google Scholar] [CrossRef]
- Budak, G.; Beji, S. Controlled course-keeping simulations of a ship under external disturbances. Ocean Eng. 2020, 218, 108126. [Google Scholar] [CrossRef]
- Wang, L.; Xiang, J.L.; Wang, H.D. Local Path Planning Algorithm for Unmanned Surface Vehicle Based on Improved Bi-RRT. Shipbuild. China 2020, 61, 21–30. [Google Scholar]
Parameters | Values |
---|---|
USV Weight | 3.273 t |
Length | 7.0 m |
Width | 1.27 m |
Draught Depth | 0.455 m |
Motor Rated Speed | 16.687°/s |
Designed speed | 1.242 m/s |
Moment of inertia | 10.0178 Kg·m2 |
Center of gravity | 0.2485 m |
Parameters | Values | Parameters | Values |
---|---|---|---|
−0.00135 | 0.001609 | ||
−0.014508 | −0.001034 | ||
−0.001209 | −0.019 | ||
−0.000588 | −0.129 | ||
−0.000564 | 0.005719 | ||
−0.0022 | −0.000048 | ||
0.0015 | −0.02429 | ||
0.0 | 0.0211 | ||
0.00159 | 0.00408 | ||
0.000338 | −0.003059 | ||
0.01391 | −0.00456 | ||
−0.00272 | 0.00326 | ||
−0.007886 | 0.003018 | ||
0.000175 | −0.002597 | ||
−0.003701 | 0.000895 | ||
−0.000707 | −0.001834 | ||
0.00372 | 0.001426 | ||
0.00232 | −0.001504 | ||
−0.001406 | 0.001191 | ||
−0.000398 |
Agent | DDPG | SAC | SAC-Auto |
---|---|---|---|
Discount factor | 0.99 | 0.99 | 0.99 |
Hidden layer 1 | 400 | 400 | 400 |
Hidden layer 2 | 300 | 300 | 300 |
Activation function | ReLU | ReLU | ReLU |
Batch size | 100 | 100 | 100 |
Experience pool capacity | 106 | 106 | 106 |
0.0001 | 0.0001 | 0.0001 | |
0.001 | 0.001 | 0.001 | |
0.001 | 0.001 | 0.001 | |
none | 0.2 | auto |
Agent | DDPG | SAC | SAC-Auto |
---|---|---|---|
Average return | −0.263 | −0.248 | −0.190 |
MSD | 0.116 | 0.078 | 0.05 |
Controller | Transverse Error Mean/(m) | Heading Deviation Mean/(°) | Average Operation Time per Step/(ms) |
---|---|---|---|
SAC-auto | 0.128 | 0.221 | 1.63 |
SAC | 0.136 | 0.376 | 1.65 |
DDPG | 0.292 | 0.888 | 1.61 |
Self-adaptive PID | 0.449 | 2.085 | 1.89 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Song, L.; Xu, C.; Hao, L.; Yao, J.; Guo, R. Research on PID Parameter Tuning and Optimization Based on SAC-Auto for USV Path Following. J. Mar. Sci. Eng. 2022, 10, 1847. https://doi.org/10.3390/jmse10121847
Song L, Xu C, Hao L, Yao J, Guo R. Research on PID Parameter Tuning and Optimization Based on SAC-Auto for USV Path Following. Journal of Marine Science and Engineering. 2022; 10(12):1847. https://doi.org/10.3390/jmse10121847
Chicago/Turabian StyleSong, Lifei, Chuanyi Xu, Le Hao, Jianxi Yao, and Rong Guo. 2022. "Research on PID Parameter Tuning and Optimization Based on SAC-Auto for USV Path Following" Journal of Marine Science and Engineering 10, no. 12: 1847. https://doi.org/10.3390/jmse10121847
APA StyleSong, L., Xu, C., Hao, L., Yao, J., & Guo, R. (2022). Research on PID Parameter Tuning and Optimization Based on SAC-Auto for USV Path Following. Journal of Marine Science and Engineering, 10(12), 1847. https://doi.org/10.3390/jmse10121847