An Efficient Multi-AUV Cooperative Navigation Method Based on Hierarchical Reinforcement Learning
Abstract
:1. Introduction
2. Multi-AUV Cooperative Navigation Model
2.1. Cooperative Navigation Model
2.2. Observability and Observation Error
3. Proposed Method
3.1. Cooperative Navigation Method under the Markov Decision Framework
- (1)
- State Set
- (2)
- Action Set
- (3)
- Reward Function
3.2. Hierarchical Reinforcement Learning-Based Approach
3.2.1. Q-Learning
3.2.2. Abstract Actions
3.2.3. Semi-Markov Decision Process
3.3. Trajectory Planning Method Based on the Hierarchical Model
- (a)
- Designate the trajectories and relevant parameters for slave AUVs;
- (b)
- For each slave AUV, use Equation (10) to determine the discretized system state set and Equation (11) to obtain the discretized action set ;
- (c)
- Use the Q-learning algorithm to train the master AUV for each slave AUV; compute the instantaneous reward of the master AUV’s actions using Equation (14), and eventually, obtain optimal action-value functions;
- (d)
- Initialize the state of the master AUV, partition the sub-navigation processes, and randomly select one slave AUV along with its corresponding optimal action-value function ;
- (e)
- Use Equation (16) to select and execute an optimal action. For each slave AUV, compute the action’s cost value and calculate the total cost;
- (f)
- Repeat Step (e) until the sub-navigation process concludes, obtaining the total cost value for each slave AUV. Thereafter, select the AUV with the highest cumulative cost value for the next sub-navigation process;
- (g)
- Upon the completion of the final sub-navigation process, obtain the planned trajectory for the master AUV;
- (h)
- The master AUV and multiple slave AUVs navigate according to their designated trajectories. Periodically, the master and slave AUVs acoustically communicate and measure distances between each other. The slave AUVs correct cumulative errors resulting from their navigation system outputs and perform the UKF filtering algorithm on the master AUV’s position data, relative distance measurement information, and their own navigation data.
4. Simulation Experiments
4.1. Algorithm Simulation Analysis
4.2. Simulation Parameter Settings
4.3. Trajectory Planning Analysis
4.4. UKF Filtering Analysis
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Zhou, J.; Si, Y.; Chen, Y. A Review of Subsea AUV Technology. J. Mar. Sci. Eng. 2023, 11, 1119. [Google Scholar] [CrossRef]
- Lambert, W.; Miller, L.; Brizzolara, S.; Woolsey, C. A Free Surface Corrected Lumped Parameter Model for Near-Surface Horizontal Maneuvers of Underwater Vehicles in Waves. Ocean. Eng. 2023, 278, 114364. [Google Scholar] [CrossRef]
- Mendes, P.; Batista, P.; Oliveira, P.; Silvestre, C. Cooperative Decentralized Navigation Algorithms Based on Bearing Measurements for Arbitrary Measurement Topologsies. Ocean. Eng. 2023, 270, 113564. [Google Scholar] [CrossRef]
- Zhao, Y.; Xing, W.; Yuan, H.; Shi, P. A Collaborative Control Framework with Multi-Leaders for AUVs Based on Unscented Particle Filter. J. Frankl. Inst. 2016, 353, 657–669. [Google Scholar] [CrossRef]
- Edwards, D.B.; Bean, T.A.; Odell, D.L.; Anderson, M.J. A Leader-Follower Algorithm for Multiple AUV Formations. In Proceedings of the 2004 IEEE/OES Autonomous Underwater Vehicles, Sebasco, ME, USA, 17–18 June 2004; pp. 40–46. [Google Scholar]
- Forsgren, B.; Vasudevan, R.; Kaess, M.; McLain, T.W.; Mangelson, J.G. Group-k Consistent Measurement Set Maximization for Robust Outlier Detection. In Proceedings of the 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Kyoto, Japan, 23–27 October 2022; pp. 4849–4856. [Google Scholar]
- Guo, Y.; Xu, B.; Wang, L. A Robust SINS/USBL Integrated Navigation Algorithm Based on Earth Frame and Right Group Error Definition. IEEE Trans. Instrum. Meas. 2022, 71, 8504716. [Google Scholar] [CrossRef]
- Lee, K.; Johnson, E.N. Robust Outlier-Adaptive Filtering for Vision-Aided Inertial Navigation. Sensors 2020, 20, 2036. [Google Scholar] [CrossRef] [PubMed]
- Lu, J.; Chen, X.; Luo, M.; Zhou, Y. Cooperative Localization for Multiple AUVs Based on the Rough Estimation of the Measurements. Appl. Soft Comput. 2020, 91, 106197. [Google Scholar] [CrossRef]
- Wang, W.; Xu, Y. A Modified Residual-Based RAIM Algorithm for Multiple Outliers Based on A Robust MM Estimation. Sensors 2020, 20, 5407. [Google Scholar] [CrossRef] [PubMed]
- Bai, M.; Huang, Y.; Chen, B.; Yang, L.; Zhang, Y. A Novel Mixture Distributions-Based Robust Kalman Filter for Cooperative Localization. IEEE Sens. J. 2020, 20, 14994–15006. [Google Scholar] [CrossRef]
- Li, Q.; Ben, Y.; Naqvi, S.M.; Neasham, J.A.; Chambers, J.A. Robust Student’s T-Based Cooperative Navigation for Autonomous Underwater Vehicles. IEEE Trans. Instrum. Meas. 2018, 67, 1762–1777. [Google Scholar] [CrossRef]
- Bo, X.; Razzaqi, A.A.; Yalong, L. Cooperative Localisation of AUVs based on Huber-Based Robust Algorithm and Adaptive Noise Estimation. J. Navigation 2019, 72, 875–893. [Google Scholar] [CrossRef]
- Sun, C.; Zhang, Y.; Wang, G.; Gao, W. A Maximum Correntropy Divided Difference Filter for Cooperative Localization. IEEE Access 2018, 6, 41720–41727. [Google Scholar] [CrossRef]
- Zhang, L.; Qu, J.; Pan, G.; Wang, Y. Analyzing of Cooperative Locating Error and Formation Configuration of AUV Based on Geometric Interpretation. J. Northwestern Polytech. Univ. 2020, 38, 755–765. [Google Scholar] [CrossRef]
- Chiarella, D. Towards Multi-AUV Collaboration and Coordination: A Gesture-Based Multi-AUV Hierarchical Language and A Language Framework Comparison System. J. Mar. Sci. Eng. 2023, 11, 1208. [Google Scholar] [CrossRef]
- Majid, M.H.A.; Yahya, M.F.; Siang, S.Y.; Arshad, M.R. Cooperative Positioning of Multiple AUVs for Underwater Docking: A Framework. In Proceedings of the Colloquium on Robotics, Unmanned Systems and Cybernetics, Pekan, Malaysia, 20 November 2014; p. 1. [Google Scholar]
- Zhang, L.; Li, Y.; Liu, L.; Tao, X. Cooperative Navigation Based on Cross Entropy: Dual Leaders. IEEE Access 2019, 7, 151378–151388. [Google Scholar] [CrossRef]
- Li, Q.; Naqvi, S.M.; Neasham, J.; Chambers, J. Robust Cooperative Navigation for AUVs Using the Student’s t Distribution. In Proceedings of the IEEE 2017 Sensor Signal Processing for Defence Conference (SSPD), London, UK, 6–7 December 2017; pp. 1–5. [Google Scholar]
- Zheng, K.; Jiang, Y.; Li, Y. Passive Localization for Multi-AUVs by Using Acoustic Signals. In Proceedings of the 14th International Conference on Underwater Networks & Systems, Atlanta, GA, USA, 23–25 October 2019; pp. 1–5. [Google Scholar]
- Yoshihara, T.; Ebihara, T.; Mizutani, K.; Sato, Y. Underwater Acoustic Positioning in Multipath Environment Using Time-of-flight Signal Group and Database Matching. Jpn. J. Appl. Phys. 2022, 61, SG1075. [Google Scholar] [CrossRef]
- Franchi, M.; Bucci, A.; Zacchini, L.; Ridolfi, A.; Bresciani, M.; Peralta, G.; Costanzi, R. Maximum a posteriori estimation for AUV localization with USBL measurements. IFAC-PapersOnLine 2021, 54, 307–313. [Google Scholar] [CrossRef]
- Li, J.H.; Lee, P.M. A Neural Network Adaptive Controller Design for Free-pitch-angle Diving Behavior of An Autonomous Underwater Vehicle. Robot. Auton. Syst. 2005, 52, 132–147. [Google Scholar] [CrossRef]
- Zhang, T.; Chen, L.; Li, Y. AUV Underwater Positioning Algorithm Based on Interactive Assistance of SINS and LBL. Sensors 2015, 16, 42. [Google Scholar] [CrossRef] [PubMed]
- Ren, R.; Zhang, L.; Liu, L.; Wu, D.; Pan, G.; Huang, Q.; Zhu, Y.; Liu, Y.; Zhu, Z. Multi-AUV Cooperative Navigation Algorithm Based on Temporal Difference Method. J. Mar. Sci. Eng. 2022, 10, 955. [Google Scholar] [CrossRef]
- Puterman, M.L. Markov Decision Processes: Discrete Stochastic Dynamic Programming; John Wiley & Sons: Hoboken, NJ, USA, 2014. [Google Scholar]
Slave AUV | Maximum Distance (m) | Minimum Distance (m) | Average Distance (m) |
---|---|---|---|
AUV1 | 410.096 | 15.387 | 223.938 |
AUV2 | 315.016 | 86.083 | 192.137 |
Slave AUV | Maximum Distance (m) | Minimum Distance (m) | Average Distance (m) |
---|---|---|---|
AUV1 | 5.950 | 1.737 | 4.271 |
AUV2 | 5.174 | 2.497 | 3.836 |
Parameter | Master AUV | Slave AUV |
---|---|---|
Speed measurement noise (m/s) | 0.5 | 1.5 |
Angle speed measurement noise (rad/s) | 0.1 | 0.5 |
Acoustic measurement noise (m) | 8 | 8 |
Acoustic measurement period (s) | 10 | 10 |
Action and State | Discrete Quantity | Number |
---|---|---|
Distance measurement azimuth angle (°) | [0, 10), [10, 20), …, [350, 359) | 36 |
Relative distance (m) | [0, 100), [100, 300), [300, 600) [600, 900), [900, ) | 5 |
Parameter | Symbol | Value |
---|---|---|
Error propagation factor | 0.1 | |
Acoustic measurement accuracy | 1 | |
Punish coefficient | 0.06 |
Parameter | Symbol | Value |
---|---|---|
Study step | 0.015 | |
Decay factor | 0.9 | |
Exploration rate | 0.1 |
Speed Measurement Noise (m/s) | Angle Speed Measurement Noise (Rad/s) | Acoustic Measurement Noise (m) | |
---|---|---|---|
Master AUV | |||
Slave AUV |
Slave AUV | Average RMS Error | Average Relative Error (m) | |
---|---|---|---|
X-Axis Distance (m) | Y-Axis Distance (m) | ||
AUV1 | 22.20 | 184.46 | 322.04 |
AUV2 | 21.47 | 183.44 | 315.18 |
AUV3 | 22.27 | 179.80 | 314.63 |
Slave AUV | Average RMS Error | Average Relative Error (m) | |
---|---|---|---|
X-Axis Distance (m) | Y-Axis Distance (m) | ||
AUV1 | 15.75 | 46.35 | 80.37 |
AUV2 | 16.45 | 51.60 | 94.32 |
AUV3 | 14.59 | 48.64 | 80.51 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhu, Z.; Zhang, L.; Liu, L.; Wu, D.; Bai, S.; Ren, R.; Geng, W. An Efficient Multi-AUV Cooperative Navigation Method Based on Hierarchical Reinforcement Learning. J. Mar. Sci. Eng. 2023, 11, 1863. https://doi.org/10.3390/jmse11101863
Zhu Z, Zhang L, Liu L, Wu D, Bai S, Ren R, Geng W. An Efficient Multi-AUV Cooperative Navigation Method Based on Hierarchical Reinforcement Learning. Journal of Marine Science and Engineering. 2023; 11(10):1863. https://doi.org/10.3390/jmse11101863
Chicago/Turabian StyleZhu, Zixiao, Lichuan Zhang, Lu Liu, Dongwei Wu, Shuchang Bai, Ranzhen Ren, and Wenlong Geng. 2023. "An Efficient Multi-AUV Cooperative Navigation Method Based on Hierarchical Reinforcement Learning" Journal of Marine Science and Engineering 11, no. 10: 1863. https://doi.org/10.3390/jmse11101863
APA StyleZhu, Z., Zhang, L., Liu, L., Wu, D., Bai, S., Ren, R., & Geng, W. (2023). An Efficient Multi-AUV Cooperative Navigation Method Based on Hierarchical Reinforcement Learning. Journal of Marine Science and Engineering, 11(10), 1863. https://doi.org/10.3390/jmse11101863