Deep Learning for Efficient and Optimal Motion Planning for AUVs with Disturbances
Abstract
:1. Introduction
- We solve a motion planning problem for an AUV using computationally efficient tools, namely, DGM, which solves a continuous time and states nonlinear optimal control problem. This tool will prove efficient computationally, and departs from numerous approaches that rely on state discretization;
- The optimal control problem we focus on consists of reaching a target position as soon as possible. As this problem cannot be solved directly using DGM, as DGM required a fixed time horizon, we develop a surrogate problem that can be solved using DGM and returns an equivalent control to our original control problem. This supposes an advance, as it extends the range of uses of DGM to problems with an unknown time horizon;
- We take into account the effect of several disturbances in the computation of our optimal trajectory. The presence of disturbances is frequent in the ocean, where currents and swirls affect the motion of the AUV. We model some disturbances and show that DGM is able to obtain optimal trajectories that take into account the effect of the disturbance.
2. Setup Description
2.1. Underwater Navigation Model
2.2. Disturbance Models
3. Optimal Control Motion Planning
3.1. Continuous Time Optimal Control
- is the time, where is the final time;
- is the state trajectory, where is the state at time t;
- is the control trajectory, where is the control at time t. The controls belong to the set of admissible controls U;
- is the cost functional to be minimized, that is formed by a terminal cost functional and a running cost functional ;
- The transition function controls the state evolution as a function of t, and ;
- is the initial state;
- is the final condition that must hold at .
3.1.1. Dynamic Programming Methods for Continuous Time
- The state space is discretized. Note that this means that we may face the curse of dimensionality if n is large;
- The value function is estimated iteratively by approximating the derivative with respect to the state by using finite differences;
- Depending on the state drift, for each state, the derivative is approximated using the backwards or the forward finite difference approximation.
3.1.2. Deep Galerkin Method
3.2. Continuous Time Surrogate Control
3.3. Surrogate OCP for Continuous Time
- The problem now has a fixed terminal time , which is required by the DGM;
- As is fixed, we must change the functional J. Recall that our target was to move the AUV as close as possible to the origin; we achieve this by using, as a running cost, a hyperbolic tangent function that depends on the distance of the AUV to the origin at each time, which is . Thus, note that when the AUV is far from the origin, and as the AUV approaches the origin, . In other words, our surrogate cost functional penalizes for being far from the origin, and rewards positions of the AUV that are as close as possible to the origin.
4. Empirical Simulation
- First, we explain the setup we use for our simulations in Section 4.1;
- Then, we train DGM and obtain the optimal control for the surrogate problem. We study the value function and control functions obtained, as well as the training convergence, in Section 4.2;
- Afterwards, we study the performance of DGM in the surrogate problem (12) in Section 4.3;
- Finally, we intensively study how the control obtained in the surrogate problem applies to our original problem, i.e., the time minimization problem (10), in Section 4.4.
4.1. Simulation Setup
4.2. Training Results
4.3. Surrogate Problem Results
4.4. Original Problem Results
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Goerzen, C.; Kong, Z.; Mettler, B. A survey of motion planning algorithms from the perspective of autonomous UAV guidance. J. Intell. Robot. Syst. 2010, 57, 65. [Google Scholar] [CrossRef]
- Friesz, T.L. Dynamic Optimization and Differential Games; Springer: Boston, MA, USA, 2010; Volume 135. [Google Scholar]
- Wang, S.; Gao, F.; Teo, K.L. An upwind finite-difference method for the approximation of viscosity solutions to Hamilton-Jacobi-Bellman equations. IMA J. Math. Control Inf. 2000, 17, 167–178. [Google Scholar] [CrossRef]
- Tourin, A. An Introduction to Finite Diffference Methods for PDEs in Finance. In Book Chapter: Nizar Touzi, Optimal Stochastic Target problems, and Backward SDE, Fields Institute Monographs; Springer: New York, NY, USA, 2011; Volume 29, pp. 201–212. [Google Scholar]
- Sun, B.; Guo, B.Z. Convergence of an upwind finite-difference scheme for Hamilton–Jacobi–Bellman equation in optimal control. IEEE Trans. Autom. Control 2015, 60, 3012–3017. [Google Scholar] [CrossRef]
- Falcone, M.; Ferretti, R. Numerical methods for hamilton–jacobi type equations. In Handbook of Numerical Methods for Hyperbolic Problems, Handbook of Numerical Analysis; Elsevier: Amsterdam, The Netherlands, 2016; Volume 17, pp. 603–626. [Google Scholar]
- Gombao, S. Approximation of optimal controls for semilinear parabolic PDE by solving Hamilton-Jacobi-Bellman equations. In Proceedings of the 15th International Symposium on the Mathematical Theory of Networks and Systems, Notre Dame, IN, USA, 12–16 August 2002. [Google Scholar]
- McEneaney, W.M. A curse-of-dimensionality-free numerical method for solution of certain HJB PDEs. SIAM J. Control Optim. 2007, 46, 1239–1276. [Google Scholar] [CrossRef]
- Sirignano, J.; Spiliopoulos, K. DGM: A deep learning algorithm for solving partial differential equations. J. Comput. Phys. 2018, 375, 1339–1364. [Google Scholar] [CrossRef] [Green Version]
- Al-Aradi, A.; Correia, A.; Naiff, D.d.F.; Jardim, G.; Saporito, Y. Applications of the Deep Galerkin Method to Solving Partial Integro-Differential and Hamilton-Jacobi-Bellman Equations. arXiv 2019, arXiv:1912.01455. [Google Scholar]
- Chen, J.; Du, R.; Wu, K. A Comparison Study of Deep Galerkin Method and Deep Ritz Method for Elliptic Problems with Different Boundary Conditions. arXiv 2020, arXiv:2005.04554. [Google Scholar] [CrossRef]
- Wang, J.; Xu, Z.Q.J.; Zhang, J.; Zhang, Y. Implicit bias with Ritz-Galerkin method in understanding deep learning for solving PDEs. arXiv 2020, arXiv:2002.07989. [Google Scholar]
- Li, J.; Yue, J.; Zhang, W.; Duan, W. The Deep Learning Galerkin Method for the General Stokes Equations. arXiv 2020, arXiv:2009.11701. [Google Scholar]
- Burns, R.S. The use of artificial neural networks for the intelligent optimal control of surface ships. IEEE J. Ocean. Eng. 1995, 20, 65–72. [Google Scholar] [CrossRef]
- Lolla, T.; Lermusiaux, P.F.; Ueckermann, M.P.; Haley, P.J. Time-optimal path planning in dynamic flows using level set equations: Theory and schemes. Ocean. Dyn. 2014, 64, 1373–1397. [Google Scholar] [CrossRef] [Green Version]
- Takei, R.; Tsai, R.; Shen, H.; Landa, Y. A practical path-planning algorithm for a simple car: A Hamilton-Jacobi approach. In Proceedings of the 2010 American Control Conference, Baltimore, MD, USA, 30 June–2 July 2010; pp. 6175–6180. [Google Scholar]
- Kragelund, S.; Walton, C.; Kaminer, I.; Dobrokhodov, V. Generalized Optimal Control for Autonomous Mine Countermeasures Missions. IEEE J. Ocean. Eng. 2020, 46, 466–496. [Google Scholar] [CrossRef]
- Li, G.; Hildre, H.P.; Zhang, H. Toward time-optimal trajectory planning for autonomous ship maneuvering in close-range encounters. IEEE J. Ocean. Eng. 2019, 45, 1219–1234. [Google Scholar] [CrossRef] [Green Version]
- Shen, C.; Shi, Y.; Buckham, B. Nonlinear model predictive control for trajectory tracking of an AUV: A distributed implementation. In Proceedings of the 2016 IEEE 55th Conference on Decision and Control (CDC), Las Vegas, NV, USA, 12–14 December 2016; pp. 5998–6003. [Google Scholar]
- Shen, C.; Shi, Y. Distributed implementation of nonlinear model predictive control for AUV trajectory tracking. Automatica 2020, 115, 108863. [Google Scholar] [CrossRef]
- Bemporad, A.; Morari, M.; Dua, V.; Pistikopoulos, E.N. The explicit linear quadratic regulator for constrained systems. Automatica 2002, 38, 3–20. [Google Scholar] [CrossRef]
- Karg, B.; Lucia, S. Efficient representation and approximation of model predictive control laws via deep learning. IEEE Trans. Cybern. 2020, 50, 3866–3878. [Google Scholar] [CrossRef] [PubMed]
- Paull, L.; Saeedi, S.; Seto, M.; Li, H. AUV navigation and localization: A review. IEEE J. Ocean. Eng. 2013, 39, 131–149. [Google Scholar] [CrossRef]
- Isaacs, R. Differential Games: A Mathematical Theory with Applications to Warfare and Pursuit, Control and Optimization; Wiley: New York, NY, USA, 1965. [Google Scholar]
- Bertsekas, D.P. Dynamic Programming and Optimal Control; Athena Scientific: Nashua, NH, USA, 2005; Volume 1. [Google Scholar]
- Bertsekas, D.P. Dynamic Programming and Optimal Control; Athena Scientific: Nashua, NH, USA, 2007; Volume 2. [Google Scholar]
- Sutton, R.S.; Barto, A.G. Reinforcement Learning: An Introduction; MIT Press Cambridge: Cambridge, MA, USA, 1998. [Google Scholar]
- Todorov, E. Optimal control theory. In Bayesian Brain: Probabilistic Approaches to Neural Coding; MIT Press: Cambridge, MA, USA, 2006; pp. 269–298. [Google Scholar]
- Crandall, M.G.; Lions, P.L. Viscosity solutions of Hamilton-Jacobi equations. Trans. Am. Math. Soc. 1983, 277, 1–42. [Google Scholar] [CrossRef]
- Lions, P.L. Hamilton-Jacobi-Bellman equations and the optimal control of stochastic systems. In Proceedings of the International Congress of Mathematicians, Warsaw, Poland, 16–24 August 1983; Volume 1, p. 2. [Google Scholar]
- Crandall, M.G. Viscosity solutions: A primer. In Viscosity Solutions and Applications; Springer: Berlin/Heidelberg, Germany, 1997; pp. 1–43. [Google Scholar]
- Bressan, A. Viscosity solutions of Hamilton-Jacobi equations and optimal control problems. Lect. Notes 2011. Available online: http://personal.psu.edu/axb62/PSPDF/hj.pdf (accessed on 15 June 2021).
- Crandall, M.G.; Ishii, H.; Lions, P.L. User’s guide to viscosity solutions of second order partial differential equations. Bull. Am. Math. Soc. 1992, 27, 1–67. [Google Scholar] [CrossRef] [Green Version]
- Bardi, M.; Capuzzo-Dolcetta, I. Optimal Control and Viscosity Solutions of Hamilton-Jacobi-Bellman Equations; Birkhäuser: Boston, MA, USA, 1997. [Google Scholar]
- Fleming, W.H.; Soner, H.M. Controlled Markov Processes and Viscosity Solutions; Springer: New York, NY, USA, 2006; Volume 25. [Google Scholar]
- Barles, G.; Souganidis, P.E. Convergence of approximation schemes for fully nonlinear second order equations. Asymptot. Anal. 1991, 4, 271–283. [Google Scholar] [CrossRef]
- Kingma, D.P.; Ba, J.L. Adam: A method for stochastic gradient descent. In Proceedings of the ICLR: International Conference on Learning Representations, San Diego, CA, USA, 7–9 May 2015. [Google Scholar]
- Sun, Y.; Cheng, J.; Zhang, G.; Xu, H. Mapless motion planning system for an autonomous underwater vehicle using policy gradient-based deep reinforcement learning. J. Intell. Robot. Syst. 2019, 96, 591–601. [Google Scholar] [CrossRef]
- Cao, X.; Sun, C.; Yan, M. Target Search Control of AUV in Underwater Environment With Deep Reinforcement Learning. IEEE Access 2019, 7, 96549–96559. [Google Scholar] [CrossRef]
Disturbance | Swirl | Current | Constant |
---|---|---|---|
Average J DGM | |||
Average J baseline | |||
e | |||
Improvement proportion |
Disturbance | Swirl | Current | Constant |
---|---|---|---|
Average J DGM | |||
Average J baseline | |||
e | |||
Improvement proportion |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Parras, J.; Apellániz, P.A.; Zazo, S. Deep Learning for Efficient and Optimal Motion Planning for AUVs with Disturbances. Sensors 2021, 21, 5011. https://doi.org/10.3390/s21155011
Parras J, Apellániz PA, Zazo S. Deep Learning for Efficient and Optimal Motion Planning for AUVs with Disturbances. Sensors. 2021; 21(15):5011. https://doi.org/10.3390/s21155011
Chicago/Turabian StyleParras, Juan, Patricia A. Apellániz, and Santiago Zazo. 2021. "Deep Learning for Efficient and Optimal Motion Planning for AUVs with Disturbances" Sensors 21, no. 15: 5011. https://doi.org/10.3390/s21155011
APA StyleParras, J., Apellániz, P. A., & Zazo, S. (2021). Deep Learning for Efficient and Optimal Motion Planning for AUVs with Disturbances. Sensors, 21(15), 5011. https://doi.org/10.3390/s21155011