Adaptive Position Control of Pneumatic Continuum Manipulator Based on MAML Meta-Reinforcement Learning
Abstract
:1. Introduction
2. MAML Meta-Reinforcement Learning
Algorithm 1: MAML meta-reinforcement learning |
Require: Task distribution |
Require: Learning rates |
1: Randomly initialize policy parameters |
2: while not done do: |
3: Sample a batch of tasks |
4: if all do: |
5: Interact with the environment using policy to collect data |
6: Calculate based on and |
7: Update |
8: Interact with the environment using policy to collect data |
9: end for |
10: Update |
11: end while |
2.1. Inner Reinforcement Learning Loop
2.2. Outer Reinforcement Learning Loop
3. Position Control Policy Traning of the PCM Based on MAML Meta-Reinforcement Learning
3.1. The Structure of the PCM
3.2. Meta-Reinforcement Learning Modeling and Policy Network Structure Design
4. Simulation Analysis
4.1. Simulation Environment
4.2. Training Overview of MAML Meta-Reinforcement Learning
4.3. Simulation Result
4.3.1. Task 1
4.3.2. Task 2
5. Experiment
5.1. Experimental Platform
5.2. Experimental Analysis
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- You, X.; Zhang, Y.; Chen, X.; Liu, X.; Wang, Z.; Jiang, H.; Chen, X. Model-free control for soft manipulators based on reinforcement learning. In Proceedings of the 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vancouver, BC, Canada, 24–28 September 2017; pp. 2909–2915. [Google Scholar]
- Satheeshbabu, S.; Uppalapati, X.K.; Chowdhary, G.; Krishnan, G. Open loop position control of soft continuum arm using deep reinforcement learning. In Proceedings of the 2019 International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada, 20–24 May 2019; pp. 5133–5139. [Google Scholar]
- Satheeshbabu, S.; Uppalapati, X.K.; Fu, T.; Krishnan, G. Continuous control of a soft continuum arm using deep reinforcement learning. In Proceedings of the 2020 3rd IEEE International Conference on Soft Robotics (RoboSoft), New Haven, CO, USA, 15 May–15 July 2020; pp. 497–503. [Google Scholar]
- Li, Y.; Wang, X.; Kwok, K. Towards adaptive continuous control of soft robotic manipulator using reinforcement learning. In Proceedings of the 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Kyoto, Japan, 23–27 October 2022; pp. 7074–7081. [Google Scholar]
- Centurelli, A.; Rizzo, A.; Tolu, S.; Laschi, C. Closed-loop dynamic control of a soft manipulator using deep reinforcement learning. IEEE Robot. Autom. Lett. 2022, 7, 4741–4748. [Google Scholar] [CrossRef]
- Pertsch, K.; Lee, Y.; Lim, J.J. Accelerating reinforcement learning with learned skill priors. arXiv 2020, arXiv:2010.11944. [Google Scholar]
- Rusu, A.A.; Colmenarejo, C.G.; Gulcehre, C.; Desjardins, G.; Kirkpatrick, J.; Pascanu, R.; Mnih, V.; Kavukcuoglu, K.; Hadsell, R. Policy distillation. arXiv 2015, arXiv:1511.06295. [Google Scholar]
- Zhang, Y.; Yang, Q. A survey on multi-task learning. IEEE Trans. Knowl. Data Eng. 2021, 34, 5586–5609. [Google Scholar] [CrossRef]
- Schweighofer, N.; Doya, K. Meta-learning in reinforcement learning. Neural Netw. 2021, 34, 5586–5609. [Google Scholar] [CrossRef] [PubMed]
- Duan, Y.; Schulman, J.; Chen, X.; Bartlett, P.L.; Sutskever, I.; Abbeel, P. RL2: Fast reinforcement learning via slow reinforcement learning. arXiv 2016, arXiv:1611.02779. [Google Scholar]
- Mishra, N.; Rohaninejad, M.; Chen, X.; Abbeel, P. A simple neural attentive meta-learner. arXiv 2017, arXiv:1707.03141. [Google Scholar]
- Parisotto, E.; Ghosh, S.; Yalamanchi, S.B.; Chinnaobireddy, V.; Wu, Y.; Salakhutdinov, R. Concurrent meta reinforcement learning. arXiv 2019, arXiv:1903.02710. [Google Scholar]
- Rakelly, K.; Zhou, A.; Finn, C.; Levine, S.; Quillen, D. Efficient off-policy meta-reinforcement learning via probabilistic context variables. In Proceedings of the 36th International Conference on Machine Learning, Long Beach, CA, USA, 9–15 June 2019; pp. 5331–5340. [Google Scholar]
- Zhang, J.; Wang, J.; Hu, H.; Chen, T.; Chen, Y.; Fan, C.; Zhang, C. MetaCURE: Meta reinforcement learning with empowerment-driven exploration. In Proceedings of the 38th International Conference on Machine Learning, Virtual, 18–24 July 2021; pp. 12600–12610. [Google Scholar]
- Fakoor, R.; Chaudhari, P.; Soatto, S.; Smola, A.J. Meta-Q-learning. arXiv 2019, arXiv:1910.00125. [Google Scholar]
- Hu, S. Research of Meta-Reinforcement Learning and Its Application in Multi-Legged Robots. Master’s Thesis, China University of Mining and Technology, Xuzhou, China, 2023. [Google Scholar]
- Finn, C.; Abbeel, P.; Levine, S. Model-agnostic meta-learning for fast adaptation of deep networks. In Proceedings of the 34th International conference on machine learning, Sydney, Australia, 6–11 August 2017; pp. 1126–1135. [Google Scholar]
- Song, X.; Gao, W.; Yang, Y.; Choromanski, K.; Pacchiano, A.; Tang, Y. ES-MAML: Simple hessian-free meta learning. arXiv 2019, arXiv:1910.01215. [Google Scholar]
- Nichol, A.; Schulman, J. Reptile: A scalable metalearning algorithm. arXiv 2018, arXiv:1803.02999. [Google Scholar]
- Gupta, A.; Mendonca, R.; Liu, Y.X.; Abbeel, P.; Levine, S. Meta-reinforcement learning of structured exploration strategies. In Proceedings of the 32nd Conference on Neural Information Processing Systems (NIPS 2018), Montréal, QC, Canada, 3–8 December 2018; pp. 5307–5316. [Google Scholar]
- Liu, H.; Socher, R.; Xiong, C. Taming MAML: Efficient unbiased meta-reinforcement learning. In Proceedings of the 36th International Conference on Machine Learning, Long Beach, CA, USA, 9–15 June 2019; pp. 4061–4071. [Google Scholar]
- Li, M.; Xu, G.; Gao, X.; Tan, C. A reaching skill learning method of manipulators based on meta-Q-learning and DDPG. J. Nanjing Univ. Posts Telecommun. (Nat. Sci. Ed.) 2023, 43, 96–103. [Google Scholar]
- Hao, T. Research on Meta-Optimizer and Meta-Reinforcement Learning Based on Bi-Level Optimization Meta-Learning. Master’s Thesis, Shandong University, Jina, China, 2022. [Google Scholar]
- Schoettler, G.; Nair, A.; Ojea, J.A.; Levine, S.; Solowjow, E. Meta-reinforcement learning for robotic industrial insertion tasks. In Proceedings of the 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Las Vegas, NV, USA, 24 October 2020–24 January 2021; pp. 9728–9735. [Google Scholar]
- Peng, K. Robot Motion Control Method Based on Particle Swarm Optimization and Meta-Reinforcement Learning. Master’s Thesis, Yangzhou University, Yangzhou, China, 2022. [Google Scholar]
- Wang, L.; Zhang, Y.; Zhu, D.; Coleman, S.; Kerr, D. Supervised meta-reinforcement learning with trajectory optimization for manipulation tasks. IEEE Trans. Cognit. Dev. Syst. 2023, 16, 681–691. [Google Scholar] [CrossRef]
Parameters | Values |
---|---|
Learning rate | |
Size of the trust-region | |
Conjugate gradient iterations | 10 |
Maximum line search iterations | 15 |
Line search backtrack coefficient | 0.8 |
Inner loop task samples | 20 |
Number of trajectories collected per task | 20 |
Inner loop optimization iterations | 1 |
Outer loop task samples | 20 |
Outer loop optimization iterations | 1000 |
Algorithm | Training Time (min) | Maximum Absolute Error (mm) |
---|---|---|
MAML meta-reinforcement learning | 5 | 2.82 |
PPO | 28 | 41.12 |
Actor–Critic | 17 | 23.97 |
PILCO | 47 | 4.6 |
Algorithm | Training Time (min) | Maximum Absolute Error (mm) |
---|---|---|
MAML meta-reinforcement learning | 5 | 4.9 |
PPO | 20 | 34.8 |
Actor–Critic | 25 | 25.9 |
PILCO | 47 | 3.9 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Hao, L.; Cheng, Q.; Liu, H.; Zhang, Y. Adaptive Position Control of Pneumatic Continuum Manipulator Based on MAML Meta-Reinforcement Learning. Appl. Sci. 2024, 14, 10821. https://doi.org/10.3390/app142310821
Hao L, Cheng Q, Liu H, Zhang Y. Adaptive Position Control of Pneumatic Continuum Manipulator Based on MAML Meta-Reinforcement Learning. Applied Sciences. 2024; 14(23):10821. https://doi.org/10.3390/app142310821
Chicago/Turabian StyleHao, Lina, Qiang Cheng, Hongshuai Liu, and Ying Zhang. 2024. "Adaptive Position Control of Pneumatic Continuum Manipulator Based on MAML Meta-Reinforcement Learning" Applied Sciences 14, no. 23: 10821. https://doi.org/10.3390/app142310821
APA StyleHao, L., Cheng, Q., Liu, H., & Zhang, Y. (2024). Adaptive Position Control of Pneumatic Continuum Manipulator Based on MAML Meta-Reinforcement Learning. Applied Sciences, 14(23), 10821. https://doi.org/10.3390/app142310821