Repetition-Based Approach for Task Adaptation in Imitation Learning
Abstract
:1. Introduction
- An imitation learning agent is proposed to learn an optimal policy using expert-generated demonstration data. The agent is capable of encoding its knowledge into high-dimensional task embedding space in order to support the knowledge expansion in the later adaptation process.
- Given a new target task, a task adaptation algorithm is proposed in order to enable the agent to broaden its knowledge without forgetting the previous source task by leveraging the idea of repetition learning in neuroscience. The resulting agent can provide a better generalization and consistently perform well on both source and target tasks.
- A set of experiments are conducted over a number of simulated tasks in order to evaluate the performance of the proposed task adaptation method in terms of success rate, average cumulative reward, and computational cost. The evaluation results demonstrate the effectiveness of the proposed method in comparison with existing transfer learning methods.
2. Related Work
3. Problem Formulation
4. The Proposed Agent and Adaptation Algorithm
4.1. The Proposed Agent
4.1.1. Task-Embedding Network E
4.1.2. Action Generator Network G and Discriminator Network D
4.1.3. Full Objective
Algorithm 1 Training the proposed agent on the source task. |
|
4.2. The Proposed Task Adaptation Algorithm
Algorithm 2 The proposed adaptation algorithm. |
|
5. Performance Evaluation
- Can the proposed IL agent provide a competitive performance on the source task?
- Can the adaptation algorithm enable the agent to adapt its learned knowledge to the target task in order to outperform the baselines?
- By leveraging the repetition learning to expand the agent’s knowledge, can the adaptation algorithm reduce the deterioration of the agent’s performance on the source task?
5.1. Experimental Settings
5.1.1. Simulated Tasks
5.1.2. Baselines
5.1.3. Implementation and Training Details
5.2. Results
5.2.1. Performance of the Proposed Agent on the Source Task
5.2.2. Performance of the Proposed Agent on the Target Task after Adaptation
5.2.3. Performance of the Proposed Agent on the Source Task after Adaptation
5.2.4. Computational Complexity
6. Discussion
7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Sutton, R.S.; Barto, A.G. Reinforcement Learning: An Introduction; MIT Press: Cambridge, MA, USA, 2018. [Google Scholar]
- Matas, J.; James, S.; Davison, A.J. Sim-to-real reinforcement learning for deformable object manipulation. In Proceedings of the Conference on Robot Learning, Zürich, Switzerland, 29–31 October 2018; pp. 734–743. [Google Scholar]
- Mohammed, M.Q.; Chung, K.L.; Chyi, C.S. Review of deep reinforcement learning-based object grasping: Techniques, open challenges, and recommendations. IEEE Access 2020, 8, 178450–178481. [Google Scholar] [CrossRef]
- Li, R.; Jabri, A.; Darrell, T.; Agrawal, P. Towards practical multi-object manipulation using relational reinforcement learning. In Proceedings of the 2020 IEEE International Conference on Robotics and Automation (ICRA), Virtual, 31 May–4 June 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 4051–4058. [Google Scholar]
- Han, H.; Paul, G.; Matsubara, T. Model-based reinforcement learning approach for deformable linear object manipulation. In Proceedings of the 2017 13th IEEE Conference on Automation Science and Engineering (CASE), Shaanxi, China, 20–23 August 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 750–755. [Google Scholar]
- Mnih, V.; Kavukcuoglu, K.; Silver, D.; Graves, A.; Antonoglou, I.; Wierstra, D.; Riedmiller, M. Playing atari with deep reinforcement learning. arXiv 2013, arXiv:1312.5602. [Google Scholar]
- Jeerige, A.; Bein, D.; Verma, A. Comparison of deep reinforcement learning approaches for intelligent game playing. In Proceedings of the 2019 IEEE 9th Annual Computing and Communication Workshop and Conference (CCWC), Las Vegas, NV, USA, 7–9 January 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 0366–0371. [Google Scholar]
- Silver, D.; Sutton, R.S.; Müller, M. Reinforcement Learning of Local Shape in the Game of Go. In Proceedings of the IJCAI, Hyderabad, India, 6–12 January 2007; Volume 7, pp. 1053–1058. [Google Scholar]
- Ye, D.; Chen, G.; Zhang, W.; Chen, S.; Yuan, B.; Liu, B.; Chen, J.; Liu, Z.; Qiu, F.; Yu, H.; et al. Towards playing full moba games with deep reinforcement learning. Adv. Neural Inf. Process. Syst. 2020, 33, 621–632. [Google Scholar]
- Sallab, A.E.; Abdou, M.; Perot, E.; Yogamani, S. Deep reinforcement learning framework for autonomous driving. Electron. Imaging 2017, 2017, 70–76. [Google Scholar] [CrossRef]
- Kiran, B.R.; Sobh, I.; Talpaert, V.; Mannion, P.; Al Sallab, A.A.; Yogamani, S.; Pérez, P. Deep reinforcement learning for autonomous driving: A survey. IEEE Trans. Intell. Transp. Syst. 2021, 23, 4090–4926. [Google Scholar] [CrossRef]
- Osiński, B.; Jakubowski, A.; Zięcina, P.; Miłoś, P.; Galias, C.; Homoceanu, S.; Michalewski, H. Simulation-based reinforcement learning for real-world autonomous driving. In Proceedings of the 2020 IEEE International Conference on Robotics and Automation (ICRA), Virtual, 31 May–4 June 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 6411–6418. [Google Scholar]
- Zhu, M.; Wang, Y.; Pu, Z.; Hu, J.; Wang, X.; Ke, R. Safe, efficient, and comfortable velocity control based on reinforcement learning for autonomous driving. Transp. Res. Part C Emerg. Technol. 2020, 117, 102662. [Google Scholar] [CrossRef]
- Dulac-Arnold, G.; Levine, N.; Mankowitz, D.J.; Li, J.; Paduraru, C.; Gowal, S.; Hester, T. Challenges of real-world reinforcement learning: Definitions, benchmarks and analysis. Mach. Learn. 2021, 110, 2419–2468. [Google Scholar] [CrossRef]
- Kormushev, P.; Calinon, S.; Caldwell, D.G. Reinforcement learning in robotics: Applications and real-world challenges. Robotics 2013, 2, 122–148. [Google Scholar] [CrossRef]
- Argall, B.D.; Chernova, S.; Veloso, M.; Browning, B. A survey of robot learning from demonstration. Robot. Auton. Syst. 2009, 57, 469–483. [Google Scholar] [CrossRef]
- Hussein, A.; Gaber, M.M.; Elyan, E.; Jayne, C. Imitation learning: A survey of learning methods. ACM Comput. Surv. (CSUR) 2017, 50, 1–35. [Google Scholar] [CrossRef]
- Jang, E.; Irpan, A.; Khansari, M.; Kappler, D.; Ebert, F.; Lynch, C.; Levine, S.; Finn, C. BC-z: Zero-shot task generalization with robotic imitation learning. In Proceedings of the Conference on Robot Learning, London, UK, 8–11 November 2021; pp. 991–1002. [Google Scholar]
- Zhu, Y.; Wang, Z.; Merel, J.; Rusu, A.; Erez, T.; Cabi, S.; Tunyasuvunakool, S.; Kramár, J.; Hadsell, R.; de Freitas, N.; et al. Reinforcement and imitation learning for diverse visuomotor skills. arXiv 2018, arXiv:1802.09564. [Google Scholar]
- Ratliff, N.; Bagnell, J.A.; Srinivasa, S.S. Imitation learning for locomotion and manipulation. In Proceedings of the 2007 7th IEEE-RAS International Conference on Humanoid Robots, Pittsburgh, PA, USA, 29 November–1 December 2007; IEEE: Piscataway, NJ, USA, 2007; pp. 392–397. [Google Scholar]
- Chen, J.; Yuan, B.; Tomizuka, M. Deep imitation learning for autonomous driving in generic urban scenarios with enhanced safety. In Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Macau, China, 3–8 November 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 2884–2890. [Google Scholar]
- Codevilla, F.; Müller, M.; López, A.; Koltun, V.; Dosovitskiy, A. End-to-end driving via conditional imitation learning. In Proceedings of the 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, Australia, 21–25 May 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 4693–4700. [Google Scholar]
- Hawke, J.; Shen, R.; Gurau, C.; Sharma, S.; Reda, D.; Nikolov, N.; Mazur, P.; Micklethwaite, S.; Griffiths, N.; Shah, A.; et al. Urban driving with conditional imitation learning. In Proceedings of the 2020 IEEE International Conference on Robotics and Automation (ICRA), Virtual, 31 May–4 June 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 251–257. [Google Scholar]
- Kebria, P.M.; Alizadehsani, R.; Salaken, S.M.; Hossain, I.; Khosravi, A.; Kabir, D.; Koohestani, A.; Asadi, H.; Nahavandi, S.; Tunsel, E.; et al. Evaluating architecture impacts on deep imitation learning performance for autonomous driving. In Proceedings of the 2019 IEEE International Conference on Industrial Technology (ICIT), Melbourne, Australia, 13–15 February 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 865–870. [Google Scholar]
- Hua, J.; Zeng, L.; Li, G.; Ju, Z. Learning for a robot: Deep reinforcement learning, imitation learning, transfer learning. Sensors 2021, 21, 1278. [Google Scholar] [CrossRef] [PubMed]
- Zhao, W.; Queralta, J.P.; Westerlund, T. Sim-to-real transfer in deep reinforcement learning for robotics: A survey. In Proceedings of the 2020 IEEE Symposium Series on Computational Intelligence (SSCI), Canberra, Australia, 1–4 December 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 737–744. [Google Scholar]
- Liu, Y.; Li, Z.; Liu, H.; Kan, Z. Skill transfer learning for autonomous robots and human–robot cooperation: A survey. Robot. Auton. Syst. 2020, 128, 103515. [Google Scholar] [CrossRef]
- Vithayathil Varghese, N.; Mahmoud, Q.H. A survey of multi-task deep reinforcement learning. Electronics 2020, 9, 1363. [Google Scholar] [CrossRef]
- Serra, J.; Suris, D.; Miron, M.; Karatzoglou, A. Overcoming catastrophic forgetting with hard attention to the task. In Proceedings of the International Conference on Machine Learning, Stockholm, Sweden, 10–15 July 2018; pp. 4548–4557. [Google Scholar]
- Ebbinghaus, H. Memory: A contribution to experimental psychology. Ann. Neurosci. 2013, 20, 155. [Google Scholar] [CrossRef]
- Zhan, L.; Guo, D.; Chen, G.; Yang, J. Effects of Repetition Learning on Associative Recognition Over Time: Role of the Hippocampus and Prefrontal Cortex. Front. Hum. Neurosci. 2018, 12. [Google Scholar] [CrossRef]
- Uchihara, T.; Webb, S.; Yanagisawa, A. The effects of repetition on incidental vocabulary learning: A meta-analysis of correlational studies. Lang. Learn. 2019, 69, 559–599. [Google Scholar] [CrossRef]
- Raghu, M.; Zhang, C.; Kleinberg, J.; Bengio, S. Transfusion: Understanding transfer learning for medical imaging. Adv. Neural Inf. Process. Syst. 2019, 32, 3347–3357. [Google Scholar]
- Raffel, C.; Shazeer, N.; Roberts, A.; Lee, K.; Narang, S.; Matena, M.; Zhou, Y.; Li, W.; Liu, P.J. Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 2020, 21, 1–67. [Google Scholar]
- Pathak, Y.; Shukla, P.K.; Tiwari, A.; Stalin, S.; Singh, S. Deep transfer learning based classification model for COVID-19 disease. Irbm 2020, 43, 87–92. [Google Scholar] [CrossRef]
- Aslan, M.F.; Unlersen, M.F.; Sabanci, K.; Durdu, A. CNN-based transfer learning–BiLSTM network: A novel approach for COVID-19 infection detection. Appl. Soft Comput. 2021, 98, 106912. [Google Scholar] [CrossRef] [PubMed]
- Humayun, M.; Sujatha, R.; Almuayqil, S.N.; Jhanjhi, N. A Transfer Learning Approach with a Convolutional Neural Network for the Classification of Lung Carcinoma. Healthcare 2022, 10, 1058. [Google Scholar] [CrossRef] [PubMed]
- Salza, P.; Schwizer, C.; Gu, J.; Gall, H.C. On the effectiveness of transfer learning for code search. IEEE Trans. Softw. Eng. 2022, 1–18. [Google Scholar] [CrossRef]
- Sharma, M.; Nath, K.; Sharma, R.K.; Kumar, C.J.; Chaudhary, A. Ensemble averaging of transfer learning models for identification of nutritional deficiency in rice plant. Electronics 2022, 11, 148. [Google Scholar] [CrossRef]
- Campos, V.; Sprechmann, P.; Hansen, S.S.; Barreto, A.; Kapturowski, S.; Vitvitskyi, A.; Badia, A.P.; Blundell, C. Beyond Fine-Tuning: Transferring Behavior in Reinforcement Learning. In Proceedings of the ICML 2021 Workshop on Unsupervised Reinforcement Learning, Virtual, 23 July 2021. [Google Scholar]
- Nagabandi, A.; Kahn, G.; Fearing, R.S.; Levine, S. Neural network dynamics for model-based deep reinforcement learning with model-free fine-tuning. In Proceedings of the 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, Australia, 21–25 May 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 7559–7566. [Google Scholar]
- Julian, R.; Swanson, B.; Sukhatme, G.; Levine, S.; Finn, C.; Hausman, K. Never Stop Learning: The Effectiveness of Fine-Tuning in Robotic Reinforcement Learning. In Proceedings of the 2020 Conference on Robot Learning, Virtual, 6–18 November 2020; Kober, J., Ramos, F., Tomlin, C., Eds.; PMLR: Maastricht, The Netherlands, 2021; Volume 155, pp. 2120–2136. Available online: https://proceedings.mlr.press/v155/ (accessed on 7 July 2022).
- Mannion, P.; Devlin, S.; Duggan, J.; Howley, E. Reward shaping for knowledge-based multi-objective multi-agent reinforcement learning. Knowl. Eng. Rev. 2018, 33, e23. [Google Scholar] [CrossRef]
- Brys, T.; Harutyunyan, A.; Taylor, M.E.; Nowé, A. Policy Transfer Using Reward Shaping. In Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems, AAMAS ’15, Istanbul, Turkey, 4–8 May 2015; International Foundation for Autonomous Agents and Multiagent Systems: Richland, SC, USA, 2015; pp. 181–188. [Google Scholar]
- Doncieux, S. Transfer learning for direct policy search: A reward shaping approach. In Proceedings of the 2013 IEEE Third Joint International Conference on Development and Learning and Epigenetic Robotics (ICDL), Osaka, Japan, 18–22 August 2013; pp. 1–6. [Google Scholar] [CrossRef]
- Taylor, M.E.; Stone, P.; Liu, Y. Transfer Learning via Inter-Task Mappings for Temporal Difference Learning. J. Mach. Learn. Res. 2007, 8, 2125–2167. [Google Scholar]
- Gupta, A.; Devin, C.; Liu, Y.; Abbeel, P.; Levine, S. Learning invariant feature spaces to transfer skills with reinforcement learning. arXiv 2017, arXiv:1703.02949. [Google Scholar]
- Ammar, H.B.; Tuyls, K.; Taylor, M.E.; Driessens, K.; Weiss, G. Reinforcement learning transfer via sparse coding. In Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems, Valencia, Spain, 4–8 June 2012; Volume 1, pp. 383–390. [Google Scholar]
- Devin, C.; Gupta, A.; Darrell, T.; Abbeel, P.; Levine, S. Learning modular neural network policies for multi-task and multi-robot transfer. In Proceedings of the 2017 IEEE international conference on robotics and automation (ICRA): Marina Bay Sands, Singapore, 29 May–3 June 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 2169–2176. [Google Scholar]
- Taylor, M.E.; Stone, P. Representation Transfer for Reinforcement Learning. In Proceedings of the AAAI Fall Symposium: Computational Approaches to Representation Change during Learning and Development, Arlington, VA, USA, 9–11 November 2007; pp. 78–85. [Google Scholar]
- Zhang, A.; Satija, H.; Pineau, J. Decoupling dynamics and reward for transfer learning. arXiv 2018, arXiv:1804.10689. [Google Scholar]
- Guo, Z.D.; Pires, B.A.; Piot, B.; Grill, J.B.; Altché, F.; Munos, R.; Azar, M.G. Bootstrap latent-predictive representations for multitask reinforcement learning. In Proceedings of the International Conference on Machine Learning, Virtual, 13–18 July 2020; PMLR: Maastricht, The Netherlands, 2020; pp. 3875–3886. [Google Scholar]
- Rahmatizadeh, R.; Abolghasemi, P.; Bölöni, L.; Levine, S. Vision-based multi-task manipulation for inexpensive robots using end-to-end learning from demonstration. In Proceedings of the 2018 IEEE international conference on robotics and automation (ICRA), Brisbane, Australia, 21–26 May 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 3758–3765. [Google Scholar]
- Teh, Y.; Bapst, V.; Czarnecki, W.M.; Quan, J.; Kirkpatrick, J.; Hadsell, R.; Heess, N.; Pascanu, R. Distral: Robust multitask reinforcement learning. Adv. Neural Inf. Process. Syst. 2017, 30. [Google Scholar]
- Espeholt, L.; Soyer, H.; Munos, R.; Simonyan, K.; Mnih, V.; Ward, T.; Doron, Y.; Firoiu, V.; Harley, T.; Dunning, I.; et al. Impala: Scalable distributed deep-rl with importance weighted actor-learner architectures. In Proceedings of the International Conference on Machine Learning, Stockholm, Sweden, 10–15 July 2018; pp. 1407–1416. [Google Scholar]
- Hessel, M.; Soyer, H.; Espeholt, L.; Czarnecki, W.; Schmitt, S.; van Hasselt, H. Multi-task deep reinforcement learning with popart. In Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA, 27 January–1 February 2019; Volume 33, pp. 3796–3803. [Google Scholar]
- Ho, J.; Ermon, S. Generative Adversarial Imitation Learning. In Proceedings of the Advances in Neural Information Processing Systems, Barcelona, Spain, 5–10 December 2016; Curran Associates, Inc.: Red Hook, NY, USA, 2016; Volume 29. [Google Scholar]
- Tian, Y.; Chen, X.; Ganguli, S. Understanding self-supervised learning dynamics without contrastive pairs. In Proceedings of the International Conference on Machine Learning, Virtual, 18–24 July 2021; pp. 10268–10278. [Google Scholar]
- Chen, X.; He, K. Exploring simple siamese representation learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Virtual, 20–25 June 2021; pp. 15750–15758. [Google Scholar]
- Brockman, G.; Cheung, V.; Pettersson, L.; Schneider, J.; Schulman, J.; Tang, J.; Zaremba, W. OpenAI Gym. arXiv 2016, arXiv:1606.01540. [Google Scholar]
- Barto, A.G.; Sutton, R.S.; Anderson, C.W. Neuronlike adaptive elements that can solve difficult learning control problems. IEEE Trans. Syst. Man Cybern. 1983, SMC-13, 834–846. [Google Scholar] [CrossRef]
- Yu, T.; Quillen, D.; He, Z.; Julian, R.; Hausman, K.; Finn, C.; Levine, S. Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning. In Proceedings of the Conference on Robot Learning, Osaka, Japan, 30 October–1 November 2019; Volume 100, pp. 1094–1100. [Google Scholar]
- Rajeswaran, A.; Kumar, V.; Gupta, A.; Vezzani, G.; Schulman, J.; Todorov, E.; Levine, S. Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations. In Proceedings of the Robotics: Science and Systems (RSS), Pittsburgh, PA, USA, 20–26 June 2018. [Google Scholar] [CrossRef]
- Schulman, J.; Wolski, F.; Dhariwal, P.; Radford, A.; Klimov, O. Proximal policy optimization algorithms. arXiv 2017, arXiv:1707.06347. [Google Scholar]
- Riedmiller, M. Neural fitted Q iteration—First experiences with a data efficient neural reinforcement learning method. In Proceedings of the European Conference on Machine Learning, Porto, Portugal, 3–7 October 2005; Springer: Berlin/Heidelberg, Germany, 2005; pp. 317–328. [Google Scholar]
- Cross-domain transfer in reinforcement learning using target apprentice. In Proceedings of the 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, Australia, 21–26 May 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 7525–7532.
- Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L.; et al. Pytorch: An imperative style, high-performance deep learning library. Adv. Neural Inf. Process. Syst. 2019, 32. [Google Scholar] [CrossRef]
- Weng, J.; Chen, H.; Yan, D.; You, K.; Duburcq, A.; Zhang, M.; Su, H.; Zhu, J. Tianshou: A Highly Modularized Deep Reinforcement Learning Library. arXiv 2021, arXiv:2107.14171. [Google Scholar]
- Raffin, A.; Hill, A.; Gleave, A.; Kanervisto, A.; Ernestus, M.; Dormann, N. Stable-baselines3: Reliable reinforcement learning implementations. J. Mach. Learn. Res. 2021, 22, 1–8. [Google Scholar]
Task | Size of State Space | Size of Action Space | Difficulty Level | Description |
---|---|---|---|---|
Pendulum [60] | 3 (continuous) | 1 (continuous) | Easy | Swinging up a pendulum. |
CartPole [60,61] | 4 (continuous) | 1 (continuous) | Easy | Preventing the pendulum from falling over by applying a force to the cart. |
WindowOpen [62] | 39 (continuous) | 4 (continuous) | Medium | Opening a window. |
WindowClose [62] | 39 (continuous) | 4 (continuous) | Medium | Closing a window. |
Door [63] | 39 (continuous) | 28 (continuous) | Hard | A 24-DoF hand attempts to undo the latch and swing the door open. |
Hammer [63] | 46 (continuous) | 26 (continuous) | Hard | A 24-DoF hand attempts to use a hammer to drive the nail into the board. |
Experiment | Source Task | Target Task | Difficulty Level | Description |
---|---|---|---|---|
Pendulum–CartPole | Pendulum | CartPole | Easy | A simple experiment in which both source and target tasks have small state and action spaces. |
WindowOpen–WindowClose | WindowOpen | WindowClose | Medium | Both source and target tasks have a large state space but small action space. |
Door–Hammer | Door | Human | Hard | A challenging experiment in which both source and target tasks have large state and action spaces. |
Pendulum | WindowOpen | Door | ||
---|---|---|---|---|
Success rate | Proposed agent | 100% | 94% | 87% |
PPO [64] | 100% | 97% | 91% | |
NFQI [65] | 100% | 76% | 65% | |
Average cumulative reward | Proposed agent | −146.51 ± 85.24 | 1586.38 ± 229.00 | 2250.04 ± 1428.60 |
PPO [64] | −134.77 ± 93.59 | 1827.56 ± 410.98 | 2450.42 ± 1303.48 | |
NFQI [65] | −189.01 ± 87.09 | 752.00 ± 476.77 | 1252.55 ± 1213.15 |
CartPole | WindowClose | Hammer | ||
---|---|---|---|---|
Success rate | Proposed agent + Proposed adaptation algorithm | 100% | 83% | 82% |
Proposed agent + Fine-tuning | 77% | 72% | 50% | |
PPO [64] + Fine-tuning | 87% | 80% | 77% | |
NFQI + TA-TL [66] | 80% | 63% | 67% | |
Average cumulative reward | Proposed agent + Proposed adaptation algorithm | 13,137.42 ± 2709.57 | ||
Proposed agent + Fine-tuning | ||||
PPO [64] + Fine-tuning | ||||
NFQI + TA-TL [66] |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Nguyen Duc, T.; Tran, C.M.; Bach, N.G.; Tan, P.X.; Kamioka, E. Repetition-Based Approach for Task Adaptation in Imitation Learning. Sensors 2022, 22, 6959. https://doi.org/10.3390/s22186959
Nguyen Duc T, Tran CM, Bach NG, Tan PX, Kamioka E. Repetition-Based Approach for Task Adaptation in Imitation Learning. Sensors. 2022; 22(18):6959. https://doi.org/10.3390/s22186959
Chicago/Turabian StyleNguyen Duc, Tho, Chanh Minh Tran, Nguyen Gia Bach, Phan Xuan Tan, and Eiji Kamioka. 2022. "Repetition-Based Approach for Task Adaptation in Imitation Learning" Sensors 22, no. 18: 6959. https://doi.org/10.3390/s22186959
APA StyleNguyen Duc, T., Tran, C. M., Bach, N. G., Tan, P. X., & Kamioka, E. (2022). Repetition-Based Approach for Task Adaptation in Imitation Learning. Sensors, 22(18), 6959. https://doi.org/10.3390/s22186959