Locally Centralized Execution for Less Redundant Computation in Multi-Agent Cooperation
Abstract
:1. Introduction
2. Related Work
2.1. Centralized Training and Decentralized Execution
2.2. Multi-Agent Communication
3. Background and Problem
3.1. Cooperative Decentralized Partially Observable Markov Decision Process with Communication
3.2. Level-Based Foraging (LBF) Environment
3.3. Problem Description
4. Proposed Method
4.1. Redundant Observation Ratio
4.2. Locally Centralized Execution (LCE)
Algorithm 1 Locally Centralized Execution |
|
Algorithm 2 Leadership Shift (LS) |
|
4.3. Team Transformer (T-Trans)
4.4. Leadership Shift (LS)
5. Experiments
5.1. Comparison to Baselines
5.2. Ablation Study and the Effect of the Number of Leaders
5.3. Experimental Results on the Difficult LBF Setting
5.4. Experimental Results on Cooperative Navigation
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Appendix A. Experimental Results in a Large Environment
References
- Yu, C.; Wang, X.; Xu, X.; Zhang, M.; Ge, H.; Ren, J.; Sun, L.; Chen, B.; Tan, G. Distributed multiagent coordinated learning for autonomous driving in highways based on dynamic coordination graphs. IEEE Trans. Intell. Transp. Syst. 2019, 21, 735–748. [Google Scholar] [CrossRef]
- Wachi, A. Failure-scenario maker for rule-based agent using multi-agent adversarial reinforcement learning and its application to autonomous driving. arXiv 2019, arXiv:1903.10654. [Google Scholar]
- Bhalla, S.; Ganapathi Subramanian, S.; Crowley, M. Deep multi agent reinforcement learning for autonomous driving. In Proceedings of the Canadian Conference on Artificial Intelligence, Ottawa, ON, Canada, 13–15 May 2020; Springer: Berlin/Heidelberg, Germany, 2020; pp. 67–78. [Google Scholar]
- Palanisamy, P. Multi-agent connected autonomous driving using deep reinforcement learning. In Proceedings of the 2020 International Joint Conference on Neural Networks (IJCNN), Glasgow, UK, 19–24 July 2020; IEEE: Amsterdam, The Netherlands, 2020; pp. 1–7. [Google Scholar]
- Shalev-Shwartz, S.; Shammah, S.; Shashua, A. Safe, multi-agent, reinforcement learning for autonomous driving. arXiv 2016, arXiv:1610.03295. [Google Scholar]
- Wang, Y.; Zhong, F.; Xu, J.; Wang, Y. ToM2C: Target-oriented Multi-agent Communication and Cooperation with Theory of Mind. In Proceedings of the International Conference on Learning Representations, Virtual Event, 25–29 April 2022. [Google Scholar]
- Yuan, L.; Wang, J.; Zhang, F.; Wang, C.; Zhang, Z.; Yu, Y.; Zhang, C. Multi-agent incentive communication via decentralized teammate modeling. In Proceedings of the AAAI Conference on Artificial Intelligence, Virtual Event, 22 February–1 March 2022; Volume 36, pp. 9466–9474. [Google Scholar]
- Berner, C.; Brockman, G.; Chan, B.; Cheung, V.; Dębiak, P.; Dennison, C.; Farhi, D.; Fischer, Q.; Hashme, S.; Hesse, C.; et al. Dota 2 with large scale deep reinforcement learning. arXiv 2019, arXiv:1912.06680. [Google Scholar]
- Gupta, J.K.; Egorov, M.; Kochenderfer, M. Cooperative multi-agent control using deep reinforcement learning. In Proceedings of the Autonomous Agents and Multiagent Systems: AAMAS 2017 Workshops, Best Papers, São Paulo, Brazil, 8–12 May 2017; Revised Selected Papers 16. Springer: Berlin/Heidelberg, Germany, 2017; pp. 66–83. [Google Scholar]
- Han, L.; Sun, P.; Du, Y.; Xiong, J.; Wang, Q.; Sun, X.; Liu, H.; Zhang, T. Grid-wise control for multi-agent reinforcement learning in video game AI. In Proceedings of the International Conference on Machine Learning, Long Beach, CA, USA, 9–15 June 2019; pp. 2576–2585. [Google Scholar]
- Kraemer, L.; Banerjee, B. Multi-agent reinforcement learning as a rehearsal for decentralized planning. Neurocomputing 2016, 190, 82–94. [Google Scholar] [CrossRef]
- Lowe, R.; Wu, Y.; Tamar, A.; Harb, J.; Abbeel, P.; Mordatch, I. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments. arXiv 2017, arXiv:1706.02275. [Google Scholar]
- Wooldridge, M. An Introduction to Multiagent Systems; John Wiley & Sons: Hoboken, NJ, USA, 2009. [Google Scholar]
- Weiss, G. Multiagent Systems: A Modern Approach to Distributed Artificial Intelligence; MIT Press: Cambridge, MA, USA, 1999. [Google Scholar]
- Shoham, Y.; Leyton-Brown, K. Multiagent Systems: Algorithmic, Game-Theoretic, and Logical Foundations; Cambridge University Press: Cambridge, UK, 2008. [Google Scholar]
- Ferber, J.; Weiss, G. Multi-Agent Systems: An Introduction to Distributed Artificial Intelligence; Addison-Wesley Reading: Boston, MA, USA, 1999; Volume 1. [Google Scholar]
- Yang, Y.; Wang, J. An overview of multi-agent reinforcement learning from game theoretical perspective. arXiv 2020, arXiv:2011.00583. [Google Scholar]
- Sugawara, T. A cooperative LAN diagnostic and observation expert system. In Proceedings of the 1990 Ninth Annual International Phoenix Conference on Computers and Communications, Scottsdale, AZ, USA, 21–23 March 1990; IEEE Computer Society: Washington, DC, USA, 1990; pp. 667–668. [Google Scholar]
- Durfee, E.H.; Lesser, V.R.; Corkill, D.D. Coherent cooperation among communicating problem solvers. IEEE Trans. Comput. 1987, 100, 1275–1291. [Google Scholar] [CrossRef]
- Krnjaic, A.; Steleac, R.D.; Thomas, J.D.; Papoudakis, G.; Schäfer, L.; To AW, K.; Lao, K.-H.; Cubuktepe, M.; Haley, M.; Börsting, P.; et al. Scalable multi-agent reinforcement learning for warehouse logistics with robotic and human co-workers. arXiv 2022, arXiv:2212.11498. [Google Scholar]
- Xu, J.; Zhong, F.; Wang, Y. Learning multi-agent coordination for enhancing target coverage in directional sensor networks. Adv. Neural Inf. Process. Syst. 2020, 33, 10053–10064. [Google Scholar]
- Cammarata, S.; McArthur, D.; Steeb, R. Strategies of cooperation in distributed problem solving. Readings in Distributed Artificial Intelligence; Elsevier: Amsterdam, The Netherlands, 1988; pp. 102–105. [Google Scholar]
- Yu, C.; Velu, A.; Vinitsky, E.; Wang, Y.; Bayen, A.; Wu, Y. The surprising effectiveness of PPO in cooperative, multi-agent games. arXiv 2021, arXiv:2103.01955. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. arXiv 2017, arXiv:1706.03762. [Google Scholar]
- Rashid, T.; Samvelyan, M.; De Witt, C.S.; Farquhar, G.; Foerster, J.; Whiteson, S. Monotonic value function factorisation for deep multi-agent reinforcement learning. J. Mach. Learn. Res. 2020, 21, 7234–7284. [Google Scholar]
- Papoudakis, G.; Christianos, F.; Schäfer, L.; Albrecht, S.V. Benchmarking multi-agent deep reinforcement learning algorithms in cooperative tasks. arXiv 2020, arXiv:2006.07869. [Google Scholar]
- Gronauer, S.; Diepold, K. Multi-agent deep reinforcement learning: A survey. Artif. Intell. Rev. 2022, 55, 895–943. [Google Scholar] [CrossRef]
- Peng, P.; Wen, Y.; Yang, Y.; Yuan, Q.; Tang, Z.; Long, H.; Wang, J. Multiagent bidirectionally-coordinated nets: Emergence of human-level coordination in learning to play starcraft combat games. arXiv 2017, arXiv:1703.10069. [Google Scholar]
- Tan, M. Multi-agent reinforcement learning: Independent vs. cooperative agents. In Proceedings of the Tenth International Conference On Machine Learning, Amherst, MA, USA, 27–29 June 1993; pp. 330–337. [Google Scholar]
- Sukhbaatar, S.; Fergus, R. Learning multiagent communication with backpropagation. Adv. Neural Inf. Process. Syst. 2016, 29, 2252–2260. [Google Scholar]
- Jaques, N.; Lazaridou, A.; Hughes, E.; Gulcehre, C.; Ortega, P.; Strouse, D.; Leibo, J.Z.; De Freitas, N. Social influence as intrinsic motivation for multi-agent deep reinforcement learning. In Proceedings of the International Conference on Machine Learning (PMLR 2019), Beach, CA, USA, 9–15 June 2019; pp. 3040–3049. [Google Scholar]
- Ding, Z.; Huang, T.; Lu, Z. Learning individually inferred communication for multi-agent cooperation. Adv. Neural Inf. Process. Syst. 2020, 33, 22069–22079. [Google Scholar]
- Bai, Y.; Sugawara, T. Reducing Redundant Computation in Multi-Agent Coordination through Locally Centralized Execution. arXiv 2024, arXiv:2404.13096. [Google Scholar]
- Jiang, J.; Lu, Z. Learning attentional communication for multi-agent cooperation. Adv. Neural Inf. Process. Syst. 2018, 31, 7265–7275. [Google Scholar]
- Sutton, R.S.; Barto, A.G. Reinforcement Learning: An Introduction; MIT Press: Cambridge, MA, USA, 2018. [Google Scholar]
- Watkins, C.J.C.H. Learning from Delayed Rewards. Ph.D. Thesis, King’s College, Cambridge, UK, 1989. [Google Scholar]
- Mnih, V.; Kavukcuoglu, K.; Silver, D.; Rusu, A.A.; Veness, J.; Bellemare, M.G.; Graves, A.; Riedmiller, M.; Fidjeland, A.K.; Ostrovski, G.; et al. Human-level control through deep reinforcement learning. Nature 2015, 518, 529–533. [Google Scholar] [CrossRef] [PubMed]
- Mnih, V.; Kavukcuoglu, K.; Silver, D.; Graves, A.; Antonoglou, I.; Wierstra, D.; Riedmiller, M. Playing atari with deep reinforcement learning. arXiv 2013, arXiv:1312.5602. [Google Scholar]
- Cho, K.; Van Merriënboer, B.; Gulcehre, C.; Bahdanau, D.; Bougares, F.; Schwenk, H.; Bengio, Y. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv 2014, arXiv:1406.1078. [Google Scholar]
MAIC | LCTT-4L | LCTT-3L | LCTT-2L | LCTT-1L |
---|---|---|---|---|
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Bai, Y.; Sugawara, T. Locally Centralized Execution for Less Redundant Computation in Multi-Agent Cooperation. Information 2024, 15, 279. https://doi.org/10.3390/info15050279
Bai Y, Sugawara T. Locally Centralized Execution for Less Redundant Computation in Multi-Agent Cooperation. Information. 2024; 15(5):279. https://doi.org/10.3390/info15050279
Chicago/Turabian StyleBai, Yidong, and Toshiharu Sugawara. 2024. "Locally Centralized Execution for Less Redundant Computation in Multi-Agent Cooperation" Information 15, no. 5: 279. https://doi.org/10.3390/info15050279
APA StyleBai, Y., & Sugawara, T. (2024). Locally Centralized Execution for Less Redundant Computation in Multi-Agent Cooperation. Information, 15(5), 279. https://doi.org/10.3390/info15050279