Proactive Agent Behaviour in Dynamic Distributed Constraint Optimisation Problems
Abstract
:1. Introduction
- The dynamic interaction graph method in [21] is extended to a complete graph setting whilst maintaining the multi-agent hierarchy or variable ordering that enables the execution of DCOP algorithms. This resolves the limitation in [21], where constraints or message-passing can only exist is parent–child relationships.
- A temporal experience-sharing algorithm, based on dynamic multi-agent connections, is proposed to enable information propagation across agents in different parts of the environment. This experience-sharing algorithm relies on a distance metric defined in the observation space to limit redundant experiences in an agent’s buffer due to perceptual aliasing.
- The temporal experiences are then used to train an autoregressive model to predict the future states of random variables in a constraint function, given the previous observation. This addresses the transition model requirement in Proactive Dynamic DCOP (PD-DCOP) algorithms. We demonstrate how this model can be used with traditional DCOP algorithms to facilitate proactive agent behaviour in dynamic environments.
2. Related Work
3. Background
3.1. RoboCup Rescue Simulation
3.2. Distributed Constraint Optimisation Problem
- is a finite set of agents,
- is a finite set of variables,
- is a set of variable domains such that the domain of is ,
- is a set of constraint functions defined on where each is defined over a subset , with , determines the cost of value assignments of the variables in as , where ⊥ denotes utility for infeasible configurations. Here, the cardinality of is the arity of . The global cost of the values assigned to variables in is ,
- is an “onto” function that assigns the control of a variable to an agent .
- The set of variables is extended to compose random variables to model stochastic events in the environment (e.g., device malfunctioning, weather, and temperature).
- The domain set is also composed of the event spaces for the random variables .
- is the horizon of the environment.
- The variables in the of a constraint function may be a mixed set of decision variables and random variables. We denote the set of constraint functions whose scope contains a random variable as , where , which is associated with at most a single random variable.
- is a set of transition functions for the random variables . A transition function specifies the probability that a random variable changes value in future time steps.
- is a switching cost function that assigns a cost to a decision variable for changing values between time steps. This study views this cost term as a unary constraint function. High switching costs discourage agents from frequently switching values. Hence, the agents have to be purposeful with any potential change in decision in time step 2.
- is a discount factor that is used to control the significance of future rewards or costs. We set in this study.
- is a set of priors of the random variables .
- The onto function assigns only decision variables to agents .
4. Proposed Approach
4.1. Dynamic Multi-Agent Connections
4.2. Temporal Experience Sharing
Algorithm 1 Temporal Experience Sharing Algorithm |
|
4.3. Temporal Experience Modelling
Algorithm 2 Temporal Experience Modelling Algorithm |
|
5. Theoretical Properties
- In the worst case, all agents in the MAS are within communication range of one another.
- Each agent communicates its experiences with all other agents in the system.
- The communication overhead per message exchange is known and is constant.
- The number of experiences shared in each message exchange is proportional to the size of the experience buffer.
- Experience Disclosure Phase: In the worst case, each agent sends its experiences to all other agents. Hence, the communication complexity is .
- Experience Sharing with Request Phase: Each agent requests experiences from other agents. The communication complexity is in this case.
- Experience Sharing Phase: Each agent shares experiences with other agents, resulting in .
- Neighbor Update Phase: Each agent sends the most recent experience to its neighbours with a complexity of .
- We assume a constant communication cost for message exchanges.
- Calculating similarity measures between experiences involves comparing each new experience with a subset of experiences in the buffer. We assume that the computation of the similarity measure is pairwise and requires comparing each new experience with a subset of experiences in the buffer.
- Simple random sampling selects a subset of experiences in the buffer.
- The merging of experiences involves searching for and removing possibly duplicate experiences. This involves iterating through the experiences in the buffer (in the worst case) and comparing them with the new experiences.
- Message-passing has a constant computational complexity .
- For each new experience, the computational complexity for comparing with all experiences in the buffer is .
- Under simple random sampling, the complexity is linear in the size of the experience buffer .
- Assuming there are Q new experiences to merge, the complexity is .
6. Experimental Setup
- Unburnt
- Burning slightly
- Burning more
- Burning severely
- Not burning (water damage)
- Extinguished (minor damage)
- Extinguished (moderate damage)
- Extinguished (severe damage)
- Completely burnt
6.1. Modelling the Search-and-Extinguish Problem as a D-DCOP
- Agents: The fire brigade agents in the environment shown in Figure 3 served as the agents in the DCOP. These agents interact with the environment by receiving local observations and sending commands. Agent-to-agent communication was enabled using an implementation of RabbitMQ’s AMQP messaging protocol.
- Variables: We mapped each agent to a single decision variable that determines the building selected by the agent to either move to or extinguish the building’s fire. The random variable in this setting is the temperature of buildings in the environment.
- Domain: Each agent’s domain was the set of all buildings in the environment whose fieriness was either burning severely or lower.
- Constraint functions: Constraints existed between agents that shared parent–child relationships.
6.2. Model Training
7. Results and Discussion
8. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
DCOP | Distributed Constraint Optimisation Problem |
D-DCOP | Dynamic Distributed Optimisation Problem |
PD-DCOP | Proactive Dynamic Distributed Constraint Optimisation Problem |
MAS | Multi-Agent System |
MDP | Markov Decision Process |
DDFS | Distributed Depth First Search |
Mobed | Multi-agent Organization with Bounded Edit |
HARP | Hybrid Algorithm for Reconstructing Pseudo-trees |
RCRS | RoboCup Rescue Simulation |
COCOA | Cooperative Constraint Approximation |
DPOP | Distributed Pseudo-tree Optimisation Problem |
AMQP | Advanced Message Queuing Protocol |
LA-COCOA | Lookahead COCOA |
LA-DPOP | Lookahead DPOP |
References
- Rust, P.; Picard, G.; Ramparany, F. Using message-passing DCOP algorithms to solve energy-efficient smart environment configuration problems. In Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI-16), New York, NY, USA, 9–15 July 2016; pp. 468–474. [Google Scholar]
- Yedidsion, H.; Zivan, R.; Farinelli, A. Applying max-sum to teams of mobile sensing agents. Eng. Appl. Artif. Intell. 2018, 71, 87–99. [Google Scholar] [CrossRef]
- Rybski, P.; Stoeter, S.; Gini, M.; Hougen, D.; Papanikolopoulos, N. Effects of limited bandwidth communications channels on the control of multiple robots. In Proceedings of the 2001 IEEE/RSJ International Conference on Intelligent Robots and Systems. Expanding the Societal Role of Robotics in the the Next Millennium (Cat. No.01CH37180), Maui, HI, USA, 29 October–3 November 2001; Volume 1, pp. 369–374. [Google Scholar] [CrossRef]
- Ramchurn, S.D.; Farinelli, A.; MacArthur, K.S.; Jennings, N.R. Decentralized coordination in RoboCup Rescue. Comput. J. 2010, 53, 1447–1461. [Google Scholar] [CrossRef]
- Padhy, P.; Dash, R.K.; Martinez, K.; Jennings, N.R. A Utility-Based Sensing and Communication Model for a Glacial Sensor Network. In Proceedings of the Fifth International Joint Conference on Autonomous Agents and Multiagent Systems, Hakodate, Japan, 8–12 May 2006; Association for Computing Machinery: New York, NY, USA, 2006; pp. 1353–1360. [Google Scholar] [CrossRef]
- Pujol-Gonzalez, M.; Cerquides, J.; Meseguer, P.; Rodríguez-Aguilar, J.A.; Tambe, M. Engineering the Decentralized Coordination of UAVs with Limited Communication Range. In Advances in Artificial Intelligence; Bielza, C., Salmerón, A., Alonso-Betanzos, A., Hidalgo, J.I., Martínez, L., Troncoso, A., Corchado, E., Corchado, J.M., Eds.; Springer: Berlin/Heidelberg, Germany, 2013; pp. 199–208. [Google Scholar]
- Junges, R.; Bazzan, A.L.C. Evaluating the Performance of DCOP Algorithms in a Real World, Dynamic Problem. In Proceedings of the 7th International Joint Conference on Autonomous Agents and Multiagent Systems, Estoril, Portugal, 12–16 May 2008; International Foundation for Autonomous Agents and Multiagent Systems: Richland, SC, USA, 2008; pp. 599–606. [Google Scholar]
- Lezama, F.; Munoz de Cote, E.; Farinelli, A.; Soares, J.; Pinto, T.; Vale, Z. Distributed constrained optimization towards effective agent-based microgrid energy resource management. In Proceedings of the 19th EPIA Conference on Artificial Intelligence, EPIA 2019, Vila Real, Portugal, 3–6 September 2019; pp. 438–449. [Google Scholar] [CrossRef]
- Picard, G. Trajectory Coordination based on Distributed Con-straint Optimization Techniques in Unmanned Air Traffic Management. In Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems, Virtual, 9–13 May 2022; Volume 9, pp. 1065–1073. [Google Scholar]
- Fioretto, F.; Pontelli, E.; Yeoh, W. Distributed constraint optimization problems and applications: A survey. J. Artif. Intell. Res. 2018, 61, 623–698. [Google Scholar] [CrossRef]
- Nair, R.; Varakantham, P.; Tambe, M.; Yokoo, M. Networked Distributed POMDPs: A Synthesis of Distributed Constraint Optimization and POMDPs. In Proceedings of the American Association for Artificial Intelligence, Pittsburgh, PA, USA, 9–13 July 2005; Volume 1, pp. 133–139. [Google Scholar]
- Zivan, R.; Yedidsion, H.; Okamoto, S.; Glinton, R.; Sycara, K. Distributed constraint optimization for teams of mobile sensing agents. Auton. Agents -Multi-Agent Syst. 2015, 29, 495–536. [Google Scholar] [CrossRef]
- Hoang, K.D.; Fioretto, F.; Hou, P.; Yokoo, M.; Yeoh, W.; Zivan, R. Proactive dynamic distributed constraint optimization. In Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS, Singapore, 9–13 May 2016; pp. 597–605. [Google Scholar]
- Holland, A.; O’Sullivan, B. Weighted super solutions for constraint programs. Proc. Natl. Conf. Artif. Intell. 2005, 1, 378–383. [Google Scholar]
- Jiang, J.; Lu, Z. Learning Attentional Communication for Multi-Agent Cooperation. In Advances in Neural Information Processing Systems; Bengio, S., Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N., Garnett, R., Eds.; Curran Associates, Inc.: Scotland, UK, 2018; Volume 31. [Google Scholar]
- Barambones, J.; Imbert, R.; Moral, C. Applicability of multi-agent systems and constrained reasoning for sensor-based distributed scenarios: A systematic mapping study on dynamic DCOPs. Sensors 2021, 21, 3807. [Google Scholar] [CrossRef] [PubMed]
- Duff, S.; Harland, J.; Thangarajah, J. On proactivity and maintenance goals. Proc. Int. Conf. Auton. Agents 2006, 2006, 1033–1040. [Google Scholar] [CrossRef]
- Hoang, K.D.; Hou, P.; Fioretto, F.; Yeoh, W.; Zivan, R.; Yokoo, M. Infinite-horizon proactive dynamic DCOPs. In Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS, São Paulo, Brazil, 8–12 May 2017; Volume 1, pp. 212–220. [Google Scholar]
- Hoang, K.D.; Fioretto, F.; Hou, P.; Yeoh, W.; Yokoo, M.; Zivan, R. Proactive Dynamic Distributed Constraint Optimization Problems. J. Artif. Intell. Res. 2022, 74, 179–225. [Google Scholar] [CrossRef]
- Wallace, R.J.; Freuder, E.C. Stable solutions for dynamic constraint satisfaction problems. In Proceedings of the International Conference on Principles and Practice of Constraint Programming, Pisa, Italy, 26–30 October 1998; Volume 1520, pp. 447–461. [Google Scholar] [CrossRef]
- Agyemang, B.; Ren, F.; Yan, J. Distributed Multi-Agent Hierarchy Construction for Dynamic DCOPs in Mobile Sensor Teams. Hum.-Centric Intell. Syst. 2023, 3, 473–486. [Google Scholar] [CrossRef]
- Nguyen, D.T.; Yeoh, W.; Lau, H.C.; Zilberstein, S.; Zhang, C. Decentralized multi-agent reinforcement learning in average-reward dynamic DCOPs. In Proceedings of the 13th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2014, Paris, France, 5–9 May 2014; Volume 2, pp. 1341–1342. [Google Scholar]
- Shokoohi, M.; Afsharchi, M.; Shah-Hoseini, H. Dynamic distributed constraint optimization using multi-agent reinforcement learning. Soft Comput. 2022, 26, 3601–3629. [Google Scholar] [CrossRef]
- Xie, S.; Zhang, H.; Yu, H.; Li, Y.; Zhang, Z.; Luo, X. ET-HF: A novel information sharing model to improve multi-agent cooperation. Knowl.-Based Syst. 2022, 257, 109916. [Google Scholar] [CrossRef]
- Sukhbaatar, S.; Szlam, A.; Fergus, R. Learning multiagent communication with backpropagation. In Proceedings of the Advances in Neural Information Processing Systems, Barcelona, Spain, 5–10 December 2016; pp. 2252–2260. Available online: https://dl.acm.org/doi/pdf/10.5555/3157096.3157348 (accessed on 1 May 2024).
- Pesce, E.; Montana, G. Learning multi-agent coordination through connectivity-driven communication. Mach. Learn. 2023, 112, 483–514. [Google Scholar] [CrossRef]
- Hamadi, Y.; Bessiere, C.; Quinqueton, J. Backtracking in Distributed Constraint Networks. In Proceedings of the ECAI 98: 13th European Conference on Artificial Intelligence, Brighton, UK, 23–28 August 1998; pp. 219–223. [Google Scholar]
- Sultanik, E.A.; Lass, R.N.; Regli, W.C. Dynamic configuration of agent organizations. In Proceedings of the IJCAI International Joint Conference on Artificial Intelligence, Pasadena, CA, USA, 11–17 July 2009; pp. 305–311. [Google Scholar]
- Yeoh, W.; Varakantham, P.; Sun, X.; Koenig, S. Incremental DCOP search algorithms for solving dynamic DCOP problems. In Proceedings of the 2015 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology, WI-IAT 2015, Singapore, 6–9 December 2015; Volume 2, pp. 257–264. [Google Scholar] [CrossRef]
- Skinner, C.; Ramchurn, S. The RoboCup Rescue Simulation Platform. In Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems, Toronto, ON, Canada, 9–14 May 2010; International Foundation for Autonomous Agents and Multiagent Systems: Richland, SC, USA, 2010; Volume 1, pp. 1647–1648. [Google Scholar]
- Sarker, A.; Choudhury, M.; Khan, M.M. A local search based approach to solve continuous DCOPs. In Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS, Virtual, 3–7 May 2021; Volume 2, pp. 1115–1123. [Google Scholar]
- Van Leeuwen, C.J.; Pawełczak, P. CoCoA: A non-iterative approach to a local search (A)DCOP Solver. In Proceedings of the 31st AAAI Conference on Artificial Intelligence, AAAI 2017, San Francisco, CA, USA, 4–9 February 2017; pp. 3944–3950. [Google Scholar] [CrossRef]
- Petcu, A.; Faltings, B. A Scalable Method for Multiagent Constraint Optimization. In Proceedings of the 19th International Joint Conference on Artificial Intelligence, Scotland, UK, 30 July–5 August 2005; Morgan Kaufmann Publishers Inc.: San Francisco, CA, USA, 2005; pp. 266–271. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Agyemang, B.; Ren, F.; Yan, J. Proactive Agent Behaviour in Dynamic Distributed Constraint Optimisation Problems. Information 2024, 15, 255. https://doi.org/10.3390/info15050255
Agyemang B, Ren F, Yan J. Proactive Agent Behaviour in Dynamic Distributed Constraint Optimisation Problems. Information. 2024; 15(5):255. https://doi.org/10.3390/info15050255
Chicago/Turabian StyleAgyemang, Brighter, Fenghui Ren, and Jun Yan. 2024. "Proactive Agent Behaviour in Dynamic Distributed Constraint Optimisation Problems" Information 15, no. 5: 255. https://doi.org/10.3390/info15050255
APA StyleAgyemang, B., Ren, F., & Yan, J. (2024). Proactive Agent Behaviour in Dynamic Distributed Constraint Optimisation Problems. Information, 15(5), 255. https://doi.org/10.3390/info15050255