A Hybrid Approach to Teamwork

Paulo Trigo; Helder Coelho

Paulo Trigo Instituto Superior de Engenharia de Lisboa
Helder Coelho Universidade de Lisboa

Resumo

In the aftermath of a large-scale disaster, agents’ decisions derive from self-interested (e.g. survival), common-good (e.g. victims’ rescue) and teamwork (e.g. fire extinction) motivations. However, decision-theoretic models find it difficult to incorporate motivations, and mental-state models find it difficult to deal with uncertainty. We present an hybrid, CvI-JI, approach that combines: i) collective ‘versus’ individual (CvI) decisions, founded on the Markov decision process (MDP) quantitative evaluation of joint-actions, and ii) joint-intentions (JI) formulation of teamwork, founded on the belief-desire-intention (BDI) architecture of general mental-state based reasoning. Experiments show the CvI-JI performance’s improvement during a policy learning process.

Referências

Bradtke, S. and Duff, M. (1995). Reinforcement learning methods for continuous-time Markov decision problems. In Proceedings of Advances in Neural Information Processing Systems, volume 7, pages 393–400. The MIT Press.

Bratman, M. (1990). What is intention? In Intentions in Communication, pages 15–31. MIT Press, Cambridge, MA.

Cohen, P. and Levesque, H. (1990). Intention is choice with commitment. Artificial Intelligence, 42(2–3):213–261.

Cohen, P. and Levesque, H. (1991). Teamwork. Noûs, Special Issue on Cognitive Science and Artificial Intelligence, 25(4):487–512.

Dietterich, T. (2000). Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of Artificial Intelligence Research, 13:227–303.

Georgeff, M. and Ingrand, F. (1989). Decision-making in an embedded reasoning system. In Proceedings of the 11th International Joint Conference on Artificial Intelligence (IJCAI-89), pages 972–978, Detroit, USA.

Kitano, H. and Tadokoro, S. (2001). RoboCup Rescue: A grand challenge for multi-agent systems. Artificial Intelligence Magazine, 22(1):39–52.

Rao, A. and Georgeff, M. (1995). BDI agents: From theory to practice. In Proceedings of the First International Conference on Multiagent Systems, pages 312–319, San Francisco, USA.

Schut, M., Wooldridge, M., and Parsons, S. (2002). On partially observable MDPs and BDI models. In Foundations and Applications of Multi-Agent Systems, volume 2403 of Lecture Notes in Computer Science, pages 243–260. Springer-Verlag.

Simari, G. and Parsons, S. (2006). On the relationship between MDPs and the BDI architecture. In Proceedings of the Fifth International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS-06), pages 1041–1048, Hakodate, Japan. ACM Press.

Sutton, R., Precup, D., and Singh, S. (1999). Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence, 112(1–2):181–211.

Tambe, M., Bowring, E., Jung, H., Kaminka, G., Maheswaran, R., Marecki, J., Modi, P., Nair, R., Okamoto, S., Pearce, J., Paruchuri, P., Pynadath, D., Scerri, P., Schurr, N., and Varakantham, P. (2005). Conflicts in teamwork: Hybrids to the rescue. In Proceedings of the 4th International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS-05), pages 3–10. ACM Press.

Trigo, P. and Coelho, H. (2005). The multi-team formation precursor of teamwork. In Progress in Artificial Intelligence, EPIA-05, volume 3808 of Lecture Notes in Artificial Intelligence, pages 560–571. Springer-Verlag.

Trigo, P., Jonsson, A., and Coelho, H. (2006). Coordination with collective and individual decisions. In Advances in Artificial Intelligence, IBERAMIA/SBIA 2006, volume 4140 of Lecture Notes in Artificial Intelligence, pages 37–47. Springer-Verlag.

Wooldridge, M. (2000). Reasoning About Rational Agents, chapter Implementing Rational Agents. The MIT Press.