I Choose You, Reinforcement Learning! Trained RL Agents For Pokémon Battles

  • Leonardo de Lellis Rossi FEEC / Unicamp / H.IAAC
  • Bruno Souza Unicamp / Recod.ai
  • Maurício Pereira Lopes Unicamp
  • Ricardo Ribeiro Gudwin H.IAAC / FEEC / Unicamp
  • Esther Luna Colombini Unicamp / H.IAAC

Resumo


Pokémon battles present a valuable training environment for Reinforcement Learning (RL) agents due to their inherent stochastic nature and adaptability to deterministic settings. However, this environment currently lacks a comprehensive benchmark of basic RL agent implementations suitable for training purposes. This project aims to fill this gap with an open-source benchmark of trained agents with classic RL methods and Deep Reinforcement Learning (DRL) techniques, to foster the development in the field and facilitate the entry of new researchers. We also propose a Markov Decision Process (MDP) environment, in which agents are trained and validated. The agents demonstrated effective learning and achieved robust performance during training.
Palavras-chave: Reinforcement Learning, Deep Learning, Tabular Methods, Pokémon, Benchmark

Referências

Arulkumaran, K., Deisenroth, M. P., Brundage, M., and Bharath, A. A. (2017). Deep Reinforcement Learning: A brief survey. IEEE Signal Processing Magazine, 34(6):26–38.

Huang, D. and Lee, S. (2019). A self-play policy optimization approach to battling Pokémon. In Proceedings of the 2019 IEEE Conference on Games (CoG), pages 1–4.

Kalose, A., Kaya, K., and Kim, A. (2018). Optimal Battle Strategy in Pokémon using Reinforcement Learning. Stanford University.

Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., and Riedmiller, M. (2013). Playing atari with Deep Reinforcement Learning. DeepMind Technologies.

Nintendo. (2024). Pokémon official site. [link]

Osband, I., Blundell, C., Pritzel, A., and Van Roy, B. (2016). Deep exploration via bootstrapped DQN. In Advances in Neural Information Processing Systems, volume 29. Curran Associates, Inc.

Rill-Garcıa, R. (2018). Reinforcement Learning for a Turn-Based Small Scale Attrition Game.

Schulman, J., Wolski, F., Dhariwal, P., Radford, A., and Klimov, O. (2017). Proximal Policy Optimization algorithms.

Simoes, D., Reis, S., Lau, N., and Reis, L. P. (2020). Competitive Deep Reinforcement Learning over a Pokémon battling simulator. In 2020 IEEE International Conference on Autonomous Robot Systems and Competitions (ICARSC), pages 40–45. IEEE.

Sutton, R. S. and Barto, A. G. (2018). Reinforcement Learning: An introduction. MIT press, Cambridge, MA.

Watkins, C. (1989). Learning From Delayed Rewards. British Library. Thesis (Ph.D.).

Williams, R. J. (1992). Simple statistical gradient-following algorithms for connectionist Reinforcement Learning. Machine learning, 8:229–256.

Zhang, J., Kim, J., O’Donoghue, B., and Boyd, S. (2021). Sample efficient Reinforcement Learning with REINFORCE. In Proceedings of the AAAI conference on artificial intelligence, volume 35, pages 10887–10895.
Publicado
30/09/2024
ROSSI, Leonardo de Lellis; SOUZA, Bruno; LOPES, Maurício Pereira; GUDWIN, Ricardo Ribeiro; COLOMBINI, Esther Luna. I Choose You, Reinforcement Learning! Trained RL Agents For Pokémon Battles. In: TRILHA DE COMPUTAÇÃO – ARTIGOS CURTOS - SIMPÓSIO BRASILEIRO DE JOGOS E ENTRETENIMENTO DIGITAL (SBGAMES) , 2024 Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2024 . p. 13-18. DOI: https://doi.org/10.5753/sbgames_estendido.2024.241242.