skip to main content
10.1145/3631085.3631337acmotherconferencesArticle/Chapter ViewAbstractPublication PagessbgamesConference Proceedingsconference-collections
research-article

Scale-Invariant Reinforcement Learning in Real-Time Strategy Games

Published:19 January 2024Publication History

ABSTRACT

Real-time strategy games present a significant challenge for artificial game-playing agents by combining several fundamental AI problems. Despite the difficulties, attempts to create autonomous agents using Deep Reinforcement Learning have been successful, with bots like AlphaStar beating even expert human players. Many RTS games include several distinct world maps with different dimensions, which may affect the agent’s observation and the representation of game states. However, most current architectures suffer from fixed input sizes or require extensive and complex training. In this paper, we overcome these limitations by combining Grid-Wise Control with Spatial Pyramid Pooling (SPP). Specifically, we employ the encoder-decoder framework provided by the GridNet architecture and enhance the critic component of PPO by adding an SPP layer to it. The new layer generates a standardized representation of any game state regardless of the initial observation dimensions, allowing the agent to act on any map. Our evaluation demonstrates that our proposed method improves the models’ flexibility and provides a more effective and efficient solution for training autonomous agents in multiple RTS game scenarios.

References

  1. Per-Arne Andersen, Morten Goodwin, and Ole-Christoffer Granmo. 2018. Deep RTS: a game environment for deep reinforcement learning in real-time strategy games. In 2018 IEEE conference on computational intelligence and games (CIG). IEEE, 1–8.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Marc G Bellemare, Yavar Naddaf, Joel Veness, and Michael Bowling. 2013. The arcade learning environment: An evaluation platform for general agents. Journal of Artificial Intelligence Research 47 (2013), 253–279.Google ScholarGoogle ScholarCross RefCross Ref
  3. Yoshua Bengio, Jérôme Louradour, Ronan Collobert, and Jason Weston. 2009. Curriculum learning. In Proceedings of the 26th annual international conference on machine learning. 41–48.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Christopher Berner, Greg Brockman, Brooke Chan, Vicki Cheung, Przemysław Dębiak, Christy Dennison, David Farhi, Quirin Fischer, Shariq Hashme, Chris Hesse, 2019. Dota 2 with large scale deep reinforcement learning. arXiv preprint arXiv:1912.06680 (2019).Google ScholarGoogle Scholar
  5. Greg Brockman, Vicki Cheung, Ludwig Pettersson, Jonas Schneider, John Schulman, Jie Tang, and Wojciech Zaremba. 2016. Openai gym. arXiv preprint arXiv:1606.01540 (2016).Google ScholarGoogle Scholar
  6. Shaked Brody, Uri Alon, and Eran Yahav. 2021. How attentive are graph attention networks?arXiv preprint arXiv:2105.14491 (2021).Google ScholarGoogle Scholar
  7. Ross Girshick, Jeff Donahue, Trevor Darrell, and Jitendra Malik. 2014. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 580–587.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Lei Han, Peng Sun, Yali Du, Jiechao Xiong, Qing Wang, Xinghai Sun, Han Liu, and Tong Zhang. 2019. Grid-wise control for multi-agent reinforcement learning in video game AI. In International Conference on Machine Learning. PMLR, 2576–2585.Google ScholarGoogle Scholar
  9. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015. Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE transactions on pattern analysis and machine intelligence 37, 9 (2015), 1904–1916.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Shengyi Huang and Santiago Ontañón. 2022. A Closer Look at Invalid Action Masking in Policy Gradient Algorithms. In Proceedings of the Thirty-Fifth International Florida Artificial Intelligence Research Society Conference, FLAIRS 2022, Hutchinson Island, Jensen Beach, Florida, USA, May 15-18, 2022, Roman Barták, Fazel Keshtkar, and Michael Franklin (Eds.). https://doi.org/10.32473/flairs.v35i.130584Google ScholarGoogle ScholarCross RefCross Ref
  11. Shengyi Huang, Santiago Ontañón, Chris Bamford, and Lukasz Grela. 2021. Gym-μRTS: Toward Affordable Full Game Real-time Strategy Games Research with Deep Reinforcement Learning. In 2021 IEEE Conference on Games (CoG), Copenhagen, Denmark, August 17-20, 2021. IEEE, 1–8. https://doi.org/10.1109/CoG52621.2021.9619076Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Vince Jankovics, Michael Garcia Ortiz, and Eduardo Alonso. 2022. Efficient entity-based reinforcement learning. arXiv preprint arXiv:2206.02855 (2022).Google ScholarGoogle Scholar
  13. Muhammad Junaid Khan, Syed Hammad Ahmed, and Gita Sukthankar. 2022. Transformer-Based Value Function Decomposition for Cooperative Multi-Agent Reinforcement Learning in StarCraft. In Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, Vol. 18. 113–119.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2017. Imagenet classification with deep convolutional neural networks. Commun. ACM 60, 6 (2017), 84–90.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Liyuan Liu, Xiaodong Liu, Jianfeng Gao, Weizhu Chen, and Jiawei Han. 2020. Understanding the difficulty of training transformers. arXiv preprint arXiv:2004.08249 (2020).Google ScholarGoogle Scholar
  16. Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. 2013. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013).Google ScholarGoogle Scholar
  17. Razvan Pascanu, Tomas Mikolov, and Yoshua Bengio. 2013. On the difficulty of training recurrent neural networks. In International conference on machine learning. PMLR, 1310–1318.Google ScholarGoogle Scholar
  18. Mikayel Samvelyan, Tabish Rashid, Christian Schroeder De Witt, Gregory Farquhar, Nantas Nardelli, Tim GJ Rudner, Chia-Man Hung, Philip HS Torr, Jakob Foerster, and Shimon Whiteson. 2019. The starcraft multi-agent challenge. arXiv preprint arXiv:1902.04043 (2019).Google ScholarGoogle Scholar
  19. John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. Proximal Policy Optimization Algorithms. CoRR abs/1707.06347 (2017). arXiv:1707.06347http://arxiv.org/abs/1707.06347Google ScholarGoogle Scholar
  20. David Silver, Aja Huang, Chris J Maddison, Arthur Guez, Laurent Sifre, George Van Den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, 2016. Mastering the game of Go with deep neural networks and tree search. Nature 529, 7587 (2016), 484–489.Google ScholarGoogle Scholar
  21. Yuandong Tian, Qucheng Gong, Wenling Shang, Yuxin Wu, and C Lawrence Zitnick. 2017. Elf: An extensive, lightweight and flexible research platform for real-time strategy games. Advances in Neural Information Processing Systems 30 (2017).Google ScholarGoogle Scholar
  22. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems 30 (2017).Google ScholarGoogle Scholar
  23. Oriol Vinyals, Igor Babuschkin, Wojciech M Czarnecki, Michaël Mathieu, Andrew Dudzik, Junyoung Chung, David H Choi, Richard Powell, Timo Ewalds, Petko Georgiev, 2019. Grandmaster level in StarCraft II using multi-agent reinforcement learning. Nature 575, 7782 (2019), 350–354.Google ScholarGoogle Scholar
  24. Oriol Vinyals, Timo Ewalds, Sergey Bartunov, Petko Georgiev, Alexander Sasha Vezhnevets, Michelle Yeo, Alireza Makhzani, Heinrich Küttler, John Agapiou, Julian Schrittwieser, 2017. Starcraft ii: A new challenge for reinforcement learning. arXiv preprint arXiv:1708.04782 (2017).Google ScholarGoogle Scholar
  25. Xiangjun Wang, Junxiao Song, Penghui Qi, Peng Peng, Zhenkun Tang, Wei Zhang, Weimin Li, Xiongjun Pi, Jujie He, Chao Gao, 2021. SCC: an efficient deep reinforcement learning agent mastering the game of StarCraft II. In International Conference on Machine Learning. PMLR, 10905–10915.Google ScholarGoogle Scholar
  26. Won Joon Yun, Sungwon Yi, and Joongheon Kim. 2021. Multi-agent deep reinforcement learning using attentive graph neural architectures for real-time strategy games. In 2021 IEEE International Conference on Systems, Man, and Cybernetics (SMC). IEEE, 2967–2972.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Scale-Invariant Reinforcement Learning in Real-Time Strategy Games

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Other conferences
          SBGames '23: Proceedings of the 22nd Brazilian Symposium on Games and Digital Entertainment
          November 2023
          176 pages

          Copyright © 2023 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 19 January 2024

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed limited
        • Article Metrics

          • Downloads (Last 12 months)15
          • Downloads (Last 6 weeks)4

          Other Metrics

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format .

        View HTML Format