Técnicas de Aprendizado por Reforço Aplicadas em Jogos Eletrônicos na Plataforma Geral do Unity

  • Gabriel Prudencio Haddad UFU
  • Rita Maria Silva Julia UFU
  • Matheus Prado Prandini Faria UFU

Abstract


The aim of this work is to investigate the performance of player agents based on reinforcement learning more specifically, on Q-Learning and Deep Q-Networks (DQN) algorithms through the Unity platform. For that, the authors implement in it Basic and GridWorld player agents that are trained according to such algorithms. In order to evaluate the agents' performance, a comparative analysis is carried out between them and the best player agents of these games available on the Unity platform, which are based on the Proximal Policy Optimization (PPO) and Soft Actor-Critic (SAC) learning techniques.

References

Beattie, C., Leibo, J. Z., Teplyashin, D., Ward, T., Wainwright, M., Küttler, H., Lefrancq, A., Green, S., Valdés, V., Sadik, A., et al. (2016). Deepmind lab. arXiv preprint arXiv:1612.03801.

Bellemare, M. G., Naddaf, Y., Veness, J., and Bowling, M. (2013). The arcade learning environment: An evaluation platform for general agents. Journal of Artificial Intelligence Research, 47:253–279.

Dosovitskiy, A., Ros, G., Codevilla, F., Lopez, A., and Koltun, V. (2017). CARLA: An open urban driving simulator. In Proceedings of the 1st Annual Conference on Robot Learning, volume 78 of Proceedings of Machine Learning Research, pages 1–16.

Haarnoja, T., Zhou, A., Hartikainen, K., Tucker, G., Ha, S., Tan, J., Kumar, V., Zhu, H., Gupta, A., Abbeel, P., et al. (2018). Soft actor-critic algorithms and applications. arXiv preprint arXiv:1812.05905.

Juliani, A., Berges, V.-P., Vckay, E., Gao, Y., Henry, H., Mattar, M., and Lange, D. (2018). Unity: A general platform for intelligent agents. arXiv preprint arXiv:1809.02627.

Karttunen, J., Kanervisto, A., Kyrki, V., and Hautamäki, V. (2020). From video game to real robot: The transfer between action spaces. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 3567–3571.

LeCun, Y., Bengio, Y., and Hinton, G. (2015). Deep learning. nature, 521:436–444.

Lillicrap, T. P., Hunt, J. J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., and Wierstra, D. (2016). Continuous control with deep reinforcement learning. In International Conference on Learning Representations (ICLR) 2016.

Min, K., Kim, H., and Huh, K. (2019). Deep distributional reinforcement learning based IEEE Transactions on Intelligent Vehicles, high-level driving policy determination. 4(3):416–424.

Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., Graves, A., Riedmiller, M., Fidjeland, A. K., Ostrovski, G., et al. (2015). Human-level control through deep reinforcement learning. nature, 518:529–533.

Mousavi, S. S., Schukat, M., and Howley, E. (2018). Deep reinforcement learning: An overview. In Bi, Y., Kapoor, S., and Bhatia, R., editors, Proceedings of SAI Intelligent Systems Conference (IntelliSys), pages 426–440.

Russell, S. and Norvig, P. (2013). Artificial Intelligence 3ª Ed. Elsevier.

Schulman, J., Wolski, F., Dhariwal, P., Radford, A., and Klimov, O. (2017). Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347.

Sewak, M. (2019). Deep reinforcement learning. Springer.

Stats, I. W. (2021). Internet growth statistics.

Sutton, R. S. (1988). Learning to predict by the methods of temporal differences. Machine learning, 3(1):9–44.

Watkins, C. J. C. H. (1989). Learning from delayed rewards.

Witkowski, W. (2020). Videogames are a bigger industry than movies and north american sports combined, thanks to the pandemic.
Published
2021-11-29
HADDAD, Gabriel Prudencio; JULIA, Rita Maria Silva; FARIA, Matheus Prado Prandini. Técnicas de Aprendizado por Reforço Aplicadas em Jogos Eletrônicos na Plataforma Geral do Unity. In: NATIONAL MEETING ON ARTIFICIAL AND COMPUTATIONAL INTELLIGENCE (ENIAC), 18. , 2021, Evento Online. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2021 . p. 571-582. ISSN 2763-9061. DOI: https://doi.org/10.5753/eniac.2021.18285.

Most read articles by the same author(s)