Using Curriculum to Train Multisensory Foraging DRL Agents

Romulo F. Férrer Filho; Alexandre M. M. Santos; Halisson R. Rodrigues; Yuri L. B. Nogueira; Creto A. Vidal; Joaquim B. Cavalcante-Neto; Paulo B. S. Serafim

doi:10.5753/sbgames.2024.241119

Romulo F. Férrer Filho UFC
Alexandre M. M. Santos UFC
Halisson R. Rodrigues UFC
Yuri L. B. Nogueira UFC
Creto A. Vidal UFC
Joaquim B. Cavalcante-Neto UFC
Paulo B. S. Serafim Gran Sasso Science Institute

DOI: https://doi.org/10.5753/sbgames.2024.241119

Resumo

Deep reinforcement learning has shown great success in developing agents that can solve complex game tasks. However, most game agents use only visual sensors to gather information about the environment. More recent works have shown that agents that use audio sensors can perform better than vision-only agents. In this paper, we propose a curriculum-based training strategy to develop agents that effectively use audio as a source of information in foraging-based scenarios. First, we demonstrate that agents with both vision and hearing capabilities perform similarly to agents with only a visual sensor, indicating that the first ones ignore the audio. Then, we show that by using a gradually increasing difficult curriculum the agent effectively uses the audio information available, making it more robust to survive in scenarios where visual information is not available. Our results indicate that agents can be trained to effectively use audio as a source of information by using a curriculum based training strategy, improving their ability to deal with more tasks than agents with only vision.

Palavras-chave: Deep Reinforcement Learning, Game Agent, Curriculum Learning, Multisensory Agents, Foraging

Referências

Akimov, D. e Makarov, I. (2019). Deep reinforcement learning in vizdoom first-person shooter for health gathering scenario. MMEDIA, pages 1–6.

Baker, B., Kanitscheider, I., Markov, T., Wu, Y., Powell, G., McGrew, B., e Mordatch, I. (2020). Emergent tool use from multi-agent autocurricula. In International Conference on Learning Representations (ICLR), pages 1–28.

Bengio, Y., Louradour, J., Collobert, R., e Weston, J. (2009). Curriculum learning. In Proceedings of the 26th annual international conference on machine learning, pages 41–48.

Gaina, R. D. e Stephenson, M. (2019). “did you hear that?” learning to play video games from audio cues. In 2019 IEEE Conference on Games (CoG), pages 1–4.

Garcia-Romero, D., Sell, G., e Mccree, A. (2020). Magneto: X-vector magnitude estimation network plus offset for improved speaker recognition. In Proc. Odyssey 2020 the speaker and language recognition workshop, pages 1–8.

Giannakopoulos, P., Pikrakis, A., e Cotronis, Y. (2021). A deep reinforcement learning approach for audio-based navigation and audio source localization in multi-speaker environments. CoRR, abs/2110.12778.

Grimshaw, M., Tan, S.-L., e Lipscomb, S. D. (2013). Playing with sound: The role of music and sound effects in gaming. In The Psychology of Music in Multimedia. Oxford University Press.

Guillen, G., Jylhä, H., e Hassan, L. (2021). The role sound plays in games: A thematic literature study on immersion, inclusivity and accessibility in game sound research. In Proceedings of the 24th International Academic Mindtrek Conference, Academic Mindtrek ’21, page 12–20, New York, NY, USA. Association for Computing Machinery.

Hegde, S., Kanervisto, A., e Petrenko, A. (2021). Agents that listen: High-throughput reinforcement learning with multiple sensory systems. In 2021 IEEE Conference on Games (CoG), pages 1–5.

Hugill, A. e Amelides, P. (2016). Audio-only computer games: Papa Sangre, page 355–375. Cambridge University Press.

Kempka, M., Wydmuch, M., Runc, G., Toczek, J., e Jaśkowski, W. (2016). Vizdoom: A Doom-based AI research platform for visual reinforcement learning. In 2016 IEEE Conference on Computational Intelligence and Games (CIG), pages 1–8. IEEE.

Latif, S., Cuayáhuitl, H., Pervez, F., Shamshad, F., Ali, H. S., e Cambria, E. (2023). A survey on deep reinforcement learning for audio-based applications. Artificial Intelligence Review, 56(3):2193–2240.

Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., e Riedmiller, M. A. (2013). Playing atari with deep reinforcement learning. ArXiv, abs/1312.5602.

Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. a., Veness, J., Bellemare, M. G., Graves, A., Riedmiller, M., Fidjeland, A. K., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou, I., King, H., Kumaran, D., Wierstra, D., Legg, S., e Hassabis, D. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540):529–533.

OpenAI, Berner, C., Brockman, G., Chan, B., Cheung, V., Debiak, P., Dennison, C., Farhi, D., Fischer, Q., Hashme, S., Hesse, C., Józefowicz, R., Gray, S., Olsson, C., Pachocki, J., Petrov, M., de Oliveira Pinto, H. P., Raiman, J., Salimans, T., Schlatter, J., Schneider, J., Sidor, S., Sutskever, I., Tang, J., Wolski, F., e Zhang, S. (2019). Dota 2 with large scale deep reinforcement learning. ArXiv, abs/1912.06680:1–66.

Park, K., Oh, H., e Lee, Y. (2021). Veca: A toolkit for building virtual environments to train and test human-like agents. arXiv preprint arXiv:2105.00762.

Petrenko, A., Huang, Z., Kumar, T., Sukhatme, G. S., e Koltun, V. (2020). Sample factory: Egocentric 3d control from pixels at 100000 FPS with asynchronous reinforcement learning. In Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 13-18 July 2020, Virtual Event, volume 119 of Proceedings of Machine Learning Research, pages 7652–7662. PMLR.

Schrittwieser, J., Antonoglou, I., Hubert, T., Simonyan, K., Sifre, L., Schmitt, S., Guez, A., Lockhart, E., Hassabis, D., Graepel, T., Lillicrap, T., e Silver, D. (2019). Mastering atari, go, chess and shogi by planning with a learned model. ArXiv e-prints, pages 1–21.

Schulman, J., Wolski, F., Dhariwal, P., Radford, A., e Klimov, O. (2017). Proximal policy optimization algorithms. arXiv preprint 1707.06347.

Shao, K., Zhao, D., Li, N., e Zhu, Y. (2018). Learning battles in vizdoom via deep reinforcement learning. pages 1–4.

Silver, D., Hubert, T., Schrittwieser, J., Antonoglou, I., Lai, M., Guez, A., Lanctot, M., Sifre, L., Kumaran, D., Graepel, T., Lillicrap, T., Simonyan, K., e Hassabis, D. (2017). Mastering chess and shogi by self-play with a general reinforcement learning algorithm. ArXiv e-prints, pages 1–19.

Sutton, R. S. e Barto, A. G. (2018). Reinforcement Learning: An Introduction. The MIT Press, 2nd edition.

Vinyals, O., Babuschkin, I., Czarnecki, W. M., Mathieu, M., Dudzik, A., Chung, J., Choi, D. H., Powell, R., Ewalds, T., Georgiev, P., et al. (2019). Grandmaster level in StarCraft II using multi-agent reinforcement learning. Nature, 575(7782):350–354.

Wang, X., Chen, Y., e Zhu, W. (2022). A survey on curriculum learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(9):4555–4576.

Westneat, D. e Fox, C. W. (2010). Evolutionary behavioral ecology. Oxford University Press.

Woubie, A., Kanervisto, A., Karttunen, J., e Hautamäki, V. (2019). Do autonomous agents benefit from hearing? CoRR, abs/1905.04192.

ZDoom (2020). Acs playsound documentation. [link] [Accessed: (28/05/2024)].