An Autonomous Emotional Virtual Character: An Approach with Deep and Goal-Parameterized Reinforcement Learning

Gilzamir Ferreira Gomes; Creto Augusto Vidal; Joaquim Bento Cavalcante Neto; Yuri Lenon Barbosa Nogueira

doi:10.5753/jis.2020.751

Authors

Gilzamir Ferreira Gomes Universidade Federal do Ceará
Creto Augusto Vidal Universidade Federal do Ceará (UFC)
Joaquim Bento Cavalcante Neto Universidade Federal do Ceará (UFC)
Yuri Lenon Barbosa Nogueira Universidade Federal do Ceará

DOI:

https://doi.org/10.5753/jis.2020.751

Keywords:

Autonomous Virtual Characters, Emotion, Motivation, Deep Reinforcement Learning

Abstract

We have developed an autonomous virtual character guided by emotions. The agent is a virtual character who lives in a three-dimensional maze world. We found that emotion drivers can induce the behavior of a trained agent. Our approach is a case of goal parameterized reinforcement learning. Thus, we create conditioning between emotion drivers and a set of goals that determine the behavioral profile of a virtual character. We train agents who can randomly assume these goals while trying to maximize a reward function based on intrinsic and extrinsic motivations. A mapping between motivation and emotion was carried out. So, the agent learned a behavior profile as a training goal. The developed approach was integrated with the Advantage Actor-Critic (A3C) algorithm. Experiments showed that this approach produces behaviors consistent with the objectives given to agents, and has potential for the development of believable virtual characters.

Downloads

Download data is not yet available.

References

Andrychowicz, M., Wolski, F., Ray, A., Schneider, J., Fong, R., Welinder, P., McGrew, B., Tobin, J., Abbeel, P., and Zaremba, W. (2017). Hindsight experience replay. In Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS’17, page5055–5065, Red Hook, NY, USA. Curran Associates Inc.

Asensio, J. M. L., Peralta, J., Arrabales, R., Bedia, M. G.,Cortez, P., and Peña, A. L. (2014). Artificial intelligence approaches for the generation and assessment of believable human-like behaviour in virtual characters. Expert Systems with Applications, 41(16):7281 – 7290.

Bechtold, F., Splechtna, R., and Matkovic, K. (2018). Visual Exploratory Analysis for Multiple T-Maze Studies. In Puig Puig, A., Schultz, T., Vilanova, A., Hotz, I., Kozlikova, B., and Vázquez, P.-P., editors, Eurographics Workshop on Visual Computing for Biology and Medicine,pages 203–213. The Eurographics Association.

Collenette, J., Atkinson, K., Bloembergen, D., and Tuyls, K. (2017). Mood modelling within reinforcement learning.Artificial Life Conference Proceedings, 14(29):106–113.

Cos, I., Cañamero, L., Hayes, G. M., and Gillies, A. (2013). Hedonic value: Enhancing adaptation for motivated agents. Adaptive Behavior, 21(6):465–483.

Dosovitskiy, A., Ros, G., Codevilla, F., Lopez, A., andKoltun, V. (2017). CARLA: An open urban driving sim-ulator. InProceedings of the 1st Annual Conference onRobot Learning, pages 1–16.

Doya, K. (2002). Metalearning and neuromodulation. Neural Netw., 15(4):495–506.

Ekman, P., Friesen, W. V., O’Sullivan, M., Chan, M.,Diacoyanni-Tarlatzis, I., Heider, K., R., et al. (1987). Universals and cultural differences in the judgments of facial expressions of emotion .Journal of Personality and Social Psychology, 53(4):712–717.

Estes, W. K. and Skinner, B. F. (1941). Some quantitative properties of anxiety. Journal of Experimental Psychology, 29(5):390–400.

Galvin, G. (1985). Stress and brain noradrenaline: a review. Neurosci. Biobehav. Rev, 9:233–243.

Gillies, M. (2018). Creating virtual characters. In Proceedings of the 5th International Conference on Movement and Computing, MOCO ’18, New York, NY, USA. Association for Computing Machinery.

Glavin, F.G. and Madden, M.G. (2015). Learning to shoot in first person shooter games by stabilizing actions and clustering rewards for reinforcement learning. In 2015 IEEE Conference on Computational Intelligence and Games(CIG), pages 344–351.

Gomes, G., Vidal, C. A., Cavalcante Neto, J. B., and Nogueira, Y. L. B. (2019). An emotional virtual character:A deep learning approach with reinforcement learning. In2019 21st Symposium on Virtual and Augmented Reality(SVR), pages 223–231.

Goodfellow, I., Bengio, Y., and Courville, A. (2016). Deep Learning. MIT Press. http://www.deeplearningbook.org.

Harley, C. (1987). A role for norepinephrine in arousal,emotion and learning?: Limbic modulation by norepinephrine and the kety hypothesis.Progress in neuro-psychopharmacology biological psychiatry, 11:419–58.

Hecht-Nielsen, R. (1987). Kolmogorov’s mapping neuralnetwork existence theorem. InProceedings of the inter-national conference on Neural Networks, volume3 ,pages11–14. IEEE Press New York.

Hernandez-Leal, P., Kartal, B., and Taylor, M. E. (2019). A survey and critique of multiagent deep reinforcement learning. Autonomous Agents and Multi-Agent Systems,33(6):750–797.

Justesen, N., Bontrager, P., Togelius, J., and Risi, S. (2017). Deep learning for video game playing. CoRR,abs/1708.07902.

Konidaris, G. and Barto, A. (2006). An adaptive robot motivational system. In From Animals to Animats 9, pages 346–356. Springer.

Lazarus, R. S. (1991). Cognition and motivation in emotion.American Psychologist, 46(4):352–367.

LeCun,Y., Kavukcuoglu,K., and Farabet,C. (2010). Convolutional networks and applications in vision. In Proceedings of 2010 IEEE international symposium on circuits and systems, pages 253–256. IEEE.

Lodish, H., Berk, A., Zipursky, S. L., and et al. (2000). Molecular Cell Biology, chapter 4. W. H. Freeman, 4 edition.

Merrick, K. and Maher, M. L. (2006). Motivated reinforcement learning for non-player characters in persistent computer game worlds. In Proceedings of the 2006 ACMSIGCHI International Conference on Advances in Computer Entertainment Technology, ACE ’06, page 3–es,New York, NY, USA. Association for Computing Machinery.

Mnih, V., Badia, A. P., Mirza, M., Graves, A., Lillicrap, T.,Harley, T., Silver, D., and Kavukcuoglu, K. (2016). Asynchronous methods for deep reinforcement learning. In Balcan, M. F. and Weinberger, K. Q., editors, Proceedings of The 33rd International Conference on Machine Learning, volume 48 of Proceedings of Machine Learning Research, pages 1928–1937, New York, New York, USA. PMLR.

Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A., Veness,J., G Bellemare, M., Graves, A., Riedmiller, M., K Fidjeland,A., Ostrovski,G., Petersen,S., Beattie,C., Sadik,A., Antonoglou,I., King,H., Kumaran,D., Wierstra,D., Legg,S., and Hassabis, D. (2015). Human-level control through deep reinforcement learning. Nature, 518:529–33.

Moerland, T. M., Broekens, J., and Jonker, C. M. (2016). Fear and hope emerge from anticipation in model-based reinforcement learning. In IJCAI, pages 848–854.

Moerland, T. M., Broekens, J., and Jonker, C. M. (2018).Emotion in reinforcement learning agents and robots: a survey. Machine Learning, 107(2):443–480.

Moussa, M. B. and Magnenat-Thalmann, N. (2013). Toward socially responsible agents: integrating attachment and learning in emotional decision-making. Computer Animation and Virtual Worlds, 24(3-4):327–334.

Pathak, D., Agrawal, P., Efros, A. A., and Darrell, T. (2017).Curiosity-driven exploration by self-supervised prediction. In Proceedings of the 34th International Conference on Machine Learning - Volume 70, ICML’17, page 2778–2787. JMLR.org.

Russell,J.A.(1978). Evidence of convergent validity on the dimensions of affect. Journal of Personality and Social Psicology, 36(10):1152 – 1168.

Schaul, T., Horgan, D., Gregor, K., and Silver, D. (2015). Universal value function approximators. In International conference on machine learning, pages 1312–1320.

Shvo, M., Buhmann, J., and Kapadia, M. (2019). An inter-dependent model of personality, motivation, emotion, andmood for intelligent virtual agents. In Proceedings of the 19th ACM International Conference on Intelligent Virtual Agents, pages 65–72.

Simonyan, K. and Zisserman, A. (2014). Two-stream convolutional networks for action recognition in videos. In Advances in neural information processing systems, pages568–576.

Singh, S., Lewis, R. L., Barto, A. G., and Sorg, J. (2010). Intrinsically motivated reinforcement learning: An evolutionary perspective. IEEE Trans. on Auton. Ment. Dev.,2(2):70–82.

Stanton, C. and Clune, J. (2018). Deep curiosity search: Intra-life exploration improves performance on challenging deep reinforcement learning problems. arXiv preprintarXiv:1806.00553.

Wang, P., Rowe, J., Min, W., Mott, B., and Lester, J. (2017). Interactive narrative personalization with deep reinforcement learning. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI-17, pages 3852–3858.

Williams, R. J. (1992). Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning, 8(3):229–256.