Facial expression generation through speech and emotion
Abstract
Introduction: This article presents a method to convert audio files into facial expressions. Objective: Conversion of emotional audios into matching facial expressions. Methodology or Steps: This article uses interpolation, C#, Unity, Google Colab and Adobe Firefly to do the audio-to-face-expression conversions. Results: Successful conversions of audio files to equivalent facial expressions.
Keywords:
AI, audio-to-image, Unity, Face, Emotion
References
Bassili, J. N. (1979). Emotion recognition: the role of facial movement and the relative importance of upper and lower areas of the face. Journal of personality and social psychology, 37(11):2049.
Buck, R. W., Savin, V. J., Miller, R. E., e Caul, W. F. (1972). Communication of affect through facial expressions in humans. Journal of personality and social psychology, 23(3):362.
Dellaert, F., Polzin, T., e Waibel, A. (1996). Recognizing emotion in speech. In Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP’96, volume 3, pages 1970–1973. IEEE.
Kwon, O.-W., Chan, K., Hao, J., e Lee, T.-W. (2003). Emotion recognition by speech signals. In Interspeech, pages 125–128. Citeseer.
Milton, A., Roy, S. S., e Selvi, S. T. (2013). Svm scheme for speech emotion recognition using mfcc feature. International Journal of Computer Applications, 69(9).
Seehapoch, T. e Wongthanavasu, S. (2013). Speech emotion recognition using support vector machines. In 2013 5th international conference on Knowledge and smart technology (KST), pages 86–91. IEEE.
Wu, W., Li, Z., He, Y., Shou, M. Z., Shen, C., Cheng, L., Li, Y., Gao, T., e Zhang, D. (2025). Paragraph-to-image generation with information-enriched diffusion model. International Journal of Computer Vision, pages 1–22.
Zhang, J. (1999). C-bézier curves and surfaces. Graphical Models and Image Processing, 61(1):2–15.
Buck, R. W., Savin, V. J., Miller, R. E., e Caul, W. F. (1972). Communication of affect through facial expressions in humans. Journal of personality and social psychology, 23(3):362.
Dellaert, F., Polzin, T., e Waibel, A. (1996). Recognizing emotion in speech. In Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP’96, volume 3, pages 1970–1973. IEEE.
Kwon, O.-W., Chan, K., Hao, J., e Lee, T.-W. (2003). Emotion recognition by speech signals. In Interspeech, pages 125–128. Citeseer.
Milton, A., Roy, S. S., e Selvi, S. T. (2013). Svm scheme for speech emotion recognition using mfcc feature. International Journal of Computer Applications, 69(9).
Seehapoch, T. e Wongthanavasu, S. (2013). Speech emotion recognition using support vector machines. In 2013 5th international conference on Knowledge and smart technology (KST), pages 86–91. IEEE.
Wu, W., Li, Z., He, Y., Shou, M. Z., Shen, C., Cheng, L., Li, Y., Gao, T., e Zhang, D. (2025). Paragraph-to-image generation with information-enriched diffusion model. International Journal of Computer Vision, pages 1–22.
Zhang, J. (1999). C-bézier curves and surfaces. Graphical Models and Image Processing, 61(1):2–15.
Published
2025-09-30
How to Cite
MELO, Arthur William Dórea; FIGUEIREDO, Caio Vasconcelos Araújo; ALMEIDA, Pedro Moreira Guerra de; FERREIRA, Renan Silva; MORAES, Guilherme Vieira; ARAUJO, Victor Flávio de Andrade.
Facial expression generation through speech and emotion. In: WORKSHOP MAGICA: GAMES IN SCHOOL AND UNDERGRADUATE COURSES - BRAZILIAN SYMPOSIUM ON COMPUTER GAMES AND DIGITAL ENTERTAINMENT (SBGAMES), 14. , 2025, Salvador/BA.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2025
.
p. 432-437.
DOI: https://doi.org/10.5753/sbgames_estendido.2025.14708.
