Reconhecimento de Emoções através da Fala utilizando Rede Neural Convolucional
Resumo
O rápido desenvolvimento da Inteligência Artificial aprimorou os sistemas de reconhecimento de emoções através da fala utilizando redes neurais. Este artigo tem como objetivo o desenvolvimento de um modelo treinado a partir de uma rede neural convolucional para o reconhecimento de emoções em áudios do idioma português do Brasil (pt-BR). A validação do modelo foi realizada a partir de testes utilizando um conjunto de dados criado com áudios que abrangem as variedades linguísticas do Brasil. Os testes apresentaram um desempenho insatisfatório para um classificador, porém foi possível identificar que as variações linguísticas existentes em um idioma podem afetar o desempenho geral de um modelo.Referências
Abdelhamid, A. A., El-Kenawy, E.-S. M., Alotaibi, B., Amer, G. M., Abdelkader, M. Y., Ibrahim, A., and Eid, M. M. (2022). Robust speech emotion recognition using cnn+lstm based on stochastic fractal search optimization algorithm. IEEE Access, 10:49265–49284.
Abdulraheem, A., Salih, A., Abdulla, A., M.Sadeeq, M., O. M.Salim, N., Abdullah, H., Khalifa, F., and Abdullah, R. (2020). Home automation system based on iot. Proceedings of the Technology Reports of Kansai University, 62:2453.
Atmaja, B. T., Sasou, A., and Akagi, M. (2022). Survey on bimodal speech emotion recognition from acoustic and linguistic information fusion. Speech Communication, 140:11–28.
Bastos Germano, R. G., Pompeu Tcheou, M., da Rocha Henriques, F., and Pinto Gomes Junior, S. (2021). emoUERJ: an emotional speech database in Portuguese.
Jaihar, J., Lingayat, N., Vijaybhai, P. S., Venkatesh, G., and Upla, K. P. (2020). Smart home automation using machine learning algorithms. In Proceedings of the International Conference for Emerging Technology (INCET), pages 1–4.
Lech, M., Stolar, M., Best, C., and Robert, B. (2020). Real-time speech emotion recognition using a pre-trained image classification network: Effects of bandwidth reduction and companding. Proceedings of the Frontiers in Computer Science.
Lopez-Martin, M., Nevado, A., and Carro, B. (2020). Detection of early stages of alzheimer’s disease based on meg activity with a randomized convolutional neural network. Artificial Intelligence in Medicine, 107:101924.
Lu, X. (2022). Deep learning based emotion recognition and visualization of figural representation. Proceedings of Frontiers in Computer Science.
Mustaqeem and Kwon, S. (2020). A cnn-assisted enhanced audio signal processing for speech emotion recognition. Sensors, 20(1).
Rumagit, R. Y., Alexander, G., and Saputra, I. F. (2021). Model comparison in speech emotion recognition for indonesian language. Proceedings of the Procedia Computer Science, 179:789–797.
Singh, V. and Prasad, S. (2023). Speech emotion recognition system using gender dependent convolution neural network. Procedia Computer Science, 218:2533–2540. International Conference on Machine Learning and Data Engineering.
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., and Salakhutdinov, R. (2014). Dropout: a simple way to prevent neural networks from overfitting. The journal of machine learning research, 15(1):1929–1958.
Sun, T.-W. (2020). End-to-end speech emotion recognition with gender information. IEEE Access, 8:152423–152438.
Abdulraheem, A., Salih, A., Abdulla, A., M.Sadeeq, M., O. M.Salim, N., Abdullah, H., Khalifa, F., and Abdullah, R. (2020). Home automation system based on iot. Proceedings of the Technology Reports of Kansai University, 62:2453.
Atmaja, B. T., Sasou, A., and Akagi, M. (2022). Survey on bimodal speech emotion recognition from acoustic and linguistic information fusion. Speech Communication, 140:11–28.
Bastos Germano, R. G., Pompeu Tcheou, M., da Rocha Henriques, F., and Pinto Gomes Junior, S. (2021). emoUERJ: an emotional speech database in Portuguese.
Jaihar, J., Lingayat, N., Vijaybhai, P. S., Venkatesh, G., and Upla, K. P. (2020). Smart home automation using machine learning algorithms. In Proceedings of the International Conference for Emerging Technology (INCET), pages 1–4.
Lech, M., Stolar, M., Best, C., and Robert, B. (2020). Real-time speech emotion recognition using a pre-trained image classification network: Effects of bandwidth reduction and companding. Proceedings of the Frontiers in Computer Science.
Lopez-Martin, M., Nevado, A., and Carro, B. (2020). Detection of early stages of alzheimer’s disease based on meg activity with a randomized convolutional neural network. Artificial Intelligence in Medicine, 107:101924.
Lu, X. (2022). Deep learning based emotion recognition and visualization of figural representation. Proceedings of Frontiers in Computer Science.
Mustaqeem and Kwon, S. (2020). A cnn-assisted enhanced audio signal processing for speech emotion recognition. Sensors, 20(1).
Rumagit, R. Y., Alexander, G., and Saputra, I. F. (2021). Model comparison in speech emotion recognition for indonesian language. Proceedings of the Procedia Computer Science, 179:789–797.
Singh, V. and Prasad, S. (2023). Speech emotion recognition system using gender dependent convolution neural network. Procedia Computer Science, 218:2533–2540. International Conference on Machine Learning and Data Engineering.
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., and Salakhutdinov, R. (2014). Dropout: a simple way to prevent neural networks from overfitting. The journal of machine learning research, 15(1):1929–1958.
Sun, T.-W. (2020). End-to-end speech emotion recognition with gender information. IEEE Access, 8:152423–152438.
Publicado
06/08/2023
Como Citar
PEIXOTO, Guilherme de S.; LINHARES, José E. B. de S..
Reconhecimento de Emoções através da Fala utilizando Rede Neural Convolucional. In: SEMINÁRIO INTEGRADO DE SOFTWARE E HARDWARE (SEMISH), 50. , 2023, João Pessoa/PB.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2023
.
p. 119-130.
ISSN 2595-6205.
DOI: https://doi.org/10.5753/semish.2023.229968.