Utilizando BERTimbau para a Classificação de Emoções em Português
Resumo
Neste trabalho propomos realizar o fine-tuning dos modelos BERTimbau-base e BERTimbau-large na tarefa de Classificação de 27 tipos de emoções em sentenças, baseado no dataset GoEmotions traduzido para a língua portuguesa, por meio de ferramentas de tradução automática. Comparamos os resultados de nossos experimentos com os resultados disponibilizados pelos autores do dataset GoEmotions e obtivemos um ganho de desempenho ao qual atribuímos ao algoritmo de balanceamento utilizado.
Referências
Cowen, A. S., Elfenbein, H. A., Laukka, P., and Keltner, D. (2019b). Mapping 24 emotions conveyed by brief human vocalization. American Psychologist, 74(6):698–712.
Cowen, A. S. and Keltner, D. (2017). Self-report captures 27 distinct categories of emotion bridged by continuous gradients. National Academy of Sciences, 114(38):7900– 7909.
Cowen, A. S. and Keltner, D. (2020). What the face displays: Mapping 28 emotions conveyed by naturalistic expression. American Psychologist, 75(3):349.
Cui, Y., Jia, M., Lin, T.-Y., Song, Y., and Belongie, S. (2019). Class-balanced loss based on effective number of samples. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9268–9277.
Demszky, D., Movshovitz-Attias, D., Ko, J., Cowen, A., Nemade, G., and Ravi, S. (2020). Goemotions: A dataset of fine-grained emotions. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 4040-4054. ACL.
Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
Dosciatti, M. M., Ferreira, L., and Paraiso, E. (2013). Identificando emoçoes em textos em português do brasil usando máquina de vetores de suporte em soluçao multiclasse. ENIAC-Encontro Nacional de Inteligência Artificial e Computacional. Fortaleza, Brasil.
Duarte, L., Macedo, L., and Oliveira, H. G. (2019). Exploring emojis for emotion reIn Proceedings of the EPIA Conference on Artificial cognition in portuguese text. Intelligence, pages 719–730. Springer.
Ekman, P. (1992). An argument for basic emotions. Cognition & emotion, 6(3-4):169–200.
Ekman, P. (2004). Emotions revealed. Bmj, 328(Suppl S5).
Gillioz, A., Casas, J., Mugellini, E., and Abou Khaled, O. (2020). Overview of the In Proceedings of the 15th Conference on transformer-based models for nlp tasks. Computer Science and Information Systems, pages 179–183. IEEE.
Liu, B. (2012). Sentiment analysis and opinion mining. Synthesis lectures on human language technologies, 5(1):1–167.
Pereira, D. A. (2021). A survey of sentiment analysis in the portuguese language. Artificial Intelligence Review, 54(2):1087–1115.
Plutchik, R. (1982). A psychoevolutionary theory of emotions. Social Science Information, 21(4-5):529–553.
Plutchik, R. (2003). Emotions and life: Perspectives from psychology, biology, and evolution. American Psychological Association.
Souza, F., Nogueira, R., and Lotufo, R. (2020). Bertimbau: pretrained bert models for brazilian portuguese. In Proceedings of the Brazilian Conference on Intelligent Systems, pages 403–417. Springer.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., and Polosukhin, I. (2017). Attention is all you need. In Proceedings of the Advances in neural information processing systems, pages 5998–6008.
Wood, I. and Ruder, S. (2016). Emoji as emotion tags for tweets. In Proceedings of the Emotion and Sentiment Analysis Workshop LREC2016, Portoroz, Slovenia, pages 76–79.
Zheng, W. and Jin, M. (2020). The effects of class imbalance and training data size on classifier learning: an empirical study. SN Computer Science, 1(2):1–13.