Mineração de Argumentos em Textos de Redes Sociais no Idioma Português

Vitor Domingos Baldoino do Santos; Livia Alabarse dos Santos; Orlando B. Coelho; Renata Mendes de Araujo; Ivan Carlos Alcântara de Oliveira

doi:10.5753/stil.2024.245433

Vitor Domingos Baldoino do Santos UPM http://orcid.org/0009-0007-5746-1819
Livia Alabarse dos Santos UPM https://orcid.org/0009-0008-8409-0272
Orlando B. Coelho UPM https://orcid.org/0000-0002-8631-1090
Renata Mendes de Araujo UPM / USP https://orcid.org/0000-0002-8674-1728
Ivan Carlos Alcântara de Oliveira UPM https://orcid.org/0000-0002-6020-7535

DOI: https://doi.org/10.5753/stil.2024.245433

Resumo

Este artigo apresenta os desafios e os avanços de pesquisa voltada à construção de soluções computacionais capazes de apoiar o entendimento do debate em redes sociais no idioma português. Uma das bases fundamentais dessas soluções é a aplicação de técnicas de Mineração de Argumentos. Apresentamos as estratégias utilizadas para o endereçamento de desafios da mineração de argumentos em redes sociais, em particular, o uso de deep learning. Os resultados obtidos demonstram boa eficácia dos modelos selecionados para as tarefas consideradas, tendo atingido um F1-Score de 0,85 para a análise de sentimento, 0,97 na detecção de posição e 0,76 na detecção de ironia.

Palavras-chave: Mineração de Argumentos, Redes Sociais, Linguística Computacional, Aprendizado Profundo

Referências

Addawood. A. e Bashir, M. (2016). “What Is Your Evidence? A Study of Controversial Topics on Social Media”. Em: Proceedings of the Third Workshop on Argument Mining (ArgMining2016). Berlin, Germany. Association for Computational Linguistics.pages 1–11.

Bosc, T., Cabrio, E. e Villata, S. (2016). “Tweeties Squabbling: Positive and Negative Results in Applying Argument Mining on Social Media”. Frontiers in Artificial Intelligence and Applications, v. 287, p. 21–32.

Bosc, Tom, Cabrio, E. e Villata, S. (2016a). “DART: a Dataset of Arguments and their Relations on Twitter” Em: Proceedings of the 10th edition of the Language Resources and Evaluation Conference. pp. 1258-1263.

Brown, T., Mann, B., Ryder, N., et al. (2020). Language Models are Few-Shot Learners. Em: Advances in Neural Information Processing Systems. Curran Associates, Inc.

Carneiro, F. P. (2023). “BERTweet.BR: A Pre-Trained Language Model for Tweets in Portuguese”. Dissertação de Mestrado. Universidade Federal Fluminense, Programa de Pós-Graduação em Computação. Niterói.

Cortiz, D. (2021) “Exploring transformers in emotion recognition: a comparison of bert, distillbert, roberta, xlnet and electra”. arXiv. arXiv:2104.02041. DOI: 10.48550/arXiv.2104.02041

Costa, P. B., Pavan, M. C., Santos, W. R., Silva, S. C., & Paraboni, I. (2023). “BERTabaporu: Assessing a Genre-Specific Language Model for Portuguese NLP”. Em: Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing, p. 217–223. Shoumen, Bulgaria. [link]

Demszky, D., Movshovitz-Attias, D., Ko, J., Cowen, A.S., Nemade, G., & Ravi, S. (2020) “GoEmotions: A Dataset of Fine-Grained Emotions”. arXiv. arXiv:abs/2005.00547. DOI: 10.48550/arXiv.2005.00547

Devlin, J., Chang, M.-W., Lee, K. and Toutanova, K. (2019). “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding”. Em: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Association for Computational Linguistics.

Lawrence, J., Bex, F., Reed, C. e Snaith, M. (2012) “AIFdb: Infrastructure for the Argument Web.” Em: Proceedings of the 6th International Conference on Computational Models of Argument. IOS Press. pp. 515-516.

Lawrence, J. e Reed, C. (2020) “Argument mining: A survey”. Computational Linguistics, v. 45(4), pp. 765-818, 2020.

Lippi, M., Torroni, P. (2016). “Argumentation mining: State of the art and emerging trends”. ACM Transactions on Internet Technology, 16(2), 1-25.

Palau, R. M. e Moens, M. F. (2009). “Argumentation mining: the detection, classification and structure of arguments in text”. Em: Proceedings of the 12th International Conference on Artificial Intelligence and Law. pp. 98-107.

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., ... e Duchesnay, E. (2011) “Scikit-learn: Machine learning in Python". The Journal of machine Learning research, 12,2825-2830.

Pérez, J. M., Furman, D. A., Alonso Alemany, L., & Luque, F. M. (2022). “RoBERTuito: A pre-trained language model for social media text in Spanish”. Em: Proceedings of the Thirteenth Language Resources and Evaluation Conference, p. 7235–7243. European Language Resources Association. [link]

Pérez, J. M., Rajngewerc, M., Giudici, J. C., Furman, D. A., Luque, F., Alemany, L. A., & Martínez, M. V. (2023). “pysentimiento: A Python Toolkit for Opinion Mining and Social NLP tasks”. arXiv. [link].

Salles, G. T., Coelho, O. B. (2022). “Reconhecimento de Emoções em Mineração de Argumentos com Deep Learning”. Trabalho de Conclusão de Curso. Universidade Presbiteriana Mackenzie.

Schaefer, R. e Stede, M. (2021). “Argument Mining on Twitter: A survey”. Information Technology, v. 63, n. 1, p. 45–58.

Silva, L. J., Santos, L. A.; Araujo, R., Coelho, O. B., Correa, A. G, D,; Oliveira, I. C. A. (2024) “Tweet_Eleições_2022: Um dataset de tweets durante as eleições presidenciais brasileiras de 2022”. Brazilian Workshop on Social Network Analysis and Mining (BRASNAM), 13. Brasília/DF. Porto Alegre: Sociedade Brasileira de Computação. p. 193-199. DOI: 10.5753/brasnam.2024.1940.

Slonim, N., Bilu, Y., Alzate, C., Bar-Haim, R., Bogin, B., Bonin, F., ... e Aharonov, R. (2021). “An autonomous debating system”. Nature, 591(7850), p. 379-384.

Sousa, J.P.S., Nascimento, R. C. U., Araujo, R. M., Coelho, O. B. (2021). “Não se perca no debate! Mineração de Argumentação em Redes Sociais”. Brazilian Workshop on Social Network Analysis and Mining (BRASNAM). Porto Alegre: Sociedade Brasileira de Computação. p. 139-150. DOI: 10.5753/brasnam.2021.16132.

Souza, F., Nogueira, R., & Lotufo, R. (2020). “BERTimbau: Pretrained BERT Models for Brazilian Portuguese”, p. 403–417. DOI: 10.1007/978-3-030-61377-8_28

Stede, M. e Schneider, J. (2019). “Argumentation Mining”. Springer. Synthesis Lectures on Human Language Technologies.

Sun, C., Qiu, X., Xu, Y. e Huang, X. (2019). “How to Fine-Tune BERT for Text Classification?” In Chinese Computational Linguistics. Lecture Notes in Computer Science. Springer International Publishing.

Tokuda, N. H., Coelho, O. B., Araujo, R.M. (2021). “Análise de Sentimento por meio de Deep Learning aplicada à Mineração de Argumentos”. Trabalho de Conclusão de Curso. Universidade Presbiteriana Mackenzie.

Toulmin, S. E. (2003). The uses of argument. Cambridge University Press.

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, T. e Polosukhin, I. (2017). “Attention is All you Need”. Em: Advances in Neural Information Processing Systems. Curran Associates, Inc. 30.

Vecchi, E. M., Falk, N., Jundi, I., Lapesa, G. (2021). “Towards Argument Mining for Social Good: A Survey”. Em: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. .Online. Association for Computational Linguistics. p. 1338–1352.

Wagner Filho, J. A., Wilkens, R., Idiart, M., & Villavicencio, A. (2018). "The brWaC Corpus: A New Open Resource for Brazilian Portuguese". Em: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). European Language Resources Association (ELRA). [link]

Walker, M. A., Tree, J. E. F., Anand, P., Abbott, R. e King, J. (2012). “A Corpus for Research on Deliberation and Debate”. Em: Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC ’12) v. 12. Istanbul, Turkey. p. 812–817.

Zhang, T., Wu, F., Katiyar, A., Weinberger, K. Q., & Artzi, Y. (2020) “Revisiting few-sample BERT fine-tuning”. arXiv preprint arXiv:2006.05987. DOI: 10.48550/arXiv.2006.05987

Zhao, W. X., Zhou, K., Li, J., et al. (2023). “A Survey of Large Language Models”. Arxiv. arXiv. [link].