When Tweets Get Viral - A Deep Learning Approach for Stance Analysis of Covid-19 Vaccines Tweets by Brazilian Political Elites
Resumo
Social media platforms are crucial for understanding public opinion about policy issues. In this regard, detecting stance in Twitter posts is a vital tool. In this study, we built a corpus of tweets from 2020 and 2021, annotated with stance towards COVID-19 vaccines and vaccination, and test BERTimbau as a way to automatically detect stance in such tweets. Our model reached 86% accuracy in 2020, 77% in 2021, and 79% in the combined 2020/2021 set. Our results also highlight the time-dependent nature of data distribution and, as a consequence, stance classification. Therefore, this research also contributes to the field by shedding some light on the existing methodological challenges in analyzing complex public policy debates over time.
Palavras-chave:
Annotated UGC corpus, Stance classification, UGC classification, Stance Analysis in Text
Referências
Addawood, A., Schneider, J., and Bashir, M. (2017). Stance classification of twitter debates. In Proceedings of the 8th International Conference on Social Media & Society - #SMSociety17. ACM Press. 10.1145/3097286.3097288 [link].
Aguiar, A., Silveira, R., Pinheiro, V., Furtado, V., and Neto, J. A. (2021). Text classification in legal documents extracted from lawsuits in brazilian courts. In Intelligent Systems, pages 586–600. Springer International Publishing. https://doi.org/10.1007/978-3-030-91699-2_40 [link].
Augenstein, I., Rockt ̈aschel, T., Vlachos, A., and Bontcheva, K. (2016). Stance detection with bidirectional conditional encoding. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pages 876–885, Austin, Texas. Association for Computational Linguistics. 10.18653/v1/D16-1084 https://aclanthology.org/D16-1084
Bar-Haim, R., Bhattacharya, I., Dinuzzo, F., Saha, A., and Slonim, N. (2017). Stance classification of context-dependent claims. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, pages 251–261, Valencia, Spain. Association for Computational Linguistics. https://aclanthology.org/E17-1024
Barbera, P., Boydstun, A., Linn, S., McMahon, R., and Nagler, J. (2020). Automated text classification of news articles: A practical guide. Political Analysis, 29(1):19–42. 10.1017/pan.2020.8 [link].
Brum, H. B. and Nunes, M. d. G. V. (2017). Building a sentiment corpus of tweets in brazilian portuguese (version 1). In 11th International Conference on Language Resources and Evaluation (LREC 2018). https://doi.org/10.48550/arXiv.1712.08917 https://arxiv.org/abs/1712.08917
Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding (version 2). arXiv. https://doi.org/10.48550/arXiv.1810.04805 https://arxiv.org/abs/1810.04805
Dey, K., Shrivastava, R., and Kaushik, S. (2017). Twitter stance detection — a subjectivity and sentiment polarity inspired two-phase approach. In 2017 IEEE International Conference on Data Mining Workshops (ICDMW), pages 365–372, New Orleans, LA, USA. 10.1109/ICDMW.2017.53. https://ieeexplore.ieee.org/document/8215685
HAMMES, Luiz Otávio Alves; FREITAS, Larissa Astrogildo de. Utilizando BERTimbau para a Classificação de Emoções em Português. In: SIMPÓSIO BRASILEIRO DE TECNOLOGIA DA INFORMAÇÃO E DA LINGUAGEM HUMANA (STIL), 13. , 2021, Evento Online. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2021 . p. 56-63. 10.5753/stil.2021.17784 [link].
Junqueira, K. T. C. and Fernandes, A. M. R. (2018). Análise de sentimento em redes sociais no idioma português com base em mensagens do twitter. In IX Computer on the Beach, Brazil. [link].
Kuhn, M. and Johnson, K. (2019). Feature Engineering and Selection: A Practical Approach for Predictive Models. Chapman and Hall/CRC. https://doi.org/10.1201/9781315108230 [link].
Lillie, A. E. and Middelboe, E. R. (2019). Fake news detection using stance classification: A survey (version 1). arXiv. https://doi.org/10.48550/arXiv.1907.00181 https://arxiv.org/abs/1907.00181
Martins, G. F. (2022). Um estudo utilizando-se de análise de sentimentos e aprendizado de máquina para a classificação de tweets sobre a vacinação no brasil. Bachelor’s thesis, Universidade Federal do Rio Grande do Sul. In Portuguese. http://hdl.handle.net/10183/243217
Minaee, S., Kalchbrenner, N., Cambria, E., Nikzad, N., Chenaghlu, M., and Gao, J. (2021). Deep learning-based text classification. ACM Computing Surveys, 54(3):1– 40. https://doi.org/10.48550/arXiv.2004.03705 https://arxiv.org/abs/2004.03705
Mohammad, S. M., Sobhani, P., and Kiritchenko, S. (2017). Stance and sentiment in tweets. ACM Transactions on Internet Technology, 17(3):1–23. 10.1145/3003433 https://dl.acm.org/doi/10.1145/3003433
Nascimento, P., Osiek, B., & Xexéo, G. (2015). ANÁLISE DE SENTIMENTO DE TWEETS COM FOCO EM NOTÍCIAS. In Revista Eletrônica de Sistemas de Informação (Vol. 14, Issue 2, p. 2). IBEPES (Instituto Brasileiro de Estudos e Pesquisas Sociais) https://doi.org/10.21529/resi.2015.1402002 [link].
Olson, D. L. and Delen, D. (2008). Advanced Data Mining Techniques. Springer, 1st edition. 10.1007/978-3-540-76917-0 [link].
Silva, F. and Freitas, L. (2022). Brazilian portuguese hate speech classification using bertimbau. In International FLAIRS Conference Proceedings, volume 35. University of Florida George A Smathers Libraries. https://doi.org/10.32473/flairs.v35i.130594 [link].
Silva, N. F. F. d., Silva, M. C. R., Pereira, F. S. F., Tarrega, J. P. M., Beinotti, J. V. P., Fonseca, M., Andrade, F. E. d., and de Carvalho, A. C. P. d. L. F. (2021). Evaluating topic models in portuguese political comments about bills from brazil’s chamber of deputies. In Intelligent Systems, pages 104–120. Springer International Publishing. 10.1007/978-3-030-91699-2_8 https://repositorio.usp.br/item/003057315
Sobhani, P. (2017). Stance Detection and Analysis in Social Media. PhD thesis, University of Ottawa, Ottawa, Canada. Thesis submitted to the Faculty of Graduate and Postdoctoral Studies in partial fulfillment of the requirements for the Ph.D. degree in Computer Science, School of Electrical Engineering and Computer Science, Faculty of Engineering. http://dx.doi.org/10.20381/ruor-20460 [link].
Somasundaran, S. and Wiebe, J. (2009). Recognizing stances in online debates. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, pages 226–234, Suntec, Singapore. Association for Computational Linguistics. https://aclanthology.org/P09-1026
Song, X., Salcianu, A., Song, Y., Dopson, D., and Zhou, D. (2020). Fast wordpiece tokenization. arXiv. https://doi.org/10.48550/arXiv.2012.15524 https://arxiv.org/abs/2012.15524
Souza, F., Nogueira, R., and Lotufo, R. (2020). Bertimbau: Pretrained bert models for brazilian portuguese. In Intelligent Systems, pages 403–417. Springer International Publishing. 10.1007/978-3-030-61377-8_28 [link].
Sukhbaatar, S. and Fergus, R. (2014). Learning from noisy labels with deep neural networks. arXiv preprint, arXiv:1406.2080(2(3)):4. arXiv:2007.08199v7 https://arxiv.org/pdf/2007.08199.pdf
Torres, B. A., Moraes, L. C., and Pimenta, D. A. C. (2020). Um estudo da aceitação das vacinas contra a covid-19 na rede social twitter utilizando ferramentas de análise sentimental. In II Workshop de Tecnologia da Fatec Ribeirão Preto, volume 1-2. [link].
Walker, M. A., Anand, P., Abbott, R., Tree, J. E. F., Martell, C., and King, J. (2012). That is your evidence?: Classifying stance in online political debate. Decision Support Systems, 53(4):719–729. https://doi.org/10.1016/j.dss.2012.05.032 [link].
Wu, Y., Schuster, M., Chen, Z., Le, Q. V., Norouzi, M., Macherey, W., Krikun, M., Cao, Y., Gao, Q., Macherey, K., Klingner, J., Shah, A., Johnson, M., Liu, X., Kaiser, , Gouws, S., Kato, Y., Kudo, T., Kazawa, H., and ... Dean, J. (2016). Google’s neural machine translation system: Bridging the gap between human and machine translation. arXiv. https://doi.org/10.48550/arXiv.1609.08144 https://arxiv.org/abs/1609.08144
Zhang, S., Qiu, L., Chen, F., Zhang, W., Yu, Y., and Elhadad, N. (2017). We make choices we think are going to save us. In Proceedings of the 26th International Conference on World Wide Web Companion - WWW ’17 Companion. ACM Press. https://doi.org/10.1145/3041021.3055134 https://dl.acm.org/doi/10.1145/3041021.3055134
Zheng, A. and Casari, A. (2018). Feature Engineering for Machine Learning: Principles and Techniques for Data Scientists. O’Reilly Media. [link].
Aguiar, A., Silveira, R., Pinheiro, V., Furtado, V., and Neto, J. A. (2021). Text classification in legal documents extracted from lawsuits in brazilian courts. In Intelligent Systems, pages 586–600. Springer International Publishing. https://doi.org/10.1007/978-3-030-91699-2_40 [link].
Augenstein, I., Rockt ̈aschel, T., Vlachos, A., and Bontcheva, K. (2016). Stance detection with bidirectional conditional encoding. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pages 876–885, Austin, Texas. Association for Computational Linguistics. 10.18653/v1/D16-1084 https://aclanthology.org/D16-1084
Bar-Haim, R., Bhattacharya, I., Dinuzzo, F., Saha, A., and Slonim, N. (2017). Stance classification of context-dependent claims. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, pages 251–261, Valencia, Spain. Association for Computational Linguistics. https://aclanthology.org/E17-1024
Barbera, P., Boydstun, A., Linn, S., McMahon, R., and Nagler, J. (2020). Automated text classification of news articles: A practical guide. Political Analysis, 29(1):19–42. 10.1017/pan.2020.8 [link].
Brum, H. B. and Nunes, M. d. G. V. (2017). Building a sentiment corpus of tweets in brazilian portuguese (version 1). In 11th International Conference on Language Resources and Evaluation (LREC 2018). https://doi.org/10.48550/arXiv.1712.08917 https://arxiv.org/abs/1712.08917
Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding (version 2). arXiv. https://doi.org/10.48550/arXiv.1810.04805 https://arxiv.org/abs/1810.04805
Dey, K., Shrivastava, R., and Kaushik, S. (2017). Twitter stance detection — a subjectivity and sentiment polarity inspired two-phase approach. In 2017 IEEE International Conference on Data Mining Workshops (ICDMW), pages 365–372, New Orleans, LA, USA. 10.1109/ICDMW.2017.53. https://ieeexplore.ieee.org/document/8215685
HAMMES, Luiz Otávio Alves; FREITAS, Larissa Astrogildo de. Utilizando BERTimbau para a Classificação de Emoções em Português. In: SIMPÓSIO BRASILEIRO DE TECNOLOGIA DA INFORMAÇÃO E DA LINGUAGEM HUMANA (STIL), 13. , 2021, Evento Online. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2021 . p. 56-63. 10.5753/stil.2021.17784 [link].
Junqueira, K. T. C. and Fernandes, A. M. R. (2018). Análise de sentimento em redes sociais no idioma português com base em mensagens do twitter. In IX Computer on the Beach, Brazil. [link].
Kuhn, M. and Johnson, K. (2019). Feature Engineering and Selection: A Practical Approach for Predictive Models. Chapman and Hall/CRC. https://doi.org/10.1201/9781315108230 [link].
Lillie, A. E. and Middelboe, E. R. (2019). Fake news detection using stance classification: A survey (version 1). arXiv. https://doi.org/10.48550/arXiv.1907.00181 https://arxiv.org/abs/1907.00181
Martins, G. F. (2022). Um estudo utilizando-se de análise de sentimentos e aprendizado de máquina para a classificação de tweets sobre a vacinação no brasil. Bachelor’s thesis, Universidade Federal do Rio Grande do Sul. In Portuguese. http://hdl.handle.net/10183/243217
Minaee, S., Kalchbrenner, N., Cambria, E., Nikzad, N., Chenaghlu, M., and Gao, J. (2021). Deep learning-based text classification. ACM Computing Surveys, 54(3):1– 40. https://doi.org/10.48550/arXiv.2004.03705 https://arxiv.org/abs/2004.03705
Mohammad, S. M., Sobhani, P., and Kiritchenko, S. (2017). Stance and sentiment in tweets. ACM Transactions on Internet Technology, 17(3):1–23. 10.1145/3003433 https://dl.acm.org/doi/10.1145/3003433
Nascimento, P., Osiek, B., & Xexéo, G. (2015). ANÁLISE DE SENTIMENTO DE TWEETS COM FOCO EM NOTÍCIAS. In Revista Eletrônica de Sistemas de Informação (Vol. 14, Issue 2, p. 2). IBEPES (Instituto Brasileiro de Estudos e Pesquisas Sociais) https://doi.org/10.21529/resi.2015.1402002 [link].
Olson, D. L. and Delen, D. (2008). Advanced Data Mining Techniques. Springer, 1st edition. 10.1007/978-3-540-76917-0 [link].
Silva, F. and Freitas, L. (2022). Brazilian portuguese hate speech classification using bertimbau. In International FLAIRS Conference Proceedings, volume 35. University of Florida George A Smathers Libraries. https://doi.org/10.32473/flairs.v35i.130594 [link].
Silva, N. F. F. d., Silva, M. C. R., Pereira, F. S. F., Tarrega, J. P. M., Beinotti, J. V. P., Fonseca, M., Andrade, F. E. d., and de Carvalho, A. C. P. d. L. F. (2021). Evaluating topic models in portuguese political comments about bills from brazil’s chamber of deputies. In Intelligent Systems, pages 104–120. Springer International Publishing. 10.1007/978-3-030-91699-2_8 https://repositorio.usp.br/item/003057315
Sobhani, P. (2017). Stance Detection and Analysis in Social Media. PhD thesis, University of Ottawa, Ottawa, Canada. Thesis submitted to the Faculty of Graduate and Postdoctoral Studies in partial fulfillment of the requirements for the Ph.D. degree in Computer Science, School of Electrical Engineering and Computer Science, Faculty of Engineering. http://dx.doi.org/10.20381/ruor-20460 [link].
Somasundaran, S. and Wiebe, J. (2009). Recognizing stances in online debates. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, pages 226–234, Suntec, Singapore. Association for Computational Linguistics. https://aclanthology.org/P09-1026
Song, X., Salcianu, A., Song, Y., Dopson, D., and Zhou, D. (2020). Fast wordpiece tokenization. arXiv. https://doi.org/10.48550/arXiv.2012.15524 https://arxiv.org/abs/2012.15524
Souza, F., Nogueira, R., and Lotufo, R. (2020). Bertimbau: Pretrained bert models for brazilian portuguese. In Intelligent Systems, pages 403–417. Springer International Publishing. 10.1007/978-3-030-61377-8_28 [link].
Sukhbaatar, S. and Fergus, R. (2014). Learning from noisy labels with deep neural networks. arXiv preprint, arXiv:1406.2080(2(3)):4. arXiv:2007.08199v7 https://arxiv.org/pdf/2007.08199.pdf
Torres, B. A., Moraes, L. C., and Pimenta, D. A. C. (2020). Um estudo da aceitação das vacinas contra a covid-19 na rede social twitter utilizando ferramentas de análise sentimental. In II Workshop de Tecnologia da Fatec Ribeirão Preto, volume 1-2. [link].
Walker, M. A., Anand, P., Abbott, R., Tree, J. E. F., Martell, C., and King, J. (2012). That is your evidence?: Classifying stance in online political debate. Decision Support Systems, 53(4):719–729. https://doi.org/10.1016/j.dss.2012.05.032 [link].
Wu, Y., Schuster, M., Chen, Z., Le, Q. V., Norouzi, M., Macherey, W., Krikun, M., Cao, Y., Gao, Q., Macherey, K., Klingner, J., Shah, A., Johnson, M., Liu, X., Kaiser, , Gouws, S., Kato, Y., Kudo, T., Kazawa, H., and ... Dean, J. (2016). Google’s neural machine translation system: Bridging the gap between human and machine translation. arXiv. https://doi.org/10.48550/arXiv.1609.08144 https://arxiv.org/abs/1609.08144
Zhang, S., Qiu, L., Chen, F., Zhang, W., Yu, Y., and Elhadad, N. (2017). We make choices we think are going to save us. In Proceedings of the 26th International Conference on World Wide Web Companion - WWW ’17 Companion. ACM Press. https://doi.org/10.1145/3041021.3055134 https://dl.acm.org/doi/10.1145/3041021.3055134
Zheng, A. and Casari, A. (2018). Feature Engineering for Machine Learning: Principles and Techniques for Data Scientists. O’Reilly Media. [link].
Publicado
25/09/2023
Como Citar
BARBERIA, Lorena Guadalupe; SCHMALZ, Pedro Henrique de Santana; ROMAN, Norton Trevisan.
When Tweets Get Viral - A Deep Learning Approach for Stance Analysis of Covid-19 Vaccines Tweets by Brazilian Political Elites. In: SIMPÓSIO BRASILEIRO DE TECNOLOGIA DA INFORMAÇÃO E DA LINGUAGEM HUMANA (STIL), 14. , 2023, Belo Horizonte/MG.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2023
.
p. 104-114.
DOI: https://doi.org/10.5753/stil.2023.233961.