Detecção de Posicionamento e Rotulação Automática de Usuários do Twitter: estudo sobre o embate científico-político no contexto da CPI da Covid-19
Resumo
O posicionamento das pessoas em torno de questões sociais e políticas é muitas vezes realizado via mensagens postadas nas mídias sociais. Prever esse posicionamento sem a ajuda de rotulações manuais pode ser uma tarefa desafiadora. Utilizando um estudo de caso específico, a saber, a CPI da Covid-19, este artigo propõe um método para detectar e quantificar o posicionamento de usuários do Twitter em relação a um tema politicamente controverso e polarizado. Por meio do uso de abordagens computacionais combinadas com fatores sociais, como homofilia e estrutura de rede, foi possível rotular automaticamente 98% dos usuários presentes nas bases de dados estudadas, com pouquíssima intervenção humana, bem como categorizar suas posições por meio de uma pontuação de valência de posicionamento e duas métricas complementares: grau de equilíbrio e engajamento.
Referências
Christhie, W., Reis, J. C., Moro, F. B. M. M., and Almeida, V. (2018). Detecção de posicionamento em tweets sobre política no contexto brasileiro. In Anais do VII Brazilian Workshop on Social Network Analysis and Mining. SBC.
Cinelli, M., Morales, G. D. F., Galeazzi, A., Quattrociocchi, W., and Starnini, M. (2020). Echo chambers on social media: A comparative analysis. arXiv preprint arXiv:2004.09603.
Conover, M., Ratkiewicz, J., Francisco, M., Gonçalves, B., Menczer, F., and Flammini, A. (2011). Political polarization on twitter. In Proceedings of the International AAAI Conference on Web and Social Media, volume 5, pages 89–96.
D’Andrea, E., Ducange, P., Bechini, A., Renda, A., and Marcelloni, F. (2019). Monitoring the public opinion about the vaccination topic from tweets analysis. Expert Systems with Applications, 116:209–226.
Darwish, K., Stefanov, P., Aupetit, M., and Nakov, P. (2020). Unsupervised user stance detection on twitter. In Proceedings of the International AAAI Conference on Web and Social Media, volume 14, pages 141–152.
Ebeling, R., Sáenz, C. A. C., Nobre, J., and Becker, K. (2020). Quarenteners vs. chloroquiners: A framework to analyze how political polarization affects the behavior of groups. In 2020 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT), pages 203–210. IEEE.
Figeac, J. and Favre, G. (2021). How behavioral homophily on social media influences the perception of tie-strengthening within young adults’ personal networks. New Media & Society, page 14614448211020691.
Grootendorst, M. (2022). Bertopic: Neural topic modeling with a class-based tf-idf procedure. arXiv e-prints, pages arXiv–2203.
Jungherr, A., Schoen, H., Posegga, O., and Jürgens, P. (2017). Digital trace data in the study of public opinion: An indicator of attention toward politics rather than political support. Social Science Computer Review, 35(3):336–356.
Küçük, D. and Can, F. (2020). Stance detection: A survey. ACM Computing Surveys (CSUR), 53(1):1–37.
Lin, J., Mao, W., and Zhang, Y. (2017). An enhanced topic modeling approach to multiple stance identification. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, pages 2167–2170.
Magdy, W., Darwish, K., and Weber, I. (2016). #failedrevolutions: Using twitter to study the antecedents of isis support. In 2016 AAAI Spring Symposium Series.
Maia, M., Oliveira, E., and Gallegos, L. (2021). Covid-19 e tweets no brasil: coleta, tratamento e análise de textos com evidências de estados afetivos alterados em momentos impactantes. In Anais do X Brazilian Workshop on Social Network Analysis and Mining, pages 79–90. SBC.
Malagoli, L., Stancioli, J., Ferreira, C., Vasconcelos, M., Silva, A. P., and Almeida, J. (2021). Caracterização do debate no twitter sobre a vacinação contra a covid-19 no brasil. In Anais do X Brazilian Workshop on Social Network Analysis and Mining, pages 55–66, Porto Alegre, RS, Brasil. SBC.
McInnes, L., Healy, J., Saul, N., and Großberger, L. (2018). Umap: Uniform manifold approximation and projection. Journal of Open Source Software, 3(29).
McPherson, M., Smith-Lovin, L., and Cook, J. M. (2001). Birds of a feather: Homophily in social networks. Annual review of sociology, 27(1):415–444.
Mohammad, S., Kiritchenko, S., Sobhani, P., Zhu, X., and Cherry, C. (2016). Semeval-2016 task 6: Detecting stance in tweets. In Proceedings of the 10th international workshop on semantic evaluation (SemEval-2016), pages 31–41.
Mohammad, S. M., Sobhani, P., and Kiritchenko, S. (2017). Stance and sentiment in tweets. ACM Transactions on Internet Technology (TOIT), 17(3):1–23.
Popat, K., Mukherjee, S., Yates, A., and Weikum, G. (2019). Stancy: Stance classification based on consistency cues. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 6413–6418.
Rashed, A., Kutlu, M., Darwish, K., Elsayed, T., and Bayrak, C. (2021). Embeddings-based clustering for target specific stances: The case of a polarized turkey. In Proceedings of the International AAAI Conference on Web and Social Media, volume 15, pages 537–548.
Samih, Y. and Darwish, K. (2021). A few topical tweets are enough for effective user stance detection. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pages 2637–2646.
Santos, P. D. and Goya, D. H. (2021). Automatic twitter stance detection on politically controversial issues: A study on covid-19’s cpi. In Anais do XVIII Encontro Nacional de Inteligência Artificial e Computacional, pages 524–535. SBC.
Sirrianni, J. W., Liu, X., and Adams, D. (2021). Predicting stance polarity and intensity in cyber argumentation with deep bidirectional transformers. IEEE Transactions on Computational Social Systems, 8(3):655–667.
Sobhani, P., Inkpen, D., and Zhu, X. (2017). A dataset for multi-target stance detection. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, pages 551–557.
Souza, F., Nogueira, R., and Lotufo, R. (2020). BERTimbau: pretrained BERT models for Brazilian Portuguese. In 9th Brazilian Conference on Intelligent Systems, BRACIS, Rio Grande do Sul, Brazil, October 20-23 (to appear).
Stefanov, P., Darwish, K., Atanasov, A., and Nakov, P. (2020). Predicting the topical stance and political leaning of media using tweets. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 527–537.
Vamvas, J. and Sennrich, R. (2020). X-stance: A multilingual multi-target dataset for stance detection. In 5th SwissText & 16th KONVENS Joint Conference 2020, page 9. CEUR-WS. org.
Wagner Filho, J. A., Wilkens, R., Idiart, M., and Villavicencio, A. (2018). The brwac corpus: A new open resource for brazilian portuguese. In Proceedings of the eleventh international conference on language resources and evaluation (LREC 2018).
Wischnewski, M., Ngo, T., Bernemann, R., Jansen, M., and Krämer, N. (2022). “i agree with you, bot!” how users (dis) engage with social bots on twitter. New Media & Society, page 14614448211072307.
Wojatzki, M. and Zesch, T. (2016). ltl. uni-due at semeval-2016 task 6: Stance detection in social media using stacked classifiers. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), pages 428–433.
Zarrella, G. and Marsh, A. (2016). Mitre at semeval-2016 task 6: Transfer learning for stance detection. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval2016), pages 458–463.