Detecção de desinformação sobre Covid-19 no Twitter

  • Ana Alice Ximenes Mota UFC
  • Wellington Franco UFC
  • César Lincoln Cavalcante Mattos UFC

Abstract


The damage caused by false or misleading news has increased due to the ease with which information is disseminated on social networks. During the Covid-19 pandemic, which began in 2020, such news could generate panic in the population and erroneously instruct people about the prevention of the disease. The present work introduces a new corpus from Twitter posts in the Portuguese language about misinformation from Covid-191. In addition to the new corpus, the work evaluates different approaches to textual representations and learning algorithms models in the task of detecting misinformative messages. The best result obtained achieved an F1-score of 89% in the SVM classification model with the TF-IDF textual representation.
Keywords: Natural Language Processing, misinformation, Covid-19

References

Buntain, C. and Golbeck, J. (2017). Automatically identifying fake news in popular twitter threads. In 2017 IEEE International Conference on Smart Cloud (SmartCloud), pages 208–215. IEEE.

Cabral, L., Monteiro, J. M., da Silva, J. W. F., Mattos, C. L. C., and Mourao, P. J. C. (2021). FakeWhastApp.BR: NLP and machine learning techniques for misinformation detection in brazilian portuguese whatsapp messages. In Proceedings of the 23rd International Conference on Enterprise Information Systems - Volume 1: ICEIS,, pages 63–74. INSTICC, SciTePress.

Confessore, N. (2018). Cambridge analytica and facebook: The scandal and the fallout so far. [link]. Acessado em : 20/07/2021.

Cordeiro, P. R. and Pinheiro, V. (2019). Um corpus de notícias falsas do twitter e verificação automática de rumores em lingua portuguesa. In STIL-Brazilian Symposium in Information and Human Language Technology. IEEE, Salvaldor, BA, Brazil, pages 220–228.

Hartmann, N., Fonseca, E., Shulby, C., Treviso, M., Rodrigues, J., and Aluisio, S. (2017). Portuguese word embeddings: Evaluating on word analogies and natural language tasks. arXiv preprint arXiv:1708.06025.

Kemp, S. (2021). Digital 2021: the latest insights into the ’state of digital’. [link]. Acessado em : 20/07/2021.

Lazer, D. M., Baum, M. A., Benkler, Y., Berinsky, A. J., Greenhill, K. M., Menczer, F., Metzger, M. J., Nyhan, B., Pennycook, G., Rothschild, D., et al. (2018). The science of fake news. Science, 359(6380):1094–1096.

Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., and Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, pages 3111–3119.

Mitra, T. and Gilbert, E. (2015). Credbank: A large-scale social media corpus with associated credibility annotations. In Ninth international AAAI conference on web and social media.

Monteiro, R. A., Santos, R. L., Pardo, T. A., De Almeida, T. A., Ruiz, E. E., and Vale, O. A. (2018). Contributions to the study of fake news in portuguese: New corpus and automatic detection results. In International Conference on Computational Processing of the Portuguese Language, pages 324–334. Springer.

Newberry, C. (2021). 36 twitter statistics all marketers should know in 2021. https://blog.hootsuite.com/twitter-statistics/. Acessado em : 20/07/2021.

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., et al. (2011). Scikit-learn: Machine learning in python. the Journal of machine Learning research, 12:2825–2830.

Silva, R. M., Santos, R. L., Almeida, T. A., and Pardo, T. A. (2020). Towards automatically filtering fake news in portuguese. Expert Systems with Applications, 146:113199.

Zervopoulos, A., Alvanou, A. G., Bezas, K., Papamichail, A., Maragoudakis, M., and Kermanidis, K. (2020). Hong kong protests: using natural language processing for fake news detection on twitter. In IFIP International Conference on Artificial Intelligence Applications and Innovations, pages 408–419. Springer.

Zubiaga, A., Liakata, M., Procter, R.,Wong Sak Hoi, G., and Tolmie, P. (2016). Analysing how people orient to and spread rumours in social media by looking at conversational threads. PloS one, 11(3):e0150989.
Published
2021-11-29
MOTA, Ana Alice Ximenes; FRANCO, Wellington; MATTOS, César Lincoln Cavalcante. Detecção de desinformação sobre Covid-19 no Twitter. In: BRAZILIAN SYMPOSIUM IN INFORMATION AND HUMAN LANGUAGE TECHNOLOGY (STIL), 13. , 2021, Evento Online. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2021 . p. 172-181. DOI: https://doi.org/10.5753/stil.2021.17796.