Fake News Detection in Tweets: Challenges and Adaptations imposed by the COVID-19

Authors

  • Christiane Santana
  • Daniela Barreiro Claro UFBA - Federal University of Bahia
  • Marlo Souza

DOI:

https://doi.org/10.5753/isys.2022.2286

Keywords:

fake news, COVID, misinformation, detection

Abstract

A large amount of misinformation has plagued citizens' lives, especially on social networks. During the coronavirus pandemic, the large amount of false or inaccurate news about the virus was exorbitant that the World Health Organization announced an infodemic. However, few resources are available to combat misinformation in a new and evolving domain such as the coronavirus pandemic. This fact is aggravated when this misinformation is associated with the speed of diffusion of social networks. In this case, the lack of resources, such as generic methods, tools, and corpora on coronaviruses, hinders our ability to combat this misinformation. Our work aimed to evaluate the existing resources and their potential uses for the specific and ephemeral domain of Covid-19. In addition, we analyzed different writing styles and the necessity to create a mechanism to annotate COVID dataset in Portuguese to improve detection mechanisms. Our results indicated the type of resources to combat misinformation in the pandemic and the F1-score of our approaches to detect misinformation dealing with the Twitter social network within the COVID domain.

Downloads

Download data is not yet available.

References

Al-Rakhami, M. S. and Al-Amri, A. M. (2020). Lies kill,facts save: Detecting covid-19 misinformation in twitter.IEEE Access, 8:155961–155970.

Anoop, K., Deepak, P., and V, L. L. (2020). Emotion cognizance im-proves health fake news identification. InProceedings of the 24th Symposium on In-ternational Database Engineering & Applications, IDEAS ’20, New York, NY, USA.Association for Computing Machinery.

Bastos, M. T. and Mercea, D. (2019). The brexit botnet and user-generated hyperpartisan news.Social Science Computer Review, 37(1):38–54.[Boleda 2020] Boleda, G. (2020). Distributional semantics and linguistic theory.AnnualReview of Linguistics, 6(1):213–234.

Ciampaglia, G. L., Shiralkar, P., Rocha, L. M., Bollen, J., Menczer,F., and Flammini, A. (2015). Computational fact checking from knowledge networks.PloS one, 10(6):e0128193.

Cordeiro, P. R. and Pinheiro, V. (2019).Um corpus denot ́ıcias falsas do twitter e verificac ̧ ̃ao autom ́atica de rumores em lingua portuguesa. InSTIL-Brazilian Symposium in Information and Human Language Technology. IEEE,Salvaldor, BA, Brazil, pages 220–228.

Dagan, I., Glickman, O., and Magnini, B. (2005). The pascal recognis-ing textual entailment challenge. InMachine Learning Challenges Workshop, pages177–190. Springer.[Dahlgren 2018] Dahlgren, P. (2018). Media, knowledge and trust: The deepening epis-temic crisis of democracy.Javnost - The Public, 25(1-2):20–27.

Dantas, L. F. S. and Deccache-Maia, E. (2020).Divulgac ̧ ̃ao cient ́ıfica no combate`as fake news em tempos de covid-19.Research,Society and Development, 9(7):e797974776–e797974776.

Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding.arXiv preprintarXiv:1810.04805.

Dryhurst, S., Schneider, C. R., Kerr, J., Freeman, A. L., Recchia, G.,Van Der Bles, A. M., Spiegelhalter, D., and van der Linden, S. (2020). Risk perceptionsof covid-19 around the world.Journal of Risk Research, pages 1–13.

Dumitrache, A., Aroyo, L., and Welty, C. (2018). Crowdsourcingsemantic label propagation in relation classification.CoRR, abs/1809.00537.

Evert, S. (2010). Distributional semantic models. InNAACL HLT 2010 Tuto-rial Abstracts, pages 15–18, Los Angeles, California. Association for ComputationalLinguistics.

Fonseca, E., Santos, L., Criscuolo, M., and Aluisio, S. (2016). Assin:Avaliacao de similaridade semantica e inferencia textual. InComputational Processingof the Portuguese Language-12th International Conference, Tomar, Portugal, pages13–15.

Forelle, M., Howard, P., Monroy-Hern ́andez, A., and Savage, S. (2015).Political bots and the manipulation of public opinion in venezuela.arXiv preprintarXiv:1507.07109.

Giordano, G., Mottola, S., and Beatrice, P. (2020). A short reviewof some mathematical methods to detect fake news. International Journal of Circuits,Systems and Signal Processing, 14:255–265

Jacobsen, K. H. and Vraga, E. K. (2020). Improving communi-cation about covid-19 and emerging infectious diseases.European journal of clinicalinvestigation, 50.

Joachims, T. (2002). Learning to classify text using support vector ma-chines - methods, theory and algorithms. InThe Kluwer international series in engi-neering and computer science.

Krause, N. M., Freiling, I., Beets, B., and Brossard, D. (2020). Fact-checking as risk communication: the multi-layered risk of misinformation in times ofcovid-19.Journal of Risk Research, pages 1–8.

Li, Y., Bandar, Z., and Mclean, D. (2003). An approach for measuringsemantic similarity between words using multiple information sources.IEEE Trans-actions on Knowledge and Data Engineering, 15(4):871–882.

Lorena, A. C. and Carvalho, A. C. P. d. L. F. (2003). Introduc ̧ ̃ao`as m ́aquinas de vetores suporte (support vector machines).

Mar ́ın, I. P. and Arroyo, D. (2019). Fake news detection. InCom-putational Intelligence in Security for Information Systems Conference, pages 229–238. Springer.

Medeiros, F. and Braga, R. (2020). Fake news detection in so-cial media: A systematic review. InAnais do XVI Simp ́osio Brasileiro de Sistemas deInformac ̧ ̃ao, Porto Alegre, RS, Brasil. SBC.

Meleo-Erwin, Z., Basch, C., MacLean, S. A., Scheibner, C., andCadorett, V. (2017). “to each his own”: Discussions of vaccine decision-making in topparenting blogs.Human vaccines & immunotherapeutics, 13(8):1895–1901.

Messeder Neto, H. (2019). A divulgac ̧ ̃ao cient ́ıfica em tempos deobscurantismo e de fake news: contribuic ̧ ̃oes hist ́orico-cr ́ıticas. In Rocha, M. andOliveira, R., editors,Divulgac ̧ ̃ao Cient ́ıfica: Textos E Contextos. Livraria da F ́ısica,S ̃ao Paulo, 1 edition.

Mikolov, T., Chen, K., Corrado, G., and Dean, J. (2013). Efficientestimation of word representations in vector space.arXiv preprint arXiv:1301.3781

Monteiro, R. A., Santos, R. L., Pardo, T. A., De Almeida, T. A., Ruiz,E. E., and Vale, O. A. (2018). Contributions to the study of fake news in portuguese:New corpus and automatic detection results. InInternational Conference on Compu-tational Processing of the Portuguese Language, pages 324–334. Springer.

Oshikawa, R., Qian, J., and Wang, W. Y. (2020). A survey on naturallanguage processing for fake news detection. InProceedings of the 12th LanguageResources and Evaluation Conference, pages 6086–6093.

Pawar, S., Ramrakhiyani, N., Hingmire, S., and Palshikar, G. K. (2017).Topics and label propagation: Best of both worlds for weakly supervised text classifi-cation.

Perez-Rosas, V., Kleinberg, B., Lefevre, A., and Mihalcea, R.(2018). Automatic detection of fake news. InProceedings of the 27th InternationalConference on Computational Linguistics, pages 3391–3401. Association for Compu-tational Linguistics.

Plous, S. (1993).The psychology of judgment and decision making.Mcgraw-Hill Book Company.

Ruediger, M. A. (2017). Robˆos, redes sociais e pol ́ıtica no brasil: estudosobre interferˆencias ileg ́ıtimas no debate p ́ublico na web, riscos`a democracia e pro-cesso eleitoral de 2018.

Ruiz, E. and Okano, E. (2019). Using linguistic cues to detect fakenews on the brazilian portuguese parallel corpus fake. br. InProceedings of the 12thBrazilian Symposium in Information and Human Language Technology, pages 181–189.

Sanh, V., Debut, L., Chaumond, J., and Wolf, T. (2019). Distilbert,a distilled version of bert: smaller, faster, cheaper and lighter.arXiv preprintarXiv:1910.01108.[Schmidt et al. 2018] Schmidt, A. L., Zollo, F., Scala, A., Betsch, C., and Quattrociocchi,W. (2018). Polarization of the vaccination debate on facebook.Vaccine, 36(25):3606–3612.

Shao, C., Ciampaglia, G. L., Varol, O., Yang, K.-C., Flammini, A., andMenczer, F. (2018). The spread of low-credibility content by social bots.Naturecommunications, 9(1):1–9.

Sharma, K., Seo, S., Meng, C., Rambhatla, S., Dua, A., and Liu, Y.(2020). Coronavirus on social media: Analyzing misinformation in twitter conversa-tions.CoRR, abs/2003.12309.

Silva, R. M., Santos, R. L., Almeida, T. A., and Pardo, T. A. (2020).Towards automatically filtering fake news in portuguese.Expert Systems with Appli-cations, 146:113199.

Tandoc Jr, E. C., Lim, Z. W., and Ling, R. (2018). Defining “fakenews” a typology of scholarly definitions.Digital journalism, 6(2):137–153.

Uscinski, J. E. and Butler, R. W. (2013). The epistemology offact checking.Critical Review, 25(2):162–180.

van Dijck, J. and Alinejad, D. (2020). Social media and trustin scientific expertise: Debating the covid-19 pandemic in the netherlands.SocialMedia+ Society, 6(4):2056305120981057.

Vosoughi, S., Roy, D., and Aral, S. (2018). The spread of true andfalse news online.Science, 359(6380):1146–1151.

Vraga, E. K. and Bode, L. (2017). Using expert sources to correcthealth misinformation in social media.Science Communication, 39(5):621–645.

Wadden, D., Lin, S., Lo, K., Wang, L. L., van Zuylen, M., Cohan, A.,and Hajishirzi, H. (2020). Fact or fiction: Verifying scientific claims. InProceed-ings of the 2020 Conference on Empirical Methods in Natural Language Processing(EMNLP), pages 7534–7550.

Wang, B., Shen, Y., and Liu, Y. (2011). Integrating distance metric learn-ing into label propagation model for multi-label image annotation. In2011 18th IEEEInternational Conference on Image Processing, pages 3649–3652.

World Health Organization (2020). Novel coronavirus(2019-ncov) situation report - 13. Dispon ́ıvel em: https://www.who.int/docs/default-source/coronaviruse/situation-reports/20200202-sitrep-13-ncov-v3.pdf.

Yiannakoulias, N., Slavik, C. E., and Chase, M. (2019). Expres-sions of pro-and anti-vaccine sentiment on youtube.Vaccine, 37(15):2057–2064.[Zanzotto 2019] Zanzotto, F. M. (2019). Human-in-the-loop artificial intelligence.Journalof Artificial Intelligence Research, 64:243–252.

Zhou, X., Mulay, A., Ferrara, E., and Zafarani, R. (2020). Recovery.Proceedings of the 29th ACM International Conference on Information & KnowledgeManagement.[Zhu and Ghahramani 2002] Zhu, X. and Ghahramani, Z. (2002). Learning from labeledand unlabeled data with label propagation

Downloads

Published

2022-10-18

How to Cite

Santana, C., Claro, D. B., & Souza, M. (2022). Fake News Detection in Tweets: Challenges and Adaptations imposed by the COVID-19. ISys - Brazilian Journal of Information Systems, 15(1), 11:1–11:26. https://doi.org/10.5753/isys.2022.2286

Issue

Section

Special issues articles

Most read articles by the same author(s)