Helping Fact-Checkers Identify Fake News Stories Shared through Images on WhatsApp

  • Julio C. S. Reis UFV
  • Philipe Melo UFV
  • Fabiano Belém UFMG
  • Fabricio Murai Worcester Polytechnic Institute
  • Jussara M. Almeida UFMG
  • Fabricio Benevenuto UFMG

Resumo


Digital news outlets have largely replaced traditional newspapers and television as the primary channels for information consumption in Brazil, and WhatsApp plays a crucial role in this context, by disseminating specific news stories. The creation of large public chat groups with numerous users and the ease of message forwarding have made the app popular among Brazilians as an affordable and immediate communication alternative. However, this has also put the platform in a central position in spreading misinformation campaigns. The platform’s closed architecture, protected by end-to-end encryption, poses a challenge for investigating and fact-checking WhatsApp content, hampering efforts to combat this problem. In this work, we explore automatic ranking-based strategies to propose a “fakeness score” model as a means to help fact-checking agencies identify fake news stories shared through images on WhatsApp. Based on the results, we design a tool and integrate it into a real system that has been used extensively for monitoring content during the 2018 Brazilian general election. Our experimental evaluation shows that this tool can reduce by up to 40% the amount of effort required to identify 80% of the fake news in the data when compared to current mechanisms practiced by the fact-checking agencies for the selection of news stories to be checked.

Palavras-chave: WhatsApp, Misinformation, Ranking, Fact-Checking, Fake News, Images

Referências

R. Baeza-Yates and B. Ribeiro-Neto. 2011. Modern Information Retrieval.Addison-Wesley

Sreyasee Das Bhattacharjee, Ashit Talukder, and Bala Venkatram Balantrapu. 2017. Active learning based news veracity detection with feature weighting and deep-shallow fusion. In Proc. of the IEEE Int’l Conference on Big Data (Big Data). 556–565

Ceren Budak, Sharad Goel, and Justin M Rao. 2016. Fair and balanced? quantifying media bias through crowdsourced content analysis. Public Opinion Quarterly 80, S1 (2016), 250–271

Victor S Bursztyn and Larry Birnbaum. 2019. Thousands of Small, Constant Rallies: A Large-Scale Analysis of Partisan WhatsApp Groups. In Proc. of the Int’l IEEE/ACM Conference on Advances in Social Networks Analysis and Mining (ASONAM). 484–488.

Tianqi Chen and Carlos Guestrin. 2016. Xgboost: A scalable tree boosting system. In Proc. of the Int’l ACM Conference on Knowledge Discovery and Data Mining (KDD). 785–794.

Giovanni Luca Ciampaglia, Prashant Shiralkar, Luis M Rocha, Johan Bollen, Filippo Menczer, and Alessandro Flammini. 2015. Computational fact checking from knowledge networks. PLOS ONE 10, 6 (2015), e0128193

Niall J Conroy, Victoria L Rubin, and Yimin Chen. 2015. Automatic deception detection: Methods for finding fake news. In Proc. of the Annual Meeting of the Association for Information Science and Technology (ASIS&T). 1–4

Patrick De Angeli and Julio CS Reis. 2022. Analyzing the Potential of Feature Groups for Misinformation Detection in WhatsApp. In Proc. of the Brazilian Symposium on Multimedia and the Web (WebMedia). SBC, 45–48

Jaynil Gaglani, Yash Gandhi, Shubham Gogate, and Aparna Halbe. 2020. Unsupervised WhatsApp fake news detection using semantic search. In Proc. of the Int’l Conference on Intelligent Computing and Control Systems (ICICCS). 285–289

Siva Charan Reddy Gangireddy, Cheng Long, and Tanmoy Chakraborty. 2020. Unsupervised fake news detection: A graph-based approach. In Proc. of the Int’l Conference on Hypertext and Social Media (HT). 75–83.

Kiran Garimella and Gareth Tyson. 2018. Whatapp Doc? A First Look at Whatsapp Public Group Data. In Proc. of the Int’l AAAI Conference on Weblogs and Social Media (ICWSM). 511–517

Mohamad Hoseini, Philipe Melo, Manoel Júnior, Fabrício Benevenuto, Balakrishnan Chandrasekaran, Anja Feldmann, and Savvas Zannettou. 2020. Demystifying the Messaging Platforms’ Ecosystem Through the Lens of Twitter. In Proc. of the ACM Internet Measurement Conference (IMC). 345–359.

Thorsten Joachims. 1998. Text categorization with support vector machines: Learning with many relevant features. In Proc. of the European Conference on Machine Learning (ECML). 137–142.

Daniel Lambton-Howard, Robert Anderson, Kyle Montague, Andrew Garbett, Shaun Hazeldine, Carlos Alvarez, John A Sweeney, Patrick Olivier, and Ahmed Kharrufa. 2019. WhatFutures: Designing Large-Scale Engagements on WhatsApp. In Proc. of the Int’l ACM Conference on Human Factors in Computing Systems (CHI). 1–14.

Yann Lecun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning. Nature 521, 7553 (2015), 436–444

Alex Munyole Luvembe, Weimin Li, Shaohua Li, Fangfang Liu, and Guiqiong Xu. 2023. Dual emotion based fake news detection: A deep attention-weight update approach. Information Processing & Management 60, 4 (2023), 103354.

Philipe Melo, Johnnatan Messias, Gustavo Resende, Kiran Garimella, Jussara Almeida, and Fabrício Benevenuto. 2019. WhatsApp Monitor: A Fact-Checking System for WhatsApp. In Proc. of the Int’l AAAI Conference on Weblogs and Social Media (ICWSM). 676–677

Philipe Melo, Carolina Coimbra Vieira, Kiran Garimella, Pedro OS de Melo, and Fabrício Benevenuto. 2019. Can WhatsApp Counter Misinformation by Limiting Message Forwarding?. In Proc. of the Int’l Conference on Complex Networks and their Applications (Complex Networks). 372–384

Vishal Monga and Brian L. Evans. 2006. Perceptual image hashing via feature points: performance evaluation and tradeoffs. IEEE TIP 15, 11 (2006), 3452–3465

Rafael A Monteiro, Roney LS Santos, Thiago AS Pardo, Tiago A de Almeida, Evandro ES Ruiz, and Oto A Vale. 2018. Contributions to the Study of Fake News in Portuguese: New Corpus and Automatic Detection Results. In Proc. of the Int’l Conference on Computational Processing of the Portuguese Language (PROPOR). 324–334

Nic Newman, Richard Fletcher, Antonis Kalogeropoulos, and Rasmus Kleis Nielsen. 2019. Reuters Institute Digital News Report 2019. Reuters Institute for the Study of Journalism

Julio Reis, Fabrıcio Benevenuto, Pedro OS de Melo, Raquel Prates, Haewoon Kwak, and Jisun An. 2015. Breaking the news: First impressions matter on online news. In Proc. of the Int’l AAAI Conference on Weblogs and Social Media (ICWSM). 357–366

Julio CS Reis and Fabrício Benevenuto. 2021. Supervised learning for misinformation detection in whatsapp. In Proc. of the Brazilian Symposium on Multimedia and the Web (WebMedia). 245–252.

Julio CS Reis, Philipe Melo, Kiran Garimella, Jussara M Almeida, Dean Eckles, and Fabrício Benevenuto. 2020. A dataset of fact-checked images shared on whatsapp during the brazilian and indian elections. In Proc. of the Int’l AAAI Conference on Web and Social Media (ICWSM). 903–908

Julio CS Reis, Philipe Melo, Kiran Garimella, and Fabrício Benevenuto. 2020. Can WhatsApp benefit from debunked fact-checked stories to reduce misinformation?Harvard Kennedy School (HKS) Misinformation Review (2020)

Julio C. S. Reis, André Correia, Fabrício Murai, Adriano Veloso, and Fabrício Benevenuto. 2019. Explainable Machine Learning for Fake News Detection. In Proc. of the Int’l ACM Conference on Web Science (WebScience). 17–26.

Julio C. S. Reis, André Correia, Fabrício Murai, Adriano Veloso, and Fabrício Benevenuto. 2019. Supervised Learning for Fake News Detection. IEEE Intelligent Systems 34, 2 (2019).

Gustavo Resende, Philipe Melo, Julio C. S. Reis, Marisa Vasconcelos, Jussara Almeida, and Fabrício Benevenuto. 2019. Analyzing Textual (Mis)Information Shared in WhatsApp Groups. In Proc. of the Int’l ACM Conference on Web Science (WebScience). 225–234.

Gustavo Resende, Philipe Melo, Hugo Sousa, Johnnatan Messias, Marisa Vasconcelos, Jussara Almeida, and Fabrício Benevenuto. 2019. (Mis)Information Dissemination in WhatsApp: Gathering, Analyzing and Countermeasures. In Proc. of the ACM Web Conference (WWW). 818–828.

Filipe Ribeiro, Lucas Henrique, Fabrício Benevenuto, Abhijnan Chakraborty, Juhi Kulshrestha, Mahmoudreza Babaei, and Krishna P. Gummadi. 2018. Media Bias Monitor: Quantifying Biases of Social Media News Outlets at Large-Scale. In Proc. of the Int’l AAAI Conference on Weblogs and Social Media (ICWSM). 290–299

Manoel Horta Ribeiro, Pedro H Calais, Virgílio AF Almeida, and Wagner Meira Jr. 2017. "Everything I Disagree With is# FakeNews": Correlating Political Polarization and Spread of Misinformation. In Proc. of the Workshop on Data Science + Journalism @KDD

Karishma Sharma, Feng Qian, He Jiang, Natali Ruchansky, Ming Zhang, and Yan Liu. 2019. Combating fake news: A survey on identification and mitigation techniques. ACM TIST 10, 3 (2019), 1–42.

Yla R Tausczik and James W Pennebaker. 2010. The psychological meaning of words: LIWC and computerized text analysis methods. Journal of language and social psychology 29, 1 (2010), 24–54

Carolina Vieira, Philipe Melo, Pedro Olmo Vaz de Melo, and Fabrício Benevenuto. 2019. The Paradox of Encrypted Information Virality on WhatsApp. In Proc. of the Brazilian Symposium on Computer Networks and Distributed Systems (SBRC)

Andreas Vlachos and Sebastian Riedel. 2014. Fact checking: Task definition and dataset construction. In Proc. of the ACL Workshop on Language Technologies and Computational Social Science. 18–22

Soroush Vosoughi, Deb Roy, and Sinan Aral. 2018. The spread of true and false news online. Science 359, 6380 (2018), 1146–1151

Yaqing Wang, Fenglong Ma, Zhiwei Jin, Ye Yuan, Guangxu Xun, Kishlay Jha, Lu Su, and Jing Gao. 2018. Eann: Event adversarial neural networks for multi-modal fake news detection. In Proc. of the Int’l ACM Conference on Knowledge Discovery and Data Mining (KDD). 849–857.

WhatsApp. 2020. Two Billion Users – Connecting the World Privately. WhatsApp Blog. [link]. [Online; Posted on 12-Feb-2020. Accessed on 10-Sep-2022].
Publicado
23/10/2023
REIS, Julio C. S.; MELO, Philipe; BELÉM, Fabiano; MURAI, Fabricio; ALMEIDA, Jussara M.; BENEVENUTO, Fabricio. Helping Fact-Checkers Identify Fake News Stories Shared through Images on WhatsApp. In: BRAZILIAN SYMPOSIUM ON MULTIMEDIA AND THE WEB (WEBMEDIA), 29. , 2023, Ribeirão Preto/SP. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2023 . p. 159–167.