ABSTRACT
Digital news outlets have largely replaced traditional newspapers and television as the primary channels for information consumption in Brazil, and WhatsApp plays a crucial role in this context, by disseminating specific news stories. The creation of large public chat groups with numerous users and the ease of message forwarding have made the app popular among Brazilians as an affordable and immediate communication alternative. However, this has also put the platform in a central position in spreading misinformation campaigns. The platform’s closed architecture, protected by end-to-end encryption, poses a challenge for investigating and fact-checking WhatsApp content, hampering efforts to combat this problem. In this work, we explore automatic ranking-based strategies to propose a “fakeness score” model as a means to help fact-checking agencies identify fake news stories shared through images on WhatsApp. Based on the results, we design a tool and integrate it into a real system that has been used extensively for monitoring content during the 2018 Brazilian general election. Our experimental evaluation shows that this tool can reduce by up to 40% the amount of effort required to identify 80% of the fake news in the data when compared to current mechanisms practiced by the fact-checking agencies for the selection of news stories to be checked.
- R. Baeza-Yates and B. Ribeiro-Neto. 2011. Modern Information Retrieval.Addison-Wesley.Google Scholar
- Sreyasee Das Bhattacharjee, Ashit Talukder, and Bala Venkatram Balantrapu. 2017. Active learning based news veracity detection with feature weighting and deep-shallow fusion. In Proc. of the IEEE Int’l Conference on Big Data (Big Data). 556–565.Google ScholarCross Ref
- Ceren Budak, Sharad Goel, and Justin M Rao. 2016. Fair and balanced? quantifying media bias through crowdsourced content analysis. Public Opinion Quarterly 80, S1 (2016), 250–271.Google ScholarCross Ref
- Victor S Bursztyn and Larry Birnbaum. 2019. Thousands of Small, Constant Rallies: A Large-Scale Analysis of Partisan WhatsApp Groups. In Proc. of the Int’l IEEE/ACM Conference on Advances in Social Networks Analysis and Mining (ASONAM). 484–488.Google ScholarDigital Library
- Tianqi Chen and Carlos Guestrin. 2016. Xgboost: A scalable tree boosting system. In Proc. of the Int’l ACM Conference on Knowledge Discovery and Data Mining (KDD). 785–794.Google ScholarDigital Library
- Giovanni Luca Ciampaglia, Prashant Shiralkar, Luis M Rocha, Johan Bollen, Filippo Menczer, and Alessandro Flammini. 2015. Computational fact checking from knowledge networks. PLOS ONE 10, 6 (2015), e0128193.Google ScholarCross Ref
- Niall J Conroy, Victoria L Rubin, and Yimin Chen. 2015. Automatic deception detection: Methods for finding fake news. In Proc. of the Annual Meeting of the Association for Information Science and Technology (ASIS&T). 1–4.Google ScholarCross Ref
- Patrick De Angeli and Julio CS Reis. 2022. Analyzing the Potential of Feature Groups for Misinformation Detection in WhatsApp. In Proc. of the Brazilian Symposium on Multimedia and the Web (WebMedia). SBC, 45–48.Google ScholarCross Ref
- Jaynil Gaglani, Yash Gandhi, Shubham Gogate, and Aparna Halbe. 2020. Unsupervised WhatsApp fake news detection using semantic search. In Proc. of the Int’l Conference on Intelligent Computing and Control Systems (ICICCS). 285–289.Google ScholarCross Ref
- Siva Charan Reddy Gangireddy, Cheng Long, and Tanmoy Chakraborty. 2020. Unsupervised fake news detection: A graph-based approach. In Proc. of the Int’l Conference on Hypertext and Social Media (HT). 75–83.Google ScholarDigital Library
- Kiran Garimella and Gareth Tyson. 2018. Whatapp Doc? A First Look at Whatsapp Public Group Data. In Proc. of the Int’l AAAI Conference on Weblogs and Social Media (ICWSM). 511–517.Google ScholarCross Ref
- Mohamad Hoseini, Philipe Melo, Manoel Júnior, Fabrício Benevenuto, Balakrishnan Chandrasekaran, Anja Feldmann, and Savvas Zannettou. 2020. Demystifying the Messaging Platforms’ Ecosystem Through the Lens of Twitter. In Proc. of the ACM Internet Measurement Conference (IMC). 345–359.Google ScholarDigital Library
- Thorsten Joachims. 1998. Text categorization with support vector machines: Learning with many relevant features. In Proc. of the European Conference on Machine Learning (ECML). 137–142.Google ScholarDigital Library
- Daniel Lambton-Howard, Robert Anderson, Kyle Montague, Andrew Garbett, Shaun Hazeldine, Carlos Alvarez, John A Sweeney, Patrick Olivier, and Ahmed Kharrufa. 2019. WhatFutures: Designing Large-Scale Engagements on WhatsApp. In Proc. of the Int’l ACM Conference on Human Factors in Computing Systems (CHI). 1–14.Google ScholarDigital Library
- Yann Lecun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning. Nature 521, 7553 (2015), 436–444.Google Scholar
- Alex Munyole Luvembe, Weimin Li, Shaohua Li, Fangfang Liu, and Guiqiong Xu. 2023. Dual emotion based fake news detection: A deep attention-weight update approach. Information Processing & Management 60, 4 (2023), 103354.Google ScholarDigital Library
- Philipe Melo, Johnnatan Messias, Gustavo Resende, Kiran Garimella, Jussara Almeida, and Fabrício Benevenuto. 2019. WhatsApp Monitor: A Fact-Checking System for WhatsApp. In Proc. of the Int’l AAAI Conference on Weblogs and Social Media (ICWSM). 676–677.Google ScholarCross Ref
- Philipe Melo, Carolina Coimbra Vieira, Kiran Garimella, Pedro OS de Melo, and Fabrício Benevenuto. 2019. Can WhatsApp Counter Misinformation by Limiting Message Forwarding?. In Proc. of the Int’l Conference on Complex Networks and their Applications (Complex Networks). 372–384.Google Scholar
- Vishal Monga and Brian L. Evans. 2006. Perceptual image hashing via feature points: performance evaluation and tradeoffs. IEEE TIP 15, 11 (2006), 3452–3465.Google Scholar
- Rafael A Monteiro, Roney LS Santos, Thiago AS Pardo, Tiago A de Almeida, Evandro ES Ruiz, and Oto A Vale. 2018. Contributions to the Study of Fake News in Portuguese: New Corpus and Automatic Detection Results. In Proc. of the Int’l Conference on Computational Processing of the Portuguese Language (PROPOR). 324–334.Google ScholarDigital Library
- Nic Newman, Richard Fletcher, Antonis Kalogeropoulos, and Rasmus Kleis Nielsen. 2019. Reuters Institute Digital News Report 2019. Reuters Institute for the Study of Journalism.Google Scholar
- Julio Reis, Fabrıcio Benevenuto, Pedro OS de Melo, Raquel Prates, Haewoon Kwak, and Jisun An. 2015. Breaking the news: First impressions matter on online news. In Proc. of the Int’l AAAI Conference on Weblogs and Social Media (ICWSM). 357–366.Google Scholar
- Julio CS Reis and Fabrício Benevenuto. 2021. Supervised learning for misinformation detection in whatsapp. In Proc. of the Brazilian Symposium on Multimedia and the Web (WebMedia). 245–252.Google ScholarDigital Library
- Julio CS Reis, Philipe Melo, Kiran Garimella, Jussara M Almeida, Dean Eckles, and Fabrício Benevenuto. 2020. A dataset of fact-checked images shared on whatsapp during the brazilian and indian elections. In Proc. of the Int’l AAAI Conference on Web and Social Media (ICWSM). 903–908.Google ScholarCross Ref
- Julio CS Reis, Philipe Melo, Kiran Garimella, and Fabrício Benevenuto. 2020. Can WhatsApp benefit from debunked fact-checked stories to reduce misinformation?Harvard Kennedy School (HKS) Misinformation Review (2020).Google Scholar
- Julio C. S. Reis, André Correia, Fabrício Murai, Adriano Veloso, and Fabrício Benevenuto. 2019. Explainable Machine Learning for Fake News Detection. In Proc. of the Int’l ACM Conference on Web Science (WebScience). 17–26.Google ScholarDigital Library
- Julio C. S. Reis, André Correia, Fabrício Murai, Adriano Veloso, and Fabrício Benevenuto. 2019. Supervised Learning for Fake News Detection. IEEE Intelligent Systems 34, 2 (2019).Google ScholarDigital Library
- Gustavo Resende, Philipe Melo, Julio C. S. Reis, Marisa Vasconcelos, Jussara Almeida, and Fabrício Benevenuto. 2019. Analyzing Textual (Mis)Information Shared in WhatsApp Groups. In Proc. of the Int’l ACM Conference on Web Science (WebScience). 225–234.Google ScholarDigital Library
- Gustavo Resende, Philipe Melo, Hugo Sousa, Johnnatan Messias, Marisa Vasconcelos, Jussara Almeida, and Fabrício Benevenuto. 2019. (Mis)Information Dissemination in WhatsApp: Gathering, Analyzing and Countermeasures. In Proc. of the ACM Web Conference (WWW). 818–828.Google ScholarDigital Library
- Filipe Ribeiro, Lucas Henrique, Fabrício Benevenuto, Abhijnan Chakraborty, Juhi Kulshrestha, Mahmoudreza Babaei, and Krishna P. Gummadi. 2018. Media Bias Monitor: Quantifying Biases of Social Media News Outlets at Large-Scale. In Proc. of the Int’l AAAI Conference on Weblogs and Social Media (ICWSM). 290–299.Google ScholarCross Ref
- Manoel Horta Ribeiro, Pedro H Calais, Virgílio AF Almeida, and Wagner Meira Jr. 2017. "Everything I Disagree With is# FakeNews": Correlating Political Polarization and Spread of Misinformation. In Proc. of the Workshop on Data Science + Journalism @KDD.Google Scholar
- Karishma Sharma, Feng Qian, He Jiang, Natali Ruchansky, Ming Zhang, and Yan Liu. 2019. Combating fake news: A survey on identification and mitigation techniques. ACM TIST 10, 3 (2019), 1–42.Google ScholarDigital Library
- Yla R Tausczik and James W Pennebaker. 2010. The psychological meaning of words: LIWC and computerized text analysis methods. Journal of language and social psychology 29, 1 (2010), 24–54.Google ScholarCross Ref
- Carolina Vieira, Philipe Melo, Pedro Olmo Vaz de Melo, and Fabrício Benevenuto. 2019. The Paradox of Encrypted Information Virality on WhatsApp. In Proc. of the Brazilian Symposium on Computer Networks and Distributed Systems (SBRC).Google Scholar
- Andreas Vlachos and Sebastian Riedel. 2014. Fact checking: Task definition and dataset construction. In Proc. of the ACL Workshop on Language Technologies and Computational Social Science. 18–22.Google ScholarCross Ref
- Soroush Vosoughi, Deb Roy, and Sinan Aral. 2018. The spread of true and false news online. Science 359, 6380 (2018), 1146–1151.Google Scholar
- Yaqing Wang, Fenglong Ma, Zhiwei Jin, Ye Yuan, Guangxu Xun, Kishlay Jha, Lu Su, and Jing Gao. 2018. Eann: Event adversarial neural networks for multi-modal fake news detection. In Proc. of the Int’l ACM Conference on Knowledge Discovery and Data Mining (KDD). 849–857.Google ScholarDigital Library
- WhatsApp. 2020. Two Billion Users – Connecting the World Privately. WhatsApp Blog. https://blog.whatsapp.com/two-billion-users-connecting-the-world-privately [Online; Posted on 12-Feb-2020. Accessed on 10-Sep-2022].Google Scholar
Index Terms
- Helping Fact-Checkers Identify Fake News Stories Shared through Images on WhatsApp
Recommendations
Fake News, Disinformation, Propaganda, and Media Bias
CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge ManagementThe rise of Internet and social media changed not only how we consume information, but it also democratized the process of content creation and dissemination, thus making it easily available to anybody. Despite the hugely positive impact, this situation ...
Diffusion of Community Fact-Checked Misinformation on Twitter
CSCWThe spread of misinformation on social media is a pressing societal problem that platforms, policymakers, and researchers continue to grapple with. As a countermeasure, recent works have proposed to employ non-expert fact-checkers in the crowd to fact-...
Linguistic Signals under Misinformation and Fact-Checking: Evidence from User Comments on Social Media
Misinformation and fact-checking are opposite forces in the news environment: the former creates inaccuracies to mislead people, while the latter provides evidence to rebut the former. These news articles are often posted on social media and attract ...
Comments