Detecção Automática de Desinformação em Diferentes Cenários: Eleições nos Estados Unidos e no Brasil
Resumo
Neste trabalho apresentamos uma investigação do potencial de atributos para detecção de desinformação considerando diferentes cenários (i.e., eleições presidenciais nos Estados Unidos e no Brasil). Para isso, reunimos dados destes dois eventos e computamos atributos explorados em trabalhos anteriores em ambos os repositórios. Depois, propomos uma metodologia para geração imparcial de modelos usando o classificador XGB, cujo desempenho dos modelos gerados foi mensurado em termos de AUC. Por fim, conduzimos um experimento baseado na Fronteira de Pareto que nos permitiu identificar atributos que podem serúteis para a geração de modelos com alto desempenho para identificação de desinformação disseminada em diferentes cenários.
Palavras-chave:
Desinformação, Notícias Falsas, Detecção Automática, Atributos, Eleições
Referências
Atanasova, P., Nakov, P., Márquez, L., Barrón-Cedeño, A., Karadzhov, G., Mihaylova, T., Mohtarami, M., and Glass, J. (2019). Automatic fact-checking using context and discourse information. Journal of Data and Information Quality (JDIQ), 11(3):1–27.
Baeza-Yates, R. and Ribeiro-Neto, B. (1999). Modern information retrieval, volume 463. ACM press New York.
Bang, Y., Ishii, E., Cahyawijaya, S., Ji, Z., and Fung, P. (2021). Model generalization on covid-19 fake news detection. In Proc. of the Int’l Workshop on Combating Online Hostile Posts in Regional Languages during Emergency Situation.
Bessi, A. and Ferrara, E. (2016). Social bots distort the 2016 us presidential election online discussion. First Monday, 21(11).
Bhattacharjee, S. D., Talukder, A., and Balantrapu, B. V. (2017). Active learning based news veracity detection with feature weighting and deep-shallow fusion. In Proc. of the IEEE Int’l Conf. on Big Data (Big Data).
Bovet, A. and Makse, H. A. (2019). Influence of fake news in twitter during the 2016 us presidential election. Nature communications, 10(1):7.
Chen, T. and Guestrin, C. (2016). Xgboost: A scalable tree boosting system. In Proc. of the Int’l ACM Conf. on Knowledge Discovery and Data Mining (KDD).
Ciampaglia, G. L., Shiralkar, P., Rocha, L. M., Bollen, J., Menczer, F., and Flammini, A. (2015). Computational fact checking from knowledge networks. Plos One, 10(6).
Cinelli, M., Quattrociocchi, W., Galeazzi, A., Valensise, C. M., Brugnoli, E., Schmidt, A. L., Zola, P., Zollo, F., and Scala, A. (2020). The covid-19 social media infodemic. Scientific reports, 10(1):1–10.
Conroy, N. J., Rubin, V. L., and Chen, Y. (2015). Automatic deception detection: Methods for finding fake news. In Proc. of the Annual Meeting of the Association for Information Science and Technology (ASIS&T).
Ferrara, E. (2020). What types of covid-19 conspiracies are populated by twitter bots? First Monday.
Kumar, S., Asthana, R., Upadhyay, S., Upreti, N., and Akbar, M. (2020). Fake news detection using deep learning models: A novel approach. Transactions on Emerging Telecommunications Technologies, 31(2).
Lazer, D. M., Baum, M. A., Benkler, Y., Berinsky, A. J., Greenhill, K. M., Menczer, F., Metzger, M. J., Nyhan, B., Pennycook, G., Rothschild, D., et al. (2018). The science of fake news. Science, 359(6380):1094–1096.
Lemos, A. L. M., Bitencourt, E. C., and dos Santos, J. G. B. (2021). Fake news as fake politics: the digital materialities of youtube misinformation videos about brazilian oil spill catastrophe. Media, Culture & Society, 43(5):886–905.
Li, Y., Gao, J., Meng, C., Li, Q., Su, L., Zhao, B., Fan, W., and Han, J. (2016). A survey on truth discovery. ACM Sigkdd Explorations Newsletter, 17(2):1–16.
Lin, X., Chen, H., Pei, C., Sun, F., Xiao, X., Sun, H., Zhang, Y., Ou, W., and Jiang, P. (2019). A pareto-efficient algorithm for multiple objective optimization in e-commerce recommendation. In Proc. of the Int’l ACM Conf. on Recommender Systems (RecSys).
Mitchell, A. (2016). Key findings on the traits and habits of the modern news consumer. http://www.pewresearch.org/fact-tank/2016/07/07/modern-news-consumer/.
Palda, K. F. (2011). Pareto’s Republic and the new Science of Peace. Filip Palda.
Reis, J. C. and Benevenuto, F. (2021). Supervised learning for misinformation detection in whatsapp. In Proc. of the Brazilian Symp. on Multimedia and the Web (WebMedia).
Reis, J. C., Correia, A., Murai, F., Veloso, A., and Benevenuto, F. (2019a). Explainable machine learning for fake news detection. In Proc. of the ACM Conf. on Web Science.
Reis, J. C., Melo, P., Garimella, K., Almeida, J. M., Eckles, D., and Benevenuto, F. (2020). A dataset of fact-checked images shared on whatsapp during the brazilian and indian elections. In Proc. of the Int’l AAAI Conference on Web and Social Media (ICWSM).
Reis, J. C. S., Correia, A., Murai, F., Veloso, A., and Benevenuto, F. (2019b). Supervised learning for fake news detection. IEEE Intelligent Systems, 34(2):76–81.
Report, D. N. (2018). Statistic of the week: How brazilian voters get their news. [link].
Resende, G., Melo, P., Reis, J. C. S., Vasconcelos, M., Almeida, J., and Benevenuto, F. (2019a). Analyzing textual (mis)information shared in whatsapp groups. In Proc. of the Int’l ACM Conf. on Web Science.
Resende, G., Melo, P., Sousa, H., Messias, J., Vasconcelos, M., Almeida, J., and Benevenuto, F. (2019b). (mis)information dissemination in whatsapp: Gathering, analyzing and countermeasures. In Proc. of the ACM Web Conference (WWW).
Ribeiro, M. T., Ziviani, N., Moura, E. S. D., Hata, I., Lacerda, A., and Veloso, A. (2014). Multiobjective pareto-efficient approaches for recommender systems. ACM Transactions on Intelligent Systems and Technology (TIST), 5(4):1–20.
Ruchansky, N., Seo, S., and Liu, Y. (2017). Csi: A hybrid deep model for fake news detection. In Proc. of the Int’l ACM Conf. on Inform. and Knowledge Manag. (CIKM).
Santia, G. and Williams, J. (2018). Buzzface: A news veracity dataset with facebook user commentary and egos. In Proc. of the Int’l AAAI Conf. on Web. and Soc. Med. (ICWSM).
Sharma, K., Qian, F., Jiang, H., Ruchansky, N., Zhang, M., and Liu, Y. (2019). Combating fake news: A survey on identification and mitigation techniques. ACM Transactions on Intelligent Systems and Technology (TIST), 10(3):1–42.
Silverman, C., Strapagiel, L., Shaban, H., Hall, E., , and Singer-Vine, J. (2016). Hyperpartisan facebook pages are publishing false and misleading information at an alarming rate. https://www.buzzfeed.com/craigsilverman/partisan-fb-pages-analysis.
Volkova, S., Shaffer, K., Jang, J. Y., and Hodas, N. (2017). Separating facts from fiction: Linguistic models to classify suspicious and trusted news posts on twitter. In Proc. of the Annual Meeting of the Association for Computational Linguistics (ACL).
Vosoughi, S., Roy, D., and Aral, S. (2018). The spread of true and false news online. Science, 359(6380):1146–1151.
Wang, Y., Ma, F., Jin, Z., Yuan, Y., Xun, G., Jha, K., Su, L., and Gao, J. (2018). Eann: Event adversarial neural networks for multi-modal fake news detection. In Proc. of the Int’l ACM Conf. on Knowledge Discovery and Data Mining (KDD).
Yoo, S. and Harman, M. (2007). Pareto efficient multi-objective test case selection. In Proc. of the Int’l Symp. on Software Testing and Analysis (ISSTA).
Zames, G., Ajlouni, N., Ajlouni, N., Ajlouni, N., Holland, J., Hills, W., and Goldberg, D. (1981). Genetic algorithms in search, optimization and machine learning. Information Technology Journal, 3(1):301–302.
Baeza-Yates, R. and Ribeiro-Neto, B. (1999). Modern information retrieval, volume 463. ACM press New York.
Bang, Y., Ishii, E., Cahyawijaya, S., Ji, Z., and Fung, P. (2021). Model generalization on covid-19 fake news detection. In Proc. of the Int’l Workshop on Combating Online Hostile Posts in Regional Languages during Emergency Situation.
Bessi, A. and Ferrara, E. (2016). Social bots distort the 2016 us presidential election online discussion. First Monday, 21(11).
Bhattacharjee, S. D., Talukder, A., and Balantrapu, B. V. (2017). Active learning based news veracity detection with feature weighting and deep-shallow fusion. In Proc. of the IEEE Int’l Conf. on Big Data (Big Data).
Bovet, A. and Makse, H. A. (2019). Influence of fake news in twitter during the 2016 us presidential election. Nature communications, 10(1):7.
Chen, T. and Guestrin, C. (2016). Xgboost: A scalable tree boosting system. In Proc. of the Int’l ACM Conf. on Knowledge Discovery and Data Mining (KDD).
Ciampaglia, G. L., Shiralkar, P., Rocha, L. M., Bollen, J., Menczer, F., and Flammini, A. (2015). Computational fact checking from knowledge networks. Plos One, 10(6).
Cinelli, M., Quattrociocchi, W., Galeazzi, A., Valensise, C. M., Brugnoli, E., Schmidt, A. L., Zola, P., Zollo, F., and Scala, A. (2020). The covid-19 social media infodemic. Scientific reports, 10(1):1–10.
Conroy, N. J., Rubin, V. L., and Chen, Y. (2015). Automatic deception detection: Methods for finding fake news. In Proc. of the Annual Meeting of the Association for Information Science and Technology (ASIS&T).
Ferrara, E. (2020). What types of covid-19 conspiracies are populated by twitter bots? First Monday.
Kumar, S., Asthana, R., Upadhyay, S., Upreti, N., and Akbar, M. (2020). Fake news detection using deep learning models: A novel approach. Transactions on Emerging Telecommunications Technologies, 31(2).
Lazer, D. M., Baum, M. A., Benkler, Y., Berinsky, A. J., Greenhill, K. M., Menczer, F., Metzger, M. J., Nyhan, B., Pennycook, G., Rothschild, D., et al. (2018). The science of fake news. Science, 359(6380):1094–1096.
Lemos, A. L. M., Bitencourt, E. C., and dos Santos, J. G. B. (2021). Fake news as fake politics: the digital materialities of youtube misinformation videos about brazilian oil spill catastrophe. Media, Culture & Society, 43(5):886–905.
Li, Y., Gao, J., Meng, C., Li, Q., Su, L., Zhao, B., Fan, W., and Han, J. (2016). A survey on truth discovery. ACM Sigkdd Explorations Newsletter, 17(2):1–16.
Lin, X., Chen, H., Pei, C., Sun, F., Xiao, X., Sun, H., Zhang, Y., Ou, W., and Jiang, P. (2019). A pareto-efficient algorithm for multiple objective optimization in e-commerce recommendation. In Proc. of the Int’l ACM Conf. on Recommender Systems (RecSys).
Mitchell, A. (2016). Key findings on the traits and habits of the modern news consumer. http://www.pewresearch.org/fact-tank/2016/07/07/modern-news-consumer/.
Palda, K. F. (2011). Pareto’s Republic and the new Science of Peace. Filip Palda.
Reis, J. C. and Benevenuto, F. (2021). Supervised learning for misinformation detection in whatsapp. In Proc. of the Brazilian Symp. on Multimedia and the Web (WebMedia).
Reis, J. C., Correia, A., Murai, F., Veloso, A., and Benevenuto, F. (2019a). Explainable machine learning for fake news detection. In Proc. of the ACM Conf. on Web Science.
Reis, J. C., Melo, P., Garimella, K., Almeida, J. M., Eckles, D., and Benevenuto, F. (2020). A dataset of fact-checked images shared on whatsapp during the brazilian and indian elections. In Proc. of the Int’l AAAI Conference on Web and Social Media (ICWSM).
Reis, J. C. S., Correia, A., Murai, F., Veloso, A., and Benevenuto, F. (2019b). Supervised learning for fake news detection. IEEE Intelligent Systems, 34(2):76–81.
Report, D. N. (2018). Statistic of the week: How brazilian voters get their news. [link].
Resende, G., Melo, P., Reis, J. C. S., Vasconcelos, M., Almeida, J., and Benevenuto, F. (2019a). Analyzing textual (mis)information shared in whatsapp groups. In Proc. of the Int’l ACM Conf. on Web Science.
Resende, G., Melo, P., Sousa, H., Messias, J., Vasconcelos, M., Almeida, J., and Benevenuto, F. (2019b). (mis)information dissemination in whatsapp: Gathering, analyzing and countermeasures. In Proc. of the ACM Web Conference (WWW).
Ribeiro, M. T., Ziviani, N., Moura, E. S. D., Hata, I., Lacerda, A., and Veloso, A. (2014). Multiobjective pareto-efficient approaches for recommender systems. ACM Transactions on Intelligent Systems and Technology (TIST), 5(4):1–20.
Ruchansky, N., Seo, S., and Liu, Y. (2017). Csi: A hybrid deep model for fake news detection. In Proc. of the Int’l ACM Conf. on Inform. and Knowledge Manag. (CIKM).
Santia, G. and Williams, J. (2018). Buzzface: A news veracity dataset with facebook user commentary and egos. In Proc. of the Int’l AAAI Conf. on Web. and Soc. Med. (ICWSM).
Sharma, K., Qian, F., Jiang, H., Ruchansky, N., Zhang, M., and Liu, Y. (2019). Combating fake news: A survey on identification and mitigation techniques. ACM Transactions on Intelligent Systems and Technology (TIST), 10(3):1–42.
Silverman, C., Strapagiel, L., Shaban, H., Hall, E., , and Singer-Vine, J. (2016). Hyperpartisan facebook pages are publishing false and misleading information at an alarming rate. https://www.buzzfeed.com/craigsilverman/partisan-fb-pages-analysis.
Volkova, S., Shaffer, K., Jang, J. Y., and Hodas, N. (2017). Separating facts from fiction: Linguistic models to classify suspicious and trusted news posts on twitter. In Proc. of the Annual Meeting of the Association for Computational Linguistics (ACL).
Vosoughi, S., Roy, D., and Aral, S. (2018). The spread of true and false news online. Science, 359(6380):1146–1151.
Wang, Y., Ma, F., Jin, Z., Yuan, Y., Xun, G., Jha, K., Su, L., and Gao, J. (2018). Eann: Event adversarial neural networks for multi-modal fake news detection. In Proc. of the Int’l ACM Conf. on Knowledge Discovery and Data Mining (KDD).
Yoo, S. and Harman, M. (2007). Pareto efficient multi-objective test case selection. In Proc. of the Int’l Symp. on Software Testing and Analysis (ISSTA).
Zames, G., Ajlouni, N., Ajlouni, N., Ajlouni, N., Holland, J., Hills, W., and Goldberg, D. (1981). Genetic algorithms in search, optimization and machine learning. Information Technology Journal, 3(1):301–302.
Publicado
31/07/2022
Como Citar
REIS, Julio C. S.; BENEVENUTO, Fabrício.
Detecção Automática de Desinformação em Diferentes Cenários: Eleições nos Estados Unidos e no Brasil. In: BRAZILIAN WORKSHOP ON SOCIAL NETWORK ANALYSIS AND MINING (BRASNAM), 11. , 2022, Niterói.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2022
.
p. 1-12.
ISSN 2595-6094.
DOI: https://doi.org/10.5753/brasnam.2022.225908.