Reconhecimento e Compartilhamento de Padrões Textuais em Notícias Falsas

Leonardo Emerson A. Alves; Jonice Oliveira; Sirius Thadeu F. da Silva

doi:10.5753/sbsc_estendido.2024.238434

Leonardo Emerson A. Alves Universidade Federal do Rio de Janeiro (UFRJ)
Jonice Oliveira Universidade Federal do Rio de Janeiro (UFRJ)
Sirius Thadeu F. da Silva Universidade Federal do Rio de Janeiro (UFRJ)

DOI: https://doi.org/10.5753/sbsc_estendido.2024.238434

Resumo

Esta pesquisa propõe uma metodologia para a caracterização, descrição da evolução e identificação de padrões de notícias falsas escritas em português-brasileiro. A caracterização das notícias falsas é realizada por meio da análise textual de notícias coletadas entre 2013 e 2021, com o uso de técnicas de processamento de linguagem natural e modelagem de tópicos. O principal diferencial dessa pesquisa consiste na abordagem de um corpus não-balanceado. Dessa forma, foi definida uma abordagem focada em aprendizado de máquina não-supervisionado com a utilização da métrica de coerência das modelagens para obter a otimização dos resultados.

Palavras-chave: Notícias Falsas, Análise Textual, Processamento de Linguagem Natural, Web Scraping, Modelagem de Tópicos

Referências

Alves, L. E. A. (2023). Caracterização, evolução e identificação de padrões em notícias falsas: uma abordagem voltada à modelagem de tópicos. Trabalho de Conclusão de Curso. (Graduação em Ciência da Computação) - Universidade Federal do Rio de Janeiro. Disponível em: [link]. Acessado em 06/01/2024.

Alves, L.E.A et al. (2023). Caracterização, evolução e identificação de padrões em notícias falsas via modelagem de tópicos (id: 2845). Semana de Integração Acadêmica da UFRJ (12.:2023): CCMN.

Bastick, Z. (2021). Would you notice if fake news changed your behavior? An experiment on the unconscious effects of disinformation. Computers in Human Behavior, v. 116, p. 106633.

Bodaghi, A, and Oliveira, J. (2022) The theater of fake news spreading, who plays which role? A study on real graphs of spreading on Twitter. Expert Systems with Applications 189 : 116110.

Charles, A., Ruback, L. and Oliveira, J. (2022). Fakepedia Corpus: A Flexible Fake News Corpus in Portuguese. International Conference on Computational Processing of the Portuguese Language (pp. 37-45). Springer International Publishing.

Colomina, C., Margalef, H. S. and Youngs, R. (2021). The impact of disinformation on democratic processes and human rights in the world. Brussels: European Parliament.

Gelfert, A. (2021). Fake News, False Beliefs, and the Fallible Art of Knowledge Maintenance. In: Bernecker, S.; Flowerree, A. K.; Grundmann, T.[Eds.]. The Epistemology of Fake News. Oxford University Press. p. 0.

Guo, B., Ding, Y., Yueheng, S., Ma, S. and Li, K. (2019). The Mass, Fake News, and Cognition Security.

May, C., Cotterell, R. and Van Durme, B. (2019). An Analysis of Lemmatization on Topic Models of Morphologically Rich Language. arXiv. Disponível em [link]. Acessado em 11/01/2024.

Melo, Tiago de; Figueiredo, Carlos M. S. Comparing News Articles and Tweets About COVID-19 in Brazil: Sentiment Analysis and Topic Modeling Approach. JMIR Public Health and Surveillance, v. 7, n. 2, p. e24585, 2021.

Monteiro, R. A., Santos, R. L. S., Pardo, T. A. S., et al. (2018). Contributions to the Study of Fake News in Portuguese: New Corpus and Automatic Detection Results. [A. Villavicencio, V. Moreira, A. Abad, et al., Eds.]In Computational Processing of the Portuguese Language., Lecture Notes in Computer Science. Springer International Publishing.

Newman, D., Chemudugunta, C., Smyth, P. and Steyvers, M. (2006). Analyzing Entities and Topics in News Articles Using Statistical Topic Models. [S. Mehrotra, D. D. Zeng, H. Chen, B. Thuraisingham, & F.-Y. Wang, Eds.]In Intelligence and Security Informatics., Lecture Notes in Computer Science. Springer.

Nwankwo, E., Okolo, C., Habonimana, C. and Beach, C.-L. (2020). Topic Modeling Approaches for Understanding COVID-19 Misinformation Spread in Sub-Saharan Africa.

Pérez-Rosas, V., Kleinberg, B., Lefevre, A. and Mihalcea, R. (2017). Automatic Detection of Fake News. arXiv. Disponível em [link]. Acessado em 11/01/2024.

Ponciano, L. and Andrade, N. (2018). Perspectivas em Computação Social. Computação Brasil, Raquel Prates and Thais Castro (Eds.) 36. p. 30–33.

Pritzkau, A., Blanc, O., Geierhos, M. and Schade, U. (2022). NLytics at CheckThat! 2022: Hierarchical multi-class fake news detection of news articles exploiting the topic structure.

Řehůřek, R. and Sojka, P. (2010). Software Framework for Topic Modelling with Large Corpora.

Reis, J. C. S. and Benevenuto, F. (2021). Towards Automatic Fake News Detection in Digital Platforms: Properties, Limitations, and Applications. In Anais do Concurso de Teses e Dissertações (CTD). SBC. Disponível em [link]. Acessado em 11/01/2024.

Röder, M., Both, A. and Hinneburg, A. (2015). Exploring the Space of Topic Coherence Measures. In Proceedings of the Eighth ACM International Conference on Web Search and Data Mining., WSDM ’15. Association for Computing Machinery. Disponível em DOI: 10.1145/2684822.2685324. Acessado em 11/01/2024.

Su, Q., Wan, M., Liu, X. and Huang, C.-R. (2020). Motivations, Methods and Metrics of Misinformation Detection: An NLP Perspective. Natural Language Processing Research, v. 1, n. 1–2, p. 1–13.

Vosoughi, S., Roy, D. and Aral, S. (2018). The spread of true and false news online. Science, v. 359, n. 6380, p. 1146–1151.

Zipitria, I., Arruarte, A. and Elorriaga, J. A. (2006). Observing Lemmatization Effect in LSA Coherence and Comprehension Grading of Learner Summaries. [M. Ikeda, K. D. Ashley, & T.-W. Chan, Eds.]In Intelligent Tutoring Systems., Lecture Notes in Computer Science. Springer.