Structural Characterization and Graph-based Detection of Fake News in Portuguese
Resumo
A produção de notícias falsas é um problema dos dias atuais. Com as redes sociais, as notícias falsas se espalham de forma mais fácil e barata, podendo chegar a um grande número de pessoas em um curto espaço de tempo. Neste artigo, investigamos abordagens baseadas em grafos para caracterização e detecção de notícias falsas, levando em consideração medidas amplamente utilizadas de grafos e redes complexas. Nossos resultados mostram que algumas medidas de rede são úteis para caracterizar estruturalmente notícias falsas e verdadeiras e que soluções baseadas em aprendizado de máquina sobre esse tipo de atributo produzem resultados promissores.
Referências
Bird, S., Klein, E., and Loper, E. (2009). Natural Language Processing with Python. O’Reilly Media, Inc., 1st edition.
Chandra, S., Mishra, P., Yannakoudakis, H., Nimishakavi, M., Saeidi, M., and Shutova, E. (2020). Graph-based modeling of online communities for fake news detection.
Ciampaglia, G. L., Shiralkar, P., Rocha, L. M., Bollen, J., Menczer, F., and Flammini, A. (2015). Computational fact checking from knowledge networks. PloS one, 10(6):e0128193.
Comin, C. H., Peron, T., Silva, F. N., Amancio, D. R., Rodrigues, F. A., and da F. Costa, L. (2020). Complex systems: Features, similarity and connectivity. Physics Reports, 861:1–41.
Figueira, A. and Oliveira, L. (2017). The current state of fake news: challenges and opportunities. Procedia Computer Science, 121:817–825.
Fonseca, E. R., Rosa, J. a. L. G., and Aluísio, S. M. (2015). Evaluating word embeddings and a revised corpus for part-of-speech tagging in portuguese. Journal of the Brazilian Computer Society, 21(1):2.
Gangireddy, S. C. R., P, D., Long, C., and Chakraborty, T. (2020). Unsupervised fake news detection: A graph-based approach. In Proceedings of the 31st ACM Conference on Hypertext and Social Media, page 75–83.
Hagberg, A. A., Schult, D. A., and Swart, P. J. (2008). Exploring network structure, dynamics, and function using networkx. In Proceedings of the 7th Python in Science Conference, pages 11 – 15.
Lind, P. G., da Silva, L. R., Andrade, J. S., and Herrmann, H. J. (2007). Spreading gossip in social networks. Phys. Rev. E, 76:036117.
Monteiro, R. A., Santos, R. L. S., Pardo, T. A. S., de Almeida, T. A., Ruiz, E. E. S., and Vale, O. A. (2018). Contributions to the study of fake news in portuguese: New corpus and automatic detection results. In Computational Processing of the Portuguese Language, pages 324–334.
Morais, G. and Prati, R. C. (2013). Complex network measures for data set characterization. In 2013 Brazilian Conference on Intelligent Systems, pages 12–18.
Paluch, R., Lu, X., Suchecki, K., Szymánski, B. K., and Hoyst, J. A. (2018). Fast and accurate detection of spread source in large complex networks. Scientific reports, 8(1):1– 10.
Pan, J. Z., Pavlova, S., Li, C., Li, N., Li, Y., and Liu, J. (2018). Content based fake news detection using knowledge graphs. In The Semantic Web – ISWC 2018, pages 669–683.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., and Duchesnay, E. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830.
Pérez-Rosas, V., Kleinberg, B., Lefevre, A., and Mihalcea, R. (2017). Automatic detection of fake news. CoRR, abs/1708.07104.
Pérez-Rosas, V. and Mihalcea, R. (2015). Experiments in open domain deception detection. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pages 1120–1125.
Raschka, S. (2018). Mlxtend: Providing machine learning and data science utilities and extensions to python’s scientific computing stack. The Journal of Open Source Software, 3(24).
Rubin, V. L., Conroy, N. J., and Chen, Y. (2015). Towards news verification: Deception detection methods for news discourse. In Proceedings of the Hawaii International Conference on System Sciences (HICSS48) Symposium on Rapid Screening Technologies, Deception Detection and Credibility Assessment Symposium, pages 5–8.
Santos, R., Pedro, G., Leal, S., Vale, O., Pardo, T., Bontcheva, K., and Scarton, C. (2020). Measuring the impact of readability features in fake news detection. In Proceedings of the 12th Language Resources and Evaluation Conference, pages 1404–1413.
Santos, R. L. d. S. and Pardo, T. A. S. (2020). Fact-checking for portuguese: KnowlIn Computational Processing of the edge graph and google search-based methods. Portuguese Language, pages 195–205.
Shi, B. and Weninger, T. (2016). Fact checking in heterogeneous information networks. In Proceedings of the 25th International Conference Companion on World Wide Web, page 101–102.
Silva, R. M., Santos, R. L., Almeida, T. A., and Pardo, T. A. (2020). Towards automatically filtering fake news in portuguese. Expert Systems with Applications, 146:113– 199.
Thorne, J. and Vlachos, A. (2018). Automated fact checking: Task formulations, methIn Proceedings of the 27th International Conference on ods and future directions. Computational Linguistics, pages 3346–3359.
Vilarinho, G. and Ruiz, E. (2018). Global centrality measures in word graphs for twitter sentiment analysis. In 2018 7th Brazilian Conference on Intelligent Systems (BRACIS), pages 55–60.
Zhou, X. and Zafarani, R. (2019). Network-based fake news detection: A pattern-driven approach. SIGKDD Explor. Newsl., 21:48–60.