Identification of “Fake News” in Brazilian political context: a computational approach

Abstract


This paper shows a computational solution's main results to analyze Brazilian fake news in a political context, and investigate which Machine Learning Algorithm, between Support Vector Machine and Naive Bayes, reach the best result to classify, in a natural language context, whether Brazilian political news is fake or not. The better performance was reached by combining SVM (RBF) + BOW with 80,4% accuracy, 82% precision, 76% recall, 78% of F1-Score, and 88% of AUC. The non-probabilistic algorithms proved to be better in the classification of fake news, thus, the results allow to present a path for future works.

Keywords: Fake News, Machine Learning, Natural Language Processing

References

Almeida, L., Fuzaro, V., Santana, A. L. M. & Venancio, F. (2020). Dataset Fake News. Repositório Github. Disponível em: https://github.com/victorfuzaro/artigofakenews

Abdin, L. (2019). Bots and fake news: the role of WhatsApp in the 2018 Brazilian Presidential election. Casey Robertson, 41(1).

Adriani, R. (2019). Fake News in the Corporate World: A Rising Threat. European Journal of Social Science Education and Research, 6(1), 92-110.

Bharadwaj, P., & Shao, Z. (2019). Fake news detection with semantic features and text mining. International Journal on Natural Language Computing (IJNLC) Vol, 8.

Bondielli, A., & Marcelloni, F. (2019). A survey on fake news and rumour detection techniques. Information Sciences, 497, 38-55.

Bovet, A., & Makse, H. A. (2019). Influence of fake news in Twitter during the 2016 US presidential election. Nature communications, 10(1), 1-14.

Davis, J., & Goadrich, M. (2006). The relationship between Precision-Recall and ROC curves. In Proceedings of the 23rd international conference on Machine learning (pp. 233-240).

Dias, C. R. M. (2019). Towards fake news detection in Portuguese: New dataset and a claim-based approach for automated detection.

El Naqa, I., & Murphy, M. J. (2015). What is machine learning?. In machine learning in radiation oncology (pp. 3-11). Springer, Cham.

Ghosh, S., & Gunning, D. (2019). Natural Language Processing Fundamentals: Build intelligent applications that can interpret the human language to deliver impactful results. Packt Publishing Ltd.

Granik, M., & Mesyura, V. (2017, May). Fake news detection using naive Bayes classifier. In 2017 IEEE First Ukraine Conference on Electrical and Computer Engineering (UKRCON) (pp. 900-903). IEEE.

Halimu, C., Kasem, A., & Newaz, S. S. (2019). Empirical comparison of area under ROC curve (AUC) and Mathew correlation coefficient (MCC) for evaluating machine learning algorithms on imbalanced datasets for binary classification. In Proceedings of the 3rd international conference on machine learning and soft computing (pp. 1-6).

Harrison, Matt. (2019). Machine Learning Poket Reference. O'Relly Media, Inc. ISBN 9781492047544

Hearst, M. A., Dumais, S. T., Osuna, E., Platt, J., & Scholkopf, B. (1998). Support vector machines. IEEE Intelligent Systems and their applications, 13(4), 18-28.

Jivani, A. G. (2011). A comparative study of stemming algorithms. Int. J. Comp. Tech. Appl, 2(6), 1930-1938.

Klein, D., & Wueller, J. (2017). Fake news: A legal perspective. Journal of Internet Law.

Lane, Hobson. Howard, Cole. Hapke, Hannes. (2019). Natural Language Processing in Action: Understanding, analyzing, and generatint text with Python. Manning. ISBN 9781617294631

Lorena, A. C., Gama, J., & Faceli, K. (2000). Inteligência Artificial: Uma abordagem de aprendizado de máquina. Grupo Gen-LTC.

Monteiro, R. A., Santos, R. L., Pardo, T. A., De Almeida, T. A., Ruiz, E. E., & Vale, O. A. (2018). Contributions to the study of fake news in portuguese: New corpus and automatic detection results. In International Conference on Computational Processing of the Portuguese Language (pp. 324-334). Springer, Cham.

Rodriguez, J. D., Perez, A., & Lozano, J. A. (2009). Sensitivity analysis of k-fold cross validation in prediction error estimation. IEEE transactions on pattern analysis and machine intelligence, 32(3), 569-575.

Rubin, V. L., Chen, Y., & Conroy, N. K. (2015). Deception detection for news: three types of fakes. Proceedings of the Association for Information Science and Technology, 52(1), 1-4.

Sharma, K., Seo, S., Meng, C., Rambhatla, S., Dua, A., & Liu, Y. (2020). Coronavirus on social media: Analyzing misinformation in Twitter conversations. arXiv preprint arXiv:2003.12309.

Shu, K., Sliva, A., Wang, S., Tang, J., & Liu, H. (2017). Fake news detection on social media: A data mining perspective. ACM SIGKDD explorations newsletter, 19(1), 22-36.

Tandoc Jr, E. C., Lim, Z. W., & Ling, R. (2018). Defining “fake news” A typology of scholarly definitions. Digital journalism, 6(2), 137-153.

Vajjala, S., Majumder, B., Gupta, A., & Surana, H. (2020). Practical Natural Language Processing: A Comprehensive Guide to Building Real-World NLP Systems. O'Reilly Media.

Vascon, L. F. C., & de Souza, L. A. F. (2019). A violência policial em páginas de redes sociais virtuais: impactos das notícias falsas na opinião pública. Complexitas–Revista de Filosofia Temática, 3(1), 16-27.

Vosoughi, S., Roy, D., & Aral, S. (2018). The spread of true and false news online. Science, 359(6380), 1146-1151.

Zhou, X., & Zafarani, R. (2020). A survey of fake news: Fundamental theories, detection methods, and opportunities. ACM Computing Surveys (CSUR), 53(5), 1-40.
Published
2021-07-19
ALMEIDA, Laura D. de; FUZARO, Victor; V. NIETO, Falmer; SANTANA, André L. M.. Identification of “Fake News” in Brazilian political context: a computational approach. In: WORKSHOP ON THE IMPLICATIONS OF COMPUTING IN SOCIETY (WICS), 2. , 2021, Evento Online. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2021 . p. 78-89. ISSN 2763-8707. DOI: https://doi.org/10.5753/wics.2021.15966.