Fake news detection: a systematic literature review of machine learning algorithms and datasets
Keywords:Algorithms, datasets, accuracy, fake news, artificial intelligence
Fake news (i.e., false news created to have a high capacity for dissemination and malicious intentions) is a problem of great interest to society today since it has achieved unprecedented political, economic, and social impacts. Taking advantage of modern digital communication and information technologies, they are widely propagated through social media, being their use intentional and challenging to identify. In order to mitigate the damage caused by fake news, researchers have been seeking the development of automated mechanisms to detect them, such as algorithms based on machine learning as well as the datasets employed in this development. This research aims to analyze the machine learning algorithms and datasets used in training to identify fake news published in the literature. It is exploratory research with a qualitative approach, which uses a research protocol to identify studies with the intention of analyzing them. As a result, we have the algorithms Stacking Method, Bidirectional Recurrent Neural Network (BiRNN), and Convolutional Neural Network (CNN), with 99.9%, 99.8%, and 99.8% accuracy, respectively. Although this accuracy is expressive, most of the research employed datasets in controlled environments (e.g., Kaggle) or without information updated in real-time (from social networks). Still, only a few studies have been applied in social network environments, where the most significant dissemination of disinformation occurs nowadays. Kaggle was the platform identified with the most frequently used datasets, being succeeded by Weibo, FNC-1, COVID-19 Fake News, and Twitter. For future research, studies should be carried out in addition to news about politics, the area that was the primary motivator for the growth of research from 2017, and the use of hybrid methods for identifying fake news.
Almeida, L.D., Fuzaro, V. Nieto, F., & Santana, A.L.M. (2021). Identificação de “Fake News” no contexto político brasileiro: uma abordagem computacional. In: Proceedings Workshop sobre as Implicações da Computação na Sociedade (WICS), Porto Alegre, Brasil, pp. 78-89. https://doi.org/10.5753/wics.2021.15966
Abouzeid, A., Granmo, O.C., Webersik, C., & Goodwin, M. (2019). Causality-based Social Media Analysis for Normal Users Credibility Assessment in a Political Crisis. Proceedings of 25th Conference of Open Innovations Association (FRUCT), Helsinki, Finland, pp. 1-14. https://doi.org/10.23919/FRUCT48121.2019.8981500
Agarwal, A., Mittal, M., Pathak, A., & Goyal, L.M. (2020). Fake News Detection Using a Blend of Neural Networks: An Application of Deep Learning. SN Computer Science, 1,3:1-9.
Ahmad, I., Yousaf, M., Yousaf, S., & Ahmad, M.O. (2020). Fake News Detection Using Machine Learning Ensemble Methods. Hindawi, 1-11. https://doi.org/10.1155/2020/8885861
Ahmed, H., Traore, I., & Saad, S. (2017). Detection of Online Fake News Using NGram Analysis and Machine Learning Techniques. Proceedings of International conference on intelligent, secure, and dependable systems in distributed and cloud environments, Springer, Cham, pp. 127-138
Ahuja, N., & Kumar, S. (2020). S-HAN: Hierarchical Attention Networks with Stacked Gated Recurrent Unit for Fake News Detection. Proceedings 8th International Conference on Reliability, Infocom Technologies and Optimization, Noida, India, pp. 873-877.
Ajao, O., Bhowmik, D., & Zargari, S. (2019). Sentiment aware fake news detection on online social networks. Proceedings ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK, pp. 2507-2511. https://doi.org/10.1109/ICASSP.2019.8683170
Al-Ahmad, B., Al-Zoubi, A.M., Abu Khurma, R., & Aljarah, I. (2021). An evolutionary fake news detection method for covid-19 pandemic information. Symmetry, 13,6:1-16.
Alanazi, S.S., & Khan, M.B. (2020). Arabic fake news detection in social media using readers’ comments: Text mining techniques in action. International Journal of Computer Science and Network Security, 20,9:29-35.
Albahar, M. (2021). A hybrid model for fake news detection: Leveraging news content and user comments in fake news. IET Information Security, 15,2:169-177.
Albahr, A., & Albahar, M. (2020). An empirical comparison of fake news detection using different machine learning algorithms. International Journal of Advanced Computer Science and Applications, 11,9:146-152.
Ali, H., Khan, M. S., AlGhadhban, A., Alazmi, M., Alzamil, A., Al-Utaibi, K., & Qadir, J. (2021). All Your Fake Detector Are Belong to Us: Evaluating Adversarial Robustness of Fake-news Detectors Under Black-Box Settings. IEEE Access, 9:81678-81692.
Asaad, B., & Erascu, M. (2018). A tool for fake news detec-tion. Proceedings 2018 20th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC), Timisoara, Romania, pp. 379-386. https://doi.org/10.1109/SYNASC.2018.00064
Aslam, N., Ullah Khan, I., Alotaibi, F.S., Aldaej, L.A., & Aldubaikil, A.K. (2021). Fake detect: A deep learning ensemble model for fake news detection. Hindawi, 1-8. https://doi.org/10.1155/2021/5557784
Ayoub, J., Yang, X.J., & Zhou, F. (2021). Combat COVID-19 infodemic using explainable natural language processing models. Information Processing & Management, 58,4:1-11. https://doi.org/10.1016/j.ipm.2021.102569
Bahad, P., Saxena, P., & Kamal, R. (2020). Fake News Detection using Bi-directional LSTM-Recurrent Neural Network. Procedia Computer Science, 165:74-82.
Barua, R., Maity, R., Minj, D., Barua, T., & Layek, A.K. (2019). F-NAD: An application for fake news article detection using machine learning techniques. Proceedings IEEE Bombay Section Signature Conference (IBSSC), Mumbai, India, pp. 1-6). https://doi.org/10.1109/IBSSC47189.2019.8973059
Birunda, S.S., & Devi, R.K. (2021). A Novel Score-Based Multi-Source Fake News Detection using Gradient Boosting Algorithm. Proceedings of International Conference on Artificial Intelligence and Smart Systems (ICAIS), Coimbatore, India, pp. 406–414. https://doi.org/10.1109/ICAIS50930.2021.9395896
Burfoot, C., & Baldwin, T. (2009). Automatic satire detection: Are you having a laugh? Proceedings of the ACL-IJCNLP 2009 conference short papers, p. 161–164.
Chapra, S.C., & Canale, R.P. (2016). Métodos Numéricos para Engenharia-7ª Ediçao. McGraw Hill Brasil.
Chen, W., Yang, C., Cheng, G., Zhang, Y., Yeo, C.K., Lau, C.T., & Lee, B.S. (2018). Exploiting Behavioral Differences to Detect Fake News. Proceedings 9th IEEE Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON), New York, NY, USA, pp. 879-884. https://doi.org/10.1109/UEMCON.2018.8796519
Collins, B., Hoang, D.T., Nguyen, N.T., & Hwang, D. (2021). Trends in combating fake news on social media–a survey. Journal of Information and Telecommunication, 5,2:247-266. https://doi.org/10.1080/24751839.2020.1847379
Dadkhah, S., Shoeleh, F., Yadollahi, M.M., Zhang, X., & Ghorbani, A.A. (2021). A real-time hostile activities analyses and detection system. Applied Soft Computing, 104:1-28. https://doi.org/10.1016/j.asoc.2021.107175
Dresch, A., Lacerda, D.P. & Antunes Júnior, J.A.V. (2015). Design science research: método de pesquisa para avanço da ciência e tecnologia. Bookman Editora.
Fang, Y., Gao, J., Huang, C., Peng, H., & Wu, R. (2019). Self multi-head attention-based convolutional neural networks for fake news detection. PloS one, 14,9:1-13. https://doi.org/10.1371/journal.pone.0222713
Faustini, P.H.A., & Covões, T.F. (2020). Fake news detection in multiple platforms and languages. Expert Systems with Applications, 158:1-9. https://doi.org/10.1016/j.eswa.2020.113503
Gangireddy, S.C.R., Long, C., & Chakraborty, T. (2020). Unsupervised fake news detection: A graph-based approach. Proceedings of the 31st ACM conference on hypertext and social media, pp. 75-83. https://doi.org/10.1145/3372923.3404783
Gereme, F., Zhu, W., Ayall, T., & Alemu, D. (2021). Combating fake news in “low-resource” languages: Amharic fake news detection accompanied by resource crafting. Information, 12,1:1-9. https://doi.org/10.3390/info12010020
Goel, P., Singhal, S., Aggarwal, S., & Jain, M. (2021). Multi Domain Fake News Analysis using Transfer Learning. Proceedings 5th International Conference on Computing Methodologies and Communication (ICCMC), Erode, India, pp. 1230–1237. https://doi.org/10.1109/ICCMC51019.2021.9418411
Goldani, M.H., Momtazi, S., & Safabakhsh, R. (2021). Detecting fake news with capsule neural networks. Applied Soft Computing, 101:1-8. https://doi.org/10.1016/j.asoc.2020.106991
Goldani, M.H., Safabakhsh, R., & Momtazi, S. (2021). Convolutional neural network with margin loss for fake news detection. Information Processing & Management, 58,1:1-12. https://doi.org/10.1016/j.ipm.2020.102418
Islam, M.R., Liu, S., Wang, X., & Xu, G. (2020). Deep learning for misinformation detection on online social networks: a survey and new perspectives. Social Network Analysis and Mining, 10,1:1-20. https://doi.org/10.1007/s13278-020-00696-x
Ivancová, K., Sarnovský, M., & Maslej-Krcšñáková, V. (2021). Fake news detection in Slovak language using deep learning techniques. Proceedings of 19th World Symposium on Applied Machine Intelligence and Informatics (SAMI), Herl'any, Slovakia, pp. 255-260. https://doi.org/10.1109/SAMI50585.2021.9378650
Jardaneh, G., Abdelhaq, H., Buzz, M., & Johnson, D. (2019). Classifying Arabic tweets based on credibility using content and user features. Proceedings IEEE Jordan International Joint Conference on Electrical Engineering and Information Technology (JEEIT), Amman, Jordan, pp. 596-601. https://doi.org/10.1109/JEEIT.2019.8717386
Jiang, T., Li, J.P., Haq, A.U., & Saboor, A. (2020). Fake News Detection using Deep Recurrent Neural Networks. Proceedings of 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), Chengdu, China, pp. 205-208. https://doi.org/10.1109/ICCWAMTIP51612.2020.9317325
Jiang, T.A.O., Li, J.P., Haq, A.U., Saboor, A. & Ali, A. (2021). A novel stacking approach for accurate detection of fake news. IEEE Access, 9:22626-22639. https://doi.org/10.1109/ACCESS.2021.3056079
Kaliyar, R.K., Goswami, A., & Narang, P. (2019). Multiclass Fake News Detection using Ensemble Machine Learning. IEEE 9th International Conference on Advanced Computing (IACC). Tiruchirappalli, India, pp. 103-107. https://doi.org/10.1109/IACC48062.2019.8971579
Kaliyar, R.K., Goswami, A., & Narang, P. (2021). FakeBERT: Fake news detection in social media with a BERT-based deep learning approach. Multimedia tools and applications, 80,8:11765-11788. https://doi.org/10.1007/s11042-020-10183-2
Kaliyar, R.K., Goswami, A., Narang, P., & Sinha, S. (2020a). FNDNet – A deep convolutional neural network for fake news detection. Cognitive Systems Research, 61:32–44. https://doi.org/10.1016/j.cogsys.2019.12.005
Kaliyar, R.K., Kumar, P., Kumar, M., Narkhede, M., Namboodiri, S., & Mishra, S. (2020b). DeepNet: an efficient neural network for fake news detection using news-user engagements.Proceedings of 5th International Conference on Computing, Communication and Security (ICCCS), Patna, India, pp. 1-6). https://doi.org/10.1109/ICCCS49678.2020.9277353
Kesarwani, A., Chauhan, S.S., & Nair, A.R. (2020). Fake news detection on social media using k-nearest neighbor classifier. Proceedings International Conference on Advances in Computing and Communication Engineering (ICACCE), Las Vegas, NV, USA, pp. 1-4. https://doi.org/10.1109/ICACCE49060.2020.9154997
Khattar, D., Goud, J.S., Gupta, M., & Varma, V. (2019). Mvae: Multimodal variational autoencoder for fake news detection. Proceedings The world wide web conference, pp. 2915-2921). https://doi.org/10.1145/3308558.3313552
Konkobo, P.M., Zhang, R., Huang, S., Minoungou, T.T., Ouedraogo, J.A., & Li, L. (2020). A deep learning model for early detection of fake news on social media. Proceedings 7th International Conference on Behavioural and Social Computing (BESC), Bournemouth, United Kingdom, pp. 1-6). https://doi.org/10.1109/BESC51023.2020.9348311
Kumar, S., Asthana, R., Upadhyay, S., Upreti, N., & Akbar, M. (2020). Fake news detection using deep learning models: A novel approach. Transactions on Emerging Telecommunications Technologies, 31,2:1-23. https://doi.org/10.1002/ett.3767
Kumar, R., Anurag, K., & Pratik, G. (2021). EchoFakeD: improving fake news detection in social media with an efficient deep neural network. Neural Computing and Applications, 33,14:8597–8613. https://doi.org/10.1007/s00521-020-05611-1
Lakshmanarao, A., Swathi, Y., & Kiran, T.S.R. (2019). An effecient fake news detection system using machine learning. International Journal of Innovative Technology and Exploring Engineering, 8,10:3125-3129.
Li, Q., Hu, Q., Lu, Y., Yang, Y., & Cheng, J. (2020). Multi-level word features based on CNN for fake news detection in cultural communication. Personal and Ubiquitous Computing, 24,2:259–272.
Lin, J., Tremblay-Taylor, G., Mou, G., You, D., & Lee, K. (2019). Detecting fake news articles. Proceedings IEEE International Conference on Big Data (Big Data), Los Angeles, CA, USA, pp. 3021-3025, https://doi.org/10.1109/BigData47090.2019.9005980
Lin, S.X., Wu, B.Y., Chou, T.H., Lin, Y.J., & Kao, H.Y. (2020). Bidirectional perspective with topic information for stance detection. Proceedings 2020 International Conference on Pervasive Artificial Intelligence (ICPAI), Taipei, Taiwan, pp. 1-8. https://doi.org/10.1109/ICPAI51961.2020.00009
Low, J. F., Fung, B. C., Iqbal, F., & Huang, S. C. (2022). Distinguishing between fake news and satire with transformers. Expert Systems with Applications, 187, 115824.
Medeiros, H.I., & Braga, R. B. (2020). Fake News detection in social media: a systematic review. Proceedings 16th Simpósio Brasileiro de Sistemas de Informação (SBSI), Porto Alegre, Brasil, pp. 1-8. https://doi.org/10.5753/sbsi.2020.13782
Mugdha, S.B.S., Ferdous, S.M., & Fahmin, A. (2020). Evaluating machine learning algorithms for bengali fake news detection. Proceedings 23rd International Conference on Computer and Information Technology (ICCIT), DHAKA, Bangladesh, pp. 1-6. https://doi.org/10.1109/ICCIT51783.2020.9392662
Najar, F., Zamzami, N., & Bouguila, N. (2019). Fake news detection using bayesian inference. Proceedings 20th International Conference on Information Reuse and Integration for Data Science (IRI), Los Angeles, CA, USA, pp. 389-394. https://doi.org/10.1109/IRI.2019.00066
Nasir, J.A., Khan, O.S., & Varlamis, I. (2021). Fake news detection: A hybrid CNN-RNN based deep learning approach. International Journal of Information Management Data Insights, 1,1:1-13. https://doi.org/10.1016/j.jjimei.2020.100007
Ozbay, F.A., & Alatas, B. (2019). A novel approach for detection of fake news on social media using metaheuristic optimization algorithms. Elektronika ir Elektrotechnika, 25,4:62–67.
Pardamean, A., & Pardede, H.F. (2021). Tuned bidirectional encoder representations from transformers for fake news detection. Indonesian Journal of Electrical Engineering and Computer Science, 22,3:1667-1671.
Qawasmeh, E., Tawalbeh, M., & Abdullah, M. (2019). Automatic identification of fake news using deep learning. Proceedings 6th international conference on social networks analysis, Management and Security (SNAMS), Granada, Spain, pp. 383-388. https://doi.org/10.1109/SNAMS.2019.8931873
Qi, P., Cao, J., Yang, T., Guo, J., & Li, J. (2019). Exploiting multi-domain visual information for fake news detection. Proceedings IEEE international conference on data mining (ICDM), Beijing, China, pp. 518-527. https://doi.org/10.1109/ICDM.2019.00062
Ren, Y., Wang, B., Zhang, J., & Chang, Y. (2020). Adversarial active learning based heterogeneous graph neural network for fake news detection. Proceedings IEEE International Conference on Data Mining (ICDM), Sorrento, Italy, pp. 452-461. https://doi.org/10.1109/ICDM50108.2020.00054
Ruchansky, N., Seo, S., & Liu, Y. (2017). CSI: A hybrid deep model for fake news detection. Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, pp. 797-806.
Sahoo, S.R., & Gupta, B.B. (2021). Multiple features based approach for automatic fake news detection on social networks using deep learning. Applied Soft Computing, 100:1-16. https://doi.org/10.1016/j.asoc.2020.106983
Shabani, S., & Sokhn, M. (2018). Hybrid machine-crowd approach for fake news detection. Proceedings of 4th International Conference on Collaboration and Internet Computing (CIC), Philadelphia, PA, USA, pp. 299-306. https://doi.org/10.1109/CIC.2018.00048
Sharma, D.K., Garg, S., & Shrivastava, P. (2021). Evaluation of tools and extension for fake news detection. Proceedings of International Conference on Innovative Practices in Technology and Management (ICIPTM), Noida, India, pp. 227-232. https://doi.org/10.1109/ICIPTM52218.2021.9388356
Song, C., Ning, N., Zhang, Y., & Wu, B. (2021). A multimodal fake news detection model based on crossmodal attention residual and multichannel convolutional neural networks. Information Processing and Management, 58,1:1-14.
Sridhar, S., & Sanagavarapu, S. (2021). Fake news detection and analysis using multitask learning with BiLSTM CapsNet model. 11th International Conference on Cloud Computing, Data Science & Engineering (Confluence), Noida, India, pp. 905-911. https://doi.org/10.1109/Confluence51648.2021.9377080
Thakur, A., Shinde, S., Patil, T., Gaud, B., & Babanne, V. (2020). MYTHYA: Fake News Detector, Real Time News Extractor and Classifier. Proceedings of 4th International Conference on Trends in Electronics and Informatics (ICOEI), Tirunelveli, India, pp. 982-987. https://doi.org/10.1109/ICOEI48184.2020.9142971
Torgheh, F., Keyvanpour, M.R., Masoumi, B., & Shojaedini, S.V. (2021). A Novel Method for Detecting Fake news: Deep Learning Based on Propagation Path Concept. Proceedings of 26th International Computer Conference, Computer Society of Iran (CSICC), Tehran, Iran, pp. 1-5. https://doi.org/10.1109/CSICC52343.2021.9420601
Umer, M., Imtiaz, Z., Ullah, S., Mehmood, A., Choi, G.S., & On, B.W. (2020). Fake news stance detection using deep learning architecture (CNN-LSTM). IEEE Access, 8: 156695-156706.
Varshney, D., & Vishwakarma, D.K. (2021). Hoax news-inspector: a real-time prediction of fake news using content resemblance over web search results for authenticating the credibility of news articles. Journal of Ambient Intelligence and Humanized Computing, 12,9:8961-8974. https://doi.org/10.1007/s12652-020-02698-1
Verma, P.K., Agrawal, P., Amorim, I., & Prodan, R. (2021). WELFake: Word Embedding Over Linguistic Features for Fake News Detection. IEEE Transactions on Computational Social Systems, 8,4:881-893.
Wang, Y., Qian, S., Hu, J., Fang, Q., & Xu, C. (2020). Fake news detection via knowledge-driven multimodal graph convolutional networks. Proceedings of the 2020 International Conference on Multimedia Retrieval, pp. 540-547. https://doi.org/10.1145/3372278.3390713
Wang, Y., Wang, L., Yang, Y., & Lian, T. (2021). Sem-Seq4FD: Integrating global semantic relationship and local sequential order to enhance text representation for fake news detection. Expert Systems with Applications, 166:1-12. https://doi.org/10.1016/j.eswa.2020.114090
Xie, J., Liu, S., Liu, R., Zhang, Y., & Zhu, Y. (2021). SeRN: Stance extraction and reasoning network for fake news detection. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Toronto, ON, Canada, pp. 2520-2524. https://doi.org/10.1109/ICASSP39728.2021.9414787
Yu, J., Huang, Q., Zhou, X., & Sha, Y. (2020). Iarnet: An information aggregating and reasoning network over heterogeneous graph for fake news detection. Proceedings of 2020 International Joint Conference on Neural Networks (IJCNN), Glasgow, UK, pp. 1-9. https://doi.org/10.1109/IJCNN48605.2020.9207406
Zhang, H., Alim, M.A., Li, X., Thai, M.T., & Nguyen, H.T. (2016). Misinformation in online social networks: Detect them all with a limited budget. ACM Transactions on Information Systems, 34,3:1-24. https://doi.org/10.1145/2885494
Zhang, Q., Lipani, A., Liang, S., & Yilmaz, E. (2019). Reply-aided detection of misinformation via bayesian deep learning. Proceedings of 19 the world wide web conference, pp. 2333-2343. https://doi.org/10.1145/3308558.3313718
How to Cite
Copyright (c) 2023 Humberto Fernandes Villela, Fábio Corrêa, Jurema Suely de Araújo Nery Ribeiro, Air Rabelo; Dárlinton Barbosa Feres Carvalho
This work is licensed under a Creative Commons Attribution 4.0 International License.
JIS is free of charges for both authors and readers, and all papers published by JIS follow the Creative Commons Attribution 4.0 International (CC BY 4.0) license.