Detection of SQL Injection: A Comparative Analysis of Machine Learning and Deep Learning Algorithms

  • Tiago Martins Ferreira UNESP
  • Thiago José Lucas FATEC
  • Carlos Alexandre Carvalho Tojeiro UNESP
  • Carlos Eduardo Silva Bertazzoli UNESP
  • Alessandra de Souza Lopes UNESP
  • Ana Caroline Silva Pontara UNESP

Resumo


This study investigates the effectiveness of Machine Learning (ML) and Deep Learning (DL) models in detecting SQL Injection (SQLi) attacks. Using a labeled dataset, techniques such as TF-IDF with n-grams and SMOTE for class imbalance were applied. The models evaluated include Naive Bayes, Random Forest, Support Vector Machine (SVM), and a Bi-LSTM network. Results indicated that the Bi-LSTM achieved the best performance (accuracy of 98.15%), followed by Random Forest and SVM. The research demonstrates that DL-based approaches, combined with appropriate preprocessing, are effective for SQLi detection, surpassing traditional models in accuracy and generalization capability. It is concluded that these techniques are promising for deployment in real-world environments, provided computational costs and practical implementation challenges are taken into account.

Palavras-chave: SQL Injection, Machine Learning, Deep Learning, Bi-LSTM, TF-IDF

Referências

OWASP Foundation, “OWASP Top Ten: The Ten Most Critical Web Application Security Risks,” [link], 2024, accessed: 2025-05-30.

S. Samtani, H. Chen, M. Kantarcioglu, and B. Thuraisingham, “Explainable artificial intelligence for cyber threat intelligence (xai-cti),” IEEE Transactions on Dependable and Secure Computing, vol. 19, no. 4, pp. 2149–2150, 2022.

“Edgescan vulnerability statistics report 2025,” Edgescan, Tech. Rep., 2025, 10th Edition. [Online]. Available: [link]

T. Scholte, D. Balzarotti, and E. Kirda, “Have things changed now? an empirical study on input validation vulnerabilities in web applications,” Comput. Secur., vol. 31, no. 3, p. 344–356, May 2012. [Online]. DOI: 10.1016/j.cose.2011.12.013

N. Kaur and P. Kaur, “Input validation vulnerabilities in web applications,” Journal of Software Engineering, vol. 8, pp. 116–126, 2014. [Online]. Available: [link]

N. Mohamed, “Artificial intelligence and machine learning in cybersecurity: a deep dive into state-of-the-art techniques and future paradigms,” Knowledge and Information Systems, 2025. [Online]. DOI: 10.1007/s10115-025-02429-y

A. Paul, V. Sharma, and O. Olukoya, “Sql injection attack: Detection, prioritization prevention,” Journal of Information Security and Applications, vol. 85, p. 103871, 2024. [Online]. Available: [link]

N. D. Bobade, V. A. Sinha, and S. S. Sherekar, “A diligent survey of sql injection attacks, detection and evaluation of mitigation techniques,” in 2024 IEEE International Students’ Conference on Electrical, Electronics and Computer Science (SCEECS), 2024, pp. 1–5.

V. Felmetsger, L. Cavedon, C. Kruegel, and G. Vigna, “Toward automated detection of logic vulnerabilities in web applications,” in Proceedings of the 19th USENIX Security Symposium. Washington, DC, USA: USENIX Association, Aug. 2010, pp. 10–10. [Online]. Available: [link]

T. Y. Khashirova, I. I. Mamuchiev, M. I. Mamuchieva, M. I. Ozhiganova, A. D. Kostyukov, and I. Shumeiko, “Assessment of information security in integrated systems,” in 2021 International Conference on Quality Management, Transport and Information Security, Information Technologies (ITQMIS), 2021, pp. 201–205.

I. I. Mamuchiev and L. A. Moskalenko, “The use of data labels for access control by means of ols,” in Perspective – 2018: Proceedings of the International Scientific Conference of Students, Postgraduates, and Young Scientists, A. M. Kumykov et al., Eds. Nalchik, Russia: Kabardino-Balkarian State University, 2018, Conference Proceedings, p. 50, 100 copies.

V. Chandola, A. Banerjee, and V. Kumar, “Anomaly detection: A survey,” ACM Comput. Surv., vol. 41, no. 3, Jul. 2009. [Online]. DOI: 10.1145/1541880.1541882

R. Hallman, J. Bryan, G. Palavicini Jr, J. Divita, and J. Romero-Mariona, “Ioddos — the internet of distributed denial of service attacks: A case study of the mirai malware and iot-based botnets,” 04 2017.

E. Bertino and N. Islam, “Botnets and internet of things security,” Computer, vol. 50, no. 2, pp. 76–79, 2017.

W. Wu, H. Fouzi, B. Benamar, S. Sidi-Mohammed, and S. Ying, “Deep learning-based stacked models for cyber-attack detection in industrial internet of things,” Neural Computing and Applications, 2025. [Online]. DOI: 10.1007/s00521-025-11418-9

J. P. Bharadiya, “Machine learning in cybersecurity: Techniques and challenges,” European Journal of Technology, vol. 7, no. 2, pp. 1–14, 2023, doctor of Philosophy Information Technology, University of the Cumberlands, USA.

M. M. Ibrohim and V. Suryani, “Classification of sql injection attacks using ensemble learning svm and na¨ıve bayes,” in 2023 International Conference on Data Science and Its Applications (ICoDSA), 2023, pp. 230–236.

V. K. Chauhan, A. Kumar, S. Kumar, T. Singh, and R. K. Dwivedi, “Utilizing an ensemble classification method to assess the severity of sql injection attacks and xss,” in 2023 12th International Conference on System Modeling Advancement in Research Trends (SMART), 2023, pp. 133–139.

P. Aggarwal, A. Kumar, K. Michael, J. Nemade, S. Sharma, and P. K. C, “Random decision forest approach for mitigating sql injection attacks,” in 2021 IEEE International Conference on Electronics, Computing and Communication Technologies (CONECCT), 2021, pp. 1–5.

I. Ghozali, M. F. Asy’ari, S. Triarjo, H. M. Ramadhani, H. Studiawan, and A. M. Shiddiqi, “A novel sql injection detection using bi-lstm and tf-idf,” in 2022 7th International Conference on Information and Network Technologies (ICINT), 2022, pp. 16–22.

M. Ozkan-Okay, E. Akin, Aslan, S. Kosunalp, T. Iliev, I. Stoyanov, and I. Beloev, “A comprehensive survey: Evaluating the efficiency of artificial intelligence and machine learning techniques on cyber security solutions,” IEEE Access, vol. 12, pp. 12 229–12 256, 2024.

G. M and P. H B, “Semantic query-featured ensemble learning model for sql-injection attack detection in iot-ecosystems,” IEEE Transactions on Reliability, vol. 71, no. 2, pp. 1057–1074, 2022.

D. M. Abdullah and A. M. Abdulazeez, “Machine learning applications based on SVM classification: A review,” Qubahan Academic Journal, vol. 1, no. 2, pp. 81–90, apr 2021. [Online]. Available: [link]

N. Mohamed, A. Oubelaid, and S. Almazrouei, “Staying ahead of threats: A review of ai and cyber security in power generation and distribution,” International Journal of Electrical and Electronics Research, vol. 11, 03 2023.

R. Zuech, J. Hancock, and T. M. Khoshgoftaar, “Detecting sql injection web attacks using ensemble learners and data sampling,” in 2021 IEEE International Conference on Cyber Security and Resilience (CSR), 2021, pp. 27–34.

K. Tasdemir, R. Khan, F. Siddiqui, S. Sezer, F. Kurugollu, and A. Bolat, “An investigation of machine learning algorithms for high-bandwidth sql injection detection utilizing bluefield-3 dpu technology,” in Proceedings of the 36th IEEE International Conference on System-on-Chip (SOCC 2023). Santa Clara, CA, USA: IEEE, 2023, pp. 1–8. [Online]. Available: [link]

GambleRyu, “Biggest SQL Injection Dataset,” [link], 2025, kaggle dataset.
Publicado
22/10/2025
FERREIRA, Tiago Martins; LUCAS, Thiago José; TOJEIRO, Carlos Alexandre Carvalho; BERTAZZOLI, Carlos Eduardo Silva; LOPES, Alessandra de Souza; PONTARA, Ana Caroline Silva. Detection of SQL Injection: A Comparative Analysis of Machine Learning and Deep Learning Algorithms. In: CONGRESSO LATINO-AMERICANO DE SOFTWARE LIVRE E TECNOLOGIAS ABERTAS (LATINOWARE), 22. , 2025, Foz do Iguaçu/PR. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2025 . p. 407-415. DOI: https://doi.org/10.5753/latinoware.2025.16464.