An Improved Tool for Detection of XSS Attacks by Combining CNN with LSTM

Caio Lente; Roberto Hirata Jr.; Daniel Macêdo Batista

doi:10.5753/sbseg_estendido.2021.17333

Caio Lente USP http://orcid.org/0000-0001-8473-069X
Roberto Hirata Jr. USP https://orcid.org/0000-0003-3861-7260
Daniel Macêdo Batista USP https://orcid.org/0000-0002-4865-5896

DOI: https://doi.org/10.5753/sbseg_estendido.2021.17333

Resumo

Cross-Site Scripting (XSS) is still a significant threat to web applications. By combining Convolutional Neural Networks (CNN) with Long ShortTerm Memory (LSTM) techniques, researchers have developed a deep learning system called 3C-LSTM that achieves upwards of 99.4% accuracy when predicting whether a new URL corresponds to a benign locator or an XSS attack. This paper improves on 3C-LSTM by proposing different network architectures and validation strategies and identifying the optimal structure for a more efficient, yet similarly accurate, version of 3C-LSTM. The authors identify larger batch sizes, smaller inputs, and cross-validation removal as modifications to achieve a speedup of around 3.9 times in the training step.

Palavras-chave: cross-site scripting, communication system security, machine learning, natural language processing

Referências

Hydara, I., Sultan, A. B. M., Zulzalil, H., & Admodisastro, N. (2015). Current state of research on cross-site scripting (XSS)–A systematic literature review. Information and Software Technology, 58, 170-186.

Grossman, J., Fogie, S., Hansen, R., Rager, A., & Petkov, P. D. (2007). XSS attacks: cross site scripting exploits and defense. Syngress.

Boyu Zhang, “Detecting XSS attacks by combining CNN with LSTM”, IEEE Dataport, 2019. [Online]. http://dx.doi.org/10.21227/css6-ds36. Accessed: Apr. 23, 2020.

Liu, M., Zhang, B., Chen, W., & Zhang, X. (2019). A Survey of Exploitation and Detection Methods of XSS Vulnerabilities. IEEE Access, 7, 182004-182016.

Wichers, D., & Williams, J., “Owasp top-10 2017”, OWASP, 2017.

Fang, Y., Li, Y., Liu, L., & Huang, C. (2018, March). DeepXSS: Cross site scripting detection based on deep learning. 2018 International Conference on Computing and Articial Intelligence (pp. 47-51).

Gupta, S., & Gupta, B. B. (2017). Cross-Site Scripting (XSS) attacks and defense mechanisms: classication and state-of-the-art. International Journal of System Assurance Engineering and Management, 8(1), 512-530.

Goswami, S., Hoque, N., Bhattacharyya, D. K., & Kalita, J. (2017). An Unsupervised Method for Detection of XSS Attack. IJ Network Security, 19(5), 761-775.

Vishnu, B. A., & Jevitha, K. P. (2014, October). Prediction of cross-site scripting attack using machine learning algorithms. In Proceedings of the 2014 International Conference on Interdisciplinary Advances in Applied Computing (pp. 1-5).

Rathore, S., Sharma, P. K., & Park, J. H. (2017). XSSClassier: An Efcient XSS Attack Detection Approach Based on Machine Learning Classier on SNSs. JIPS, 13(4), 1014-1028.

Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efcient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.

Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems (pp. 3111-3119).

LeCun, Y., Boser, B. E., Denker, J. S., Henderson, D., Howard, R. E., Hubbard, W. E., & Jackel, L. D. (1990). Handwritten digit recognition with a back-propagation network. In Advances in neural information processing systems (pp. 396-404).

Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural computation, 9(8), 1735-1780.

Hahnloser, R. H., Sarpeshkar, R., Mahowald, M. A., Douglas, R. J., & Seung, H. S. (2000). Digital selection and analogue amplication coexist in a cortex-inspired silicon circuit. Nature, 405(6789), 947-951.

LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. nature, 521(7553), 436-444.

An Improved Tool for Detection of XSS Attacks by Combining CNN with LSTM

Resumo

Referências

Artigos mais lidos do(s) mesmo(s) autor(es)