Investigando o uso de Características na Detecção de URLs Maliciosas

  • Maria Azevedo Bezzera UFAM
  • Eduardo Feitosa UFAM

Abstract


Malicious URLs became a powerful channel for criminal activities on the Internet. Current solutions for URL verification present high accuracy rates with well-adjusted results, an important question can and should be done: Is it really possible to obtain 100% of accuracy in these solutions? This paper presents an investigation of features, bases and URLs formats, aiming to show that the results of URLs validation and verification are quite dependent on certain aspects and factors. By extracting features (lexical, DNS and others) directly from the URL, machine learning algorithms were employed to question their influence in the process of URLs validation and verification. Thus, four (4) cases were prepared and the evaluation shows that it is possible to disagree with the results of several studies from the literature.

References

Aaron and Rasmussen (2014). Global phishing survey: Trends and domain name use in 2h2013. http://goo.gl/Fjkp9x.

Akiyama, M., Yagi, T., and Itoh, M. (2011). Searching structural neighborhood of malicious urls to improve blacklisting. In Proceedings of the 2011 IEEE/IPSJ International Symposium on Applications and the Internet, pages 1–10. IEEE.

Anderson, D. S., Fleizach, C., Savage, S., and Voelker, G. M. (2007). Spamscatter: Characterizing internet scam hosting infrastructure. In Proceedings of 16th USENIX Security Symposium on USENIX Security Symposium, SS’07, pages 10:1–10:14, Berkeley, CA, USA. USENIX Association.

bin Lin, J. (2008). Anomaly Based Malicious URL Detection in Instant Messaging.

Master’s thesis, Dep. of Computer Science and Engineering, National Sun Yat-Sen University.

Canali, D., Cova, M., Vigna, G., and Kruegel, C. (2011). Prophiler: A fast filter for the large-scale detection of malicious web pages. In Proceedings of the 20th International Conference on World Wide Web, pages 197–206. ACM.

Choi, H., Zhu, B. B., and Lee, H. (2011). Detecting malicious web links and identifying their attack types. In Proceedings of the 2Nd USENIX Conference on Web Application Development, pages 11–11. USENIX.

Eshete, B., Villafiorita, A., and Weldemariam, K. (2013). Binspect: Holistic analysis and detection of malicious web pages. In Security and Privacy in Communication Networks, volume 106, pages 149–166. Springer Berlin Heidelberg.

Lee, B. T., Masinter, L., and Mccahill, M. (1994). RFC 1738: Uniform resource locator (URL). http://www.ietf.org/rfc/rfc1738.txt.

Ma, J., Saul, L. K., Savage, S., and Voelker, G. M. (2009). Beyond blacklists: Learning to detect malicious web sites from suspicious urls. In Proceedings of the 15th ACM SIGKDD, pages 1245–1254. ACM.

Maslennikov and Namestnikov (2012). Kaspersky security bulletin statistics 2012. http://goo.gl/LfPhVD.

Prakash, P., Kumar, M., Kompella, R. R., and Gupta, M. (2010). Phishnet: Predictive blacklisting to detect phishing attacks. In Proceedings of the 29th Conference on Information Communications, pages 346–350. IEEE.

Raymond, E. (1998). Book review: The essential perl books. Linux J., 1998(46es).

Tsoumakas, G. and Vlahavas, I. (2007). Random k-labelsets: An ensemble method for multilabel classification. In Proceedings of the 18th European Conference on Machine Learning, pages 406–417. Springer-Verlag.

Zhang, J., Porras, P., and Ullrich, J. (2008). Highly predictive blacklisting. In Proceedings of the 17th Conference on Security Symposium, pages 107–122. USENIX.

Zhang, M.-L. and Zhou, Z.-H. (2007). Ml-knn: A lazy learning approach to multilabel learning. Pattern Recognition, 40(7):2038 – 2048.
Published
2015-11-09
BEZZERA, Maria Azevedo; FEITOSA, Eduardo. Investigando o uso de Características na Detecção de URLs Maliciosas. In: BRAZILIAN SYMPOSIUM ON CYBERSECURITY (SBSEG), 15. , 2015, Florianópolis. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2015 . p. 100-113. DOI: https://doi.org/10.5753/sbseg.2015.20088.

Most read articles by the same author(s)

1 2 3 4 > >>