A framework for clustering and extracting information from phishing URLs
Abstract
Despite advances in prevention and mitigation mechanisms, phishing remains a threat. One reason for this is that phishers continuously improve their techniques. In this paper we study and characterize one of these improvements: phishers’ use of redirection chains to evade identification mechanisms and avoid takedown of the infrastructure hosting the malicious content. We propose a method to group messages and URLs into phishing campaigns, and develop a framework to identify their hosting infrastructure. We apply our methodand framework on a dataset of spam and phishing messages collected from low-interactivity honeypots. We explore and characterize phishing campaigns as well as their hosting infrastructure. Our results indicate that phishing campaigns are usually hosted in cloud providers, but some are hosted on devices in access networks, possibly infected end-user devices. This indicates multiple approaches used by phishers, and motivates different fronts to combat this threat.
References
Fazzion, E., Las-Casas, P. H., Fonseca, O., Guedes, D., Meira Jr, W., Hoepers, C., Steding-Jessen, K., and Chaves, M. (2014). Spambands: uma metodologia para identificac¸ao de fontes de spam agindo de forma orquestrada. In Proc. of Brazilian Symposium on Information and Computational Systems Security (SBSeg).
Khonji, M., Iraqi, Y., and Jones, A. (2012). Enhancing phishing e-mail classifiers: A lexical url analysis approach. International Journal for Information Security.
Las-Casas, P. H., Fonseca, O., Fazzion, E., Hoepers, C., Steding-Jessen, K., Chaves, M., Cunha, I., Meira Jr, W., and Guedes, D. (2016). Uma metodologia para identificacao adaptativa e caracterizacao de phishing. In Proc. of Brazilian Symposium on Computer Networks and Distributed Systems (SBRC).
Li, F. and Hsieh, M.-H. (2006). An empirical study of clustering behavior of spammers and group-based anti-spam strategies. In Proc. in Conference on Email and Anti-Spam (CEAS).
Santos, W., Fazzion, E., Fonseca, O., Cunha, I., Chaves, M. H., Hoepers, C., Steding- Jessen, K., Guedes, D., and Meira Jr, W. (2019). Uma metodologia para agrupamento e extracao de informacoes de urls de phishing. In Proc. of Brazilian Symposium on Information and Computational Systems Security (SBSeg).
Shoeb, A. A. M., Mukhopadhyay, D., Al Noor, S., Sprague, A., and Warner, G. (2015). Spam campaign cluster detection using redirected urls and randomized sub-domains. In Social Informatics (Harvard).