Monitoramento e Identificação de Páginas de Phishing
Abstract
Phishing campaigns frequently employ Web pages that mimic legitimate pages to fool victims into providing sensitive information. Despite the research community's continued effort to address these malicious activities, phishing becomes ever more sophisticated and continues making victims. In this paper we present a new phishing Web page monitoring framework that combines multiple techniques to achieve scalability and effectiveness. We also study trade-offs in the complex task of training models for identifying phishing Web pages. We show that representative datasets and features about the content of phishing Web pages are crucial for building general models. Our framework has been applied to hundreds of thousands of daily e-mails, and identified about one hundred phishing pages, a reduction of three orders of magnitude, and also serves as starting point for future phishing mitigation efforts.
References
Cisco Systems (2019). Email: Click with Caution. Technical report.
Gowda, T. and Mattmann, C. A. (2016). Clustering Web Pages Based on Structure and Style Similarity (Application Paper). In Proc. of IEEE International Conference on Information Reuse and Integration.
Gowtham, R. and Krishnamurthi, I. (2014). A Comprehensive and Efficacious Architecture for Detecting Phishing Webpages. Computers & Security, 40:23–37.
Gupta, B. B., Arachchilage, N., and Psannis, K. (2017). Defending against Phishing Attacks: Taxonomy of Methods, Current Issues and Future Directions. Telecommunication Systems, 67(2):247–267.
Han, Y. and Shen, Y. (2016). Accurate Spear Phishing Campaign Attribution and Early Detection. In Proc of. ACM Symposium on Applied Computing.
Ho, ., Cidon, A., Gavish, L., Schweighauser, M., Paxson, V., Savage, S., Voelker, G. M., and Wagner, D. (2019). Detecting and Characterizing Lateral Phishing at Scale. In Proc. of USENIX Security Symposium.
Jain, A. and Gupta, B. B. (2017). Towards detection of phishing websites on client-side using machine learning based approach. Telecommunication Systems, 68(04):687–700.
Jain, A. and Gupta, B. B. (2018). A machine learning based approach for phishing detection using hyperlinks information. Journal of Ambient Intelligence and Humanized Computing, 10(5):2015–2028.
Koetsier, J. (2020). Scammers Send 3.1 Billion Domain Spoofing Emails A Day. Here’s How To Protect Yourself (And Your Company).
Konte, M., Perdisci, R., and Feamster, N. (2015). ASwatch: An AS Reputation System to Expose Bulletproof Hosting ASes. In Proc. of ACM Conference on Special Interest Group on Data Communication.
Krämer, L., Krupp, J., Makita, D., Nishizoe, T., Koide, T., Yoshioka, K., and Rossow, C. (2015). AmpPot: Monitoring and Defending Against Amplification DDoS Attacks. In Proc. of RAID International Symposium on Research in Attacks, Intrusions, and Defenses.
Li, Y., Chen, X., Yuan, H., and Liu, W. (2018). A stacking model using URL and HTML features for phishing webpage detection. Future Generation Computer Systems, 94:27–39.
Mason, J. (2002). Filtering spam with SpamAssassin. In HEANet Annual Conference.
Mohammad, R. (2013). Predicting Phishing Websites based on Self-Structuring Neural Network. Neural Computing and Applications, 25(2):443–458.
Mohammad, R., Thabtah, F., and Mccluskey, T. (2012). An assessment of features related In Proc. of IEEE International to phishing websites using an automated technique. Conference for Internet Technology and Secured Transactions.
Park, A. J., Quadari, R. N., and Tsang, H. H. (2017). Phishing website detection framework through web scraping and data mining. In Proc. of IEEE Information Technology, Electronics and Mobile Communication Conference.
Rao, R. and Pais, A. (2018). Detection of Phishing Websites Using an Efficient Featurebased Machine Learning Framework. Neural Comp. and Applic., 31(8):3851–3873. Sahingoz, O. K., Buber, E., Demir, Ö., and Diri, B. (2019). Machine learning based phishing detection from URLs. Expert Syst. Appl., 117:345–357.
Sameen, M., Han, K., and Hwang, S. (2020). PhishHaven - An Efficient Real-Time AI Phishing URLs Detection System. IEEE Access, 8:83425–83443.
Santos, W., Fazzion, E., Fonseca, O., Cunha, I., Chaves, M., Hoepers, C., Steding-Jessen, K., Guedes, D., and jr, W. M. (2019). Uma Metodologia para Agrupamento e Extração de Informações de URLs de Phishing. In Proc. of Brazilian Symposium on Information and Computational Systems Security (SBSeg).
Shirazi, H., Bezawada, B., and Ray, I. (2018). “Kn0w Thy Doma1n Name”: Unbiased Phishing Detection Using Domain Name Based Features. In Proc. of Symposium on Access Control Models and Technologies.
Spamhaus (2021). https://www.spamhaus.org/.
Steding-Jessen, K., Vijaykumar, N., and Montes, A. (2008). Using Low-Interaction Honeypots to Study the Abuse of Open Proxies to Send Spam. Journal of Computer Science, 7(1):44–52.
Symantec (2019). Internet Security Threat Report. Technical report.
Xiang, G., Hong, J., Rose, C. P., and Cranor, L. (2011). CANTINA+: A Feature-Rich Machine Learning Framework for Detecting Phishing Web Sites. Transaction on Information System Security, 14(2):21:1–21:28.
Zhang, W., Jiang, Q., Chen, L., and Li, C. (2016). Two-stage ELM for phishing Web pages detection using hybrid features. World Wide Web, 20(4):797–813.
