Characterization of Phishing with Large Language Models (LLMs): A Comparative Evaluation between Gemini, DeepSeek, and ChatGPT

  • Evelyn E. B. Bustamante UFU
  • Adriano M. Rocha UFU
  • Silvio E. Quincozes UFU / UNIPAMPA
  • Juliano F. Kazienko UFSM
  • Vagner E. Quincozes UFF

Abstract


The rise in phishing attacks demands robust detection and characterization strategies. Large Language Models (LLMs) show promise in this domain, but their effectiveness—particularly that of newer models—remains unexplored. In this work, a novel phishing characterization method based on LLMs is proposed. Based on 1,009 analyzed emails, key phishing features were extracted by Gemini, DeepSeek, and ChatGPT using standardized prompts to ensure consistency in testing. Our results indicate that DeepSeek stands out in robustness and achieved the best overall performance, with an F1-Score of 92.38%.

References

Ahmed (2024). Large Language Models (LLMs) in Cybersecurity: A Paradigm Shift in Threat Intelligence. Disponível em: [link]. Acessado em 01 de maio de 2025.

Al Daoud, E., Al Daoud, L., Asassfeh, M., Al-Shaikh, A., Al-Sherideh, A. S., and Afaneh, S. (2024). Enhancing cybersecurity with transformers: Preventing phishing emails and social media scams. In 2024 IEEE Conference on Dependable and Secure Computing (DSC), pages 31–36.

Associação APWG (2025). Relatórios de tendências de atividades de phishing. Disponível em: [link]. Acessado em 09 de abril de 2025.

BBC (2025). DeepSeek: O aplicativo de IA chinês que está dando o que falar no mundo. Disponível em: [link]. Acessado em 03 de abril de 2025.

Beydemir, A. B., Sezgin, U., Doğan, U., Aşıklar, B. E., Yerlikaya, F. A., and Bahtiyar, Ş. (2024). A dynamically selected gpt model for phishing detection. In 2024 14th International Conference on Advanced Computer Information Technologies (ACIT), pages 481–484.

Chakraborty, S. (2023). Phishing Email Detection. Disponível em: [link]. Acessado em 03 de abril de 2025.

Chataut, R., Gyawali, P. K., and Usman, Y. (2024). Can ai keep you safe? a study of large language models for phishing detection. In 2024 IEEE 14th Annual Computing and Communication Workshop and Conference (CCWC), pages 0548–0554.

Ciso Advisor (2023). E-mail de phishing gerado por IA é quase impossível de detectar. Disponível em: [link]. Acessado em 01 de maio de 2025.

CNN Brasil (2024). Brasil é vice-campeão em ataques cibernéticos, com 1.379 golpes por minuto, aponta estudo. Disponível em: [link]. Acessado em 28 de março de 2025.

Cohen, W. W. (2015). Enron Email Dataset. Disponível em: [link]. Universidade Carnegie Mellon. Acessado em 25 de maio de 2025.

De Rosa, S., Gringoli, F., and Bellicini, G. (2024). Hey chatgpt, is this message phishing? In 2024 22nd Mediterranean Communication and Computer Networking Conference (MedComNet), pages 1–10.

Google (2023). Apresentando o Gemini: nosso maior e mais hábil modelo de IA. Disponível em: [link]. Acessado em 03 de abril de 2025.

Gov, P. (2021). Engenharia social. Guia para Proteção de Conhecimentos Sensíveis. Disponível em: [link]. Acessado em 01 de abril de 2025.

Heiding, F., Schneier, B., Vishwanath, A., Bernstein, J., and Park, P. S. (2024). Devising and detecting phishing emails using large language models. IEEE Access, 12:42131–42146.

IBM (2024). O que é phishing? Disponível em: [link]. Acessado em 01 de abril de 2025.

Jiang, L. (2024). Detecting scams using large language models. arXiv preprint arXiv:2402.03147.

OpenAI (2025). O que é o ChatGPT? Disponível em: [link]. Acessado em 03 de abril de 2025.

Oracle (2021). O que é Processamento de Linguagem Natural (PLN)? Disponível em: [link]. Acessado em 03 de abril de 2025.

Ramprasath, J., Priyanka, S., Manudev, R., and Gokul, M. (2023). Identification and mitigation of phishing email attacks using deep learning. In 2023 3rd International Conference on Advance Computing and Innovative Technologies in Engineering (ICACITE), pages 466–470.

Sayyafzadeh, S., Weatherspoon, M., Yan, J., and Chi, H. (2024). Securing against deception: Exploring phishing emails through chatgpt and sentiment analysis. In 2024 IEEE/ACIS 22nd International Conference on Software Engineering Research, Management and Applications (SERA), pages 159–165.

Xu, Z., Wang, H., and Jajodia, S. (2014). Gemini: An emergency line of defense against phishing attacks. In 2014 IEEE 33rd International Symposium on Reliable Distributed Systems, pages 11–20.

Zhang, D., Jain, K., and Singh, P. (2024). Guarding against chatgpt threats: Identifying and addressing vulnerabilities. In 2024 IEEE 7th International Conference on Multimedia Information Processing and Retrieval (MIPR), pages 612–615.
Published
2025-09-01
BUSTAMANTE, Evelyn E. B.; ROCHA, Adriano M.; QUINCOZES, Silvio E.; KAZIENKO, Juliano F.; QUINCOZES, Vagner E.. Characterization of Phishing with Large Language Models (LLMs): A Comparative Evaluation between Gemini, DeepSeek, and ChatGPT. In: WORKSHOP ON SCIENTIFIC INITIATION AND UNDERGRADUATE WORKS - BRAZILIAN SYMPOSIUM ON CYBERSECURITY (SBSEG), 25. , 2025, Foz do Iguaçu/PR. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2025 . p. 204-215. DOI: https://doi.org/10.5753/sbseg_estendido.2025.11793.

Most read articles by the same author(s)

1 2 3 > >>