Generating Malware Using Large Language Models: A Study on Detectability and Security Barriers
Resumo
Large Language Models are Artificial Intelligence systems capable of processing natural language inputs to produce contextualized and coherent outputs. This work investigates their potential to generate malware from non-technical prompts. Controlled experiments were conducted with ChatGPT, Gemini, and Copilot, following attack templates from recent literature and analyzing the outputs through VirusTotal. Across 36 tests, five resulted in functional and undetectable code, representing a 13.89% success rate. These findings reveal that minimal prompt variations can bypass safety mechanisms and evade automated detection. Despite existing defenses, LLMs-based systems remain vulnerable to iterative prompt engineering, reinforcing the need for stronger semantic validation and multi-layered protection strategies.Referências
Botacin, M. (2023). Gpthreats-3: Is automatic malware generation a threat?. In 2023 IEEE Security and Privacy Workshops. 238-254. IEEE.
Cani, A., Gaudesi, M., Sanchez, E., Squillero, G., & Tonda, A. (2014). Towards automated malware creation: code generation and code integration. In Proceedings of the 29th Annual ACM Symposium on Applied Computing. 157–160. ACM.
Cardillo, A. (2025). 60 most popular AI tools ranked. Exploding Topics. [link].
Carvalho, G., Ladeira, R., & Lima, G. (2025). NoobGPT: LLMs e a geração de malwares indetectáveis. In Anais do XVI Workshop de Sistemas de Informação, 220-225. Porto Alegre: SBC.
Du, Y., Zhao, S., Ma, M., Chen, Y., & Qin, B. (2024). Analyzing the inherent response tendency of LLMs: Real-world instructions-driven jailbreak. arXiv: 2312.04127.
Gupta, M., Akiri, C., Aryal, K., Parker, E., & Praharaj, L. (2023). From ChatGPT to ThreatGPT: Impact of Generative AI in Cybersecurity and Privacy. IEEE Access, 11, 80218–80245.
Kamath, U., Keenan, K., Somers, G., & Sorenson, S. (2024). Large Language Models: A Deep Dive: Bridging Theory and Practice. Springer Nature.
Li, X., Zhou, Z., Zhu, J., Yao, J., Liu, T., & Han, B. (2024). DeepInception: Hypnotize LLM to be Jailbreaker. In Neurips Safe Generative AI Workshop 2024.
Lindrea, B. (2024). Cryptocurrency User Persuades AI Robot Freysa to Transfer $47,000 Prize Pool. CoinTelegraph. [link].
Liu, Y., Deng, G., Xu, Z., Li, Y., Zheng, Y., Zhang, Y., … & Liu, Y. (2024). Jailbreaking chatgpt via prompt engineering: An empirical study. arXiv:2305.13860.
Pa, Y. M. P., Tanizaki, S., Kou, T., … & Matsumoto, T. (2023). An attacker’s dream? exploring the capabilities of chatgpt for developing malware. In Proceedings of the 16th cyber security experimentation and test workshop. 10-18.
Paim, K., Mansilha, R., Kreutz, D., Franco, M., & Cordeiro, W. (2025). Exploiting Latent Space Discontinuities for Building Universal LLM Jailbreaks and Data Extraction Attacks. In Anais do XXV Simpósio Brasileiro de Cibersegurança, 417-431. Porto Alegre: SBC.
Stanford University. (2024). The 2024 AI Index Report. [link].
Xu, Z., Liu, Y., Deng, G., Li, Y., & Picek, S. (2024). A Comprehensive Study of Jailbreak Attack versus Defense for Large Language Models. In Findings of the Association for Computational Linguistics ACL 2024, 7432-7449.
Yamin, M. M., Hashmi, E., & Katt, B. (2024). Combining uncensored and censored llms for ransomware generation. In International Conference on Web Information Systems Engineering, 189-202. Springer Nature Singapore.
Yao, D., Zhang, J., Harris, I. G., & Carlsson, M. (2024). Fuzzllm: A novel and universal fuzzing framework for proactively discovering jailbreak vulnerabilities in large language models. In IEEE ICASSP, 4485-4489. IEEE.
Yong, Z. X., Menghini, C., & Bach, S. H. (2023). Low-Resource Languages Jailbreak GPT-4. In Socially Responsible Language Modelling Research.
Zhu, K. (2024). Ranked: The most popular generative AI tools in 2024. Visual Capitalist. [link].
Cani, A., Gaudesi, M., Sanchez, E., Squillero, G., & Tonda, A. (2014). Towards automated malware creation: code generation and code integration. In Proceedings of the 29th Annual ACM Symposium on Applied Computing. 157–160. ACM.
Cardillo, A. (2025). 60 most popular AI tools ranked. Exploding Topics. [link].
Carvalho, G., Ladeira, R., & Lima, G. (2025). NoobGPT: LLMs e a geração de malwares indetectáveis. In Anais do XVI Workshop de Sistemas de Informação, 220-225. Porto Alegre: SBC.
Du, Y., Zhao, S., Ma, M., Chen, Y., & Qin, B. (2024). Analyzing the inherent response tendency of LLMs: Real-world instructions-driven jailbreak. arXiv: 2312.04127.
Gupta, M., Akiri, C., Aryal, K., Parker, E., & Praharaj, L. (2023). From ChatGPT to ThreatGPT: Impact of Generative AI in Cybersecurity and Privacy. IEEE Access, 11, 80218–80245.
Kamath, U., Keenan, K., Somers, G., & Sorenson, S. (2024). Large Language Models: A Deep Dive: Bridging Theory and Practice. Springer Nature.
Li, X., Zhou, Z., Zhu, J., Yao, J., Liu, T., & Han, B. (2024). DeepInception: Hypnotize LLM to be Jailbreaker. In Neurips Safe Generative AI Workshop 2024.
Lindrea, B. (2024). Cryptocurrency User Persuades AI Robot Freysa to Transfer $47,000 Prize Pool. CoinTelegraph. [link].
Liu, Y., Deng, G., Xu, Z., Li, Y., Zheng, Y., Zhang, Y., … & Liu, Y. (2024). Jailbreaking chatgpt via prompt engineering: An empirical study. arXiv:2305.13860.
Pa, Y. M. P., Tanizaki, S., Kou, T., … & Matsumoto, T. (2023). An attacker’s dream? exploring the capabilities of chatgpt for developing malware. In Proceedings of the 16th cyber security experimentation and test workshop. 10-18.
Paim, K., Mansilha, R., Kreutz, D., Franco, M., & Cordeiro, W. (2025). Exploiting Latent Space Discontinuities for Building Universal LLM Jailbreaks and Data Extraction Attacks. In Anais do XXV Simpósio Brasileiro de Cibersegurança, 417-431. Porto Alegre: SBC.
Stanford University. (2024). The 2024 AI Index Report. [link].
Xu, Z., Liu, Y., Deng, G., Li, Y., & Picek, S. (2024). A Comprehensive Study of Jailbreak Attack versus Defense for Large Language Models. In Findings of the Association for Computational Linguistics ACL 2024, 7432-7449.
Yamin, M. M., Hashmi, E., & Katt, B. (2024). Combining uncensored and censored llms for ransomware generation. In International Conference on Web Information Systems Engineering, 189-202. Springer Nature Singapore.
Yao, D., Zhang, J., Harris, I. G., & Carlsson, M. (2024). Fuzzllm: A novel and universal fuzzing framework for proactively discovering jailbreak vulnerabilities in large language models. In IEEE ICASSP, 4485-4489. IEEE.
Yong, Z. X., Menghini, C., & Bach, S. H. (2023). Low-Resource Languages Jailbreak GPT-4. In Socially Responsible Language Modelling Research.
Zhu, K. (2024). Ranked: The most popular generative AI tools in 2024. Visual Capitalist. [link].
Publicado
08/12/2025
Como Citar
CARVALHO, Gustavo Lofrese; LADEIRA, Ricardo de la Rocha; LIMA, Gabriel Eduardo.
Generating Malware Using Large Language Models: A Study on Detectability and Security Barriers. In: ESCOLA REGIONAL DE REDES DE COMPUTADORES (ERRC), 22. , 2025, Porto Alegre/RS.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2025
.
p. 116-122.
DOI: https://doi.org/10.5753/errc.2025.17690.