Detecção de Vulnerabilidades em Smart Contracts com LLMs: Uma Comparação entre Gemini 2.0 Flash e GPT-4

Felipe Mello Fonseca; Pedro Henrique Gonzalez; Diogo Silveira Mendonça

doi:10.5753/wblockchain.2026.23103

Felipe Mello Fonseca CEFET-RJ
Pedro Henrique Gonzalez UFRJ
Diogo Silveira Mendonça CEFET-RJ

DOI: https://doi.org/10.5753/wblockchain.2026.23103

Resumo

A detecção automatizada de vulnerabilidades em smart contracts é um desafio relevante para a segurança de sistemas descentralizados. Este trabalho investiga a eficácia da Large Language Models (LLMs), em especial o Gemini 2.0 Flash, nessa tarefa. A metodologia baseia-se na replicação experimental utilizando o dataset SmartBugs Curated, permitindo uma análise comparativa com o GPT-4. Os resultados indicam que, embora as LLMs apresentem bom desempenho na triagem inicial, ainda enfrentam limitações em análises que exigem raciocínio semântico mais profundo da lógica de execução. Conclui-se que esses modelos podem apoiar o processo de auditoria, mas a validação por especialistas humanos permanece essencial. Também são apontadas direções futuras, como o refinamento de prompts e análises especializadas por categoria de vulnerabilidade.

Referências

Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F. L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al. (2023). Gpt-4 technical report. arXiv preprint arXiv:2303.08774.

Boi, B., Esposito, C., and Lee, S. (2024). Vulnhunt-gpt: a smart contract vulnerabilities detector based on openai chatgpt. In Proceedings of the 39th ACM/SIGAPP Symposium on Applied Computing, pages 1517–1524.

Buterin, V. et al. (2014). A next-generation smart contract and decentralized application platform. white paper, 3(37).

Chen, C., Su, J., Chen, J., Wang, Y., Bi, T., Yu, J., Wang, Y., Lin, X., Chen, T., and Zheng, Z. (2025). When chatgpt meets smart contract vulnerability detection: How far are we? ACM Transactions on Software Engineering and Methodology, 34(4):1–30.

David, I., Zhou, L., Qin, K., Song, D., Cavallaro, L., and Gervais, A. (2023). Do you still need a manual smart contract audit? arXiv preprint arXiv:2306.12338.

Google Cloud (2026). Gemini 2.0 flash documentation. [link]. Acesso em: 09 mar. 2026.

Google DeepMind (2024). Introducing gemini 2.0: our new ai model for the agentic era. [link]. Acesso em: 09 mar. 2026.

Hu, S., Huang, T., İlhan, F., Tekin, S. F., and Liu, L. (2023). Large language model-powered smart contract vulnerability detection: New perspectives. In 2023 5th IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications (TPS-ISA), pages 297–306. IEEE.

Larios-Vargas, E., Elazhary, O., Yousefi, S., Lowlind, D., Vliek, M. L., and Storey, M.-A. (2023). Dasp: A framework for driving the adoption of software security practices. IEEE Transactions on Software Engineering.

Liao, J.-W., Tsai, T.-T., He, C.-K., and Tien, C.-W. (2019). Soliaudit: Smart contract vulnerability assessment based on machine learning and fuzz testing. In 2019 Sixth International Conference on Internet of Things: Systems, Management and Security (IOTSMS), pages 458–465. IEEE.

Ma, W., Wu, D., Sun, Y., Wang, T., Liu, S., Zhang, J., Xue, Y., and Liu, Y. (2024). Combining fine-tuning and llm-based agents for intuitive smart contract auditing with justifications. arXiv preprint arXiv:2403.16073.

Maesa, D. D. F. and Mori, P. (2020). Blockchain 3.0 applications survey. Journal of Parallel and Distributed Computing, 138:99–114.

Minaee, S., Mikolov, T., Nikzad, N., Chenaghlu, M., Socher, R., Amatriain, X., and Gao, J. (2024). Large language models: A survey. arXiv preprint arXiv:2402.06196.

Nakamoto, S. (2008). Bitcoin: A peer-to-peer electronic cash system. Decentralized Business Review, page 21260.

OpenAI (2023). Gpt-4 technical overview. Accessed: 2026-03-11.

Salzano, F., Marchesi, L., Antenucci, C. K., Scalabrino, S., Tonelli, R., Oliveto, R., and Pareschi, R. (2026). Bridging the gap: a comparative study of academic and developer approaches to smart contract vulnerabilities. Empirical Software Engineering, 31(2):37.

Sun, Y., Wu, D., Xue, Y., Liu, H., Wang, H., Xu, Z., Xie, X., and Liu, Y. (2024). Gpts-can: Detecting logic vulnerabilities in smart contracts by combining gpt with program analysis. In Proceedings of the IEEE/ACM 46th International Conference on Software Engineering, pages 1–13.

Tikhomirov, S., Voskresenskaya, E., Ivanitskiy, I., Takhaviev, R., Marchenko, E., and Alexandrov, Y. (2018). Smartcheck: Static analysis of ethereum smart contracts. In 2018 IEEE/ACM 1st International Workshop on Emerging Trends in Software Engineering for Blockchain (WETSEB), pages 9–16.

Zhao, W. X., Zhou, K., Li, J., Tang, T., Wang, X., Hou, Y., Min, Y., Zhang, B., Zhang, J., Dong, Z., et al. (2023). A survey of large language models. arXiv preprint arXiv:2303.18223, 1(2).

Zheng, Z., Xie, S., Dai, H., Chen, X., and Wang, H. (2017). An overview of blockchain technology: Architecture, consensus, and future trends. pages 557–564. cited By 760.