Comparative analysis of generative AIs as tools for secure programming support
Abstract
Generative Artificial Intelligence has become popular in supporting software development. However, there are questions regarding the effectiveness of this technology in software security. This article investigates the use of programming assistants to support secure programming, weaknesses prevention, and detection of known vulnerabilities. Amazon Code Whisperer, Codium and Microsoft Copilot were evaluated against source code with known vulnerabilities. Despite being competitive in precision and recall, they are not effective due to lack of concise responses and poor usability for prompts.
References
Borji, A. (2023). A categorical archive of ChatGPT failures. ArXiv (Cornell University), 5 Feb. 2023.
Braga, A., Dahab, R., Antunes, N., Laranjeiro, N., and Vieira, M. (2017). Practical evaluation of static analysis tools for cryptography: Benchmarking method and case study. In 2017 IEEE 28th International Symposium on Software Reliability Engineering (ISSRE), pages 170–181. IEEE.
Braga, A., Dahab, R., Antunes, N., Laranjeiro, N., and Vieira, M. (2019). Understanding how to use static analysis tools for detecting cryptography misuse in software. 68(4):1384–1403.
CodiumAI (2024). CodiumAI, [link], Janeiro.
CMU. Software Engineering Institute (SEI), CERT Coding Standard for C, [link], Abril.
CMU. Software Engineering Institute (SEI), CERT Oracle Coding Standard for Java, [link], Abril.
CWE (2023) 2023 CWE Top 25 Most Dangerous Software Weaknesses. [link] Março.
Kabir, S., Udo-Imeh, D. N., Kou, B., and Zhang, T. (2024). Is stack overflow obsolete? an empirical study of the characteristics of ChatGPT answers to stack overflow questions. In Proceedings of the CHI Conference on Human Factors in Computing Systems, pages 1–17.
Microsoft Copilot (2024). Microsoft Copilot, [link], Janeiro.
Perry, N., Srivastava, M., Kumar, D., and Boneh, D. (2022). Do users write more insecure code with AI assistants? Publisher: arXiv Version Number: 3.
