Source Code Vulnerability Detection and Interpretability with Language Models

Leonardo Silveira; Claudio A. S. Lelis; Cesar A. C. Marcondes; Filipe A. N. Verri

Leonardo Silveira ITA
Claudio A. S. Lelis ITA
Cesar A. C. Marcondes ITA
Filipe A. N. Verri ITA

Resumo

Software vulnerability detection is crucial to prevent hostile attacks that can compromise applications and expose sensitive data. Traditional static, dynamic, and symbolic analyzers typically balance precision against computational complexity, often demanding high analytical costs or sacrificing detection accuracy. A recent alternative is the use of machine learning models, which can circumvent this trade-off by offering precise predictions with acceptable complexity. These models draw on advances in Natural Language Processing; however, most existing works focus on classification performance, with limited study of model interpretability. In this work, we fine-tuned two medium-sized language models, CodeBERT and CoTexT, for vulnerability detection in programming language code. We curated a benchmark dataset composed of vulnerable code fragments and their respective ground-truth masks, which indicates the exact tokens corresponding to vulnerabilities. We then applied the two interpretability methods—Saliency and InputXGradient, which rank the most performant interpretability techniques for text classification—to generate token-level importance heatmaps. Our evaluation shows that both methods achieve comparable precision; however, InputXGradient produces heatmaps that are substantially more interpretable. Finally, comparing the language models in terms of their ability to provide interpretation for their predictions, we observed a stark contrast, with CodeBERT providing more precise, consise and intuitive explanations for its predictions. Furthermore, our findings suggest that tokenizer design significantly influences the capacity of the model to learn code syntax and semantics, affecting both predictive performance and the clarity of generated interpretations. The results of our work underscore the critical role of interpretability and tokenizer configuration in ML-based vulnerability detection, highlighting the need to choose suitable attribution methods and tokenizer settings when employing such models.