On the Use of Deep Graph CNN to Detect Vulnerable C Functions

José D’Abruzzo Pereira; Nuno Lourenço; Marco Vieira

José D’Abruzzo Pereira Universidade de Coimbra
Nuno Lourenço Universidade de Coimbra
Marco Vieira Universidade de Coimbra

Resumo

Software vulnerabilities are a problem in most software systems. If left unchecked, they can be exploited by malicious third parties to compromise the system, which can result in hazardous consequences. Over the years, several techniques have been proposed to tackle the problem of automatically detecting vulnerabilities. However, in spite of the efforts, they usually issue a large number of false alarms, which create a large overhead for the development team to analyze them. In this work, we study the viability of using a static technique (originally developed to classify classes of malware) to detect vulnerable C functions. This technique uses the Control Flow Graph (CFG) of the functions, features related to the structure of the graph, and the code sequence. Different from the malware classification problem, we also extract memory management-related features. All of the features are processed by a Deep Graph Convolution Neural Network (DGCNN). To do that, we use vulnerable and non-vulnerable functions of the open-source Linux Kernel project. Results show that a high recall can be obtained using this approach at the cost of a low precision.

Palavras-chave: vulnerability detection, machine learning, software security