An Empirical Study on Configuration-Related Code Weaknesses
Developers often use the C preprocessor to handle variability and portability. However, many researchers and practitioners criticize the use of preprocessor directives because of their negative effect on code understanding, maintainability, and error proneness. This negative effect may lead to configuration-related code weaknesses, which appear only when we enable or disable certain configuration options. A weakness is a type of mistake in software that, in proper conditions, could contribute to the introduction of vulnerabilities within that software. Configuration-related code weaknesses may be harder to detect and fix than weaknesses that appear in all configurations, because variability increases complexity. To address this problem, we propose a sampling-based white-box technique to detect configuration-related weaknesses in configurable systems. To evaluate our technique, we performed an empirical study with 24 popular highly configurable systems that make heavy use of the C preprocessor, such as Apache Httpd and Libssh. Using our technique, we detected 57 configuration-related weaknesses in 16 systems. In total, we found occurrences of the following five kinds of weaknesses: 30 memory leaks, 10 uninitialized variables, 9 null pointer dereferences, 6 resource leaks, and 2 buffer overflows. The corpus of these weaknesses is a valuable source to better support further research on configuration-related code weaknesses.