An Assessment of Machine Learning Algorithms and Models for Prediction of Change-Prone Java Methods

  • Paulo Roberto Farah UDESC / UFPR
  • Rogério Silva UFPR
  • Silvia Vergilio UFPR


Identifying which parts of code are prone to change during software evolution allows developers to prioritize and allocate resources efficiently. Having as focus a smaller scope makes easier change management and allows monitoring the type of modification and its impact. However, existing change-proneness prediction approaches are focused mainly on system classes. But the problem is that classes contain many characteristics of different software attributes and some software behaviors are more granular and better captured at the method-level. Motivated by these facts, in this paper, we empirically assess the performance of four machine learning algorithms for change-prone method prediction in seven open-source software projects. We derived and compared models obtained with three sets of independent variables (features): a set composed of structural metrics, a second set composed of evolution-based metrics, and a third that includes a combination of both kinds of metrics. The results show that, Random Forest presents the best general performance, independently of the used indicator and set of features. The model composed by both sets of metrics outperforms the other two. Two features based on the frequency of changes that happened in the evolution history of the method are point out as the most important for our problem.

Palavras-chave: software metrics, machine learning, Software maintenance
FARAH, Paulo Roberto; SILVA, Rogério; VERGILIO, Silvia. An Assessment of Machine Learning Algorithms and Models for Prediction of Change-Prone Java Methods. In: SIMPÓSIO BRASILEIRO DE ENGENHARIA DE SOFTWARE (SBES), 37. , 2023, Campo Grande/MS. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2023 . p. 322–331.