Machine Learning Techniques for Escaped Defect Analysis in Software Testing

  • Lidia Perside Gomes Nascimento UFPE
  • Ricardo Bastos Cavalcante Prudêncio UFPE
  • Alexandre Cabral Mota UFPE
  • Audir de Araujo Paiva Filho Motorola
  • Pedro Henrique Alves Cruz Motorola
  • Daniel Cardoso Coelho Alves de Oliveira Motorola
  • Pedro Roncoli Sarmet Moreira Motorola


Software testing is crucial to ensure the quality of a software under development. Once a potential bug is identified, a Bug Report (BR) is opened with information to describe and reproduce the found issue. Usually in big companies, hundreds of BRs are opened weekly by different testing teams, which have to be inspected and fixed adequately. This paper is focused on the use of Machine Learning (ML) techniques to automate the Escaped Defect Analysis (EDA), which is an important (but expensive) task to improve the effectiveness of the testing teams. In our work, Escaped Defects (EDs) are bugs or issues that should have been opened by a specific team, but which was accidentally found by another team. The occurrence of EDs is risky, as it is usually related to failures in the testing activities. EDA is usually performed manually by software engineers, who read each BR’s textual content to judge whether it is an ED or not. This is challenging and time-consuming. In our solution, the BR’s content is preprocessed by textual operations and then a feature representation is adopted by a ML classifier to return the probability of EDA labels. Experiments were performed in a dataset of 3767 BRs provided by the Motorola Mobility Comércio de Produtos Eletrônicos Ltda. Different ML algorithms were adopted to build classifiers, obtaining high AUC values (usually higher than 0.8), in a cross-validation experiment. This result indicates a good trade-off between the number of EDs correctly identified and the number of BRs that have to be actually inspected in the EDA process. This paper presents a ML based approach to classify escaped defects described in bug reports. EDs are bugs missed by the QA team in charge and happened to be uncovered by a different team. To automate the identification of EDs (a costly and error-prone task), a dataset of a partner company is leveraged, text processing operators are adopted for feature engineering and 6 classical ML algorithms are applied. The results show satisfactory accuracy and AUC and the experiments indicate a good trade-off between the number of EDs correctly identified and the number of BRs that have to be inspected in the EDA.

Palavras-chave: Android Releases, Android Tests, Google Tests, Quality Assurance, Robotic Arm, Software Testing, Test Automation
NASCIMENTO, Lidia Perside Gomes; PRUDÊNCIO, Ricardo Bastos Cavalcante; MOTA, Alexandre Cabral; PAIVA FILHO, Audir de Araujo; CRUZ, Pedro Henrique Alves; OLIVEIRA, Daniel Cardoso Coelho Alves de; MOREIRA, Pedro Roncoli Sarmet. Machine Learning Techniques for Escaped Defect Analysis in Software Testing. In: SIMPÓSIO BRASILEIRO DE TESTES DE SOFTWARE SISTEMÁTICO E AUTOMATIZADO (SAST), 8. , 2023, Campo Grande/MS. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2023 . p. 47–53.