From past to future: An experience using data mining to guide tests

Érica Miranda Sousa; André Rodrigues; Nityananda Teixeira; Ismayle Sousa Santos; Mariana Salamoni Francisco; Rossana Maria de Castro Andrade; Danilo Reis Vasconcelos

Érica Miranda Sousa Universidade Federal do Ceará https://orcid.org/0000-0002-6211-6856
André Rodrigues Universidade Federal do Ceará https://orcid.org/0000-0002-3448-1143
Nityananda Teixeira Instituto Federal do Ceará https://orcid.org/0000-0002-5881-0001
Ismayle Sousa Santos Universidade Federal do Ceará https://orcid.org/0000-0001-5580-643X
Mariana Salamoni Francisco Furukawa Electric LatAm http://orcid.org/0000-0001-8350-9803
Rossana Maria de Castro Andrade Universidade Federal do Ceará https://orcid.org/0000-0002-0186-2994
Danilo Reis Vasconcelos Instituto Federal do Ceará https://orcid.org/0000-0002-1035-2275

Resumo

It’s common to face errors during the process of software development. Be it an agile or traditional methodology, those errors are documented and registered in tools that allow us to manage and trace them. This data is rich in information about the product we are developing and the processes being used. Therefore, the analysis of this data can give us a better view of the product’s characteristics, its faults and how they affect it’s quality. Having said that, this article relates the use of Machine Learning techniques in a software’s error data base, to identify and classify critical areas in the system, in order to support decision making from the test team, the evolution process and production code maintenance by the developers. Overall, a set of 1045 software defects registries were collected, and we could identify that: (i) 63% of the defects are concentraded in 10 of the 71 existing functionalities, (ii) a functionality has a tendecy to show defects in the last versions of our software, (iii) the software have 4 critical functionalites that concentrate 52% of the reported defects and show recurrent defects.

Palavras-chave: Software Testing, Software Incident Repositories, Data Mining