An Empirical Analysis of Two Mutation Testing Tools for Java

  • Ricardo Monteiro UFSJ
  • Vinicius Humberto Serapilha Durelli UFSJ
  • Marcelo Eler USP
  • Andre Endo UFSCar


The effectiveness of mutation testing relies on the mutants that are used. However, mutant generation is time-consuming and unwieldy to carry out manually mainly because of the vast number of mutants that need to be generated. Thus, many mutation tools have been developed and employed by researchers. Despite the longstanding availability of mutation tools, many tools still fall short of the mark. Specifically, many tools seldom realize the complete set of mutation operators proposed in the literature and the final set implemented by most tools is heavily influenced by the target programming language, audience, and when mutants are generated (i.e., phase of compilation). Consequently, current mutation tools might produce different results in terms of the mutants killed by a given test suite. We set out to look into the quality of the mutants produced by two different mutation tools for Java: Major and Pit. We found that Pit generates a significantly larger number of mutants than Major. Our results suggest that the mutants generated by Pit perform slightly better than the mutants generated by Major. When excluding potentially equivalent mutants from our analysis, we found that the mutants generated by Major outperformed the ones yielded by Pit.
Palavras-chave: Empirical analysis, Mutation testing, Software testing, Replication study
MONTEIRO, Ricardo; DURELLI, Vinicius Humberto Serapilha; ELER, Marcelo; ENDO, Andre. An Empirical Analysis of Two Mutation Testing Tools for Java. In: SIMPÓSIO BRASILEIRO DE TESTES DE SOFTWARE SISTEMÁTICO E AUTOMATIZADO (SAST), 7. , 2022, Uberlândia. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2022 . p. 49–58.