LCCSS: A Similarity Metric for Identifying Similar Test Code

  • Lucas Pereira da Silva UFSC
  • Patrícia Vilain UFSC

Resumo


Test code maintainability is a common concern in software testing. In order to achieve good maintainability, test methods should be clearly structured, well named, small in size, and, mainly, test code duplication should be avoided. Several strategies exist to avoid test code duplication, such as implicit setup and delegated setup. However, prior to applying these strategies, first it is necessary to identify the duplicate code, which can be a time-consuming task. To address this problem, we automate the identification of duplicate test code through the application of code similarity metrics. We propose a novel similarity metric, called Longest Common Contiguous Start Sub-Sequence (LCCSS), to identify refactoring candidates. LCCSS is a metric used to measure similarity between pairs of tests. The most similar pairs are reported as strong candidates to be refactored through the implicit setup strategy. We also develop a framework, called R\'{o}\.{z}a, that can use different similarity metrics to identify test code duplication. An experiment shows that LCCSS and Simian, a clone detection tool, have both identified pairs of tests to be refactored through the implicit setup strategy with maximum precision in all the eleven standard recall levels. But, unlike Simian, LCCSS does not need to be calibrated for each project.
Palavras-chave: implicit setup, measure, metric, refactoring, similarity, testing
Publicado
19/10/2020
SILVA, Lucas Pereira da; VILAIN, Patrícia. LCCSS: A Similarity Metric for Identifying Similar Test Code. In: SIMPÓSIO BRASILEIRO DE COMPONENTES, ARQUITETURAS E REUTILIZAÇÃO DE SOFTWARE (SBCARS), 14. , 2020, Natal/RN. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2020 . p. 91–100.