Vocabulary of Flaky Tests in Javascript
Resumo
Context: Regression testing is a software verification and validation activity in modern software engineering. In this activity, tests can fail without any implementation change, characterizing a flaky test. Flaky tests may delay the release of the software and reduce testing confidence. One way to identify flaky tests is by re-running the tests, but this has a high computational cost. An alternative to re-execution is the static analysis of the code of the test cases, iidentifying patterns related to flaky tests. Objective: The objective of this work was to identify flaky tests in Javascript applications by analyzing the source code of the test cases, without executing them. Method: A dataset was built with flaky test cases extracted from open source software hosted on GitHub and implemented in Javascript. Then, a classification model and a flakiness vocabulary were created, considering the source code of flaky tests in the Javascript language. Results: We observed good results during the execution of most classifiers using the training and validation sets, with the best result being the logistic regression algorithm. However, when classifying the test set, the performance was not good, with the best results being the linear discriminant analysis. We obtained a vocabulary related to instability with words associated with asynchronous behavior (then, await, return) and related to UI (layout, gd, plot, click). Conclusions: This work presents relevant results toward a more efficient identification of flaky tests in projects that use Javascript. Further studies are required to consolidate a reliable classification of tests regarding flakiness using the vocabulary approach.