TY - JOUR AU - Lima, Jackson Antonio do Prado AU - Regina Vergilio, Silvia PY - 2023/02/07 Y2 - 2024/03/28 TI - An Evaluation of Ranking-to-Learn Approaches for Test Case Prioritization in Continuous Integration JF - Journal of Software Engineering Research and Development JA - JSERD VL - 11 IS - 1 SE - Research Article DO - 10.5753/jserd.2023.2142 UR - https://sol.sbc.org.br/journals/index.php/jserd/article/view/2142 SP - 4:1 - 4:20 AB - <p><span style="font-weight: 400;">Continuous Integration (CI) environments is a practice adopted by most organizations that allows frequent integration of software changes, making software evolution more rapid and cost-effective. Such environments require dynamic Test Case Prioritization (TCP) approaches that adapt better to the test budgets and frequent addition/removal of test cases. In this sense, Ranking-to-Learn approaches have been proposed and are more suitable for CI constraints. By observing past prioritizations and guided by reward functions, they learn the best prioritization for a given commit. In order to contribute for improvements and direct future research, this work evaluates how far the solutions produced by these approaches are from optimal solutions produced by a deterministic approach (ground truth). To this end, we consider two learning-based approaches i) RETECS, which is based on Reinforcement Learning; and ii) COLEMAN, an approach based on Multi-Armed Bandit. The evaluation was conducted with twelve systems, three test budgets, two reward functions, and six measures concerning fault detection effectiveness, early fault detection, test time reduction in the CI cycles, prioritization time, and accuracy. Our findings have some implications for the approaches application and reward function choice. The approaches are applicable in real scenarios and produce solutions very close to the optimal ones, respectively, in 92% and 75% of the cases. Both approaches have some limitations to learn with few historical test data (a small number of CI Cycles) and deal with a large test case set, in which many failures are distributed over many test cases.</span></p> ER -