An Empirical Study on the Co-occurrence of Test Smells

Railana Santana; Luana Martins; Márcio Ribeiro; Ivan Machado

doi:10.5753/sast.2025.14387

Railana Santana UFBA
Luana Martins University of Salerno
Márcio Ribeiro UFAL
Ivan Machado UFBA

DOI: https://doi.org/10.5753/sast.2025.14387

Resumo

The co-occurrence of test smells poses a challenge for refactoring test code, as these issues rarely appear in isolation. Consequently, developers often need to apply multiple transformations during refactoring, which increases complexity and effort. Therefore, it is essential to explore efficient strategies that reduce steps and practically allow the simultaneous refactoring of these problems. Although the literature already documents several test smells, the joint refactoring of these problems remains unaddressed. Therefore, this paper investigates the co-occurrence of test smells in test code (both at the class and test method level) and strategies to fix them efficiently, minimizing the number of transformations and preserving the behavior of the tests. To achieve this goal, we identified test smells using an automated tool in twenty-two open-source projects. We evaluated the co-occurrence between different types of test smells. Then, we ranked the thirty most frequent test smells pairs and suggested ways to refactor them in an integrated manner. Our findings can support developers in improving the quality of test cases, as our approach was designed with the industry’s reality in mind, where test smells often appear simultaneously.

Palavras-chave: Software testing, Test code, Unit testing, Test smells, Anti-patterns, Controlled experiments

Referências

Marwen Abbes, Foutse Khomh, Yann-Gaël Guéhéneuc, and Giuliano Antoniol. 2011. An Empirical Study of the Impact of Two Antipatterns, Blob and Spaghetti Code, on Program Comprehension. In 2011 15th European Conference on Software Maintenance and Reengineering. IEEE, Oldenburg, Germany, 181–190.

Wajdi Aljedaani, Anthony Peruma, Ahmed Aljohani, Mazen Alotaibi, Mohamed Wiem Mkaouer, Ali Ouni, Christian D. Newman, Abdullatif Ghallab, and Stephanie Ludi. 2021. Test Smell Detection Tools: A Systematic Mapping Study. In Proceedings of the 25th International Conference on Evaluation and Assessment in Software Engineering (Trondheim, Norway) (EASE ’21). Association for Computing Machinery, New York, NY, USA, 170–180.

Gabriele Bavota, Abdallah Qusef, Rocco Oliveto, Andrea De Lucia, and Dave Binkley. 2015. Are test smells really harmful? an empirical study. Empirical Software Engineering 20 (2015), 1052–1094.

Moritz Beller, Georgios Gousios, and Andy Zaidman. 2015. How (much) do developers test?. In Proceedings of the 37th International Conference on Software Engineering - Volume 2 (Florence, Italy) (ICSE ’15). IEEE Press, Florence, Italy, 559–562.

Felipe Diniz Dallilo, Marcio Eduardo Delamaro, and Simone Senger Souza. 2024. A methodology to support the execution of proficiency tests for software quality assessment. In Proceedings of the XXIII Brazilian Symposium on Software Quality (SBQS ’24). ACM, New York, NY, USA, 49–59.

Arie Van Deursen, Leon Moonen, Alex Van Den Bergh, and Gerard Kok. 2001. Refactoring test code. In 2nd international conference on extreme programming and flexible processes in software engineering (XP2001). CWI (Centre for Mathematics and Computer Science), Amsterdam, The Netherlands, The Netherlands, 92–95.

Vinicius H. S. Durelli, Rafael S. Durelli, Simone S. Borges, Andre T. Endo, Marcelo M. Eler, Diego R. C. Dias, and Marcelo P. Guimarães. 2019. Machine Learning Applied to Software Testing: A Systematic Mapping Study. IEEE Transactions on Reliability 68, 3 (2019), 1189–1212.

Mehdi Esnaashari and Amir Hossein Damia. 2021. Automation of software test data generation using genetic algorithm and reinforcement learning. Expert Systems with Applications 183 (2021), 115446.

Francesca Arcelli Fontana and Stefano Spinelli. 2011. Impact of refactoring on quality code evaluation. In Proceedings of the 4th Workshop on Refactoring Tools (Waikiki, Honolulu, HI, USA) (WRT ’11). ACM, New York, NY, USA, 37–40.

Martin Fowler. 2018. Refactoring. Addison-Wesley Professional, Boston.

Vahid Garousi, Michael Felderer, Marco Kuhrmann, Kadir Herkiloğlu, and Sigrid Eldh. 2020. Exploring the industry’s challenges in software testing: An empirical study. Journal of Software: Evolution and Process 32, 8 (2020), e2251.

Vahid Garousi, Baris Kucuk, and Michael Felderer. 2019. What We Know About Smells in Software Test Code. IEEE Software 36, 3 (2019), 61–73.

Giovanni Grano, Cristian De Iaco, Fabio Palomba, and Harald C. Gall. 2020. Pizza versus Pinsa: On the Perception and Measurability of Unit Test Code Quality. In 2020 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE, Adelaide, SA, Australia, 336–347.

Muhammad Abid Jamil, Muhammad Arif, Normi Sham Awang Abubakar, and Akhlaq Ahmad. 2016. Software Testing Techniques: A Literature Review. In 2016 6th International Conference on Information and Communication Technology for The Muslim World (ICT4M). IEEE, Jakarta, Indonesia, 177–182.

Manju Khari and Prabhat Kumar. 2019. An extensive evaluation of search-based software testing: a review. Soft Computing 23, 6 (2019), 1933–1946.

Guilherme Lacerda, Fabio Petrillo, Marcelo Pimenta, and Yann Gaël Guéhéneuc. 2020. Code smells and refactoring: A tertiary systematic review of challenges and observations. Journal of Systems and Software 167 (2020), 110610.

Gerard Meszaros. 2007. xUnit Test Patterns: Refactoring Test Code. Addison-Wesley, Upper Saddle River, NJ.

Soukaina Najihi, Sakina Elhadi, Rachida Ait Abdelouahid, and Abdelaziz Marzak. 2022. Software Testing from an Agile and Traditional view. Procedia Computer Science 203 (2022), 775–782.

Fabio Palomba, Dario Di Nucci, Annibale Panichella, Rocco Oliveto, and Andrea De Lucia. 2016. On the diffusion of test smells in automatically generated test code: An empirical study. In Proc. of the 9th international workshop on search-based software testing. ACM, IEEE, Austin, TX, USA, 5–14.

Annibale Panichella, Sebastiano Panichella, Gordon Fraser, Anand Ashok Sawant, and Vincent J Hellendoorn. 2022. Test smells 20 years later: detectability, validity, and reliability. Empirical Software Engineering 27, 7 (2022), 170.

Anthony Peruma, Khalid Saeed Almalki, Christian D Newman, Mohamed Wiem Mkaouer, Ali Ouni, and Fabio Palomba. 2019. On the distribution of test smells in open source android applications: An exploratory study. In 29th Annual International Conference on Computer Science and Software Engineering. IBM Corp., USA, 193–202.

Anthony Shehan Ayam Peruma. 2018. What the Smell? An Empirical Investigation on the Distribution and Severity of Test Smells in Open Source Android Applications. Ph.D. Thesis. Rochester Institute of Technology, Rochester, New York.

Railana Santana, Daniel Fernandes, Denivan Campos, Larissa Soares, Rita Maciel, and Ivan Machado. 2021. Understanding Practitioners’ Strategies to Handle Test Smells: A Multi-Method Study. ACM, New York, NY, USA, 49–53.

Railana Santana, Luana Martins, Larissa Rocha, Tássio Virgínio, Adriana Cruz, Heitor Costa, and Ivan Machado. 2020. RAIDE: A Tool for Assertion Roulette and Duplicate Assert Identification and Refactoring. In Proceedings of the XXXIV Brazilian Symposium on Software Engineering (Natal, Brazil) (SBES ’20). ACM, New York, NY, USA, 374–379.

Ashok Sivaji, Rosnisa Abdul Razak, Nur Faezah Mohamad, Nurshakirin Sazali, Afiqah Musa, Norzam Mohd Bajuri, Aslinda Md Hashim, Mohd Solehuddin Abdullah, Nur Diyana Joha, Nadia Ellyani Azis, Anjana Devi N Kuppusamy, Azlan Deniel, Ngip Khean Chuan, and Torkil Clemmensen. 2020. Software Testing Automation: A Comparative Study on Productivity Rate of Open Source Automated Software Testing Tools For Smart Manufacturing. In 2020 IEEE Conference on Open Systems (ICOS). IEEE, Kota Kinabalu, Malaysia, 7–12.

Elvys Soares, Marcio Ribeiro, Guilherme Amaral, Rohit Gheyi, Leo Fernandes, Alessandro Garcia, Baldoino Fonseca, and Andre Santos. 2020. Refactoring Test Smells: A Perspective from Open-Source Developers. In Proceedings of the 5th Brazilian Symposium on Systematic and Automated Software Testing (Natal, Brazil) (SAST 20). ACM, New York, NY, USA, 50–59.

Elvys Soares, Márcio Ribeiro, Rohit Gheyi, Guilherme Amaral, and André Santos. 2023. Refactoring Test Smells With JUnit 5: Why Should Developers Keep Up-to-Date? IEEE Transactions on Software Engineering 49, 3 (2023), 1152–1170.

Elvys Soares, Marcio Ribeiro, Rohit Gheyi, Guilherme Amaral, and Andre Medeiros Santos. 2022. Refactoring Test Smells With JUnit 5: Why Should Developers Keep Up-to-Date. IEEE Transactions on Software Engineering 49, 3 (2022), 1–1.

Davide Spadini, Fabio Palomba, Andy Zaidman, Magiel Bruntink, and Alberto Bacchelli. 2018. On the Relation of Test Smells to Software Code Quality. In 2018 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE, Madrid, Spain, 1–12.

Daniel Staegemann, Matthias Volk, Maneendra Perera, Christian Haertel, Matthias Pohl, Christian Daase, and Klaus Turowski. 2022. A literature review on the challenges of applying test-driven development in software engineering. Complex Systems Informatics and Modeling Quarterly 31 (2022), 18–28.

Konstantinos Stroggylos and Diomidis Spinellis. 2007. Refactoring–Does It Improve Software Quality?. In Fifth International Workshop on Software Quality (WoSQ’07: ICSE Workshops 2007). IEEE, Minneapolis, MN, USA, 10–10.

Michele Tufano, Fabio Palomba, Gabriele Bavota, Massimiliano Di Penta, Rocco Oliveto, Andrea De Lucia, and Denys Poshyvanyk. 2016. An empirical investigation into the nature of test smells. In Proceedings of the 31st IEEE/ACMInternational Conference on Automated Software Engineering (Singapore, Singapore) (ASE ’16). ACM, New York, NY, USA, 4–15.

Arash Vahabzadeh, Amin Milani Fard, and Ali Mesbah. 2015. An empirical study of bugs in test code. In 2015 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE, Bremen, Germany, 101–110.

Pedro Henrique Dias Valle, Ricardo Ferreira Vilela, Guilherme Guerino, and Williamson Silva. 2023. Soft and Hard Skills of Software Testing Professionals: A Comprehensive Survey. In Proceedings of the XXII Brazilian Symposium on Software Quality (Brasília, Brazil) (SBQS ’23). ACM, New York, NY, USA, 90–99.

Brent van Bladel and Serge Demeyer. 2021. A comparative study of test code clones and production code clones. Journal of Systems and Software 176 (2021), 110940.

Tássio Virgínio, Luana Martins, Larissa Rocha, Railana Santana, Adriana Cruz, Heitor Costa, and Ivan Machado. 2020. JNose: Java Test Smell Detector. In Proceedings of the XXXIV Brazilian Symposium on Software Engineering (Natal, Brazil) (SBES ’20). ACM, New York, NY, USA, 564–569.

Tássio Virgínio, Luana Martins, Railana Santana, Adriana Cruz, Larissa Rocha, Heitor Costa, and Ivan Machado. 2021. On the test smells detection: an empirical study on the jnose test accuracy. Journal of Software Engineering Research and Development 9 (2021), 8–1.

Tássio Virgínio, Railana Santana, Luana Almeida Martins, Larissa Rocha Soares, Heitor Costa, and Ivan Machado. 2019. On the Influence of Test Smells on Test Coverage. In Proceedings of the XXXIII Brazilian Symposium on Software Engineering (Salvador, Brazil) (SBES 2019). ACM, New York, NY, USA, 467–471.

Junjie Wang, Yuchao Huang, Chunyang Chen, Zhe Liu, Song Wang, and Qing Wang. 2024. Software Testing With Large Language Models: Survey, Landscape, and Vision. IEEE Transactions on Software Engineering 50, 4 (2024), 911–936.

Yanming Yang, Xing Hu, Xin Xia, and Xiaohu Yang. 2024. The Lost World: Characterizing and Detecting Undiscovered Test Smells. ACM Trans. Softw. Eng. Methodol. 33, 3, Article 59 (March 2024), 32 pages.