On the prevalence of test smells in mobile development

  • Tássio Virgínio UFBA
  • Márcio Ribeiro UFAL
  • Ivan Machado UFBA

Abstract


Objective: This study explores the quality of tests in Dart, the main language for mobile application development with the Flutter framework. Methods: The study begins by using the DNose tool, used to detect 14 types of test smells in code written in the Dart language. Next, we evaluate the tool’s precision, accuracy, recall, and F1-score. Using this tool, we conduct a detailed analysis of tests in open-source projects extracted from the language’s central repository. Results: The study starts with a dataset of 5,410 Dart-language projects, from which we were able to clone 4,154 repositories after processing. Based on the cloned projects, we generated a dataset containing 907,566 occurrences of test smells. Through our analysis, we characterized the specific types of test smells most frequently encountered and identified their causes. We observed the presence of test smells in 74% of test files. Another noticeable characteristic among the analyzed projects was the scarcity of tests, with 1,873 projects having one or no tests, which led us to expand the number of analyzed projects to a broader base. Conclusion: This research makes a significant contribution by providing insights into the quality of tests in projects from Dart’s official repository, as well as by offering an open-source tool for detecting 14 types of test smells.

Keywords: Tests, Test Smells, Dart, Flutter

References

João Afonso and José Campos. 2023. Automatic Generation of Smell-free Unit Tests. In 2023 IEEE/ACM International Workshop on Search-Based and Fuzz Testing (SBFT). IEEE, 9–16.

Wajdi Aljedaani, Mohamed Wiem Mkaouer, Anthony Peruma, and Stephanie Ludi. 2023. Do the Test Smells Assertion Roulette and Eager Test Impact Students’ Troubleshooting and Debugging Capabilities?. In Proceedings of the 45th International Conference on Software Engineering: Software Engineering Education and Training (Melbourne, Australia) (ICSE-SEET ’23). IEEE Press, 29–39. DOI: 10.1109/ICSE-SEET58685.2023.00009

Wajdi Aljedaani, Anthony Peruma, Ahmed Aljohani, Mazen Alotaibi, Mohamed Wiem Mkaouer, Ali Ouni, Christian D. Newman, Abdullatif Ghallab, and Stephanie Ludi. 2021. Test Smell Detection Tools: A Systematic Mapping Study. In Proceedings of the 25th International Conference on Evaluation and Assessment in Software Engineering (Trondheim, Norway) (EASE ’21). Association for Computing Machinery, New York, NY, USA, 170–180. DOI: 10.1145/3463274.3463335

Gabriele Bavota, Abdallah Qusef, Rocco Oliveto, Andrea De Lucia, and Dave Binkley. 2015. Are test smells really harmful? an empirical study. Empirical Software Engineering 20 (2015), 1052–1094.

Bruno Camara, Marco Silva, Andre Endo, and Silvia Vergilio. 2021. On the use of test smells for prediction of flaky tests. In Proceedings of the 6th Brazilian Symposium on Systematic and Automated Software Testing. 46–54.

Denivan Campos, Larissa Rocha, and Ivan Machado. 2021. Developers perception on the severity of test smells: an empirical study. arXiv preprint arXiv:2107.13902 (2021).

B. Clark. 2013. Cellular phones as a primary communications device: What are the implications for a global community? Global Media Journal 12 (01 2013).

Manuel De Stefano, Fabiano Pecorelli, Dario Di Nucci, and Andrea De Lucia. 2022. A preliminary evaluation on the relationship among architectural and test smells. In 2022 IEEE 22nd International Working Conference on Source Code Analysis and Manipulation (SCAM). 66–70. DOI: 10.1109/SCAM55253.2022.00013

Jianshu Ding, Guisheng Fan, Huiqun Yu, and Zijie Huang. 2022. Automatic identification of high-impact bug report by product and test code quality. International Journal of Software Engineering and Knowledge Engineering 32, 06 (2022), 893–916.

Amin Milani Fard and Ali Mesbah. 2013. JSNOSE: Detecting JavaScript Code Smells. In 2013 IEEE 13th InternationalWorking Conference on Source Code Analysis and Manipulation (SCAM). 116–125. DOI: 10.1109/SCAM.2013.6648192

Daniel Fernandes, Ivan Machado, and Rita Maciel. 2021. Handling test smells in python: Results from a mixed-method study. In Proceedings of the XXXV Brazilian Symposium on Software Engineering. 84–89.

Daniel Fernandes, Ivan Machado, and Rita Maciel. 2022. TEMPY: Test Smell Detector for Python. In Proceedings of the XXXVI Brazilian Symposium on Software Engineering (Virtual Event, Brazil) (SBES ’22). Association for Computing Machinery, New York, NY, USA, 214–219. DOI: 10.1145/3555228.3555280

Vahid Garousi and Barış Küçük. 2018. Smells in software test code: A survey of knowledge in industry and academia. Journal of Systems and Software 138 (2018), 52–81. DOI: 10.1016/j.jss.2017.12.013

Kevin A Hallgren. 2012. Computing inter-rater reliability for observational data: an overview and tutorial. Tutorials in quantitative methods for psychology 8, 1 (2012), 23.

Dalton Jorge, Patricia Machado, and Wilkerson Andrade. 2021. Investigating Test Smells in JavaScript Test Code. In Anais do VI Simpósio Brasileiro de Testes de Software Sistemático e Automatizado (Joinville). SBC, Porto Alegre, RS, Brasil, 36–45. [link]

Dalton Jorge, Patricia Machado, and Wilkerson Andrade. 2021. Investigating Test Smells in JavaScript Test Code. In Proceedings of the 6th Brazilian Symposium on Systematic and Automated Software Testing (Joinville, Brazil) (SAST ’21). Association for Computing Machinery, New York, NY, USA, 36–45. DOI: 10.1145/3482909.3482915

Nildo Silva Junior, Luana Martins, Larissa Rocha, Heitor Costa, and Ivan Machado. 2021. How are test smells treated in the wild? A tale of two empirical studies. Journal of Software Engineering Research and Development 9 (2021), 9–1.

Nildo Silva Junior, Larissa Rocha, Luana Almeida Martins, and Ivan Machado. 2020. A survey on test practitioners’ awareness of test smells. arXiv preprint arXiv:2003.05613 (2020).

Dong Jae Kim, Tse-Hsun Chen, and Jinqiu Yang. 2021. The secret life of test smells-an empirical study on test smell evolution and maintenance. Empirical Software Engineering 26 (2021).

Luana Martins, Carla Bezerra, Heitor Costa, and Ivan Machado. 2021. Smart prediction for refactorings in the software test code. In Proceedings of the XXXV Brazilian Symposium on Software Engineering (Joinville, Brazil) (SBES ’21). Association for Computing Machinery, New York, NY, USA, 115–120. DOI: 10.1145/3474624.3477070

Luana Martins, Heitor Costa, and Ivan Machado. 2024. On the diffusion of test smells and their relationship with test code quality of Java projects. Journal of Software: Evolution and Process 36, 4 (2024), e2532. DOI: 10.1002/smr.2532 arXiv: [link]

Estevan Paula and Rodrigo Bonifácio. 2022. TestAXE: Automatically Refactoring Test Smells Using JUnit 5 Features. In Anais Estendidos do XIII Congresso Brasileiro de Software: Teoria e Prática (Uberlândia/MG). SBC, Porto Alegre, RS, Brasil, 89–98. DOI: 10.5753/cbsoft_estendido.2022.227655

Anthony Peruma, Khalid Almalki, Christian D. Newman, Mohamed Wiem Mkaouer, Ali Ouni, and Fabio Palomba. 2019. On the distribution of test smells in open source Android applications: an exploratory study. In Proceedings of the 29th Annual International Conference on Computer Science and Software Engineering (Toronto, Ontario, Canada) (CASCON ’19). IBM Corp., USA, 193–202.

Anthony Peruma, Khalid Almalki, Christian D. Newman, Mohamed Wiem Mkaouer, Ali Ouni, and Fabio Palomba. 2020. tsDetect: an open source test smells detection tool. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (Virtual Event, USA) (ESEC/FSE 2020). Association for Computing Machinery, New York, NY, USA, 1650–1654. DOI: 10.1145/3368089.3417921

Anthony Peruma and Christian D Newman. 2021. On the distribution of" simple stupid bugs" in unit test files: An exploratory study. In 2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR). IEEE, 525–529.

Anthony Peruma, Christian D. Newman, Mohamed Wiem Mkaouer, Ali Ouni, and Fabio Palomba. 2020. An Exploratory Study on the Refactoring of Unit Test Files in Android Applications. In Proceedings of the IEEE/ACM 42nd International Conference on Software Engineering Workshops (Seoul, Republic of Korea) (ICSEW’ 20). Association for Computing Machinery, New York, NY, USA, 350–357. DOI: 10.1145/3387940.3392189

Valeria Pontillo, Dario Amoroso d’Aragona, Fabiano Pecorelli, Dario Di Nucci, Filomena Ferrucci, and Fabio Palomba. 2024. Machine learning-based test smell detection. Empirical Software Engineering 29, 2 (2024), 55.

Railana Santana, Daniel Fernandes, Denivan Campos, Larissa Soares, Rita Maciel, and Ivan Machado. 2021. Understanding practitioners’ strategies to handle test smells: a multi-method study. In Proceedings of the XXXV Brazilian Symposium on Software Engineering. 49–53.

Railana Santana, Luana Martins, Larissa Rocha, Tássio Virgínio, Adriana Cruz, Heitor Costa, and Ivan Machado. 2020. RAIDE: a tool for Assertion Roulette and Duplicate Assert identification and refactoring. In Proceedings of the XXXIV Brazilian Symposium on Software Engineering (Natal, Brazil) (SBES ’20). Association for Computing Machinery, New York, NY, USA, 374–379. DOI: 10.1145/3422392.3422510

Railana Santana, Luana Martins, Larissa Rocha, Tássio Virgínio, Adriana Cruz, Heitor Costa, and Ivan Machado. 2020. RAIDE: a tool for Assertion Roulette and Duplicate Assert identification and refactoring. In Proceedings of the XXXIV Brazilian Symposium on Software Engineering. 374–379.

Railana Santana, Luana Martins, Tássio Virgínio, Larissa Rocha, Heitor Costa, and Ivan Machado. 2024. An empirical evaluation of RAIDE: A semi-automated approach for test smells detection and refactoring. Science of Computer Programming 231 (2024), 103013. DOI: 10.1016/j.scico.2023.103013

Railana Santana, Luana Martins, Tássio Virgínio, Larissa Soares, Heitor Costa, and Ivan Machado. 2022. Refactoring Assertion Roulette and Duplicate Assert test smells: a controlled experiment. In Anais do XXV Congresso Ibero-Americano em Engenharia de Software (Córdoba). SBC, Porto Alegre, RS, Brasil, 263–277. DOI: 10.5753/cibse.2022.20977

Elvys Soares, Manoel Aranda III, Davi Romão, and Márcio Ribeiro. 2023. The Open Catalog of Test Smells. Available at [link].

Elvys Soares, Márcio Ribeiro, Guilherme Amaral, Rohit Gheyi, Leo Fernandes, Alessandro Garcia, Baldoino Fonseca, and André Santos. 2020. Refactoring Test Smells: A Perspective from Open-Source Developers. In Proceedings of the 5th Brazilian Symposium on Systematic and Automated Software Testing (Natal, Brazil) (SAST ’20). Association for Computing Machinery, New York, NY, USA, 50–59. DOI: 10.1145/3425174.3425212

Elvys Soares, Márcio Ribeiro, Rohit Gheyi, Guilherme Amaral, and André Santos. 2023. Refactoring Test Smells With JUnit 5: Why Should Developers Keep Up-to-Date? IEEE Transactions on Software Engineering 49, 3 (2023), 1152–1170. DOI: 10.1109/TSE.2022.3172654

Davide Spadini, Fabio Palomba, Andy Zaidman, Magiel Bruntink, and Alberto Bacchelli. 2018. On the Relation of Test Smells to Software Code Quality. In 2018 IEEE International Conference on Software Maintenance and Evolution (ICSME). 1–12. DOI: 10.1109/ICSME.2018.00010

Statista. 2024. Cross-platform mobile frameworks used by software developers worldwide from 2019 to 2023. [link]. Accessed: 2024-07-15.

Arie van Deursen, Leon Moonen, Alex van den Bergh, and Gerard Kok. 2001. Refactoring Test Code. In Refactoring Test Code, M. Marchesi and G. Succi (Eds.). Proceedings 2nd International Conference on Extreme Programming and Flexible Processes in Software Engineering (XP2001).

Victor Veloso and Andre Hora. 2022. Characterizing High-Quality Test Methods: A First Empirical Study. In 2022 IEEE/ACM 19th International Conference on Mining Software Repositories (MSR). 265–269. DOI: 10.1145/3524842.3529092

Tássio Virgínio, Luana Martins, Larissa Rocha, Railana Santana, Adriana Cruz, Heitor Costa, and Ivan Machado. 2020. JNose: Java Test Smell Detector. In Proceedings of the XXXIV Brazilian Symposium on Software Engineering (Natal, Brazil) (SBES ’20). Association for Computing Machinery, New York, NY, USA, 564–569. DOI: 10.1145/3422392.3422499

Tássio Virgínio, Railana Santana, Luana Almeida Martins, Larissa Rocha Soares, Heitor Costa, and Ivan Machado. 2019. On the influence of test smells on test coverage. In Proceedings of the XXXIII Brazilian Symposium on Software Engineering. 467–471.

Machado.I Virgínio.T, Ribeiro.M. 2025. On the prevalence of test smells in mobile development. DOI: 10.5281/zenodo.14869744

Tongjie Wang, Yaroslav Golubev, Oleg Smirnov, Jiawei Li, Timofey Bryksin, and Iftekhar Ahmed. 2022. PyNose: a test smell detector for python. In PyNose: a test smell detector for python (Melbourne, Australia) (ASE ’21). IEEE Press, 593–605. DOI: 10.1109/ASE51524.2021.9678615
Published
2025-09-22
VIRGÍNIO, Tássio; RIBEIRO, Márcio; MACHADO, Ivan. On the prevalence of test smells in mobile development. In: BRAZILIAN SYMPOSIUM ON SYSTEMATIC AND AUTOMATED SOFTWARE TESTING (SAST), 10. , 2025, Recife/PE. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2025 . p. 84-93. DOI: https://doi.org/10.5753/sast.2025.14327.