On the test smells detection: an empirical study on the JNose Test accuracy





Tests Quality, Test Evolution, Test Smells, Evidence-based Software Engineering


Several strategies have supported test quality measurement and analysis. For example, code coverage, a widely used one, enables verification of the test case to cover as many source code branches as possible. Another set of affordable strategies to evaluate the test code quality exists, such as test smells analysis. Test smells are poor design choices in test code implementation, and their occurrence might reduce the test suite quality. A practical and largescale test smells identification depends on automated tool support. Otherwise, test smells analysis could become a cost-ineffective strategy. In an earlier study, we proposed the JNose Test, automated tool support to detect test smells and analyze test suite quality from the test smells perspective. This study extends the previous one in two directions: i) we implemented the JNose-Core, an API encompassing the test smells detection rules. Through an extensible architecture, the tool is now capable of accomodating new detection rules or programming languages; and ii) we performed an empirical study to evaluate the JNose Test effectiveness and compare it against the state-of-the-art tool, the tsDetect. Results showed that the JNose-Core precision score ranges from 91% to 100%, and the recall score from 89% to 100%. It also presented a slight improvement in the test smells detection rules compared to the tsDetect for the test smells detection at the class level.


Download data is not yet available.


Bavota,G.,Qusef,A.,Oliveto,R.,DeLucia,A.,and Binkley, D. (2015). Are test smells really harmful? an empirical study. Empirical Software Engineering,20(4):1052–1094.

Bavota, G., Qusef, A., Oliveto, R., Lucia, A., and Binkley, D. (2012). An empirical analysis of the distribution of unit test smells and their impact on software maintenance. In 28th IEEE International Conference on Software Maintenance (ICSM).

Bell, J., Legunsen, O., Hilton, M., Eloussi, L., Yung, T., and Marinov, D. (2018). DeFlaker: Automatically Detecting Flaky Tests. In IEEE/ACM 40th International Conferenceon Software Engineering (ICSE), pages 433–444.

Capgemini (2018). World Quality Report 2018­19. https://www.capgemini.com/service/world-quality-report-2018-19/.Accessed:March 1st, 2021.

CISQ (2021). The Cost of Poor Software Quality in theUS: A 2020 Report. https://www.it-cisq.org/pdf/CPSQ-2020-report.pdf. Accessed: March 1st, 2021.

Deursen, A., Moonen, L. M., Bergh, A., and Kok, G. (2001). Refactoring test code. InRefactoring Test Code, Amsterdam, The Netherlands, The Netherlands. CWI (Centre for Mathematics and Computer Science).

Garousi, V. and Küçük, B. (2018). Smells in software test code: A survey of knowledge in industry and academia. Journal of Systems and Software, 138:52 – 81.

Gopinath, R., Jensen, C., and Groce, A. (2014). Code cover­age for suite evaluation by developers. InProceedings of the 36th International Conference on Software Engineer­ing (ICSE), New York, NY, USA. ACM.

Grano,G.,Palomba,F.,DiNucci,D.,DeLucia,A.,andGall,H. C. (2019). Scented since the beginning: On the diffuse­ness of test smells in automatically generated test code. Journal of Systems and Software, 156:312–327.

Greiler, M., van Deursen, A., and Storey, M. (2013). Auto­mated detection of test fixture strategies and smells. In IEEE Sixth International Conference on Software Testing, Verification and Validation, pages 322–331.

Guerra Calle, D., Delplanque, J., and Ducasse, S. (2019).Exposing Test Analysis Results with DrTests. In International Workshop on Smalltalk Technologies, pages 1–5, Cologne, Germany. HAL.

Hallgren, K. A. (2012). Computing inter­rater reliability for observational data: an overview and tutorial. Tutorials in quantitative methods for psychology, 8(1):23.

Junior, N. S., Rocha, L., Martins, L. A., and Machado, I.(2020). A survey on test practitioners’ awareness of test smells. In Proceedings of the XXIII Iberoamerican Con­ference on Software Engineering, CIbSE2020, pages 462–475. Curran Associates.

Koochakzadeh, N. and Garousi, V. (2010). TeCReVis: A Tool for Test Coverage and Test Redundancy Visualiza­tion. In Bottaci, L. and Fraser, G., editors, Testing – Prac­ticeand Research Techniques,pages129–136, Berlin, Hei­delberg. Springer Berlin Heidelberg.

Meszaros, G., Smith, S. M., and Andrea, J. (2003). The testautomation manifesto. In Maurer, F. and Wells, D., edi­tors, Extreme Programming and Agile Methods­ XP / Agile Universe 2003, Berlin, Heidelberg. Springer Berlin Hei­delberg.

Negar, K. and Garousi, V. (2010). A tester­assisted methodology for test redundancy detection. Advances in Software Engineering, 2010.

Palomba, F., Zaidman, A., and Lucia, A. D. (2018). Au­tomatic test smell detection using information retrieval techniques. In IEEE International Conference on Soft­ware Maintenance and Evolution (ICSME), pages 311–322, Madrid, Spain. IEEE.

Pecorelli, F., Di Lillo, G., Palomba, F., and De Lucia, A.(2020). VITRuM: A Plug­In for the Visualization of Test­Related Metrics. In Proceedings of the International Con­ference on Advanced Visual Interfaces, New York, NY,USA. ACM.

Peruma, A., Almalki, K., Newman, C. D., Mkaouer, M. W., Ouni, A., and Palomba, F. (2019). On the distribution of test smells in open source android applications: An ex­ploratory study. In Proceedings of the 29th Annual Inter­national Conference on Computer Science and Software Engineering (CASCON), Riverton, NJ, USA. IBM.

Peruma, A., Almalki, K., Newman, C. D., Mkaouer, M. W., Ouni, A., and Palomba, F. (2020). TsDetect: An OpenSource Test Smells Detection Tool. ACM, New York, NY, USA.

Santana, R., Martins, L., Rocha, L., Virginio, T., Cruz, A.,Costa, H., and Machado, I. (2020). RAIDE: A Tool for Assertion Roulette and Duplicate Assert Identification and Refactoring. In Proceedings of the 34th Brazilian Sympo­sium on Software Engineering (SBES). ACM.

Spadini, D., Palomba, F., Zaidman, A., Bruntink, M., andBacchelli, A. (2018). On the relation of test smells to soft­ware code quality. In International Conference on Soft­ware Maintenance and Evolution (ICSME), pages 1–12.IEEE.

Spadini, D., Schvarcbacher, M., Oprescu, A.­M., Bruntink, M.,andBacchelli, A.(2020). Investigatingseveritythresh­olds for test smells. InProceedings of the 17th In­ternational Conference on Mining Software Repositories(MSR). ACM.

Virginio, T., Martins, L., Soares, L. R., Railana, S., Costa,H., and Machado, I. (2020). An empirical study ofautomatically­generated tests from the perspective of test smells. In Proceedings of theXXXIV Brazilian Symposium on Software Engineering (SBES), New York, NY, USA.ACM.

Virginio,T.,Santana,R.,Martins,L.A.,Soares,L.R.,Costa,H.,andMachado,I.(2019). On the influence of test smells on test coverage. In Proceedings of the XXXIII Brazilian Symposium on Software Engineering (SBES), pages 467–471, New York, NY, USA. ACM.

Virgínio, T., Martins, L., Santana, R., Cruz, A., Rocha, L.,Costa, H., and Machado, I. (2021). On the test smells detection: an empirical study on the JNose Test accuracy[Dataset]. Available at:https://doi.org/10.5281/zenodo.4570751.

Yusifoğlu, V. G., Amannejad, Y., and Can, A. B. (2015). Software test­-code engineering: A systematic mapping.In­formation and Software Technology, 58:123 – 147




How to Cite

Virgínio, T., Martins, L., Santana, R., Cruz, A., Rocha, L., Costa, H., & Machado, I. (2021). On the test smells detection: an empirical study on the JNose Test accuracy. Journal of Software Engineering Research and Development, 9(1), 8:1 – 8:14. https://doi.org/10.5753/jserd.2021.1893



Research Article