On the test smells detection: an empirical study on the JNose Test accuracy


  • Tássio Virgínio IFTO
  • Luana Martins UFBA
  • Railana Santana UFBA
  • Adriana Cruz UFLA
  • Larissa Rocha UFBA
  • Heitor Costa UFLA
  • Ivan Machado UFBA




Tests Quality, Test Evolution, Test Smells, Evidence-based Software Engineering


Several strategies have been proposed for test quality measurement and analysis. Code coverage is likely the most widely used one. It enables to verify the ability of a test case to cover as many source code branches as possible. Although code coverage has been widely used, novel strategies have been recently employed.
Test smells analysis, for example, has been introduced as an affordable strategy to evaluate the test code quality. Test smells are poor design choices in implementation, and their occurrence in test code might reduce the test suites quality. ~The test smells identification is most dependent on tool support; otherwise, it could become a cost-ineffective strategy. In an earlier study, we proposed the JNose Test, a tool to analyze test suite quality from the test smells perspective. The JNose Test detects twenty-one types of test smells throughout software versions. This study extends the previous one in two directions: i) the test smells detection rules were extracted to an API, named JNose-Core, that provides an extensible architecture for the implementation of new detection rules or programming languages; and ii) we performed an empirical study to evaluate the tool effectiveness for the test smells detection and a comparison between the JNose Test and the state-of-the-art tool, the tsDetect. Results showed that the JNose-Core precision score ranges from 91\% to 100\%, and the recall score from 89\% to 100\%. It also presented a slight improvement in the test smells detection rules compared to the tsDetect for the test smells detection at the class-level.


Download data is not yet available.


Bavota,G.,Qusef,A.,Oliveto,R.,DeLucia,A.,and Binkley, D. (2015). Are test smells really harmful? an empirical study. Empirical Software Engineering,20(4):1052–1094.

Bavota, G., Qusef, A., Oliveto, R., Lucia, A., and Binkley, D. (2012). An empirical analysis of the distribution of unit test smells and their impact on software maintenance. In 28th IEEE International Conference on Software Maintenance (ICSM).

Bell, J., Legunsen, O., Hilton, M., Eloussi, L., Yung, T., and Marinov, D. (2018). DeFlaker: Automatically Detecting Flaky Tests. In IEEE/ACM 40th International Conferenceon Software Engineering (ICSE), pages 433–444.

Capgemini (2018). World Quality Report 2018­19. https://www.capgemini.com/service/world-quality-report-2018-19/.Accessed:March 1st, 2021.

CISQ (2021). The Cost of Poor Software Quality in theUS: A 2020 Report. https://www.it-cisq.org/pdf/CPSQ-2020-report.pdf. Accessed: March 1st, 2021.

Deursen, A., Moonen, L. M., Bergh, A., and Kok, G. (2001). Refactoring test code. InRefactoring Test Code, Amsterdam, The Netherlands, The Netherlands. CWI (Centre for Mathematics and Computer Science).

Garousi, V. and Küçük, B. (2018). Smells in software test code: A survey of knowledge in industry and academia. Journal of Systems and Software, 138:52 – 81.

Gopinath, R., Jensen, C., and Groce, A. (2014). Code cover­age for suite evaluation by developers. InProceedings of the 36th International Conference on Software Engineer­ing (ICSE), New York, NY, USA. ACM.

Grano,G.,Palomba,F.,DiNucci,D.,DeLucia,A.,andGall,H. C. (2019). Scented since the beginning: On the diffuse­ness of test smells in automatically generated test code. Journal of Systems and Software, 156:312–327.

Greiler, M., van Deursen, A., and Storey, M. (2013). Auto­mated detection of test fixture strategies and smells. In IEEE Sixth International Conference on Software Testing, Verification and Validation, pages 322–331.

Guerra Calle, D., Delplanque, J., and Ducasse, S. (2019).Exposing Test Analysis Results with DrTests. In International Workshop on Smalltalk Technologies, pages 1–5, Cologne, Germany. HAL.

Hallgren, K. A. (2012). Computing inter­rater reliability for observational data: an overview and tutorial. Tutorials in quantitative methods for psychology, 8(1):23.

Junior, N. S., Rocha, L., Martins, L. A., and Machado, I.(2020). A survey on test practitioners’ awareness of test smells. InProceedings of the XXIII Iberoamerican Con­ferenceonSoftwareEngineering, CIbSE2020, pages 462–475. Curran Associates.

Koochakzadeh, N. and Garousi, V. (2010). TeCReVis: A Tool for Test Coverage and Test Redundancy Visualiza­tion. In Bottaci, L. and Fraser, G., editors, Testing – Prac­ticeandResearchTechniques,pages129–136, Berlin, Hei­delberg. Springer Berlin Heidelberg.

Meszaros, G., Smith, S. M., and Andrea, J. (2003). The testautomation manifesto. In Maurer, F. and Wells, D., edi­tors,ExtremeProgrammingandAgileMethods­XP/AgileUniverse 2003, Berlin, Heidelberg. Springer Berlin Hei­delberg.

Negar, K. and Garousi, V. (2010). A tester­assisted method­ologyfortestredundancydetection.AdvancesinSoftwareEngineering, 2010.

Palomba, F., Zaidman, A., and Lucia, A. D. (2018). Au­tomatic test smell detection using information retrieval techniques. In IEEE International Conference on Soft­ware Maintenance and Evolution (ICSME), pages 311–322, Madrid, Spain. IEEE.

Pecorelli, F., Di Lillo, G., Palomba, F., and De Lucia, A.(2020). VITRuM: A Plug­In for the Visualization of Test­Related Metrics. InProceedings of the International Con­ference on Advanced Visual Interfaces, New York, NY,USA. ACM.

Peruma, A., Almalki, K., Newman, C. D., Mkaouer, M. W., Ouni, A., and Palomba, F. (2019). On the distribution of test smells in open source android applications: An ex­ploratory study. InProceedings of the 29th Annual Inter­national Conference on Computer Science and Software Engineering (CASCON), Riverton, NJ, USA. IBM.

Peruma, A., Almalki, K., Newman, C. D., Mkaouer, M. W., Ouni, A., and Palomba, F. (2020). TsDetect: An OpenSource Test Smells Detection Tool. ACM, New York, NY, USA.

Santana, R., Martins, L., Rocha, L., Virginio, T., Cruz, A.,Costa, H., and Machado, I. (2020). RAIDE: A Tool forAssertionRouletteandDuplicateAssertIdentificationandRefactoring. InProceedings of the 34th Brazilian Sympo­sium on Software Engineering (SBES). ACM.

Spadini, D., Palomba, F., Zaidman, A., Bruntink, M., andBacchelli, A. (2018). On the relation of test smells to soft­ware code quality. In International Conference on Soft­ware Maintenance and Evolution (ICSME), pages 1–12.IEEE.

Spadini, D., Schvarcbacher, M., Oprescu, A.­M., Bruntink, M.,andBacchelli, A.(2020). Investigatingseveritythresh­olds for test smells. InProceedings of the 17th In­ternational Conference on Mining Software Repositories(MSR). ACM.

Virginio, T., Martins, L., Soares, L. R., Railana, S., Costa,H., and Machado, I. (2020). An empirical study ofautomatically­generated tests from the perspective of test smells. InProceedingsoftheXXXIVBrazilianSymposiumon Software Engineering (SBES), New York, NY, USA.ACM.

Virginio,T.,Santana,R.,Martins,L.A.,Soares,L.R.,Costa,H.,andMachado,I.(2019). Ontheinfluenceoftestsmellson test coverage. InProceedings of the XXXIII BrazilianSymposium on Software Engineering (SBES), pages 467–471, New York, NY, USA. ACM.

Virgínio, T., Martins, L., Santana, R., Cruz, A., Rocha, L.,Costa, H., and Machado, I. (2021). On the test smells detection: an empirical study on the JNose Test accuracy[Dataset]. Available at:https://doi.org/10.5281/zenodo.4570751.

Yusifoğlu, V. G., Amannejad, Y., and Can, A. B. (2015). Software test­-code engineering: A systematic mapping.In­formation and Software Technology, 58:123 – 147




How to Cite

Virgínio, T., Luana Martins, Railana Santana, Adriana Cruz, Larissa Rocha, Heitor Costa, & Ivan Machado. (2021). On the test smells detection: an empirical study on the JNose Test accuracy. Journal of Software Engineering Research and Development, 9(1), 8:1 – 8:14. https://doi.org/10.5753/jserd.2021.1893



Research Article