An Exploratory Evaluation of Continuous Feedback to Enhance Machine Learning Code Smell Detection

  • Daniel Cruz UFMG
  • Amanda Santana UFMG
  • Eduardo Figueiredo UFMG


Code smells are symptoms of bad design choices implemented on the source code. Several code smell detection tools and strategies have been proposed over the years, including the use of machine learning algorithms. However, we lack empirical evidence on how expert feedback could improve machine learning based detection of code smells. This paper aims to propose and evaluate a conceptual strategy to improve machine-learning detection of code smells by means of continuous feedback. To evaluate the strategy, we follow an exploratory evaluation design to compare results of the smell detection before and after feedback provided by a service - acting as a software expert. We focus on four code smells - God Class, Long Method, Feature Envy, and Refused Bequest - detected in 20 Java systems. As results, we observed that continuous feedback improves the performance of code smell detection. For the detection of the class-level code smells, God Class and Refused Bequest, we achieved an average improvement in terms of F1 of 0.13 and 0.58, respectively, after 50 iterations of feedback. For the method-level code smells, Long Method and Feature Envy, the improvements of F1 were 0.66 and 0.72, respectively.


An exploratory evaluation of continuous feedback to enhance machine learning code smell detection online replication package, 2023. URL [link].

M. Abbes, F. Khomh, Y. Gueheneuc, and G. Antoniol. An empirical study of the impact of two antipatterns, blob and spaghetti code, on program comprehension. In European Conference on Software Maintenance and Reengineering (CSMR), 2011.

L. Amorim, E. Costa, N. Antunes, B. Fonseca, and M. Ribeiro. Experience report: Evaluating the effectiveness of decision trees for detecting code smells. In Int. Symposium on Software Reliability Engineering (ISSRE), pages 261–269, 2015.

M. Aniche. Java code metrics calculator (CK), 2015. Available in [link].

H. Barkmann, R. Lincke, and W. Löwe. Quantitative evaluation of software quality metrics in open-source projects. In Int’l Conf. on Advanced Information Networking and Applications Workshops (AINA), pages 1067–1072, 2009.

J. M. Bieman and BK. Kang. Cohesion and reuse in an object-oriented system. Software Engineering Notes (SEN), 20:259–262, 1995.

T. Chen and C. Guestrin. Xgboost: A scalable tree boosting system. In Int’l Conf. on knowledge discovery and data mining (KDD), pages 785–794. ACM, 2016.

S. R. Chidamber and C. F. Kemerer. A metrics suite for object oriented design. Transactions on Software Engineering, 20:476–493, 1994.

D. Cruz, A. Santana, and E. Figueiredo. Detecting bad smells with machine learning algorithms: an empirical study. In Proceedings of the 3rd International Conference on Technical Debt, pages 31–40, 2020.

D. Di Nucci, F. Palomba, D. A. Tamburri, A. Serebrenik, and A. De Lucia. Detecting code smells using machine learning techniques: are we there yet? In Int’l Conf. on Software Analysis, Evolution and Reengineering (SANER), pages 612–621, 2018.

E. Fernandes, J. Oliveira, G. Vale, T. Paiva, and E. Figueiredo. A review-based comparative study of bad smell detection tools. In Proceedings of the Int’l Conf. on Evaluation and Assessment in Software Engineering (EASE), 2016.

M. Fokaefs, N. Tsantalis, E. Stroulia, and A. Chatzigeorgiou. Jdeodorant: identification and application of extract class refactorings. In Int’l Conf. on Software Engineering (ICSE), pages 1037–1039, 2011.

F. Fontana, M. Zanoni, A. Marino, and M. V. Mäntylä. Code smell detection: Towards a machine learning-based approach (icsm). In Int’l Conf. on Software Maintenance, pages 396–399, 2013.

F. Fontana, V. Ferme, M. Zanoni, and A. Yamashita. Automatic metric thresholds derivation for code smell detection. In 2015 IEEE/ACM 6th International Workshop on Emerging Trends in Software Metrics, pages 44–53. IEEE, 2015.

F. A. Fontana, M. V. Mäntylä, M. Zanoni, and A. Marino. Comparing and experimenting machine learning techniques for code smell detection. Empirical Software Engineering, 21(3):1143–1191, 2016.

M. Fowler. Refactoring: improving the design of existing code. Addison-Wesley Professional, 1999.

S. Herbold, J. Grabowski, and S. Waack. Calculation and optimization of thresholds for sets of software metrics. Empirical Software Engineering, 16(6):812–841, 2011.

M. Hitz and B. Montazeri. Measuring coupling and cohesion in object-oriented systems. Citeseer, 1995.

M. Hozano, H. Ferreira, I. Silva, B. Fonseca, and E. Costa. Using developers’ feedback to improve code smell detection. In Proceedings of the 30th Annual ACM Symposium on Applied Computing, pages 1661–1663, 2015.

F. Khomh, S. Vaucher, YG. Guéhéneuc, and H. Sahraoui. A bayesian approach for the detection of code and design smells. In Int’l Conf. on Quality Software, pages 305–314, 2009.

F. Khomh, S. Vaucher, YG. Guéhéneuc, and H. Sahraoui. Bdtex: A gqm-based bayesian approach for the detection of antipatterns. Journal of Systems and Software, 84(4):559–572, 2011.

F. Khomh, M. Di Penta, YG. Guéhéneuc, and G. Antoniol. An exploratory study of the impact of antipatterns on class change-and fault-proneness. Empirical Software Engineering, 17(3):243–275, 2012.

W. Li and S. Henry. Object-oriented metrics that predict maintainability. Journal of systems and software, 23(2):111–122, 1993.

R. Lincke, J. Lundberg, and W. Löwe. Comparing software metrics tools. In Proceedings of the 2008 international symposium on Software testing and analysis, pages 131–142, 2008.

H. Liu, Q. Liu, Z. Niu, and Y. Liu. Dynamic and automatic feedback-based threshold adaptation for code smell detection. IEEE Transactions on Software Engineering, 42(6):544–558, 2015.

A. Maiga, N. Ali, N. Bhattacharya, A. Sabané, Y. Guéhéneuc, and E. Aimeur. Smurf: A svm-based incremental anti-pattern detection approach. In Working Conference on Reverse Engineering (WCRE), pages 466–475, 2012.

A. Maiga, N. Ali, N. Bhattacharya, A. Sabané, Y. Guéhéneuc, G. Antoniol, and E. Aı̈meur. Support vector machines for anti-pattern detection. In Proceedings of Int’l Conf. on Automated Software Engineering (ASE), pages 278–281, 2012.

C. Marinescu, R. Marinescu, P. F. Mihancea, and R. Wettel. iplasma: An integrated platform for quality assessment of object-oriented design. In In ICSM (Industrial and Tool Volume), pages 77–80, 2005.

R. Marinescu. Detection strategies: Metrics-based rules for detecting design flaws. In Int’l Conf. on Software Maintenance (ICSM), pages 350–359, 2004.

N. Moha, YG. Gueheneuc, L. Duchien, and AF. Le Meur. Decor: A method for the specification and detection of code and design smells. Transactions on Software Engineering, 36(1):20–36, 2009.

W. Oizumi, L. Sousa, A. Oliveira, A. Garcia, A. B. Agbachi, R. Oliveira, and C. Lucena. On the identification of design problems in stinky code: experiences and tool support. Journal of the Brazilian Computer Society, 24(1), 2018.

F. Palomba, G. Bavota, M. Di Penta, R. Oliveto, A. De Lucia, and D. Poshyvanyk. Detecting bad smells in source code using change history information. In Int’l Conf. on Automated Software Engineering (ASE), pages 268–278, 2013.

F. Palomba, G. Bavota, M. Di Penta, R. Oliveto, and A. De Lucia. Do they really smell bad? a study on developers’ perception of bad code smells. In International Conference on Software Maintenance and Evolution, pages 101–110, 2014.

D. Taibi, A. Janes, and V. Lenarduzzi. How developers perceive smells in source code: A replicated study. Information and Software Technology, 92:223–235, 2017.

E. Tempero, C. Anslow, J. Dietrich, T. Han, J. Li, M. Lumpe, H. Melton, and J. Noble. The qualitas corpus: A curated collection of java code for empirical studies. In Asia Pacific Software Engineering Conference (APSEC), pages 336–345, Nov 2010.

S. Vidal, H. Vazquez, J. A. Diaz-Pace, C. Marcos, A. Garcia, and W. Oizumi. Jspirit: a flexible tool for the analysis of code smells. In Int’l Conf. of the Chilean Computer Science Society (SCCC), pages 1–6, 2015.

C. Wohlin, P. Runeson, M. Höst, M. Ohlsson, B. Regnell, and A. Wesslén. Experimentation in software engineering. Springer Science & Business Media, 2012.

A. Yamashita and S. Counsell. Code smells as system-level indicators of maintainability: An empirical study. J. of Systems and Software, 86(10):2639–2653, 2013.

A. Yamashita and L. Moonen. Do developers care about code smells? an exploratory survey. In 2013 20th working conference on reverse engineering (WCRE), 2013.
CRUZ, Daniel; SANTANA, Amanda; FIGUEIREDO, Eduardo. An Exploratory Evaluation of Continuous Feedback to Enhance Machine Learning Code Smell Detection. In: CONGRESSO IBERO-AMERICANO EM ENGENHARIA DE SOFTWARE (CIBSE), 27. , 2024, Curitiba/PR. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2024 . p. 76-90. DOI: