Measuring Trivial and Non-Trivial Refactoring: A Predictive Analysis and Index Proposal
Resumo
This study investigates the relationship between trivial and non-trivial refactorings and proposes a metric to evaluate refactoring triviality. We analyzed 1.9M refactorings from 1,291 open-source projects with 45 code metrics using supervised learning. We evaluate 5 classification models and 7 regression models under various configurations. Based on these results, we propose a metric based on complexity, speed and risk, with insights from 15 developers on 58 selected features. The results show that separating the refactorings by triviality improves the predictions and that the use of all features outperforms the prioritization of the developer. Ensemble models outperformed linear ones, and expert perceptions aligned with model results. These findings support refactoring decisions and highlight future research opportunities.Referências
Abid, C., Gaaloul, K., Kessentini, M., and Alizadeh, V. (2022). What refactoring topics do developers discuss? A large scale empirical study using stack overflow. IEEE Access, 10:56362–56374.
Agnihotri, M. and Chug, A. (2020). A systematic literature survey of software metrics, code smells and refactoring techniques. Journal of Information Processing Systems, 16(4):915–934.
Akhtar, S. M., Nazir, M., Ali, A., Khan, A. S., Atif, M., and Naseer, M. (2022). A systematic literature review on software-refactoring techniques, challenges, and practices. VFAST Transactions on Software Engineering, 10(4):93–103.
Almogahed, A., Mahdin, H., Omar, M., Zakaria, N. H., Mostafa, S. A., AlQahtani, S. A., Pathak, P., Shaharudin, S. M., and Hidayat, R. (2023). A refactoring classification framework for efficient software maintenance. IEEE Access, 11:78904–78917.
AlOmar, E. A., Peruma, A., Mkaouer, M. W., Newman, C., Ouni, A., and Kessentini, M. (2021). How we refactor and how we document it? on the use of supervised machine learning algorithms to classify refactoring documentation. Expert Systems with Applications, 167:114176.
Aniche, M., Maziero, E., Durelli, R., and Durelli, V. H. (2020). The effectiveness of supervised machine learning algorithms in predicting software refactoring. IEEE Transactions on Software Engineering, 48(4):1432–1450.
Azeem, M. I., Palomba, F., Shi, L., and Wang, Q. (2019). Machine learning techniques for code smell detection: A systematic literature review and meta-analysis. Information and Software Technology, 108:115–138.
Baqais, A. and Alshayeb, M. (2020). Automatic software refactoring: a systematic literature review. Software Quality Journal, 28(2):459–502.
Bavota, G., De Lucia, A., Di Penta, M., Oliveto, R., and Palomba, F. (2015). An experimental investigation on the innate relationship between quality and refactoring. Journal of Systems and Software, 107:1–14.
Bertrand, G. (1994). Simple points, topological numbers and geodesic neighborhoods in cubic grids. Pattern recognition letters, 15(10):1003–1011.
Bibiano, A. C., Coutinho, D., Uchôa, A., Assunçao, W. K., Garcia, A., de Mello, R., Colanzi, T. E., Tenório, D., Vasconcelos, A., Fonseca, B., et al. (2024). Enhancing recommendations of composite refactorings based on the practice. In 24th IEEE International Conference on Source Code Analysis and Manipulation (SCAM), pages 1–12. IEEE.
Bibiano, A. C., Uchôa, A., Assunção, W. K., Tenório, D., Colanzi, T. E., Vergilio, S. R., and Garcia, A. (2023). Composite refactoring: Representations, characteristics and effects on software projects. Information and Software Technology, 156:107134.
de Paulo Sobrinho, E. V., De Lucia, A., and de Almeida Maia, M. (2018). A systematic literature review on bad smells–5 w’s: which, when, what, who, where. IEEE Transactions on Software Engineering, 47(1):17–66.
Dehaghani, S. M. H. and Hajrahimi, N. (2013). Which factors affect software projects maintenance cost more? Acta Informatica Medica, 21(1):63.
Ferreira, T., Ivers, J., Yackley, J. J., Kessentini, M., Ozkaya, I., and Gaaloul, K. (2023). Dependent or Not: Detecting and Understanding Collections of Refactorings. IEEE Transactions on Software Engineering, 49(6):3344–3358.
Fowler, M. (2018). Refactoring: improving the design of existing code. Addison-Wesley Professional, Boston, MA, US, 2nd edition.
James, G., Witten, D., Hastie, T., Tibshirani, R., and Taylor, J. (2023). An introduction to statistical learning: With applications in python. Springer Nature, New York, NY, US, 3rd edition.
Kaur, S. and Singh, P. (2019). How does object-oriented code refactoring influence software quality? research landscape and challenges. Journal of Systems and Software, 157:110394.
Kim, M., Zimmermann, T., and Nagappan, N. (2014). An empirical study of refactoring challenges and benefits at microsoft. IEEE Transactions on Software Engineering, 40(7):633–649.
Kuhn, M. and Johnson, K. (2013). Applied Predictive Modeling. Springer, New York, NY, US.
Likert, R. (1932). A technique for the measurement of attitudes. Archives of Psychology, 22 140:55–55.
Liu, J., Jin, W., Zhou, J., Feng, Q., Fan, M., Wang, H., and Liu, T. (2024). 3erefactor: Effective, efficient and executable refactoring recommendation for software architectural consistency. IEEE Transactions on Software Engineering, pages 1–23.
Malhotra, R. and Chug, A. (2016). An empirical study to assess the effects of refactoring on software maintainability. In 2016 International Conference on Advances in Computing, Communications and Informatics (ICACCI), pages 110–117, Jaipur, India. IEEE.
Mens, T. and Tourwé, T. (2004). A survey of software refactoring. IEEE Transactions on software engineering, 30(2):126–139.
Moser, R., Abrahamsson, P., Pedrycz, W., Sillitti, A., and Succi, G. (2007). A case study on the impact of refactoring on quality and productivity in an agile team. In IFIP Central and East European Conference on Software Engineering Techniques, pages 252–266, Berlin, Germany. Springer.
Naik, P., Nelaballi, S., Pusuluri, V. S., and Kim, D.-K. (2023). Deep learning-based code refactoring: A review of current knowledge. Journal of Computer Information Systems, 64(2):314–328.
Nikolaidis, N., Mittas, N., Ampatzoglou, A., Feitosa, D., and Chatzigeorgiou, A. (2024). A metrics-based approach for selecting among various refactoring candidates. Empirical Software Engineering, 29(1):25.
Nyamawe, A. S. (2022). Mining commit messages to enhance software refactorings recommendation: A machine learning approach. Machine Learning with Applications, 9:100316.
Opdyke, W. F. (1992). Refactoring Object-Oriented Frameworks. Ph.d., University of Illinois at Urbana-Champaign, Urbana, IL, US.
Ouni, A., Kessentini, M., Bechikh, S., and Sahraoui, H. (2015). Prioritizing code-smells correction tasks using chemical reaction optimization. Software Quality Journal, 23(2):323–361.
Palomba, F., Zaidman, A., Oliveto, R., and De Lucia, A. (2017). An exploratory study on the relationship between changes and refactoring. In 2017 IEEE/ACM 25th International Conference on Program Comprehension (ICPC), pages 176–185, Buenos Aires, Argentina. IEEE.
Pinheiro, D., Bezerra, C., and Uchôa, A. (2024). On the effectiveness of trivial refactorings in predicting non-trivial refactorings. Journal of Software Engineering Research and Development, 12(1):5–1.
Pinheiro, D., Bezerra, C. I. M., and Uchoa, A. (2022). How do trivial refactorings affect classification prediction models? In Proceedings of the 16th Brazilian Symposium on Software Components, Architectures, and Reuse, page 81–90, New York, NY, US. Association for Computing Machinery.
Sharma, T., Suryanarayana, G., and Samarthyam, G. (2015). Challenges to and solutions for refactoring adoption: An industrial perspective. IEEE Software, 32(6):44–51.
Silva, D., Tsantalis, N., and Valente, M. T. (2016). Why we refactor? Confessions of github contributors. In Proceedings of the 2016 24th acm sigsoft international symposium on foundations of software engineering, pages 858–870, New York, NY, USA. Association for Computing Machinery.
Tan, A. J. J., Chong, C. Y., and Aleti, A. (2024). Rearrange: Effort estimation approach for software clustering-based remodularisation. Information and Software Technology, 176:107567.
Tsantalis, N., Chaikalis, T., and Chatzigeorgiou, A. (2018). Ten years of jdeodorant: Lessons learned from the hunt for smells. In 2018 IEEE 25th international conference on software analysis, evolution and reengineering (SANER), pages 4–14, Campobasso, Italy. IEEE.
Zarnekow, R. and Brenner, W. (2005). Distribution of cost over the application lifecycle - A multi-case study. ECIS 2005 Proceedings, page 26.
Agnihotri, M. and Chug, A. (2020). A systematic literature survey of software metrics, code smells and refactoring techniques. Journal of Information Processing Systems, 16(4):915–934.
Akhtar, S. M., Nazir, M., Ali, A., Khan, A. S., Atif, M., and Naseer, M. (2022). A systematic literature review on software-refactoring techniques, challenges, and practices. VFAST Transactions on Software Engineering, 10(4):93–103.
Almogahed, A., Mahdin, H., Omar, M., Zakaria, N. H., Mostafa, S. A., AlQahtani, S. A., Pathak, P., Shaharudin, S. M., and Hidayat, R. (2023). A refactoring classification framework for efficient software maintenance. IEEE Access, 11:78904–78917.
AlOmar, E. A., Peruma, A., Mkaouer, M. W., Newman, C., Ouni, A., and Kessentini, M. (2021). How we refactor and how we document it? on the use of supervised machine learning algorithms to classify refactoring documentation. Expert Systems with Applications, 167:114176.
Aniche, M., Maziero, E., Durelli, R., and Durelli, V. H. (2020). The effectiveness of supervised machine learning algorithms in predicting software refactoring. IEEE Transactions on Software Engineering, 48(4):1432–1450.
Azeem, M. I., Palomba, F., Shi, L., and Wang, Q. (2019). Machine learning techniques for code smell detection: A systematic literature review and meta-analysis. Information and Software Technology, 108:115–138.
Baqais, A. and Alshayeb, M. (2020). Automatic software refactoring: a systematic literature review. Software Quality Journal, 28(2):459–502.
Bavota, G., De Lucia, A., Di Penta, M., Oliveto, R., and Palomba, F. (2015). An experimental investigation on the innate relationship between quality and refactoring. Journal of Systems and Software, 107:1–14.
Bertrand, G. (1994). Simple points, topological numbers and geodesic neighborhoods in cubic grids. Pattern recognition letters, 15(10):1003–1011.
Bibiano, A. C., Coutinho, D., Uchôa, A., Assunçao, W. K., Garcia, A., de Mello, R., Colanzi, T. E., Tenório, D., Vasconcelos, A., Fonseca, B., et al. (2024). Enhancing recommendations of composite refactorings based on the practice. In 24th IEEE International Conference on Source Code Analysis and Manipulation (SCAM), pages 1–12. IEEE.
Bibiano, A. C., Uchôa, A., Assunção, W. K., Tenório, D., Colanzi, T. E., Vergilio, S. R., and Garcia, A. (2023). Composite refactoring: Representations, characteristics and effects on software projects. Information and Software Technology, 156:107134.
de Paulo Sobrinho, E. V., De Lucia, A., and de Almeida Maia, M. (2018). A systematic literature review on bad smells–5 w’s: which, when, what, who, where. IEEE Transactions on Software Engineering, 47(1):17–66.
Dehaghani, S. M. H. and Hajrahimi, N. (2013). Which factors affect software projects maintenance cost more? Acta Informatica Medica, 21(1):63.
Ferreira, T., Ivers, J., Yackley, J. J., Kessentini, M., Ozkaya, I., and Gaaloul, K. (2023). Dependent or Not: Detecting and Understanding Collections of Refactorings. IEEE Transactions on Software Engineering, 49(6):3344–3358.
Fowler, M. (2018). Refactoring: improving the design of existing code. Addison-Wesley Professional, Boston, MA, US, 2nd edition.
James, G., Witten, D., Hastie, T., Tibshirani, R., and Taylor, J. (2023). An introduction to statistical learning: With applications in python. Springer Nature, New York, NY, US, 3rd edition.
Kaur, S. and Singh, P. (2019). How does object-oriented code refactoring influence software quality? research landscape and challenges. Journal of Systems and Software, 157:110394.
Kim, M., Zimmermann, T., and Nagappan, N. (2014). An empirical study of refactoring challenges and benefits at microsoft. IEEE Transactions on Software Engineering, 40(7):633–649.
Kuhn, M. and Johnson, K. (2013). Applied Predictive Modeling. Springer, New York, NY, US.
Likert, R. (1932). A technique for the measurement of attitudes. Archives of Psychology, 22 140:55–55.
Liu, J., Jin, W., Zhou, J., Feng, Q., Fan, M., Wang, H., and Liu, T. (2024). 3erefactor: Effective, efficient and executable refactoring recommendation for software architectural consistency. IEEE Transactions on Software Engineering, pages 1–23.
Malhotra, R. and Chug, A. (2016). An empirical study to assess the effects of refactoring on software maintainability. In 2016 International Conference on Advances in Computing, Communications and Informatics (ICACCI), pages 110–117, Jaipur, India. IEEE.
Mens, T. and Tourwé, T. (2004). A survey of software refactoring. IEEE Transactions on software engineering, 30(2):126–139.
Moser, R., Abrahamsson, P., Pedrycz, W., Sillitti, A., and Succi, G. (2007). A case study on the impact of refactoring on quality and productivity in an agile team. In IFIP Central and East European Conference on Software Engineering Techniques, pages 252–266, Berlin, Germany. Springer.
Naik, P., Nelaballi, S., Pusuluri, V. S., and Kim, D.-K. (2023). Deep learning-based code refactoring: A review of current knowledge. Journal of Computer Information Systems, 64(2):314–328.
Nikolaidis, N., Mittas, N., Ampatzoglou, A., Feitosa, D., and Chatzigeorgiou, A. (2024). A metrics-based approach for selecting among various refactoring candidates. Empirical Software Engineering, 29(1):25.
Nyamawe, A. S. (2022). Mining commit messages to enhance software refactorings recommendation: A machine learning approach. Machine Learning with Applications, 9:100316.
Opdyke, W. F. (1992). Refactoring Object-Oriented Frameworks. Ph.d., University of Illinois at Urbana-Champaign, Urbana, IL, US.
Ouni, A., Kessentini, M., Bechikh, S., and Sahraoui, H. (2015). Prioritizing code-smells correction tasks using chemical reaction optimization. Software Quality Journal, 23(2):323–361.
Palomba, F., Zaidman, A., Oliveto, R., and De Lucia, A. (2017). An exploratory study on the relationship between changes and refactoring. In 2017 IEEE/ACM 25th International Conference on Program Comprehension (ICPC), pages 176–185, Buenos Aires, Argentina. IEEE.
Pinheiro, D., Bezerra, C., and Uchôa, A. (2024). On the effectiveness of trivial refactorings in predicting non-trivial refactorings. Journal of Software Engineering Research and Development, 12(1):5–1.
Pinheiro, D., Bezerra, C. I. M., and Uchoa, A. (2022). How do trivial refactorings affect classification prediction models? In Proceedings of the 16th Brazilian Symposium on Software Components, Architectures, and Reuse, page 81–90, New York, NY, US. Association for Computing Machinery.
Sharma, T., Suryanarayana, G., and Samarthyam, G. (2015). Challenges to and solutions for refactoring adoption: An industrial perspective. IEEE Software, 32(6):44–51.
Silva, D., Tsantalis, N., and Valente, M. T. (2016). Why we refactor? Confessions of github contributors. In Proceedings of the 2016 24th acm sigsoft international symposium on foundations of software engineering, pages 858–870, New York, NY, USA. Association for Computing Machinery.
Tan, A. J. J., Chong, C. Y., and Aleti, A. (2024). Rearrange: Effort estimation approach for software clustering-based remodularisation. Information and Software Technology, 176:107567.
Tsantalis, N., Chaikalis, T., and Chatzigeorgiou, A. (2018). Ten years of jdeodorant: Lessons learned from the hunt for smells. In 2018 IEEE 25th international conference on software analysis, evolution and reengineering (SANER), pages 4–14, Campobasso, Italy. IEEE.
Zarnekow, R. and Brenner, W. (2005). Distribution of cost over the application lifecycle - A multi-case study. ECIS 2005 Proceedings, page 26.
Publicado
22/09/2025
Como Citar
PINHEIRO, Darwin; BEZERRA, Carla; UCHÔA, Anderson.
Measuring Trivial and Non-Trivial Refactoring: A Predictive Analysis and Index Proposal. In: CONCURSO DE TESES E DISSERTAÇÕES EM ENGENHARIA DE SOFTWARE (MESTRADO) - CONGRESSO BRASILEIRO DE SOFTWARE: TEORIA E PRÁTICA (CBSOFT), 16. , 2025, Recife/PE.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2025
.
p. 3-17.
DOI: https://doi.org/10.5753/cbsoft_estendido.2025.12268.
