Realizing Refactoring Prediction through Deep Learning

Lucas Rafael Rodrigues Pereira; Dilson Lucas Pereira; Rafael Serapilha Durelli

doi:10.5753/ise.2023.235749

Lucas Rafael Rodrigues Pereira UFLA
Dilson Lucas Pereira UFLA
Rafael Serapilha Durelli UFLA

DOI: https://doi.org/10.5753/ise.2023.235749

Resumo

Refactoring is the process of changing the internal structure of a software in order to improve its quality, without modifying its behavior. Recent studies have shown that the act of refactoring brings positive results for maintaining and understanding the code and the system as a whole. It turns out that, currently, this method is still little used, with expertise and intuition being the main factors that determine the need for software refactoring. Before starting the refactoring process, an analysis is essential to check whether refactoring is really necessary. Therefore, the present study analyzes artificial intelligence techniques, such as Deep Learning, to predict when software refactoring is essential. Deep Learning models like CNN, RNN, LSTM and DenseLayer were analyzed and compared using precision, recall and accuracy metrics. The results demonstrated that Machine Learning models performed better than Deep Learning algorithms using the same data set, however, the good performance of Deep Learning models stands out in scenarios where the data is very unbalanced.

Palavras-chave: Refactoring, Deep Learning, Software Engineering

Referências

[n. d.]. Formulário de validação de modelos DL. https://forms.gle/pL9HRfZnJmocQwgt6. Acessado: 2021-06-11.

Eman Abdullah AlOmar, Mohamed Wiem Mkaouer, Ali Ouni, and Marouane Kessentini. 2019. Do design metrics capture developers perception of quality? an empirical study on self-affirmed refactoring activities. arXiv preprint arXiv:1907.04797 (2019).

Uri Alon, Meital Zilberstein, Omer Levy, and Eran Yahav. 2019. code2vec: Learning distributed representations of code. Proceedings of the ACM on Programming Languages 3, POPL (2019), 1–29.

Juan Martín Sotuyo Dodero Clément Fournier Pelisse Romain Robert Sösemann Andreas Dangel, BBG. 2022. https://github.com/pmd/pmd.

Mauricio Aniche, Erick Maziero, Rafael Durelli, and Vinicius Durelli. 2020. The effectiveness of supervised machine learning algorithms in predicting software refactoring. IEEE Transactions on Software Engineering (2020).

Anders Arpteg, Björn Brinne, Luka Crnkovic-Friis, and Jan Bosch. 2018. Software engineering challenges of deep learning. In 2018 44th Euromicro Conference on Software Engineering and Advanced Applications (SEAA). IEEE, 50–59.

Muhammad Ilyas Azeem, Fabio Palomba, Lin Shi, and Qing Wang. 2019. Machine learning techniques for code smell detection: A systematic literature review and meta-analysis. Information and Software Technology 108 (2019), 115–138.

Abdulrahman Ahmed Bobakr Baqais and Mohammad Alshayeb. 2020. Automatic software refactoring: a systematic literature review. Software Quality Journal 28, 2 (2020), 459–502.

Gabriele Bavota, Andrea De Lucia, Andrian Marcus, and Rocco Oliveto. 2014. Recommending refactoring operations in large software systems. In Recommendation Systems in Software Engineering. Springer, 387–419.

Moritz Beller, Radjino Bholanath, Shane McIntosh, and Andy Zaidman. 2016. Analyzing the state of static analysis: A large-scale evaluation in open source software. In 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER), Vol. 1. IEEE, 470–481.

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).

Lisa Nguyen Quang Do, James R. Wright, and Karim Ali. 2022. Why Do Software Developers Use Static Analysis Tools? A User-Centered Study of Developer Needs and Motivations. IEEE Transactions on Software Engineering 48, 3 (2022), 835–847. https://doi.org/10.1109/TSE.2020.3004525

Marco D’Ambros, Michele Lanza, and Romain Robbes. 2012. Evaluating defect prediction approaches: a benchmark and an extensive comparison. Empirical Software Engineering 17, 4 (2012), 531–577.

Zhangyin Feng, Daya Guo, Duyu Tang, Nan Duan, Xiaocheng Feng, Ming Gong, Linjun Shou, Bing Qin, Ting Liu, Daxin Jiang, et al. 2020. Codebert: A pre-trained model for programming and natural languages. arXiv preprint arXiv:2002.08155 (2020).

Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735–1780.

Brittany Johnson, Yoonki Song, Emerson Murphy-Hill, and Robert Bowdidge. 2013. Why don’t software developers use static analysis tools to find bugs?. In 2013 35th International Conference on Software Engineering (ICSE). IEEE, 672–681.

Yoshio Kataoka, Takeo Imai, Hiroki Andou, and Tetsuji Fukaya. 2002. A quantitative evaluation of maintainability enhancement by refactoring. In International Conference on Software Maintenance, 2002. Proceedings. IEEE, 576–585.

Miryung Kim, Thomas Zimmermann, and Nachiappan Nagappan. 2012. A field study of refactoring challenges and benefits. In Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering. 1–11.

Philippe Kruchten, Robert L Nord, and Ipek Ozkaya. 2012. Technical debt: From metaphor to theory and practice. Ieee software 29, 6 (2012), 18–21.

Robert Leitch and Eleni Stroulia. 2004. Assessing the maintainability benefits of design restructuring using dependency analysis. In Proceedings. 5th International Workshop on Enterprise Networking and Computing in Healthcare Industry (IEEE Cat. No. 03EX717). IEEE, 309–322.

Kui Liu, Dongsun Kim, Tegawendé F Bissyandé, Taeyoung Kim, Kisub Kim, Anil Koyuncu, Suntae Kim, and Yves Le Traon. 2019. Learning to spot and refactor inconsistent method names. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). IEEE, 1–12.

Mateus Lopes and Andre Hora. 2022. How and why we end up with complex methods: a multi-language study. Empirical Software Engineering 27, 5 (2022), 1–42.

Thainá Mariani and Silvia Regina Vergilio. 2017. A systematic review on search-based refactoring. Information and Software Technology 83 (2017), 14–34.

Purnima Naik, Salomi Nelaballi, Venkata Sai Pusuluri, and Dae-Kyoo Kim. 2023. Deep Learning-Based Code Refactoring: A Review of Current Knowledge. Journal of Computer Information Systems (2023), 1–15.

Milos Djermanovic Nicholas C. Zakas, Brandon Mills. 2022. https://eslint.org/.

Keiron O’Shea and Ryan Nash. 2015. An introduction to convolutional neural networks. arXiv preprint arXiv:1511.08458 (2015).

Mark O’Keeffe and Mel O Cinnéide. 2008. Search-based refactoring for software maintenance. Journal of Systems and Software 81, 4 (2008), 502–516.

SonarSource S.A. 2008–2022. https://sonarqube.org.

Nikolaos Tsantalis, Ameya Ketkar, and Danny Dig. 2020. RefactoringMiner 2.0. IEEE Transactions on Software Engineering (2020), 21 pages. https://doi.org/10.1109/TSE.2020.3007722.