Evaluating Machine Learning Algorithms for Software Effort Estimation from Textual Descriptions of Requirements
Abstract
Effort estimation in software development is challenging, subjective, and time-consuming, and it directly impacts planning and resource allocation. This work investigated the effectiveness of Machine Learning (ML) algorithms and deep neural network architectures in automatically estimating development efforts from textual requirements descriptions. Traditional ML algorithms with TF-IDF representations, as well as BERT-based models, were evaluated on six public datasets. The experiments analyzed the performance of the models in two scenarios, intra- and inter-datasets, using the mean absolute error metric. The experimental results showed that BERT-based models outperformed traditional algorithms in the intra-dataset scenario, but experienced a more pronounced performance drop in the inter-dataset setting.
References
Atoum, I. and Otoom, A. A. (2024). Enhancing software effort estimation with pre-trained word embeddings: A small-dataset solution for accurate story point prediction. Electronics, 13(23).
Awad, M. and Khanna, R. (2015). Efficient Learning Machines: Theories, Concepts, and Applications for Engineers and System Designers. Apress.
Choetkiertikul, M., Dam, H. K., Tran, T., Pham, T., Ghose, A., and Menzies, T. (2018). A deep learning model for estimating story points. IEEE Transactions on Software Engineering, 45(7):637–656.
CORRÊA, W. A. (2020). Aplicando aprendizado de máquina para estimativa de esforço no desenvolvimento de software. Master’s thesis. DEPARTAMENTO DE INFORMÁTICA/CCET.
Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. (2019). Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers), pages 4171–4186.
Fernandes, J. M. and Machado, R. J. (2017). Requisitos em projetos de software e de sistemas de informação (Portuguese Edition). Novatec Editora. Edição do Kindle., 1ª edição edition.
Fu, M. and Tantithamthavorn, C. (2023). Gpt2sp: A transformer-based agile story point estimation approach. IEEE Transactions on Software Engineering, 49(2):611–625.
Jadhav, A., Shandilya, S. K., Izonin, I., and Muzyka, R. (2024). Multi-step dynamic ensemble selection to estimate software effort. Applied Artificial Intelligence, 38(1):2351718.
Massari, V. (2018). Gerenciamento Ágil de Projetos (2a. edição). Brasport.
Pressman, R. S. and Maxim, B. R. (2021). Engenharia de software. McGraw Hill Brasil, 9ª edição edition.
Pérez-Godoy, M. D., Molina, M., Martínez, F., Elizondo, D., Charte, F., and Rivera, A. J. (2024). Desreg: Dynamic ensemble selection library for regression tasks. Neurocomputing, 580:127487.
Tawosi, V., Moussa, R., and Sarro, F. (2024). Agile effort estimation: Have we solved the problem yet? insights from the replication of the gpt2sp study. In 2024 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER), pages 1034–1041.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., and Polosukhin, I. (2017). Attention is all you need. Advances in neural information processing systems, 30.
