Promise+: expandindo a base de dados de requisitos de software Promise_exp

  • Bruno Silva UFMA
  • Rodrigo Nascimento UFMA
  • Luis Rivero UFMA
  • Geraldo Braz UFMA
  • Rodrigo Pereira dos Santos UNIRIO
  • Luiz E. G. Martins Unifesp
  • Davi Viana UFMA

Resumo


A classificação de requisitos de software é um dos processos da etapa de análise de requisitos, sendo fundamental para a compreensão do software a ser criado. Realizar essa classificação manualmente é uma tarefa difícil, demorada e sujeita a erros. Nesse sentido, trabalhos na literatura propõem utilizar algoritmos de aprendizado de máquina supervisionado para automatizar essa tarefa. As bases de dado mais comumente usadas para este processo são PROMISE e PROMISE_- exp. No entanto, estudos anteriores identificaram questões como o número limitado de requisitos e a falta de diversidade das bases de dados existentes. Essas limitações impactam negativamente o desempenho dos algoritmos de aprendizado de máquina na classificação de requisitos. Este trabalho é uma nova expansão da base de requisitos com classificação feita por especialistas e avaliada no desempenho de seis algoritmos de aprendizado de máquina. Apresentamos a expansão, nomeadamente Promise+, que representa um aumento de quase 280% face ao PROMISE_exp. Para a tarefa de classificação binária, o Promise+ representou uma melhoria na identificação de requisitos funcionais. Quanto às tarefas multiclasse, a maioria dos algoritmos treinados com Promise+ apresentou melhor desempenho em mais classes de requisitos não funcionais. Por fim, o Promise+ estará disponível para toda a comunidade de Engenharia de Software.

Palavras-chave: Engenharia de Requisitos, Repositório, Base de Dados, Aprendizado de Máquina

Referências

1998. IEEE Recommended Practice for Software Requirements Specifications. IEEE Std 830-1998 (1998), 1–40. DOI: 10.1109/IEEESTD.1998.88286

2018. ISO/IEC/IEEE International Standard - Systems and software engineering – Life cycle processes – Requirements engineering. ISO/IEC/IEEE 29148:2018(E) (2018), 1–104. DOI: 10.1109/IEEESTD.2018.8559686

Fawaz S. Al-Anzi. 2022. An Effective Hybrid Stochastic Gradient Descent for Classification of Short Text Communication in E-Learning Environments. In 2022 8th International Conference on Control, Decision and Information Technologies (CoDIT), Vol. 1. 1096–1101. DOI: 10.1109/CoDIT55151.2022.9804138

Manal Binkhonain and Liping Zhao. 2023. A machine learning approach for hierarchical classification of software requirements. Machine Learning with Applications 12 (02 2023), 100457. DOI: 10.1016/j.mlwa.2023.100457

Jane Cleland-Huang, Raffaella Settimi, Xuchang Zou, and Peter Solc. 2006. The Detection and Classification of Non-Functional Requirements with Application to Early Aspects. In 14th IEEE International Requirements Engineering Conference (RE’06). 39–48. DOI: 10.1109/RE.2006.65

Jacob Cohen. 1960. A Coefficient of Agreement for Nominal Scales. Educational and Psychological Measurement 20 (1960), 37 – 46. [link]

Edna Dias Canedo and Bruno Cordeiro Mendes. 2020. Software Requirements Classification Using Machine Learning Algorithms. Entropy 22, 9 (2020). DOI: 10.3390/e22091057

Alessio Ferrari, Giorgio Oronzo Spagnolo, and Stefania Gnesi. 2017. PURE: A Dataset of Public Requirements Documents. In 2017 IEEE 25th International Requirements Engineering Conference (RE). 502–505. DOI: 10.1109/RE.2017.29

Aurelien Geron. 2019. Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems (2nd ed.). O’Reilly Media, Inc.

Zhiqiang Gong, Ping Zhong, and Weidong Hu. 2019. Diversity in Machine Learning. IEEE Access 7 (2019), 64323–64350. DOI: 10.1109/ACCESS.2019.2917620

Jianglin Huang, Jacky Wai Keung, Federica Sarro, Yan-Fu Li, Yuen-Tak Yu, WK Chan, and Hongyi Sun. 2017. Cross-validation based K nearest neighbor imputation for software quality datasets: an empirical study. Journal of Systems and Software 132 (2017), 226–252.

Nitin Indurkhya and Fred J. Damerau. 2010. Handbook of Natural Language Processing (2nd ed.). Chapman & Hall/CRC.

Giulia Falcão de Melo F. Cavalcanti Matheus Barreto Lins Marinho Karolayne Teixeira da Silva, Geovane Miguel da Silva and Francisco Madeiro. 2021. Algoritmos de Aprendizagem Supervisionada com Conjuntos de Dados Desbalanceados para Classificação de Requisitos Não-Funcionais. In Anais do 15 Congresso Brasileiro de Inteligência Computacional. SBIC, Joinville, SC, 1–7.

Kamaljit Kaur and Parminder Kaur. 2023. Improving BERT model for requirements classification by bidirectional LSTM-CNN deep model. Computers and Electrical Engineering 108 (2023), 108699.

Kamaljit Kaur and Parminder Kaur. 2024. The application of AI techniques in requirements classification: a systematic mapping. Artificial Intelligence Review 57 (02 2024). DOI: 10.1007/s10462-023-10667-1

Marcia Lima, Victor Valle, Estevão Costa, Fylype Lira, and Bruno Gadelha. 2019. Software Engineering Repositories: Expanding the PROMISE Database. SBES 2019: Proceedings of the XXXIII Brazilian Symposium on Software Engineering, 427–436. DOI: 10.1145/3350768.3350776

Zachary C. Lipton, Charles Elkan, and Balakrishnan Naryanaswamy. 2014. Optimal Thresholding of Classifiers to Maximize F1 Measure. In Machine Learning and Knowledge Discovery in Databases. Springer Berlin Heidelberg, Berlin, Heidelberg, 225–239.

Raul Navarro-Almanza, Reyes Juarez-Ramirez, and Guillermo Licea. 2017. Towards Supporting Software Engineering Using Deep Learning: A Case of Software Requirements Classification. In 2017 5th International Conference in Software Engineering Research and Innovation (CONISOFT). 116–120. DOI: 10.1109/CONISOFT.2017.00021

Vrutik Patel, Priya Mehta, and Kruti Lavingia. 2023. Software Requirement Classification Using Machine Learning Algorithms. 1–6. DOI: 10.1109/ICAIA57370.2023.10169588

Rajvardhan Patil, Sorio Boit, Venkat Gudivada, and Jagadeesh Nandigam. 2023. A Survey of Text Representation and Embedding Techniques in NLP. IEEE Access 11 (2023), 36120–36146. DOI: 10.1109/ACCESS.2023.3266377

R. Pressman and B. Maxim. 2016. Engenharia de Software - 8ª Edição. [link]

Gaith Y Quba, Hadeel Al Qaisi, Ahmad Althunibat, and Shadi AlZu’bi. 2021. Software Requirements Classification using Machine Learning algorithm’s. In 2021 International Conference on Information Technology (ICIT). 685–690. DOI: 10.1109/ICIT52682.2021.9491688

Duksan Ryu, Okjoo Choi, and Jongmoon Baik. 2016. Value-cognitive boosting with a support vector machine for cross-project defect prediction. Empirical Software Engineering 21 (2016), 43–71.

J. Sayyad Shirabad and T.J. Menzies. 2005. The PROMISE Repository of Software Engineering Databases. School of Information Technology and Engineering, University of Ottawa, Canada. [link]

Ian Sommerville. 2015. Software Engineering (10th ed.). Pearson.

Ahmad Subahi. 2023. BERT-Based Approach for Greening Software Requirements Engineering Through Non-Functional Requirements. IEEE Access PP (01 2023), 1–1. DOI: 10.1109/ACCESS.2023.3317798

Mohammad Mustafa Taye. 2023. Understanding of Machine Learning with Deep Learning: Architectures, Workflow, Applications and Future Directions. Computers 12, 5 (2023). DOI: 10.3390/computers12050091

Saurabh Tiwari, Santosh Singh Rathore, Shreya Sagar, and Yash Mirani. 2020. Identifying use case elements from textual specification: A preliminary study. In 2020 IEEE 28th International Requirements Engineering Conference (RE). IEEE, 410–411.

Wentao Wang, Nesrin Hussein, Arushi Gupta, and Yinglin Wang. 2017. A regression model based approach for identifying security requirements in open source software development. In 2017 IEEE 25th International Requirements Engineering Conference Workshops (REW). IEEE, 443–446.

D.H.Wolpert andW.G. Macready. 1997. No free lunch theorems for optimization. IEEE Transactions on Evolutionary Computation 1, 1 (1997), 67–82. DOI: 10.1109/4235.585893

Shuo Xu, Yan Li, and ZhengWang. 2017. Bayesian Multinomial Naïve Bayes Classifier to Text Classification. In Advanced Multimedia and Ubiquitous Engineering. Springer Singapore, Singapore, 347–352.

Kareshna Zamani, Didar Zowghi, and Chetan Arora. 2021. Machine Learning in Requirements Engineering: A Mapping Study. In 2021 IEEE 29th International Requirements Engineering Conference Workshops (REW). 116–125. DOI: 10.1109/REW53955.2021.00023

Liping Zhao, Waad Alhoshan, Alessio Ferrari, Keletso J. Letsholo, Muideen A. Ajagbe, Erol-Valeriu Chioasca, and Riza T. Batista-Navarro. 2021. Natural Language Processing for Requirements Engineering: A Systematic Mapping Study. ACM Comput. Surv. 54, 3, Article 55 (apr 2021), 41 pages. DOI: 10.1145/3444689
Publicado
30/09/2024
SILVA, Bruno; NASCIMENTO, Rodrigo; RIVERO, Luis; BRAZ, Geraldo; SANTOS, Rodrigo Pereira dos; MARTINS, Luiz E. G.; VIANA, Davi. Promise+: expandindo a base de dados de requisitos de software Promise_exp. In: SIMPÓSIO BRASILEIRO DE ENGENHARIA DE SOFTWARE (SBES), 38. , 2024, Curitiba/PR. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2024 . p. 291-301. DOI: https://doi.org/10.5753/sbes.2024.3427.