Multilingual Crowd-Based Requirements Engineering Using Large Language Models

  • Arthur Pilone USP
  • Paulo Meirelles USP
  • Fabio Kon USP
  • Walid Maalej Universität Hamburg

Resumo


A central challenge for ensuring the success of software projects is to assure the convergence of developers’ and users’ views. While the availability of large amounts of user data from social media, app store reviews, and support channels bears many benefits, it still remains unclear how software development teams can effectively use this data. We present an LLM-powered approach called DeeperMatcher that helps agile teams use crowd-based requirements engineering (CrowdRE) in their issue and task management.We are currently implementing a command-line tool that enables developers to match issues with relevant user reviews. We validated our approach on an existing English dataset from a well-known open-source project. Additionally, to check how well DeeperMatcher works for other languages, we conducted a single-case mechanism experiment alongside developers of a local project that has issues and user feedback in Brazilian Portuguese. Our preliminary analysis indicates that the accuracy of our approach is highly dependent on the text embedding method used. We discuss further refinements needed for reliable crowd-based requirements engineering with multilingual support.

Palavras-chave: Large Language Models, Foundation Models, Natural Language Processing, Requirements Engineering, User Feedback Mining

Referências

Jakob Smedegaard Andersen andWalid Maalej. 2024. Design Patterns for Machine Learning-Based Systems With Humans in the Loop. IEEE Software 41, 4 (2024), 151–159. DOI: 10.1109/MS.2023.3340256

Kent Beck and Cynthia Andres. 2004. Extreme programming explained: embrace change. Addison-Wesley Professional.

Markus Borg. 2024. Requirements Engineering and Large Language Models: Insights From a Panel. IEEE Software 41, 2 (2024), 6–10.

Rubens Ideron dos Santos, Eduard C. Groen, and Karina Villela. 2019. An Overview of User Feedback Classification Approaches. In REFSQ Workshops. [link]

Angela Fan, Beliz Gokkaya, Mark Harman, Mitya Lyubarskiy, Shubho Sengupta, Shin Yoo, and JieMZhang. 2023. Large language models for software engineering: Survey and open problems. arXiv preprint arXiv:2310.03533 (2023).

Eduard C. Groen, Norbert Seyff, Raian Ali, Fabiano Dalpiaz, Joerg Doerr, Emitza Guzman, Mahmood Hosseini, Jordi Marco, Marc Oriol, Anna Perini, and Melanie Stade. 2017. The Crowd in Requirements Engineering: The Landscape and Challenges. IEEE Software 34, 2 (2017), 44–52. DOI: 10.1109/MS.2017.33

Marlo Haering, Christoph Stanik, and Walid Maalej. 2021. Automatically Matching Bug Reports With Related App Reviews. In 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). 970–981. DOI: 10.1109/ICSE43902.2021.00092

Matthew Honnibal, Ines Montani, Sofie Van Landeghem, Adriane Boyd, et al. 2020. spaCy.

Qilu Jiao and Shunyao Zhang. 2021. A Brief Survey of Word Embedding and Its Recent Development. In 2021 IEEE 5th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), Vol. 5. 1697–1701. DOI: 10.1109/IAEAC50856.2021.9390956

Brendan Julian, James Noble, and Craig Anslow. 2019. Agile Practices in Practice: Towards a Theory of Agile Adoption and Process Evolution. In Agile Processes in Software Engineering and Extreme Programming. Springer International Publishing, 3–18.

Yoon Kim, Carl Denton, Luong Hoang, and Alexander M Rush. 2017. Structured attention networks. arXiv preprint arXiv:1702.00887 (2017).

Walid Maalej, Volodymyr Biryuk, Jialiang Wei, and Fabian Panse. 2024. On the Automated Processing of User Feedback. In Handbook of Natural Language Processing for Requirements Engineering, Alessio Ferrari and Gouri Deshpande (Eds.). Springer.

Daniel Martens and Walid Maalej. 2019. Extracting and Analyzing Context Information in User-Support Conversations on Twitter. In 2019 IEEE 27th International Requirements Engineering Conference (RE). 131–141. DOI: 10.1109/RE.2019.00024

Daniel Martens and Walid Maalej. 2019. Release Early, Release Often, and Watch Your Users’ Emotions: Lessons From Emotional Patterns. IEEE Software 36, 5 (2019), 32–37. DOI: 10.1109/MS.2019.2923603

Rohan Reddy Mekala, Asif Irfan, Eduard C. Groen, Adam Porter, and Mikael Lindvall. 2021. Classifying User Requirements from Online Feedback in Small Dataset Environments using Deep Learning. In 2021 IEEE 29th International Requirements Engineering Conference (RE). 139–149. DOI: 10.1109/RE51729.2021.00020

Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013).

Lloyd Montgomery, Clara Lüders, and Walid Maalej. 2022. An alternative issue tracking dataset of public Jira repositories. In Proceedings of the 19th International Conference on Mining Software Repositories (Pittsburgh, Pennsylvania) (MSR ’22). Association for Computing Machinery, New York, NY, USA, 73–77. DOI: 10.1145/3524842.3528486

Lloyd Montgomery, Clara Lüders, andWalid Maalej. 2024. Mining Issue Trackers: Concepts and Techniques. In Handbook of Natural Language Processing for Requirements Engineering, Alessio Ferrari and Gouri Deshpande (Eds.). Springer. [link]

Dennis Pagano and Walid Maalej. 2013. User feedback in the appstore: An empirical study. In 2013 21st IEEE International Requirements Engineering Conference (RE). 125–134. DOI: 10.1109/RE.2013.6636712

Arthur Pilone, Lorenzo Bertin, Isabela Clementino Ponciano, Filipe Tressmann Velozo, Jonathas Castilho, Jorge Harrisonn, Guilherme Luiz Pereira de Almeida, and Ana Yoon Faria de Lima. 2024. Projeto Piloto Bike SP. [link]

Nils Reimers and Iryna Gurevych. 2019. Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084 (2019).

Victor Sanh, Lysandre Debut, Julien Chaumond, and Thomas Wolf. 2019. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019).

Remco Snijders, Fabiano Dalpiaz, Mahmood Hosseini, Alimohammad Shahri, and Raian Ali. 2014. Crowd-centric Requirements Engineering. In 2014 IEEE/ACM 7th International Conference on Utility and Cloud Computing. 614–615. DOI: 10.1109/UCC.2014.96

Christoph Stanik, Marlo Haering, and Walid Maalej. 2019. Classifying Multilingual User Feedback using Traditional Machine Learning and Deep Learning. In 2019 IEEE 27th International Requirements Engineering Conference Workshops (REW). 220–226. DOI: 10.1109/REW.2019.00046

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems 30 (2017).

Roel J. Wieringa. 2014. Single-Case Mechanism Experiments. Springer Berlin Heidelberg, Berlin, Heidelberg, 247–267. DOI: 10.1007/978-3-662-43839-8_18
Publicado
30/09/2024
PILONE, Arthur; MEIRELLES, Paulo; KON, Fabio; MAALEJ, Walid. Multilingual Crowd-Based Requirements Engineering Using Large Language Models. In: SIMPÓSIO BRASILEIRO DE ENGENHARIA DE SOFTWARE (SBES), 38. , 2024, Curitiba/PR. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2024 . p. 679-685. DOI: https://doi.org/10.5753/sbes.2024.3646.