Industrial Practices of Requirements Engineering for ML-Enabled Systems in Brazil

  • Antonio Pedro Santos Alves PUC-Rio
  • Marcos Kalinowski PUC-Rio
  • Daniel Mendez Blekinge Institute of Technology
  • Hugo Villamizar PUC-Rio
  • Kelly Azevedo PUC-Rio
  • Tatiana Escovedo PUC-Rio
  • Helio Lopes PUC-Rio

Resumo


[Context] In Brazil, 41% of companies use machine learning (ML) to some extent. However, several challenges have been reported when engineering ML-enabled systems, including unrealistic customer expectations and vagueness in ML problem specifications. Literature suggests that Requirements Engineering (RE) practices and tools may help to alleviate these issues, yet there is insufficient understanding of RE’s practical application and its perception among practitioners. [Goal] This study aims to investigate the application of RE in developing ML-enabled systems in Brazil, creating an overview of current practices, perceptions, and problems in the Brazilian industry. [Method] To this end, we extracted and analyzed data from an international survey focused on ML-enabled systems, concentrating specifically on responses from practitioners based in Brazil. We analyzed the cluster of RE-related answers gathered from 72 practitioners involved in data-driven projects.We conducted quantitative statistical analyses on contemporary practices using bootstrapping with confidence intervals and qualitative studies on the reported problems involving open and axial coding procedures. [Results] Our findings highlight distinct aspects of RE implementation in ML projects in Brazil. For instance, (i) RErelated tasks are predominantly conducted by data scientists; (ii) the most common techniques for eliciting requirements are interviews and workshop meetings; (iii) there is a prevalence of interactive notebooks in requirements documentation; (iv) practitioners report problems that include a poor understanding of the problem to solve and the business domain, low customer engagement, and difficulties managing stakeholders expectations. [Conclusion] These results provide an understanding of RE-related practices in the Brazilian ML industry, helping to guide research and initiatives toward improving the maturity of RE for ML-enabled systems.

Palavras-chave: Requirements Engineering, Machine Learning, Survey, Brazil

Referências

Khlood Ahmad, Muneera Bano, Mohamed Abdelrazek, Chetan Arora, and John Grundy. 2021. What’s up with requirements engineering for artificial intelligence systems?. In 2021 IEEE 29th International Requirements Engineering Conference. IEEE, 1–12.

Timo Aho, Outi Sievi-Korte, Terhi Kilamo, Sezin Yaman, and Tommi Mikkonen. 2020. Demystifying data science projects: A look on the people and process of data science today. In Product-Focused Software Process Improvement: 21st International Conference, PROFES 2020, Turin, Italy, November 25–27, 2020, Proceedings 21. Springer, 153–167.

Antonio Pedro Santos Alves, Marcos Kalinowski, Görkem Giray, Daniel Mendez, Niklas Lavesson, Kelly Azevedo, Hugo Villamizar, Tatiana Escovedo, Helio Lopes, Stefan Biffl, et al. 2023. Status Quo and Problems of Requirements Engineering for Machine Learning: Results from an International Survey. In Product-Focused Software Process Improvement: 24st International Conference, PROFES 2023, Dornbirn, Austria, December 10–13. Springer, 153–167.

Saleema Amershi, Andrew Begel, Christian Bird, Robert DeLine, Harald Gall, Ece Kamar, Nachiappan Nagappan, Besmira Nushi, and Thomas Zimmermann. 2019. Software engineering for machine learning: A case study. In 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice. IEEE, 291–300.

Emma Beede, Elizabeth Baylor, Fred Hersch, Anna Iurchenko, Lauren Wilcox, Paisan Ruamviboonsuk, and Laura M Vardoulakis. 2020. A human-centered evaluation of a deep learning system deployed in clinics for the detection of diabetic retinopathy. In Proceedings of the 2020 CHI conference on human factors in computing systems. 1–12.

Harshitha Challa, Nan Niu, and Reese Johnson. 2020. Faulty requirements made valuable: On the role of data quality in deep learning. In 2020 IEEE 7th International Workshop on Artificial Intelligence for Requirements Engineering. IEEE, 61–69.

João Lucas Correia, Juliana Alves Pereira, Rafael Mello, Alessandro Garcia, Baldoino Fonseca, Márcio Ribeiro, Rohit Gheyi, Marcos Kalinowski, Renato Cerqueira, and Willy Tiengo. 2021. Brazilian Data Scientists: Revealing their Challenges and Practices on Machine Learning Model Development. In Proceedings of the XIX Brazilian Symposium on Software Quality (São Luís, Brazil) (SBQS ’20). Association for Computing Machinery, New York, NY, USA, Article 10, 10 pages. DOI: 10.1145/3439961.3439971

Fabiano Dalpiaz and Nan Niu. 2020. Requirements engineering in the days of artificial intelligence. IEEE Software 37, 4 (2020), 7–10.

Daniela Damian. 2007. Stakeholders in global requirements engineering: Lessons learned from practice. IEEE software 24, 2 (2007), 21–27.

Bradley Efron and Robert J. Tibshirani. 1993. An Introduction to the Bootstrap. Chapman & Hall/CRC.

D Méndez Fernández, Stefan Wagner, Marcos Kalinowski, Michael Felderer, Priscilla Mafra, Antonio Vetrò, Tayana Conte, M-T Christiansson, Des Greer, Casper Lassenius, et al. 2017. Naming the pain in requirements engineering: Contemporary problems, causes, and effects in practice. Empirical Software Engineering 22 (2017), 2298–2338.

Hannah Fry. 2018. Hello World: How to be Human in the Age of the Machine. Random House.

Gartner. 2020. Gartner Identifies the Top Strategic Technology Trends for 2021. [link]

Görkem Giray. 2021. A software engineering perspective on engineering machine learning systems: State of the art and challenges. Journal of Systems and Software 180 (2021), 111031.

Khan Mohammad Habibullah, Gregory Gay, and Jennifer Horkoff. 2023. Nonfunctional requirements for machine learning: Understanding current use and challenges among practitioners. Requirements Engineering 28, 2 (2023), 283–316.

IBM. [n.d.]. Estudo IBM: 41% das empresas no Brasil já implementaram ativamente inteligência artificial em seus negócios. [link]. Accessed: April 09, 2024.

Fuyuki Ishikawa and Yutaka Matsuno. 2020. Evidence-driven Requirements Engineering for Uncertainty of Machine Learning-based Systems. In 2020 IEEE 28th International Requirements Engineering Conference. 346–351. DOI: 10.1109/RE48521.2020.00046

Fuyuki Ishikawa and Nobukazu Yoshioka. 2019. How Do Engineers Perceive Difficulties in Engineering of Machine-Learning Systems? - Questionnaire Survey. 2019 IEEE/ACM Joint 7th International Workshop on Conducting Empirical Studies in Industry (CESI) and 6th InternationalWorkshop on Software Engineering Research and Industrial Practice (SER&IP) (2019), 2–9.

Michael I Jordan and Tom M Mitchell. 2015. Machine learning: Trends, perspectives, and prospects. Science 349, 6245 (2015), 255–260.

Marcos Kalinowski, David N Card, and Guilherme H Travassos. 2012. Evidencebased guidelines to defect causal analysis. IEEE Software 29, 4 (2012), 16–18.

Marcos Kalinowski, Tatiana Escovedo, Hugo Villamizar, and Hélio Lopes. 2023. Engenharia de Software para Ciência de Dados: Um guia de boas práticas com ênfase na construção de sistemas de Machine Learning em Python. Casa do Código.

Marcos Kalinowski, Emilia Mendes, David N Card, and Guilherme H Travassos. 2010. Applying DPPI: A defect causal analysis approach using bayesian networks. In Product-Focused Software Process Improvement: 11th International Conference, PROFES 2010. Springer, 92–106.

Marcos Kalinowski, Emilia Mendes, and Guilherme H Travassos. 2011. Automating and evaluating probabilistic cause-effect diagrams to improve defect causal analysis. In Product-Focused Software Process Improvement: 12th International Conference, PROFES 2011. Springer, 232–246.

Marcos Kalinowski, Daniel Mendez, Görkem Giray, Antonio Pedro Santos Alves, Kelly Azevedo, Tatiana Escovedo, Hugo Villamizar, Helio Lopes, Teresa Baldassarre, StefanWagner, Stefan Biffl, Jürgen Musil, Michael Felderer, Niklas Lavesson, and Tony Gorschek. 2024. Naming the Pain in Machine Learning-Enabled Systems Engineering. arXiv:2406.04359 [cs.SE] [link]

Hourieh Khalajzadeh, Mohamed Abdelrazek, John Grundy, John Hosking, and Qiang He. 2018. A survey of current end-user data analytics tool support. In 2018 IEEE International Congress on Big Data. IEEE, 41–48.

Miryung Kim, Thomas Zimmermann, Robert DeLine, and Andrew Begel. 2017. Data scientists in software teams: State of the art and challenges. IEEE Transactions on Software Engineering 44, 11 (2017), 1024–1038.

Hiroshi Kuwajima, Hirotoshi Yasuoka, and Toshihiro Nakae. 2020. Engineering problems in ML systems. Machine Learning 109, 5 (2020), 1103–1126.

Skylar Lei and M.R. Smith. 2003. Evaluation of several nonparametric bootstrap methods to estimate confidence intervals for software metrics. IEEE Transactions on Software Engineering 29, 11 (2003), 996–1004.

Grace A Lewis, Stephany Bellomo, and Ipek Ozkaya. 2021. Characterizing and Detecting Mismatch in Machine-Learning-Enabled Systems. In 2021 IEEE/ACM 1st Workshop on AI Engineering-Software Engineering for AI (WAIN). IEEE, 133–140.

Grace A. Lewis, Ipek Ozkaya, and Xiwei Xu. 2021. Software Architecture Challenges for ML Systems. In 2021 IEEE International Conference on Software Maintenance and Evolution (ICSME). 634–638. DOI: 10.1109/ICSME52107.2021.00071

Johan Linaker, Sardar Muhammad Sulaman, Martin Höst, and Rafael Maiani de Mello. 2015. Guidelines for conducting surveys in software engineering v. 1.1. Lund University 50 (2015).

Clifford E. Lunneborg. 2001. Bootstrap Inference for Local Populations. Therapeutic Innovation & Regulatory Science 35, 4 (2001), 1327–1342.

Silverio Martínez-Fernández, Justus Bogner, Xavier Franch, Marc Oriol, Julien Siebert, Adam Trendowicz, Anna Maria Vollmer, and Stefan Wagner. 2022. Software engineering for AI-based systems: a survey. ACM Transactions on Software Engineering and Methodology 31, 2 (2022), 1–59.

Tom M Mitchell. 1997. Machine Learning.

Nadia Nahar, Haoran Zhang, Grace Lewis, Shurui Zhou, and Christian Kästner. 2023. A Meta-Summary of Challenges in Building Products with ML Components – Collecting Experiences from 4758+ Practitioners. In 2023 IEEE/ACM 2nd International Conference on AI Engineering – Software Engineering for AI (CAIN). 171–183. DOI: 10.1109/CAIN58948.2023.00034

Nadia Nahar, Shurui Zhou, Grace Lewis, and Christian Kästner. 2022. Collaboration challenges in building ml-enabled systems: Communication, documentation, engineering, and process. In Proceedings of the 44th International Conference on Software Engineering. 413–425.

Jeffrey Perkel. 2018. Why Jupyter is data scientists’ computational notebook of choice. Nature 563 (11 2018), 145–146. DOI: 10.1038/d41586-018-07196-1

A. P. Santos Alves, M. Kalinowski, D. Méndez, H. Villamizar, K. Azevedo, T. Escovedo, and H. Lopes. 2024. Artifacts: Industrial Practices of Requirements Engineering for ML-Enabled Systems in Brazil. DOI: 10.5281/zenodo.11000344

Christoph Schröer, Felix Kruse, and Jorge Marx Gómez. 2021. A systematic literature review on applying CRISP-DM process model. Procedia Computer Science 181 (2021), 526–534.

Lavanya Sharma and Pradeep Kumar Garg. 2021. Artificial intelligence: technologies, applications, and challenges. (2021).

Klaas-Jan Stol, Paul Ralph, and Brian Fitzgerald. 2016. Grounded theory in software engineering research: a critical review and guidelines. In Proceedings of the 38th International Conference on Software Engineering. 120–131.

Hugo Villamizar, Tatiana Escovedo, and Marcos Kalinowski. 2021. Requirements Engineering for Machine Learning: A Systematic Mapping Study. In 2021 47th Euromicro Conference on Software Engineering and Advanced Applications. 29–36.

Hugo Villamizar, Marcos Kalinowski, Hélio Lopes, and Daniel Mendez. 2024. Identifying concerns when specifying machine learning-enabled systems: A perspective-based approach. Journal of Systems and Software 213 (2024), 112053. DOI: 10.1016/j.jss.2024.112053

Andreas Vogelsang and Markus Borg. 2019. Requirements Engineering for Machine Learning: Perspectives from Data Scientists. In 2019 IEEE 27th International Requirements Engineering Conference Workshops. 245–251.

Stefan Wagner, Daniel Méndez Fernández, Michael Felderer, Antonio Vetrò, Marcos Kalinowski, Roel Wieringa, Dietmar Pfahl, Tayana Conte, Marie-Therese Christiansson, Desmond Greer, Casper Lassenius, Tomi Männistö, Maleknaz Nayebi, Markku Oivo, Birgit Penzenstadler, Rafael Prikladnicki, Guenther Ruhe, André Schekelmann, Sagar Sen, Rodrigo Spínola, Ahmed Tuzcu, Jose Luis De La Vara, and Dietmar Winkler. 2019. Status Quo in Requirements Engineering: A Theory and a Global Family of Surveys. ACM Transactions on Software Engineering and Methodology (TOSEM) 28, 2, Article 9 (2019), 48 pages.

Stefan Wagner, Daniel Mendez, Michael Felderer, Daniel Graziotin, and Marcos Kalinowski. 2020. Challenges in survey research. Contemporary Empirical Methods in Software Engineering (2020), 93–125.

Chong Wang, Pengwei Cui, Maya Daneva, and Mohamad Kassab. 2018. Understanding what industry wants from requirements engineers: an exploration of RE jobs in Canada. In Proceedings of the 12th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (Oulu, Finland) (ESEM ’18). Association for Computing Machinery, New York, NY, USA, Article 41, 10 pages. DOI: 10.1145/3239235.3268916

Eduardo Zimelewicz, Marcos Kalinowski, Daniel Mendez, Görkem Giray, Antonio Pedro Santos Alves, Niklas Lavesson, Kelly Azevedo, Hugo Villamizar, Tatiana Escovedo, Helio Lopes, Stefan Biffl, Juergen Musil, Michael Felderer, Stefan Wagner, Teresa Baldassarre, and Tony Gorschek. 2024. ML-Enabled Systems Model Deployment and Monitoring: Status Quo and Problems. In Software Quality as a Foundation for Security. Springer Nature Switzerland, Cham, 112–131.
Publicado
30/09/2024
ALVES, Antonio Pedro Santos; KALINOWSKI, Marcos; MENDEZ, Daniel; VILLAMIZAR, Hugo; AZEVEDO, Kelly; ESCOVEDO, Tatiana; LOPES, Helio. Industrial Practices of Requirements Engineering for ML-Enabled Systems in Brazil. In: SIMPÓSIO BRASILEIRO DE ENGENHARIA DE SOFTWARE (SBES), 38. , 2024, Curitiba/PR. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2024 . p. 224-233. DOI: https://doi.org/10.5753/sbes.2024.3371.