skip to main content
10.1145/3439961.3439991acmotherconferencesArticle/Chapter ViewAbstractPublication PagessbqsConference Proceedingsconference-collections
research-article

Deployment of a Machine Learning System for Predicting Lawsuits Against Power Companies: Lessons Learned from an Agile Testing Experience for Improving Software Quality

Published: 06 March 2021 Publication History

Abstract

The advances in Machine Learning (ML) require software organizations to evolve their development processes in order to improve the quality of ML systems. Within the software development process, the testing stage of an ML system is more critical, considering that it is necessary to add data validation, trained model quality evaluation, and model validation to traditional unit, integration tests and system tests. In this paper, we focus on reporting the lessons learned of using model testing and exploratory testing within the context of the agile development process of an ML system that predicts lawsuits proneness in energy supply companies. Through the development of the project, the SCRUM agile methodology was applied and activities related to the development of the ML model and the development of the end-user application were defined. After the testing process of the ML model, we managed to achieve 93.89 accuracy; 95.58 specificity; 88.84 sensitivity; and 87.09 precision. Furthermore, we focused on the quality of use of the application embedding the ML model, by carrying out exploratory testing. As a result, through several iterations, different types of defects were identified and corrected. Our lessons learned support software engineers willing to develop ML systems that consider both the ML model and the end-user application.

References

[1]
Saleema Amershi, Andrew Begel, Christian Bird, Robert DeLine, Harald Gall, Ece Kamar, Nachiappan Nagappan, Besmira Nushi, and Thomas Zimmermann. 2019. Software engineering for machine learning: A case study. In 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP). IEEE, 291–300.
[2]
Adnan Amin, Sajid Anwar, Awais Adnan, Muhammad Nawaz, Khalid Alawfi, Amir Hussain, and Kaizhu Huang. 2017. Customer churn prediction in the telecommunication sector using a rough set approach. Neurocomputing 237(2017), 242–254.
[3]
Mohit Arora, Sahil Verma, Shivali Chopra, 2020. A Systematic Literature Review of Machine Learning Estimation Approaches in Scrum Projects. In Cognitive Informatics and Soft Computing. Springer, 573–586.
[4]
Hrvoje Belani, Marin Vukovic, and Željka Car. 2019. Requirements Engineering Challenges in Building AI-Based Complex Systems. In 2019 IEEE 27th International Requirements Engineering Conference Workshops (REW). IEEE, 252–255.
[5]
Eric Breck, Shanqing Cai, Eric Nielsen, Michael Salib, and D Sculley. 2016. What’s your ML Test Score? A rubric for ML production systems. (2016).
[6]
Tianqi Chen and Carlos Guestrin. 2016. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining. 785–794.
[7]
Erick Barros dos Santos, Lucas Simão da Costa, Bruno Sabóia Aragão, Ismayle de Sousa Santos, and Rossana Maria de Castro Andrade. 2019. Extraction of test cases procedures from textual use cases to reduce test effort: Test Factory Experience Report. In Proceedings of the XVIII Brazilian Symposium on Software Quality. 266–275.
[8]
Fabio Falcini and Giuseppe Lami. 2017. Deep learning in automotive: Challenges and opportunities. In International Conference on Software Process Improvement and Capability Determination. Springer, 279–288.
[9]
Ahmad Nauman Ghazi, Kai Petersen, Elizabeth Bjarnason, and Per Runeson. 2018. Levels of exploration in exploratory testing: from freestyle to fully scripted. IEEE Access 6(2018), 26416–26423.
[10]
LADS Gruginskie and Guilherme Luís Roehe Vaccaro. 2018. Lawsuit lead time prediction: Comparison of data mining techniques based on categorical response variable.PloS one 13, 6 (2018), e0198122–e0198122.
[11]
Isabelle Guyon, Jason Weston, Stephen Barnhill, and Vladimir Vapnik. 2002. Gene selection for cancer classification using support vector machines. Machine learning 46, 1-3 (2002), 389–422.
[12]
Gaétan Hains, Arvid Jakobsson, and Youry Khmelevsky. 2018. Towards formal methods and software engineering for deep learning: security, safety and productivity for dl systems development. In 2018 Annual IEEE International Systems Conference (SysCon). IEEE, 1–5.
[13]
Koichi Hamada, Fuyuki Ishikawa, Satoshi Masuda, Mineo Matsuya, Tomoyuki Myojin, Yasuharu Nishi, Hideto Ogawa, Takahiro Toku, Susumu Tokumoto, Kazunori Tsuchiya, [n.d.]. Guidelines for Quality Assurance of Machine Learning-based Artificial Intelligence. ([n. d.]).
[14]
Charles Hill, Rachel Bellamy, Thomas Erickson, and Margaret Burnett. 2016. Trials and tribulations of developers of intelligent systems: A field study. In 2016 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC). IEEE, 162–170.
[15]
Fred Hohman, Kanit Wongsuphasawat, Mary Beth Kery, and Kayur Patel. 2020. Understanding and Visualizing Data Iteration in Machine Learning. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–13.
[16]
Vanessa Apaolaza Ibáñez, Patrick Hartmann, and Pilar Zorrilla Calvo. 2006. Antecedents of customer loyalty in residential energy markets: Service quality, satisfaction, trust and switching costs. The Service Industries Journal 26, 6 (2006), 633–650.
[17]
Fuyuki Ishikawa and Nobukazu Yoshioka. 2019. How do engineers perceive difficulties in engineering of machine-learning systems?-Questionnaire survey. In 2019 IEEE/ACM Joint 7th International Workshop on Conducting Empirical Studies in Industry (CESI) and 6th International Workshop on Software Engineering Research and Industrial Practice (SER&IP). IEEE, 2–9.
[18]
Juha Itkonen and Mika V Mäntylä. 2014. Are test cases needed? Replicated comparison between exploratory and test-case-based software testing. Empirical Software Engineering 19, 2 (2014), 303–342.
[19]
James Kennedy and Russell Eberhart. 1995. Particle swarm optimization. In Proceedings of ICNN’95-International Conference on Neural Networks, Vol. 4. IEEE, 1942–1948.
[20]
Abbas Keramati, Hajar Ghaneei, and Seyed Mohammad Mirmohammadi. 2016. Developing a prediction model for customer churn from electronic banking services using data mining. Financial Innovation 2, 1 (2016), 10.
[21]
Philip Koopman and Michael Wagner. 2016. Challenges in autonomous vehicle testing and validation. SAE International Journal of Transportation Safety 4, 1 (2016), 15–24.
[22]
Fumihiro Kumeno. 2019. Sofware engneering challenges for machine learning applications: A literature review. Intelligent Decision Technologies 13, 4 (2019), 463–476.
[23]
Zeshan Kurd, Tim Kelly, and Jim Austin. 2007. Developing artificial neural networks for safety critical systems. Neural Computing and Applications 16, 1 (2007), 11–19.
[24]
Francisco Tarciso Leite. 2008. Metodologia científica: métodos e técnicas de pesquisa. Aparecida: Ideias & Letras(2008).
[25]
Renato Marchetti and Paulo HM Prado. 2004. Avaliação da satisfação do consumidor utilizando o método de equações estruturais: um modelo aplicado ao setor elétrico brasileiro. Revista de Administração Contemporânea 8, 4(2004), 9–32.
[26]
Claudia de O Melo, Viviane Santos, Eduardo Katayama, Hugo Corbucci, Rafael Prikladnicki, Alfredo Goldman, and Fabio Kon. 2013. The evolution of agile software development in Brazil. Journal of the Brazilian Computer Society 19, 4 (2013), 523–552.
[27]
Tim Menzies. 2019. The five laws of SE for AI. IEEE Software 37, 1 (2019), 81–85.
[28]
Francisco Y Oliveira, Pedro T Cutrim, João OB Diniz, Giovanni LF Silva, Darlan BP Quintanilha, Otílio Silva Neto, Vandecia RM Fernandes, Geraldo Braz Junior, André B Cavalcante, Aristófanes C Silva, 2019. Prediction of unregistered power consumption lawsuits and its correlated factors based on customer data using extreme gradient boosting model. In 2019 IEEE International Conference on Systems, Man and Cybernetics (SMC). IEEE, 2059–2064.
[29]
Christian Quesada-López, Erika Hernandez-Agüero, and Marcelo Jenkins. 2019. Characterization of software testing practices: A replicated survey in Costa Rica. Journal of Software Engineering Research and Development 7 (2019), 6–1.
[30]
David Sculley, Gary Holt, Daniel Golovin, Eugene Davydov, Todd Phillips, Dietmar Ebner, Vinay Chaudhary, Michael Young, Jean-Francois Crespo, and Dan Dennison. 2015. Hidden technical debt in machine learning systems. In Advances in neural information processing systems. 2503–2511.
[31]
Jasmine Sekhon and Cody Fleming. 2019. Towards improved testing for deep learning. In 2019 IEEE/ACM 41st International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER). IEEE, 85–88.
[32]
Susana M Vieira, Luís F Mendonça, Goncalo J Farinha, and João MC Sousa. 2013. Modified binary PSO for feature selection using SVM applied to mortality prediction of septic patients. Applied Soft Computing 13, 8 (2013), 3494–3504.
[33]
Byron C Wallace, Kevin Small, Carla E Brodley, Joseph Lau, and Thomas A Trikalinos. 2012. Deploying an interactive machine learning system in an evidence-based practice center: abstrackr. In Proceedings of the 2nd ACM SIGHIT international health informatics symposium. 819–824.
[34]
Tianyi Zhang, Cuiyun Gao, Lei Ma, Michael Lyu, and Miryung Kim. 2019. An empirical study of common challenges in developing deep learning applications. In 2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE). IEEE, 104–115.

Cited By

View all
  • (2024)Teaching Machine Learning as Part of Agile Software EngineeringIEEE Transactions on Education10.1109/TE.2023.333734367:3(377-386)Online publication date: Jun-2024
  • (2023)The pipeline for the continuous development of artificial intelligence models—Current state of research and practiceJournal of Systems and Software10.1016/j.jss.2023.111615199:COnline publication date: 1-May-2023
  • (2022)A software engineering perspective on engineering machine learning systemsJournal of Systems and Software10.1016/j.jss.2021.111031180:COnline publication date: 22-Apr-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
SBQS '20: Proceedings of the XIX Brazilian Symposium on Software Quality
December 2020
430 pages
ISBN:9781450389235
DOI:10.1145/3439961
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 March 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Methods
  2. Software Processes
  3. Validation
  4. Verification
  5. and Testing
  6. and Tools

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

  • The authors thank Equatorial Energy for the financial support provided through the National Electric Energy Agency (ANEEL) Research and Development Program (R&D), PD-00037-0031/2019.

Conference

SBQS'20
SBQS'20: 19th Brazilian Symposium on Software Quality
December 1 - 4, 2020
São Luís, Brazil

Acceptance Rates

Overall Acceptance Rate 35 of 99 submissions, 35%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)21
  • Downloads (Last 6 weeks)4
Reflects downloads up to 23 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Teaching Machine Learning as Part of Agile Software EngineeringIEEE Transactions on Education10.1109/TE.2023.333734367:3(377-386)Online publication date: Jun-2024
  • (2023)The pipeline for the continuous development of artificial intelligence models—Current state of research and practiceJournal of Systems and Software10.1016/j.jss.2023.111615199:COnline publication date: 1-May-2023
  • (2022)A software engineering perspective on engineering machine learning systemsJournal of Systems and Software10.1016/j.jss.2021.111031180:COnline publication date: 22-Apr-2022
  • (2021)Lessons Learned from Applying Requirements and Design Techniques in the Development of a Machine Learning System for Predicting Lawsuits Against Power CompaniesHuman Interface and the Management of Information. Information Presentation and Visualization10.1007/978-3-030-78321-1_18(227-243)Online publication date: 3-Jul-2021

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media