Applying DevOps to Machine Learning Processes: A Systematic Mapping
Resumo
Práticas de DevOps têm sido cada vez mais utilizadas por equipes de engenharia de software com o intuito de aprimorar as etapas de desenvolvimento. Em processos que envolvem machine learning (ML), DevOps também pode ser aplicado a fim de implantar modelos de aprendizado de máquina em produção – prática também conhecida como MLOps. Neste mapeamento sistemático objetiva-se entender como DevOps tem sido aplicado a processos de machine learning e quais são os desafios enfrentados. Foram selecionados 15 artigos e observou-se que a maioria faz uso de práticas de CI/CD e propõe arquiteturas para a implantação de modelos de ML. Como maiores desafios, têm-se as características inerentes aos modelos de ML e resistência à mudança.
Referências
Barisits, M., Beermann, T., Berghaus, F., Bockelman, B., Bogado, J., Cameron, D., Christidis, D., Ciangottini, D., Dimitrov, G., Elsing, M., et al. (2019). Rucio: Scientific data management. Computing and Software for Big Science, 3(1):1–19.
Bersani, M. M., Marconi, F., Tamburri, D. A., Nodari, A., and Jamshidi, P. (2019). Verifying big data topologies by-design: a semi-automated approach. Journal of Big Data, 6(1):1–23.
Castellanos, C., Varela, C. A., and Correal, D. (2021). Accordant: A domain specificmodel and devops approach for big data analytics architectures. Journal of Systems and Software, 172:110869.
Dalla Palma, S., Di Nucci, D., Palomba, F., and Tamburri, D. A. (2021). Within-project defect prediction of infrastructure-as-code using product and process metrics. IEEE Transactions on Software Engineering.
de Feijter, R., Overbeek, S., van Vliet, R., Jagroep, E., and Brinkkemper, S. (2018). Devops competences and maturity for software producing organizations. In Enterprise, Business-Process and Information Systems Modeling, pages 244–259. Springer.
Domingos, P. (2012). A few useful things to know about machine learning. Communications of the ACM, 55(10):78–87.
Eifert, T., Eisen, K., Maiwald, M., and Herwig, C. (2020). Current and future requirements to industrial analytical infrastructure—part 2: smart sensors. Analytical and bioanalytical chemistry, 412(9):2037–2045.
Forsgren, N. and Kersten, M. (2018). Devops metrics. Communications of the ACM, 61(4):44–48.
Fursin, G. (2021). Collective knowledge: organizing research projects as a database of reusable components and portable workows with common interfaces. Philosophical Transactions of the Royal Society A, 379(2197):20200211.
García, ´A. L., De Lucas, J. M., Antonacci, M., Zu Castell, W., David, M., Hardt, M., Iglesias, L. L., Moltó, G., Plociennik, M., Tran, V., et al. (2020). A cloud-based framework for machine learning workloads and applications. IEEE access, 8:18681– 18692.
Granlund, T., Stirbu, V., and Mikkonen, T. (2021). Towards regulatory-compliant mlops: Orazivio’s journey from a machine learning experiment to a deployed certified medical product. SN Computer Science, 2(5):1–14.
Jabbari, R., bin Ali, N., Petersen, K., and Tanveer, B. (2016). What is devops? a systematic mapping study on definitions and practices. In Proceedings of the Scientific Workshop Proceedings of XP2016, pages 1–11.
Karamitsos, I., Albarhami, S., and Apostolopoulos, C. (2020). Applying devops practices of continuous automation for machine learning. Information, 11(7):363.
Karn, R. R., Kudva, P., and Elfadel, I. A. M. (2019). Dynamic autoselection and autotuning of machine learning models for cloud network analytics. IEEE Transactions on Parallel and Distributed Systems, 30(5):1052–1064.
Kitchenham, B., Brereton, O. P., Budgen, D., Turner, M., Bailey, J., and Linkman, S. (2009). Systematic literature reviews in software engineering–a systematic literature review. Information and software technology, 51(1):7–15.
Kitchenham, B. A. and Charters, S. (2007). Guidelines for performing systematic literature reviews in software engineering. Technical Report EBSE 2007-001, Keele University and Durham University Joint Report.
Langley, P. and Simon, H. A. (1995). Applications of machine learning and rule induction. Communications of the ACM, 38(11):54–64.
Leite, L., Rocha, C., Kon, F., Milojicic, D., and Meirelles, P. (2019). A survey of devops concepts and challenges. ACM Comput. Surv., 52(6).
Li, Y., Jiang, Z. M., Li, H., Hassan, A. E., He, C., Huang, R., Zeng, Z., Wang, M., and Chen, P. (2020). Predicting node failures in an ultra-large-scale cloud computing platform: an aiops solution. ACM Transactions on Software Engineering and Methodology (TOSEM), 29(2):1–24.
Liu, Y., Ling, Z., Huo, B., Wang, B., Chen, T., and Mouine, E. (2020). Building a platform for machine learning operations from open source frameworks. IFAC-PapersOnLine, 53(5):704–709. 3rd IFAC Workshop on Cyber-Physical Human Systems CPHS 2020.
Lwakatare, L. E., Crnkovic, I., and Bosch, J. (2020). Devops for ai–challenges in development of ai-enabled applications. In 2020 International Conference on Software, Telecommunications and Computer Networks (SoftCOM), pages 1–6. IEEE.
Moher, D., Stewart, L., and Shekelle, P. (2015). All in the family: systematic reviews, rapid reviews, scoping reviews, realist reviews, and more. Systematic reviews, 4(1):1– 2.
Oluyisola, O. E., Bhalla, S., Sgarbossa, F., and Strandhagen, J. O. (2021). Designing and developing smart production planning and control systems in the industry 4.0 era: a methodology and case study. Journal of Intelligent Manufacturing, pages 1–22.
Pääkkönen, P. and Pakkala, D. (2020). Extending reference architecture of big data systems towards machine learning in edge computing environments. Journal of Big Data, 7(1):1–29.
Rossel, S. (2017). Continuous Integration, Delivery, and Deployment: Reliable and faster software releases with automating builds, tests, and deployment. Packt Publishing Ltd.
Saltz, J., Shamshurin, I., and Crowston, K. (2017). Comparing data science project management methodologies via a controlled experiment. In Hawaii International Conference on System Sciences 2017.
Schlossnagle, T. (2018). Monitoring in a devops world. Communications of the ACM, 61(3):58–61.
Sculley, D., Holt, G., Golovin, D., Davydov, E., Phillips, T., Ebner, D., Chaudhary, V., Young, M., Crespo, J.-F., and Dennison, D. (2015). Hidden technical debt in machine learning systems. Advances in neural information processing systems, 28:2503–2511.
Shahin, M., Babar, M. A., and Zhu, L. (2016). The intersection of continuous deployment and architecting process: practitioners’ perspectives. In Proceedings of the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, pages 1–10.
Spjuth, O., Frid, J., and Hellander, A. (2021). The machine learning life cycle and the cloud: implications for drug discovery. Expert Opinion on Drug Discovery, pages 1–9.
Tom, E., Aurum, A., and Vidgen, R. (2013). An exploration of technical debt. Journal of Systems and Software, 86(6):1498–1516.