Towards a New MLOps Architecture: A Methodological Approach Driven by Business and Scientific Requirements

Diego Nogare; Ismar Frango Silveira; Leandro Augusto Silva

doi:10.5753/latinoware.2025.16455

Diego Nogare UPM http://orcid.org/0000-0003-0796-9431
Ismar Frango Silveira UPM https://orcid.org/0000-0001-8029-072X
Leandro Augusto Silva UPM https://orcid.org/0000-0002-8671-3102

DOI: https://doi.org/10.5753/latinoware.2025.16455

Resumo

This article proposes an innovative conceptual model for Machine Learning Operations (MLOps) pipelines, aiming to overcome the current challenges concerning the entire lifecycle of machine learning models and to meet the growing demands of both Academia and Industry. Based on a hybrid research approach, combining scientific works and insights from professionals in the field, this proposed MLOps pipeline model integrates advanced automation, robust governance, intelligent data and model management, and explainable monitoring. We explore the convergence between theory and practice, identifying gaps and proposing an approach that promotes the scalability, reproducibility, and reliability of ML systems in complex and dynamic production environments. A state-of-the-art conceptual model for MLOps pipelines was proposed, based on a rigorous analysis of the literature and valuable insights from professional practice. The model addresses the critical challenges of automation, data and model management, monitoring, governance, and usability, aligning research ambitions with operational needs. The results from applying the MLOps architecture demonstrated measurable efficiency with a perceived improvement in the scalability, reproducibility, and reliability of ML systems. Positive outcomes were observed in relation to the deployment time of Machine Learning models, which was reduced from approximately 6 months to a range of 3 to 5 days, depending on the team’s maturity and the application’s purpose. An increase in productivity and operational standardization was also noted, accompanied by gains in scalability and efficiency, evidenced by the elimination of the model deployment queue, the migration of over 3,200 users to the new environment, and the publication of more than 100 Data Science models in the first few months of the new environment’s operation. Additionally, the transition to a cloud infrastructure provided cost and financial resource optimization compared to the previous on-premises solution, and an enhancement of governance and security through the execution of standardized pipelines.

Palavras-chave: MLOps, Methodological Architecture, Model Experimentation, Model Deployment, Model Monitoring

Referências

L. Colombi, A. Gilli, S. Dahdal, I. Boleac, M. Tortonesi, C. Stefanelli, and M. Vignoli, “A machine learning operations platform for streamlined model serving in industry 5.0,” in NOMS 2024-2024 IEEE Network Operations and Management Symposium. IEEE, 2024, pp. 1–6.

R. Ranawana and A. S. Karunananda, “An agile software development life cycle model for machine learning application development.” Institute of Electrical and Electronics Engineers Inc., 2021. [Online]. Available: [link]

L. Faubel and K. Schmid, “Mlops: A multiple case study in industry 4.0,” in 2024 IEEE 29th International Conference on Emerging Technologies and Factory Automation (ETFA). IEEE, 2024, pp. 01–08.

S. Shankar, R. Garcia, J. M. Hellerstein, and A. G. Parameswaran, “” we have no idea how models will behave in production until production”: How engineers operationalize machine learning,” Proceedings of the ACM on Human-Computer Interaction, vol. 8, no. CSCW1, pp. 1–34, 2024.

J. Antony, D. Jalusˇić, S. Bergweiler, Á. Hajnal, V. Zˇ labravec, M. Emo˝di, D. Strbad, T. Legler, and A. C. Marosi, “Adapting to changes: A novel framework for continual machine learning in industrial applications,” Journal of Grid Computing, vol. 22, no. 4, p. 71, 2024.

R. Subramanya, S. Sierla, and V. Vyatkin, “From devops to mlops: Overview and application to electricity market forecasting,” Applied Sciences (Switzerland), vol. 12, 10 2022. [Online]. Available: [link]

I. Zimmerman, J. Silge, P. Abedin, and R. Sanchez-Arias, “Meta-analysis of the machine learning operations open source ecosystem,” in 2023 International Conference on Machine Learning and Applications (ICMLA). IEEE, 2023, pp. 922–925.

S. J. Warnett and U. Zdun, “On the understandability of mlops system architectures,” IEEE Transactions on Software Engineering, 2024.

S. Idowu, D. Strüber, and T. Berger, “Asset management in machine learning: State-of-research and state-of-practice,” ACM Computing Surveys, vol. 55, 12 2022. [Online]. DOI: 10.1145/3543847

D. Nigenda, Z. Karnin, M. B. Zafar, R. Ramesha, A. Tan, M. Donini, and K. Kenthapadi, “Amazon sagemaker model monitor: A system for real-time insights into deployed machine learning models.” Association for Computing Machinery, 8 2022, pp. 3671–3681. [Online]. DOI: 10.1145/3534678.3539145

M. Barry, J. Montiel, A. Bifet, S. Wadkar, N. Manchev, M. Halford, R. Chiky, S. E. Jaouhari, K. B. Shakman, J. Al Fehaily et al., “Streammlops: Operationalizing online learning for big data streaming & realtime applications,” in 2023 IEEE 39th International Conference on Data Engineering (ICDE). IEEE, 2023, pp. 3508–3521.

D. Sculley, G. Holt, D. Golovin, E. Davydov, T. Phillips, D. Ebner, V. Chaudhary, M. Young, J.-F. Crespo, and D. Dennison, “Hidden technical debt in machine learning systems,” Advances in neural information processing systems, vol. 28, 2015.

M. Barry, A. Bifet, and J.-L. Billy, “Streamai: dealing with challenges of continual learning systems for serving ai in production,” in 2023 IEEE/ACM 45th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP). IEEE, 2023, pp. 134–137.

D. Nogare, I. F. Silveira, R. Banzai, and M. C. Alexandre, “Make or buy strategy for machine learning operations–mlops,” Anais da Academia Brasileira de Ciências, vol. 97, no. 2, p. e20240924, 2025.

K. H. Chen, H. P. Su, W. C. Chuang, H. C. Hsiao, W. Tan, Z. Tang, X. Liu, Y. Liang, W. C. Lo, W. Ji, B. Hsu, K. Hu, H. Jian, Q. Zhou, and C. M. Wang, “Apache submarine: A unified machine learning platform made simple.” Association for Computing Machinery, Inc, 4 2022, pp. 101–108. [Online]. DOI: 10.1145/3517207.3526984

J. De La Rúa Martínez, F. Buso, A. Kouzoupis, A. A. Ormenisan, S. Niazi, D. Bzhalava, K. Mak, V. Jouffrey, M. Ronström, R. Cunningham et al., “The hopsworks feature store for machine learning,” in Companion of the 2024 International Conference on Management of Data, 2024, pp. 135–147.

I. L. Markov, H. Wang, N. S. Kasturi, S. Singh, M. R. Garrard, Y. Huang, S. W. C. Yuen, S. Tran, Z. Wang, I. Glotov, T. Gupta, P. Chen, B. Huang, X. Xie, M. Belkin, S. Uryasev, S. Howie, E. Bakshy, and N. Zhou, “Looper: An end-to-end ml platform for product decisions.” Association for Computing Machinery, 8 2022, pp. 3513–3523. [Online]. DOI: 10.1145/3534678.3539059

A. Serban, K. V. D. Blom, H. Hoos, and J. Visser, “Adoption and effects of software engineering best practices in machine learning.” IEEE Computer Society, 10 2020. [Online]. DOI: 10.1145/3382494.3410681

F. Rezazadeh, H. Chergui, L. Alonso, and C. Verikoukis, “Sliceops: Explainable mlops for streamlined automation-native 6g networks,” IEEE Wireless Communications, 2024.

B. Eck, D. Kabakci-Zorlu, Y. Chen, F. Savard, and X. Bao, “A monitoring framework for deployed machine learning models with supply chain examples.” Institute of Electrical and Electronics Engineers Inc., 2022, pp. 2231–2238. [Online]. Available: [link]

V. Kumar, D. Ghosh, and S. Srivastava, “Efficient mlops pipeline for transfer learning and reuse of pre-trained ml models,” in 2023 IEEE International Conference on Advanced Networks and Telecommunications Systems (ANTS). IEEE, 2023, pp. 1–6.

L. C. Silva, F. R. Zagatti, B. S. Sette, L. N. D. S. Silva, D. Lucredio, D. F. Silva, and H. D. M. Caseli, “Benchmarking machine learning solutions in production.” Institute of Electrical and Electronics Engineers Inc., 12 2020, pp. 626–633. [Online]. Available: [link]

K. Sakuma, R. Matsuno, and Y. Kameda, “A method of identifying causes of prediction errors to accelerate mlops,” in 2023 IEEE/ACM International Workshop on Deep Learning for Testing and Testing for Deep Learning (DeepTest). IEEE, 2023, pp. 9–16.

S. Laato, T. Birkstedt, M. Mäantymäki, M. Minkkinen, and T. Mikkonen, “Ai governance in the system development life cycle: insights on responsible machine learning engineering,” in Proceedings of the 1st International Conference on AI Engineering: Software Engineering for AI, ser. CAIN ’22. New York, NY, USA: Association for Computing Machinery, 2022, p. 113–123. [Online]. DOI: 10.1145/3522664.3528598

H. Kim, B. Kim, W. Lu, and L. Li, “No-code mlops platform for data annotation,” in 2023 IEEE International Conference on Memristive Computing and Applications (ICMCA). IEEE, 2023, pp. 1–6.

N. Janbi, I. Katib, and R. Mehmood, “Distributed artificial intelligence: Taxonomy, review, framework, and reference architecture,” Intelligent Systems with Applications, vol. 18, p. 200231, 2023.

L. Fischer, L. Ehrlinger, V. Geist, R. Ramler, F. Sobiezky, W. Zellinger, D. Brunner, M. Kumar, and B. Moser, “Ai system engineering - key challenges and lessons learned,” 2020. [Online]. Available: [link]

N. Psaromanolakis, V. Theodorou, D. Laskaratos, I. Kalogeropoulos, M.-E. Vlontzou, E. Zarogianni, and G. Samaras, “Mlops meets edge computing: an edge platform with embedded intelligence towards 6g systems,” in 2023 Joint European Conference on Networks and Communications & 6G Summit (EuCNC/6G Summit). IEEE, 2023, pp. 496–501.

J. G. Almaraz-Rivera, “An anomaly-based detection system for monitoring kubernetes infrastructures,” IEEE Latin America Transactions, vol. 21, no. 3, pp. 457–465, 2023.

H. S. Kabbay, “Streamlining ai application: Mlops best practices and platform automation illustrated through an advanced rag based chatbot,” 8 in 2024 2nd International Conference on Sustainable Computing and Smart Systems (ICSCSS). IEEE, 2024, pp. 1304–1313.

A. P. S. Venkatesh, S. Sabu, M. Chekkapalli, J. Wang, L. Li, and E. Bodden, “Static analysis driven enhancements for comprehension in machine learning notebooks,” Empirical Software Engineering, vol. 29, no. 5, p. 136, 2024.

M. M. John, D. Gillblad, H. H. Olsson, and J. Bosch, “Advancing mlops from ad hoc to kaizen,” in 2023 49th Euromicro Conference on Software Engineering and Advanced Applications (SEAA). IEEE, 2023, pp. 94–101.

E. Kannout, M. Grodzki, and M. Grzegorowski, “Considering various aspects of models’ quality in the ml pipeline - application in the logistics sector.” Institute of Electrical and Electronics Engineers Inc., 2022, pp. 403–412. [Online]. Available: [link]

S. Moreschini, D. Hästbacka, and D. Taibi, “Mlops pipeline development: The ossara use case,” in Proceedings of the 2023 International Conference on Research in Adaptive and Convergent Systems, 2023, pp. 1–8.

M. Haakman, L. Cruz, H. Huijgens, and A. van Deursen, “Ai lifecycle models need to be revised: An exploratory study in fintech,” Empirical Software Engineering, vol. 26, 9 2021. [Online]. Available: [link]

M. A. Al Alamin and G. Uddin, “How far are we with automated machine learning? characterization and challenges of automl toolkits,” Empirical Software Engineering, vol. 29, no. 4, p. 91, 2024.

A. Isenko, R. Mayer, J. Jedele, and H. A. Jacobsen, “Where is my training bottleneck? hidden trade-offs in deep learning preprocessing pipelines.” Association for Computing Machinery, 6 2022, pp. 1825–1839. [Online]. DOI: 10.1145/3514221.3517848

L. Boué, P. Kunireddy, and P. Subotić, “Automatically resolving data source dependency hell in large scale data science projects,” in 2023 IEEE/ACM 2nd International Conference on AI Engineering–Software Engineering for AI (CAIN). IEEE, 2023, pp. 1–6.

N. Rauschmayr, S. Kama, M. Kim, M. Choi, and K. Kenthapadi, “Profiling deep learning workloads at scale using amazon sagemaker.” Association for Computing Machinery, 8 2022, pp. 3801–3809. [Online]. DOI: 10.1145/3534678.3539036

H. Zhang, L. Cruz, and A. V. Deursen, “Code smells for machine learning applications.” Institute of Electrical and Electronics Engineers Inc., 2022, pp. 217–228. [Online]. DOI: 10.1145/3522664.3528620

P. Ruf, C. Reich, and D. Ould-Abdeslam, “Aspects of module placement in machine learning operations for cyber physical systems.” Institute of Electrical and Electronics Engineers Inc., 2022. [Online]. Available: [link]

D. Nogare, R. F. Mello, and M. A. Lopes, “Automação no processo de publicação de modelos de ciência de dados,” in Congresso Brasileiro de Software: Teoria e Prática (CBSoft). SBC, 2022, pp. 40–43.

D. Nogare, R. F. Mello, and V. Azeka. (2024) Itaú melhora a velocidade de lançamento no mercado e a produtividade de soluções de ml usando a amazon web services. [Online]. Available: [link]