Machine Learning Model: Perspectives for quality, observability, risk and continuous monitoring
Resumo
The transition of machine learning (ML) and artificial intelligence (AI) projects from experimental stages to fully operational solutions presents substantial challenges. This is especially true for applications where these technologies play a critical role, demanding high-quality, reliable, and observable ML models. This paper explores the crucial aspects of continuous monitoring in ML models and emphasizes the need for a comprehensive approach that goes beyond technical development. It highlights that ensuring the reliability and robustness of deployed ML models requires a multifaceted framework encompassing data governance, model lifecycle management, and thorough team training. The paper addresses key aspects such as model quality, risk management, and the crucial role of observability in maintaining model stability and reliability in production environments. Using Itaú Unibanco as a case study, the paper showcases a robust model risk management approach and a dual monitoring system: an independent validation team oversees riskier models, while smaller models are monitored by their development team. The paper concludes by emphasizing the significance of a robust Model Risk Management (MRM) framework in the evolving landscape of AI and ML, particularly as these technologies become deeply integrated into various business operations. Highlighting that Itaú Unibanco’s rigorous approach to model quality, observability, low risk, and continuous integration aligns with the regulatory requirements set by the Brazilian central bank.
Referências
D. S. Magalh˜aes, S. B. S. Monteiro, and V. Vasconcellos, “Mitigation of model risk in a financial institution,” in 2022 17th Iberian Conference on Information Systems and Technologies (CISTI). IEEE, 2022, pp. 1–7.
J. F. Kurian and M. Allali, “Detecting drifts in data streams using kullback-leibler (kl) divergence measure for data engineering applications,” Journal of Data, Information and Management, pp. 1–10, 2024.
A. Bourgais and I. Ibnouhsein, “Ethics-by-design: the next frontier of industrialization,” AI and Ethics, vol. 2, pp. 317–324, 5 2022. [Online]. Available: [link]
B. van Oort, L. Cruz, B. Loni, and A. van Deursen, “Project smells experiences in analysing the software quality of ml projects with mllint.” Association for Computing Machinery (ACM), 5 2022, pp. 211–220. [Online]. Available: DOI: 10.1145/3510457.3513041
E. Kannout, M. Grodzki, and M. Grzegorowski, “Considering various aspects of models’ quality in the ml pipeline - application in the logistics sector.” Institute of Electrical and Electronics Engineers Inc., 2022, pp. 403–412. [Online]. Available: [link]
H. L. Franca, C. Teixeira, and N. Laranjeiro, “Techniques for evaluating the robustness of deep learning systems: A preliminary review.” Institute of Electrical and Electronics Engineers Inc., 2021. [Online]. Available: [link]
P. Ruf, C. Reich, and D. Ould-Abdeslam, “Aspects of module placement in machine learning operations for cyber physical systems.” Institute of Electrical and Electronics Engineers Inc., 2022. [Online]. Available: [link]
B. Eck, D. Kabakci-Zorlu, Y. Chen, F. Savard, and X. Bao, “A monitoring framework for deployed machine learning models with supply chain examples.” Institute of Electrical and Electronics Engineers Inc., 2022, pp. 2231–2238. [Online]. Available: [link]
H. Jean-Baptiste, L. Tao, M. Qiu, and K. Gai, “Understanding model risk management–model rationalization in financial industry,” in 2015 IEEE 2nd International Conference on Cyber Security and Cloud Computing. IEEE, 2015, pp. 301–306.
H. Jean-Baptiste, M. Qiu, K. Gai, and L. Tao, “Model risk management systems-back-end, middleware, front-end and analytics,” in 2015 IEEE 2nd International Conference on Cyber Security and Cloud Computing. IEEE, 2015, pp. 312–316.
D. Chen and W. Ye, “Monotonic neural additive models: Pursuing regulated machine learning models for credit scoring,” in Proceedings of the Third ACM International Conference on AI in Finance, 2022, pp. 70–78.
D. Nigenda, Z. Karnin, M. B. Zafar, R. Ramesha, A. Tan, M. Donini, and K. Kenthapadi, “Amazon sagemaker model monitor: A system for real-time insights into deployed machine learning models.” Association for Computing Machinery, 8 2022, pp. 3671–3681. [Online]. Available: DOI: 10.1145/3534678.3539145
I. L. Markov, H. Wang, N. S. Kasturi, S. Singh, M. R. Garrard, Y. Huang, S. W. C. Yuen, S. Tran, Z. Wang, I. Glotov, T. Gupta, P. Chen, B. Huang, X. Xie, M. Belkin, S. Uryasev, S. Howie, E. Bakshy, and N. Zhou, “Looper: An end-to-end ml platform for product decisions.” Association for Computing Machinery, 8 2022, pp. 3513–3523. [Online]. Available: DOI: 10.1145/3534678.3539059
C. Mougan and D. S. Nielsen, “Monitoring model deterioration with explainable uncertainty estimation via non-parametric bootstrap,” in AAAI Conference on Artificial Intelligence, 2022. [Online]. Available: [link]
F. Bayram, B. S. Ahmed, and A. Kassler, “From concept drift to model degradation: An overview on performance-aware drift detectors,” Knowledge-Based Systems, vol. 245, p. 108632, 2022. [Online]. Available: [link]
J. Gama, I. Zliobaitè, A. B. abd Mykola Pechenizkiy, and A. Bouchachia, “A survey on concept drift adaptation,” ACM Computing Surveys (CSUR), vol. 46, pp. 1 – 37, 2014. [Online]. Available: [link]
T. Schröder and M. Schulz, “Monitoring machine learning models: a categorization of challenges and methods,” Data Science and Management, vol. 5, no. 3, pp. 105–116, 2022. [Online]. Available: [link]
L. C. Silva, F. R. Zagatti, B. S. Sette, L. N. D. S. Silva, D. Lucredio, D. F. Silva, and H. D. M. Caseli, “Benchmarking machine learning solutions in production.” Institute of Electrical and Electronics Engineers Inc., 12 2020, pp. 626–633. [Online]. Available: [link]
H. Jayalath and L. Ramaswamy, “Enhancing performance of operationalized machine learning models by analyzing user feedback.” Association for Computing Machinery, 3 2022, pp. 197–203. [Online]. Available: DOI: 10.1145/3531232.3531261
R. Miñón, J. Díaz-De-Arcaya, A. I. Torre-Bastida, G. Zarate, and A. Moreno-Fernandez-De-Leceta, “Mlpacker: A unified software tool for packaging and deploying atomic and distributed analytic pipelines,” 2022. [Online]. Available: [link]
B. M. Matsui and D. H. Goya, “Mlops: A guide to its adoption in the context of responsible ai.” Institute of Electrical and Electronics Engineers Inc., 2022, pp. 45–49. [Online]. Available: [link]
S. Idowu, D. Strüber, and T. Berger, “Asset management in machine learning: State-of-research and state-of-practice,” ACM Computing Surveys, vol. 55, 12 2022. [Online]. Available: DOI: 10.1145/3543847
S. Shankar and A. Parameswaran, “Towards observability for production machine learning pipelines,” arXiv preprint arXiv:2108.13557, 2021.
H.-L. Truong and T.-M. Nguyen, “Qoa4ml - a framework for supporting contracts in machine learning services,” in 2021 IEEE International Conference on Web Services (ICWS), 2021, pp. 465–475.
D. Aineto, S. J. Celorrio, and E. Onaindia, “Learning action models with minimal observability,” Artificial Intelligence, vol. 275, pp. 104–137, 2019.