Sonora: An Autonomous Analyst for Anti-Money Laundering Based on Explainable Artificial Intelligence

  • Vlademir Lenin Donato Batista ITA
  • Paulo André Lima de Castro ITA

Abstract


This paper presents the development and evaluation of Sonora, an autonomous analyst based on explainable artificial intelligence (XAI) designed to support Anti-Money Laundering (AML) monitoring in financial institutions. Using real-world, imbalanced data from a Brazilian financial institution, we developed a predictive model to identify suspicious cases. A multi-stage evaluation process began with the selection of Gradient Boosting as a robust baseline model over other standard classifiers. This model was then benchmarked against an alternative model, XGBoost, selected from a pool of state-of-the-art boosting algorithms. After hyperparameter optimization for both finalists, a threshold-based analysis confirmed that Gradient Boosting delivered the best performance for a recall-focused strategy, achieving a recall of approximately 96%. We deliberately focus on recall rather than accuracy to minimize regulatory risk and ensure that critical cases are surfaced for human review. To foster institutional trust, Sonora integrates instance-level SHAP explanations, highlighting the most influential features in each classification, and it acts as a decision-support tool, not a replacement for human analysts.

References

Akiba, T., Sano, S., Yanase, T., Ohta, T., and Koyama, M. (2019). Optuna: A next-generation hyperparameter optimization framework. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 2623–2631. ACM.

Assumpção, C., Batista, J., and Finger, M. (2023). Delator: Dynamic embedding learning for anti-money laundering using transaction networks. Expert Systems with Applications, 213:119041.

Banco Central do Brasil (2020). Circular nº 3.978, de 23 de janeiro de 2020. Dispõe sobre controles internos a serem adotados pelas instituições financeiras. Brasília, DF. Accessed: June 1, 2025.

Brasil (1998). Lei nº 9.613, de 3 de março de 1998. Dispõe sobre os crimes de lavagem de dinheiro e bens. Planalto – Presidência da República. Accessed: June 1, 2025.

Chen, Z., Khoa, L. D. V., Teoh, E. N., Nazir, A., Karuppiah, E. K., and Lam, K. S. (2018). Machine learning techniques for aml solutions in suspicious transaction detection: A review. Knowledge and Information Systems, 57(2):313–339.

Domingos, P. (2012). A few useful things to know about machine learning. Communications of the ACM, 55(10):78–87.

Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. The Annals of Statistics, 29(5):1189–1232.

GAFI (2012). Padrões internacionais de combate à lavagem de dinheiro e ao financiamento do terrorismo e da proliferação: As recomendações do GAFI. Technical report, GAFI, Paris, France. Official translation by COAF. Accessed: June 1, 2025.

Grinsztajn, L., Oyallon, E., and Varoquaux, G. (2022). Why do tree-based models still outperform deep learning on typical tabular data? Han, J., Cai, Y., and Xue, Y. (2020). Artificial intelligence for anti-money laundering: A review and extension. Digital Finance, 2(4):211–239.

Jullum, M., Øystein Huseby, Løland, A., Espe, N. V., and Bjørkevoll, V. H. (2020). Detecting money laundering transactions with machine learning. Journal of Money Laundering Control, 23(1):173–186.

Konstantinidis, S. and Gegov, D. (2024). Interpretable deep learning for AML: A case study with SHAP and dnns. Journal of Financial Crime Analytics, 4(1):35–50.

Lundberg, S. M., Erion, G. G., and Lee, S.-I. (2020). Consistent individualized feature attribution for tree ensembles. Nature Machine Intelligence, 2(1):56–67.

Lundberg, S. M. and Lee, S.-I. (2017). A unified approach to interpreting model predictions. In ., editor, Advances in Neural Information Processing Systems 30 (NIPS 2017), pages 419–428, Red Hook, NY, USA. Curran Associates, Inc. Accessed: June 1, 2025.

Moepya, S. O., Akhoury, S. S., and Nelwamondo, F. V. (2016). Measuring the impact of imputation in financial fraud. In 2016 Pattern Recognition Association of South Africa and Robotics and Mechatronics International Conference, pages 1–6. IEEE.

Parliament, E. and Council (2024). Regulation (eu) 2024/1689 of 13 june 2024 laying down harmonised rules on artificial intelligence (ai act). Official Journal of the European Union. Accessed: June 1, 2025.

Silva, R. M., Sbrana, A., Castro, P. A. L., and Soma, N. Y. (2023). Developing and assessing a human-understandable metric for evaluating local interpretable model-agnostic explanations. International Journal of Intelligent Engineering and Systems. DOI: 10.22266/ijies2023.0831.26, 16:318–332.

UCI Machine Learning Repository (2009). Default of credit card clients dataset. [link]. Dataset accessed in July 2025.
Published
2025-09-29
BATISTA, Vlademir Lenin Donato; CASTRO, Paulo André Lima de. Sonora: An Autonomous Analyst for Anti-Money Laundering Based on Explainable Artificial Intelligence. In: NATIONAL MEETING ON ARTIFICIAL AND COMPUTATIONAL INTELLIGENCE (ENIAC), 22. , 2025, Fortaleza/CE. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2025 . p. 285-296. ISSN 2763-9061. DOI: https://doi.org/10.5753/eniac.2025.12402.