Supervised Machine Learning for Tax Evasion Detection: A Case Study with the Brazilian Tax Administration

  • Cleyton Andre Pires Universidade Federal de Santa Catarina (UFSC)

Resumo


In this study, we present an innovative approach to enhance the audit case selection process within the Brazilian Tax Authority (RFB) by integrating Artificial Intelligence techniques. We employ supervised learning algorithms to predict taxpayers’ annual income coupled with outlier detection techniques to strategically prioritize cases of heightened fiscal interest. This involves leveraging a comprehensive dataset of socioeconomic variables available to the Tax Administration. A pivotal facet of our methodology is its commitment to model explainability for ensuring fairness and compliance with legal and ethical considerations. Preliminary findings demonstrate promising results, positioning our model as a valuable complement to the existing rule-based system.
Palavras-chave: Tax Evasion, Machine Learning, Supervised Learning, Outlier Detection, Explainability, RFB

Referências

Chen, T. and Guestrin, C. (2016). XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’16, pages 785–794, New York, NY, USA. ACM.

da Silva L. S., de C., R. H., N., C. R., and F, S. J. C. (2016). Bayesian networks on income tax audit selection —a case study of brazilian tax administration. In Bayesian Modeling Application Workshop (BMAW).

de Roux, D., Perez, B., Moreno, A., Villamil, M. D. P., and Figueroa, F. (2018). Tax fraud detection for under-reporting declarations using an unsupervised machine learning approach. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD ‘18), pages 215–222.

Lin, Y. et al. (2021). Taxthemis: Interactive mining and exploration of suspicious tax evasion groups. IEEE Transactions on Visualization & Computer Graphics, 27(02):849–859.

Lundberg, S. M. and Lee, S.-I. (2017). A unified approach to interpreting model predictions. In Guyon, I., Luxburg, U. V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., and Garnett, R., editors, Advances in Neural Information Processing Systems 30, pages 4765–4774. Curran Associates, Inc.

OECD (2017). Tax audits in a changing environment. In The Changing Tax Compliance Environment and the Role of Audit, pages 72–77. OECD Publishing, Paris.

Savić, M. et al. (2021). Tax evasion risk management using a hybrid unsupervised outlier detection method. [link].

Wirth, R. and Hipp, J. (2000). Crisp-dm: Towards a standard process model for data mining. pages 29–39.

Xavier, O. et al. (2022). Tax evasion identification using open data and artificial intelligence. Revista de Administração Pública, 56:426–440.

Zumaya, M. et al. (2021). Identifying Tax Evasion in Mexico with Tools from Network Science and Machine Learning. Springer.
Publicado
14/10/2024
PIRES, Cleyton Andre. Supervised Machine Learning for Tax Evasion Detection: A Case Study with the Brazilian Tax Administration. In: WORKSHOP ON DATA SCIENCE AGAINST CORRUPTION IN THE PUBLIC SECTOR (DS-COPS) - SIMPÓSIO BRASILEIRO DE BANCO DE DADOS (SBBD), 39. , 2024, Florianópolis/SC. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2024 . p. 288-294. DOI: https://doi.org/10.5753/sbbd_estendido.2024.244262.