Hierarchical Classification of Financial Transactions Through Context-Fusion of Transformer-based Embeddings and Taxonomy-aware Attention Layer
Resumo
This work proposes the Two-headed DragoNet, a Transformer-based model for hierarchical multi-label classification of financial transactions. Our model is based on a stack of Transformers encoder layers that generates contextual embeddings from two short textual descriptors (merchant name and business activity), followed by a Context Fusion layer and two output heads that classify transactions according to a hierarchical two-level taxonomy (macro and micro categories). Finally, our proposed Taxonomy-aware Attention Layer corrects predictions that break categorical hierarchy rules defined in the given taxonomy. Our proposal outperforms classical machine learning methods in experiments of macro-category classification by achieving an F1-score of 93% on a card dataset and 95% on a current account dataset.
Referências
Cheng, Y. and Al-Sayed, A. (2022). Classification method of financial behaviour through means of machine learning: Can a classification method created using bank transaction and machine learning help individuals to understand their spending behavior?
Costa, P. B., Marques, G., Serra, A. C., Moraes, D. d. S., Busson, A. J. G., Guedes, Á. L., Lima, G., and Colcher, S. (2020). Towards neural-symbolic ai for media understanding. In Anais Estendidos do XXVI Simpósio Brasileiro de Sistemas Multimídia e Web, pages 169–172. SBC.
de Sá, A. G., Pereira, A. C., and Pappa, G. L. (2018). A customized classification algorithm for credit card fraud detection. Engineering Applications of Artificial Intelligence, 72:21–29.
Fiore, U., De Santis, A., Perla, F., Zanetti, P., and Palmieri, F. (2019). Using generative adversarial networks for improving classification effectiveness in credit card fraud detection. Information Sciences, 479:448–455.
Garcia-Mendez, S., Fernandez-Gavilanes, M., Juncal-Martinez, J., González-Castaño, F. J., and Seara, Ó. B. (2020). Identifying banking transaction descriptions via support vector machine short-text classification based on a specialized labelled corpus. IEEE Access, 8:61642–61655.
Hewapathirana, I., Kekayan, N., and Diyasena, D. (2022). A systematic investigation on the effectiveness of the tabbert model for credit card fraud detection. In 2022 International Research Conference on Smart Computing and Systems Engineering (SCSE), volume 5, pages 96–101. IEEE.
Khrestina, M. P., Dorofeev, D. I., Kachurina, P. A., Usubaliev, T. R., and Dobrotvorskiy, A. S. (2017). Development of algorithms for searching, analyzing and detecting fraudulent activities in the financial sphere.
Padhi, I., Schiff, Y., Melnyk, I., Rigotti, M., Mroueh, Y., Dognin, P., Ross, J., Nair, R., and Altman, E. (2021). Tabular transformers for modeling multivariate time series. In ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 3565–3569. IEEE.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., and Polosukhin, I. (2017). Attention is all you need. Advances in neural information processing systems, 30.
Vollset, E., Folkestad, E., Gallala, M. R., and Gulla, J. A. (2017). Making use of external company data to improve the classification of bank transactions. In Advanced Data Mining and Applications: 13th International Conference, ADMA 2017, Singapore, November 5–6, 2017, Proceedings 13, pages 767–780. Springer.
Yeh, C. M., Zhuang, Z., Zheng, Y., Wang, L., Wang, J., and Zhang, W. (2020). Merchant category identification using credit card transactions. CoRR, abs/2011.02602.