Credit Card Fraud: A Hybrid and Unified Approach with Machine Learning and GMM
Abstract
The rapid growth of digital transactions has been accompanied by an increase in credit card fraud, causing significant financial losses. This paper proposes a hybrid fraud detection pipeline that integrates supervised and unsupervised machine learning models. Using a real-world dataset of over 560,000 transactions, we evaluated Random Forest, Logistic Regression, Neural Networks (Keras and JAX), and Gaussian Mixture Models (GMMs), applying class balancing and feature selection techniques. The JAX-based neural network achieved the highest performance (F1-score of 94.3% and AUC of 0.964), outperforming traditional approaches. This study provides a replicable and adaptable detection framework that can support financial institutions in mitigating fraud risks.References
AZEVEDO, A. I. F. de; SANTOS, M. F. Dos. Mineração de Dados. São Paulo: Editora Campus, 2008.
BAHNSEN, A. C. et al. Example-dependent cost-sensitive decision trees. Expert Systems with Applications, v. 42, n. 19, p. 6609–6619, 2015.
BREIMAN, L. Random Forests. Machine Learning, v. 45, n. 1, p. 5–32, 2001.
CARCILLO, Fabrizio et al. Combining unsupervised and supervised learning in credit card fraud detection. IEEE Transactions on Neural Networks and Learning Systems, v. 31, n. 8, p. 2744-2757, 2019. DOI: 10.1109/TNNLS.2019.2896116
CHAWLA, N. V. et al. SMOTE: Synthetic Minority Over-sampling Technique. Journal of Artificial Intelligence Research, v. 16, p. 321–357, 2002.
DAL POZZOLO, A. et al. Calibrating probability with undersampling for unbalanced classification. In: 2015 IEEE Symposium Series on Computational Intelligence. IEEE, 2015. p. 159–166.
DEMPSTER, A. P.; LAIRD, N. M.; RUBIN, D. B. Maximum Likelihood from Incomplete Data via the EM Algorithm. Journal of the Royal Statistical Society, v. 39, n. 1, p. 1–38, 1977.
FIORE, U. et al. Using generative adversarial networks for improving classification effectiveness in credit card fraud detection. Information Sciences, v. 479, p. 448–455, 2019.
JURGOVSKY, S. et al. Sequence classification for credit-card fraud detection. Expert Systems with Applications, v. 100, p. 234–245, 2018.
NILSON REPORT. Global Card Fraud Losses Projected to Exceed $40 Billion by 2025. The Nilson Report, Issue 1239, 2023.
SERASA EXPERIAN. Indicador de Tentativas de Fraude – 1º Trimestre de 2025. Disponível em: [link]. Acesso em: jul. 2025.
BAHNSEN, A. C. et al. Example-dependent cost-sensitive decision trees. Expert Systems with Applications, v. 42, n. 19, p. 6609–6619, 2015.
BREIMAN, L. Random Forests. Machine Learning, v. 45, n. 1, p. 5–32, 2001.
CARCILLO, Fabrizio et al. Combining unsupervised and supervised learning in credit card fraud detection. IEEE Transactions on Neural Networks and Learning Systems, v. 31, n. 8, p. 2744-2757, 2019. DOI: 10.1109/TNNLS.2019.2896116
CHAWLA, N. V. et al. SMOTE: Synthetic Minority Over-sampling Technique. Journal of Artificial Intelligence Research, v. 16, p. 321–357, 2002.
DAL POZZOLO, A. et al. Calibrating probability with undersampling for unbalanced classification. In: 2015 IEEE Symposium Series on Computational Intelligence. IEEE, 2015. p. 159–166.
DEMPSTER, A. P.; LAIRD, N. M.; RUBIN, D. B. Maximum Likelihood from Incomplete Data via the EM Algorithm. Journal of the Royal Statistical Society, v. 39, n. 1, p. 1–38, 1977.
FIORE, U. et al. Using generative adversarial networks for improving classification effectiveness in credit card fraud detection. Information Sciences, v. 479, p. 448–455, 2019.
JURGOVSKY, S. et al. Sequence classification for credit-card fraud detection. Expert Systems with Applications, v. 100, p. 234–245, 2018.
NILSON REPORT. Global Card Fraud Losses Projected to Exceed $40 Billion by 2025. The Nilson Report, Issue 1239, 2023.
SERASA EXPERIAN. Indicador de Tentativas de Fraude – 1º Trimestre de 2025. Disponível em: [link]. Acesso em: jul. 2025.
Published
2025-09-17
How to Cite
SILVA, Martony Demes da; ROMA, Warleyson Costa.
Credit Card Fraud: A Hybrid and Unified Approach with Machine Learning and GMM. In: WORKSHOP ON INFORMATION SYSTEMS (WSIS), 16. , 2025, Rio Paranaíba/MG.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2025
.
p. 89-96.
DOI: https://doi.org/10.5753/wsis.2025.15731.
