Equitable Diabetes Diagnosis: Tackling Ethnic and Gender Disparities

Resumo


Machine Learning (ML) has advanced disease diagnosis in healthcare, but raises fairness concerns, as model biases can perpetuate social inequalities. This study aims to evaluate and mitigate bias in diabetes diagnosis prediction models. We conducted experiments considering ethnicity and gender as protected attributes, evaluating bias using the fairness metrics Statistical Parity Difference, Equal Opportunity Difference, and Average Odds Difference. We applied the bias mitigation techniques Reweighing and Prejudice Remover, which showed improvements in fairness metrics, with a reduction in disparities between groups, while maintaining model accuracy. These findings reinforce the need to integrate fairness considerations into ML models for healthcare applications.
Palavras-chave: machine learning, health care, diabetes, bias, fairness

Referências

Association, A. D. (2023). Standards of medical care in diabetes. Diabetes Care, 46(Supplement 1):S1–S291.

Bhatti, A., Chen, L., and Kumar, R. (2025). The influence of missing data mechanisms and simple missing data handling techniques on fairness. Journal of Fairness in Machine Learning. In press.

Blow, C. H., Qian, L., Gibson, C., Obiomon, P., and Dong, X. (2024). Comprehensive validation on reweighting samples for bias mitigation via aif360. Applied Sciences, 14(9):3826.

Caton, S. and Haas, C. (2024). Fairness in machine learning: A survey. ACM Comput. Surv., 56(7).

Chang, V., Ganatra, M. A., Hall, K., Golightly, L., and Xu, Q. A. (2022). An assessment of machine learning models and algorithms for early prediction and diagnosis of diabetes using health indicators. Healthcare Analytics, 2:100118.

Cronjé, H., Katsiferis, A., Elsenburg, L., Andersen, T., Rod, N., et al. (2023). Assessing racial bias in type 2 diabetes risk prediction algorithms. PLOS Global Public Health, 3(5):e0001556.

Dwork, C., Hardt, M., Pitassi, T., Reingold, O., and Zemel, R. S. (2012). Fairness through awareness. In Goldwasser, S., editor, Proceedings of the 3rd Innovations in Theoretical Computer Science Conference on - ITCS '12, pages 214–226. ACM.

GBD 2021 Diabetes Collaborators (2023). Global, regional, and national burden of diabetes from 1990 to 2021: a systematic analysis for the global burden of disease study 2021. The Lancet, 402(10397):203–234.

Hardt, M., Price, E., and Srebro, N. (2016). Equality of opportunity in supervised learning. In Proceedings of the 30th International Conference on Neural Information Processing Systems, NIPS’16, page 3323–3331, Red Hook, NY, USA. Curran Associates Inc.

Huang, J., Galal, G., Etemadi, M., and Vaidyanathan, M. (2022). Evaluation and mitigation of racial bias in clinical machine learning models: Scoping review. JMIR Med Inform, 10(5):e36388.

Huang, Y., Guo, J., Chen, W.-H., Lin, H.-Y., Tang, H., Wang, F., Xu, H., and Bian, J. (2024). A scoping review of fair machine learning techniques when using real-world data. medRxiv.

Jain, H. P., Patel, S. R., and Joshi, M. V. (2022). Addressing bias in diabetes prediction models through fairness-aware learning. Journal of Ambient Intelligence and Humanized Computing, 13(2):1085–1098.

Kamishima, T., Akaho, S., Asoh, H., and Sakuma, J. (2012). Fairness-aware classifier with prejudice remover regularizer. In Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2012), pages 35–50. Springer.

Khanam, R., Sultana, M., Rahman, M. M., and Ahmed, M. (2021). A comparison of machine learning algorithms for diabetes prediction. International Journal of Engineering Research and Technology, 10(3):1–6.

Klement, W. and El Emam, K. (2023). Consolidated reporting guidelines for prognostic and diagnostic machine learning modeling studies: Development and validation. J Med Internet Res, 25:e48763.

Kush Varshney (2018). Introducing ai fairness 360. [link]. Accessed: 2025-04-27.

Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K., and Galstyan, A. (2021). A survey on bias and fairness in machine learning. ACM Comput. Surv., 54(6).

Pias, T. S., Su, Y., Tang, X., Wang, H., Faghani, S., and Yao, D. D. (2023). Enhancing fairness and accuracy in type 2 diabetes prediction through data resampling. medRxiv.

Raza, S. (2022). A machine learning model for predicting, diagnosing, and mitigating health disparities in hospital readmission. Healthcare Analytics, 2:100100.

Raza, S. (2023). Connecting fairness in machine learning with public health equity.

Ruback, L., Carvalho, D., and Avila, S. (2022). Mitigating bias in machine learning: A socio-technical analysis. iSys - Brazilian Journal of Information Systems, 15(1):23:1–23:31.

Sajid, M. R., Alkhawaldeh, R. S., Hafez, A. M., and Alkhawaldeh, M. (2024). Predicting diabetes in adults: identifying important features in unbalanced data over a 5-year cohort study using machine learning algorithm. BMC Medical Research Methodology, 24(1):211.

Varshney, K. R. (2022). Trustworthy Machine Learning. Independently Published, Chappaqua, NY, USA.

Verma, S. and Rubin, J. (2018). Fairness definitions explained. In Proceedings of the International Workshop on Software Fairness, FairWare ’18, page 1–7, New York, NY, USA. Association for Computing Machinery.

Wang, S. C. Y., Nickel, G., Venkatesh, K. P., Raza, M. M., and Kvedar, J. C. (2024). Ai-based diabetes care: risk prediction models and implementation concerns. NPJ Digital Medicine, 7(1):36.

Wright, R. E. (1995). Logistic regression.

Xie, Z., Nikolayeva, O., Luo, J., and Li, D. (2019). Building risk prediction models for type 2 diabetes using machine learning techniques. Preventing Chronic Disease, 16:E130.
Publicado
29/09/2025
RUBACK, Lívia; FELIX, Luisa; SOARES TELES, Ariel. Equitable Diabetes Diagnosis: Tackling Ethnic and Gender Disparities. In: SIMPÓSIO BRASILEIRO DE BANCO DE DADOS (SBBD), 40. , 2025, Fortaleza/CE. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2025 . p. 466-478. ISSN 2763-8979. DOI: https://doi.org/10.5753/sbbd.2025.247265.