Uncovering Algorithmic Fairness in Deep Learning–Based Imputation of Multivariate Clinical Time Series in Heart Failure Patients

Rayssa Muniz; Victor Lima; Paloma Saldanha; Andrea Ribeiro; Rodrigo de Paula; Martin Cadeiras; Carlo da Silva; Hítalo Silva; Paulo Rocha; Diego Pinheiro

doi:10.5753/eniac.2025.13855

Rayssa Muniz UNICAP
Victor Lima UNICAP
Paloma Saldanha UNICAP
Andrea Ribeiro UFPE
Rodrigo de Paula UFPE
Martin Cadeiras UCDAVIS
Carlo da Silva UPE
Hítalo Silva UPE
Paulo Rocha UPE
Diego Pinheiro UNICAP

DOI: https://doi.org/10.5753/eniac.2025.13855

Resumo

Deep learning for missing data imputation (MDI) in healthcare time series has advanced, but fairness concerns remain underexplored. Traditional evaluations focus on global error metrics (e.g., MAE), overlooking disparities across variables and protected subgroups. We analyze five state-of-the-art MDI models (BRITS, SAITS, USGAN, GPVAE, MRNN) on a heart failure dataset (PhysioNet), assessing fairness via Lorenz curves and the Gini coefficient. Results reveal that low MAE does not imply fairness. SAITS was most efficient (MAE = 0.241) but least fair (Gini = 0.615), while MRNN was less efficient (MAE = 0.672) yet fairer (Gini = 0.439). We highlight the need to balance accuracy and fairness in MDI for clinical applications.

Referências

Cao, W., Wang, D., Li, J., Zhou, H., Li, Y., and Li, L. (2018). Brits: bidirectional recurrent imputation for time series. In Proceedings of the 32nd International Conference on Neural Information Processing Systems.

Du, W., Côté, D., and Liu, Y. (2023). SAITS: Self-attention-based imputation for time series. Expert Systems with Applications.

Fortuin, V., Baranchuk, D., Raetsch, G., and Mandt, S. (2020). Gp-vae: Deep probabilistic time series imputation. In Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics.

Liu, M., Li, S., Yuan, H., Ong, M. E. H., Ning, Y., Xie, F., Saffari, S. E., Shang, Y., Volovici, V., Chakraborty, B., and Liu, N. (2023). Handling missing values in healthcare data: A systematic review of deep learning-based imputation techniques. Artificial Intelligence in Medicine.

Meng, C., Trinh, L., Xu, N., Enouen, J., and Liu, Y. (2022). Interpretability and fairness evaluation of deep learning models on MIMIC-IV dataset. Scientific Reports.

Mesquita, T. P., Silva, D. M. P. F., Ribeiro, A. M. N. C., Silva, I. R. R., Bastos-Filho, C. J. A., and Monteiro, R. P. (2024). A comparative analysis of deep learning-based methods for multivariate time series imputation with varying missing rates. In 2024 IEEE Eighth Ecuador Technical Chapters Meeting (ETCM).

Miao, X., Wu, Y., Wang, J., Gao, Y., Mao, X., and Yin, J. (2021). Generative semi-supervised learning for multivariate time series imputation. Proceedings of the AAAI Conference on Artificial Intelligence.

Min, S., Asif, H., and Vaidya, J. (2025). Exploring the inequitable impact of data missingness on fairness in machine learning. IEEE Intelligent Systems, 40(3):28–38.

Obermeyer, Z., Powers, B., Vogeli, C., and Mullainathan, S. (2019). Dissecting racial bias in an algorithm used to manage the health of populations. Science, 366(6464):447–453.

Omar, M., Soffer, S., Agbareia, R., Bragazzi, N. L., Apakama, D. U., Horowitz, C. R., Charney, A. W., Freeman, R., Kummer, B., Glicksberg, B. S., Nadkarni, G. N., and Klang, E. (2024). Socio-demographic biases in medical decision-making by large language models: A large-scale multi-model analysis. medRxiv.

Pfohl, S. R., Cole-Lewis, H., Sayres, R., Neal, D., Asiedu, M., Dieng, A., Tomasev, N., Rashid, Q. M., Azizi, S., Rostamzadeh, N., et al. (2024). A toolbox for surfacing health equity harms and biases in large language models. Nature Medicine, 30(12):3590–3600.

Russell, S. (2020). Artificial Intelligence A Modern Approach. Pearson Series, 4 edition.

Silva, I., Moody, G., Mark, R., and Celi, L. A. (2012a). Predicting Mortality of ICU Patients: The PhysioNet/Computing in Cardiology Challenge 2012 (version 1.0.0). [link]. Accessed 23 June 2025.

Silva, I., Moody, G., Scott, D. J., Celi, L. A., and Mark, R. G. (2012b). Predicting inhospital mortality of icu patients: The physionet/computing in cardiology challenge 2012. IEEE.

Tipirneni, S. and Reddy, C. K. (2022). Self-Supervised Transformer for Sparse and Irregularly Sampled Multivariate Clinical Time-Series. ACM Trans. Knowl. Discov. Data.

Verma, S. and Rubin, J. (2018). Fairness definitions explained. In Proceedings of the International Workshop on Software Fairness, pages 1–7. ACM.

Wang, J., Du, W., Cao, W., Zhang, K., Wang, W., Liang, Y., and Wen, Q. (2024). Deep learning for multivariate time series imputation: A survey.

Wenjie, D. (2023). Pypots: A python toolbox for data mining on partially-observed time series. arXiv preprint arXiv:2305.18811.

Yoon, J., Zame, W. R., and Van Der Schaar, M. (2019). Estimating Missing Data in Temporal Data Streams Using Multi-Directional Recurrent Neural Networks. IEEE Transactions on Biomedical Engineering.

Uncovering Algorithmic Fairness in Deep Learning–Based Imputation of Multivariate Clinical Time Series in Heart Failure Patients

Resumo

Referências

Artigos mais lidos do(s) mesmo(s) autor(es)