Predicting sepsis prognosis with machine learning models trained on MIMIC-IV

  • Felipe Alexandre P. Miranda Instituto Tecnológico de Aeronáutica (ITA)
  • Fábio Agostini A. Gomes Instituto Tecnológico de Aeronáutica (ITA) / Irmandade da Santa Casa de Misericórdia de São Paulo
  • Sarah Negreiros de Carvalho Instituto Tecnológico de Aeronáutica (ITA)

Resumo


Sepsis remains one of the leading causes of mortality in intensive care units (ICUs), making early prognosis prediction essential for improving clinical outcomes. This study develops and compares five machine learning models—Logistic Regression, Gradient Boosting, XGBoost, LightGBM, and Multi-Layer Perceptron—to predict 30-day mortality in 6,965 sepsis patients from the MIMIC-IV database. Our key contribution is demonstrating that a model-agnostic feature selection approach reduces the variable set from 103 to 20 clinical parameters while maintaining equivalent predictive performance (AUC ¿ 0.87 across all models), significantly improving computational efficiency and clinical interpretability. Systematic hyperparameter optimization with algorithm-specific class-imbalance strategies revealed that ensemble methods (XGBoost, LightGBM, MLP) achieved validation AUC values above 0.90. The identified minimal feature set, dominated by hemodynamic and metabolic markers, provides actionable clinical information while establishing reproducible benchmarks for sepsis mortality prediction, demonstrating the potential of machine learning as a practical support tool for ICU decision-making. Sepsis Mortality Prediction Machine Learning in Healthcare Feature Selection.
Palavras-chave: machine learning, sepsis

Referências

Al Omar, S., Alshraideh, J. A., Oweidat, I., Al Qadire, M., Khalaf, A., Sumaqa, Y. A., Al-Mugheed, K., Alabdullah, A. A. S., and Abdelaliem, S. M. F. (2024). Mortality of patients with sepsis in intensive care units at tertiary hospitals in jordan: Prospective cohort study. Medicine, 103(43):e40169.

Arbous, S. M., Termorshuizen, F., Brinkman, S., de Lange, D. W., Bosman, R. J., Dekkers, O. M., and de Keizer, N. F. (2024). Three-year mortality of icu survivors with sepsis, an infection or an inflammatory illness: an individually matched cohort study of icu patients in the netherlands from 2007 to 2019. Critical Care, 28(1):374.

Bao, C., Deng, F., and Zhao, S. (2023). Machine-learning models for prediction of sepsis patients mortality. Medicina Intensiva (English Edition), 47(6):315–325.

Han, Y., Xie, X., Qiu, J., Tang, Y., Song, Z., Li, W., and Wu, X. (2025). Early prediction of sepsis associated encephalopathy in elderly icu patients using machine learning models: a retrospective study based on the mimic-iv database. Frontiers in Cellular and Infection Microbiology, 15:1545979.

Huayanay, A., Bazán, J. L., and Russo, C. M. (2025). Performance of evaluation metrics for classification in imbalanced data. Computational Statistics, 40(3):1447–1473.

Johnson, A., Bulgarelli, L., Pollard, T., Gow, B., Moody, B., Horng, S., Celi, L., and Mark, R. (2024). Mimic-iv (version 3.1). physionet.

Karakike, E., Kyriazopoulou, E., Tsangaris, I., Routsi, C., Vincent, J.-L., and Giamarellos-Bourboulis, E. J. (2019). The early change of sofa score as a prognostic marker of 28-day sepsis mortality: analysis through a derivation and a validation cohort. Critical care, 23:1–8.

Meng, C., Trinh, L., Xu, N., Enouen, J., and Liu, Y. (2022). Interpretability and fairness evaluation of deep learning models on mimic-iv dataset. Scientific Reports, 12(1):7166.

Shan, W., Sun, D., and Liu, Z.-P. (2024). Predicting sepsis onset in icu patients using machine learning and feature section: A case study of mimic-iv data. In 2024 IEEE International Conference on Medical Artificial Intelligence (MedAI), pages 546–551. IEEE.

Singer, M., Deutschman, C. S., Seymour, C. W., Shankar-Hari, M., Annane, D., Bauer, M., Bellomo, R., Bernard, G. R., Chiche, J.-D., Coopersmith, C. M., et al. (2016). The third international consensus definitions for sepsis and septic shock (sepsis-3). Jama, 315(8):801–810.

Yu, Z., Fang, L., and Ding, Y. (2025). Explainable machine learning model for prediction of 28-day all-cause mortality in immunocompromised patients in the intensive care unit: a retrospective cohort study based on mimic-iv database. European Journal of Medical Research, 30(1):358.
Publicado
29/09/2025
P. MIRANDA, Felipe Alexandre; AGOSTINI A. GOMES, Fábio; NEGREIROS DE CARVALHO, Sarah. Predicting sepsis prognosis with machine learning models trained on MIMIC-IV. In: SIMPÓSIO BRASILEIRO DE BIOINFORMÁTICA (BSB), 18. , 2025, Fortaleza/CE. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2025 . p. 162-173. ISSN 2316-1248. DOI: https://doi.org/10.5753/bsb.2025.15175.