Aplicação de Mineração de Dados para Predição de Mortalidade em UTI: balanceamento, dados ausentes e classificadores

  • Jorge S. A .M. Barreto
  • Angelo C. Loula

Abstract


Severity scores provide a consolidated index of the patient’s health status in the ICU. These scores are based on linear models and analysis of each single variable in isolation. Data mining techniques have been applied to generate more complex predicion models for prediction, but did not deepen in the analysis of class imbalance and treatment of missing data. This work analyzed techniques for class balancing and imputation of missing values, together with Random Forest (RF), Artificial Neural Networks (RNA) and Logistic Regression (RL) classification models. As a result the RF obtained the best performance with the mean AUC of 0.7840.006, sensitivity of 0.7380.002 and specificity of 0.7000.003 with missing values replaced by default values and trained with the base with NCL undersampling.


 

References

Azur, M. J., Stuart, E. A., Frangakis, C., and Leaf, P. J. (2011). Multiple imputation by chained equations: what is it and how does it work? International journal of methods in psychiatric research, 20(1):40–49.

Chawla, N. V., Bowyer, K. W., Hall, L. O., and Kegelmeyer, W. P. (2002). Smote: synthetic minority over-sampling technique. Journal of artificial intelligence research, 16:321–357.

Johnson, A. E., Pollard, T. J., Shen, L., Lehman, L.-w. H., Feng, M., Ghassemi, M., Moody, B., Szolovits, P., Celi, L. A., and Mark, R. G. (2016). Mimic-iii, a freely accessible critical care database. Scientific data, 3.

Kim, S., Kim, W., and Park, R. W. (2011). A comparison of intensive care unit mortality prediction models through the use of data mining techniques. Healthcare informatics
research, 17(4):232–243.

Knaus, W. A., Wagner, D. P., Draper, E. A., Zimmerman, J. E., Bergner, M., Bastos, P. G., Sirio, C. A., Murphy, D. J., Lotring, T., Damiano, A., et al. (1991). The apache iii prognostic system: risk prediction of hospital mortality for critically iii hospitalized adults. Chest, 100(6):1619–1636.

Knaus, W. A., Zimmerman, J. E., Wagner, D. P., Draper, E. A., and Lawrence, D. E. (1981). Apache-acute physiology and chronic health evaluation: a physiologically based classification system. Critical care medicine, 9(8):591–597.

Laurikkala, J. (2001). Improving identification of difficult small classes by balancing class distribution. In Conference on Artificial Intelligence in Medicine in Europe, pages 63–
66. Springer.

Schmidt, D., da Silva, D. B., da Costa, C. A., and da Rosa Righi, R. (2018). Um modelo de predição de mortalidade em unidades de terapia intensiva baseado em deep learning. In 18o Simp´osio Brasileiro de Computac¸ ˜ao Aplicada `a Sa´ude (SBCAS 2018), volume 18, Porto Alegre, RS, Brasil. SBC.

Silva, I., Moody, G., Scott, D. J., Celi, L. A., and Mark, R. G. (2012). Predicting inhospital mortality of icu patients: The physionet/computing in cardiology challenge 2012. In Computing in Cardiology (CinC), 2012, pages 245–248. IEEE.

Xia, H., Daley, B. J., Petrie, A., and Zhao, X. (2012). A neural network model for mortality prediction in icu. In 2012 Computing in Cardiology, pages 261–264. IEEE.
Published
2019-06-24
BARRETO, Jorge S. A .M.; LOULA, Angelo C.. Aplicação de Mineração de Dados para Predição de Mortalidade em UTI: balanceamento, dados ausentes e classificadores. In: BRAZILIAN E-SCIENCE WORKSHOP (BRESCI), 13. , 2019, Belém. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2019 . ISSN 2763-8774. DOI: https://doi.org/10.5753/bresci.2019.10024.