Modelo preditivo para classificação de risco de óbito de pacientes com COVID-19 utilizando dados abertos

  • Gustavo Rodrigues UNIPAMPA/Combate à Fraude
  • Diego Kreutz UNIPAMPA

Resumo


Com o intuito de mitigar a subjetividade de políticas para acesso a leitos de UTI, propomos um preditor baseado em florestas aleatórias para classificação de risco de óbito de pacientes com COVID-19. O conjunto de dados abertos utilizados engloba mais de 600 mil pacientes reportados através do Painel Coronavírus RS. No conjunto de teste, o modelo classificou a chance de óbito com uma pontuação AUC-ROC de 0,97. Estes resultados evidenciam o potencial do preditor em auxiliar na tomada de decisão no ambiente hospitalar.

Referências

Amidi, A. and Amidi, S. (2020). Machine learning tips and tricks cheatsheet. https://tinyurl.com/ml-tips-and-tricks.

Anand, A., Pugalenthi, G., Fogel, G. B., and Suganthan, P. (2010). An approach for classification of highly imbalanced data using weighting and undersampling. Amino acids, 39(5):1385-1391.

Assaf, D., Gutman, Y., Neuman, Y., Segal, G., Amit, S., Gefen-Halevi, S., Shilo, N., Epstein, A., Mor-Cohen, R., Biber, A., et al. (2020). Utilization of machine-learning models to accurately predict the risk for critical COVID-19. Internal and emergency medicine, 15(8):1435-1443.

Benesty, J., Chen, J., Huang, Y., and Cohen, I. (2009). Pearson correlation coefficient. In Noise reduction in speech processing, pages 1-4. Springer.

Casiraghi, E., Malchiodi, D., Trucco, G., Frasca, M., Cappelletti, L., Fontana, T., Esposito, A. A., Avola, E., Jachetti, A., Reese, J., et al. (2020). Explainable machine learning for early assessment of COVID-19 risk prediction in emergency departments. IEEE Access, 8:196299-196325.

Cheng, F.-Y., Joshi, H., Tandon, P., Freeman, R., Reich, D. L., Mazumdar, M., Kohli-Seth, R., Levin, M., Timsina, P., and Kia, A. (2020). Using machine learning to predict ICU transfer in hospitalized COVID-19 patients. Journal of Clinical Medicine, 9(6):1668.

Chowdhury, M. E., Rahman, T., Khandakar, A., Al-Madeed, S., Zughaier, S. M., Hassen, H., and Islam, M. T. (2021). An early warning tool for predicting mortality risk of COVID-19 patients using machine learning. Cognit Comput, 21:1-16.

Collins, G. S., Reitsma, J. B., Altman, D. G., and Moons, K. G. (2015). Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD) the TRIPOD statement. Circulation, 131(2):211-219.

Dun, C., Walsh, C., Bae, S., Adalja, A., Toner, E., Lash, T. A., Hashim, F., Paturzo, J., Segev, D. L., and Makary, M. A. (2020). A machine learning study of 534,023 medicare beneficiaries with COVID-19: Implications for personalized risk prediction. medRxiv.

Gao, Y., Cai, G.-Y., Fang, W., Li, H.-Y., Wang, S.-Y., Chen, L., Yu, Y., Liu, D., Xu, S., Cui, P.-F., et al. (2020). Machine learning based early warning system enables accurate mortality risk prediction for COVID-19. Nature communications, 11(1):1-10.

Harris, C. R., Millman, K. J., van der Walt, S. J., Gommers, R., Virtanen, P., Cournapeau, D., Wieser, E., Taylor, J., Berg, S., Smith, N. J., Kern, R., Picus, M., Hoyer, S., van Kerkwijk, M. H., Brett, M., Haldane, A., Fernández del Río, J., Wiebe, M., Peterson, P., Gérard-Marchant, P., Sheppard, K., Reddy, T., Weckesser, W., Abbasi, H., Gohlke, C., and Oliphant, T. E. (2020). Array programming with NumPy. Nature, 585:357-362.

Hosmer Jr, D. W., Lemeshow, S., and Sturdivant, R. X. (2013). Applied logistic regression, volume 398. John Wiley & Sons.

Hunter, J. D. (2007). Matplotlib: A 2D graphics environment. Computing in science & engineering, 9(3):90-95.

Iwendi, C., Bashir, A. K., Peshkar, A., Sujatha, R., Chatterjee, J. M., Pasupuleti, S., Mishra, R., Pillai, S., and Jo, O. (2020). COVID-19 patient health prediction using boosted random forest algorithm. Frontiers in public health, 8:357.

James, G., Witten, D., Hastie, T., and Tibshirani, R. (2013). An introduction to statistical learning, volume 112. Springer.

Krstajic, D., Buturovic, L. J., Leahy, D. E., and Thomas, S. (2014). Cross-validation pitfalls when selecting and assessing regression and classification models. Journal of cheminformatics, 6(1):1-15.

Latif, S., Usman, M., Manzoor, S., Iqbal, W., Qadir, J., Tyson, G., Castro, I., Razi, A., Boulos, M. N. K., Weller, A., et al. (2020). Leveraging data science to combat COVID-19: A comprehensive review. IEEE Transactions on Artificial Intelligence.

Magunia, H., Lederer, S., Verbuecheln, R., Gilot, B. J., Koeppen, M., Haeberle, H. A., Mirakaj, V., Hofmann, P., Marx, G., Bickenbach, J., et al. (2021). Machine learning identifies ICU outcome predictors in a multicenter COVID-19 cohort. Critical Care, 25(1):1-14.

Mandrekar, J. N. (2010). Receiver operating characteristic curve in diagnostic test assessment. Journal of Thoracic Oncology, 5(9):1315-1316.

McKinney, W. (2010). Data structures for statistical computing in Python. In van der Walt, S. and Millman, J., editors, Proceedings of the 9th Python in Science Conference, pages 51-56.

Menze, B. H., Kelm, B. M., Masuch, R., Himmelreich, U., Bachert, P., Petrich, W., and Hamprecht, F. A. (2009). A comparison of random forest and its Gini importance with standard chemometric methods for the feature selection and classification of spectral data. BMC bioinformatics, 10(1):1-16.

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., et al. (2011). Scikit-learn: Machine learning in Python. the Journal of machine Learning research, 12:2825-2830.

Pourhomayoun, M. and Shakibi, M. (2021a). Predicting mortality risk in patients with COVID-19 using machine learning to help medical decision-making. Smart Health, 20:100178.

Pourhomayoun, M. and Shakibi, M. (2021b). Predicting mortality risk in patients with COVID-19 using machine learning to help medical decision-making. Smart Health, 20:100178.

Ranney, M. L., Griffeth, V., and Jha, A. K. (2020). Critical supply shortages-the need for ventilators and personal protective equipment during the COVID-19 pandemic. New England Journal of Medicine, 382(18):e41.

Schöning, V., Liakoni, E., Baumgartner, C., Exadaktylos, A. K., Hautz, W. E., Atkinson, A., and Hammann, F. (2021). Development and validation of a prognostic COVID-19 severity assessment (COSA) score and machine learning models for patient triage at a tertiary hospital. Journal of Translational Medicine, 19(1):1-11.

Vepa, A., Saleem, A., Rakhshan, K., Daneshkhah, A., Sedighi, T., Shohaimi, S., Omar, A., Salari, N., Chatrabgoun, O., Dharmaraj, D., et al. (2021). Using machine learning algorithms to develop a clinical decision-making tool for COVID-19 inpatients. International journal of environmental research and public health, 18(12):6228.

Wynants, L., Van Calster, B., Collins, G. S., Riley, R. D., Heinze, G., Schuit, E., Bonten, M. M., Dahly, D. L., Damen, J. A., Debray, T. P., et al. (2020). Prediction models for diagnosis and prognosis of covid-19: systematic review and critical appraisal. BMJ, 369.

Yadaw, A. S., Li, Y.-c., Bose, S., Iyengar, R., Bunyavanich, S., and Pandey, G. (2020). Clinical features of COVID-19 mortality: development and validation of a clinical prediction model. The Lancet Digital Health, 2(10):e516-e525.

Zhao, Z., Chen, A., Hou, W., Graham, J. M., Li, H., Richman, P. S., Thode, H. C., Singer, A. J., and Duong, T. Q. (2020). Prediction model and risk scores of icu admission and mortality in COVID-19. PloS one, 15(7):e0236618.
Publicado
07/06/2022
RODRIGUES, Gustavo; KREUTZ, Diego. Modelo preditivo para classificação de risco de óbito de pacientes com COVID-19 utilizando dados abertos. In: SIMPÓSIO BRASILEIRO DE COMPUTAÇÃO APLICADA À SAÚDE (SBCAS), 22. , 2022, Teresina. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2022 . p. 144-155. ISSN 2763-8952. DOI: https://doi.org/10.5753/sbcas.2022.222494.