Exploratory Analysis of Electronic Health Records using Topic Modeling


  • Ivair Puerari Federal University of Fronteira Sul
  • Denio Duarte Federal University of Fronteira Sul
  • Guilherme Dal Bianco Federal University of Fronteira Sul
  • Julyane Felipette Lima Federal University of Fronteira Sul




Topic Modeling, Electronic Health Record, ICU, LDA, Discharge, Death


The rapid growth of electronic health record (EHR) systems brings an increase in available information about patients in hospitals. This massive amount of text information presents an opportunity to extract unknown information about medical history, medication, diseases, allergies, among others. Extracting the main topics that represent the subjects covered by a text collection can give valuable insights. To this end, approaches for topic modeling have been used to tackle such problems as information discovery and topic extraction with thematic information. In this context, this work presents an exploratory analysis of a collection of electronic health records from an intensive care unit (ICU). The collection is split into two sub-collections: discharged patients and patients who progressed to death. We apply an LDA-based approach to discover the latent topics from the collections. The analyses show that some topics are more recurrent in the deceased patients (the death collection), like renal diseases, and others are more recurrent in the discharge collection, for example, diabetes. The results of the analyses can be useful for improving intensive care services since the topics can be a guide to understanding the patterns in discharge and death situations.


Download data is not yet available.


Puerari, I., Duarte, D., Dal Bianco, G., & Felipette Lima, J. (2021). Exploratory Analysis of Electronic Health Records using Topic Modeling. Journal of Information and Data Management, 11(2). https://doi.org/10.5753/jidm.2020.2024



Regular Papers