Automated classification of cardiology diagnoses based on textual medical reports


  • J. A. O. Pedrosa Universidade Federal de Minas Gerais
  • D. M. Oliveira Universidade Federal de Minas Gerais
  • Wagner Meira Jr. Universidade Federal de Minas Gerais
  • Antonio Luiz P. Ribeiro Universidade Federal de Minas Gerais



cardiology, information extraction, machine learning, natural language processing


Automatic classification of diagnoses has been a long term challenge for Computer Science and related disciplines. Textual clinical reports can be used as a great source of data for such diagnoses. However, building classification models from them is not a trivial task. The problem tackled in this work is the identification of the medical diagnoses that are indicated in these reports. In the past, several methods have been proposed for addressing this problem, but a method developed for reports in the cardiology area that are written in Portuguese is still needed. In this paper we describe a method that is able to handle the peculiarities of clinical reports, including the medical terminology, and that is implemented to estimate correctly the diagnosis based on raw clinical reports and a list of the possible diagnoses. Experimental results show that our method has a high degree of accuracy, even for infrequent
classes and complex databases.


Download data is not yet available.


