Evaluation of a method for mapping medical reports to a structured representation: a case study with upper digestive endoscopy reports

  • Daniel de Faveri Honorato Unioeste / PTI
  • Maria Carolina Monard USP
  • Huei Diana Lee Unioeste / PTI
  • Antonio Pietrobom Neto Hospital Municipal de Paulínia
  • Wu Feng Chung Unioeste / PTI

Abstract


In order to perform the Text Mining process, text data should be pre-processed so that predictive or descriptive methods are applicable. To this end, we have proposed a method that transforms information related to medical findings described in natural language to the attribute-value format. Two modes can be used to apply the proposed method: automatic, where only syntactical information is used, and semi-automatic where semantic information can be added by domain specialists. This work shows a case study where this method was applied using the automatic mode. Although applying the method using this mode represents its worst case as it ignores any semantic information, results show that our proposal is feasible even in its worst case.

References

Feldman, R. and Sanger, J. (2006). The Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data. Cambridge University Press, Nova Iorque, EUA.

Honorato, D. D. F. (2008). Metodologia de transformação de laudos médicos não estruturados e estruturados em uma representação atributo-valor. Dissertação de Mestrado, ICMC-USP, [link]. Acesso em 18/06/2009.

Honorato, D. D. F., Cherman, E. A., Lee, H. D., Monard, M. C., and Wu, F. C. (2008a). Construction of an attribute-value representation for semi-structured medical findings knowledge extraction. CLEI Electronic Journal, 11(2):1–12.

Honorato, D. D. F. and Monard, M. C. (2008). Descrição do ambiente computacional TP-DISCOVER para mapear informações não estruturadas em uma tabela atributo-valor. Technical Report 318, ICMC-USP, [link]. Acesso em 18/06/2009.

Honorato, D. D. F., Monard, M. C., Lee, H. D., and Wu, F. C. (2008b). Uma abordagem de extração de terminologia para a construção de uma representação atributo-valor a partir de documentos não estruturados. In Conferencia Latinoamericana de Informática, pages 190–199, Santa Fe, Argentina.

Lee, H. D. (2005). Seleção de atributos importantes para a extração de conhecimento de bases de dados. Tese de Doutorado, ICMC-USP, [link]. Acesso em 18/06/2009.

Witten, I. and Frank, E. (2005). Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufman Publishers Inc., San Francisco, Califórina, EUA.
Published
2009-07-20
HONORATO, Daniel de Faveri; MONARD, Maria Carolina; LEE, Huei Diana; PIETROBOM NETO, Antonio; CHUNG, Wu Feng. Evaluation of a method for mapping medical reports to a structured representation: a case study with upper digestive endoscopy reports. In: BRAZILIAN SYMPOSIUM ON COMPUTING APPLIED TO HEALTH (SBCAS), 9. , 2009, Bento Gonçalves/RS. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2009 . p. 1977-1986. ISSN 2763-8952.