An Ontology Based Natural Language Processing Pipeline for Brazilian COVID-19 EMR

  • Raquel A. J. Gritz LNCC
  • Rafael S. Pereira LNCC
  • Henrique Matheus F. da Silva LNCC
  • Henrique G. Zatti UFMG
  • Laura E. A. Viana UFMG
  • Karol C. S. F. Navarro UFMG
  • Thalita R. Dias Hospital Universitário de Brasília (UNB)
  • Viviane S. B. Oliveira Fundação Oswaldo Cruz (Fiocruz)
  • Ricardo A. Souza UFMG
  • Vinícius A. Oliveira Fundação Oswaldo Cruz (Fiocruz)
  • Manoel Barral Netto Fundação Oswaldo Cruz (Fiocruz)
  • Fabio Porto LNCC

Resumo


COVID-19 became a pandemic infecting more than 100 million people across the world and has been going on for over a year. A huge amount of data has been produced as electronic medical records in the form of textual data because of patient visits. Extracting this information may be very useful in better understanding the COVID-19 disease. However, challenges exist in interpreting the medical records typed as free text as doctors may use different terms to type in their observations. In order to deal with the latter, we created an ontology in Portuguese to describe the terms used in COVID-19 medical records in Brazil. In this paper, we present a brief overview of the ontology and how we are using it as the first step of a more complex NLP task.
Palavras-chave: COVID-19, Ontology, Natural Language Processing(NLP)

Referências

Cai, X., Dong, S., and Hu, J. (2019). A deep learning model incorporating part of speech and self-matching attention for named entity recognition of chinese electronic medical records. BMC medical informatics and decision making, 19(2):101–109.

Chen, Q., Du, J., Kim, S., Wilbur, W. J., and Lu, Z. (2018). Combining rich features and deep learning for finding similar sentences in electronic medical records. Proceedings of the BioCreative/OHNLP Challenge, pages 5–8.

de Andrade, A. Q. (2013). A linguagem medica utilizada em prontuários e suas representações em sistemas de informação: as ontologias e os modelos de informação.

EBSERH. Brasil. Ministerio da Educação. Empresa Brasileira de Serviços Hospítalares.

Farinelli, F. and Almeida, M. B. (2019). Ontologias biomedicas: teoria e prática. Sociedade Brasileira de Computação.

Gruber, T. R. (1993). A translation approach to portable ontology specifications. Knowledge acquisition, 5(2):199–220.

Hirschberg, J. and Manning, C. D. (2015). Advances in natural language process ing. Science, 349(6245):261–266

Lamy, J.-B. (2017). Owlready: Ontology-oriented programming in python with automatic classification and high level constructs for biomedical ontologies. Artificial intelligence in medicine, 80:11–28.

Li, J., Sun, A., Han, J., and Li, C. (2020). A survey on deep learning for named entity recognition. IEEE Transactions on Knowledge Data Engineering, (01):1–1.

McIntosh, K., Hirsch, M., and Bloom, A. (2020). Coronavirus disease 2019 (covid 19): Epidemiology, virology, and prevention. Lancet. Infect. Dis, 1:2019–2020

Musen, M. A. (2015). The protegé project: a look back and a look forward.áI Matters, 1(4):4–12.

Noy, N. F. and McGuinness, D. L. A guide to creating your first ontology 2001. URL:[http://www.protege.stanford.edu/publications/ontologydevelopment/onlogy101.html2001].

Schriml, L. M., Mitraka, E., Munro, J., Tauber, B., Schor, M., Nickle, L., Felix, V., Jeng, L., Bearer, C., Lichenstein, R., et al. (2019). Human disease ontology 2018 update: classification, content and workflow expansion. Nucleic acids research, 47(D1):D955–D962.

Smith, B., Ashburner, M., Rosse, C., Bard, J., Bug, W., Ceusters, W., Goldberg, L. J., Eilbeck, K., Ireland, A., Mungall, C. J., et al. (2007). The obo foundry: coordinated evolution of ontologies to support biomedical data integration. Nature biotechnology, 25(11):1251–1255.
Publicado
18/07/2021
GRITZ, Raquel A. J. et al. An Ontology Based Natural Language Processing Pipeline for Brazilian COVID-19 EMR . In: BRAZILIAN E-SCIENCE WORKSHOP (BRESCI), 15. , 2021, Evento Online. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2021 . p. 97-104. ISSN 2763-8774. DOI: https://doi.org/10.5753/bresci.2021.15794.