A Question Answering System over Chronic Diseases and Epigenetics Knowledge
Resumo
Medical records describe patients’ health conditions and help experts to decide on treatments. The scientific biomedical knowledge can improve the prevention and treatment of diseases and promote innovation and discovery in health. However, healthcare professionals may have difficulty in searching for relevant scientific information due to lack time and constant literature update. The present work proposes a Question Answering (Q&A) architecture to support a more focused search for information about chronic diseases. A user question in natural language initiates the search for answering and promoting knowledge such as a learning healthcare system. To evaluate the system, we employ a reference collection on epigenetics and chronic disease and calculate performance measures like precision, recall and F-measure.
Referências
Almansa, L. and Macedo, A. (2016). Sistema de informação para perguntas e respostas em doenças cronicas. In Anais Principais do XVI Workshop de Informatica Médica , pages 127–136, Porto Alegre, RS, Brasil. SBC.
Amorim, M. T. C. F. d., Cury, D., and Menezes, C. S. (2012). Um Sistema Inteligente Baseado em Ontologia para Apoio ao esclarecimento de Duvidas.
Arrigo, A. J. S., Silva, E. G., Martins, H. P., and Silva, P. P. (2014). Desenvolvimento de um Sistema de Pergunta e Resposta Baseado em Corpus. In 14o Congresso Nacional de Iniciação Cient ífica (CONICSEMESP), pages 1–6, Sao Paulo, SP.
Athenikos, S. J. and Han, H. (2010). Biomedical question answering: A survey. Computer Methods and Programs in Biomedicine, 99(1):1 – 24.
Baeza-Yates, R. and Ribeiro-Neto, B. (1999). Modern information retrieval, volume 463. ACM Press New York, 1nd edition.
Barker, D. (2001). Fetal and infant origins of adult disease. Monatsschrift Kinderheilkunde, 149(1):S2–S6.
Be, A. and Zweigenbaum, P. (2015). MEANS: A medical question-answering system combining NLP techniques and semantic Web technologies. Information Processing & Management, 51(5):570–594.
Bergman, S. (1970). The kernel function and conformal mapping. Number 5. American Mathematical Soc.
Bhat, S., Gijo, E., and Jnanesh, N. (2016). Productivity and performance improvement in the medical records department of a hospital: An application of lean six sigma. International Journal of Productivity and Performance Management, 65(1):98–125.
Bodenreider, O. (2004). The unified medical language system (umls): integrating biomedical terminology. Nucleic acids research, 32(suppl 1):D267–D270.
Brin, S. (1999). Extracting patterns and relations from the world wide web. In Selected Papers from the International Workshop on The World Wide Web and Databases, WebDB ’98, pages 172–183, London, UK, UK. Springer-Verlag.
Cao, Y., Liu, F., Simpson, P., Antieau, L., Bennett, A., Cimino, J. J., Ely, J., and Yu, H. (2011). AskHERMES: An online question answering system for complex clinical questions. Journal of biomedical informatics, 44(2):277–88.
Cohen, K. B. and Hunter, L. (2008). Getting started in text mining. 4:1–20.
Er, N. P. and Cicekli, I. (2013). A Factoid Question Answering System Using Answer Pattern Matching. In International Joint Conference on Natural Language Processing, pages 854–858, Nagoya, Japan.
Gupta, P. and Gupta, V. (2012). A survey of text question answering techniques. International Journal of Computer Applications, 53(4):1–8.
Kolomiyets, O. and Moens, M.-F. (2011). A survey on question answering technology from an information retrieval perspective. Information Sciences, 181(24):5412 – 5434.
Macedo, A., Pessotti, H., Almansa, L., Felipe, J., and Kimura, E. (2016a). Morphometric Information Reducing Semantic Gap on the Characterization of Microscopic Images of Thyroid Nodules. Computer Methods and Programs in Biomedicine, 130(162-174).
Macedo, A. A., Pessotti, H., Almansa, L. F., Felipe, J. C., and Kimura, E. (2016b). Morphometric information to reduce the semantic gap in the characterization of microscopic images of thyroid nodules. Computer Methods and Programs in Biomedicine, 130:162–174.
Macedo, A. A., Polettini, J., Baranauskas, J. A., and Chaves, J. (2016c). A health surveillance software framework to design the delivery of information on preventive healthcare strategies. Submitted with minor revisions to the Journal of Biomedical Informatics, 62.
Macedo, A. A., Pollettini, J. T., and Munson, E. V. (2015). A Chronic Illness System Using Biomedical Knowledge Sources and Relevance Feedback. In 2015 IEEE 28th International Symposium on Computer-Based Medical Systems, pages 244–249. IEEE.
Machado Junior, D., Foleiss, J. H., and de Souza, V. M. a. A. (2009). SQAS: Um Sistema Automatico de Question-Answering para Textos Jornalísticos. In 7th Brazilian Symposium in Information and Human Language Tecnology, pages 1–3, Sao Carlos, SP.
Magnini, B., Romagnoli, S., Vallin, A., Herrera, J., Penas, A., Peinado, V., Verdejo, F., and de Rijke, M. (2004). The multiple language question answering track at CLEF 2003. In Comparative Evaluation of Multilingual Information Access Systems, pages 471–486. Springer.
Monz, C. (2003). From document retrieval to question answering. Inst for Logic, Language and Computation.
Moreda, P., Llorens, H., Saquete, E., and Palomar, M. (2011). Combining semantic information in question answering systems. Information Processing & Management, 47(6):870–885.
NLTK (2015). Categorizing and Tagging Words. On line http://www.nltk.org/book/ch05.html.
Olvera-Lobo, M. D. and Gutierrez-Artacho, J. (2015). Question answering track evaluation in TREC, CLEFánd NTCIR. In Rocha, A., Correia, A. M., Costanzo, S., and Reis, L. P., editors, New Contributions in Information Systems and Technologies, volume 353 of Advances in Intelligent Systems and Computing, pages 13–22. Springer International Publishing.
Pessotti, H. (2012). Uso de Mapeamento Conceitual para Redução de Descontinuidade Sem antica na Recuperação de Imagens Microsc opicas de Carcinoma Tireoidiano. Master’s thesis, Universidade de Sao Paulo.
Pollettini, J., Baranauskas, J. A., Ruiz, E. S., da Graça Pimentel, M., and Macedo, A. (2014). Surveillance for the prevention of chronic diseases through information association. BMC Medical Genomics, 7(1):7.
Pollettini, J., Panico, S., Daneluzzi, J. C., Tinos, R., Baranauskas, J. A., and Macedo, A. A. (2012). Using machine learning classifiers to assist healthcare-related decisions: Classification of electronic patient records. Journal of Medical Systems, 36(6):3861–3874.
Porter, M. and Boulton, R. (2001). Snowball. On line http://www.snowball.tartarus.org.
Prestes, K. V. (2011). Avaliação de m etodos de seleçáo da resposta de um sistema de perguntas e respostas. Technical report.
Ryu, P.-M., Jang, M.-G., and Kim, H.-K. (2014). Open domain question answering using Wikipedia-based knowledge model. Information Processing & Management, 50(5):683–692.
Rzhetsky, A., Seringhaus, M., and Gerstein, M. (2009). Getting Started in Text Mining: Part Two. PLoS Comput Biol, 5(7):e1000411+.
Shortliffe, E. H. and Cimino, J. J. (2013). Biomedical informatics: computer applications in health care and biomedicine. Springer Science & Business Media.
Spackman, K., Campbell, K., and Cot e, R. A. (1997). SNOMED RT: a reference terminology for health care. In Proceedings of the AMIA annual fall symposium, page 640. American Medical Informatics Association.
Suresh kumar, G. and Zayaraz, G. (2015). Concept relation extraction using Na¨ıve Bayes classifier for ontology-based question answering systems. Journal of King Saud University - Computer and Information Sciences, 27(1):13–24.
Voorhees, E. M. and Tice, D. M. (2000). Building a question answering test collection. In Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’00, pages 200–207, New York, NY, USA. ACM.
Wallace, R. (2003). The elements of aiml style. Alice AI Foundation.
Wilkens, R., Villavicencio, A., Muller, D., Wives, L., Da Silva, F., and Loh, S. (2010). Comunica: a question answering system for brazilian portuguese. In Proceedings of the 23rd International Conference on Computational Linguistics: Demonstrations, pages 21–24. Association for Computational Linguistics.
Yen, S.-J., Wu, Y.-C., Yang, J.-C., Lee, Y.-S., Lee, C.-J., and Liu, J.-J. (2013). A support vector machinebased context-ranking model for question answering. Inf. Sciences, 224:77 – 87.
Zhang, D. and Lee, W. S. (2003). Question classification using support vector machines. In Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Informaion Retrieval, SIGIR ’03, pages 26–32, New York, NY, USA. ACM.