Experiments on Portuguese Clinical Question Answering

Oliveira, Lucas Emanuel Silva e; Schneider, Elisa Terumi Rubel; Gumiel, Yohan Bonescki; Luz, Mayara Aparecida Passaura da; Paraiso, Emerson Cabrera; Moro, Claudia

doi:10.1007/978-3-030-91699-2_10

Experiments on Portuguese Clinical Question Answering

Conference paper
First Online: 28 November 2021

997 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13074))

Abstract

Question answering (QA) systems aim to answer human questions made in natural language. This type of functionality can be very useful in the most diverse application domains, such as the biomedical and clinical. Considering the clinical context, where we have a growing volume of information stored in electronic health records, answering questions about the patient status can improve the decision-making and optimize the patient care. In this work, we carried out the first experiments to develop a QA model for clinical texts in Portuguese. To overcome the lack of corpora for the required language and context, we used a transfer learning approach supported by pre-trained attention-based models from the Transformers library. We fine-tuned the BioBERTpt model with a translated version of the SQuAD dataset. The evaluation showed promising results when evaluated in different clinical scenarios, even without the application of a clinical QA corpus to support a training process. The developed model is publicly available to the scientific community.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

References

Calijorne Soares, M.A., Parreiras, F.S.: A literature review on question answering techniques, paradigms and systems (2020). https://doi.org/10.1016/j.jksuci.2018.08.005
Dalianis, H.: Characteristics of patient records and clinical corpora. In: Clinical Text Mining, pp. 21–34. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-78503-5_4
Chapter Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 4171–4186. Association for Computational Linguistics, Minneapolis, Minnesota (June 2019). https://doi.org/10.18653/v1/N19-1423, https://www.aclweb.org/anthology/N19-1423
Dias, L.B., Duran, E.C.M.: Análise das evoluções de enfermagem contextualizadas no processo de enfermagem. Revista de Enfermagem UFPE on line (2018). https://doi.org/10.5205/1981-8963-v12i11a234623p2952-2960-2018
Article Google Scholar
Garritano, C.R.d.O., Junqueira, F.H., Lorosa, E.F.S., Fujimoto, M.S., Martins, W.H.A.: Avaliação do Prontuário Médico de um Hospital Universitário. Revista Brasileira de Educação Médica (2020). https://doi.org/10.1590/1981-5271v44.1-20190123
Guillou, P.: Portuguese bert base cased QA (question answering), finetuned on squad v1.1 (2021). https://huggingface.co/pierreguillou/bert-base-cased-squad-v1.1-portuguese
Jeong, M., et al.: Transferability of natural language inference to biomedical question answering. CoRR abs/2007.00217 (2020). https://arxiv.org/abs/2007.00217
Jin, Q., Dhingra, B., Liu, Z., Cohen, W.W., Lu, X.: PubMedQA: a dataset for biomedical research question answering. In: EMNLP-IJCNLP 2019–2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, Proceedings of the Conference (2020). https://doi.org/10.18653/v1/d19-1259
Krallinger, M., Krithara, A., Nentidis, A., Paliouras, G., Villegas, M.: BioASQ at CLEF2020: large-scale biomedical semantic indexing and question answering. In: Jose, J.M., et al. (eds.) ECIR 2020. LNCS, vol. 12036, pp. 550–556. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-45442-5_71
Chapter Google Scholar
Mutabazi, E., Ni, J., Tang, G., Cao, W.: A review on medical textual question answering systems based on deep learning approaches. Appl. Sci. 11(12) (2021). https://doi.org/10.3390/app11125456, https://www.mdpi.com/2076-3417/11/12/5456
e Oliveira, L.E.S., et al.: Semclinbr - a multi institutional and multi specialty semantically annotated corpus for Portuguese clinical NLP tasks (2020). https://arxiv.org/abs/2001.10071
Pampari, A., Raghavan, P., Liang, J., Peng, J.: emrQA: a large corpus for question answering on electronic medical records. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 2357–2368. Association for Computational Linguistics, Brussels, Belgium (October-November 2018). https://doi.org/10.18653/v1/D18-1258, https://aclanthology.org/D18-1258
Qiu, X.P., Sun, T.X., Xu, Y.G., Shao, Y.F., Dai, N., Huang, X.J.: Pre-trained models for natural language processing: a survey. Sci. Chin. Technol. Sci. 63(10), 1872–1897 (2020). https://doi.org/10.1007/s11431-020-1647-3
Article Google Scholar
Rajpurkar, P., Zhang, J., Lopyrev, K., Liang, P.: SQuad: 100,000+ questions for machine comprehension of text. In: EMNLP 2016 - Conference on Empirical Methods in Natural Language Processing, Proceedings (2016). https://doi.org/10.18653/v1/d16-1264
Schneider, E.T.R., et al.: BioBERTpt - a Portuguese neural language model for clinical named entity recognition (2020). https://doi.org/10.18653/v1/2020.clinicalnlp-1.7
Soni, S., Roberts, K.: Evaluation of dataset selection for pre-training and fine-tuning transformer language models for clinical question answering. In: LREC 2020–12th International Conference on Language Resources and Evaluation, Conference Proceedings (2020)
Google Scholar
Souza, J.V.A.D., et al.: A multilabel approach to Portuguese clinical named entity recognition. J. Health Inf. 12 (2021). http://www.jhi-sbis.saude.ws/ojs-jhi/index.php/jhi-sbis/article/view/840. http://www.jhi-sbis.saude.ws/ojs-jhi/index.php/jhi-sbis/issue/view/98/showToc
Šuster, S., Daelemans, W.: CliCR: a dataset of clinical case reports for machine reading comprehension. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 1551–1563. Association for Computational Linguistics, New Orleans, Louisiana (June 2018). https://doi.org/10.18653/v1/N18-1140, https://aclanthology.org/N18-1140
Vaswani, A., et al.: Attention Is All You Need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS 2017, pp. 6000–6010 (2017)
Google Scholar
Wiese, G., Weissenborn, D., Neves, M.: Neural domain adaptation for biomedical question answering. In: CoNLL 2017–21st Conference on Computational Natural Language Learning, Proceedings (2017). https://doi.org/10.18653/v1/k17-1029
Wolf, T., et al.: transformers: state-of-the-art natural language processing. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp. 38–45. Association for Computational Linguistics (October 2020). https://www.aclweb.org/anthology/2020.emnlp-demos.6
Yoon, W., Lee, J., Kim, D., Jeong, M., Kang, J.: Pre-trained language model for biomedical question answering. In: Communications in Computer and Information Science (2020). https://doi.org/10.1007/978-3-030-43887-6_64
Yue, X., Zhang, X.F., Sun, H.: Annotated question-answer pairs for clinical notes in the mimic-iii database (2021). https://doi.org/10.13026/J0Y6-BW05, https://physionet.org/content/mimic-iii-question-answer/1.0.0/

Download references

Acknowledgements

This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES) - Finance Code 001.

Author information

Authors and Affiliations

Pontifícia Universidade Católica Do Paraná, Curitiba, Brazil
Lucas Emanuel Silva e Oliveira, Elisa Terumi Rubel Schneider, Yohan Bonescki Gumiel, Mayara Aparecida Passaura da Luz, Emerson Cabrera Paraiso & Claudia Moro
Comsentimento, NLP Lab, São Paulo, Brazil
Lucas Emanuel Silva e Oliveira

Authors

Lucas Emanuel Silva e Oliveira
View author publications
You can also search for this author in PubMed Google Scholar
Elisa Terumi Rubel Schneider
View author publications
You can also search for this author in PubMed Google Scholar
Yohan Bonescki Gumiel
View author publications
You can also search for this author in PubMed Google Scholar
Mayara Aparecida Passaura da Luz
View author publications
You can also search for this author in PubMed Google Scholar
Emerson Cabrera Paraiso
View author publications
You can also search for this author in PubMed Google Scholar
Claudia Moro
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lucas Emanuel Silva e Oliveira .

Editor information

Editors and Affiliations

Universidade Federal de Sergipe, São Cristóvão, Brazil
André Britto
Universidade de São Paulo, São Paulo, Brazil
Karina Valdivia Delgado

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Oliveira, L.E.S.e., Schneider, E.T.R., Gumiel, Y.B., Luz, M.A.P.d., Paraiso, E.C., Moro, C. (2021). Experiments on Portuguese Clinical Question Answering. In: Britto, A., Valdivia Delgado, K. (eds) Intelligent Systems. BRACIS 2021. Lecture Notes in Computer Science(), vol 13074. Springer, Cham. https://doi.org/10.1007/978-3-030-91699-2_10

Download citation

DOI: https://doi.org/10.1007/978-3-030-91699-2_10
Published: 28 November 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-91698-5
Online ISBN: 978-3-030-91699-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics