A pipeline for tabular dataset formation from unstructured data provided by ACR Appropriateness Criteria guidelines

  • Anderson A. Eduardo HIAE
  • Rafael M. Loureiro HIAE
  • Adriano Tachibana HIAE
  • Pedro V. Netto HIAE
  • Tatiana F. de Almeida HIAE
  • André Pires HIAE

Resumo


Entre as tecnologias centradas em dados, os sistemas de suporte à decisão clínica (CDSS) figuram entre os mais promissores. Avanços tecnológicos facilitaram sua implementação, mas a manutenção da base de conhecimento para CDSS permanece aberta a melhorias. Aqui, defendemos as diretrizes de adequabilidade do ACR como fonte valiosa de dados abertos e que, se combinados com algoritmos apropriados, podem impulsionar a pesquisa com CDSS. Portanto, desenvolvemos um pipeline capaz de formar conjuntos de dados tabulares a partir das diretrizes do ACR, armazenados em website como arquivos PDF. Também demonstramos experimentalmente que esse pipeline recupera com sucesso os conteúdos de interesse e a melhor composição, em termos de seus algoritmos componentes, é discutida. Pesquisas futuras que focarem na flexibilidade do pipeline frente a atualizações de template dos PDFs contribuirão para o avanço deste trabalho.

Referências

Akturk, C. (2021). "Bibliometric analysis of clinical decision support sys- tems". In Acta Informatica Pragensia 10(1), pages 61-74. doi: 10. 18267/J.AIP.146.

Barchard, K. A. and Pace, L. A. (2011). "Preventing human error: The impact of data entry methods on data accuracy and statistical results". In Computers in Human Behavior 27(2011), pages 1834-1839. doi: 10.1016/j.chb.2011.04.004.

Baviskar, D. et al. (2021). "Efficient automated processing of the unstructured documents using Artificial Intelligence: A systematic literature review and future directions". In IEEE Access 9(2021), pages 72894-72936. doi: 10.1109/ACCESS.2021.3072900.

Bellatreche, L., Valduriez P. and Morzy T. (2018). "Advances in Databases and Information Systems". In Information Systems Frontiers 20(2018), pages 1-6. doi: 10.1007/s10796017-9819-2.

Doyle, D. et al. (2019). "Clinical decision support for high-cost imaging: A randomized clinical trial". In Plos One 14(3-2019), e0213373. doi: 10.1371/journal.pone.0213373.

Harman, D. (2019). "Information Retrieval: The Early Years". In Foundations and Trends in Information Retrieval 13(5), pages 425-577. doi: 10.1561/1500000065.

Geewook, K. et al. (2021). "Donut: Document Understanding Transformer without OCR". In ArXiv (Nov. 2021). doi: 10.48550/arxiv.2111.15664. url: http://arxiv.org/abs/2111.15664.

Greenes, R. A. et al. (2018). "Clinical decision support models and frameworks: Seeking to address research issues underlying implementation successes and failures". In Journal of Biomedical Informatics 78, pages 134-143. doi: https://doi.org/10.1016/J.JBI.2017.12.005

Parsania, V. and Jani, N. (2015). "Reviewing and Modeling Clinical Decision Support System". In International Journal of Technology and Science 7 (Dec. 2015), pp. 15-17.

Roh, Y. et al. (2021). "A Survey on Data Collection for Machine Learning: A Big Data-AI Integration Perspective". In IEEE Transactions on Knowledge and Data Engineering 33(4), pages 1328-1347. doi: https://doi.org/10.1109/TKDE.2019.2946162

Shiffman, R. N. (1997). "Representation of Clinical Practice Guidelines in Conventional and Augmented Decision Tables". In Journal of the American Medical Informatics Association 4(5), pages 382-393. doi: https://doi.org/10.1136/jamia.1997.0040382

Shiffman, R. N. and Greenes, R. A. (1994). "Improving Clinical Guidelines with Logic and Decision-table Techniques". In Medical Decision Making 14(3), pages 245-254. doi: https://doi.org/10.1177/0272989X940140030

Sutton, R. T. et al. (2020). "An overview of clinical decision support systems: benefits, risks, and strategies for success". In NPJ Digital Medicine 3(1), pages 17-29. doi: 10.1038/s41746-020-0221-y.

Towbin, A. J. (2019). "Collecting Data to Facilitate Change". In Journal of the American College of Radiology 16(2019), pages 1248-1253. doi: 10.1016/j.jacr.2019.05.032.

Zhang, Q. and Segall, R. S. (2008). "Web mining: a survey of current research, techniques, and software". In International Journal of Information Technology & Decision Making 7(2008), pages 683-720. doi: 10.1142/S0219622008003150.9.

Yin, A. L. et al. (2022). "Comparing automated vs. manual data collection for COVID-specific medications from electronic health records". In International Journal of Medical Informatics 157, page 104622. doi: https://doi.org/10.1016/j.ijmedinf.2021.104622

Zhang, Y., Chen, M. and Liu, L. (2015). "A review on text mining". In 6th IEEE International Conference on Software Engineering and Service Science (ICSESS), pages 681-685. doi: 10.1109/ICSESS.2015.7339149.10.
Publicado
07/06/2022
EDUARDO, Anderson A.; LOUREIRO, Rafael M.; TACHIBANA, Adriano; NETTO, Pedro V.; ALMEIDA, Tatiana F. de; PIRES, André. A pipeline for tabular dataset formation from unstructured data provided by ACR Appropriateness Criteria guidelines. In: SIMPÓSIO BRASILEIRO DE COMPUTAÇÃO APLICADA À SAÚDE (SBCAS), 22. , 2022, Teresina. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2022 . p. 168-177. ISSN 2763-8952. DOI: https://doi.org/10.5753/sbcas.2022.222497.