Generating X-ray Reports Using Global Attention

Felipe André Zeiser; Cristiano André da Costa; Gabriel de Oliveira Ramos; Henrique C. Bohn; Ismael Santos; Bruna Donida; Ana Paula de Oliveira Brun; Nathália Zarichta

doi:10.5753/eniac.2022.227598

Felipe André Zeiser UNISINOS
Cristiano André da Costa UNISINOS
Gabriel de Oliveira Ramos UNISINOS
Henrique C. Bohn UNISINOS
Ismael Santos UNISINOS
Bruna Donida Grupo Hospitalar Conceição
Ana Paula de Oliveira Brun Grupo Hospitalar Conceição
Nathália Zarichta Grupo Hospitalar Conceição

DOI: https://doi.org/10.5753/eniac.2022.227598

Resumo

The use of images for the diagnosis, treatment, and decision-making in health is frequent. A large part of the radiologist’s work is the interpretation and production of potentially diagnostic reports. However, they are professionals with high workloads doing tasks operator dependent, that is being subject to errors in case of non-ideal conditions. With the COVID-19 pandemic, healthcare systems were overwhelmed, extending to the X-ray analysis process. In this way, the automatic generation of reports can help to reduce the workload of radiologists and define the diagnosis and treatment of patients with suspected COVID-19. In this article, we propose to generate suggestions for chest radiography reports evaluating two architectures based on: (i) Long short-term memory (LSTM), and (ii) LSTM with global attention. The extraction of the most representative features from the X-ray images is performed by an encoder based on a pre-trained DenseNet121 network for the ChestX-ray14 dataset. Experimental results in a private set of 6650 images and reports indicate that the LSTM model with global attention yields the best result, with a BLEU-1 of 0.693, BLEU-2 of 0.496, BLEU-3 of 0.400, and BLEU-4 of 0.345. The quantitative and qualitative results demonstrate that our method can effectively suggest high-quality radiological findings and demonstrate the possibility of using our methodology as a tool to assist radiologists in chest X-ray analysis.

Referências

Aafaq, N., Mian, A., Liu, W., Gilani, S. Z., and Shah, M. (2019). Video description: A survey of methods, datasets, and evaluation metrics. ACM Computing Surveys (CSUR), 52(6):1-37.

Chen, Z., Song, Y., Chang, T.-H., and Wan, X. (2020). Generating radiology reports via memory-driven transformer. arXiv preprint arXiv:2010.16056.

Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., and Bengio, Y. (2014). Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078.

De Fauw, J., Ledsam, J. R., Romera-Paredes, B., Nikolov, S., Tomasev, N., Blackwell, S., Askham, H., Glorot, X., O'Donoghue, B., Visentin, D., et al. (2018). Clinically applicable deep learning for diagnosis and referral in retinal disease. Nature medicine, 24(9):1342-1350.

Granata, V., Pradella, S., Cozzi, D., Fusco, R., Faggioni, L., Coppola, F., Grassi, R., Maggialetti, N., Buccicardi, D., Lacasella, G. V., et al. (2021). Computed tomography structured reporting in the staging of lymphoma: A delphi consensus proposal. Journal of clinical medicine, 10(17):4007.

Huang, C., Wang, Y., Li, X., Ren, L., Zhao, J., Hu, Y., Zhang, L., Fan, G., Xu, J., Gu, X., et al. (2020). Clinical features of patients infected with 2019 novel coronavirus in wuhan, china. The lancet, 395(10223):497-506.

Jing, B., Wang, Z., and Xing, E. (2020). Show, describe and conclude: On exploiting the structure information of chest x-ray reports. arXiv preprint arXiv:2004.12274.

Jing, B., Xie, P., and Xing, E. (2017). On the automatic generation of medical imaging reports. arXiv preprint arXiv:1711.08195.

Lakhani, P. and Sundaram, B. (2017). Deep learning at chest radiography: Automated classification of pulmonary tuberculosis by using convolutional neural networks. Radiology, 284(2):574-582.

Patel, A., Jernigan, D. B., et al. (2020). Initial public health response and interim clinical guidance for the 2019 novel coronavirus outbreak-united states, december 31, 2019- february 4, 2020. Morbidity and mortality weekly report, 69(5):140.

Pooch, E. H. P., Alva, T. A. P., and Becker, C. D. L. (2020). A deep learning approach for pulmonary lesion identification in chest radiographs. In Brazilian Conference on Intelligent Systems, pages 197-211. Springer.

Rennie, S. J., Marcheret, E., Mroueh, Y., Ross, J., and Goel, V. (2017). Self-critical sequence training for image captioning. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 7008-7024.

Shin, H.-C., Roberts, K., Lu, L., Demner-Fushman, D., Yao, J., and Summers, R. M. (2016). Learning to read chest x-rays: Recurrent neural cascade model for automated image annotation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2497-2506.

Wang, X., Peng, Y., Lu, L., Lu, Z., Bagheri, M., and Summers, R. M. (2017). Chestxray8: Hospital-scale chest x-ray database and benchmarks on weakly-supervised clas sification and localization of common thorax diseases. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2097-2106.

Wang, X., Peng, Y., Lu, L., Lu, Z., and Summers, R. M. (2018). Tienet: Text-image embedding network for common thorax disease classification and reporting in chest x-rays. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 9049-9058.

Wong, H. Y. F. et al. (2020). Frequency and distribution of chest radiographic findings in patients positive for covid-19. Radiology, 296(2):E72-E78.

Xu, K., Ba, J., Kiros, R., Cho, K., Courville, A., Salakhudinov, R., Zemel, R., and Bengio, Y. (2015). Show, attend and tell: Neural image caption generation with visual attention. In International conference on machine learning, pages 2048-2057. PMLR.

You, Q., Jin, H., Wang, Z., Fang, C., and Luo, J. (2016). Image captioning with semantic attention. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4651-4659.

Zeiser, F. A., Donida, B., da Costa, C. A., de Oliveira Ramos, G., Scherer, J. N., Barcellos, N. T., Alegretti, A. P., Ikeda, M. L. R., Müller, A. P. W. C., Bohn, H. C., et al. (2022). First and second covid-19 waves in brazil: A cross-sectional study of patients' characteristics related to hospitalization and in-hospital mortality. The Lancet Regional Health-Americas, 6:100107.

Zhang, Z., Xie, Y., Xing, F., McGough, M., and Yang, L. (2017). Mdnet: A semantically and visually interpretable medical image diagnosis network. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 6428-6436.

Zuiderveld, K. (1994). Graphics gems iv. In Heckbert, P. S., editor, Graphics Gems, chapter Contrast Limited Adaptive Histogram Equalization, pages 474-485. Academic Press Professional, Inc., San Diego, CA, USA.

Generating X-ray Reports Using Global Attention

Resumo

Referências

Artigos mais lidos do(s) mesmo(s) autor(es)