ImTeNet: Image-Text Classification Network for Abnormality Detection and Automatic Reporting on Musculoskeletal Radiographs

Braz, Leodécio; Teixeira, Vinicius; Pedrini, Helio; Dias, Zanoni

doi:10.1007/978-3-030-65775-8_14

ImTeNet: Image-Text Classification Network for Abnormality Detection and Automatic Reporting on Musculoskeletal Radiographs

Conference paper
First Online: 20 December 2020

530 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 12558))

Abstract

Deep learning techniques have been increasingly applied to provide more accurate results in the classification of medical images and in the classification and generation of report texts. The main objective of this paper is to investigate the influence of fusing several features of heterogeneous modalities to improve musculoskeletal abnormality detection in comparison with the individual results of image and text classification. In this work, we propose a novel image-text classification framework, named ImTeNet, to learn relevant features from image and text information for binary classification of musculoskeletal radiography. Initially, we use a caption generator model to artificially create textual data for a dataset lacking text information. Then, we apply the ImTeNet, a multi-modal information model that consists of two distinct networks, DenseNet-169 and BERT, to perform image and text classification tasks respectively, and a fusion module that receives a concatenation of feature vectors extracted from both. To evaluate our proposed approach, we used the Musculoskeletal Radiographs (MURA) dataset and compare the results obtained with image and text classification scheme individually.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
No additional datasets were used for training.
2.
If the normal and abnormal classification occurrences are equal, we perform an arithmetic mean of the probabilities.

References

Annarumma, M., Withey, S.J., Bakewell, R.J., Pesce, E., Goh, V., Montana, G.: Automated triaging of adult chest radiographs with deep artificial neural networks. Radiology 291(1), 196–202 (2019)
Article Google Scholar
Beltagy, I., Lo, K., Cohan, A.: SciBERT: A Pretrained Language Model for Scientific Text. arXiv preprint arXiv:1903.10676 (2019)
Chen, B., Li, J., Guo, X., Lu, G.: DualCheXNet: dual asymmetric feature learning for thoracic disease classification in chest X-rays. Biomed. Signal Process. Control 53, 101554 (2019)
Article Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Drozdov, I., Forbes, D., Szubert, B., Hall, M., Carlin, C., Lowe, D.J.: Supervised and unsupervised language modelling in chest X-ray radiological reports. PLoS ONE 15(3), e0229963 (2020)
Article CAS Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Google Scholar
Huang, G., Liu, Z., van der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Google Scholar
Jing, B., Xie, P., Xing, E.P.: On the automatic generation of medical imaging reports. In: 56th Annual Meeting of the Association for Computational Linguistics - Proceedings of the Conference (Long Papers), vol. 1, pp. 2577–2586 (2018)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization, pp. 1–15. arXiv preprint arXiv:1412.6980 (2014)
Kooi, T., et al.: Large scale deep learning for computer aided detection of mammographic lesions. Med. Image Anal. 35, 303–312 (2017)
Article Google Scholar
Pelka, O., Koitka, S., Rückert, J., Nensa, F., Friedrich, C.M.: Radiology objects in context (ROCO): a multimodal image dataset. In: Stoyanov, D., et al. (eds.) LABELS/CVII/STENT -2018. LNCS, vol. 11043, pp. 180–189. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01364-6_20
Chapter Google Scholar
Pelka, O., Nensa, F., Friedrich, C.M.: Branding - fusion of meta data and musculoskeletal radiographs for multi-modal diagnostic recognition. In: International Conference on Computer Vision Workshop (ICCV), pp. 467–475 (2019)
Google Scholar
Rajpurkar, P., et al.: MURA: large dataset for abnormality detection in musculoskeletal radiographs. arXiv preprint arXiv:1712.06957 (2017)
Ranjan, E., Paul, S., Kapoor, S., Kar, A., Sethuraman, R., Sheet, D.: Jointly learning convolutional representations to compress radiological images and classify thoracic diseases in the compressed domain. In: 11th Indian Conference on Computer Vision, Graphics and Image Processing, pp. 1–8 (2018)
Google Scholar
Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015). https://doi.org/10.1007/s11263-015-0816-y
Article Google Scholar
Smit, A., Jain, S., Rajpurkar, P., Pareek, A., Ng, A.Y., Lungren, M.P.: CheXbert: combining automatic labelers and expert annotations for accurate radiology report labeling using BERT. arXiv preprint arXiv:2004.09167 (2020)
Sun, C., Qiu, X., Xu, Y., Huang, X.: How to fine-tune BERT for text classification? In: Sun, M., Huang, X., Ji, H., Liu, Z., Liu, Y. (eds.) CCL 2019. LNCS (LNAI), vol. 11856, pp. 194–206. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32381-3_16
Chapter Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems 30, pp. 5998–6008 (2017)
Google Scholar
Vig, J.: A Multiscale visualization of attention in the transformer model, pp. 1–6. arXiv preprint arXiv:1906.05714 (2019)
Vinyals, O., Toshev, A., Bengio, S., Erhan, D.: Show and tell: a neural image caption generator. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. 7(12), 3156–3164 (2015)
Google Scholar
Wang, X., Peng, Y., Lu, L., Lu, Z., Bagheri, M., Summers, R.M.: ChestX-ray8: hospital-scale chest X-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Google Scholar
Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Google Scholar

Download references

Acknowledments

The authors would like to thank FAPESP (grants #2015/11937-9, #2017/12646-3, #2017/16246-0, #2017/12646-3 and #2019/20875-8), CNPq (grants #304380/2018-0 and #309330/2018-1) and CAPES for their financial support.

Author information

Authors and Affiliations

Institute of Computing, University of Campinas, Campinas, Brazil
Leodécio Braz, Vinicius Teixeira, Helio Pedrini & Zanoni Dias

Authors

Leodécio Braz
View author publications
You can also search for this author in PubMed Google Scholar
Vinicius Teixeira
View author publications
You can also search for this author in PubMed Google Scholar
Helio Pedrini
View author publications
You can also search for this author in PubMed Google Scholar
Zanoni Dias
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Helio Pedrini .

Editor information

Editors and Affiliations

University of São Paulo, São Paulo, Brazil
João C. Setubal
Instituto Federal de Goiás, Formosa, Goiás, Brazil
Waldeyr Mendes Silva

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Braz, L., Teixeira, V., Pedrini, H., Dias, Z. (2020). ImTeNet: Image-Text Classification Network for Abnormality Detection and Automatic Reporting on Musculoskeletal Radiographs. In: Setubal, J.C., Silva, W.M. (eds) Advances in Bioinformatics and Computational Biology. BSB 2020. Lecture Notes in Computer Science(), vol 12558. Springer, Cham. https://doi.org/10.1007/978-3-030-65775-8_14

Download citation

DOI: https://doi.org/10.1007/978-3-030-65775-8_14
Published: 20 December 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-65774-1
Online ISBN: 978-3-030-65775-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics