Skip to main content

ImTeNet: Image-Text Classification Network for Abnormality Detection and Automatic Reporting on Musculoskeletal Radiographs

  • Conference paper
  • First Online:
  • 530 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 12558))

Abstract

Deep learning techniques have been increasingly applied to provide more accurate results in the classification of medical images and in the classification and generation of report texts. The main objective of this paper is to investigate the influence of fusing several features of heterogeneous modalities to improve musculoskeletal abnormality detection in comparison with the individual results of image and text classification. In this work, we propose a novel image-text classification framework, named ImTeNet, to learn relevant features from image and text information for binary classification of musculoskeletal radiography. Initially, we use a caption generator model to artificially create textual data for a dataset lacking text information. Then, we apply the ImTeNet, a multi-modal information model that consists of two distinct networks, DenseNet-169 and BERT, to perform image and text classification tasks respectively, and a fusion module that receives a concatenation of feature vectors extracted from both. To evaluate our proposed approach, we used the Musculoskeletal Radiographs (MURA) dataset and compare the results obtained with image and text classification scheme individually.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    No additional datasets were used for training.

  2. 2.

    If the normal and abnormal classification occurrences are equal, we perform an arithmetic mean of the probabilities.

References

  1. Annarumma, M., Withey, S.J., Bakewell, R.J., Pesce, E., Goh, V., Montana, G.: Automated triaging of adult chest radiographs with deep artificial neural networks. Radiology 291(1), 196–202 (2019)

    Article  Google Scholar 

  2. Beltagy, I., Lo, K., Cohan, A.: SciBERT: A Pretrained Language Model for Scientific Text. arXiv preprint arXiv:1903.10676 (2019)

  3. Chen, B., Li, J., Guo, X., Lu, G.: DualCheXNet: dual asymmetric feature learning for thoracic disease classification in chest X-rays. Biomed. Signal Process. Control 53, 101554 (2019)

    Article  Google Scholar 

  4. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)

  5. Drozdov, I., Forbes, D., Szubert, B., Hall, M., Carlin, C., Lowe, D.J.: Supervised and unsupervised language modelling in chest X-ray radiological reports. PLoS ONE 15(3), e0229963 (2020)

    Article  CAS  Google Scholar 

  6. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)

    Google Scholar 

  7. Huang, G., Liu, Z., van der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)

    Google Scholar 

  8. Jing, B., Xie, P., Xing, E.P.: On the automatic generation of medical imaging reports. In: 56th Annual Meeting of the Association for Computational Linguistics - Proceedings of the Conference (Long Papers), vol. 1, pp. 2577–2586 (2018)

    Google Scholar 

  9. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization, pp. 1–15. arXiv preprint arXiv:1412.6980 (2014)

  10. Kooi, T., et al.: Large scale deep learning for computer aided detection of mammographic lesions. Med. Image Anal. 35, 303–312 (2017)

    Article  Google Scholar 

  11. Pelka, O., Koitka, S., Rückert, J., Nensa, F., Friedrich, C.M.: Radiology objects in context (ROCO): a multimodal image dataset. In: Stoyanov, D., et al. (eds.) LABELS/CVII/STENT -2018. LNCS, vol. 11043, pp. 180–189. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01364-6_20

    Chapter  Google Scholar 

  12. Pelka, O., Nensa, F., Friedrich, C.M.: Branding - fusion of meta data and musculoskeletal radiographs for multi-modal diagnostic recognition. In: International Conference on Computer Vision Workshop (ICCV), pp. 467–475 (2019)

    Google Scholar 

  13. Rajpurkar, P., et al.: MURA: large dataset for abnormality detection in musculoskeletal radiographs. arXiv preprint arXiv:1712.06957 (2017)

  14. Ranjan, E., Paul, S., Kapoor, S., Kar, A., Sethuraman, R., Sheet, D.: Jointly learning convolutional representations to compress radiological images and classify thoracic diseases in the compressed domain. In: 11th Indian Conference on Computer Vision, Graphics and Image Processing, pp. 1–8 (2018)

    Google Scholar 

  15. Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015). https://doi.org/10.1007/s11263-015-0816-y

    Article  Google Scholar 

  16. Smit, A., Jain, S., Rajpurkar, P., Pareek, A., Ng, A.Y., Lungren, M.P.: CheXbert: combining automatic labelers and expert annotations for accurate radiology report labeling using BERT. arXiv preprint arXiv:2004.09167 (2020)

  17. Sun, C., Qiu, X., Xu, Y., Huang, X.: How to fine-tune BERT for text classification? In: Sun, M., Huang, X., Ji, H., Liu, Z., Liu, Y. (eds.) CCL 2019. LNCS (LNAI), vol. 11856, pp. 194–206. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32381-3_16

    Chapter  Google Scholar 

  18. Vaswani, A., et al.: Attention is all you need. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems 30, pp. 5998–6008 (2017)

    Google Scholar 

  19. Vig, J.: A Multiscale visualization of attention in the transformer model, pp. 1–6. arXiv preprint arXiv:1906.05714 (2019)

  20. Vinyals, O., Toshev, A., Bengio, S., Erhan, D.: Show and tell: a neural image caption generator. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. 7(12), 3156–3164 (2015)

    Google Scholar 

  21. Wang, X., Peng, Y., Lu, L., Lu, Z., Bagheri, M., Summers, R.M.: ChestX-ray8: hospital-scale chest X-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)

    Google Scholar 

  22. Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)

    Google Scholar 

Download references

Acknowledments

The authors would like to thank FAPESP (grants #2015/11937-9, #2017/12646-3, #2017/16246-0, #2017/12646-3 and #2019/20875-8), CNPq (grants #304380/2018-0 and #309330/2018-1) and CAPES for their financial support.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Helio Pedrini .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Braz, L., Teixeira, V., Pedrini, H., Dias, Z. (2020). ImTeNet: Image-Text Classification Network for Abnormality Detection and Automatic Reporting on Musculoskeletal Radiographs. In: Setubal, J.C., Silva, W.M. (eds) Advances in Bioinformatics and Computational Biology. BSB 2020. Lecture Notes in Computer Science(), vol 12558. Springer, Cham. https://doi.org/10.1007/978-3-030-65775-8_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-65775-8_14

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-65774-1

  • Online ISBN: 978-3-030-65775-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics