Vim-Med: a Vision Mamba-based Model for Pathology Classification in X-Ray Images

  • Gregory J. Pitthan FURG
  • Lucas B. V. Cordova FURG
  • Tatiana T. Schein FURG
  • Eduardo L. Silva FURG
  • Gustavo A. Dutra UFSM
  • Gustavo P. Almeida FURG
  • Stephanie L. Brião FURG
  • Paulo L. J. Drews-Jr FURG

Resumo


There is a need to improve medical diagnostics in identifying rare diseases and analyzing unbalanced image data. This work presents Vim-Med, an adaptation of the Vision Mamba (Vim) architecture for pathology classification in X-ray images. To evaluate the model, a comparison was made with other Mamba models and Transformer architectures. The results show that in the Chest X-Ray dataset, Vim-Med achieved the best F1-score with 0.888. In the NIH CRX8 dataset, Vim-Med excelled at handling rare classes (Macro-F1 of 0.192). Vim-Med achieved the highest inference speed, corresponding to 125 FPS, and achieved a reduction of more than 50% in training time. Thus, the Vim-Med model is efficient in classifying pathologies in X-ray images.

Referências

Dosovitskiy, A. and et al. (2021). An image is worth 16x16 words: Transformers for image recognition at scale. In ICLR.

Goodfellow, I., Bengio, Y., and Courville, A. (2016). Deep Learning. MIT Press.

Gu, A. and Dao, T. (2024). Mamba: Linear-time sequence modeling with selective state spaces. In First Conference on Language Modeling.

Liu et al. (2021). Swin transformer: Hierarchical vision transformer using shifted windows. Proceedings of the IEEE/CVF ICCV.

Vaswani et al. (2017). Attention is all you need. Advances in NeurIPS, 30.

Wang, Z. et al. (2024). Mamba-unet: Unet-like pure visual mamba for medical image segmentation.

Zhu et al. (2024). Vision mamba: Efficient visual representation learning with bidirectional state space model. In Proceedings of the 41st ICML.
Publicado
12/11/2025
PITTHAN, Gregory J.; CORDOVA, Lucas B. V.; SCHEIN, Tatiana T.; SILVA, Eduardo L.; DUTRA, Gustavo A.; ALMEIDA, Gustavo P.; BRIÃO, Stephanie L.; DREWS-JR, Paulo L. J.. Vim-Med: a Vision Mamba-based Model for Pathology Classification in X-Ray Images. In: ESCOLA REGIONAL DE APRENDIZADO DE MÁQUINA E INTELIGÊNCIA ARTIFICIAL DA REGIÃO SUL (ERAMIA-RS), 1. , 2025, Porto Alegre/RS. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2025 . p. 396-399. DOI: https://doi.org/10.5753/eramiars.2025.16757.