Convolution-Vision Transformer for Automatic Lung Sound Classification

  • José Neto UFPE
  • Nicksson Arrais SiDi
  • Tiago Vinuto SiDi
  • João Lucena SiDi


Auscultation is an essential part of clinical examination since it is an inexpensive, noninvasive, safe, and one of the oldest diagnostic techniques used to diagnose various pulmonary diseases. In literature, machine learning models were proposed in various studies for lung sound classification to overcome the ear acuity and the inherent inter-listener variability. In this work, we propose a hybrid Convolution-Vision Transformer architecture that explores the usage of Convolutional with Vision Transformers in a single system. We evaluate our proposed method on ICBHI 2017 database for the four-class sound classification of lung sounds to demonstrate the effectiveness of our method which has achieved a score of 57.36% surpassing many state-of-art models.
Palavras-chave: Graphics, Sensitivity, Databases, Pulmonary diseases, Lung, Machine learning, Ear, Auscultation, Lung Sound Classification, Vision Transformer, ICBHI dataset
Como Citar

Selecione um Formato
NETO, José; ARRAIS, Nicksson; VINUTO, Tiago; LUCENA, João. Convolution-Vision Transformer for Automatic Lung Sound Classification. In: CONFERENCE ON GRAPHICS, PATTERNS AND IMAGES (SIBGRAPI), 35. , 2022, Natal/RN. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2022 .