Convolution-Vision Transformer for Automatic Lung Sound Classification

José Neto; Nicksson Arrais; Tiago Vinuto; João Lucena

José Neto UFPE
Nicksson Arrais SiDi
Tiago Vinuto SiDi
João Lucena SiDi

Resumo

Auscultation is an essential part of clinical examination since it is an inexpensive, noninvasive, safe, and one of the oldest diagnostic techniques used to diagnose various pulmonary diseases. In literature, machine learning models were proposed in various studies for lung sound classification to overcome the ear acuity and the inherent inter-listener variability. In this work, we propose a hybrid Convolution-Vision Transformer architecture that explores the usage of Convolutional with Vision Transformers in a single system. We evaluate our proposed method on ICBHI 2017 database for the four-class sound classification of lung sounds to demonstrate the effectiveness of our method which has achieved a score of 57.36% surpassing many state-of-art models.

Palavras-chave: Graphics, Sensitivity, Databases, Pulmonary diseases, Lung, Machine learning, Ear, Auscultation, Lung Sound Classification, Vision Transformer, ICBHI dataset