Exploring Biases in Machine Learning Models for Neurodegenerative Diseases Diagnosis Through Gait and Voice Analysis

  • Ana Luísa de Bastos Chagas UFG
  • Giordana de Farias F. B. Bucci UFG
  • Juliana Paula Félix UFG / PUC Goiás
  • Rogerio Salvini UFG
  • Hugo Nascimento UFG
  • Fabrizzio Soares UFG

Resumo


This work examines potential biases in machine learning models for diagnosing neurodegenerative diseases (NDDs) through gait and voice analysis. It investigates how common techniques, such as gait signal windowing and using multiple voice samples per individual indiscriminately, can lead to inflated performance estimates when samples from the same individual are treated independently. Using two public databases, it compares scenarios where augmented samples are treated independently versus grouped by individual. Results show that independent treatment leads to artificially higher performance metrics. The findings highlight the need for proper intra-individual variability handling to ensure reliable clinical decision support for NDD diagnosis.

Referências

Berman, T. and Bayati, A. (2018). What are neurodegenerative diseases and how do they affect the brain? Frontiers for Young Minds, 6.

Chagas, A., Lobo, P. S., Felix, J., do Nascimento, H., and Salvini, R. (2024a). Analyzing the impact of voice data replication on machine learning models for parkinson’s disease diagnosis. In Anais da XII Escola Regional de Informática de Goiás, pages 263–264, Porto Alegre, RS, Brasil. SBC.

Chagas, A. L., Bucci, G., Felix, J., Fonseca, A., Nascimento, H., and Soares, F. (2024b). Avaliando a sobreamostragem de dados temporais de marcha no diagnóstico automático de doenças neurodegenerativas. In Simpósio Brasileiro de Computação Aplicada à Saúde (SBCAS), Goiânia, GO, Brazil, June 25–28, 2024, pages 1–12. SBC.

da Silva, M. I., Felix, J. P., de Stecca Prado, T., de Bastos Chagas, A. L., Bucci, G. d. F. F. B., da Fonseca, A. U., and Soares, F. (2024). Sobre a análise de sinais de voz para o diagnóstico da doença de parkinson. Journal of Health Informatics, 16(Especial).

Felix, J., da Silva, M. I., Chagas, A. L., Salvini, R., Nascimento, H., and Soares, F. (2025). Analyzing the effect of replicated voice samples in Parkinson’s disease classification. In 2025 IEEE Canadian Conference on Electrical and Computer Engineering (CCECE), pages 1–5, Vancouver, Canada. IEEE. To appear.

Felix, J., Fonseca, A. U., Nascimento, H., and Guimarães, N. (2022). Rede neural multi-camadas para classificação de doenças neurodegenerativas a partir de sinais de marcha. In Anais do XXIV Congresso Brasileiro de Automática, pages 1354–1361. SBA.

Fraiwan, L. and Hassanin, O. (2021). Computer-aided identification of degenerative neuromuscular diseases based on gait dynamics and ensemble decision tree classifiers. Plos one, 16(6):e0252380.

Little, M. A., McSharry, P. E., Hunter, E. J., Spielman, J., and Ramig, L. O. (2009). Suitability of dysphonia measurements for telemonitoring of parkinson’s disease. IEEE Transactions on Biomedical Engineering, 56(4):1015–1022.

Mayeux, R. (2003). Epidemiology of neurodegeneration. Annual Review of Neuroscience, 26(1):81–104.

Modaresnia, Y., Torghabeh, F. A., and Hosseini, S. A. (2024). A deep time-frequency approach in automated diagnosis of neurodegenerative diseases using gait signals. Basic and Clinical Neuroscience, 15(6):759–774. [Online].

Ouhmida, A., Fattah, J., Khaireddin, Y., and Maaroufi, M. (2021). Voice-based deep learning medical diagnosis system for parkinson’s disease prediction. In 2021 International Congress of Advanced Technology and Engineering (ICOTEN). IEEE.
Publicado
09/06/2025
CHAGAS, Ana Luísa de Bastos; BUCCI, Giordana de Farias F. B.; FÉLIX, Juliana Paula; SALVINI, Rogerio; NASCIMENTO, Hugo; SOARES, Fabrizzio. Exploring Biases in Machine Learning Models for Neurodegenerative Diseases Diagnosis Through Gait and Voice Analysis. In: CONCURSO DE TRABALHOS DE INICIAÇÃO CIENTÍFICA - SIMPÓSIO BRASILEIRO DE COMPUTAÇÃO APLICADA À SAÚDE (SBCAS), 25. , 2025, Porto Alegre/RS. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2025 . p. 1-6. ISSN 2763-8987. DOI: https://doi.org/10.5753/sbcas_estendido.2025.7498.