Analyzing the Impact of Voice Data Replication on Machine Learning Models for Parkinson’s Disease Diagnosis
Resumo
This study examines the effect of voice data replication on machine learning models for Parkinson’s Disease diagnosis. Using a dataset of 80 individuals, we compare two evaluation scenarios: treating voice samples as independent and considering the source individual when composing training and test sets. Results show that treating replicated samples as independent leads to inflated performance metrics, highlighting the importance of properly handling intra-individual variability in PD diagnosis models.
Palavras-chave:
Parkinson's disease, machine learning, data replication, intra-individual variability, diagnosis
Referências
Braak, H. and Braak, E. (2000). Pathoanatomy of Parkinson’s disease. Journal of Neurology, 247:II3–II10.
Idrisoglu, A., Dallora, A. L., Anderberg, P., and Berglund, J. S. (2023). Applied Machine Learning Techniques to Diagnose Voice-Affecting Conditions and Disorders: Systematic Literature Review. J Med Internet Res, 25:e46105.
Naranjo, L., Pérez, C. J., and Martín, J. (2016). Addressing voice recording replications for tracking Parkinson’s disease progression. Medical Biological Engineering Computing, 55(3):365–373. Epub 2016 May 21.
Prez, C. (2016). Parkinson Dataset with replicated acoustic features. UCI Machine Learning Repository. DOI: 10.24432/C5701F.
Idrisoglu, A., Dallora, A. L., Anderberg, P., and Berglund, J. S. (2023). Applied Machine Learning Techniques to Diagnose Voice-Affecting Conditions and Disorders: Systematic Literature Review. J Med Internet Res, 25:e46105.
Naranjo, L., Pérez, C. J., and Martín, J. (2016). Addressing voice recording replications for tracking Parkinson’s disease progression. Medical Biological Engineering Computing, 55(3):365–373. Epub 2016 May 21.
Prez, C. (2016). Parkinson Dataset with replicated acoustic features. UCI Machine Learning Repository. DOI: 10.24432/C5701F.
Publicado
05/12/2024
Como Citar
CHAGAS, Ana Luísa B.; S. LOBO, Pedro L.; FELIX, Juliana P.; DO NASCIMENTO, Hugo A. D.; SALVINI, Rogerio.
Analyzing the Impact of Voice Data Replication on Machine Learning Models for Parkinson’s Disease Diagnosis. In: ESCOLA REGIONAL DE INFORMÁTICA DE GOIÁS (ERI-GO), 12. , 2024, Ceres/GO.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2024
.
p. 263-264.
DOI: https://doi.org/10.5753/erigo.2024.5092.