A Comparative Analysis of Denoising Methods for Deep Learning-Based Audio Event Detection in Noisy Agricultural Environments
Abstract
This study investigates whether traditional denoising improves deep learning for Audio Event Detection (AED) in real-world, noisy environments like livestock farms. We evaluated three denoising algorithms — Spectral Subtraction, Adaptive Kalman Filter, and SD-ROM — on the aSwine dataset using state-of-the-art models, including Pretrained Audio Neural Networks (PANNs) and the Audio Spectrogram Transformer (AST). Contrary to conventional wisdom, all denoising methods proved detrimental, with a worst-case scenario showing a 65.8% mean Average Precision (mAP) decrease. We conclude that powerful models learn to be inherently noise-robust, making robust architectures a superior strategy to noise preprocessing.
Keywords:
Precision Livestock Farming, Computational Bioacoustics, End-to-End Models, Feature Distortion, Non-stationary Noise
References
Azarang, A. and Kehtarnavaz, N. (2020). A review of multi-objective deep learning speech denoising methods. Speech Communication, 122:1–10.
Boll, S. (1979). Suppression of acoustic noise in speech using spectral subtraction. IEEE Transactions on Acoustics, Speech, and Signal Processing, 27(2):113–120.
Chen, R., Ghobakhlou, A., and Narayanan, A. (2024). Interpreting CNN models for musical instrument recognition using multi-spectrogram heatmap analysis: a preliminary study. Frontiers in Artificial Intelligence, 7:1499913. Publisher: Frontiers.
Ding, B., Zhang, T., Wang, C., Liu, G., Liang, J., Hu, R., Wu, Y., and Guo, D. (2024). Acoustic scene classification: A comprehensive survey. Expert Systems with Applications, 238:121902.
Ferahtia, J., Djarfour, N., Baddari, K., and Guérin, R. (2009). Application of signal dependent rank-order mean filter to the removal of noise spikes from 2D electrical resistivity imaging data. Near Surface Geophysics, 7(3):159–169.
Gong, Y., Chung, Y.-A., and Glass, J. (2021). AST: Audio Spectrogram Transformer. In Interspeech 2021, pages 571–575. ISCA.
Hardjanto, V. L. and Wahyono, . (2025). Audio Enhancement for Gamelan Instrument Recognition using Spectral Subtraction. Engineering, Technology & Applied Science Research, 15(2):22042–22048.
Kong, Q., Cao, Y., Iqbal, T., Wang, Y., Wang, W., and Plumbley, M. D. (2020). PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 28:2880–2894.
Lostanlen, V., Salamon, J., Farnsworth, A., Kelling, S., and Bello, J. P. (2019). Robust sound event detection in bioacoustic sensor networks. PLOS ONE, 14(10):e0214168.
Pan, W., Li, H., Zhou, X., Jiao, J., Zhu, C., and Zhang, Q. (2024). Research on Pig Sound Recognition Based on Deep Neural Network and Hidden Markov Models. Sensors, 24(4):1269.
Permana Putra, F., Kartika, K., Sitti Nurfebruary, N., Misriana, M., G. S, K., and Siregar, R. H. (2024). Matlab Simulation Using Kalman Filter Algorithm to Reduce Noise in Voice Signals. Journal of Renewable Energy, Electrical, and Computer Engineering, 4(1):23–31.
Rao, G., Babu, D. R., Kanth, P. K., Vinay, B., and Nikhil, V. (2021). Reduction of Impulsive Noise from Speech and Audio Signals by using Sd Rom Algorithm. International Journal of Recent Technology and Engineering (IJRTE), 10(1):265–268.
Souza, A. M., Kobayashi, L. L., Tassoni, L. A., Garbossa, C. A. P., Ventura, R. V., and Machado De Sousa, E. P. (2025). Deep learning solutions for audio event detection in a swine barn using environmental audio and weak labels. Applied Intelligence, 55(7):668.
Turpault, N., Wisdom, S., Erdogan, H., Hershey, J. R., Serizel, R., Fonseca, E., Seetharaman, P., and Salamon, J. (2020). Improving Sound Event Detection In Domestic Environments Using Sound Separation. In DCASE Workshop 2020 - Detection and Classification of Acoustic Scenes and Events, Tokyo /Virtual, Japan.
Boll, S. (1979). Suppression of acoustic noise in speech using spectral subtraction. IEEE Transactions on Acoustics, Speech, and Signal Processing, 27(2):113–120.
Chen, R., Ghobakhlou, A., and Narayanan, A. (2024). Interpreting CNN models for musical instrument recognition using multi-spectrogram heatmap analysis: a preliminary study. Frontiers in Artificial Intelligence, 7:1499913. Publisher: Frontiers.
Ding, B., Zhang, T., Wang, C., Liu, G., Liang, J., Hu, R., Wu, Y., and Guo, D. (2024). Acoustic scene classification: A comprehensive survey. Expert Systems with Applications, 238:121902.
Ferahtia, J., Djarfour, N., Baddari, K., and Guérin, R. (2009). Application of signal dependent rank-order mean filter to the removal of noise spikes from 2D electrical resistivity imaging data. Near Surface Geophysics, 7(3):159–169.
Gong, Y., Chung, Y.-A., and Glass, J. (2021). AST: Audio Spectrogram Transformer. In Interspeech 2021, pages 571–575. ISCA.
Hardjanto, V. L. and Wahyono, . (2025). Audio Enhancement for Gamelan Instrument Recognition using Spectral Subtraction. Engineering, Technology & Applied Science Research, 15(2):22042–22048.
Kong, Q., Cao, Y., Iqbal, T., Wang, Y., Wang, W., and Plumbley, M. D. (2020). PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 28:2880–2894.
Lostanlen, V., Salamon, J., Farnsworth, A., Kelling, S., and Bello, J. P. (2019). Robust sound event detection in bioacoustic sensor networks. PLOS ONE, 14(10):e0214168.
Pan, W., Li, H., Zhou, X., Jiao, J., Zhu, C., and Zhang, Q. (2024). Research on Pig Sound Recognition Based on Deep Neural Network and Hidden Markov Models. Sensors, 24(4):1269.
Permana Putra, F., Kartika, K., Sitti Nurfebruary, N., Misriana, M., G. S, K., and Siregar, R. H. (2024). Matlab Simulation Using Kalman Filter Algorithm to Reduce Noise in Voice Signals. Journal of Renewable Energy, Electrical, and Computer Engineering, 4(1):23–31.
Rao, G., Babu, D. R., Kanth, P. K., Vinay, B., and Nikhil, V. (2021). Reduction of Impulsive Noise from Speech and Audio Signals by using Sd Rom Algorithm. International Journal of Recent Technology and Engineering (IJRTE), 10(1):265–268.
Souza, A. M., Kobayashi, L. L., Tassoni, L. A., Garbossa, C. A. P., Ventura, R. V., and Machado De Sousa, E. P. (2025). Deep learning solutions for audio event detection in a swine barn using environmental audio and weak labels. Applied Intelligence, 55(7):668.
Turpault, N., Wisdom, S., Erdogan, H., Hershey, J. R., Serizel, R., Fonseca, E., Seetharaman, P., and Salamon, J. (2020). Improving Sound Event Detection In Domestic Environments Using Sound Separation. In DCASE Workshop 2020 - Detection and Classification of Acoustic Scenes and Events, Tokyo /Virtual, Japan.
Published
2025-09-29
How to Cite
MOREIRA SOUZA, André; MOREIRA, Guilherme Augusto; PULCINELLI, Lucas Eduardo Gulka.
A Comparative Analysis of Denoising Methods for Deep Learning-Based Audio Event Detection in Noisy Agricultural Environments. In: BRAZILIAN SYMPOSIUM ON DATABASES (SBBD), 40. , 2025, Fortaleza/CE.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2025
.
p. 942-948.
ISSN 2763-8979.
DOI: https://doi.org/10.5753/sbbd.2025.247812.
