Classification of Anurans Based on Vocalizations for Pervasive Environmental Monitoring
Abstract
In this work, we developed an automatic system for classifying anura (frogs and toads) based on their vocalizations. Each bioacustic signal has been segmented on the first stage into smaller units called “syllables”, followed by a preprocessing step, composed by a pre-emphasis filter and a Hamming window that prepares the signal for feature extraction. We used Mel-fourier Cepstral Coefficients (MFCCs) to represent the acoustic signals and two classifiers were evaluated: kNN and SVM. In our experiments, we have achieved a classification rate of 98.97%, which shows that the MFCC, usually used in speech recognition, can be used for the recognizing anura species as well. The anuran recognition rate was improved in 16.09%, using SVM and MFCCs, compared with results found in literature.References
Akyildiz, I. F., Su, W., Sankarasubramaniam, Y., and Cayirci, E. (2002). Wireless sensor networks: a survey. Computer Networks, 38(4):393–422.
Bee, M. A. and Micheyl, C. (2008). The cocktail party problem: what is it? How can it be solved? And why should animal behaviorists study it? Journal of comparative psychology, 122(3):235–51.
Boser, B. E., Guyon, I. M., and Vapnik, V. N. (1992). A training algorithm for optimal margin classifiers. In Proceedings of the fifth annual workshop on Computational learning theory., pages 144–152, New York, USA. ACM.
Cai, J., Ee, D., Pham, B., Roe, P., and Zhang, J. (2007). Sensor Network for the Monitoring of Ecosystem: Bird Species Recognition. In 3rd International Conference on Intelligent Sensors, Sensor Networks and Information, pages 293–298. IEEE.
Carey, C., Heyer, W. R., Wilkinson, J., a. Alford, R., Arntzen, J. W., Halliday, T., Hungerford, L., Lips, K. R., Middleton, E. M., a. Orchard, S., and a. Stanley Rand (2001). Amphibian Declines and Environmental Change: Use of Remote-Sensing Data to Identify Environmental Correlates. Conservation Biology, 15(4):903–913.
Clemins, P. J. (2005). Automatic classification of animal vocalizations. PhD thesis, Marquette University, Wisconsin.
Collins, J. P. and Storfer, A. (2003). Global amphibian declines: sorting the hypotheses. Diversity and Distributions, 9:89–98.
Cover, T. and Hart, P. (1967). Nearest neighbor pattern classification. Transactions on Information Theory, IEEE, 13(1):21–27.
Davis, S. and Mermelstein, P. (1980). Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Transactions on Acoustics, Speech, and Signal Processing, 28(4):357–366.
Deller, J., Hansen, J., and Proakis, J. (1993). Discrete-time processing of speech signals, volume 1. IEEE.
Haddad, C. (2005). Guia Sonoro dos Anfíbios Anuros da Mata Atlântica.
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., and Witten, I. H. (2009). The WEKA Data Mining Software: An Update.
Harma, A. (2003). Automatic identification of bird species based on sinusoidal modeling of syllables. In IEEE International Conference on Acoustics, Speech, and Signal Processing, pages 545–548. IEEE.
Hu, W., Tran, V. N., Bulusu, N., Chou, C. T., Jha, S., and Taylor, A. (2005). The design and evaluation of a hybrid sensor network for cane-toad monitoring. In Fourth International Symposium on Information Processing in Sensor Networks., pages 503–508.
Huang, C.-J., Yang, Y.-J., Yang, D.-X., and Chen, Y.-J. (2009). Frog classification using machine learning techniques. Expert Systems with Applications, 36(2):3737–3743.
Márquez, R., Riva, I., Matheu, B., and Matheu, E. (2002). Sounds of Frogs and Toads of Bolivia.
Marsland, S. (2009). Machine Learning: an algorithmic perspective., volume 1. CRC Press, Palmerston North, New Zealand.
Quinlan, J. R. (1993). C4.5: programs for machine learning. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA.
Rabiner, L. R. and Schafer, R. W. (2007). Introduction to Digital Speech Processing. Foundations and Trends in Signal Processing, 1:1–194.
Riede, K. (1993). Monitoring biodiversity: analysis of Amazonian rainforest sounds. Ambio, 22(8):546–548.
Taylor, A., Watson, G., Grigg, G., and McCallum, H. (1996). Monitoring Frog Communities: An Application of Machine Learning. Proceedings of the 8th Innovative Applications of Artificial Conference.
Bee, M. A. and Micheyl, C. (2008). The cocktail party problem: what is it? How can it be solved? And why should animal behaviorists study it? Journal of comparative psychology, 122(3):235–51.
Boser, B. E., Guyon, I. M., and Vapnik, V. N. (1992). A training algorithm for optimal margin classifiers. In Proceedings of the fifth annual workshop on Computational learning theory., pages 144–152, New York, USA. ACM.
Cai, J., Ee, D., Pham, B., Roe, P., and Zhang, J. (2007). Sensor Network for the Monitoring of Ecosystem: Bird Species Recognition. In 3rd International Conference on Intelligent Sensors, Sensor Networks and Information, pages 293–298. IEEE.
Carey, C., Heyer, W. R., Wilkinson, J., a. Alford, R., Arntzen, J. W., Halliday, T., Hungerford, L., Lips, K. R., Middleton, E. M., a. Orchard, S., and a. Stanley Rand (2001). Amphibian Declines and Environmental Change: Use of Remote-Sensing Data to Identify Environmental Correlates. Conservation Biology, 15(4):903–913.
Clemins, P. J. (2005). Automatic classification of animal vocalizations. PhD thesis, Marquette University, Wisconsin.
Collins, J. P. and Storfer, A. (2003). Global amphibian declines: sorting the hypotheses. Diversity and Distributions, 9:89–98.
Cover, T. and Hart, P. (1967). Nearest neighbor pattern classification. Transactions on Information Theory, IEEE, 13(1):21–27.
Davis, S. and Mermelstein, P. (1980). Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Transactions on Acoustics, Speech, and Signal Processing, 28(4):357–366.
Deller, J., Hansen, J., and Proakis, J. (1993). Discrete-time processing of speech signals, volume 1. IEEE.
Haddad, C. (2005). Guia Sonoro dos Anfíbios Anuros da Mata Atlântica.
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., and Witten, I. H. (2009). The WEKA Data Mining Software: An Update.
Harma, A. (2003). Automatic identification of bird species based on sinusoidal modeling of syllables. In IEEE International Conference on Acoustics, Speech, and Signal Processing, pages 545–548. IEEE.
Hu, W., Tran, V. N., Bulusu, N., Chou, C. T., Jha, S., and Taylor, A. (2005). The design and evaluation of a hybrid sensor network for cane-toad monitoring. In Fourth International Symposium on Information Processing in Sensor Networks., pages 503–508.
Huang, C.-J., Yang, Y.-J., Yang, D.-X., and Chen, Y.-J. (2009). Frog classification using machine learning techniques. Expert Systems with Applications, 36(2):3737–3743.
Márquez, R., Riva, I., Matheu, B., and Matheu, E. (2002). Sounds of Frogs and Toads of Bolivia.
Marsland, S. (2009). Machine Learning: an algorithmic perspective., volume 1. CRC Press, Palmerston North, New Zealand.
Quinlan, J. R. (1993). C4.5: programs for machine learning. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA.
Rabiner, L. R. and Schafer, R. W. (2007). Introduction to Digital Speech Processing. Foundations and Trends in Signal Processing, 1:1–194.
Riede, K. (1993). Monitoring biodiversity: analysis of Amazonian rainforest sounds. Ambio, 22(8):546–548.
Taylor, A., Watson, G., Grigg, G., and McCallum, H. (1996). Monitoring Frog Communities: An Application of Machine Learning. Proceedings of the 8th Innovative Applications of Artificial Conference.
Published
2011-07-19
How to Cite
COLONNA, Juan Gabriel; NAKAMURA, Eduardo Freire; SANTOS, Eulanda Miranda dos.
Classification of Anurans Based on Vocalizations for Pervasive Environmental Monitoring. In: PROCEEDINGS OF BRAZILIAN SYMPOSIUM ON UBIQUITOUS AND PERVASIVE COMPUTING (SBCUP), 3. , 2011, Natal/RN.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2011
.
p. 1093-1102.
ISSN 2595-6183.
