Avaliação do Desempenho de Ferramentas de Transcrição de Áudio em Português para Análise de Dados da Web

  • Jonatas Santos UFMG
  • Marcelo M. R. Araujo UFMG
  • Josemar Caetano UFMG
  • Yago Santos UFV
  • Julio C. S. Reis UFV
  • Ana P. C. Silva UFMG
  • Jussara M. Almeida UFMG

Abstract


In this work, we present an evaluation of the performance of Portuguese audio transcription tools used for Web data analysis. For this purpose, we explored a publicly accessible dataset, and performed analyses based on two main dimensions: (total) number of failures and accuracy. Our results present interesting findings that may be useful to guide researchers in choosing audio transcription methods for studies focused on the Portuguese language.

References

Filippidou, F. and Moussiades, L. (2020). A benchmarking of ibm, google and wit automatic speech recognition systems. In IFIP Int’l Conference AIAI.

Guerra, P. A. C., Silveira, S. R., Bertolini, C., Parreira, F. J., and Ulbricht, V. R. (2020). Aplicativo mobile para avaliar a acessibilidade de objetos de aprendizagem utilizando um sistema especialista. Revista Educação Especial, 33:1–26.

Kolobov, R., Okhapkina, O., Omelchishina, O., Platunov, A., Bedyakin, R., Moshkin, V., Menshikov, D., and Mikhaylovskiy, N. (2021). Mediaspeech: Multilanguage asr benchmark and dataset. arXiv preprint arXiv:2103.16193.

Lops, P., Jannach, D., Musto, C., Bogers, T., and Koolen, M. (2019). Trends in content-based recommendation. User Modeling and User-Adapted Interaction, 29(2):239–249.

Maros, A., Almeida, J., Benevenuto, F., and Vasconcelos, M. (2020). Analyzing the use of audio messages in whatsapp groups. In Proc. of the WWW.

McCleary, L. and Viotti, E. (2007). Transcrição de dados de uma língua sinalizada: um estudo piloto da transcrição de narrativas na língua de sinais brasileira (lsb).

Qiang, J., Qian, Z., Li, Y., Yuan, Y., and Wu, X. (2020). Short text topic modeling techniques, applications, and performance: a survey. IEEE TKDE.

Ratcliff, J. W. and Metzener, D. E. (1988). Pattern-matching-the gestalt approach. Dr Dobbs Journal, 13(7):46.

Resende, G., Melo, P., Sousa, H., Messias, J., Vasconcelos, M., Almeida, J., and Benevenuto, F. (2019). (mis)information dissemination in whatsapp: Gathering, analyzing and countermeasures. In Proc. of the WWW.

Riggs, E. E. and Knobloch-Westerwick, S. (2022). Beyond the text: Testing narrative persuasion mechanisms with audio messages. Mass Communication and Society.

Rumsey, F. (2012). Spatial audio. Routledge.

Saha, P., Mathew, B., Garimella, K., and Mukherjee, A. (2021). “short is the road that leads from fear to hate”: Fear speech in indian whatsapp groups. In Proc. of the WWW.

Sampaio, M. X., Magalhães, R. P., da Silva, T. L. C., Cruz, L. A., de Vasconcelos, D. R., de Macêdo, J. A. F., and Ferreira, M. G. F. (2021). Evaluation of automatic speech recognition systems. In Proc. of the SBBD.

Verma, J. P., Agrawal, S., Patel, B., and Patel, A. (2016). Big data analytics: Challenges and applications for text, audio, video, and social media data. IJSCAI, 5(1):41–51.

Wang, Y., Luan, H., Yuan, J., Wang, B., and Lin, H. (2020). Laix corpus of chinese learner english: Towards a benchmark for l2 english asr. In Proc. of the INTERSPEECH.
Published
2023-08-06
SANTOS, Jonatas; ARAUJO, Marcelo M. R.; CAETANO, Josemar; SANTOS, Yago; REIS, Julio C. S.; SILVA, Ana P. C.; ALMEIDA, Jussara M.. Avaliação do Desempenho de Ferramentas de Transcrição de Áudio em Português para Análise de Dados da Web. In: BRAZILIAN WORKSHOP ON SOCIAL NETWORK ANALYSIS AND MINING (BRASNAM), 12. , 2023, João Pessoa/PB. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2023 . p. 228-233. ISSN 2595-6094. DOI: https://doi.org/10.5753/brasnam.2023.230512.

Most read articles by the same author(s)

<< < 1 2 3 4 5