Avaliação de modelos para detecção de ataques de replay usando diferentes bases de dados
Resumo
Ataque de replay e uma falsificação de fala utilizada na tentativa de autenticação de locutor. Redes neurais profundas têm sido propostas como métodos para detecção de áudios fraudulentos. Tendo em vista a utilização desses modelos em aplicações reais, além de bom desempenho na aprendizagem, espera-se que o modelo obtido apresente bons resultados com bases de dados distintas da utilizada no treinamento. Neste trabalho, duas abordagens foram avaliadas com três bases de dados públicas, com resultados que indicam baixa capacidade de generalização dos modelos.
Referências
Chettri, B., Mishra, S., Sturm, B. L., and Benetos, E. (2018). A study on convolutional neural network based end-to-end replay anti-spoofing. [link]
Gong, Y., Yang, J., Huber, J., MacKnight, M., and Poellabauer, C. (2019). Remasc: Realistic replay attack corpus for voice controlled systems. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, volume 2019-September, pages 2355–2359. International Speech Communication Association. DOI: 10.21437/Interspeech.2019-1541
Jain, A. K., Flynn, P., and Ross, A. A. (2008). Handbook of Biometrics. Springer. DOI: 10.1007/978-0-387-71041-9
Khan, A., Malik, K. M., Ryan, J., and Saravanan, M. (2023). Battling voice spoofing: a review, comparative analysis, and generalizability evaluation of state-of-the-art voice spoofing counter measures. Artificial Intelligence Review, 56:513–566. 01. DOI: 10.1007/s10462-023-10539-8
Korshunov, P., Gonçalves, A. R., Violato, R. P. V., Simões, F. O., and Marcel, S. (2018). On the use of convolutional neural networks for speech presentation attack detection. In IEEE, editor, 2018 IEEE 4th international conference on identity, security, and behavior analysis (ISBA), pages 1–8. DOI: 10.1109/ISBA.2018.8311474
Korshunov, P. and Marcel, S. (2016). Cross-database evaluation of audio-based spoofing detection systems. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, volume 08-12-September-2016, pages 1705–1709. International Speech and Communication Association. DOI: 10.21437/Interspeech.2016
Lavrentyeva, G., Novoselov, S., Malykh, E., Kozlov, A., Kudashev, O., and Shchemelinin, V. (2017). Audio replay attack detection with deep learning frameworks. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, volume 2017-August, pages 82–86. International Speech Communication Association. DOI: 10.21437/Interspeech.2017-360
Lavrentyeva, G., Novoselov, S., Tseren, A., Volkova, M., Gorlanov, A., and Kozlov, A. (2019). Stc antispoofing systems for the asvspoof2019 challenge. arXiv. DOI: 10.48550/arXiv.1904.05576
Lee, S.-K. (2024). Arbitrary discrete fourier analysis and its application in replayed speech detection. arXiv. DOI: 10.48550/arXiv.2403.01130
Lee, S.-K., Tsao, Y., and Wang, H.-M. (2022). Detecting replay attacks using single-channel audio: The temporal autocorrelation of speech. In Proceedings of 2022 APSIPA Annual Summit and Conference. 2022 APSIPA Annual Summit and Conference.
Liu, X., Wang, X., Sahidullah, M., Patino, J., Delgado, H., Kinnunen, T., Todisco, M., Yamagishi, J., Evans, N., Nautsch, A., and Lee, K. A. (2023). Asvspoof 2021: Towards spoofed and deepfake speech detection in the wild. IEEE/ACM Transactions on Audio Speech and Language Processing, 31:2507–2522. DOI: 10.1109/TASLP.2023.3285283
Nautsch, A., Wang, X., Evans, N., Kinnunen, T., Vestman, V., Todisco, M., Delgado, H., Sahidullah, M., Yamagishi, J., and Lee, K. A. (2021). Asvspoof 2019: spoofing countermeasures for the detection of synthesized, converted and replayed speech. arXiv. DOI: 10.1109/TBIOM.2021.3059479
Zhang, Z., Yi, X., and Zhao, X. (2021). Fake speech detection using residual network with transformer encoder. In IH and MMSec 2021 - Proceedings of the 2021 ACM Workshop on Information Hiding and Multimedia Security, pages 13–22. Association for Computing Machinery, Inc. DOI: 10.1145/3437880.34604