Silva, João, Francisco de Assis Boldt, Luis A. Souza Jr, Mariella Berger, Anselmo Frizera, Alberto F. De Souza, Thiago Oliveira-Santos, and Claudine Badue. " Evaluating Transformer-Based Architectures for Simultaneous Audio Speech Transcription and Background Audio Captioning." Proceedings of the 52nd Integrated Software and Hardware Seminar, Maceió/AL, 2025. SBC, 2025, pp.633-644.