João Silva et al. 2025. Evaluating Transformer-Based Architectures for Simultaneous Audio Speech Transcription and Background Audio Captioning. In Proceedings of the 52nd Integrated Software and Hardware Seminar, July 20, 2025, Maceió/AL, Brasil. SBC, Porto Alegre, Brasil, 633-644. DOI: https://doi.org/10.5753/semish.2025.9474.