J. Silva et al." Evaluating Transformer-Based Architectures for Simultaneous Audio Speech Transcription and Background Audio Captioning", in Proceedings of the 52nd Integrated Software and Hardware Seminar, Maceió/AL, 2025, pp. 633-644, doi: https://doi.org/10.5753/semish.2025.9474.