Gloss-to-Text Translation for Libras and Portuguese: Evaluating Pretrained and Fine-Tuned Encoder-Decoder Models

João Pedro Tomaszewski; Brenda S. Santana; Antonielle Martins; Guilherme Corrêa

doi:10.5753/wics.2026.23865

João Pedro Tomaszewski UFPel
Brenda S. Santana UFPel
Antonielle Martins UFPel
Guilherme Corrêa UFPel

DOI: https://doi.org/10.5753/wics.2026.23865

Resumo

We evaluate encoder-decoder models for Gloss-to-Text translation from Brazilian Sign Language (Libras) glosses into Portuguese using a corpus derived from Libras-UFPel. The evaluated models are mT5-small, mT5-base, Flan-T5-base, and PTT5-v2-base. Experiments were conducted with 5-fold cross-validation and evaluated using BLEU and chrF. All models improved after supervised fine-tuning, with PTT5-v2-base achieving the best overall performance. The results suggest that Portuguese-specialized encoder-decoder models are a promising direction for Gloss-to-Text translation in low-resource settings.

Referências

Camgoz, N. C., Hadfield, S., Koller, O., Ney, H., and Bowden, R. (2018). Neural sign language translation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 7784–7793.

Chen, Z., Zhou, B., Li, J., and Wan, J. (2024). Factorized learning assisted with large language model for gloss-free sign language translation. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING), pages 7071–7081. ELRA and ICCL.

Chung, H. W., Hou, L., Longpre, S., Zoph, B., Tay, Y., Fedus, W., Li, Y., Wang, X., Dehghani, M., Brahma, S., et al. (2022). Scaling instruction-finetuned language models. arXiv preprint arXiv:2210.11416.

Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. (2019). Bert: Pre-training of deep bidirectional transformers for language understanding. Proceedings of NAACL-HLT, pages 4171–4186.

Fayyazsanavi, P., Anastasopoulos, A., and Košecká, J. (2024). Gloss2text: Sign language gloss translation using llms and semantically aware label smoothing. In Findings of the Association for Computational Linguistics: EMNLP 2024, pages 16162–16171.

Forster, J., Schmidt, C., Koller, O., Bellgardt, M., and Ney, H. (2014). Extensions of the sign language recognition and translation corpus rwth-phoenix-weather. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC 2014), pages 1911–1916.

Guo, J., Li, P., and Cohn, T. (2025). Bridging sign and spoken languages: Pseudo gloss generation for sign language translation. arXiv preprint arXiv:2505.15438.

Johnston, T. (2010). From archive to corpus: Transcription and annotation in the creation of signed language corpora. International Journal of Corpus Linguistics, 15(1):104–129.

Liang, H., Huang, C., Xu, Y., and Tang, C. (2024). Llava-slt: Visual language tuning for sign language translation. arXiv preprint arXiv:2412.16524.

Lima, M. A., Cruz, D., Silva, D. R., Albuquerque, D. D., Lacerda, D. F., Costa, R., Souza Filho, G. L. d., and Araújo, T. M. d. (2025). Vlibrasbd: A brazilian portuguese–brazilian sign language (libras) bilingual text dataset designed to support neural machine translation. Data in Brief, 62:111911.

Loshchilov, I. and Hutter, F. (2018). Decoupled weight decay regularization. In Proceedings of the 7th International Conference on Learning Representations (ICLR).

Maia, W. F., Lopes, A. M., and David, S. (2025). Automatic sign language to text translation using mediapipe and transformer architectures. Neurocomputing, 642:130421.

Martins, A., Santana, B. S., Martins, F., Lebedeff, T., Nunes, D., and Bohm, L. (2026). Libras-ufpel corpus: A parallel dataset of brazilian sign language and portuguese for multimodal research and processing. In Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026), Salvador, Brazil. Association for Computational Linguistics.

Moon, J., Park, J., Kim, J., and Bae, J. (2024). Diffslt: Enhancing diversity in sign language translation via diffusion model. arXiv preprint arXiv:2411.17248.

Padden, C. and Sandler, W. (2015). Lexicalization and variation in sign languages. In The Oxford Handbook of Deaf Studies in Language, pages 210–229. Oxford University Press.

Papineni, K., Roukos, S., Ward, T., and Zhu, W.-J. (2002). Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pages 311–318, Philadelphia, Pennsylvania. Association for Computational Linguistics.

Piau, M., Lotufo, R., and Nogueira, R. (2024). ptt5-v2: A closer look at continued pretraining of t5 models for the portuguese language. arXiv preprint arXiv:2406.10806.

Popović, M. (2015). chrf: character n-gram f-score for automatic mt evaluation. In Proceedings of the Tenth Workshop on Statistical Machine Translation, pages 392–395.

Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., and Liu, P. J. (2020). Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of Machine Learning Research, 21(140):1–67.

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., and Polosukhin, I. (2017). Attention is all you need. Advances in Neural Information Processing Systems (NeurIPS), 30:5998–6008.

Xue, L., Constant, N., Roberts, A., Kale, M., Al-Rfou, R., Siddhant, A., Barua, A., and Raffel, C. (2021). mt5: A massively multilingual pre-trained text-to-text transformer. arXiv preprint arXiv:2010.11934.

Yin, K., Zhang, Y., and Bowden, R. (2021). Better sign language translation with stmc-transformer. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages 2845–2854.

Zhang, B., Müller, M., and Sennrich, R. (2023). Sltunet: A simple unified model for sign language translation. arXiv preprint arXiv:2305.01778.