Analyzing Fine-Tuning for Cross-Language Knowledge Transfer: A Study for the Portuguese Language

  • Yuri Hughes UFBA
  • Marlo Souza UFBA

Abstract


Grandes Modelos de Linguagem (LLMs) desempenham um papel central no cenário moderno de PLN, devido aos seus consistentes bons resultados em muitas tarefas na literatura. No entanto, o treinamento desses modelos envolve altos custos associados. Devido à escassez de dados necessários para tarefas em idiomas com poucos recursos, a literatura tem proposto o uso de tecnologia multiĺıngue combinada com técnicas de transferência de conhecimento entre ĺınguas. Esta pesquisa explora o fine-tuning de modelos para transferência de conhecimento entre ĺınguas para o português, em diversas tarefas, avaliando os trade-offs entre quantidade de dados, recursos computacionais, tempo de treinamento e desempenho do modelo.

References

Akiba, T., Sano, S., Yanase, T., Ohta, T., and Koyama, M. (2019). Optuna: A next-generation hyperparameter optimization framework. In Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, pages 2623–2631.

Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. (2019). Bert: Pre-training of deep bidirectional transformers for language understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186.

Howard, J. and Ruder, S. (2018). Universal language model fine-tuning for text classification. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 328–339.

Hu, E. J., Shen, Y., Wallis, P., Allen-Zhu, Z., Li, Y., Wang, L., Chen, W., and Smola, A. (2021). Lora: Low-rank adaptation of large language models. In International Conference on Learning Representations.

Martin, L., Muller, B., Suarez, P. O., Dupont, Y., Romary, L., De La Clergerie, É. V., Seddah, D., and Sagot, B. (2020). Camembert: a tasty french language model. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7203–7219.

Pfeiffer, J., Rücklé, A., Poth, C., Kamath, A., Vulić, I., Ruder, S., Cho, K., and Gurevych, I. (2020). Adapterhub: A framework for adapting transformers. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 46–54.

Ramshaw, L. and Marcus, M. (1995). Text chunking using transformation-based learning. In Third Workshop on Very Large Corpora.

Razuvayevskaya, O., Wu, B., Leite, J. A., et al. (2024). Comparison between parameter-efficient techniques and full fine-tuning: A case study on multilingual news article classification. PLOS ONE, 19(5):e0301738.

Souza, F., Nogueira, R., and Lotufo, R. (2020). Bertimbau: Pre-trained bert models for brazilian portuguese. Brazilian Conference on Intelligent Systems (BRACIS), pages 403–417.

Wu, S. and Dredze, M. (2019). Beto, bentz, becas: The surprising cross-lingual effectiveness of bert. In 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP 2019, pages 833–844. Association for Computational Linguistics.

Zilio, L. V. and Rino, L. M. (2011). Brwac: A new brazilian web corpus. PROPOR, pages 149–158.
Published
2025-09-29
HUGHES, Yuri; SOUZA, Marlo. Analyzing Fine-Tuning for Cross-Language Knowledge Transfer: A Study for the Portuguese Language. In: BRAZILIAN SYMPOSIUM IN INFORMATION AND HUMAN LANGUAGE TECHNOLOGY (STIL), 16. , 2025, Fortaleza/CE. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2025 . p. 693-697. DOI: https://doi.org/10.5753/stil.2025.37873.