Viés de gênero na tradução automática do GPT-3.5 turbo: avaliando o par linguístico inglês-português

Tayane Arantes Soares; Yohan Bonescki Gumiel; Rafael Junqueira; Tácio Gomes; Adriana Pagano

doi:10.5753/stil.2023.234186

Tayane Arantes Soares UFMG http://orcid.org/0009-0002-8315-4090
Yohan Bonescki Gumiel PUC-PR https://orcid.org/0000-0001-8239-2930
Rafael Junqueira UFMG
Tácio Gomes UFMG
Adriana Pagano UFMG https://orcid.org/0000-0002-3150-3503

DOI: https://doi.org/10.5753/stil.2023.234186

Resumo

Este estudo avaliou a qualidade das traduções automáticas geradas pelo GPT-3.5 turbo. Traduzimos para o português o Challenge Test Set WinoMT, que avalia a capacidade de modelos de tradução automática em traduzir o gênero gramatical de substantivos relacionados a profissões. Adaptamos o código de avaliação automática desenvolvido por Stanovsky et al. (2019) para avaliar as traduções resultantes. Os resultados indicam que o GPT-3.5 turbo tende a promover viés de gênero na tradução de profissões.

Palavras-chave: Métodos de avaliação de tarefas de PLN, Tradução Automática, Modelos de linguagem grandes

Referências

Bender, E. M., Gebru, T., McMillan-Major, A., Shmitchell, S. (2021) On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? In: Proceedings of the 2021 ACM conference on fairness, accountability, and transparency. p. 610-623. https://doi.org/10.1145/3442188.3445922

Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., Amodei, D. (2020). Language models are few-shot learners. Advances in neural information processing systems, 33, 1877-1901. https://dl.acm.org/doi/abs/10.5555/3495724.3495883

Caseli, H. de M. (2017) Tradução Automática: estratégias e limitações. Domínios de Lingu@gem, v. 11, n. 5, p. 1782-1796. https://doi.org/10.14393/DL32-v11n5a2017-21 [link].

Castilho, S., Mallon, C., Meister, R., Yue, S. (2023) Do online machine translation systems care for context? What about a GPT model? In: 24th Annual Conference of the European Association for Machine Translation (EAMT 2023), 12-15 June 2023, Tampere, Finland. (In Press) https://doras.dcu.ie/28297/

Cohen, J. A. (1960) Coefficient of Agreement for Nominal Scales. Educational and Psychological Measurement, v. 20, n. 1, p. 37–46. https://doi.org/10.1177/001316446002000104

Devinney, H., Björklund, J., Björklund, H. (2022) Theories of “Gender” in NLP Bias Research. arXiv:2205.02526 [cs].

Halliday, M. K. (1978) Language as social semiotic: The social interpretation of language and meaning. London: Edward Arnold.

Jakobson, R. (1959) On Linguistic Aspects of Translation. In: Brower, R. A. (ed.). On translation. Cambridge, USA: Harvard University Press. https://doi.org/10.4159/harvard.9780674731615.c18

Kocmi, T., Federmann, C. (2023). Large language models are state-of-the-art evaluators of translation quality. arXiv preprint arXiv:2302.14520. https://doi.org/10.48550/arXiv.2302.14520 https://arxiv.org/abs/2302.14520

Levesque, H. J. (2011) The Winograd schema challenge. In: AAAI Spring Symposium: Logical Formalizations of Commonsense Reasoning.

Lewis, M., Lupyan, G. (2020) Gender stereotypes are reflected in the distributional structure of 25 languages. Nature Human Behaviour, v. 4, n. 10, p. 1021-1028. https://doi.org/10.1038/s41562-020-0918-6 https://www.nature.com/articles/s41562-020-0918-6

Popović, M., Castilho, S. (2019). Challenge Test Sets for MT Evaluation. In Proceedings of Machine Translation Summit XVII: Tutorial Abstracts, Dublin, Ireland. European Association for Machine Translation. https://aclanthology.org/W19-7602

Rudinger, R., Naradowsky, J., Leonard, B., Van Durme, B. (2018) Gender Bias in Coreference Resolution. arXiv:1804.09301 [cs]. https://doi.org/10.18653/v1/N18-2002 https://aclanthology.org/N18-2002

Savoldi, B., Gaido, M., Bentivogli, L., Negri, M., Turchi, M. (2021) Gender Bias in Machine Translation. Transactions of the Association for Computational Linguistics, v. 9, p. 845–874. https://doi.org/10.1162/tacl_a_00401

Stanovsky, G., Smith, N., Zettlemoyer, L. (2019). Evaluating Gender Bias in Machine Translation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 1679–1684, Florence, Italy. https://doi.org/10.18653/v1/P19-1164 https://aclanthology.org/P19-1164

Zhao, J., Wang, T., Yatskar, M., Ordonez, V., Chang, K. (2018) Gender Bias in Coreference Resolution: Evaluation and Debiasing Methods. 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Proceedings [...], volume 2 (Short Papers). https://doi.org/10.18653/v1/N18-2003 https://aclanthology.org/N18-2003