Meta4BR: Evaluating Metaphorical Fidelity in Portuguese Metaphor Translations by LLMs
Abstract
Metaphors are key elements of human communication, enabling nuanced expression and deeper conceptual understanding. Yet, most research on metaphor detection focuses on English corpora and models. The ability of Large Language Models (LLMs) to process metaphorical language in other languages remains underexplored. This study investigates whether metaphors can be preserved and annotated through translation. Focusing on Brazilian Portuguese, we propose a pipeline based on back-translation and comparison using multiple LLMs. Based on both automatic metrics and human evaluation, results suggest that conceptual and linguistic metaphors can be effectively translated. The findings suggest that metaphor resources can be bootstrapped via translation workflows. Nonetheless, challenges remain, particularly the need for extensive manual validation and the risks of cultural bias and semantic drift.References
Alibaba (2025). Qwen2.5 Technical Report.
Bengio, Y., Ducharme, R., Vincent, P., and Jauvin, C. (2003). A neural probabilistic language model. Journal of machine learning research, 3(Feb):1137–1155.
Boisson, J., Espinosa-Anke, L., and Camacho-Collados, J. (2023). Construction Artifacts in Metaphor Identification Datasets. In Bouamor, H., Pino, J., and Bali, K., editors, Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 6581–6590, Singapore. Association for Computational Linguistics.
Boisson, J., Mehmood, A., and Camacho-Collados, J. (2025). METAPHORSHARE: A dynamic collaborative repository of open metaphor datasets. In Dziri, N., Ren, S. X., and Diao, S., editors, Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (System Demonstrations), pages 509–521, Albuquerque, New Mexico. Association for Computational Linguistics.
Boldarine, A. C. (2024). Uso de metáforas por aprendizes brasileiros bilíngues de inglês.
Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al. (2020). Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901.
Castilho, S. and Caseli, H. d. M. (2024). Tradução automática: Abordagens e avaliação. In Caseli, H. M. and Nunes, M. G. V., editors, Processamento de Linguagem Natural: Conceitos, Técnicas e Aplicações em Português, book chapter 23. BPLN, 3 edition.
Cui, M., Gao, P., Liu, W., Luan, J., and Wang, B. (2025). Multilingual Machine Translation with Open Large Language Models at Practical Scale: An Empirical Study. In Chiruzzo, L., Ritter, A., and Wang, L., editors, Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 5420–5443, Albuquerque, New Mexico. Association for Computational Linguistics.
Deignan, A. (2016). From linguistic to conceptual metaphors. In The Routledge handbook of metaphor and language, pages 120–134. Routledge.
Freitag, M., Foster, G. F., Grangier, D., Ratnakar, V., Tan, Q., and Macherey, W. (2021). Experts, errors, and context: A large-scale study of human evaluation for machine translation. Trans. Assoc. Comput. Linguistics, 9:1460–1474.
Gemma-Team (2025). Gemma 3 technical report.
Glucksberg, S. and Keysar, B. (1993). How metaphors work. Metaphor and thought, 2:401–424.
Goatly, A. (1997). The language of metaphors. Routledge.
Google-DeepMind (2024). Introducing Gemini 2.0: our new AI model for the agentic era. [link]. Acesso em 10 jun. 2025.
Guerreiro, N. M., Rei, R., Stigt, D. v., Coheur, L., Colombo, P., and Martins, A. F. T. (2024). xcomet: Transparent machine translation evaluation through fine-grained error detection. Transactions of the Association for Computational Linguistics, 12:979–995.
Joseph, R., Liu, T., Ng, A. B., See, S., and Rai, S. (2023). NewsMet : A ‘do it all’ dataset of contemporary metaphors in news headlines. In Rogers, A., Boyd-Graber, J., and Okazaki, N., editors, Findings of the Association for Computational Linguistics: ACL 2023, pages 10090–10104, Toronto, Canada. Association for Computational Linguistics.
Joshi, P., Santy, S., Budhiraja, A., Bali, K., and Choudhury, M. (2020). The state and fate of linguistic diversity and inclusion in the NLP world. In Jurafsky, D., Chai, J., Schluter, N., and Tetreault, J., editors, Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 6282–6293, Online. Association for Computational Linguistics.
Kabra, A., Liu, E., Khanuja, S., Aji, A. F., Winata, G., Cahyawijaya, S., Aremu, A., Ogayo, P., and Neubig, G. (2023). Multi-lingual and Multi-cultural Figurative Language Understanding. In Rogers, A., Boyd-Graber, J., and Okazaki, N., editors, Findings of the Association for Computational Linguistics: ACL 2023, pages 8269–8284, Toronto, Canada. Association for Computational Linguistics.
Kövecses, Z. (2005). Metaphor in culture: Universality and variation. Cambridge university press.
Kövecses, Z. (2010). Metaphor, language, and culture. DELTA: Documentação de Estudos em Lingüística Teórica e Aplicada, 26:739–757.
Krennmayr, T. and Steen, G. (2017). VU Amsterdam Metaphor Corpus, pages 1053–1071. Springer Netherlands, Dordrecht.
Labarta Postigo, M. (2023). ‘Life is baseball’ vs. ‘A vida é futebol’: la traducción de metáforas deportivas del inglés al portugués de Portugal y al portugués de Brasil. Sendebar, 34:147–161.
Lakoff, G. and Johnson, M. (2008). Metaphors we live by. University of Chicago press.
Lin, C.-Y. (2004). ROUGE: A Package for Automatic Evaluation of Summaries. In Text Summarization Branches Out, pages 74–81, Barcelona, Spain. Association for Computational Linguistics.
Liu, T., Wang, Y., Wang, Z., and Yu, H. (2025). Exploring cognitive effort and divergent thinking in metaphor translation using eye-tracking and EEG technology. Scientific Reports, 15(1):18177.
Ma, W. (2025). Optimizing English translation processing of conceptual metaphors in big data technology texts. Advances in Continuous and Discrete Models, 2025(1):36.
Massey, G. (2021). Re-framing conceptual metaphor translation research in the age of neural machine translation: Investigating translators’ added value with products and processes. Training, Language and Culture, 5(1):37–56.
MetaAI (2024). The Llama 3 Herd of Models.
MistralAI (2024). Ministral-8B. Acesso em 09 jun. 2025.
Morrison, J., Na, C., Fernandez, J., Dettmers, T., Strubell, E., and Dodge, J. (2025). Holistically evaluating the environmental impact of creating language models. In The 13th International Conference on Learning Representations.
NLLB-Team, Costa-jussà, M. R., Cross, J., Çelebi, O., Elbayad, M., Heafield, K., Heffernan, K., Kalbassi, E., Lam, J., Licht, D., Maillard, J., Sun, A., Wang, S., Wenzek, G., Youngblood, A., Akula, B., Barrault, L., Gonzalez, G. M., Hansanti, P., Hoffman, J., Jarrett, S., Sadagopan, K. R., Rowe, D., Spruit, S., Tran, C., Andrews, P., Ayan, N. F., Bhosale, S., Edunov, S., Fan, A., Gao, C., Goswami, V., Guzmán, F., Koehn, P., Mourachko, A., Ropers, C., Saleem, S., Schwenk, H., and Wang, J. (2022). No Language Left Behind: Scaling Human-Centered Machine Translation.
OpenAI (2024). Gpt-4o System Card.
Papineni, K., Roukos, S., Ward, T., and Zhu, W.-J. (2002). Bleu: a Method for Automatic Evaluation of Machine Translation. In Isabelle, P., Charniak, E., and Lin, D., editors, Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pages 311–318, Philadelphia, Pennsylvania, USA. Association for Computational Linguistics.
Postigo, M. L. (2024). ChatGPT and MT-Systems: advantages and limitations when translating english to spanish and portuguese. Lengua y Habla, 28(0):369–390.
Ranathunga, S., Lee, E.-S. A., Prifti Skenduli, M., Shekhar, R., Alam, M., and Kaur, R. (2023). Neural machine translation for low-resource languages: A survey. ACM Computing Surveys, 55(11):1–37.
Rei, R., De Souza, J. G., Alves, D., Zerva, C., Farinha, A. C., Glushkova, T., Lavie, A., Coheur, L., and Martins, A. F. (2022). Comet-22: Unbabel-ist 2022 submission for the metrics shared task. In Proceedings of the Seventh Conference on Machine Translation (WMT), pages 578–585.
Rei, R., Guerreiro, N. M., Pombal, J., van Stigt, D., Treviso, M., Coheur, L., C. de Souza, J. G., and Martins, A. (2023). Scaling up CometKiwi: Unbabel-IST 2023 submission for the quality estimation shared task. In Koehn, P., Haddow, B., Kocmi, T., and Monz, C., editors, Proceedings of the Eighth Conference on Machine Translation, pages 841–848, Singapore. Association for Computational Linguistics.
Sanchez-Bayona, E. and Agerri, R. (2024). Meta4XNLI: A Crosslingual Parallel Corpus for Metaphor Detection and Interpretation.
Sellam, T., Das, D., and Parikh, A. (2020). BLEURT: Learning robust metrics for text generation. In Jurafsky, D., Chai, J., Schluter, N., and Tetreault, J., editors, Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7881–7892, Online. Association for Computational Linguistics.
Shao, Y., Yao, X., Qu, X., Lin, C., Wang, S., Huang, W., Zhang, G., and Fu, J. (2024). CMDAG: A Chinese metaphor dataset with annotated grounds as CoT for boosting metaphor generation. In Calzolari, N., Kan, M.-Y., Hoste, V., Lenci, A., Sakti, S., and Xue, N., editors, Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 3357–3366, Torino, Italia. ELRA and ICCL.
Shrout, P. E. and Fleiss, J. L. (1979). Intraclass correlations: uses in assessing rater reliability. Psychological bulletin, 86(2):420.
Shutova, E. (2017). Annotation of linguistic and conceptual metaphor. Handbook of linguistic annotation, pages 1073–1100.
Steen, G. (2010). A Method for Linguistic Metaphor Identification: From MIP to MIPVU.
Tiedemann, J. and Thottingal, S. (2020). OPUS-MT – building open translation services for the world. In Martins, A., Moniz, H., Fumega, S., Martins, B., Batista, F., Coheur, L., Parra, C., Trancoso, I., Turchi, M., Bisazza, A., Moorkens, J., Guerberof, A., Nurminen, M., Marg, L., and Forcada, M. L., editors, Proceedings of the 22nd Annual Conference of the European Association for Machine Translation, pages 479–480, Lisboa, Portugal. European Association for Machine Translation.
Tong, X., Choenni, R., Lewis, M., and Shutova, E. (2024). Metaphor Understanding Challenge Dataset for LLMs. In Ku, L.-W., Martins, A., and Srikumar, V., editors, Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 3517–3536, Bangkok, Thailand. Association for Computational Linguistics.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., and Polosukhin, I. (2017). Attention is all you need. Advances in neural information processing systems, 30.
Wang, S., Zhang, G., Wu, H., Loakman, T., Huang, W., and Lin, C. (2024). MMTE: Corpus and Metrics for Evaluating Machine Translation Quality of Metaphorical Language. In Al-Onaizan, Y., Bansal, M., and Chen, Y.-N., editors, Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 11343–11358, Miami, Florida, USA. Association for Computational Linguistics.
Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., Davison, J., Shleifer, S., von Platen, P., Ma, C., Jernite, Y., Plu, J., Xu, C., Le Scao, T., Gugger, S., Drame, M., Lhoest, Q., and Rush, A. (2020). Transformers: State-of-the-Art Natural Language Processing. In Liu, Q. and Schlangen, D., editors, Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 38–45, Online. Association for Computational Linguistics.
Xia, M., Kong, X., Anastasopoulos, A., and Neubig, G. (2019). Generalized data augmentation for low-resource translation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 5786–5796.
Zayed, O., McCrae, J. P., and Buitelaar, P. (2020). Figure Me Out: A Gold Standard Dataset for Metaphor Interpretation. In Calzolari, N., Béchet, F., Blache, P., Choukri, K., Cieri, C., Declerck, T., Goggi, S., Isahara, H., Maegaard, B., Mariani, J., Mazo, H., Moreno, A., Odijk, J., and Piperidis, S., editors, Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 5810–5819, Marseille, France. European Language Resources Association.
Zhang, T., Kishore, V., Wu, F., Weinberger, K. Q., and Artzi, Y. (2020). BERTScore: Evaluating Text Generation with Bert. In International Conference on Learning Representations.
Bengio, Y., Ducharme, R., Vincent, P., and Jauvin, C. (2003). A neural probabilistic language model. Journal of machine learning research, 3(Feb):1137–1155.
Boisson, J., Espinosa-Anke, L., and Camacho-Collados, J. (2023). Construction Artifacts in Metaphor Identification Datasets. In Bouamor, H., Pino, J., and Bali, K., editors, Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 6581–6590, Singapore. Association for Computational Linguistics.
Boisson, J., Mehmood, A., and Camacho-Collados, J. (2025). METAPHORSHARE: A dynamic collaborative repository of open metaphor datasets. In Dziri, N., Ren, S. X., and Diao, S., editors, Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (System Demonstrations), pages 509–521, Albuquerque, New Mexico. Association for Computational Linguistics.
Boldarine, A. C. (2024). Uso de metáforas por aprendizes brasileiros bilíngues de inglês.
Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al. (2020). Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901.
Castilho, S. and Caseli, H. d. M. (2024). Tradução automática: Abordagens e avaliação. In Caseli, H. M. and Nunes, M. G. V., editors, Processamento de Linguagem Natural: Conceitos, Técnicas e Aplicações em Português, book chapter 23. BPLN, 3 edition.
Cui, M., Gao, P., Liu, W., Luan, J., and Wang, B. (2025). Multilingual Machine Translation with Open Large Language Models at Practical Scale: An Empirical Study. In Chiruzzo, L., Ritter, A., and Wang, L., editors, Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 5420–5443, Albuquerque, New Mexico. Association for Computational Linguistics.
Deignan, A. (2016). From linguistic to conceptual metaphors. In The Routledge handbook of metaphor and language, pages 120–134. Routledge.
Freitag, M., Foster, G. F., Grangier, D., Ratnakar, V., Tan, Q., and Macherey, W. (2021). Experts, errors, and context: A large-scale study of human evaluation for machine translation. Trans. Assoc. Comput. Linguistics, 9:1460–1474.
Gemma-Team (2025). Gemma 3 technical report.
Glucksberg, S. and Keysar, B. (1993). How metaphors work. Metaphor and thought, 2:401–424.
Goatly, A. (1997). The language of metaphors. Routledge.
Google-DeepMind (2024). Introducing Gemini 2.0: our new AI model for the agentic era. [link]. Acesso em 10 jun. 2025.
Guerreiro, N. M., Rei, R., Stigt, D. v., Coheur, L., Colombo, P., and Martins, A. F. T. (2024). xcomet: Transparent machine translation evaluation through fine-grained error detection. Transactions of the Association for Computational Linguistics, 12:979–995.
Joseph, R., Liu, T., Ng, A. B., See, S., and Rai, S. (2023). NewsMet : A ‘do it all’ dataset of contemporary metaphors in news headlines. In Rogers, A., Boyd-Graber, J., and Okazaki, N., editors, Findings of the Association for Computational Linguistics: ACL 2023, pages 10090–10104, Toronto, Canada. Association for Computational Linguistics.
Joshi, P., Santy, S., Budhiraja, A., Bali, K., and Choudhury, M. (2020). The state and fate of linguistic diversity and inclusion in the NLP world. In Jurafsky, D., Chai, J., Schluter, N., and Tetreault, J., editors, Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 6282–6293, Online. Association for Computational Linguistics.
Kabra, A., Liu, E., Khanuja, S., Aji, A. F., Winata, G., Cahyawijaya, S., Aremu, A., Ogayo, P., and Neubig, G. (2023). Multi-lingual and Multi-cultural Figurative Language Understanding. In Rogers, A., Boyd-Graber, J., and Okazaki, N., editors, Findings of the Association for Computational Linguistics: ACL 2023, pages 8269–8284, Toronto, Canada. Association for Computational Linguistics.
Kövecses, Z. (2005). Metaphor in culture: Universality and variation. Cambridge university press.
Kövecses, Z. (2010). Metaphor, language, and culture. DELTA: Documentação de Estudos em Lingüística Teórica e Aplicada, 26:739–757.
Krennmayr, T. and Steen, G. (2017). VU Amsterdam Metaphor Corpus, pages 1053–1071. Springer Netherlands, Dordrecht.
Labarta Postigo, M. (2023). ‘Life is baseball’ vs. ‘A vida é futebol’: la traducción de metáforas deportivas del inglés al portugués de Portugal y al portugués de Brasil. Sendebar, 34:147–161.
Lakoff, G. and Johnson, M. (2008). Metaphors we live by. University of Chicago press.
Lin, C.-Y. (2004). ROUGE: A Package for Automatic Evaluation of Summaries. In Text Summarization Branches Out, pages 74–81, Barcelona, Spain. Association for Computational Linguistics.
Liu, T., Wang, Y., Wang, Z., and Yu, H. (2025). Exploring cognitive effort and divergent thinking in metaphor translation using eye-tracking and EEG technology. Scientific Reports, 15(1):18177.
Ma, W. (2025). Optimizing English translation processing of conceptual metaphors in big data technology texts. Advances in Continuous and Discrete Models, 2025(1):36.
Massey, G. (2021). Re-framing conceptual metaphor translation research in the age of neural machine translation: Investigating translators’ added value with products and processes. Training, Language and Culture, 5(1):37–56.
MetaAI (2024). The Llama 3 Herd of Models.
MistralAI (2024). Ministral-8B. Acesso em 09 jun. 2025.
Morrison, J., Na, C., Fernandez, J., Dettmers, T., Strubell, E., and Dodge, J. (2025). Holistically evaluating the environmental impact of creating language models. In The 13th International Conference on Learning Representations.
NLLB-Team, Costa-jussà, M. R., Cross, J., Çelebi, O., Elbayad, M., Heafield, K., Heffernan, K., Kalbassi, E., Lam, J., Licht, D., Maillard, J., Sun, A., Wang, S., Wenzek, G., Youngblood, A., Akula, B., Barrault, L., Gonzalez, G. M., Hansanti, P., Hoffman, J., Jarrett, S., Sadagopan, K. R., Rowe, D., Spruit, S., Tran, C., Andrews, P., Ayan, N. F., Bhosale, S., Edunov, S., Fan, A., Gao, C., Goswami, V., Guzmán, F., Koehn, P., Mourachko, A., Ropers, C., Saleem, S., Schwenk, H., and Wang, J. (2022). No Language Left Behind: Scaling Human-Centered Machine Translation.
OpenAI (2024). Gpt-4o System Card.
Papineni, K., Roukos, S., Ward, T., and Zhu, W.-J. (2002). Bleu: a Method for Automatic Evaluation of Machine Translation. In Isabelle, P., Charniak, E., and Lin, D., editors, Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pages 311–318, Philadelphia, Pennsylvania, USA. Association for Computational Linguistics.
Postigo, M. L. (2024). ChatGPT and MT-Systems: advantages and limitations when translating english to spanish and portuguese. Lengua y Habla, 28(0):369–390.
Ranathunga, S., Lee, E.-S. A., Prifti Skenduli, M., Shekhar, R., Alam, M., and Kaur, R. (2023). Neural machine translation for low-resource languages: A survey. ACM Computing Surveys, 55(11):1–37.
Rei, R., De Souza, J. G., Alves, D., Zerva, C., Farinha, A. C., Glushkova, T., Lavie, A., Coheur, L., and Martins, A. F. (2022). Comet-22: Unbabel-ist 2022 submission for the metrics shared task. In Proceedings of the Seventh Conference on Machine Translation (WMT), pages 578–585.
Rei, R., Guerreiro, N. M., Pombal, J., van Stigt, D., Treviso, M., Coheur, L., C. de Souza, J. G., and Martins, A. (2023). Scaling up CometKiwi: Unbabel-IST 2023 submission for the quality estimation shared task. In Koehn, P., Haddow, B., Kocmi, T., and Monz, C., editors, Proceedings of the Eighth Conference on Machine Translation, pages 841–848, Singapore. Association for Computational Linguistics.
Sanchez-Bayona, E. and Agerri, R. (2024). Meta4XNLI: A Crosslingual Parallel Corpus for Metaphor Detection and Interpretation.
Sellam, T., Das, D., and Parikh, A. (2020). BLEURT: Learning robust metrics for text generation. In Jurafsky, D., Chai, J., Schluter, N., and Tetreault, J., editors, Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7881–7892, Online. Association for Computational Linguistics.
Shao, Y., Yao, X., Qu, X., Lin, C., Wang, S., Huang, W., Zhang, G., and Fu, J. (2024). CMDAG: A Chinese metaphor dataset with annotated grounds as CoT for boosting metaphor generation. In Calzolari, N., Kan, M.-Y., Hoste, V., Lenci, A., Sakti, S., and Xue, N., editors, Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 3357–3366, Torino, Italia. ELRA and ICCL.
Shrout, P. E. and Fleiss, J. L. (1979). Intraclass correlations: uses in assessing rater reliability. Psychological bulletin, 86(2):420.
Shutova, E. (2017). Annotation of linguistic and conceptual metaphor. Handbook of linguistic annotation, pages 1073–1100.
Steen, G. (2010). A Method for Linguistic Metaphor Identification: From MIP to MIPVU.
Tiedemann, J. and Thottingal, S. (2020). OPUS-MT – building open translation services for the world. In Martins, A., Moniz, H., Fumega, S., Martins, B., Batista, F., Coheur, L., Parra, C., Trancoso, I., Turchi, M., Bisazza, A., Moorkens, J., Guerberof, A., Nurminen, M., Marg, L., and Forcada, M. L., editors, Proceedings of the 22nd Annual Conference of the European Association for Machine Translation, pages 479–480, Lisboa, Portugal. European Association for Machine Translation.
Tong, X., Choenni, R., Lewis, M., and Shutova, E. (2024). Metaphor Understanding Challenge Dataset for LLMs. In Ku, L.-W., Martins, A., and Srikumar, V., editors, Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 3517–3536, Bangkok, Thailand. Association for Computational Linguistics.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., and Polosukhin, I. (2017). Attention is all you need. Advances in neural information processing systems, 30.
Wang, S., Zhang, G., Wu, H., Loakman, T., Huang, W., and Lin, C. (2024). MMTE: Corpus and Metrics for Evaluating Machine Translation Quality of Metaphorical Language. In Al-Onaizan, Y., Bansal, M., and Chen, Y.-N., editors, Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 11343–11358, Miami, Florida, USA. Association for Computational Linguistics.
Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., Davison, J., Shleifer, S., von Platen, P., Ma, C., Jernite, Y., Plu, J., Xu, C., Le Scao, T., Gugger, S., Drame, M., Lhoest, Q., and Rush, A. (2020). Transformers: State-of-the-Art Natural Language Processing. In Liu, Q. and Schlangen, D., editors, Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 38–45, Online. Association for Computational Linguistics.
Xia, M., Kong, X., Anastasopoulos, A., and Neubig, G. (2019). Generalized data augmentation for low-resource translation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 5786–5796.
Zayed, O., McCrae, J. P., and Buitelaar, P. (2020). Figure Me Out: A Gold Standard Dataset for Metaphor Interpretation. In Calzolari, N., Béchet, F., Blache, P., Choukri, K., Cieri, C., Declerck, T., Goggi, S., Isahara, H., Maegaard, B., Mariani, J., Mazo, H., Moreno, A., Odijk, J., and Piperidis, S., editors, Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 5810–5819, Marseille, France. European Language Resources Association.
Zhang, T., Kishore, V., Wu, F., Weinberger, K. Q., and Artzi, Y. (2020). BERTScore: Evaluating Text Generation with Bert. In International Conference on Learning Representations.
Published
2025-09-29
How to Cite
STELLET, Luisa; LEITE, Isabella; ASSIS, Gabriel; PAES, Aline.
Meta4BR: Evaluating Metaphorical Fidelity in Portuguese Metaphor Translations by LLMs. In: BRAZILIAN SYMPOSIUM IN INFORMATION AND HUMAN LANGUAGE TECHNOLOGY (STIL), 16. , 2025, Fortaleza/CE.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2025
.
p. 441-454.
DOI: https://doi.org/10.5753/stil.2025.37845.
