Detection of texts generated by LLMs in Portuguese

Guilherme S. M. de C. Paes; Arthur Negrão de F. M. C.; Guilherme Silva; Ederson Júnior; Eduardo Luz; Pedro Silva

doi:10.5753/eniac.2025.13952

Guilherme S. M. de C. Paes UFOP
Arthur Negrão de F. M. C. UFOP
Guilherme Silva UFOP
Ederson Júnior UFOP
Eduardo Luz UFOP
Pedro Silva UFOP

DOI: https://doi.org/10.5753/eniac.2025.13952

Abstract

With the increasing accessibility and use of generative Artificial Intelligence (AI) models, concerns about the misuse of these technologies have intensified. Although originally developed to assist with everyday tasks, their malicious use can contribute to plagiarism and the spread of misinformation. Due to their recent emergence and high capacity, texts generated by Large Language Models (LLMs) still pose significant challenges in terms of detection. In this context, this work proposes the construction of a Portuguese-language dataset containing examples of human-authored texts, AI-generated texts, and human texts rewritten by LLMs. Additionally, five classification models were developed based on architectures from the LLaMA and BERT families, along with a recurrent neural network using bidirectional LSTM layers. The proposed classifiers demonstrated strong performance, achieving accuracies of up to 98.18% in binary classification (LLM-authored or not) and 97.7% in the three-class classification task (human, AI-generated, and AI-rewritten), using the defined test set.

References

Almeida, T. S., Abonizio, H., Nogueira, R., and Pires, R. (2024). Sabiá-2: A new generation of portuguese large language models.

Caseli, H. d. M. and Nunes, M. d. G. V. (2023). Processamento de linguagem natural: conceitos, técnicas e aplicações em português.

Castro, I. (2006). Introdução à história do português.

Chakraborty, S., Bedi, A. S., Zhu, S., An, B., Manocha, D., and Huang, F. (2023). On the possibilities of ai-generated text detection.

da Silva Oliveira, A., de Carvalho Cecote, T., Alvarenga, J. P. R., da Silva Luz, E. J., et al. (2024). Toxic speech detection in portuguese: A comparative study of large language models. In Proceedings of the 16th International Conference on Computational Processing of Portuguese, pages 108–116.

Devlin, J., Chang, M., Lee, K., and Toutanova, K. (2018). BERT: pre-training of deep bidirectional transformers for language understanding. CoRR, abs/1810.04805.

DONATO, H., ESCADA, P., and VILLANUEVA, T. (2023). A transparência da ciência com o chatgpt e as ferramentas emergentes de inteligência artificial: como se devem posicionar as revistas científicas médicas. The Transparency of Science with ChatGpt and the Emerging Artificial Intelligence Language Models: Where Should Medical Journals Stand.

Else, H. (2023). Abstracts written by chatgpt fool scientists. Nature, 613(7944):423–423.

Gemini Team Group (2024). Gemini: A family of highly capable multimodal models.

Gôlo, M. P. S., Mori, A. L. V., Oliveira, W. G., Barbosa, J. R., Graciano Neto, V. V., Lima, E. A. d., and Marcacini, R. M. (2024). On the use of large language models to detect brazilian politics fake news. In Encontro Nacional de Inteligência Artificial e Computacional (ENIAC). SBC.

Guo, B., Zhang, X., Wang, Z., Jiang, M., Nie, J., Ding, Y., Yue, J., and Wu, Y. (2023). How close is chatgpt to human experts? comparison corpus, evaluation, and detection.

instituto camoes (2022). Dia mundial da lingua portuguesa: 5 de maio de 2022. Acessado em 05 de junho de 2024.

Leite, J. A., Silva, D., Bontcheva, K., and Scarton, C. (2020). Toxic language detection in social media for Brazilian Portuguese: New dataset and multilingual analysis. In Wong, K.-F., Knight, K., and Wu, H., editors, Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing, pages 914–924, Suzhou, China. Association for Computational Linguistics.

Liu, F., Liu, Y., Shi, L., Huang, H., Wang, R., Yang, Z., Zhang, L., Li, Z., and Ma, Y. (2024). Exploring and evaluating hallucinations in llm-powered code generation.

Llama Team Group (2024). The llama 3 herd of models.

Marlesson (2024). News of the brazilian newspaper. Kaggle. Accessado em 31-08-2024.

Pinto, J. (2012). A aquisição de português le por alunos marroquinos: dificuldades interlinguísticas. In Actas del II Congreso Internacional de la Sociedad Extremeña de Estudios Portugueses y la Lusofonía (SEEPLU), pages 217–239. SEEPLU-CILEM-LEPOLL.

Pires, V. B., Guerreiro, D., et al. (2024). Portuguese fake news classification with bert models. In Encontro Nacional de Inteligência Artificial e Computacional (ENIAC), pages 834–845. SBC.

Rossoni, L. and Chat, G. (2022). A inteligência artificial e eu: escrevendo o editorial juntamente com o chatgpt. Revista Eletrônica de Ciência Administrativa, 21(3):399–405.

Sant, F. P., Sant, I. P., de Camargo Sant, C., et al. (2023). Uma utilização do chat gpt no ensino. Com a Palavra, o Professor, 8(20):74–86.

Soni, M. and Wade, V. (2023). Comparing abstractive summaries generated by chatgpt to real summaries through blinded reviewers and text classification algorithms. arXiv preprint arXiv:2303.17650.

Souza, F., Nogueira, R., and Lotufo, R. (2020). Bertimbau: pretrained bert models for brazilian portuguese. In Intelligent Systems: 9th Brazilian Conference, BRACIS 2020, Rio Grande, Brazil, October 20–23, 2020, Proceedings, Part I 9, pages 403–417. Springer.

Wu, J., Yang, S., Zhan, R., Yuan, Y., Wong, D. F., and Chao, L. S. (2023). A survey on llm-gernerated text detection: Necessity, methods, and future directions. arXiv preprint arXiv:2310.14724.

Zago, R. and Pedotti, L. d. S. (2024). Bertugues: A novel bert transformer model pre-trained for brazilian portuguese.

Zhang, Y., Wang, M., Ren, C., Li, Q., Tiwari, P., Wang, B., and Qin, J. (2024). Pushing the limit of llm capacity for text classification.

Detection of texts generated by LLMs in Portuguese

Abstract

References

Most read articles by the same author(s)