Toxic Text Classification in Portuguese: Is LLaMA 3.1 8B All You Need?

Resumo


The recognition of toxic and hate speech on social media platforms is important due to the significant risks posed to users and the digital ecosystem. Current state-of-the-art models, such as BERTimbau, have set benchmarks for Portuguese text classification, yet challenges remain in accurately detecting toxic content. This paper investigates the effectiveness of fine-tuning a smaller, open-source decoder-only model, LLaMA 3.1 8B 4bit, for this task. We propose an iterative prompt evolution method to optimize the model’s performance. Our results demonstrate that fine-tuning significantly enhances the LLaMA model’s F1-score from 0.61 to 0.75, surpassing BERTimbau in precision and matching the performance of the GPT-4o mini. However, the approach depends on the quality of the language models used for prompt evolution, highlighting the need for further research to enhance robustness in this area.

Palavras-chave: Toxic Text Classification, Llama, LLM

Referências

BehnamGhader, P., Adlakha, V., Mosbach, M., Bahdanau, D., Chapados, N., and Reddy, S. (2024). Llm2vec: Large language models are secretly powerful text encoders. [link] DOI: 10.48550/arXiv.2404.05961

Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al. (2020). Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901. [link] DOI: 10.48550/arXiv.2005.14165

da Rocha Junqueira, J., Junior, C. L., Silva, F. L. V., Côrrea, U. B., and de Freitas, L. A.(2023). Albertina in action: An investigation of its abilities in aspect extraction, hate speech detection, irony detection, and question-answering. In Anais do XIV Simpósio Brasileiro de Tecnologia da Informação e da Linguagem Humana, pages 146–155. SBC. [link] DOI: 10.5753/stil.2023.234159

Dettmers, T., Pagnoni, A., Holtzman, A., and Zettlemoyer, L. (2023). Qlora: Efficient finetuning of quantized llms. [link]

dos Santos, W. R. and Paraboni, I. (2023). Predição de transtorno depressivo em redes sociais: Bert supervisionado ou chatgpt zero-shot? In Anais do XIV Simpósio Brasileiro de Tecnologia da Informação e da Linguagem Humana, pages 11–21. SBC. [link] DOI: 10.5753/stil.2023.233275

Dubey, A., Jauhri, A., Pandey, A., Kadian, A., Al-Dahle, A., Letman, A., Mathur, A., Schelten, A., Yang, A., and et al., A. F. (2024). The llama 3 herd of models. [link] DOI: 10.48550/arXiv.2407.21783

Guo, Q., Wang, R., Guo, J., Li, B., Song, K., Tan, X., Liu, G., Bian, J., and Yang, Y. (2024). Connecting large language models with evolutionary algorithms yields powerful prompt optimizers. In The Twelfth International Conference on Learning Representations. [link]

Hammes, L. O. A. and de Freitas, L. A. (2021). Utilizando bertimbau para a classificação de emoções em português. In Simpósio Brasileiro de Tecnologia da Informação e da Linguagem Humana (STIL), pages 56–63. SBC. [link] DOI: 10.5753/stil.2021.17784

Houlsby, N., Giurgiu, A., Jastrzebski, S., Morrone, B., De Laroussilhe, Q., Gesmundo, A., Attariyan, M., and Gelly, S. (2019). Parameter-efficient transfer learning for nlp. In International conference on machine learning, pages 2790–2799. PMLR. [link]

Hu, E. J., Shen, Y., Wallis, P., Allen-Zhu, Z., Li, Y., Wang, S., Wang, L., and Chen, W. (2021). Lora: Low-rank adaptation of large language models. [link] DOI: 10.48550/arXiv.2106.09685

Lee, C., Roy, R., Xu, M., Raiman, J., Shoeybi, M., Catanzaro, B., and Ping, W. (2024). Nv-embed: Improved techniques for training llms as generalist embedding models [link] DOI: 10.48550/arXiv.2405.17428

Lehman, J., Gordon, J., Jain, S., Ndousse, K., Yeh, C., and Stanley, K. O. (2023). Evolution through large models. In Handbook of Evolutionary Machine Learning, pages 331–366. Springer. [link]

Leite, J. A., Silva, D., Bontcheva, K., and Scarton, C. (2020). Toxic language detection in social media for brazilian portuguese: New dataset and multilingual analysis. In Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing, pages 914–924. [link]

Meyerson, E., Nelson, M. J., Bradley, H., Gaier, A., Moradi, A., Hoover, A. K., and Lehman, J. (2023). Language model crossover: Variation through few-shot prompting. arXiv preprint arXiv:2302.12170. [link] DOI: 10.48550/arXiv.2302.12170

Oliveira, A. S., Cecote, T. C., Alvarenga, J. P. R., Freitas, V. L. S., and Luz, E. J. S. (2024). Toxic speech detection in Portuguese: A comparative study of large language models. In Gamallo, P., Claro, D., Teixeira, A., Real, L., Garcia, M., Oliveira, H. G., and Amaro, R., editors, Proceedings of the 16th International Conference on Computational Processing of Portuguese - Vol. 1, pages 108–116, Santiago de Compostela, Galicia/Spain. Association for Computational Lingustics. [link]

Oliveira, A. S., Cecote, T. C., Silva, P. H., Gertrudes, J. C., Freitas, V. L., and Luz, E. J. (2023). How good is chatgpt for detecting hate speech in portuguese? In Anais do XIV Simpósio Brasileiro de Tecnologia da Informação e da Linguagem Humana, pages 94–103. SBC. [link] DOI: 10.5753/stil.2023.233943

OpenAI, Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F. L., Almeida, D., Altenschmidt, J., Altman, S., and et al., S. A. (2024). Gpt-4 technical report. [link] DOI: 10.48550/arXiv.2303.08774

Pires, R., Abonizio, H., Almeida, T. S., and Nogueira, R. (2023). Sabia: Portuguese large language models. In Naldi, M. C. and Bianchi, R. A. C., editors, Intelligent Systems, pages 226–240, Cham. Springer Nature Switzerland. [link]

Serras, F. and Finger, M. (2021). verbert: Automating brazilian case law document multilabel categorization using bert. In Anais do XIII Simpósio Brasileiro de Tecnologia da Informação e da Linguagem Humana, pages 237–246, Porto Alegre, RS, Brasil. SBC. [link] DOI: 10.5753/stil.2021.17803

Souza, F., Nogueira, R., and Lotufo, R. (2020). Bertimbau: pretrained bert models for brazilian portuguese. In Intelligent Systems: 9th Brazilian Conference, BRACIS 2020, Rio Grande, Brazil, October 20–23, 2020, Proceedings, Part I 9, pages 403–417. Springer. [link]

Zheng, M., Su, X., You, S., Wang, F., Qian, C., Xu, C., and Albanie, S. (2023). Can gpt-4 perform neural architecture search? arXiv preprint arXiv:2304.10970. [link] DOI: 10.48550/arXiv.2304.10970
Publicado
17/11/2024
OLIVEIRA, Amanda S.; SILVA, Pedro H. L.; SANTOS, Valéria de C.; MOREIRA, Gladston; FREITAS, Vander L. S.; LUZ, Eduardo J. S.. Toxic Text Classification in Portuguese: Is LLaMA 3.1 8B All You Need?. In: SIMPÓSIO BRASILEIRO DE TECNOLOGIA DA INFORMAÇÃO E DA LINGUAGEM HUMANA (STIL), 15. , 2024, Belém/PA. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2024 . p. 57-66. DOI: https://doi.org/10.5753/stil.2024.245416.