Comparative Analysis of LLMs for Detecting Racism, Sexism, and Homophobia on Social Media

Guilherme Bou; Adriano Mendonça Rocha

doi:10.5753/sbsi.2026.248649

Guilherme Bou UFU
Adriano Mendonça Rocha UFU

DOI: https://doi.org/10.5753/sbsi.2026.248649

Resumo

Research Context: Social media expansion has increased digital interactions, including hate speech such as racism, sexism, and homophobia. This challenges society and platforms to develop strategies for identifying and moderating harmful content; Scientific and/or Practical Problem: Despite AI advances, automatic detection faces limitations due to linguistic nuances, cultural context, and implementation costs. Scientifically, the challenge is evaluating model effectiveness; practically, it is developing economical, reliable large-scale solutions; Proposed Solution and/or Analysis: We conducted a comparative analysis of LLMs (GPT-3.5-Turbo, GPT-4.0, DeepSeek-V3 and Gemini-2.0-Flash) in detecting offensive social media comments. Tests on raw and preprocessed data using standardized prompts measured precision, cost, and execution time; Related IS Theory: The study draws on Information Systems theories, emphasizing socio-technical, ethical, and cost–benefit aspects (Socio-technical Theory, Actor-Network Theory, Resource-Based View, Dynamic Capabilities); Research Method: Over 2,000 comments were analyzed by LLMs using precision, recall, F1-score, operational cost, and processing time metrics; Summary of Results: GPT-4.0 achieved the highest F1-score (94.19%) but at high cost (US$ 26.99). DeepSeek-V3 balanced performance and cost (F1-score 93.37%, US$ 0.66). Gemini-2.0-Flash was the cheapest (US$ 0.12) but showed inconsistent results; Contributions and Impact to IS area: This work offers a practical framework for selecting LLMs for hate-speech detection based on accuracy, cost, and performance. It advances IS research by evaluating state-of-the-art models in real scenarios and providing guidance for ethical and efficient content moderation.

Referências

Albright, M. (2021). Monitoring system for detecting suspicious users on social network application.

Bou, G., Rocha, A. M., Quincozes, V. E., Quincozes, S. E., and Kazienko, J. F. (2023). Bou-guard: Uma abordagem para detecçao de conteudo impróprio na internet. In Anais Estendidos do XXIII Simpósio Brasileiro em Segurança da Informação e de Sistemas Computacionais, pages 285–290, Juiz de Fora, Brasil. SBC.

Brasil - Presidência da República (1989). Lei nº 7.716, De 5 De Janeiro De 1989 (Lei Caó).

Brasil - Presidência da República (2006). Lei nº 11.340, De 7 De Agosto De 2006 (Lei Maria da Penha).

Brasil - Presidência da República (2014). Lei n.º 12.965, De 23 De Abril De 2014 (Marco Civil Da Internet).

Chiu, K.-L., Collins, A., and Alexander, R. (2021). Detecting hate speech with gpt-3. arXiv preprint arXiv:2103.12407.

Davis, A. Y. (1981). Women, Race Class. Random House, New York.

Fredrickson, G. M. (2002). Racism: A Short History. Princeton University Press, Princeton.

Hall, S. (1997). Representation: Cultural representations and signifying practices. Sage Publications, London.

Heinze, E. (2016). Hate Speech and Democratic Citizenship. Oxford University Press, Oxford.

Herek, G. M. (2009). Hate crimes and stigma-related experiences among sexual minority adults in the united states: Prevalence estimates from a national probability sample. Journal of Interpersonal Violence, 24(1):54–74.

Houaiss, A. (2001). Dicionário Houaiss da língua portuguesa. Objetiva, Rio de Janeiro.

Khoshtab, P., Namazifard, D., Masoudi, M., Akhgary, A., Sani, S. M., and Yaghoobzadeh, Y. (2025). Comparative study of multilingual idioms and similes in large language models. In Proceedings of the 31st International Conference on Computational Linguistics, pages 8680–8698.

Kilomba, G. (2021). Plantation Memories: Episodes of Everyday Racism. Between the Lines, Toronto.

Li, L., Fan, L., Atreja, S., and Hemphill, L. (2024). “hot” chatgpt: The promise of chatgpt in detecting and discriminating hateful, offensive, and toxic comments on social media. ACM Transactions on the Web, 18(2):1–36.

Nicory, D., Guanabara, D., Assumpção, V., Viana, D., Leal, F., Afonso, I., Pinho, L. P., Freire, M., and Colombini, P. (2022). Racismo e injúria racial praticados nas redes sociais: Relatório do observatório das condenações judiciais em 2ª instância até o ano de 2022. Technical report, Faculdade Baiana de Direito e Jusbrasil, Salvador, BA.

Ribeiro, D. (2019). Lugar de fala. Pólen Produção Editorial, São Paulo.

Risman, B. J. (2018). Gender as a social structure. Gender & Society, 32(4):465–480.

Sachdeva, P., Barreto, R., Bacon, G., Sahn, A., Vacano, C. V., and Kennedy, C. (2022). The measuring hate speech corpus: Leveraging rasch measurement theory for data perspectivism. In Proceedings of the 1st Workshop on Perspectivist Approaches to NLP @ LREC2022, pages 83–94, Marseille, France. European Language Resources Association (ELRA).

SaferNet Brasil (2021). Hotline safernet lgbtqia+. Technical report, São Paulo, Brasil.

SaferNet Brasil and Ministério dos Direitos Humanos e da Cidadania (2023). Enfrentamento ao discurso de ódio. Technical report, São Paulo, Brasil.

Salminen, J., Hopf, M., Chowdhury, S. A., Jung, S.-g., Almerekhi, H., and Jansen, B. J. (2020). Developing an online hate classifier for multiple social media platforms. Human-centric Computing and Information Sciences, 10:1–34.

Sheth, A., Shalin, V. L., and Kursuncu, U. (2022). Defining and detecting toxicity on social media: context and knowledge are key. Neurocomputing, 490:312–318.

Wang, H., Hee, M. S., Awal, M. R., Choo, K. T. W., and Lee, R. K.-W. (2023). Evaluating gpt-3 generated explanations for hateful content moderation. arXiv preprint arXiv:2305.17680.

Yenala, H., Jhanwar, A., Chinnakotla, M. K., and Goyal, J. (2018). Deep learning for detecting inappropriate content in text. International Journal of Data Science and Analytics, 6:273–286.

Zhang, Z. and Luo, L. (2019). Hate speech detection: A solved problem? the challenging case of long tail on twitter. Semantic Web, 10(5):925–945.

Comparative Analysis of LLMs for Detecting Racism, Sexism, and Homophobia on Social Media

Resumo

Referências

Artigos mais lidos do(s) mesmo(s) autor(es)