Toxicidade e Gatilhos: Um Estudo de Caso em Comunidades do Reddit no Brasil

Giovana Piorino; Luiz Henrique Quevedo Lima; Adriana Silvina Pagano; Ana Paula Couto da Silva

doi:10.5753/webmedia.2025.15984

Giovana Piorino UFMG
Luiz Henrique Quevedo Lima BEON.tech
Adriana Silvina Pagano UFMG
Ana Paula Couto da Silva UFMG

DOI: https://doi.org/10.5753/webmedia.2025.15984

Resumo

In this work, we aim to understand linguistic features that may indicate potential triggers fostering toxic behavior among users in Portuguese-speaking Reddit communities. Our findings show that discussions involving such triggers tend to concentrate political terms and insults, shift topics more frequently, and make greater use of subordinating conjunctions.

Palavras-chave: Toxicity Triggers, Discussion Trees, Linguistic Analysis

Referências

Ezgi Akar. 2025. Exploring the impact of social network structures on toxicity in online mental health communities. Computers in Human Behavior 165 (2025), 108542. DOI: 10.1016/j.chb.2024.108542

Hind Almerekhi, Haewoon Kwak, Joni Salminen, and Bernard J. Jansen. 2020. Are These Comments Triggering? Predicting Triggers of Toxicity in Online Discussions. In Proceedings of The Web Conference 2020 (Taipei, Taiwan) (WWW ’20). Association for Computing Machinery, New York, NY, USA, 3033–3040. DOI: 10.1145/3366423.3380074

Hind Almerekhi, Haewoon Kwak, Joni Salminen, and Bernard J. Jansen. 2022. PROVOKE: Toxicity trigger detection in conversations from the top 100 subreddits. Data and Information Management 6, 4 (2022), 100019. DOI: 10.1016/j.dim.2022.100019

Henrico Bertini Brum and Maria das Graças Volpe Nunes. 2017. Building a Sentiment Corpus of Tweets in Brazilian Portuguese. arXiv:1712.08917 [cs.CL] [link]

Gustavo Cunha and Ana Silva. 2024. Caracterizando Polarização em Redes Sociais: Um Estudo de Caso das Discussões no Reddit sobre as Eleições Brasileiras de 2018 e 2022. In Proceedings of the 30th Brazilian Symposium on Multimedia and the Web (Juiz de Fora/MG). SBC, Porto Alegre, RS, Brasil, 365–369. DOI: 10.5753/webmedia.2024.241688

Tope Falade, Niloofar Yousefi, and Nitin Agarwal. 2024. Toxicity Prediction in Reddit. In AMCIS 2024 Proceedings. 18. [link]

E Fonseca, L Santos, Marcelo Criscuolo, and S Aluisio. 2016. ASSIN: Avaliacao de similaridade semantica e inferencia textual. In Computational Processing of the Portuguese Language-12th International Conference, Tomar, Portugal. 13–15.

Claudia Freitas, Paulo Rocha, and Eckhard Bick. 2008. A new world in Floresta Sintá(c)tica – the Portuguese treebank. Calidoscópio 6, 3 (2008), 142–148. DOI: 10.4013/cld.20083.03

Maarten Grootendorst. 2022. BERTopic: Neural topic modeling with a class-based TF-IDF procedure. arXiv:2203.05794 [cs.CL] [link]

Luiz Henrique Quevedo Lima, Adriana Silvina Pagano, and Ana Paula Couto da Silva. 2024. Toxic Content Detection in online social networks: a new dataset from Brazilian Reddit Communities. In Proceedings of the 16th International Conference on Computational Processing of Portuguese - Vol. 1, Pablo Gamallo, Daniela Claro, António Teixeira, Livy Real, Marcos Garcia, Hugo Gonçalo Oliveira, and Raquel Amaro (Eds.). Association for Computational Lingustics, Santiago de Compostela, Galicia/Spain, 472–482. [link]

Philip May. 2021. Machine translated multilingual STS benchmark dataset. [link]

Andreas Mueller. 2024. wordcloud. [link] Acessado em: 10/08/2025.

Joel Nothman, Nicky Ringland, Will Radford, Tara Murphy, and James R Curran. 2013. Learning multilingual named entity recognition from Wikipedia. Artificial Intelligence 194 (2013), 151–175.

Mihir Parmar, Swaroop Mishra, Mor Geva, and Chitta Baral. 2023. Don’t Blame the Annotator: Bias Already Starts in the Annotation Instructions. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, Andreas Vlachos and Isabelle Augenstein (Eds.). Association for Computational Linguistics, Dubrovnik, Croatia, 1779–1789. DOI: 10.18653/v1/2023.eacl-main.130

Slav Petrov, Dipanjan Das, and Ryan McDonald. 2011. A universal part-of-speech tagset. arXiv preprint arXiv:1104.2086 (2011).

Alexandre Rademaker, Fabricio Chalub, Livy Real, Cláudia Freitas, Eckhard Bick, and Valeria de Paiva. 2017. Universal Dependencies for Portuguese. In Proceedings of the Fourth International Conference on Dependency Linguistics (Depling). Pisa, Italy, 197–206. [link]

Livy Real, Erick Fonseca, and Hugo Goncalo Oliveira. 2020. The assin 2 shared task: a quick overview. In International Conference on Computational Processing of the Portuguese Language. Springer, 406–412.

Raquel Recuero. 2024. The platformization of violence: Toward a concept of discursive toxicity on social media. Social Media+ Society 10, 1 (2024), 20563051231224264.

Vigneshwaran Shankaran and Rajesh Sharma. 2024. Analyzing Toxicity in Deep Conversations: A Reddit Case Study. arXiv:2404.07879 [cs.CL] [link]

Fábio Souza, Rodrigo Nogueira, and Roberto Lotufo. 2020. BERTimbau: pretrained BERT models for Brazilian Portuguese. In 9th Brazilian Conference on Intelligent Systems, BRACIS, Rio Grande do Sul, Brazil, October 20-23 (to appear).

spaCy. 2023. Portuguese Models. [link]. Acessado em: 22/06/2024.

Demetris Vrontis, Evangelia Siachou, Georgia Sakka, Sheshadri Chatterjee, Ranjan Chaudhuri, and Arka Ghosh. 2022. Societal effects of social media in organizations: Reflective points deriving from a systematic literature review and a bibliometric meta-analysis. European Management Journal 40, 2 (2022), 151–162. DOI: 10.1016/j.emj.2022.01.007

Yulin Yu, Julie Jiang, and Paramveer Dhillon. 2024. Characterizing the Structure of Online Conversations Across Reddit. arXiv:2209.14836 [cs.SI] [link]

Toxicidade e Gatilhos: Um Estudo de Caso em Comunidades do Reddit no Brasil

Resumo

Referências

Artigos mais lidos do(s) mesmo(s) autor(es)