Text Mining for Cyberbullying Detection: a Brazilian Portuguese Evaluation

  • Carolina Eberhart UNISINOS
  • Luciano Ignaczak UNISINOS
  • Márcio Garcia Martins UNISINOS

Resumo


Bullying e cyberbullying são assuntos abordados com frequência pela mídia. Embora a comunidade científica venha avaliando técnicas de mineração de texto para detecção de cyberbullying, poucos estudos utilizam datasets em português. Este estudo tem como objetivo avaliar a aplicação de mineração de texto para detectar mensagens em português associadas com cyberbullying. O estudo coletou posts e comentários de comunidades do site Reddit e extraiu diversas features, que foram usadas para treinar classificadores para descoberta de cyberbullying. Apesar dos resultados não demonstrarem que mineração de texto possa automatizar completamente a detecção de cyberbullying, as técnicas podem auxiliar moderadores na priorização da análise de mensagens.

Referências

Chawla, N. V., Bowyer, K. W., Hall, L. O., and Kegelmeyer, W. P. (2002). Smote: synthetic minority over-sampling technique. Journal of artificial intelligence research, 16:321–357.

Choi, Y.-J., Jeon, B.-J., and Kim, H.-W. (2020). Identification of key cyberbullies: A text mining and social network analysis approach. Telematics and Informatics, page 101504.

Hinduja, S. and Patchin, J. W. (2014). Bullying beyond the schoolyard: Preventing and responding to cyberbullying. Corwin Press.

McCarthy, N. (2018). Where cyberbullying is most prevalent. Statista, 2018. Available at: [link]. Acessed in: November 24, 2020.

Nandhini, B. S. and Sheeba, J. (2015a). Cyberbullying detection and classification using information retrieval algorithm. In Proceedings of the 2015 International Conference on Advanced Research in Computer Science Engineering & Technology, pages 1–5.

Nandhini, B. S. and Sheeba, J. (2015b). Online social network bullying detection using intelligence techniques. Procedia Computer Science, 45:485–492.

Singh, V. K., Huang, Q., and Atrey, P. K. (2016). Cyberbullying detection using probabilistic socio-textual information fusion. In 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), pages 884–887. IEEE.

Smith, P. K., Catalano, R., Junger-Tas, J., Slee, P., Morita, Y., and Olweus, D. (1999). The nature of school bullying: A cross-national perspective. Psychology Press.

Song, G., Ye, Y., Du, X., Huang, X., and Bie, S. (2014). Short text classification: A survey. Journal of multimedia, 9(5):635.

Song, J., Han, Y., Kim, K., and Song, T. M. (2020). Social big data analysis of future signals for bullying in south korea: Application of general strain theory. Telematics and Informatics, 54:101472.

Song, T.-M. and Song, J. (2020). Prediction of risk factors of cyberbullying-related words in korea: Application of data mining using social big data. Telematics and Informatics.

Taeho, J. (2019). Text mining concepts, implementation, and big data challange,(p. 1). Seoul, Korea: Hongik University.

Urtiga, T. and Castro, T. (2018). Detecção de bullying escolar em redes sociais e suas implicações na educação de adolescentes. In Brazilian Symposium on Computers in Education (SBIE), volume 29, page 1693.

Zhao, R. and Mao, K. (2016). Cyberbullying detection based on semantic-enhanced IEEE Transactions on Affective Computing, marginalized denoising auto-encoder. 8(3):328–339.

Zhao, R., Zhou, A., and Mao, K. (2016). Automatic detection of cyberbullying on soIn Proceedings of the 17th international cial networks based on bullying features. conference on distributed computing and networking, pages 1–6.
Publicado
29/11/2021
EBERHART, Carolina; IGNACZAK, Luciano; MARTINS, Márcio Garcia. Text Mining for Cyberbullying Detection: a Brazilian Portuguese Evaluation. In: SIMPÓSIO BRASILEIRO DE TECNOLOGIA DA INFORMAÇÃO E DA LINGUAGEM HUMANA (STIL), 13. , 2021, Evento Online. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2021 . p. 92-100. DOI: https://doi.org/10.5753/stil.2021.17788.