Hate Speech Detection Using Brazilian Imageboards

  • Gabriel Nascimento CEFET/RJ
  • Flavio Carvalho CEFET/RJ
  • Alexandre M. da Cunha CEFET/RJ - UFF
  • Carlos Roberto Viana CEFET/RJ
  • Gustavo P. Guedes CEFET/RJ

Resumo


With the changes in human interaction prompted by the development of communications platforms over the internet, hate speech and offensive language emerged as a contemporary problem. Social networks allow users with different opinions and backgrounds to interact without direct eye-to-eye contact. It brings a sense of safety to promote hate speech, which is even more significant in anonymous environments. There are sites called imageboards, composed of different boards aggregating different topics. On some boards, anonymous users widely promote hate speech. However, only a few works in literature have focused on hate speech in imageboards content. This work aims to classify Brazilian Portuguese texts to detect hate speech, using data from the Brazilian 55chan imageboard to build a dataset with hate speech content. Three classifiers were trained to hate speech binary classification. The Linear Support Vector Classifier achieved the best result with 0.955 of F1-score.
Publicado
29/10/2019
NASCIMENTO, Gabriel; CARVALHO, Flavio; CUNHA, Alexandre M. da; VIANA, Carlos Roberto; GUEDES, Gustavo P.. Hate Speech Detection Using Brazilian Imageboards. In: BRAZILIAN SYMPOSIUM ON MULTIMEDIA AND THE WEB (WEBMEDIA), 25. , 2019, Rio de Janeiro. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2019 . p. 325-328.