Hate Speech Detection Using Brazilian Imageboards

Gabriel Nascimento; Flavio Carvalho; Alexandre M. da Cunha; Carlos Roberto Viana; Gustavo P. Guedes

Gabriel Nascimento CEFET/RJ
Flavio Carvalho CEFET/RJ
Alexandre M. da Cunha CEFET/RJ - UFF
Carlos Roberto Viana CEFET/RJ
Gustavo P. Guedes CEFET/RJ

Resumo

With the changes in human interaction prompted by the development of communications platforms over the internet, hate speech and offensive language emerged as a contemporary problem. Social networks allow users with different opinions and backgrounds to interact without direct eye-to-eye contact. It brings a sense of safety to promote hate speech, which is even more significant in anonymous environments. There are sites called imageboards, composed of different boards aggregating different topics. On some boards, anonymous users widely promote hate speech. However, only a few works in literature have focused on hate speech in imageboards content. This work aims to classify Brazilian Portuguese texts to detect hate speech, using data from the Brazilian 55chan imageboard to build a dataset with hate speech content. Three classifiers were trained to hate speech binary classification. The Linear Support Vector Classifier achieved the best result with 0.955 of F1-score.

Hate Speech Detection Using Brazilian Imageboards

Resumo

Artigos mais lidos do(s) mesmo(s) autor(es)