A Classifier Ensemble for Offensive Text Detection

  • Rogers Pelle UFRGS
  • Cleber Alcântara UFRGS
  • Viviane P. Moreira UFRGS

Resumo


Offensive posts are a constant nuisance in many Web platforms. As a consequence, there has been growing interest in devising methods to automatically identify such posts. In this paper, we present Hate2Vec -- an approach for detecting offensive comments on the Web. Hate2Vec relies on a classifier ensemble. The base learners include: (i) a lexicon-based classifier which leverages the semantic relatedness of word embeddings; (ii) a logistic regression classifier based on comment embeddings; (iii) and a standard bag-of-words (BOW) classifier based on unigram features. Our experiments with datasets in English and Portuguese have yielded high classification results (F-measure above 0.9) and significantly outperformed a traditional BOW classifier.
Palavras-chave: Text Classification, Hate Speech Detection
Publicado
16/10/2018
PELLE, Rogers; ALCÂNTARA, Cleber; MOREIRA, Viviane P.. A Classifier Ensemble for Offensive Text Detection. In: BRAZILIAN SYMPOSIUM ON MULTIMEDIA AND THE WEB (WEBMEDIA), 24. , 2018, Salvador. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2018 . p. 237-243.