A Classifier Ensemble for Offensive Text Detection
Resumo
Offensive posts are a constant nuisance in many Web platforms. As a consequence, there has been growing interest in devising methods to automatically identify such posts. In this paper, we present Hate2Vec -- an approach for detecting offensive comments on the Web. Hate2Vec relies on a classifier ensemble. The base learners include: (i) a lexicon-based classifier which leverages the semantic relatedness of word embeddings; (ii) a logistic regression classifier based on comment embeddings; (iii) and a standard bag-of-words (BOW) classifier based on unigram features. Our experiments with datasets in English and Portuguese have yielded high classification results (F-measure above 0.9) and significantly outperformed a traditional BOW classifier.
Palavras-chave:
Text Classification, Hate Speech Detection
Publicado
16/10/2018
Como Citar
PELLE, Rogers; ALCÂNTARA, Cleber; MOREIRA, Viviane P..
A Classifier Ensemble for Offensive Text Detection. In: BRAZILIAN SYMPOSIUM ON MULTIMEDIA AND THE WEB (WEBMEDIA), 24. , 2018, Salvador.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2018
.
p. 237-243.