A Classifier Ensemble for Offensive Text Detection

Rogers Pelle; Cleber Alcântara; Viviane P. Moreira

A Classifier Ensemble for Offensive Text Detection

Rogers Pelle UFRGS
Cleber Alcântara UFRGS
Viviane P. Moreira UFRGS

Resumo

Offensive posts are a constant nuisance in many Web platforms. As a consequence, there has been growing interest in devising methods to automatically identify such posts. In this paper, we present Hate2Vec -- an approach for detecting offensive comments on the Web. Hate2Vec relies on a classifier ensemble. The base learners include: (i) a lexicon-based classifier which leverages the semantic relatedness of word embeddings; (ii) a logistic regression classifier based on comment embeddings; (iii) and a standard bag-of-words (BOW) classifier based on unigram features. Our experiments with datasets in English and Portuguese have yielded high classification results (F-measure above 0.9) and significantly outperformed a traditional BOW classifier.

Palavras-chave: Text Classification, Hate Speech Detection

ACM DL

Publicado

16/10/2018

Como Citar

Selecione um Formato

PELLE, Rogers; ALCÂNTARA, Cleber; MOREIRA, Viviane P.. A Classifier Ensemble for Offensive Text Detection. In: BRAZILIAN SYMPOSIUM ON MULTIMEDIA AND THE WEB (WEBMEDIA), 24. , 2018, Salvador. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2018 . p. 237-243.