Positive Unlabeled Learning: Adapting NMF for text classification

  • Lucas S. S. Nunes Universidade de Brasília
  • Thiago de P. Faleiros Universidade de Brasília
  • Rafael G. Rossi iFood


Due to the overwhelming data generation that surpasses human evaluation capacity, manually labeling data for training machine learning models is becoming increasingly impractical. This article focuses on analyzing techniques to address the challenges of Positive Unlabeled Learning (PUL). To this end, we propose structural adaptations to the Non-Negative Matrix Factorization (NMF) algorithm, specifically tailored for PU data (NMFPUL). We compare NMFPUL with state-of-the-art techniques to identify improvements in the performance of textual data classification. Our study reveals that NMFPUL consistently outperforms most baseline algorithms across diverse document collections even with a limited number of labeled documents, and mainly on these situations.

Palavras-chave: positive unlabeled learning, non-negative matrix factorization, text classification


