Why Ignore Content? A Guideline for Intrinsic Evaluation of Item Embeddings for Collaborative Filtering

  • Pedro R. Pires UFSCar
  • Bruno B. Rizzi BTG Pactual
  • Tiago A. Almeida UFSCar


With the constant growth in available information and the popularization of technology, recommender systems have to deal with an increasing number of users and items. This leads to two problems in representing items: scalability and sparsity. Therefore, many recommender systems aim to generate low-dimensional dense representations of items. Matrix factorization techniques are popular, but models based on neural embeddings have recently been proposed and are gaining ground in the literature. Their main goal is to learn dense representations with intrinsic meaning. However, most studies proposing embeddings for recommender systems ignore this property and focus only on extrinsic evaluations. This study presents a guideline for assessing the intrinsic quality of matrix factorization and neural-based embedding models for collaborative filtering, comparing the results with a traditional extrinsic evaluation. To enrich the evaluation pipeline, we suggest adapting an intrinsic evaluation task commonly employed in the Natural Language Processing literature, and we propose a novel strategy for evaluating the learned representation compared to a content-based scenario. Finally, every mentioned technique is analyzed over established recommender models, and the results show how vector representations that do not yield good recommendations can still be useful in other tasks that demand intrinsic knowledge, highlighting the potential of this perspective of evaluation.
Palavras-chave: embeddings, intrinsic evaluation, qualitative evaluation, recommender systems, similarity tables, intruder detection, autotagging


PIRES, Pedro R.; RIZZI, Bruno B.; ALMEIDA, Tiago A.. Why Ignore Content? A Guideline for Intrinsic Evaluation of Item Embeddings for Collaborative Filtering. In: BRAZILIAN SYMPOSIUM ON MULTIMEDIA AND THE WEB (WEBMEDIA), 30. , 2024, Juiz de Fora/MG. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2024 . p. 345-354. DOI: https://doi.org/10.5753/webmedia.2024.243199.

