An empirical assessment of quality metrics for diversified similarity searching


  • Camila R. Lopes Federal Institute of North of Minas Gerais
  • Lúcio F. D. Santos Federal Institute of North of Minas Gerais
  • Daniel L. Jasbick Fluminense Federal University
  • Daniel de Oliveira Fluminense Federal University
  • Marcos Bedo Fluminense Northwest Institute, Fluminense Federal University



Metric spaces, Diversified similarity searching, Result diversification, Similarity searching


A diversified similarity search retrieves elements that are simultaneously similar to a query object and akin to the different collections within the explored data. While several methods in information retrieval, data clustering, and similarity searching have tackled the problem of adding diversity into result sets, the experimental comparison of their performances is still an open issue mainly because the quality metrics are “borrowed” from those different research areas, bringing their biases alongside. In this manuscript, we investigate a series of such metrics and experimentally discuss their trends and limitations. We conclude diversity is better addressed by a set of measures rather than a single quality index and introduce the concept of Diversity Features Model (DFM), which combines the viewpoints of biased metrics into a multidimensional representation. Experimental evaluations indicate (i) DFM enables comparing different result diversification algorithms by considering multiple criteria, and (ii) the most suitable searching methods for a particular dataset are spotted by combining DFM with ranking aggregation and parallel coordinates maps.


Download data is not yet available.


SBBD 2020 - Full papers