Per Query Subtopic Discovery for Diverse Image Retrieval
Abstract
Given the complex search tasks imposed to multimedia retrieval systems, the similarity-based results often represent redundant item sets. Several real-world search tasks demand broad coverage of multiple implicit subtopics of a given query. Many works have proposed the use of clustering-based result diversification for addressing such problem. However, the definition of the number of clusters (subtopics) to be discovered is a long-lasting challenge. In order to attenuate such problems, this work proposes a novel diverse image retrieval approach as an unsupervised query-adaptive subtopic discovery based on intrinsic clustering quality optimization. Our experimental analysis have shown significant improvements, both in terms of relevance and diversity.
References
Ionescu, B., Gînsca, A., Boteanu, B., Popescu, A., Lupu, M., and Muller, H. (2015). Retrieving diverse social images at mediaeval 2015: Challenge, dataset and evaluation. In MediaEval.
Muhlenbach, F. and Lallich, S. (2009). A new clustering algorithm based on regions of influence with self-detection of the best number of clusters. In IEEE ICDM.
Salvador, S. and Chan, P. (2004). Determining the number of clusters/segments in hierarchical clustering/segmentation algorithms. In IEEE ICTAI.
Santos, R. L. T., Macdonald, C., and Ounis, I. (2015). Search result diversification. Found Trends Inf Ret, 9(1):1–90.
Tollari, S. (2016). UPMC at mediaeval 2016 retrieving diverse social images task. In MediaEval.
Xie, X. L. and Beni, G. (1991). A validity measure for fuzzy clustering. IEEE TPAMI, 13(8):841–847.
Zhai, C. X., Cohen, W. W., and Lafferty, J. (2003). Beyond independent relevance: Methods and evaluation metrics for subtopic retrieval. In ACM SIGIR.
