Método de Aprendizado Não Supervisionado Baseado no Produto Cartesiano de Rankings para Busca de Imagens

Lucas Pascotti Valem; Daniel Carlos Guimarães Pedronette

Lucas Pascotti Valem UNESP
Daniel Carlos Guimarães Pedronette UNESP

Resumo

Apesar dos avanços significativos em ferramentas de busca de imagens, a definição de uma medida efetiva para a modelagem de similaridade entre imagens continua sendo um desafio em Sistemas de Recuperação de Imagens por Conteúdo (CBIR). Nesse cenário, técnicas de aprendizado não supervisionado de similaridade, capazes de melhorar a eficácia de tarefas de recuperação de imagens sem a intervenção do usuário são indispensáveis. Este trabalho de iniciação científica apresenta o método Cartesian Product of Ranking References (CPRR), o qual foi desenvolvido com esse propósito. Vários experimentos foram conduzidos em quatro coleções de imagens, considerando várias características visuais e diversos aspectos. Além da eficácia, também foram realizados experimentos para avaliações de eficiência, considerando computação paralela e heterogênea em CPU e GPU. O método atingiu ganhos significativos de eficácia que são comparáveis aos resultados de estado da arte mais populares.

Referências

Arica, N. and Vural, F. T. Y. (2003). BAS: a perceptual shape descriptor based on the beam angle statistics. Pattern Recognition Letters, 24(9-10):1627–1639.

Bai, X., Bai, S., and Wang, X. (2015). Beyond diffusion process: Neighbor set similarity for fast re-ranking. Information Sciences, 325:342 – 354.

Brodatz, P. (1966). Textures: A Photographic Album for Artists and Designers. Dover.

Chatzichristofis, S. A. and Boutalis, Y. S. (2008). Fcth: Fuzzy color and texture histogram - a low level feature for accurate image retrieval. In WIAMIS, pages 191–196.

Chen, Y., Li, X., Dick, A., and Hill, R. (2014). Ranking consistency for image matching and object retrieval. Pattern Recognition, 47(3):1349 – 1360.

da S. Torres, R. and Falcão, A. X. (2007). Contour Salience Descriptors for Effective Image Retrieval and Analysis. Image and Vision Computing, 25(1):3–13.

Gopalan, R., Turaga, P., and Chellappa, R. (2010). Articulation-invariant representation of non-planar shapes. In ECCV, volume 3, pages 286–299.

Huang, J., Kumar, S. R., Mitra, M., Zhu, W.-J., and Zabih, R. (1997). Image indexing using color correlograms. In CVPR, pages 762–768.

Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., and Darrell, T. (2014). Caffe: Convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093.

Jiang, J., Wang, B., and Tu, Z. (2011). Unsupervised metric learning by self-smoothing operator. In ICCV, pages 794–801.

Kovalev, V. and Volmer, S. (1998). Color co-occurence descriptors for querying-by-example. In ICMM, page 32.

Latecki, L. J., Lakmper, R., and Eckhardt, U. (2000). Shape descriptors for non-rigid shapes with a single closed contour. In CVPR, pages 424–429.

Ling, H. and Jacobs, D. W. (2007). Shape classification using the inner-distance. IEE TPAMI, 29(2):286–299.

Ling, H., Yang, X., and Latecki, L. J. (2010). Balancing deformability and discriminability for shape matching. In ECCV, volume 3, pages 411–424.

Liu, Y., Zhang, D., Lu, G., and Ma, W.-Y. (2007). A survey of content-based image retrieval with high-level semantics. Pattern Recognition, 40(1):262 – 282.

Lowe, D. (1999). Object recognition from local scale-invariant features. In ICCV, pages 1150–1157.

Nistér, D. and Stewénius, H. (2006). Scalable recognition with a vocabulary tree. In CVPR, volume 2, pages 2161–2168.

Ojala, T., Pietikäinen, M., and Mäenpää, T. (2002). Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. PAMI, 24(7):971–987.

Pedronette, D. C. G. and da S. Torres, R. (2010). Shape retrieval using contour features and distance optmization. In VISAPP, volume 1, pages 197 – 202.

Pedronette, D. C. G. and da S. Torres, R. (2012). Exploiting pairwise recommendation and clustering strategies for image re-ranking. Information Sciences, 207:19–34.

Pedronette, D. C. G. and da S. Torres, R. (2013). Image re-ranking and rank aggregation based on similarity of ranked lists. Pattern Recognition, 46(8):2350–2360.

Stehling, R. O., Nascimento, M. A., and Falcão, A. X. (2002). A compact and efficient image retrieval approach based on border/interior pixel classification. In CIKM, pages 102–109.

Swain, M. J. and Ballard, D. H. (1991). Color indexing. International Journal on Computer Vision, 7(1):11–32.

Tao, B. and Dickinson, B. W. (2000). Texture recognition and image retrieval using gradient indexing. JVCIR, 11(3):327–342.

Thomee, B. and Lew, M. (2012). Interactive search in image retrieval: a survey. International Journal of Multimedia Information Retrieval, 1(2):71–86.

Valem, L. P. and Pedronette, D. C. G. (2016). Unsupervised similarity learning through cartesian product of ranking references for image retrieval tasks. In Conference on Graphics, Patterns and Images (SIBGRAPI’2016), pages 249–256.

Valem, L. P., Pedronette, D. C. G., Torres, R. d. S., Borin, E., and Almeida, J. (2015). Effective, efficient, and scalable unsupervised distance learning in image retrieval tasks. ICMR.

van de Weijer, J. and Schmid, C. Coloring local feature extraction. In ECCV.

Wang, J., Li, Y., Bai, X., Zhang, Y., Wang, C., and Tang, N. (2011a). Learning context-sensitive similarity by shortest path propagation. Pattern Recognition, 44(10-11):2367–2374.

Wang, X., Yang, M., Cour, T., Zhu, S., Yu, K., and Han, T. (2011b). Contextual weighting for vocabulary tree based image retrieval. In ICCV’2011, pages 209–216.

Yang, X., Koknar-Tezel, S., and Latecki, L. J. (2009). Locally constrained diffusion process on locally densified distance spaces with applications to shape retrieval. In CVPR, pages 357–364.

Yang, X., Prasad, L., and Latecki, L. (2013). Affinity learning with diffusion on tensor product graph. IEEE TPAMI, 35(1):28–38.