Representation Learning for Image Retrieval through 3D CNN and Manifold Ranking

Lucas Barbosa de Almeida; Vanessa Helena Pereira-Ferrero; Lucas Pascotti Valem; Jurandy Almeida; Daniel Carlos Guimarães Pedronette

Lucas Barbosa de Almeida UNESP
Vanessa Helena Pereira-Ferrero UNESP
Lucas Pascotti Valem UNESP
Jurandy Almeida UNIFESP
Daniel Carlos Guimarães Pedronette UNESP

Resumo

Despite of the substantial success of Convolutional Neural Networks (CNNs) on many recognition and representation tasks, such models are very reliant on huge amount of data to allow effective training. In order to improve the generalization ability of CNNs, several approaches have been proposed, including variations of data augmentation strategies. With the goal of achieving more effective retrieval results on unsupervised learning scenarios, we propose a representation learning approach which exploits a rank-based formulation to build a more comprehensive data representation. The proposed model uses 2D and 3D CNNs trained by transfer learning and fuse both representations through a rank-based formulation based on manifold learning algorithms. Our approach was evaluated on an unsupervised image retrieval scenario applied to action recognition datasets. The experimental results indicated that significant effectiveness gains can be obtained on various datasets, reaching +56.93% of relative gains on MAP scores.

Palavras-chave: Representation learning, Training, Manifolds, Solid modeling, Three-dimensional displays, Image retrieval, Transfer learning, image retrieval, representation learning, manifold learning