Alignment of Local and Global Features from Multiple Layers of Convolutional Neural Network for Image Classification

Fernando Pereira dos Santos; Moacir Ponti

doi:10.5753/sibgrapi.2019.9787

Fernando Pereira dos Santos University of São Paulo
Moacir Ponti University of São Paulo

DOI: https://doi.org/10.5753/sibgrapi.2019.9787

Resumo

Convolutional networks have been extensively applied to obtain features spaces for classification tasks. Although those achieve high accuracy in many scenarios, typically only the top layers of the network are explored. Hence, a relevant question arises from this fact: are initial layers useful in terms of discriminative ability? In this paper, we leverage the complementary description offered by such first layers. Our method consists of features extraction in multiple layers, followed by feature selection, fusion of feature maps from the different layers, and space alignment. Through an extensive experimentation with different datasets and studying different training strategies, our results show that local information, coming from the first layers, may significantly improve the classification performance when merged with a global descriptor extracted from a top layer of the network. We report different methods for reducing the dimensionality of the local descriptors, and guidelines on how to align them so that to perform fusion. Our study encourages future studies on combining feature maps from multiple layers, which may be relevant in particular for transfer learning scenarios.

Palavras-chave: feature learning, convolutional networks, fusion multiple maps, manifold alignment

Referências

M. A. Ponti, G. B. P. da Costa, F. P. Santos, K. U. Silveira, "Supervised and unsupervised relevance sampling in handcrafted and deep learning features obtained from image collections", Applied Soft Computing, vol. pp. 414-42019.

M. M. Ghazi, B. Yanikoglu, E. Aptoula, "Plant identification using deep neural networks via optimization of transfer learning parameters", Neurocomputing, vol. 2pp. 228-22017.

K. Nogueira, O. A. Penatti, J. A. dos Santos, "Towards better exploiting convolutional neural networks for remote sensing scene classification", Pattern Recognition, vol. pp. 539-52017.

M. d. F.O. Baffa, L. G. Lattari, "Convolutional neural networks for static and dynamic breast infrared imaging classification", 2018 31st SIBGRAPI Conference on Graphics Patterns and Images (SIBGRAPI), pp. 174-12018.

F. P. dos Santos, M. A. Ponti, "Robust feature spaces from pre-trained deep network layers for skin lesion classification", 2018 31st SIBGRAPI Conference on Graphics Patterns and Images (SIBGRAPI), pp. 189-196, 2018.

M. Ponti, L. S. Ribeiro, T. S. Nazare, T. Bui, J. Collomosse, "Everything you wanted to know about deep learning for computer vision but were afraid to ask", 30th SIBGRAPI Conference on Graphics Patterns and Images Tutorials (SIBGRAPI-T 2017), pp. 17-2017.

A. Sharif Razavian, H. Azizpour, J. Sullivan, S. Carlsson, "Cnn features off-the-shelf: an astounding baseline for recognition", Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp. 806-82014.

O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein et al., "Imagenet large scale visual recognition challenge", International journal of computer vision, vol. 1no. 3, pp. 211-22015.

H. Ravishankar, P. Sudhakar, R. Venkataramani, S. Thiruvenkadam, P. Annangi, N. Babu, V. Vaidya, "Understanding the mechanisms of deep transfer learning for medical images", Deep Learning and Data Labeling for Medical Applications. Springer, pp. 188-196, 2016.

J. Yosinski, J. Clune, Y. Bengio, H. Lipson, "How transferable are features in deep neural networks?", Advances in neural information processing systems, pp. 3320-332014.

T.-Y. Lin, A. RoyChowdhury, S. Maji, "Bilinear cnn models for fine-grained visual recognition", Proceedings of the IEEE international conference on computer vision, pp. 1449-142015.

S. Sadigh, P. Sen, Improving the resolution of cnn feature maps efficiently with multisampling, 2018.

Y. Pan, T. Yao, H. Li, T. Mei, "Video captioning with transferred semantic attributes", CVPR, vol. 2, pp. 3, 2017.

F. P. dos Santos, L. S. Ribeiro, M. A. Ponti, "Generalization of feature embeddings transferred from different video anomaly detection domains", Journal of Visual Communication and Image Representation, vol. pp. 407-42019.

Z. Shi, H. Hao, M. Zhao, Y. Feng, L. He, Y. Wang, K. Suzuki, "A deep cnn based transfer learning method for false positive reduction", Multimedia Tools and Applications, pp. 1-2018.

A. Esteva, B. Kuprel, R. A. Novoa, J. Ko, S. M. Swetter, H. M. Blau, S. Thrun, "Dermatologist-level classification of skin cancer with deep neural networks", Nature, vol. 5no. 76pp. 12017.

T. Majtner, S. Yildirim-Yayilgan, J. Y. Hardeberg, "Combining deep learning and hand-crafted features for skin lesion classification", Image Processing Theory Tools and Applications (IPTA) 2016 6th International Conference on, pp. 1-6, 2016.

K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition, 2014.

K. He, X. Zhang, S. Ren, J. Sun, "Deep residual learning for image recognition", Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770-72016.

W. M. Kouw, M. Loog, A review of single-source unsupervised domain adaptation, 2019.

S. J. Pan, I. W. Tsang, J. T. Kwok, Q. Yang, "Domain adaptation via transfer component analysis", IEEE Transactions on Neural Networks, vol. no. 2, pp. 199-22011.

Y. Zheng, J. Huang, T. Chen, Y. Ou, W. Zhou, "Cnn classification based on global and local features", Real-Time Image Processing and Deep Learning 2019 vol. 10996. International Society for Optics and Photonics, pp. 109960G, 2019.

Q. Yu, X. Chang, Y.-Z. Song, T. Xiang, T. M. Hospedales, The devil is in the middle: Exploiting mid-level representations for cross-domain instance matching, 2017.

B. Hariharan, P. Arbelaez, R. Girshick, J. Malik, "Hypercolumns for object segmentation and fine-grained localization", Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 447-42015.

S. Xie, Z. Tu, "Holistically-nested edge detection", Proceedings of the IEEE international conference on computer vision, pp. 1395-1403, 2015.

Y. Chen, S. Duffner, A. Stoian, J.-Y. Dufour, A. Baskurt, "Pedes-trian attribute recognition with part-based cnn and combined feature representations", VISAPP202018.

Z. Ge, S. Demyanov, B. Bozorgtabar, M. Abedini, R. Chakravorty, A. Bowling, R. Garnavi, "Exploiting local and generic features for accurate skin lesions classification using clinical and dermoscopy imaging", 2017 IEEE 14th International Symposium on Biomedical Imaging (ISBI 2017), pp. 986-990, 2017.

J. Mišeikis, I. Brijacak, S. Yahyanejad, K. Glette, O. J. Elle, J. Torresen, "Transfer learning for unseen robot detection and joint estimation on a multi-objective convolutional neural network", 2018 IEEE International Conference on Intelligence and Safety for Robotics (ISR), pp. 337-32018.

M. M. Kalayeh, E. Basaran, M. Gökmen, M. E. Kamasak, M. Shah, "Human semantic parsing for person re-identification", Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1062-102018.

J. Wen, R. Liu, N. Zheng, Q. Zheng, Z. Gong, J. Yuan, Exploiting local feature patterns for unsupervised domain adaptation, 2018.

Y. Xian, T. Lorenz, B. Schiele, Z. Akata, "Feature generating networks for zero-shot learning", Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5542-552018.

H. Muresan, M. Oltean, "Fruit recognition from images using deep learning", Acta Universitatis Sapientiae Informatica, vol. no. 1, pp. 26-2018.

A. Rocha, D. C. Hauagge, J. Wainer, S. Goldenstein, "Automatic produce classification from images using color texture and appearance cues", 2008 XXI Brazilian Symposium on Computer Graphics and Image Processing, pp. 3-2008.

K. Saenko, B. Kulis, M. Fritz, T. Darrell, "Adapting visual category models to new domains", European conference on computer vision. Springer, pp. 213-22010.

P. Tschandl, C. Rosendahl, H. Kittler, The ham10000 dataset: A large collection of multi-source dermatoscopic images of common pigmented skin lesions, 2018.

T. Mendonça, P. M. Ferreira, J. S. Marques, A. R. Marcal, J. Rozeira, "Ph 2-a dermoscopic image database for research and benchmarking", Engineering in Medicine and Biology Society (EMBC) 2013 35th Annual International Conference of the IEEE, pp. 5437-542013.

J. Z. Wang, J. Li, G. Wiederhold, "Simplicity: Semantics-sensitive integrated matching for picture libraries", IEEE Transactions on Pattern Analysis & Machine Intelligence, no. 9, pp. 947-92001.

R. F. de Mello, M. A. Ponti, Machine Learning: A Practical Approach on the Statistical Learning Theory, Springer, 2018.

V. N. Vapnik, "An overview of statistical learning theory", IEEE transactions on neural networks, vol. no. 5, pp. 988-999, 1999.