Hybrid CNN-GNN Models in Active Sonar Imagery: an Experimental Evaluation

  • Gabriel Arruda Evangelista UFRJ
  • João Baptista de Oliveira e Souza Filho UFRJ

Resumo


The development of sonar technologies, such as Multibeam Forward Looking Sonar (MFLS), has enabled detailed underwater imaging, which can be applied for tasks like identifying mine-like objects. However, obtaining large datasets to train image recognition models remains challenging, leading to the need for smaller yet equally accurate alternative models. Previous research proposed a hybrid model that combines Convolutional Neural Networks with Graph Neural Networks for MFLS image classification. This study refines the feature extractor of this model using Knowledge Distillation (KD) and evaluates the cost-effectiveness of this pipeline compared to alternative solutions. The proposed method achieved an error rate of 6.42%, a value comparable to that of other solutions but with less computational effort.
Palavras-chave: Graph-Based AI, Hybrid Systems and Metaheuristics, Vision AI

Referências

Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., and Süsstrunk, S. (2012). SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(11):2274–2282.

Avelar, P. H., Tavares, A. R., da Silveira, T. L., Jung, C. R., and Lamb, L. C. (2020). Superpixel image classification with graph attention networks. In 2020 33rd SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI), pages 203–209. IEEE.

Belkin, M. and Niyogi, P. (2001). Laplacian eigenmaps and spectral techniques for embedding and clustering. Advances in Neural Information Processing Systems, 14.

Defferrard, M., Bresson, X., and Vandergheynst, P. (2016). Convolutional neural networks on graphs with fast localized spectral filtering. Advances in Neural Information Processing Systems, 29.

Dos Santos, M. M., De Giacomo, G. G., Drews-Jr, P. L., and Botelho, S. S. (2022). Crossview and cross-domain underwater localization based on optical aerial and acoustic underwater images. IEEE Robotics and Automation Letters, 7(2):4969–4974.

Dwivedi, V. P., Joshi, C. K., Luu, A. T., Laurent, T., Bengio, Y., and Bresson, X. (2023). Benchmarking graph neural networks. Journal of Machine Learning Research, 24(43):1–48.

Eiben, A. E. and Smith, J. E. (2015). Introduction to evolutionary computing. Springer.

Evangelista, G. A. and Souza Filho, J. B. O. (2023). Graph-based multibeam forward looking acoustic image classification. In Anais do XX Encontro Nacional de Inteligência Artificial e Computacional, pages 756–770, Porto Alegre, RS, Brasil. SBC.

Fey, M., Lenssen, J., Weichert, F., and Muller, H. (2018). SplineCNN: Fast geometric deep learning with continuous B-Spline kernels. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 869–877. IEEE Computer Society.

Fey, M. and Lenssen, J. E. (2019). Fast graph representation learning with PyTorch Geometric. In ICLR Workshop on Representation Learning on Graphs and Manifolds.

Girshick, R. (2015). Fast R-CNN. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), pages 1440–1448.

Goodfellow, I. (2016). Deep Learning. MIT Press.

He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 770–778.

Hinton, G., Vinyals, O., and Dean, J. (2015). Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531.

Huo, G., Wu, Z., and Li, J. (2020). Underwater object classification in sidescan sonar images using deep transfer learning and semisynthetic training data. IEEE Access, 8:47407–47418.

Kingma, D. P. and Ba, J. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.

Kipf, T. N. and Welling, M. (2017). Semi-supervised classification with graph convolutional networks. In International Conference on Learning Representations (ICLR).

Knyazev, B., Taylor, G. W., and Amer, M. (2019). Understanding attention and generalization in graph neural networks. Advances in Neural Information Processing Systems, 32.

Monti, F., Boscaini, D., Masci, J., Rodola, E., Svoboda, J., and Bronstein, M. M. (2017). Geometric deep learning on graphs and manifolds using mixture model CNNs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 5115–5124.

Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., Kopf, A., Yang, E., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., Bai, J., and Chintala, S. (2019). PyTorch: An imperative style, high-performance deep learning library. In Wallach, H., Larochelle, H., Beygelzimer, A., d’Alché Buc, F., Fox, E., and Garnett, R., editors, Advances in Neural Information Processing Systems 32, pages 8024–8035. Curran Associates, Inc.

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., and Duchesnay, E. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830.

Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Fei-Fei, L., et al. (2015). Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 115:211–252.

Sinai, A., Amar, A., and Gilboa, G. (2016). Mine-like objects detection in side-scan sonar images using a shadows-highlights geometrical features space. In OCEANS 2016 MTS/IEEE Monterey, pages 1–6. IEEE.

Singh, D. and Valdenegro-Toro, M. (2021). The marine debris dataset for forward-looking sonar semantic segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 3741–3749.

Steiniger, Y., Kraus, D., and Meisen, T. (2022). Survey on deep learning based computer vision for sonar imagery. Engineering Applications of Artificial Intelligence, 114:105157.

Van der Walt, S., Schönberger, J. L., Nunez-Iglesias, J., Boulogne, F., Warner, J. D., Yager, N., Gouillart, E., and Yu, T. (2014). scikit-image: Image Processing in Python. PeerJ, 2:e453.

Vasudevan, V., Bassenne, M., Islam, M. T., and Xing, L. (2023). Image classification using graph neural network and multiscale wavelet superpixels. Pattern Recognition Letters, 166:89–96.

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., and Polosukhin, I. (2017). Attention is all you need. In Advances in Neural Information Processing Systems, volume 31, page 6000–6010. Curran Associates Inc.

Wightman, R. (2019). PyTorch image models. [link].

Xie, K., Yang, J., and Qiu, K. (2022). A dataset with multibeam forward-looking sonar for underwater object detection. Scientific Data, 9:739.

Xu, K., Hu, W., Leskovec, J., and Jegelka, S. (2019). How powerful are graph neural networks? In International Conference on Learning Representations (ICLR).

Yang, D., Cheng, C., Wang, C., Pan, G., and Zhang, F. (2022). Side-scan sonar image segmentation based on multi-channel CNN for AUV navigation. Frontiers in Neuro-robotics, 16.

You, J., Ying, Z., and Leskovec, J. (2020). Design space for graph neural networks. Advances in Neural Information Processing Systems, 33:17009–17021.

Yu, Y., Zhao, J., Gong, Q., Huang, C., Zheng, G., and Ma, J. (2021). Real-time underwater maritime object detection in side-scan sonar images based on Transformer-YOLOv5. Remote Sensing, 13:3555.
Publicado
17/11/2024
EVANGELISTA, Gabriel Arruda; OLIVEIRA E SOUZA FILHO, João Baptista de. Hybrid CNN-GNN Models in Active Sonar Imagery: an Experimental Evaluation. In: ENCONTRO NACIONAL DE INTELIGÊNCIA ARTIFICIAL E COMPUTACIONAL (ENIAC), 21. , 2024, Belém/PA. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2024 . p. 37-48. ISSN 2763-9061. DOI: https://doi.org/10.5753/eniac.2024.245038.