Semantic Segmentation for People Detection on Beach Images

Leonardo de A. Monte; Emília G. Oliveira; Filipe R. Cordeiro; Valmir Macario

doi:10.5753/eniac.2021.18295

Leonardo de A. Monte UFRPE
Emília G. Oliveira UFRPE
Filipe R. Cordeiro UFRPE
Valmir Macario UFRPE

DOI: https://doi.org/10.5753/eniac.2021.18295

Resumo

Nosso trabalho compara um conjunto de redes de segmentação semântica aplicados na detecção de pessoas em imagens de praia, como parte de um sistema de rastreamento automático para evitar que banhistas ultrapassem a região segura do mar. Em nossa análise, comparamos as redes de segmentação U-net, X-net, Linknet, e Unet++ usando os backbones prétreinados VGG-16 e VGG-19. Nós propomos nossa própria base de imagens, composta de 300 imagens. Os modelos foram avaliados utilizando a métrica F-score. Nossos resultados mostraram que a Linknet obteve o melhor valor de F-score, com 90.89%, enquanto a Linknet foi mais rápida que as outras redes, sem diferença estatística significativa.

Referências

Berman, M., Triki, A. R., and Blaschko, M. B. (2018). The lovászsoftmax loss: A tractable surrogate for the optimization of the intersection-over-union measure in neural networks.

Bullock, J., Cuesta-Lázaro, C., and Quera-Bofarull, A. (2018). Xnet: A convolutional neural network (CNN) implementation for medical x-ray image segmentation suitable for small datasets. CoRR, abs/1812.00548.

CEMIT (2021). Statistics of shark incidents in the state of pernambucobrazil.

Chaurasia, A. and Culurciello, E. (2017). Linknet: CoRR, Exploiting encoder representations for efficient semantic segmentation. abs/1707.03718.

Chen, C., Surette, R., and Shah, M. (2020). Automated monitoring for security camera networks: promise from computer vision labs. Security Journal.

Chevtchenko, S., Vale, R., Cordeiro, F., and Macario, V. (2018). Deep learning for people detection on beach images. In 2018 7th Brazilian Conference on Intelligent Systems (BRACIS), pages 218–223.

Chollet, F. et al. (2015). Keras. https://github.com/fchollet/keras.

Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., and Fei-Fei, L. (2009). In 2009 IEEE conference on Imagenet: A large-scale hierarchical image database. computer vision and pattern recognition, pages 248–255. Ieee.

Everingham, M., Van Gool, L., Williams, C. K. I., Winn, J., and Zisserman, A. (2010). The pascal visual object classes (voc) challenge. International Journal of Computer Vision, 88(2):303–338.

Garcia-Garcia, A., Orts-Escolano, S., Oprea, S., VillenaMartinez, V., and Garcia-Rodriguez, J. (2017). A review on deep learning techniques applied to semantic segmentation. arXiv preprint arXiv:1704.06857.

Green, S., Blumenstein, M., Browne, M., and Tomlinson, R. (2005). The detection and quantification of persons in cluttered beach scenes using neural networkbased classification. In Sixth International Conference on Computational Intelligence and Multimedia Applications (ICCIMA’05), pages 303–308.

Hanin, B. and Rolnick, D. (2018). How to start training: The effect of initialization and architecture.

He, K., Zhang, X., Ren, S., and Sun, J. (2015). Deep residual learning for image recognition.

Jo, J., Koo, H. I., Soh, J. W., and Cho, N. I. (2020). Handwritten text segmentation via end-to-end learning of convolutional neural networks. Multimedia Tools and Applications, 79(43):32137–32150.

Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Pereira, F., Burges, C. J. C., Bottou, L., and Weinberger, K. Q., editors, Advances in Neural Information Processing Systems 25, pages 1097–1105. Curran Associates, Inc.

Liu, C., Chen, L.-C., Schroff, F., Adam, H., Hua, W., Yuille, A. L., and Fei-Fei, L. (2019a). Auto-deeplab: Hierarchical neural architecture search for semantic image segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

Liu, W., Liao, S., Ren, W., Hu, W., and Yu, Y. (2019b). High-level semantic feature detection: A new perspective for pedestrian detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5187– 5196.

Long, J., Shelhamer, E., and Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3431–3440.

Luna da Silva, R., Chevtchenko, S., Alves de Moura, A., Rolim Cordeiro, F., and Macario, V. (2017). Detecting people from beach images. In 2017 IEEE 29th International Conference on Tools with Artificial Intelligence (ICTAI), pages 636–643.

Miclea, V.-C. and Nedevschi, S. (2019). Real-time semantic segmentation-based stereo reconstruction. IEEE Transactions on Intelligent Transportation Systems, 21(4):1514–1524.

Noh, H., Hong, S., and Han, B. (2015). Learning deconvolution network for semantic segmentation. In Proceedings of the IEEE international conference on computer vision, pages 1520–1528.

Raschka, S. (2020). Model evaluation, model selection, and algorithm selection in machine learning.

Ronneberger, O., Fischer, P., and Brox, T. (2015). U-net: Convolutional networks for biomedical image segmentation. CoRR, abs/1505.04597.

Siam, M., Gamal, M., Abdel-Razek, M., Yogamani, S., and Jagersand, M. (2018). Rtseg: Real-time semantic segmentation comparative study. In 2018 25th IEEE International Conference on Image Processing (ICIP), pages 1603–1607. IEEE.

Simonyan, K. and Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. In Bengio, Y. and LeCun, Y., editors, 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings.

Stapor, K. (2017). Evaluating and comparing classifiers: Review, some recommendations and limitations. pages 12–21.

Wada, K. (2016). labelme: Image Polygonal Annotation with Python. https://github.com/wkentaro/labelme.

Wong, J. M., Kee, V., Le, T., Wagner, S., Mariottini, G.-L., Schneider, A., Hamilton, L., Chipalkatty, R., Hebert, M., Johnson, D. M., et al. (2017). Segicp: In 2017 IEEE/RSJ InIntegrated deep semantic segmentation and pose estimation. ternational Conference on Intelligent Robots and Systems (IROS), pages 5784–5789. IEEE.

Yang, Z., Yu, H., Feng, M., Sun, W., Lin, X., Sun, M., Mao, Z.-H., and Mian, A. (2020). Small object augmentation of urban scenes for real-time semantic segmentation. IEEE Transactions on Image Processing, 29:5175–5190.

Zhou, H., Jun Zhang, Jun Lei, Shuohao Li, and Dan Tu (2016). Image semantic segmentation based on fcn-crf model. In 2016 International Conference on Image, Vision and Computing (ICIVC), pages 9–14.

Zhou, Z., Siddiquee, M. M. R., Tajbakhsh, N., and Liang, J. (2018). Unet++: A nested u-net architecture for medical image segmentation. In Deep learning in medical image analysis and multimodal learning for clinical decision support, pages 3–11. Springer.

Semantic Segmentation for People Detection on Beach Images

Resumo

Referências

Artigos mais lidos do(s) mesmo(s) autor(es)