OpenImages Cyclists: Expandindo a Generalização na Detecção de Ciclistas em Câmeras de Segurança

Ednilza Evangelista da Silva Nardi; Bruno Padilha; Leonardo Tadashi Kamaura; João Eduardo Ferreira

doi:10.5753/sbbd.2022.224626

Ednilza Evangelista da Silva Nardi Universidade de São Paulo http://orcid.org/0000-0003-3733-4840
Bruno Padilha Universidade de São Paulo
Leonardo Tadashi Kamaura Universidade de São Paulo
João Eduardo Ferreira Universidade de São Paulo

DOI: https://doi.org/10.5753/sbbd.2022.224626

Resumo

Embora haja diversos conjuntos de dados públicos contendo ciclistas para treinamento de detectores baseados em Aprendizado Profundo, suas anotações são para bicicletas e pessoas, ou a qualidade e quantidade das imagens são limitadas. Para superar essas limitações, propomos o novo conjunto de dados OpenImages Cyclists, construído por meio de pré-seleção de imagens do conjunto OpenImages e de um novo algoritmo para geração semi-automatizada de anotações de ciclistas auxiliado por detectores de pessoas e bicicletas. Ao treinar um detector com esses dados, obtivemos uma taxa de identificação da ordem de 78% na detecção de ciclistas na USP, Campus São Paulo - Capital, por transferência de aprendizado, maior que os 52%, com o conjunto MIO-TCD.

Palavras-chave: aprendizado profundo, detecção de objetos, detecção de objetos online, monitoramento em tempo real, detecção de ciclistas

Referências

Bochkovskiy, A., Wang, C. Y., and Liao, H. Y. M. (2020). Yolov4: Optimal speed and accuracy of object detection. ArXiv, abs/2004.10934.

Dollár, P., Appel, R., Belongie, S., and Perona, P. (2014). Fast feature pyramids for object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(8):1532-1545.

Everingham, M., Van Gool, L., Williams, C. K. I., Winn, J., and Zisserman, A. (2010). The pascal visual object classes (voc) challenge. International Journal of Computer Vision, 88(2):303-338.

Felzenszwalb, P. F., Girshick, R. B., McAllester, D., and Ramanan, D. (2010). Object detection with discriminatively trained part-based models. IEEE transactions on pattern analysis and machine intelligence, 32(9):1627-1645.

Girshick, R., Donahue, J., Darrell, T., and Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 580-587. IEEE.

Jung, H., Choi, M.-K., Jung, J., Lee, J.-H., Kwon, S., and Jung, W. Y. (2017). Resnet-based vehicle classification and localization in traffic surveillance systems. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pages 934-940. IEEE.

Krasin, I., Duerig, T., Alldrin, N., Ferrari, V., Abu-El-Haija, S., Kuznetsova, A., Rom, H., Uijlings, J., Popov, S., Veit, A., et al. (2017). Openimages: A public dataset for large-scale multi-label and multi-class image classification. https://github.com/openimages.

Kuznetsova, A., Rom, H., Alldrin, N., Uijlings, J., Krasin, I., Pont-Tuset, J., Kamali, S., Popov, S., Malloci, M., Kolesnikov, A., et al. (2020). The open images dataset v4. International Journal of Computer Vision, 128(7):1956-1981.

Li, X., Flohr, F., Yang, Y., Xiong, H., Braun, M., Pan, S., Li, K., and Gavrila, D. M. (2016). A new benchmark for vision-based cyclist detection. In 2016 IEEE Intelligent Vehicles Symposium (IV), pages 1028-1033. IEEE.

Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., and Zitnick, C. L. (2014). Microsoft coco: Common objects in context. In European conference on computer vision, pages 740-755. Springer.

Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C. Y., and Berg, A. C. (2016). Ssd: Single shot multibox detector. In European Conference on Computer Vision, pages 21-37. Springer.

Luo, Z., Branchaud-Charron, F., Lemaire, C., Konrad, J., Li, S., Mishra, A., Achkar, A., Eichel, J., and Jodoin, P. M. (2018). Mio-tcd: A new benchmark dataset for vehicle classification and localization. IEEE Transactions on Image Processing, 27(10):5129-5141.

MacAskill, D. (2018). Putting your best photo forward: Flickr updates. https://blog.flickr.net/.

Masalov, A., Matrenin, P., Ota, J., Wirth, F., Stiller, C., Corbet, H., and Lee, E. (2019). Specialized cyclist detection dataset: Challenging real-world computer vision dataset for cyclist detection using a monocular rgb camera. In 2019 IEEE Intelligent Vehicles Symposium (IV), pages 114-118. IEEE.

Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (2016). You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 779-788. IEEE.

Redmon, J. and Farhadi, A. (2017). Yolo9000: better, faster, stronger. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 6517-6525. IEEE.

Redmon, J. and Farhadi, A. (2018). Yolov3: An incremental improvement. ArXiv, abs/1804.02767.

Ren, S., He, K., Girshick, R., and Sun, J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks. In Advances in Neural Information Processing Systems, volume 28, pages 91-99. Curran Associates, Inc.

Robert, Ross, Marcin, Elvis, Guillem, Andrew, and Thomas (2022). Papers with code. https://paperswithcode.com/sota/object-detection-on-coco.

Santhosh, K. K., Dogra, D. P., and Roy, P. P. (2020). Anomaly detection in road traffic using visual surveillance: A survey. ACM Comput. Surv., 53(6).

Tan, M., Pang, R., and Le, Q. V. (2020). Efficientdet: Scalable and efficient object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 10778-10787. IEEE.

Wang, T., He, X., Su, S., and Guan, Y. (2017). Efficient scene layout aware object detection for traffic surveillance. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pages 926-933. IEEE.

Zaidi, S. S. A., Ansari, M. S., Aslam, A., Kanwal, N., Asghar, M., and Lee, B. (2022). A survey of modern deep learning based object detection models. Digital Signal Processing, 126:103514.

Zhang, C., Bengio, S., Hardt, M., Recht, B., and Vinyals, O. (2021). Understanding deep learning (still) requires rethinking generalization. Commun. ACM, 64(3):107-115.

Zhou, X., Gong, W., Fu, W., and Du, F. (2017). Application of deep learning in object detection. In 2017 IEEE/ACIS 16th International Conference on Computer and Information Science (ICIS), pages 631-634. IEEE.

Zhuang, F., Qi, Z., Duan, K., Xi, D., Zhu, Y., Zhu, H., Xiong, H., and He, Q. (2020). A comprehensive survey on transfer learning. Proceedings of the IEEE, 109(1):43-76.