OpenImages Cyclists: Expandindo a Generalização na Detecção de Ciclistas em Câmeras de Segurança
Resumo
Embora haja diversos conjuntos de dados públicos contendo ciclistas para treinamento de detectores baseados em Aprendizado Profundo, suas anotações são para bicicletas e pessoas, ou a qualidade e quantidade das imagens são limitadas. Para superar essas limitações, propomos o novo conjunto de dados OpenImages Cyclists, construído por meio de pré-seleção de imagens do conjunto OpenImages e de um novo algoritmo para geração semi-automatizada de anotações de ciclistas auxiliado por detectores de pessoas e bicicletas. Ao treinar um detector com esses dados, obtivemos uma taxa de identificação da ordem de 78% na detecção de ciclistas na USP, Campus São Paulo - Capital, por transferência de aprendizado, maior que os 52%, com o conjunto MIO-TCD.
Referências
Dollár, P., Appel, R., Belongie, S., and Perona, P. (2014). Fast feature pyramids for object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(8):1532-1545.
Everingham, M., Van Gool, L., Williams, C. K. I., Winn, J., and Zisserman, A. (2010). The pascal visual object classes (voc) challenge. International Journal of Computer Vision, 88(2):303-338.
Felzenszwalb, P. F., Girshick, R. B., McAllester, D., and Ramanan, D. (2010). Object detection with discriminatively trained part-based models. IEEE transactions on pattern analysis and machine intelligence, 32(9):1627-1645.
Girshick, R., Donahue, J., Darrell, T., and Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 580-587. IEEE.
Jung, H., Choi, M.-K., Jung, J., Lee, J.-H., Kwon, S., and Jung, W. Y. (2017). Resnet-based vehicle classification and localization in traffic surveillance systems. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pages 934-940. IEEE.
Krasin, I., Duerig, T., Alldrin, N., Ferrari, V., Abu-El-Haija, S., Kuznetsova, A., Rom, H., Uijlings, J., Popov, S., Veit, A., et al. (2017). Openimages: A public dataset for large-scale multi-label and multi-class image classification. https://github.com/openimages.
Kuznetsova, A., Rom, H., Alldrin, N., Uijlings, J., Krasin, I., Pont-Tuset, J., Kamali, S., Popov, S., Malloci, M., Kolesnikov, A., et al. (2020). The open images dataset v4. International Journal of Computer Vision, 128(7):1956-1981.
Li, X., Flohr, F., Yang, Y., Xiong, H., Braun, M., Pan, S., Li, K., and Gavrila, D. M. (2016). A new benchmark for vision-based cyclist detection. In 2016 IEEE Intelligent Vehicles Symposium (IV), pages 1028-1033. IEEE.
Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., and Zitnick, C. L. (2014). Microsoft coco: Common objects in context. In European conference on computer vision, pages 740-755. Springer.
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C. Y., and Berg, A. C. (2016). Ssd: Single shot multibox detector. In European Conference on Computer Vision, pages 21-37. Springer.
Luo, Z., Branchaud-Charron, F., Lemaire, C., Konrad, J., Li, S., Mishra, A., Achkar, A., Eichel, J., and Jodoin, P. M. (2018). Mio-tcd: A new benchmark dataset for vehicle classification and localization. IEEE Transactions on Image Processing, 27(10):5129-5141.
MacAskill, D. (2018). Putting your best photo forward: Flickr updates. https://blog.flickr.net/.
Masalov, A., Matrenin, P., Ota, J., Wirth, F., Stiller, C., Corbet, H., and Lee, E. (2019). Specialized cyclist detection dataset: Challenging real-world computer vision dataset for cyclist detection using a monocular rgb camera. In 2019 IEEE Intelligent Vehicles Symposium (IV), pages 114-118. IEEE.
Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (2016). You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 779-788. IEEE.
Redmon, J. and Farhadi, A. (2017). Yolo9000: better, faster, stronger. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 6517-6525. IEEE.
Redmon, J. and Farhadi, A. (2018). Yolov3: An incremental improvement. ArXiv, abs/1804.02767.
Ren, S., He, K., Girshick, R., and Sun, J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks. In Advances in Neural Information Processing Systems, volume 28, pages 91-99. Curran Associates, Inc.
Robert, Ross, Marcin, Elvis, Guillem, Andrew, and Thomas (2022). Papers with code. https://paperswithcode.com/sota/object-detection-on-coco.
Santhosh, K. K., Dogra, D. P., and Roy, P. P. (2020). Anomaly detection in road traffic using visual surveillance: A survey. ACM Comput. Surv., 53(6).
Tan, M., Pang, R., and Le, Q. V. (2020). Efficientdet: Scalable and efficient object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 10778-10787. IEEE.
Wang, T., He, X., Su, S., and Guan, Y. (2017). Efficient scene layout aware object detection for traffic surveillance. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pages 926-933. IEEE.
Zaidi, S. S. A., Ansari, M. S., Aslam, A., Kanwal, N., Asghar, M., and Lee, B. (2022). A survey of modern deep learning based object detection models. Digital Signal Processing, 126:103514.
Zhang, C., Bengio, S., Hardt, M., Recht, B., and Vinyals, O. (2021). Understanding deep learning (still) requires rethinking generalization. Commun. ACM, 64(3):107-115.
Zhou, X., Gong, W., Fu, W., and Du, F. (2017). Application of deep learning in object detection. In 2017 IEEE/ACIS 16th International Conference on Computer and Information Science (ICIS), pages 631-634. IEEE.
Zhuang, F., Qi, Z., Duan, K., Xi, D., Zhu, Y., Zhu, H., Xiong, H., and He, Q. (2020). A comprehensive survey on transfer learning. Proceedings of the IEEE, 109(1):43-76.