Combining YOLO and Visual Rhythm for Vehicle Counting

  • Victor Nascimento Ribeiro USP
  • Nina S. T. Hirata USP


Video-based vehicle detection and counting play a critical role in managing transport infrastructure. Traditional image-based counting methods usually involve two main steps: initial detection and subsequent tracking, which are applied to all video frames, leading to a significant increase in computational complexity. To address this issue, this work presents an alternative and more efficient method for vehicle detection and counting. The proposed approach eliminates the need for a tracking step and focuses solely on detecting vehicles in key video frames, thereby increasing its efficiency. To achieve this, we developed a system that combines YOLO, for vehicle detection, with Visual Rhythm, a way to create time-spatial images that allows us to focus on frames that contain useful information. Additionally, this method can be used for counting in any application involving unidirectional moving targets to be detected and identified.Experimental analysis using real videos shows that the proposed method achieves mean counting accuracy around 99.15% over a set of videos, with a processing speed three times faster than tracking based approaches.


A. A. Kurzhanskiy and P. Varaiya, “Traffic management: An outlook,” Economics of Transportation, vol. 4, no. 3, pp. 135–146, 2015.

C. S. Asha and A. V. Narasimhadhan, “Vehicle counting for traffic management system using yolo and correlation filter,” in IEEE International Conference on Electronics, Computing and Communication Technologies (CONECCT), 2018, pp. 1–6.

P. Patil, “Applications of deep learning in traffic management: A review,” International Journal of Business Intelligence and Big Data Analytics, vol. 5, no. 1, p. 16–23, Jan. 2022.

J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” 2016.

P. Jiang, D. Ergu, F. Liu, Y. Cai, and B. Ma, “A review of yolo algorithm developments,” Procedia Computer Science, vol. 199, pp. 1066–1073, 2022, the 8th International Conference on Information Technology and Quantitative Management (ITQM 2020 and 2021): Developing Global Digital Economy after COVID-19.

G. Jocher, A. Chaurasia, and J. Qiu, “YOLO by Ultralytics,” Jan. 2023. [Online]. Available:

J. Terven and D. Cordova-Esparza, “A comprehensive review of yolo: From yolov1 and beyond,” 2023.

S. Guimar, M. Couprie, N. Leite, and D. A. Araujo, “A method for cut detection based on visual rhythm,” in Proceedings XIV Brazilian Symposium on Computer Graphics and Image Processing, 2001, pp. 297–304.

A. d. S. Pinto, H. Pedrini, W. Schwartz, and A. Rocha, “Video-based face spoofing detection through visual rhythm analysis,” in 25th SIBGRAPI Conference on Graphics, Patterns and Images, 2012, pp. 221–228.

K.-d. Seo, S. J. Park, and S.-h. Jung, “Wipe scene-change detector based on visual rhythm spectrum,” IEEE Transactions on Consumer Electronics, vol. 55, no. 2, pp. 831–838, 2009.

D. J. Matuszewski, “Computer vision for continuous plankton monitoring,” Master’s thesis, Instituto de Matemática e Estatística, University of São Paulo, São Paulo, 2014, retrieved 2023-08-06.

M. A. Bin Zuraimi and F. H. Kamaru Zaman, “Vehicle detection and tracking using yolo and deepsort,” in IEEE 11th IEEE Symposium on Computer Applications and Industrial Electronics (ISCAIE), 2021, pp. 23–29.

B. S. Torres and H. Pedrini, “Detection of complex video events through visual rhythm,” The Visual Computer, vol. 34, no. 2, pp. 145–165, 02 2018. [Online]. Available:

B. Dwyer, J. Nelson, J. Solawetz et al., “Roboflow (version 1.0) [software],”, 2022, computer vision.
RIBEIRO, Victor Nascimento; HIRATA, Nina S. T.. Combining YOLO and Visual Rhythm for Vehicle Counting. In: WORKSHOP DE TRABALHOS DA GRADUAÇÃO - CONFERENCE ON GRAPHICS, PATTERNS AND IMAGES (SIBGRAPI), 36. , 2023, Rio Grande/RS. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2023 . p. 164-167. DOI: