Detecção e Classificação de Objetos Presentes em Imagens Aéreas de Drones de Ambientes Urbanos

  • Guilherme R. Sganderla UNIOESTE
  • Claudio Roberto M. Mauricio UNIOESTE
  • Valéria Nunes dos Santos Fundação Parque Tecnológico Itaipu
  • Fabiana Frata F. Peres UNIOESTE


Through large data sets, it is possible to train and instruct a machine with skills to perform tasks previously performed only by humans. This possibility has become increasingly real with the use of Deep Learning and powerful algorithms that have been developed over time. Among them is YOLO, a Convolutional Neural Network algorithm that allows several uses, including the detection and classification of objects contained in images of urban environments, such as people and vehicles, allowing the identification and location of objects within the images. This work presents a model for detecting and classifying common object classes in urban environments People, Small Vehicles, Medium-Vehicles and Large-Vehicles). For this project we used a combination of 3 datasets of aerial drone images of urban environments (Stanford Drone Dataset, Vision Meets Drone, The Unmanned Aerial Vehicle Benchmark Object Detection and Tracking). The result obtained from the initial training of this YOLO algorithm was an average accuracy of 67.2%.


K. Su, J. Li, and H. Fu, “Smart city and the applications,” in 2011 international conference on electronics, communications and control (ICECC). IEEE, 2011, pp. 1028–1031.

M. Mohri, A. Rostamizadeh, and A. Talwalkar, Foundations of Machine Learning, ser. Adaptive Computation and Machine Learning series. MIT Press, 2012. [Online]. Available:

A. Robicquet, A. Sadeghian, A. Alahi, and S. Savarese, “Learning social etiquette: Human trajectory understanding in crowded scenes,” in European conference on computer vision. Springer, 2016, pp. 549–565.

P. Zhu, L. Wen, D. Du, X. Bian, Q. Hu, and H. Ling, “Vision meets drones: Past, present and future,” arXiv preprint arXiv:2001.06303, 2020.

D. Du, Y. Qi, H. Yu, Y. Yang, K. Duan, G. Li, W. Zhang, Q. Huang, and Q. Tian, “The unmanned aerial vehicle benchmark: Object detection and tracking,” in Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 370–386.

G. Jocher, A. Stoken, J. Borovec, NanoCode012, A. Chaurasia, TaoXie, L. Changyu, A. V, Laughing, tkianai, yxNONG, A. Hogan, lorenzomammana, AlexWang1900, J. Hajek, L. Diaconu, Marc, Y. Kwon, oleg, wanghaoyang0106, Y. Defretin, A. Lohia, ml5ah, B. Milanko, B. Fineran, D. Khromov, D. Yiwei, Doug, Durgesh, and F. Ingham, “ultralytics/yolov5: v5.0 - YOLOv5-P6 1280 models, AWS, and YouTube integrations,” Apr. 2021. [Online]. Available:

S. Skansi, Introduction to Deep Learning: From Logical Calculus to Artificial Intelligence, ser. Undergraduate Topics in Computer Science. Springer International Publishing, 2018. [Online]. Available:

M. Zhu, “Recall, precision and average precision,” 09 2004.

C. D. Manning and P. Raghavan, “and schutze, h. [2008] introduction to information retrieval,” 2008.
SGANDERLA, Guilherme R.; MAURICIO, Claudio Roberto M.; SANTOS, Valéria Nunes dos; PERES, Fabiana Frata F.. Detecção e Classificação de Objetos Presentes em Imagens Aéreas de Drones de Ambientes Urbanos. In: WORKSHOP DE TRABALHOS DA GRADUAÇÃO - CONFERENCE ON GRAPHICS, PATTERNS AND IMAGES (SIBGRAPI), 34. , 2021, Online. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2021 . p. 223-227. DOI: