Improving Fine-Grained Vehicle Classification via Multitask Learning and Hierarchical Consistency

Gabriel E. Lima; Eduardo Santos; Eduil Nascimento Jr.; Rayson Laroca; David Menotti

doi:10.5753/sibgrapi.est.2025.38289

Gabriel E. Lima UFPR
Eduardo Santos Polícia Militar do Paraná / UFPR
Eduil Nascimento Jr. Polícia Militar do Paraná
Rayson Laroca PUCPR / UFPR
David Menotti UFPR

DOI: https://doi.org/10.5753/sibgrapi.est.2025.38289

Resumo

Fine-Grained Vehicle Classification (FGVC) plays a key role in intelligent transportation systems, enabling the recognition of vehicle attributes – such as type, make, and model – from images. Such information supports vehicle identification and can complement automatic license plate recognition by enabling cross-checks and addressing cases with unreadable plates. However, existing approaches often treat these attributes independently, overlooking their hierarchical relationships and differences in task difficulty. This work-in-progress study explores the use of Multitask Learning (MTL) and hierarchical regularization to address these gaps. We evaluate seven deep learning models on a diverse dataset under three training setups: singletask learning, MTL with balanced optimization, and MTL with hierarchical regularization. Results show that MTL consistently improves classification accuracy, while incorporating hierarchical information significantly reduces semantic inconsistencies and enhances confidence calibration. In our best-performing configuration, hierarchy-violating errors dropped from 32.87% (singletask) to 4.10% (MTL with hierarchical regularization). These findings highlight the importance of modeling semantic relationships among attributes in FGVC and suggest promising directions for building more accurate and reliable classifiers. Future work will expand attribute granularity, investigate optimal task combinations, and benchmark against state-of-the-art methods.

Referências

S. H. Tan, J. H. Chuah, C.-O. Chow, and J. Kanesan, “Cross-granularity network for vehicle make and model recognition,” IEEE Transactions on Intelligent Transportation Systems, vol. 26, pp. 5782–5791, 2025.

S. Wolf, D. Loran, and J. Beyerer, “Knowledge-distillation-based label smoothing for fine-grained open-set vehicle recognition,” in IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW), 2024, pp. 330–340.

I. O. Oliveira et al., “Vehicle-Rear: A new dataset to explore feature fusion for vehicle identification using convolutional neural networks,” IEEE Access, vol. 9, pp. 101 065–101 077, 2021.

V. Nascimento et al., “Toward advancing license plate super-resolution in real-world scenarios: A dataset and benchmark,” Journal of the Brazilian Computer Society, vol. 1, no. 31, pp. 435–449, 2025.

L. Wojcik, G. E. Lima, V. Nascimento, E. Nascimento Jr., R. Laroca, and D. Menotti, “LPLC: A dataset for license plate legibility classification,” Conference on Graphics, Patterns and Images), pp. 1–6, 2025.

D. Liu, “Progressive multi-task anti-noise learning and distilling frameworks for fine-grained vehicle recognition,” IEEE Transactions on Intelligent Transportation Systems, vol. 25, pp. 10 667–10 678, 2024.

L. Yang, P. Luo, C. C. Loy, and X. Tang, “A large-scale car dataset for fine-grained categorization and verification,” in IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2015, pp. 3973–3981.

J. Krause, J. Deng, M. Stark, and L. Fei-Fei, “Collecting a large-scale dataset of fine-grained cars,” [link], 2013.

C. Jin, L. Luo, H. Lin, J. Hou, and H. Chen, “Hmil: Hierarchical multiinstance learning for fine-grained whole slide image classification,” IEEE Transactions on Medical Imaging, vol. 44, no. 4, pp. 1796–1808, 2025.

R. Wang, C. Zou, W. Zhang, Z. Zhu, and L. Jing, “Consistency-aware feature learning for hierarchical fine-grained visual classification,” in ACM International Conference on Multimedia, 2023, pp. 2326–2334.

J. Zhao, Y. Peng, and X. He, “Attribute hierarchy based multi-task learning for fine-grained image classification,” Neurocomputing, vol. 395, pp. 150–159, 2020.

T. Chen, W. Wu, Y. Gao, L. Dong, X. Luo, and L. Lin, “Fine-grained representation learning and recognition by exploiting hierarchical semantic embedding,” in ACM International Conference on Multimedia, 2018, pp. 2023–2031.

X. Clady, P. Negri, M. Milgram, and R. Poulenard, “Multi-class vehicle type recognition system,” in Artificial Neural Networks in Pattern Recognition, L. Prevost, S. Marinai, and F. Schwenker, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2008, pp. 228–239.

V. Petrovic and T. Cootes, “Analysis of features for rigid structure vehicle type recognition,” in British Machine Vision Conference (BMVC), vol. 2, 2004.

M. Biglari, A. Soleimani, and H. Hassanpour, “Part-based recognition of vehicle make and model,” IET Image Processing, vol. 11, pp. 483–491, 2017.

J. Fang, Y. Zhou, Y. Yu, and S. Du, “Fine-grained vehicle model recognition using a coarse-to-fine convolutional neural network architecture,” IEEE Transactions on Intelligent Transportation Systems, vol. 18, no. 7, pp. 1782–1792, 2017.

H. Wang, J. Peng, Y. Zhao, and X. Fu, “Multi-path deep cnns for fine-grained car recognition,” IEEE Transactions on Vehicular Technology, vol. 69, no. 10, pp. 10 484–10 493, 2020.

Y. Yu et al., “Cam: A fine-grained vehicle model recognition method based on visual attention model,” Image and Vision Computing, vol. 104, p. 104027, 2020.

L. Lu, Y. Cai, H. Huang, and P. Wang, “An efficient fine-grained vehicle recognition method based on part-level feature optimization,” Neurocomputing, vol. 536, pp. 40–49, 2023.

X. Li, L. Yu, D. Chang, Z. Ma, and J. Cao, “Dual cross-entropy loss for small-sample fine-grained vehicle classification,” IEEE Transactions on Vehicular Technology, vol. 68, no. 5, pp. 4204–4212, 2019.

M.-P. Jolly, S. Lakshmanan, and A. Jain, “Vehicle segmentation and classification using deformable templates,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 18, pp. 293–308, 1996.

X. Ma and W. Grimson, “Edge-based rich representation for vehicle classification,” in IEEE International Conference on Computer Vision (ICCV), 2005, pp. 1185–1192.

Z. Dong, Y. Wu, M. Pei, and Y. Jia, “Vehicle type classification using a semisupervised convolutional neural network,” IEEE Transactions on Intelligent Transportation Systems, vol. 16, no. 4, pp. 2247–2256, 2015.

B. Hu, J.-H. Lai, and C.-C. Guo, “Location-aware fine-grained vehicle type recognition using multi-task deep networks,” Neurocomputing, vol. 243, pp. 60–68, 2017.

C. Yu, X. Zhao, Q. Zheng, P. Zhang, and X. You, “Hierarchical bilinear pooling for fine-grained visual recognition,” in Proceedings of the European conference on computer vision (ECCV), 2018, pp. 574–589.

R. Caruana, “Multitask learning,” Machine learning, vol. 28, no. 1, pp. 41–75, 1997.

G. R. Gonçalves et al., “Multi-task learning for low-resolution license plate recognition,” in Iberoamerican Congress on Pattern Recognition (CIARP), Oct 2019, pp. 251–261.

Z. Chen et al., “Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks,” in International Conference on Machine Learning (ICML), 2018, pp. 794–803.

L. Zhang, S. Huang, W. Liu, and D. Tao, “Learning a mixture of granularity-specific experts for fine-grained categorization,” in IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 8330–8339.

R. Laroca et al., “Do we train on test data? The impact of near-duplicates on license plate recognition,” in International Joint Conference on Neural Networks (IJCNN), June 2023, pp. 1–8.

M. Tan and Q. Le, “EfficientNetV2: Smaller models and faster training,” in International Conf. on Machine Learning, 2021, pp. 10 096–10 106.

A. Howard et al., “Searching for MobileNetV3,” in IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 1314–1324.

K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 770–778.

A. Dosovitskiy et al., “An image is worth 16x16 words: Transformers for image recognition at scale,” in International Conference on Learning Representations (ICLR), 2021, pp. 1–22.

Ultralytics, “YOLOv11 Image Classification,” [link], 2025, accessed: 2025-08-07.

R. Laroca, M. dos Santos, and D. Menotti, “Improving small drone detection through multi-scale processing and data augmentation,” in International Joint Conference on Neural Networks (IJCNN), 2025.

G. E. Lima, R. Laroca, E. Santos, E. Nascimento Jr., and D. Menotti, “Toward enhancing vehicle color recognition in adverse conditions: A dataset and benchmark,” in Conference on Graphics, Patterns and Images (SIBGRAPI), Sept 2024, pp. 1–6.

Ultralytics, “Data augmentation using ultralytics yolo,” [link], 2025, accessed: 2025-08-07.

E. D. Cubuk, B. Zoph, J. Shlens, and Q. V. Le, “Randaugment: Practical automated data augmentation with a reduced search space,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2020, pp. 3008–3017.

J. Demšar, “Statistical comparisons of classifiers over multiple data sets,” J. Mach. Learn. Res., vol. 7, p. 1–30, Dec. 2006.

F. Wilcoxon, Individual Comparisons by Ranking Methods. New York, NY: Springer New York, 1992, pp. 196–202. [Online]. DOI: 10.1007/978-1-4612-4380-9_16

Improving Fine-Grained Vehicle Classification via Multitask Learning and Hierarchical Consistency

Resumo

Referências

Artigos mais lidos do(s) mesmo(s) autor(es)