CNN Architecture Assessment: Exploring Depth, Width, and Kernel Size for Image Classification

Eduardo T. Buss; Erick Radmann; Ruhan Conceicao; Bruno Zatt; Luciano Agostini

doi:10.5753/eramiars.2025.16782

Eduardo T. Buss UFPel
Erick Radmann UFPel
Ruhan Conceicao UFPel
Bruno Zatt UFPel
Luciano Agostini UFPel

DOI: https://doi.org/10.5753/eramiars.2025.16782

Resumo

This paper investigates the impact of different convolutional neural network (CNN) architectures on image classification performance using the CIFAR-10 dataset. We evaluate variations in the number of convolutional layers, kernel size, and number of filters. Experiments indicate that increasing the number of filters and slightly enlarging kernel size generally improves accuracy, while deeper models do not always yield better results. The CIFAR-10 dataset contains 60,000 color images across 10 classes, with 45,000 for training, 5,000 for validation, and 10,000 for final testing.

Referências

He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep residual learning for image recognition. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 770–778.

Huang, G., Liu, Z., Van Der Maaten, L., and Weinberger, K. Q. (2017). Densely connected convolutional networks. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 4700–4708.

Ioffe, S. and Szegedy, C. (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International Conference on Machine Learning (ICML), pages 448–456.

Kingma, D. P. and Ba, J. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.

Krizhevsky, A. and Hinton, G. (2009). Learning multiple layers of features from tiny images. Technical report, University of Toronto.

Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems (NeurIPS), volume 25, pages 1097–1105.

LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324.

Lin, M., Chen, Q., and Yan, S. (2014). Network in network. In International Conference on Learning Representations (ICLR).

Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., and Salakhutdinov, R. (2014). Dropout: A simple way to prevent neural networks from overfitting. In Journal of Machine Learning Research, volume 15, pages 1929–1958.