Estratégia de Treinamento para Poda Consciente de Modelo de CNN Aplicado a Classificação de Imagens

Mateus A. S. de S. Goldbarg; Sérgio N. Silva; Lucileide M. D. da Silva; Marcelo A. C. Fernandes

doi:10.5753/eniac.2023.234645

Mateus A. S. de S. Goldbarg Universidade Federal do Rio Grande do Norte
Sérgio N. Silva Universidade Federal do Rio Grande do Norte
Lucileide M. D. da Silva Universidade Federal do Rio Grande do Norte / Instituto Federal de Educação, Ciência e Tecnologia do Rio Grande do Norte
Marcelo A. C. Fernandes Universidade Federal do Rio Grande do Norte

DOI: https://doi.org/10.5753/eniac.2023.234645

Resumo

Este artigo apresenta uma nova estratégia de treinamento para compressão de modelos de redes neurais convolucionais (Convolutional Neural Networks - CNN). A estratégia proposta utiliza um esquema de poda consciente dos pesos da CNN, diferenciando-se das abordagens convencionais. Neste trabalho, a poda consciente é aplicada de forma contínua durante todo o processo de treinamento, em todos os mini-batches. A estratégia foi aplicada em um problema de classificação de 10 mil imagens, pertencentes a 10 classes diferentes, utilizando o dataset CIFAR-10. Os resultados obtidos demonstraram que foi possível remover aproximadamente 82% dos parâmetros da CNN, mantendo uma alta acurácia. Esses resultados evidenciam a eficácia da técnica de remoção de pesos por poda consciente para essa aplicação específica.

Palavras-chave: Poda consciente, Redes Neurais Convolucionais, Classificação de imagens, Compressão de modelo

Referências

Awan, Z. W., Khalid, S., and Gul, S. (2022). A theoretical cnn compression framework for resource-restricted environments. In 2022 2nd International Conference on Digital Futures and Transformative Technologies (ICoDT2), pages 1–8. IEEE.

Blalock, D., Ortiz, J. J. G., Frankle, J., and Guttag, J. (2020). What is the state of neural network pruning? arXiv preprint arXiv:2003.03033.

Coutinho, M. G. F., Torquato, M. F., and Fernandes, M. A. C. (2019). Deep neural network hardware implementation based on stacked sparse autoencoder. IEEE Access, 7:40674–40694.

Fernandes, M. A. C. and Kung, H. T. (2021). A novel training strategy for deep learning model compression applied to viral classifications. In 2021 International Joint Conference on Neural Networks (IJCNN), pages 1–9.

Guo, K., Zeng, S., Yu, J., Wang, Y., and Yang, H. (2017). A survey of fpga-based neural network accelerator. arXiv preprint arXiv:1712.08934.

Han, S., Mao, H., and Dally, W. J. (2015). Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149.

He, H., Huang, L., Huang, Z., and Yang, T. (2022). The compression techniques applied on deep learning model. Highlights in Science, Engineering and Technology, 4:325–331.

Huang, Q., Zhou, K., You, S., and Neumann, U. (2018). Learning to prune filters in convolutional neural networks. In 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), pages 709–718. IEEE.

Imani, M., Samragh Razlighi, M., Kim, Y., Gupta, S., Koushanfar, F., and Rosing, T. (2020). Deep learning acceleration with neuron-to-memory transformation. In 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA), pages 1–14.

Jin, S., Di, S., Liang, X., Tian, J., Tao, D., and Cappello, F. (2019). Deepsz: A novel framework to compress deep neural networks by using error-bounded lossy compression. In Proceedings of the 28th International Symposium on High-Performance Parallel and Distributed Computing, pages 159–170.

Jung, S., Son, C., Lee, S., Son, J., Han, J.-J., Kwak, Y., Hwang, S. J., and Choi, C. (2019). Learning to quantize deep networks by optimizing quantization intervals with task loss. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 4350–4359.

Kim, Y., Tong, Q., Choi, K., Lee, E., Jang, S., and Choi, B. (2018). System level power reduction for yolo2 sub-modules for object detection of future autonomous vehicles. In 2018 International SoC Design Conference (ISOCC), pages 151–155.

Krizhevsky, A. and Hinton, G. (2009). Cifar-10. Technical Report.

Kung, H., McDanel, B., and Zhang, S. Q. (2020). Term revealing: Furthering quantization at run time on quantized dnns. arXiv preprint arXiv:2007.06389.

Rakin, A. S., Yi, J., Gong, B., and Fan, D. (2018). Defend deep neural networks against adversarial examples via fixed and dynamic quantized activation functions. arXiv pre-print arXiv:1807.06714.

Sawant, S. S., Wiedmann, M., Göb, S., Holzer, N., Lang, E. W., and Götz, T. (2022). Compression of deep convolutional neural network using additional importance-weight-based filter pruning approach. Applied Sciences, 12(21):11184.

Tung, F. and Mori, G. (2020). Deep neural network compression by in-parallel pruning-quantization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(3):568–579.

Wang, K., Liu, Z., Lin, Y., Lin, J., and Han, S. (2019). Haq: Hardware-aware automated quantization with mixed precision. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 8612–8620.

Zhe, W., Lin, J., Chandrasekhar, V., and Girod, B. (2019). Optimizing the bit allocation for compression of weights and activations of deep neural networks. In 2019 IEEE International Conference on Image Processing (ICIP), pages 3826–3830.