Avoiding Overfitting: new algorithms to improve generalization in Convolutional Neural Networks
Resumo
Deep Learning has achieved state-of-the-art results in several domains, such as im- age processing, natural language processing, and audio processing. To accomplish such results, it uses neural networks with several processing layers along with a massive amount of labeled information. One particular family of Deep Learning is the Convolutional Neural Networks (CNNs), which work using convolutional layers de- rived from the digital signal processing area, being very helpful to detect relevant features in unstructured data, such as audio and pictures. One way to improve results on CNN is to use regularization algorithms, which aim to make the training process harder but generate models that generalize better for inference when used in applications. The present work contributes to the area of regularization methods for CNNs, proposing more methods for use in different image processing tasks. This thesis presents a collection of works developed by the author during the research period, which were published or submitted until the present time, presenting: (i) a survey, listing recent regularization works and highlighting the solutions and problems of the area; (ii) a neuron dropping method to use in the tensors generated during CNNs training; (iii) a variation of the mentioned method, changing the dropping rules, targeting different features of the tensor; and (iv) a label regularization algorithm used in different image processing problems.
Referências
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Å. Kaiser, and I. Polosukhin, "Attention is all you need," Advances in neural information processing systems, vol. 30, 2017.
M. Tan and Q. V. Le, "Efficientnet: Rethinking model scaling for convolutional neural networks," arXiv preprint arXiv:1905.11946, 2019.
A. Sauer, K. Schwarz, and A. Geiger, "Stylegan-xl: Scaling stylegan to large diverse datasets," arXiv preprint arXiv:2202.00273, vol. 1, 2022.
I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. MIT Press, 2016, http://www.deeplearningbook.org.
D. E. Rumelhart, G. E. Hinton, and R. J. Williams, "Learning representations by back-propagating errors," nature, vol. 323, no. 6088, pp. 533-536, 1986.
Y. LeCun and C. Cortes, "MNIST handwritten digit database," 2010. [Online]. Available: http://yann.lecun.com/exdb/mnist/
Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient-based learning applied to document recognition," Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, 1998.
A. Krizhevsky, I. Sutskever, and G. E. Hinton, "Imagenet classification with deep convolutional neural networks," Advances in neural information processing systems, vol. 25, pp. 1097-1105, 2012.
J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, "ImageNet: A Large-Scale Hierarchical Image Database," in CVPR09, 2009.
X. Wang, K. Yu, S. Wu, J. Gu, Y. Liu, C. Dong, Y. Qiao, and C. Change Loy, "Esrgan: Enhanced super-resolution generative adversarial networks," in Proceedings of the European Conference on Computer Vision (ECCV) Workshops, 2018, pp. 0-0.
M. Tan, R. Pang, and Q. V. Le, "Efficientdet: Scalable and efficient object detection," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 10 781-10 790.
T. Wolf, L. Debut, V. Sanh, J. Chaumond, C. Delangue, A. Moi, P. Cistac, T. Rault, R. Louf, M. Funtowicz et al., "Transformers: Stateof- the-art natural language processing," in Proceedings of the 2020 conference on empirical methods in natural language processing: system demonstrations, 2020, pp. 38-45.
T. He, Z. Zhang, H. Zhang, Z. Zhang, J. Xie, and M. Li, "Bag of tricks for image classification with convolutional neural networks," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 558-567.
E. Hoffer, T. Ben-Nun, I. Hubara, N. Giladi, T. Hoefler, and D. Soudry, "Augment your batch: better training with larger batches," arXiv preprint arXiv:1901.09335, 2019.
H. Touvron, A. Vedaldi, M. Douze, and H. Jégou, "Fixing the train-test resolution discrepancy," arXiv preprint arXiv:1906.06423, 2019.
T. DeVries and G. W. Taylor, "Improved regularization of convolutional neural networks with cutout," arXiv preprint arXiv:1708.04552, 2017.
S. Yun, D. Han, S. J. Oh, S. Chun, J. Choe, and Y. Yoo, "Cutmix: Regularization strategy to train strong classifiers with localizable features," in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 6023-6032.
Z. Zhong, L. Zheng, G. Kang, S. Li, and Y. Yang, "Random erasing data augmentation." in AAAI, 2020, pp. 13 001-13 008.
H. Zhang, M. Cisse, Y. N. Dauphin, and D. Lopez-Paz, "mixup: Beyond empirical risk minimization," arXiv preprint arXiv:1710.09412, 2017.
E. D. Cubuk, B. Zoph, D. Mane, V. Vasudevan, and Q. V. Le, "Autoaugment: Learning augmentation policies from data," arXiv preprint arXiv:1805.09501, 2018.
S. Lim, I. Kim, T. Kim, C. Kim, and S. Kim, "Fast autoaugment," in Advances in Neural Information Processing Systems, 2019, pp. 6665-6675.
E. D. Cubuk, B. Zoph, J. Shlens, and Q. V. Le, "Randaugment: Practical automated data augmentation with a reduced search space," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2020, pp. 702-703.
D. Ho, E. Liang, X. Chen, I. Stoica, and P. Abbeel, "Population based augmentation: Efficient learning of augmentation policy schedules," in International Conference on Machine Learning. PMLR, 2019, pp. 2731-2741.
J. Yoo, N. Ahn, and K.-A. Sohn, "Rethinking data augmentation for image super-resolution: A comprehensive analysis and a new strategy," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 8375-8384.
N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, "Dropout: a simple way to prevent neural networks from overfitting," The journal of machine learning research, vol. 15, no. 1, pp. 1929-1958, 2014.
C. F. G. d. Santos, D. Colombo, M. Roder, and J. P. Papa, "Maxdropout: Deep neural network regularization based on maximum output values," in Proceedings of 25th International Conference on Pattern Recognition, ICPR 2020, Milan, Italy, 10-15 January, 2021. IEEE Computer Society, 2020, pp. 2671-2676.
T. Yang, S. Zhu, and C. Chen, "Gradaug: A new regularization method for deep neural networks," arXiv preprint arXiv:2006.07989, 2020.
Z. Lu, C. Xu, B. Du, T. Ishida, L. Zhang, and M. Sugiyama, "Localdrop: A hybrid regularization for deep neural networks," IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021.
X. Gastaldi, "Shake-shake regularization," arXiv preprint arXiv:1705.07485, 2017.
Y. Yamada, M. Iwamura, T. Akiba, and K. Kise, "Shakedrop regularization for deep residual learning," IEEE Access, vol. 7, pp. 186 126- 186 136, 2019.
V. Verma, A. Lamb, C. Beckham, A. Najafi, I. Mitliagkas, D. Lopez-Paz, and Y. Bengio, "Manifold mixup: Better representations by interpolating hidden states," in International Conference on Machine Learning. PMLR, 2019, pp. 6438-6447.
G. Ghiasi, T.-Y. Lin, and Q. V. Le, "Dropblock: A regularization method for convolutional networks," in Advances in Neural Information Processing Systems, 2018, pp. 10 727-10 737.
H. Pham and Q. V. Le, "Autodropout: Learning dropout patterns to regularize deep networks," arXiv preprint arXiv:2101.01761, 2021.
C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, "Rethinking the inception architecture for computer vision," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 2818-2826.
Y. Xu, Y. Xu, Q. Qian, H. Li, and R. Jin, "Towards understanding label smoothing," arXiv preprint arXiv:2006.11653, 2020.
W. Li, G. Dasarathy, and V. Berisha, "Regularization via structural label smoothing," in Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, ser. Proceedings of Machine Learning Research, S. Chiappa and R. Calandra, Eds., vol. 108. PMLR, 26-28 Aug 2020, pp. 1453-1463. [Online]. Available: https://proceedings.mlr.press/v108/li20e.html
H. Wei, L. Feng, X. Chen, and B. An, "Combating noisy labels by agreement: A joint training method with co-regularization," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2020.
C. F. G. d. Santos, M. Roder, L. A. Passos, and J. P. Papa, "Maxdropoutv2: An improved method to drop out neurons in convolutional neural networks," in Iberian Conference on Pattern Recognition and Image Analysis. Springer, 2022, pp. 271-282.
B. Lim, S. Son, H. Kim, S. Nah, and K. Mu Lee, "Enhanced deep residual networks for single image super-resolution," in Proceedings of the IEEE conference on computer vision and pattern recognition workshops, 2017, pp. 136-144.
A. Ignatov, L. Van Gool, and R. Timofte, "Replacing mobile camera isp with a single deep learning model," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2020, pp. 536-537.
K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770-778.
H. Zhu and X. Zhao, "Targetdrop: A targeted regularization method for convolutional neural networks," arXiv preprint arXiv:2010.10716, 2020.
S. Zagoruyko and N. Komodakis, "Wide residual networks," in Proceedings of the British Machine Vision Conference (BMVC), E. R. H. Richard C. Wilson and W. A. P. Smith, Eds. BMVA Press, September 2016, pp. 87.1-87.12. [Online]. Available: https://dx.doi.org/10.5244/C.30.87
D. Han, J. Kim, and J. Kim, "Deep pyramidal residual networks," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 5927-5935.
C. F. G. d. Santos and J. a. P. Papa, "Avoiding overfitting: A survey on regularization methods for convolutional neural networks," ACM Comput. Surv., jan 2022, just Accepted. [Online]. Available: https://doi.org/10.1145/3510413
C. Santos, L. Afonso, C. Pereira, and J. Papa, "Breastnet: Breast cancer categorization using convolutional neural networks," in 2020 IEEE 33rd International Symposium on Computer-Based Medical Systems (CBMS), 2020, pp. 463-468.
C. Filipi Gonçalves dos Santos, D. d. S. Oliveira, L. A. Passos, R. Gonçalves Pires, D. Felipe Silva Santos, L. Pascotti Valem, T. P. Moreira, M. Cleison S. Santana, M. Roder, J. Paulo Papa et al., "Gait recognition based on deep learning: a survey," ACM Computing Surveys (CSUR), vol. 55, no. 2, pp. 1-34, 2022.
C. F. G. dos Santos, T. P. Moreira, D. Colombo, and J. P. Papa, "Does pooling really matter? an evaluation on gait recognition," in Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications, I. Nyström, Y. Hernández Heredia, and V. Milián Núñez, Eds. Cham: Springer International Publishing, 2019, pp. 751-760.
C. F. G. d. Santos, T. P. Moreira, D. Colombo, and J. P. Papa, "Does removing pooling layers from convolutional neural networks improve results?" SN Computer Science, vol. 1, no. 5, pp. 1-10, 2020.
C. F. G. dos Santos, L. A. Passos, M. C. de Santana, and J. P. Papa, "Normalizing images is good to improve computer-assisted covid-19 diagnosis," in Data Science for COVID-19, U. Kose, D. Gupta, V. H. C. de Albuquerque, and A. Khanna, Eds. Academic Press, 2021, pp. 51-62. [Online]. Available: https://www.sciencedirect.com/science/article/pii/B9780128245361000332
L. A. Passos, C. Santos, C. R. Pereira, L. C. S. Afonso, and J. P. Papa, "A hybrid approach for breast mass categorization," in VipIMAGE 2019, J. M. R. S. Tavares and R. M. Natal Jorge, Eds. Cham: Springer International Publishing, 2019, pp. 159-168.
R. G. Pires, D. F. S. Santos, C. F. Santos, M. C. Santana, and J. P. Papa, "Image denoising using attention-residual convolutional neural networks," in 2020 33rd SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI), 2020, pp. 101-107.
G. H. De Rosa, M. Roder, J. P. Papa, and C. F. Dos Santos, "Improving pre-trained weights through meta-heuristics fine-tuning," in 2021 IEEE Symposium Series on Computational Intelligence (SSCI). IEEE, 2021, pp. 1-8.