Optimizing data augmentation policies for convolutional neural networks based on classification of sickle cells

  • Matheus da Silva UFV
  • Larissa Rodrigues UFV
  • João Fernando Mari UFV

Resumo


Data augmentation is a key procedure in many image classification tasks, mainly to overcome the problem of small datasets. In this work, we exploit the data augmentation as a hyperparameter optimization approach. We tested our methods to classify erythrocytes to assist the diagnosis of sickle cell anemia. In this study, we proposed a data augmentation approach based on hyperparameter optimization to find the best augmentation policies through the Bayesian optimization algorithm. We also developed a convolutional neural network architecture from scratch and compared it with two classic architectures to classify sickle cell images. Our approach defines the best data augmentation solutions and sends those solutions to be carried out by CNN in the final training. All experiments were validated using a stratified five-fold cross-validation procedure, and our best result achieves 92.54% of accuracy. The results suggest the best augmentation policies defined with optimization allow us to obtain superior results than other strategies such as without data augmentation, several randomly defined image transformations, and only random rotations. As far as we know, our paper is the first to propose optimizing data augmentation policies in biomedical images leading to a better exploration of these strategies in several fields.

Palavras-chave: sickle cell, medical imaging, deep learning, data augmentation, Bayesian optimization

Referências

Gregory J Kato, Frédéric B Piel, Clarice D Reid, Marilyn H Gaston, Kwaku Ohene-Frempong, Lakshmanan Krishnamurti, Wally R Smith, Julie A Panepinto, David J Weatherall, Fernando F Costa, et al. Sickle cell disease. Nature Reviews Disease Primers, 4(1):1–22, 2018.

Frédéric B Piel, Martin H Steinberg, and David C Rees. Sickle cell disease. New England Journal of Medicine, 376(16):1561–1573, 2017.

Gentil Claudino de Galiza Neto and Maria da Silva Pitombeira. Aspectos moleculares da anemia falciforme. Jornal Brasileiro de Patologia e Medicina Laboratorial, 39(1):51–56, 2003.

James S Duncan and Nicholas Ayache. Medical image analysis: Progress over two decades and the challenges ahead. IEEE transactions on pattern analysis and machine intelligence, 22(1):85–106, 2000.

Marleen De Bruijne. Machine learning approaches in medical image analysis: From detection to diagnosis, 2016.

Ian Goodfellow, Yoshua Bengio, and Aaron Courville. Deep Learning. MIT Press, 2016. http://www.deeplearningbook.org.

Livia Faes, Xiaoxuan Liu, Aditya Kale, Alice Bruynseels, Mohith Shamdas, Gabriella Moraes, Dun Jack Fu, Siegfried K Wagner, Christoph Kern, Joseph RE Ledsam, et al. Deep learning under scrutiny: performance against health care professionals in detecting diseases from medical imaging-systematic review and meta-analysis. 2019.

Athira Sreekumar and Ashok Bhattacharya. Identification of sickle cells from microscopic blood smear image using image processing. International Journal of Emerging Trends in Science and Technology, 1(5):783–787, 2014.

Shashi Bala and Amit Doegar. Automatic detection of sickle cell in red blood cell using watershed segmentation. Int. J. Adv. Res. Comput. and Commun. Eng, 4(6):488–491, 2015.

Larissa Ferreira Rodrigues, Murilo Coelho Naldi, and João Fernando Mari. Morphological analysis and classification of erythrocytes in microscopy images. In XII Workshop de Visão Computacional, Campo Grande, MS, Brazil, 2016. WVC.

Lucas Costa de Faria, Larissa Ferreira Rodrigues, and João Fernando Mari. Cell classification using handcrafted features and bag of visual words. In 2018 Workshop de Visão Computacional (WVC), pages 68– 73, Nov 2018.

M. A. Ponti, L. S. F. Ribeiro, T. S. Nazare, T. Bui, and J. Collomosse. Everything you wanted to know about deep learning for computer vision but were afraid to ask. In 2017 30th SIBGRAPI Conference on Graphics, Patterns and Images Tutorials (SIBGRAPI-T), pages 17–41, Oct 2017.

Fábio Perez, Cristina Vasconcelos, Sandra Avila, and Eduardo Valle. Data augmentation for skin lesion analysis. In OR 2.0 Context-Aware Operating Theaters, Computer Assisted Robotic Endoscopy, Clinical Image-Based Procedures, and Skin Image Analysis, pages 303–311. Springer, 2018.

Bobak Shahriari, Kevin Swersky, Ziyu Wang, Ryan P Adams, and Nando De Freitas. Taking the human out of the loop: A review of bayesian optimization. Proceedings of the IEEE, 104(1):148–175, 2015.

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pages 1097–1105, 2012.

Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998.

Manuel Gonzalez-Hidalgo, FA Guerrero-Pena, S Herold-Garcia, Antoni Jaume-i Capó, and Pedro D Marrero-Fernández. Red blood cell cluster separation from digital images for use in sickle cell disease. IEEE journal of biomedical and health informatics, 19(4):1514–1525, 2014.

Mengjia Xu, Dimitrios P Papageorgiou, Sabia Z Abidi, Ming Dao, Hong Zhao, and George Em Karniadakis. A deep convolutional neural network for classification of red blood cells in sickle cell anemia. PLoS computational biology, 13(10), 2017.

Wei Qiu, Jiaming Guo, Xiang Li, Mengjia Xu, Mo Zhang, Ning Guo, and Quanzheng Li. Multi-label detection and classification of red blood cells in microscopic images. arXiv preprint arXiv:1910.02672, 2019.

Laith Alzubaidi, Omran Al-Shamma, Mohammed A. Fadhel, Laith Farhan, and Jinglan Zhang. Classification of red blood cells in sickle cell anemia using deep convolutional neural network. In Ajith Abraham, Aswani Kumar Cherukuri, Patricia Melin, and Niketa Gandhi, editors, Intelligent Systems Design and Applications, pages 550–559, Cham, 2020. Springer International Publishing.

Laith Alzubaidi, Mohammed A. Fadhel, Omran Al-Shamma, Jinglan Zhang, and Ye Duan. Deep learning models for classification of red blood cells in microscopy images to aid in sickle cell anemia diagnosis. Electronics, 9(3):427, Mar 2020.

Manuel Gonzalez-Hidalgo, FA Guerrero-Pena, S Herold-Garcia, Antoni Jaume-i Capó, and Pedro D Marrero-Fernández. Red blood cell cluster separation from digital images for use in sickle cell disease. IEEE journal of biomedical and health informatics, 19(4):1514–1525, 2015.

Ekin D Cubuk, Barret Zoph, Dandelion Mane, Vijay Vasudevan, and Quoc V Le. Autoaugment: Learning augmentation strategies from data. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 113–123, 2019.

Huan Liu, Farhad Hussain, Chew Lim Tan, and Manoranjan Dash. Discretization: An enabling technique. Data mining and knowledge discovery, 6(4):393–423, 2002.

Maithra Raghu, Chiyuan Zhang, Jon Kleinberg, and Samy Bengio. Transfusion: Understanding transfer learning for medical imaging. In Advances in Neural Information Processing Systems, pages 3342–3352, 2019.

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.

Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1–9, 2015.

Dominik Scherer, Andreas Müller, and Sven Behnke. Evaluation of pooling operations in convolutional architectures for object recognition. In International conference on artificial neural networks, pages 92–101. Springer, 2010.

Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. Dropout: a simple way to prevent neural networks from overfitting. The journal of machine learning research, 15(1):1929–1958, 2014.

Yann LeCun, D Touresky, G Hinton, and T Sejnowski. A theoretical framework for back-propagation. In Proceedings of the 1988 connectionist models summer school, volume 1, pages 21–28. CMU, Pittsburgh, Pa: Morgan Kaufmann, 1988.

Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.

Yoshua Bengio. Practical recommendations for gradient-based training of deep architectures. In Neural networks: Tricks of the trade, pages 437–478. Springer, 2012.

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee, 2009.

Ian Dewancker, Michael McCourt, and Scott Clark. Bayesian optimization for machine learning: A practical guidebook. arXiv preprint arXiv:1612.04858, 2016.

Stephen V Stehman. Selecting and interpreting measures of thematic classification accuracy. Remote sensing of Environment, 62(1):77–89, 1997.

Michael W Browne. Cross-validation methods. Journal of mathematical psychology, 44(1):108–132, 2000.

James S. Bergstra, Rémi Bardenet, Yoshua Bengio, and Balázs Kégl. Algorithms for hyper-parameter optimization. In J. Shawe-Taylor, R. S. Zemel, P. L. Bartlett, F. Pereira, and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems 24, pages 2546– 2554. Curran Associates, Inc., 2011.
Publicado
07/10/2020
Como Citar

Selecione um Formato
DA SILVA, Matheus; RODRIGUES, Larissa; MARI, João Fernando. Optimizing data augmentation policies for convolutional neural networks based on classification of sickle cells. In: WORKSHOP DE VISÃO COMPUTACIONAL (WVC), 16. , 2020, Evento Online. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2020 . p. 46-51. DOI: https://doi.org/10.5753/wvc.2020.13479.