Braille character detection using deep neural networks for an educational robot for visually impaired people
Resumo
Teaching computer programming to the visually impaired is a difficult task that has sparked a great deal of interest, in part due to its specific demands. Robotics has been one of the strategies adopted to help in this task. One system that uses robotics to teach programming for the visually impaired, called Donnie, has as its key part the need to detect Braille characters in a scaled-down environment. In this paper, we investigate the current state-of-the-art in Braille letter detection based on deep neural networks. For such, we provide a novel public dataset with 2818 labeled images of Braille characters, classified in the letters of the alphabet, and we present a comparison among some recent detection methods. As a result, the proposed Braille letters detection method could be used to assist in teaching programming for blind students using a scaled-down physical environment. The proposal of EVA (Ethylene Vinyl Acetate) pieces with pins to represent Braille letters in this environment is also a contribution.
Referências
M. Konecki, N. Ivkoviíc, and M. Kaniski, "Making programming education more accessible for visually impaired," in 2016 39th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), May 2016, pp. 887–890.
A. Hadwen-Bennett, S. Sentance, and C. Morrison, "Making programming accessible to learners with visual impairments: A literature review," International Journal of Computer Science Education in Schools, vol. 2, no. 2, 2018.
S. Ludi and T. Reichlmayr, "The use of robotics to promote computing impairments," Trans. Comput.to pre-college students with visual Educ., vol. 11, no. 3, pp. 20:1–20:20, Oct. 2011. [Online]. Available: http://doi.acm.org/10.1145/2037276.2037284
R. Dorsey, C. H. Park, and A. Howard, "Developing the capabilities of blind and visually impaired youth to build and program robots," 2014.
S. L. Ludi, L. Ellis, and S. Jordan, "An accessible robotics programming environment for visually impaired users," in Proceedings of the 16th in- ternational ACM SIGACCESS conference on Computers & accessibility. ACM, 2014, pp. 237–238.
R. P. Barros, A. M. F. Burlamaqui, S. O. de Azevedo, S. T. de Lima Sa, L. M. G. Goncalves, and A. A. R. S. da Silva Burlamaqui, "Cardbot assistive technology for visually impaired in educational robotics: Experiments and results," IEEE Latin America Transactions, vol. 15, no. 3, pp. 517–527, March 2017.
G. H. M. Marques, D. C. Einloft, A. C. P. Bergamin, J. A. Marek, R. G. Maidana, M. B. Campos, I. H. Manssour, and A. M. Amory, "Donnie robot: Towards an accessible and educational robot for visually impaired people," in 2017 Latin American Robotics Symposium (LARS) and 2017 Brazilian Symposium on Robotics (SBR), Nov 2017, pp. 1–6.
S. Isayed and R. Tahboub, "A review of optical Braille recognition," 2015 2nd World Symposium on Web Applications and Networking, WSWAN 2015, pp. 1–6, 2015.
B. Nugroho, I. Ardiyanto, and H. A. Nugroho, “Review of optical braille recognition using camera for image acquisition,” in 2018 2nd International Conference on Biomedical Engineering (IBIOMED). IEEE, 2018, pp. 106–110.
G. Morgavi and M. Morando, “A neural network hybrid model for an optical braille recognitor,” in International Conference on Signal, Speech and Image Processing, vol. 2002, 2002.
L. Wong, W. Abdulla, and S. Hussmann, “A software algorithm prototype for optical recognition of embossed braille,” in Proceedings - International Conference on Pattern Recognition, vol. 2, 2004, pp. 586– 589.
S. Zhang and K. Yoshino, “A braille recognition system by the mobile phone with embedded camera,” in Second International Conference on Innovative Computing, Informatio and Control (ICICIC 2007), Sep. 2007, pp. 223–223.
J. Li and X. Yan, “Optical braille character recognition with supportvector machine classifier,” in Computer Application and System Modeling (ICCASM), 2010 International Conference on, vol. 12. IEEE, 2010, pp. V12–219.
M. Waleed, “Braille identification system using artificial neural networks,” Tikrit Journal of Pure Science, vol. 22, no. 2, 2017.
T. Li, X. Zeng, and S. Xu, “A deep learning method for Braille recognition,” Proceedings - 2014 6th International Conference on Computational Intelligence and Communication Networks, CICN 2014, pp. 1092–1095, 2014.
Y. Shimomura, H. Kawabe, H. Nambo, and S. Seto, “Construction of restoration system for old books written in braille,” in International conference on management science and engineering management. Springer, 2017, pp. 469–477.
A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in Advances in neural information processing systems, 2012, pp. 1097–1105.
K. He, G. Gkioxari, P. Dollár, and R. Girshick, “Mask r-cnn,” in Computer Vision (ICCV), 2017 IEEE International Conference on. IEEE, 2017, pp. 2980–2988.
S. Ren, K. He, R. Girshick, and J. Sun, “Faster r-cnn: Towards real-time object detection with region proposal networks,” in Advances in neural information processing systems, 2015, pp. 91–99.
R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” 2014 IEEE Conference on Computer Vision and Pattern Recognition, Jun 2014. [Online]. Available: http://dx.doi.org/10.1109/CVPR.2014.81
J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 779– 788.
C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, “Rethinking the inception architecture for computer vision,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 2818–2826.
F. N. Iandola, S. Han, M. W. Moskewicz, K. Ashraf, W. J. Dally, and K. Keutzer, “Squeezenet: Alexnet-level accuracy with 50x fewer parameters and¡ 0.5 mb model size,” arXiv preprint arXiv:1602.07360, 2016.
A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, “Mobilenets: Efficient convolutional neural networks for mobile vision applications,” arXiv preprint arXiv:1704.04861, 2017.
T. Lin, M. Maire, S. J. Belongie, L. D. Bourdev, R. B. Girshick, J. Hays, P. Perona, D. Ramanan, P. Doll´ar, and C. L. Zitnick, “Microsoft COCO: common objects in context,” CoRR, vol. abs/1405.0312, 2014. [Online]. Available: http://arxiv.org/abs/1405.0312