Using syntactic methods and LSTM to the recognition of objects visual patterns
Resumo
In this paper, we have designed a new approach to represent and recognize objects visual patterns using syntactic methods. We capture relevant information from an object and associate them with symbols of an alphabet. After that, we derive a string from the object and in put it to LSTM. The idea is to train LSTM with objects visual patterns encapsulated in the strings. We conducted an experiment using soybean crops aerial images captured by an Unmanned Aerial Vehicle (UAV), and we reached an average F-measure of 91%.
Referências
M. Eden, On the formalization of handwriting, in Structure of Language and its Mathematical Aspect., 1961, p. 83 - 88. DOI: 10.1090/psapm/012/9979
J. Chua and P. F. Felzenszwalb, Scene grammars, factor graphs, and belief propagation, CoRR, 2016.
A. Fire and S. Zhu, Inferring hidden statuses and actions in video by causal reasoning, in 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), July 2017, pp. 48 - 56. DOI: 10.1109/cvprw.2017.13
I. Demir, D. G. Aliaga, and B. Benes, Procedural editing of 3d building point clouds, in 2015 IEEE International Conference on Computer Vision (ICCV), Dec 2015, pp. 2147 - 2155. DOI: 10.1109/iccv.2015.248
R. W. D. Pedro, F. L. S. Nunes, and A. Machado-Lima, Using grammars for pattern recognition in images: A systematic review, ACM Comput. Surv., vol. 46, no. 2, pp. 26 : 1 - 26 : 34, Nov. DOI: 10.1145/2543581.2543593
G. Chanda and F. Dellaert, Grammatical methods in computer vision: An overview, College of Computing, Georgia Institute of Technology, Atlanta, GA, Tech. Rep. GIT-GVU-04-29, 2004.
H. Pistori, A. Calway, and P. Flach, A new strategy for applying grammatical inference to image classification problems, in 2013 IEEE International Conference on Industrial Technology (ICIT), Feb 2013, pp. 1032 - 1037.
X. Song, T. Wu, Y. Jia, and S. C. Zhu, Discriminatively trained and-or tree models for object detection, in 2013 IEEE Conference on Computer Vision and Pattern Recognition, June 2013, pp. 3278 - 3285. DOI: 10.1109/cvpr.2013.421
M. Walton, D. Lange, and S.-C. Zhu, Inferring context through scene understanding, in AAAI Spring Symposium Series, 2017.
R. B. Girshick, P. F. Felzenszwalb, and D. McAllester, Object detection with grammar models, in Proceedings of the 24th International Conference on Neural Information Processing Systems, ser. NIPS' USA: Curran Associates Inc., 2011, pp. 442 - 450.
A. Foncubierta-Rodríguez, H. Müller, and A. Depeursinge, From visual words to a visual grammar: using language modelling for image classification, CoRR 2017.
K.-S. Fu and A. Rosenfeld, Pattern recognition and image processing, IEEE Transactions on Computers, vol. C- 25, no. 12, pp. 1336 - 1346, Dec 1976. DOI: 10.1109/tc.1976.1674602
A. K. Jain, R. P. W. Duin, and Jianchang Mao, Statistical pattern recognition: a review, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 1, pp. 4 - 37, Jan 2000. DOI: 10.1109/34.824819
D. G. Lowe, Distinctive image features from scale-invariant keypoints, Int. J. Comput. Vision, vol. 60, no. 2, pp. 91 - 110, Nov. 2004. DOI: 10.1023/b:visi.0000029664.99615.94
R. Achanta, A. Shaji, K. Smith, A. Lucchi, P. Fua, and S. Süsstrunk, Slic superpixels compared to state-of-the-art superpixel methods, IEEE Trans.on Pattern Anal. and Mach. Intell., vol. 34, no. 11, pp. 2274 - 2282, 2012.
R. Achanta, A. Shaji, K. Smith, A. Lucchi, P. Fua, and S. Süsstrunk, Slic superpixels, School of Computer and Communication Sciences and École Polytechnique Fédrale de Lausanne Joint Repor, Tech. Rep. EPFL Technical Report 149300, 2010.
J. Chung, C. Gulcehre, K. Cho, and Y. Bengio, Empirical evaluation of gated recurrent neural networks on sequence modeling, in NIPS 2014 Workshop on Deep Learning, December 2014, 2014.
K. Greff, R. K. Srivastava, J. Koutník, B. R. Steunebrink, and J. Schmidhuber, LSTM: A search space odyssey, CoRR, 2015. DOI: 10.1109/tnnls.2016.2582924
T. Luong, I. Sutskever, Q. Le, O. Vinyals, and W. Zaremba, Addressing the rare word problem in neural machine translation, in Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1 : Long Papers). Beijing, China: Association for Computational Linguistics, Jul. 2015, pp. 11 - 19.
I. Sutskever, O. Vinyals, and Q. V. Le, Sequence to sequence learning with neural networks, in Advances in Neural Information Processing Systems 27, Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, Eds. Curran Associates, Inc., 2014, pp. 3104 - 3112.
F. C. Cao R, S. M. Chan L, J. H, and C. Z., Prolango: Protein function prediction using neural machine translation based on a recurrent neural network, Molecules, vol. 22, no. 10, pp. 176 - 218, 2017. DOI: 10.3390/molecules22101732
F. Pouladi, H. Salehinejad, and A. M. Gilani, Recurrent neural networks for sequential phenotype prediction in genomics, in 2015 International Conference on Developments of E-Systems Engineering (DeSE), 2015, pp. 225 -230.
K. He, X. Zhang, S. Ren, and J. Sun, Deep residual learning for image recognition, in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016, pp. 770 - 778. DOI: 10.1109/cvpr.2016.90
F. Chollet, Xception: Deep learning with depthwise separable convolutions, in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017, pp. 1800 - 1807. DOI: 10.1109/cvpr.2017.195