Study of Convolutional Neural Networks applied to Image Stereo Matching

  • João Pedro Poloni Ponce UFABC
  • Ricardo Suyama UFABC

Resumo


Stereo images are images formed from two or more sources that capture the same scene so that it is possible to infer the depth of the scene under analysis. The use of convolutional neural networks to compute these images has been shown to be a viable alternative due to its speed in finding the correspondence between the images. This raises questions related to the influence of structural parameters, such as size of kernel, stride and pooling policy on the performance of the neural network. To this end, this work sought to reproduce an article that deals with the topic and to explore the influence of the parameters mentioned above in function of the results of error rate and losses of the neural model. The results obtained reveal improvements. The influence of the parameters on the training time of the models was also notable, even using the GPU, the temporal difference in the training period between the maximum and minimum limits reached a ratio of six times.

Referências

J. Zbontar and Y. LeCun, "Stereo matching by training a convolutional neural network to compare image patches," 2015.

F. Remondino, M. Spera, E. Nocerino, F. Menna, and F. Nex, "State of the art in high density image matching," Photogrammetric record, vol. 29, no. 146, pp. 144–166, 2014.

R. A. Hamzah and H. Ibrahim, "Literature survey on stereo vision disparity map algorithms," Journal of Sensors, vol. 2016, 2016.

D. Scharstein, R. Szeliski, and R. Zabih, "A taxonomy and evaluation of dense two-frame stereo correspondence algorithms," in Proceedings IEEE Workshop on Stereo and Multi-Baseline Vision (SMBV 2001), 2001, pp. 131–140.

W. Luo, A. G. Schwing, and R. Urtasun, "Efficient deep learning for stereo matching," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 5695–5703.

J. Zbontar and Y. LeCun, "Computing the stereo matching cost with a convolutional neural network," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 1592–1599.

S. Tulyakov, A. Ivanov, and F. Fleuret, "Practical deep stereo (pds): Toward applications-friendly deep stereo matching," 2018.

J.-R. Chang and Y.-S. Chen, "Pyramid stereo matching network," 2018.

J. Schennings, "Deep convolutional neural networks for real-time single frame monocular depth estimation," 2017.

S. Zagoruyko and N. Komodakis, "Learning to compare image patches via convolutional neural networks," in Proceedings of the IEEE confer- ence on computer vision and pattern recognition, 2015, pp. 4353–4361.

T. S. Jordan, S. Shridhar, and J. Thatte, "Usings cnns to estimate depth from stereo imagery."

K. Zhang, J. Lu, and G. Lafruit, "Cross-based local stereo matching using orthogonal integral images," IEEE transactions on circuits and systems for video technology, vol. 19, no. 7, pp. 1073–1079, 2009.

J. Pang, W. Sun, J. S. Ren, C. Yang, and Q. Yan, "Cascade residual learning: A two-stage convolutional neural network for stereo matching," the IEEE International Conference on Computer in Proceedings of Vision, 2017, pp. 887–895.

M. Menze and A. Geiger, "Object scene flow for autonomous vehicles," in Conference on Computer Vision and Pattern Recognition (CVPR), 2015.

S. Baker, D. Scharstein, J. P. Lewis, S. Roth, M. J. Black, and R. Szeliski, "A database and evaluation methodology for optical flow," International Journal of Computer Vision, vol. 92, no. 1, pp. 1–31, Mar. 2011.

Y. Wang, "stereo matching," Oct. 2019. [Online]. Available: https: //github.com/wangy12/stereo matching
Publicado
07/11/2020
PONCE, João Pedro Poloni; SUYAMA, Ricardo. Study of Convolutional Neural Networks applied to Image Stereo Matching. In: WORKSHOP DE TRABALHOS DA GRADUAÇÃO - CONFERENCE ON GRAPHICS, PATTERNS AND IMAGES (SIBGRAPI), 33. , 2020, Evento Online. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2020 . p. 175-178. DOI: https://doi.org/10.5753/sibgrapi.est.2020.13005.