Study of Convolutional Neural Networks applied to Image Stereo Matching
ResumoStereo images are images formed from two or more sources that capture the same scene so that it is possible to infer the depth of the scene under analysis. The use of convolutional neural networks to compute these images has been shown to be a viable alternative due to its speed in finding the correspondence between the images. This raises questions related to the influence of structural parameters, such as size of kernel, stride and pooling policy on the performance of the neural network. To this end, this work sought to reproduce an article that deals with the topic and to explore the influence of the parameters mentioned above in function of the results of error rate and losses of the neural model. The results obtained reveal improvements. The influence of the parameters on the training time of the models was also notable, even using the GPU, the temporal difference in the training period between the maximum and minimum limits reached a ratio of six times.
F. Remondino, M. Spera, E. Nocerino, F. Menna, and F. Nex, "State of the art in high density image matching," Photogrammetric record, vol. 29, no. 146, pp. 144–166, 2014.
R. A. Hamzah and H. Ibrahim, "Literature survey on stereo vision disparity map algorithms," Journal of Sensors, vol. 2016, 2016.
D. Scharstein, R. Szeliski, and R. Zabih, "A taxonomy and evaluation of dense two-frame stereo correspondence algorithms," in Proceedings IEEE Workshop on Stereo and Multi-Baseline Vision (SMBV 2001), 2001, pp. 131–140.
W. Luo, A. G. Schwing, and R. Urtasun, "Efﬁcient deep learning for stereo matching," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 5695–5703.
J. Zbontar and Y. LeCun, "Computing the stereo matching cost with a convolutional neural network," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 1592–1599.
S. Tulyakov, A. Ivanov, and F. Fleuret, "Practical deep stereo (pds): Toward applications-friendly deep stereo matching," 2018.
J.-R. Chang and Y.-S. Chen, "Pyramid stereo matching network," 2018.
J. Schennings, "Deep convolutional neural networks for real-time single frame monocular depth estimation," 2017.
S. Zagoruyko and N. Komodakis, "Learning to compare image patches via convolutional neural networks," in Proceedings of the IEEE confer- ence on computer vision and pattern recognition, 2015, pp. 4353–4361.
T. S. Jordan, S. Shridhar, and J. Thatte, "Usings cnns to estimate depth from stereo imagery."
K. Zhang, J. Lu, and G. Lafruit, "Cross-based local stereo matching using orthogonal integral images," IEEE transactions on circuits and systems for video technology, vol. 19, no. 7, pp. 1073–1079, 2009.
J. Pang, W. Sun, J. S. Ren, C. Yang, and Q. Yan, "Cascade residual learning: A two-stage convolutional neural network for stereo matching," the IEEE International Conference on Computer in Proceedings of Vision, 2017, pp. 887–895.
M. Menze and A. Geiger, "Object scene ﬂow for autonomous vehicles," in Conference on Computer Vision and Pattern Recognition (CVPR), 2015.
S. Baker, D. Scharstein, J. P. Lewis, S. Roth, M. J. Black, and R. Szeliski, "A database and evaluation methodology for optical ﬂow," International Journal of Computer Vision, vol. 92, no. 1, pp. 1–31, Mar. 2011.
Y. Wang, "stereo matching," Oct. 2019. [Online]. Available: https: //github.com/wangy12/stereo matching