Dense 3D Indoor Scene Reconstruction from Spherical Images
ResumoTechniques for 3D reconstruction of scenes based on images are popular and support a number of secondary applications. Traditional approaches require several captures for covering whole environments due to the narrow field of view (FoV) of the pinhole-based/perspective cameras. This paper summarizes the main contributions of the homonym Ph.D. Thesis, which addresses the 3D scene reconstruction problem by considering omnidirectional (spherical or 360◦ ) cameras that present a 360◦ × 180◦ FoV. Although spherical imagery have the benefit of the full-FoV, they are also challenging due to the inherent distortions involved in the capture and representation of such images, which might compromise the use of many wellestablished algorithms for image processing and computer vision. The referred Ph.D. Thesis introduces novel methodologies for estimating dense depth maps from two or more uncalibrated and temporally unordered 360◦ images. It also presents a framework for inferring depth from a single spherical image. We validate our approaches using both synthetic data and computer-generated imagery, showing competitive results concerning other state-ofthe-art methods.
H. Kim and A. Hilton, "Block world reconstruction from spherical stereo image pairs," Computer Vision and Image Understanding, 2015.
J. Moreau, S. Ambellouis, and Y. Ruiche, "3D reconstruction of urban environments based on ﬁsheye stereovision," in 8th International Conference on Signal Image Technology and Internet Based Systems, SITIS 2012r, 2012.
J. Tong and X. Ning, "Depth measurement by omni-directional camera," in 2013 IEEE International Conference on Cyber Technology in Automation, Control and Intelligent Systems, IEEE-CYBER 2013, 2013.
S. Pathak, A. Moro, A. Yamashita, and H. Asama, "Dense 3D reconstruction from two spherical images via optical ﬂow-based equirectangular epipolar rectiﬁcation," in IEEE International Conference on Imaging Systems and Techniques (IST), 2016, pp. 140–145.
"A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms," in 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 1 (CVPR’06), vol. 1. IEEE, 2006, pp. 519–528.
O. ¨Ozyes¸il, V. Voroninski, R. Basri, and A. Singer, "A survey of structure from motion." Acta Numerica, vol. 26, pp. 305–364, 2017.
S. Im, H. Ha, F. Rameau, H.-G. Jeon, G. Choe, and I. S. Kweon, All- Around Depth from Small Motion with a Spherical Panoramic Camera, ser. Lecture Notes in Computer Science, B. Leibe, J. Matas, N. Sebe, and M. Welling, Eds., Cham, 2016, vol. 9907.
T. Akihiko, I. Atsushi, and N. Ohnishi, "Two-and three-view geometry for spherical cameras," Proc. of the Sixth Workshop on Omnidirectional Vision, Camera Networks and Non-classical Cameras, vol. 105, pp. 29–34, 2005.
J. Xiao, K. A. Ehinger, A. Oliva, and A. Torralba, "Recognizing scene viewpoint using panoramic place representation," in 2012 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, jun 2012, pp. 2695–2702.
Y.-C. Su and K. Grauman, "Learning Spherical Convolution for Fast Features from 360 Imagery," in Conference on Neural Information Processing Systems, 2017, pp. 529–539.
J. Cruz-Mota, I. Bogdanova, B. Paquier, M. Bierlaire, and J. P. Thiran, "Scale invariant feature transform on the sphere: Theory and applications," International Journal of Computer Vision, vol. 98, no. 2, pp. 217–241, 2012.
H. Guan and W. A. P. Smith, "BRISKS: Binary Features for Spherical Images on a Geodesic Grid," in Conference on Computer Vision and Pattern Recognition, 2017, pp. 1–9.
R. G. d. A. Azevedo, N. Birkbeck, F. De Simone, I. Janatra, B. Adsumilli, and P. Frossard, "Visual Distortions in 360-degree Videos," IEEE Transactions on Circuits and Systems for Video Technology, vol. PP, no. c, pp. 1–1, 2019.
R. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision. Cambridge, 2003.
T. L. T. da Silveira, L. Dal’aqua, and C. R. Jung, "Indoor Depth Estimation From Single Spherical Images," in IEEE International Conference on Image Processing (ICIP), 2018, pp. 2935–2939.
N. Zioulis, A. Karakottas, D. Zarpalas, and P. Daras, "OmniDepth: Dense Depth Estimation for Indoors Spherical Panoramas," 2018, pp. 453–471.
M. Eder and J.-M. Frahm, "Convolutions on Spherical Images," IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, may 2019.
C. Zou, A. Colburn, Q. Shan, and D. Hoiem, "LayoutNet: Reconstruct- ing the 3D Room Layout from a Single RGB Image," 2018.
S.-T. Yang, F.-E. Wang, C.-H. Peng, P. Wonka, M. Sun, and H.-K. Chu, "DuLa-Net: A Dual-Projection Network for Estimating Room Layouts from a Single RGB Panorama," in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 3363–3372.
S. Pathak, A. Moro, H. Fujii, A. Yamashita, and H. Asama, "Virtual Reality with Motion Parallax by Dense Optical Flow-Based Depth Generation from Two Spherical Images," pp. 1–6, 2017.
S. Pathak, A. Moro, A. Yamashita, and H. Asama, "Optical Flow-Based Epipolar Estimation of Spherical Image Pairs for 3D Reconstruction," SICE Journal of Control, Measurement, and System Integration, vol. 10, no. 5, pp. 476–485, 2017.
P. K. Lai, S. Xie, J. Lang, and R. Laqaruere, "Real-Time Panoramic Depth Maps from Omni-directional Stereo Images for 6 DoF Videos in Virtual Reality," IEEE Conference on Virtual Reality and 3D User Interfaces (VR), pp. 405–412, 2019.
P. Alcantarilla, J. Nuevo, and A. Bartoli, "Fast Explicit Diffusion for Accelerated Features in Nonlinear Scale Spaces," Procedings of the British Machine Vision Conference 2013, pp. 13.1–13.11, 2013.
H. C. Longuet-Higgins, "A Computer Algorithm for Reconstructing a Scene from Two Projections," in Readings in computer vision: issues, problems, principles, and paradigms, 1987, pp. 61–62.
H. Kim and A. Hilton, "3D scene reconstruction from multiple spherical stereo pairs," International Journal of Computer Vision, 2013.
P. J. Besl and N. D. McKay, "A method for registration of 3-d shapes," IEEE Transactions Pattern Analasys and Machine Intelligence, vol. 14, no. 2, pp. 239–256, 1992. [Online]. Available: http://dx.doi.org/10.1109/34.121791
J. Huang, Z. Chen, D. Ceylan, and H. Jin, "6-DOF VR videos with a single 360-camera," in IEEE Virtual Reality (VR), 2017, pp. 37–44.
H. Guan and W. A. P. Smith, "Structure-From-Motion in Spherical Video Using the von Mises-Fisher Distribution," IEEE Transactions on Image Processing, vol. 26, no. 2, pp. 711–723, feb 2017.
T. L. T. da Silveira and C. R. Jung, "Perturbation Analysis of the 8-Point Algorithm: a Case Study for Wide FoV Cameras," in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 11 757– 11 766.
T. L. T. da Silveira, "Dense 3d indoor scene reconstruction from spherical images," Ph.D. dissertation, Universidade Federal do Rio Grande do Sul, Porto Alegre, 2019, available at https://lume.ufrgs.br/ handle/10183/202142.
P.- ˚A. Wedin, "Perturbation bounds in connection with singular value decomposition," BIT Numerical Mathematics, vol. 12, no. 1, pp. 99– 111, 1972.
J. K. Merikoski, H. Sarria, and P. Tarazaga, "Bounds for singular values using traces," Linear Algebra and its Applications, vol. 210, pp. 227– 254, 1994.
T. T. Cai, A. Zhang et al., "Rate-optimal perturbation bounds for singular subspaces with applications to high-dimensional statistics," The Annals of Statistics, vol. 46, no. 1, pp. 60–89, 2018.
T. L. T. da Silveira and C. R. Jung, "Evaluation of Keypoint Extraction and Matching for Pose Estimation Using Pairs of Spherical Images," Conference on Graphics, Patterns and Images (SIBGRAPI), pp. 374– 381, 2017.
Q. Zhao, W. Feng, L. Wan, and J. Zhang, "SPHORB: A Fast and Robust Binary Feature on the Sphere," International Journal of Computer Vision, vol. 113, no. 2, pp. 143–159, 2014.
M. A. Fischler and R. C. Bolles, "Random sample consensus: a paradigm for model ﬁtting with applications to image analysis and automated cartography," Communications of the ACM, vol. 24, no. 6, pp. 381–395, 1981.
P. Weinzaepfel, J. Revaud, Z. Harchaoui, and C. Schmid, "DeepFlow: Large displacement optical flow with deep matching," Proceedings of the IEEE International Conference on Computer Vision, no. Section 2, pp. 1385–1392, 2013.
E. S. L. Gastal and M. M. Oliveira, "Domain transform for edge-aware image and video processing," ACM Trans. Graph., vol. 30, no. 4, pp. 69:1–69:12, 2011.
M. Solh and G. AlRegib, "Hierarchical hole-ﬁlling for depth-based view synthesis in ftv and 3d video," IEEE Journal of Selected Topics in Signal Processing, vol. 6, no. 5, pp. 495–504, 2012.
T. L. T. da Silveira and C. R. Jung, "Dense 3D Scene Reconstruction from Multiple Spherical Images for 3-DoF+ VR Applications," in IEEE Conference on Virtual Reality and 3D User Interfaces (VR), 2019, pp. 9–18.
A. Q. de Oliveira, T. L. T. da Silveira, M. Walter, and C. R. Jung, "On the performance of DIBR methods when using depth maps from state-of- the-art stereo matching algorithms," in IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019, pp. 2272– 2276.
R. Achanta and S. S¨usstrunk, "Superpixels and polygons using simple non-iterative clustering," in Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, 2017.
Q. Zhao, F. Dai, Y. Ma, L. Wan, J. Zhang, and Y. Zhang, "Spherical Superpixel Segmentation," IEEE Transactions on Multimedia, vol. 20, no. 6, pp. 1406–1417, 2018.
T. L. T. da Silveira and C. R. Jung, "Snics: um método para supersegmentação de imagens esféricas," 2019, instituto Nacional de Propriedade Industrial (INPI). Número do registro: BR5120190029348.
F. Liu, C. Shen, G. Lin, and I. Reid, "Learning Depth from Single Monocular Images Using Deep Convolutional Neural Fields," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 38, no. 10, pp. 2024–2039, oct 2016.
C. Godard, O. Mac Aodha, and G. J. Brostow, "Unsupervised Monocular Depth Estimation with Left-Right Consistency," in Conference on Computer Vision and Pattern Recognition, 2017, pp. 270–279.