Fast and Robust 3D Reconstruction Solution from Permissive Open-Source Code

Victor Gouveia de M. Lyra; Adam H. M. Pinto; Gustavo C. R. Lima; João Paulo Lima; Veronica Teichrieb; Jonysberg Peixoto Quintino; Fabio Q. B. da Silva; André L M Santos; Helder Pinho

doi:10.5753/jis.2021.2065

Authors

Victor Gouveia de M. Lyra Voxar Labs, Centro de Informática (UFPE) https://orcid.org/0000-0003-3508-3486
Adam H. M. Pinto Voxar Labs, Centro de Informática (UFPE) https://orcid.org/0000-0001-9302-3575
Gustavo C. R. Lima Voxar Labs, Centro de Informática (UFPE) https://orcid.org/0000-0002-5843-742X
João Paulo Lima Departamento de Computação (UFRPE) and Voxar Labs, Centro de Informática (UFPE)
Veronica Teichrieb Voxar Labs, Centro de Informática (UFPE) https://orcid.org/0000-0003-4685-3634
Jonysberg Peixoto Quintino Projeto de P&D CIn/Samsung (UFPE) https://orcid.org/0000-0003-4667-2243
Fabio Q. B. da Silva Centro de Informática (UFPE)
André L M Santos Centro de Informática (UFPE) https://orcid.org/0000-0001-6661-5970
Helder Pinho SiDi, Campinas

DOI:

https://doi.org/10.5753/jis.2021.2065

Keywords:

reconstruction, photogrammetry, permissive license, batch, texture

Abstract

With the growth of access to faster computers and more powerful cameras, the 3D reconstruction of objects has become one of the public's main topics of research and demand. This task is vigorously applied in creating virtual environments, creating object models, and other activities. One of the techniques for obtaining 3D features is photogrammetry, mapping objects and scenarios using only images. However, this process is very costly and can be pretty time-consuming for large datasets. This paper proposes a robust, efficient reconstruction pipeline with a low runtime in batch processing and permissive code. It is even possible to commercialize it without the need to keep the code open. We mix an improved structure from motion algorithm and a recurrent multi-view stereo reconstruction. We also use the Point Cloud Library for normal estimation, surface reconstruction, and texture mapping. We compare our results with state-of-the-art techniques using benchmarks and our datasets. The results showed a decrease of 69.4% in the average execution time, with high quality but a greater need for more images to achieve complete reconstruction.

Downloads

Download data is not yet available.

References

Alcantarilla, P. and Solutions, T. (2011). Fast explicit diffusion for accelerated features in nonlinear scale spaces. IEEE Trans. Patt. Anal. Mach. Intell, 34(7):1281–1298.

Bay, H., Ess, A., Tuytelaars, T., and Van Gool, L. (2008). Speeded-up robust features (surf). Computer vision and image understanding, 110(3):346–359.

Bradski, G. (2000). The OpenCV Library. Dr. Dobb’s Journal of Software Tools.

Chen, R., Han, S., Xu, J., et al. (2020). Visibility-aware point-based multi-view stereo network. IEEE transactions on pattern analysis and machine intelligence.

Chen, R., Han, S., Xu, J., and Su, H. (2019). Point-based multiview stereo network. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 1538–1547.

Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., and Bengio, Y. (2014). Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078.

Fuhrmann, S. and Goesele, M. (2014). Floating scale surface reconstruction. ACM Transactions on Graphics (ToG), 33(4):1–11.

Fuhrmann, S., Langguth, F., and Goesele, M. (2014). Mvea multi-view reconstruction environment. In GCH, pages 11–18. Citeseer.

Goesele, M., Snavely, N., Curless, B., Hoppe, H., and Seitz, S. (2007). Multi-view stereo for community photo collections. In 2007 IEEE 11th International Conference on Computer Vision, pages 1–8. IEEE.

Haner, S. and Heyden, A. (2012). Covariance propagation and next best view planning for 3d reconstruction. In European Conference on Computer Vision, pages 545–556. Springer.

Jensen, R., Dahl, A., Vogiatzis, G., Tola, E., and Aanæs, H. (2014). Large scale multi-view stereopsis evaluation. In 2014 IEEE Conference on Computer Vision and Pattern Recognition, pages 406–413. IEEE.

Kazhdan, M., Bolitho, M., and Hoppe, H. (2006). Poisson surface reconstruction. In Proceedings of the fourth Eurographics symposium on Geometry processing, volume 7.

Knapitsch, A., Park, J., Zhou, Q.Y., and Koltun, V. (2017). Tanks and temples: Benchmarking large-scale scene reconstruction. ACM Transactions on Graphics (ToG), 36(4):1–13.

Li, R., Liu, L., Phan, L., Abeysinghe, S., Grimm, C., and Ju, T. (2010). Polygonizing extremal surfaces with manifold guarantees. In Proceedings of the 14th ACM Symposium on Solid and Physical Modeling, pages 189–194.

Lowe, D. (2004). Distinctive image features from scale-invariant keypoints. International journal of computer vision, 60(2):91–110.

Lyra, V. G. d. M., Pinto, A. H., Lima, G. C., Lima, J. P., Teichrieb, V., Quintino, J. P., da Silva, F. Q., Santos, A. L., and Pinho, H. (2020). Development of an efficient 3d reconstruction solution from permissive open-source code. In 2020 22nd Symposium on Virtual and Augmented Reality (SVR), pages 232–241. IEEE.

Marton, Z. C., Rusu, R. B., and Beetz, M. (2009). On Fast Surface Reconstruction Methods for Large and Noisy Datasets. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), Kobe, Japan.

Rusu, R. and Cousins, S. (2011). 3D is here: Point Cloud Library (PCL). In IEEE International Conference on Robotics and Automation (ICRA), Shanghai, China.

Rusu, Radu Bogdan, S. C. (2011). Point cloud library. https://pointclouds.org/.

Schönberger, J. and Frahm, J. (2016). Structure-from-motion revisited. In Conference on Computer Vision and Pattern Recognition (CVPR).

Schönberger, J., Zheng, E., Frahm, J.-M., and Pollefeys, M. (2016). Pixelwise view selection for unstructured multi-view stereo. In European Conference on Computer Vision, pages 501–518. Springer.

Schops, T., Schönberger, J., Galliani, S., Sattler, T., Schindler, K., Pollefeys, M., and Geiger, A. (2017). A multi-view stereo benchmark with high-resolution images and multi-camera videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.

Seitz, S., Curless, B., Diebel, J., Scharstein, D., and Szeliski, R. (2006). A comparison and evaluation of multi-view stereo reconstruction algorithms. In 2006 IEEE computer society conference on computer vision and pattern recognition (CVPR’06), volume 1, pages 519–528. IEEE.

Strecha, C., Fransens, R., and Van Gool, L. (2004). Wide-baseline stereo from multiple views: a probabilistic account. In Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004., volume 1, pages I–I. IEEE.

Ullman, S. (1979). The interpretation of structure from motion. Proceedings of the Royal Society of London. Series B. Biological Sciences, 203(1153):405–426.

Weilharter, R. and Fraundorfer, F. (2021). Highres-mvsnet: A fast multi-view stereo network for dense 3d reconstruction from high-resolution images. IEEE Access, 9:11306–11315.

Yao, Y., Luo, Z., Li, S., Fang, T., and Quan, L. (2018). Mvsnet: Depth inference for unstructured multi-view stereo. In Proceedings of the European Conference on Computer Vision (ECCV), pages 767–783.

Yao, Y., Luo, Z., Li, S., Shen, T., Fang, T., and Quan, L. (2019). Recurrent mvsnet for highresolution multiview stereo depth inference. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 5525–5534.

Zhang, K., Liu, M., Zhang, J., and Dong, Z. (2021). Pamvsnet: Sparse-to-dense multi-view stereo with pyramid attention. IEEE Access, 9:27908–27915.

Zheng, E., Dunn, E., Jojic, V., and Frahm, J.-M. (2014). Patchmatch based joint view selection and depthmap estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1510–1517.