research-article

Parallax Engine: Head Controlled Motion Parallax Using Notebooks’ RGB Camera

Authors:
Jarbas Jácome

Centro de Informática/Voxar Labs/Mustic, Universidade Federal de Pernambuco, Brazil and Centro de Artes, Humanidades e Letras, Universidade Federal do Recôncavo da Bahia, Brazil

Centro de Informática/Voxar Labs/Mustic, Universidade Federal de Pernambuco, Brazil and Centro de Artes, Humanidades e Letras, Universidade Federal do Recôncavo da Bahia, Brazil
View Profile

,
Arlindo Gomes

Centro de Informática/Voxar Labs, Universidade Federal de Pernambuco, Brazil

Centro de Informática/Voxar Labs, Universidade Federal de Pernambuco, Brazil
View Profile

,
Willams de Lima Costa

Centro de Informática/Voxar Labs, Universidade Federal de Pernambuco, Brazil

Centro de Informática/Voxar Labs, Universidade Federal de Pernambuco, Brazil
View Profile

,
Lucas Silva Figueiredo

Centro de Informática/Voxar Labs, Universidade Federal de Pernambuco, Brazil

Centro de Informática/Voxar Labs, Universidade Federal de Pernambuco, Brazil
View Profile

,
Jader Abreu

Centro de Informática/Voxar Labs/Mustic, Universidade Federal de Pernambuco, Brazil

Centro de Informática/Voxar Labs/Mustic, Universidade Federal de Pernambuco, Brazil
View Profile

,
Luana Porciuncula

Centro de Informática/Voxar Labs, Universidade Federal de Pernambuco, Brazil

Centro de Informática/Voxar Labs, Universidade Federal de Pernambuco, Brazil
View Profile

,
Pedro K. Brant

Centro de Informática/Voxar Labs, Universidade Federal de Pernambuco, Brazil

Centro de Informática/Voxar Labs, Universidade Federal de Pernambuco, Brazil
View Profile

,
Luis E. M. Alves

Centro de Informática/Voxar Labs, Universidade Federal de Pernambuco, Brazil

Centro de Informática/Voxar Labs, Universidade Federal de Pernambuco, Brazil
View Profile

,
Walter F M Correia

Design/Voxar Labs, Universidade Federal de Pernambuco, Brazil

Design/Voxar Labs, Universidade Federal de Pernambuco, Brazil
View Profile

,
Veronica Teichrieb

Centro de Informática/Voxar Labs, Universidade Federal de Pernambuco, Brazil

Centro de Informática/Voxar Labs, Universidade Federal de Pernambuco, Brazil
View Profile

,
Jonysberg P. Quintino

Centro de Informática/Samsung Project, Universidade Federal de Pernambuco, Brazil

Centro de Informática/Samsung Project, Universidade Federal de Pernambuco, Brazil
View Profile

,
Fabio Q. B. da Silva

Centro de Informatica, Universidade Federal de Pernambuco, Brazil

Centro de Informatica, Universidade Federal de Pernambuco, Brazil
View Profile

,
Andre L M Santos

Centro de Informatica, Universidade Federal de Pernambuco, Brazil

Centro de Informatica, Universidade Federal de Pernambuco, Brazil
View Profile

,
Helder de Sousa Pinho

Development, SiDi, Brazil

Development, SiDi, Brazil
View Profile

SVR '21: Proceedings of the 23rd Symposium on Virtual and Augmented RealityOctober 2021Pages 137–146https://doi.org/10.1145/3488162.3488218

Published:03 January 2022Publication History

SVR '21: Proceedings of the 23rd Symposium on Virtual and Augmented Reality

Pages 137–146

ABSTRACT

Research on the Fish Tank Virtual Reality (FTVR) technique commonly uses specific sensors (e.g. infrared cameras and LEDs on glasses) to estimate user’s eye position. However, estimating the face position with an RGB camera is becoming more accessible. In this work, we explore community available face characteristics detection software to implement the FTVR technique for everyday uses of 3D-enabled applications on consumer notebooks without requiring extra devices. We introduce the Parallax Engine solution that can be added with ease to any Unity game engine application. The solution supports two parallax-related visualization options: 1) a monoscopic FTVR mode (FishTank), which locks the virtual camera of the 3D environment to the laptop’s screen 2) and a 2D parallax mode (Parallax2DoF), which allows horizontal and vertical displacement of 3D scene camera. Regarding face characteristics detection techniques, the Parallax Engine uses a standardized interface that can receive input from different methods and currently supports three options: Google’s MediaPipe, dlib, and PoseNet. We evaluated the proposed solution with five users, performing tasks using different options for viewing and face characteristics detection, aiming to understand how suitable it is for end-users. Besides some detection failures from dlib, results showed an overall good acceptance for both the FishTank and Parallax2DoF visualization options.

Supplemental Material

short_video_ParallaxEngineHeadControlledMotionParallaxUsingNotebooksRGBCamera.mp4

mp4

3.2 MB

Download

References

Brian Amberg, Reinhard Knothe, and Thomas Vetter. 2008. Expression invariant 3D face recognition with a morphable model. In 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition. IEEE, 1–6.Google ScholarCross Ref
Volker Blanz, Kristina Scherbaum, and Hans-Peter Seidel. 2007. Fitting a morphable model to 3D scans of faces. In 2007 IEEE 11th International Conference on Computer Vision. IEEE, 1–8.Google ScholarCross Ref
Volker Blanz and Thomas Vetter. 1999. A morphable model for the synthesis of 3D faces. In Proceedings of the 26th annual conference on Computer graphics and interactive techniques. 187–194.Google ScholarDigital Library
Timo Bolkart and Stefanie Wuhrer. 2013. Statistical analysis of 3d faces in motion. In 2013 International conference on 3D vision-3DV 2013. IEEE, 103–110.Google ScholarDigital Library
Mark F Bradshaw, Andrew D Parton, and Andrew Glennerster. 2000. The task-dependent use of binocular disparity and motion parallax information. Vision Research 40, 27 (2000), 3725–3734. https://doi.org/10.1016/S0042-6989(00)00214-5Google ScholarCross Ref
Zhe Cao, Gines Hidalgo, Tomas Simon, Shih-En Wei, and Yaser Sheikh. 2018. OpenPose: realtime multi-person 2D pose estimation using Part Affinity Fields. arXiv preprint arXiv:1812.08008(2018).Google Scholar
Zhe Cao, Tomas Simon, Shih-En Wei, and Yaser Sheikh. 2017. Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarCross Ref
Yu Chen, Chunhua Shen, Hao Chen, Xiu-Shen Wei, Lingqiao Liu, and Jian Yang. 2020. Adversarial Learning of Structure-Aware Fully Convolutional Networks for Landmark Localization. IEEE Transactions on Pattern Analysis and Machine Intelligence 42, 7(2020), 1654–1669. https://doi.org/10.1109/TPAMI.2019.2901875Google ScholarCross Ref
Neil Dodgson. 2004. Variation and extrema of human interpupillary distance. Stereoscopic Displays and Virtual Reality Syst XI 5291, 36–46. https://doi.org/10.1117/12.529999Google ScholarCross Ref
Xuanyi Dong, Yan Yan, Wanli Ouyang, and Yi Yang. 2018. Style Aggregated Network for Facial Landmark Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarCross Ref
Xuanyi Dong, Shoou-I Yu, Xinshuo Weng, Shih-En Wei, Yi Yang, and Yaser Sheikh. 2018. Supervision-by-Registration: An Unsupervised Approach to Improve the Precision of Facial Landmark Detectors. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarCross Ref
Dylan Fafard, Ian Stavness, Martin Dechant, Regan Mandryk, Qian Zhou, and Sidney Fels. 2019. Ftvr in vr: Evaluation of 3d perception with a simulated volumetric fish-tank virtual reality display. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 1–12.Google ScholarDigital Library
Dylan Brodie Fafard, Qian Zhou, Chris Chamberlain, Georg Hagemann, Sidney Fels, and Ian Stavness. 2018. Design and implementation of a multi-person fish-tank virtual reality display. In Proceedings of the 24th ACM Symposium on Virtual Reality Software and Technology. 1–9.Google ScholarDigital Library
Yao Feng, Haiwen Feng, Michael J Black, and Timo Bolkart. 2020. Learning an animatable detailed 3D face model from in-the-wild images. arXiv preprint arXiv:2012.04012(2020).Google Scholar
Lucas S Figueiredo, Edvar Vilar Neto, Ermano Arruda, João Marcelo Teixeira, and Veronica Teichrieb. 2014. Fishtank everywhere: Improving viewing experience over 3D content. In International Conference of Design, User Experience, and Usability. Springer, 560–571.Google ScholarCross Ref
James J. Gibson. 1979. The Ecological Approach to Visual Perception (1st ed.). Houghton Mifflin, Boston. 346 pages.Google ScholarCross Ref
Jianzhu Guo, Xiangyu Zhu, Yang Yang, Fan Yang, Zhen Lei, and Stan Z Li. 2020. Towards fast, accurate and stable 3d dense face alignment. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XIX 16. Springer, 152–168.Google Scholar
Harvey J Howard. 1919. A test for the judgment of distance. Transactions of the American Ophthalmological Society 17 (1919), 195.Google Scholar
Thibaut Jacob, Gilles Bailly, Eric Lecolinet, Géry Casiez, and Marc Teyssier. 2016. Desktop orbital camera motions using rotational head movements. In Proceedings of the 2016 Symposium on Spatial User Interaction. 139–148.Google ScholarDigital Library
Youngkyoon Jang, Hatice Gunes, and Ioannis Patras. 2019. Registration-free face-ssd: Single shot analysis of smiles, facial attributes, and affect in the wild. Computer Vision and Image Understanding 182 (2019), 17–29.Google ScholarDigital Library
Sheng Jin, Wentao Liu, Enze Xie, Wenhai Wang, Chen Qian, Wanli Ouyang, and Ping Luo. 2020. Differentiable hierarchical graph grouping for multi-person pose estimation. In European Conference on Computer Vision. Springer, 718–734.Google ScholarDigital Library
Sheng Jin, Lumin Xu, Jin Xu, Can Wang, Wentao Liu, Chen Qian, Wanli Ouyang, and Ping Luo. 2020. Whole-body human pose estimation in the wild. In European Conference on Computer Vision. Springer, 196–214.Google ScholarDigital Library
Yury Kartynnik, Artsiom Ablavatski, Ivan Grishchenko, and Matthias Grundmann. 2019. Real-time facial surface geometry from monocular video on mobile GPUs. arXiv preprint arXiv:1907.06724(2019).Google Scholar
Yury Kartynnik, Artsiom Ablavatski, Ivan Grishchenko, and Matthias Grundmann. 2019. Real-time facial surface geometry from monocular video on mobile GPUs. arXiv preprint arXiv:1907.06724(2019).Google Scholar
Vahid Kazemi and Josephine Sullivan. 2014. One millisecond face alignment with an ensemble of regression trees. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1867–1874.Google ScholarDigital Library
Petr Kellnhofer, Piotr Didyk, Tobias Ritschel, Belen Masia, Karol Myszkowski, and Hans-Peter Seidel. 2016. Motion parallax in stereo 3D: Model and applications. ACM Transactions on Graphics (TOG) 35, 6 (2016), 1–12.Google ScholarDigital Library
Davis E King. 2009. Dlib-ml: A machine learning toolkit. The Journal of Machine Learning Research 10 (2009), 1755–1758.Google ScholarDigital Library
Sirisilp Kongsilp and Matthew N Dailey. 2017. Communication portals: Immersive communication for everyday life. In 2017 20th Conference on Innovations in Clouds, Internet and Networks (ICIN). IEEE, 226–228.Google ScholarCross Ref
Sirisilp Kongsilp and Matthew N Dailey. 2017. Motion parallax from head movement enhances stereoscopic displays by improving presence and decreasing visual fatigue. Displays 49(2017), 72–79.Google ScholarCross Ref
Sirisilp Kongsilp and Matthew N Dailey. 2020. User Behavior and the Importance of Stereo for Depth Perception in Fish Tank Virtual Reality. PRESENCE: Virtual and Augmented Reality 27, 2 (2020), 206–225.Google ScholarCross Ref
Alexandros Lattas, Stylianos Moschoglou, Baris Gecer, Stylianos Ploumpis, Vasileios Triantafyllou, Abhijeet Ghosh, and Stefanos Zafeiriou. 2020. AvatarMe: Realistically Renderable 3D Facial Reconstruction” In-the-Wild”. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 760–769.Google ScholarCross Ref
Jia Li, Wen Su, and Zengfu Wang. 2020. Simple pose: Rethinking and improving a bottom-up approach for multi-person pose estimation. In Proceedings of the AAAI conference on artificial intelligence, Vol. 34. 11354–11361.Google ScholarCross Ref
Yong-Lu Li, Xinpeng Liu, Han Lu, Shiyi Wang, Junqi Liu, Jiefeng Li, and Cewu Lu. 2020. Detailed 2d-3d joint representation for human-object interaction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10166–10175.Google ScholarCross Ref
Jiangke Lin, Yi Yuan, and Zhengxia Zou. 2021. MeInGame: Create a Game Character Face from a Single Portrait. arXiv preprint arXiv:2102.02371(2021).Google Scholar
Camillo Lugaresi, Jiuqiang Tang, Hadon Nash, Chris McClanahan, Esha Uboweja, Michael Hays, Fan Zhang, Chuo-Ling Chang, Ming Guang Yong, Juhyun Lee, 2019. Mediapipe: A framework for building perception pipelines. arXiv preprint arXiv:1906.08172(2019).Google Scholar
George Papandreou, Tyler Zhu, Liang-Chieh Chen, Spyros Gidaris, Jonathan Tompson, and Kevin Murphy. 2018. PersonLab: Person Pose Estimation and Instance Segmentation with a Bottom-Up, Part-Based, Geometric Embedding Model. arxiv:1803.08225 [cs.CV]Google Scholar
George Papandreou, Tyler Zhu, Nori Kanazawa, Alexander Toshev, Jonathan Tompson, Chris Bregler, and Kevin Murphy. 2017. Towards Accurate Multi-person Pose Estimation in the Wild. arxiv:1701.01779 [cs.CV]Google Scholar
Ankur Patel and William AP Smith. 2009. 3d morphable face models revisited. In 2009 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 1327–1334.Google ScholarCross Ref
Eduardo Rodrigues, Lucas Silva Figueiredo, Lucas Maggi, Edvar Neto, Layon Tavares Bezerra, João Marcelo Teixeira, and Veronica Teichrieb. 2017. Mixed Reality TVs: Applying Motion Parallax for Enhanced Viewing and Control Experiences on Consumer TVs. In 2017 19th Symposium on Virtual and Augmented Reality (SVR). IEEE, 319–330.Google Scholar
Eduardo Rodrigues, Lucas Silva Figueiredo, Lucas Maggi, Edvar Neto, Layon Tavares Bezerra, João Marcelo Teixeira, and Veronica Teichrieb. 2017. Mixed Reality TVs: Applying Motion Parallax for Enhanced Viewing and Control Experiences on Consumer TVs. In 2017 19th Symposium on Virtual and Augmented Reality (SVR). 319–330. https://doi.org/10.1109/SVR.2017.48Google ScholarCross Ref
Jiaxiang Shang, Tianwei Shen, Shiwei Li, Lei Zhou, Mingmin Zhen, Tian Fang, and Long Quan. 2020. Self-supervised monocular 3d face reconstruction by occlusion-aware multi-view geometry consistency. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XV 16. Springer, 53–70.Google Scholar
Jonathan Tompson, Ross Goroshin, Arjun Jain, Yann LeCun, and Christoph Bregler. 2015. Efficient object localization using convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 648–656.Google ScholarCross Ref
Alexander Toshev and Christian Szegedy. 2014. Deeppose: Human pose estimation via deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1653–1660.Google ScholarDigital Library
M Alex O Vasilescu and Demetri Terzopoulos. 2002. Multilinear analysis of image ensembles: Tensorfaces. In European conference on computer vision. Springer, 447–460.Google ScholarDigital Library
Daniel Vlasic, Matthew Brand, Hanspeter Pfister, and Jovan Popovic. 2006. Face transfer with multilinear models. In ACM SIGGRAPH 2006 Courses. 24–es.Google ScholarDigital Library
Collin Ware, Kevin Arthur, and Kellogg S. Booth. 1993. Fish tank virtual reality. In Conference on Human Factors in Computing Systems - Proceedings. 37–42. https://doi.org/10.1145/169059.169066Google ScholarDigital Library
Shih-En Wei, Varun Ramakrishna, Takeo Kanade, and Yaser Sheikh. 2016. Convolutional pose machines. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4724–4732.Google ScholarCross Ref
E Wright, PE Connolly, M Sackley, J McCollom, S Malek, K Fan, 2012. A comparative analysis of Fish Tank Virtual Reality to stereoscopic 3D imagery. In 67th Midyear Meeting Proceedings. 37–45.Google Scholar
Yue Wu, Chao Gou, and Qiang Ji. 2017. Simultaneous Facial Landmark Detection, Pose and Deformation Estimation Under Facial Occlusion. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarCross Ref
Feng Zhang, Xiatian Zhu, Hanbin Dai, Mao Ye, and Ce Zhu. 2020. Distribution-aware coordinate representation for human pose estimation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 7093–7102.Google ScholarCross Ref
Jialiang Zhang, Lixiang Lin, Jianke Zhu, and Steven CH Hoi. 2021. Weakly-Supervised Multi-Face 3D Reconstruction. arXiv preprint arXiv:2101.02000(2021).Google Scholar
Meilu Zhu, Daming Shi, Mingjie Zheng, and Muhammad Sadiq. 2019. Robust Facial Landmark Detection via Occlusion-Adaptive Deep Networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarCross Ref
Xiangyu Zhu, Zhen Lei, Junjie Yan, Dong Yi, and Stan Z. Li. 2015. High-Fidelity Pose and Expression Normalization for Face Recognition in the Wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
Xu Zou, Sheng Zhong, Luxin Yan, Xiangyun Zhao, Jiahuan Zhou, and Ying Wu. 2019. Learning Robust Facial Landmark Detection via Hierarchical Structured Ensemble. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV).Google ScholarCross Ref

Index Terms

Parallax Engine: Head Controlled Motion Parallax Using Notebooks’ RGB Camera
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
  2. Computer graphics
    1. Graphics systems and interfaces
      1. Virtual reality
2. Human-centered computing
  1. Human computer interaction (HCI)
    1. Interaction paradigms
      1. Virtual reality

Index terms have been assigned to the content through auto-classification.

Recommendations

Mixed Reality Laptops: A Protocol to Evaluate Enhanced Viewing Experiences on Consumer Laptops Using Parallax Engine
SVR '22: Proceedings of the 24th Symposium on Virtual and Augmented Reality

Research into Fish Tank Virtual Reality (FTVR) techniques commonly uses specific wearable sensors such as infrared cameras, glasses, or helmets to estimate the position of the user’s eyes. However, using only existing RGB cameras on laptops is more ...
Read More
A Motion Parallax Rendering Approach to Real-time Stereoscopic Visualization for Aircraft Virtual Assembly
ICVRV '12: Proceedings of the 2012 International Conference on Virtual Reality and Visualization

As a cue to depth perception, motion parallax can improve stereoscopic visualization to a level more like human natural vision. And stereoscopic visualization with motion parallax rendering can lessen the fatigue when people are long-time immersed in ...
Read More
The effects of virtual reality, augmented reality, and motion parallax on egocentric depth perception
APGV '08: Proceedings of the 5th symposium on Applied perception in graphics and visualization

As the use of virtual and augmented reality applications becomes more common, the need to fully understand how observers perceive spatial relationships grows more critical. One of the key requirements in engineering a practical virtual or augmented ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

SVR '21: Proceedings of the 23rd Symposium on Virtual and Augmented Reality
October 2021
196 pages
ISBN:9781450395526
DOI:10.1145/3488162

Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 3 January 2022
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Fish tank virtual reality
face characteristics detection
motion parallax
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 102
  Total Downloads
- Downloads (Last 12 months)32
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Parallax Engine: Head Controlled Motion Parallax Using Notebooks’ RGB Camera

SVR '21: Proceedings of the 23rd Symposium on Virtual and Augmented Reality

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Mixed Reality Laptops: A Protocol to Evaluate Enhanced Viewing Experiences on Consumer Laptops Using Parallax Engine

A Motion Parallax Rendering Approach to Real-time Stereoscopic Visualization for Aircraft Virtual Assembly

The effects of virtual reality, augmented reality, and motion parallax on egocentric depth perception

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Parallax Engine: Head Controlled Motion Parallax Using Notebooks’ RGB Camera

SVR '21: Proceedings of the 23rd Symposium on Virtual and Augmented Reality

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Mixed Reality Laptops: A Protocol to Evaluate Enhanced Viewing Experiences on Consumer Laptops Using Parallax Engine

A Motion Parallax Rendering Approach to Real-time Stereoscopic Visualization for Aircraft Virtual Assembly

The effects of virtual reality, augmented reality, and motion parallax on egocentric depth perception

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media