Special Section on SIBGRAPI 2021Spatially and color consistent environment lighting estimation using deep neural networks for mixed reality
Graphical abstract
Introduction
Consistent environment lighting is a crucial component in real-time simulations based on mixed reality applications. The divergence between real and virtual objects lighting is a significant factor for immersion loss and a perceptual reduced graphical quality [1]. Plausible mixed reality lighting can be accomplished by acquiring the lighting of the real environment and adapting the virtual environment with matching lighting properties [2]. Most lighting recovery approaches have focused on intrusive tools to measure the environment lighting, requiring great user’s effort and scenario preparation. Consequently, these solutions have limited applicability in XR systems based on real-time visualization. An alternative to real-data measurements is to estimate the lighting indirectly through the available environment information. Despite the recent advances in computer vision and inverse rendering [3], estimating the environment lighting without specialized equipment and under strict time constraints remains a challenging problem [4]. This work aims to recognize the user’s environment lighting through a model that learns the scene’s inherent characteristics regarding lighting and illumination, therefore estimating an environment lighting capable of generating plausible XR environments. A challenging aspect of the problem resides in the fact that lighting estimation is an ill-posed problem, yielding no solution or multiple solutions for a given input [5].
We use machine learning techniques and a specialized dataset to overcome the complex aspects of lighting estimation, learning from promptly available information in mixed reality applications: an RGB image of the environment taken from an egocentric point-of-view.
We leverage state-of-the-art lighting estimation methods by predicting the real-world environment lighting using a convolutional neural network that works in the wild without assumptions about the scene’s geometry or special measurement devices. Our method does work in a variety of environments, including indoor and outdoor scenes, and does not require any user’s intervention in the scene. Our custom-designed CNN architecture learns a latent space representation of the environment lighting, allowing an efficient representation of the scene illumination. This representation is used to estimate the environment lighting encoded in a spherical harmonics basis. We also present a framework to create a mixed-reality-view, an image that mimics the user’s egocentric view in an XR environment.
Fig. 1 illustrates examples where virtual objects are illuminated by our method. The composition of real and virtual objects can be utilized as a plausible and realistic XR environment.
The main contributions of our work are:
- •
Automatic end-to-end method to estimate the environment lighting in XR applications.
- •
A CNN architecture that learns a latent-space of the environment lighting.
- •
A methodology to generate egocentric mixed-reality-views from HDR panoramas.
- •
Real-time lighting estimation that does not make assumptions about the XR scene.
The lighting estimation model developed in this work can be employed in most XR applications increasing the user’s immersion by providing lighting consistency. The applicability of our model is not restricted to mixed reality; other applications also benefit from it, including real-time editing of video and photo with consistent illumination, real-time relighting of pictures, and inverse lighting design [6].
Section snippets
Related work
Many related works try to solve the lighting estimation task based on different assumptions or strategies. In the following subsections, we group them into categories comparing with our proposed solution. In addition, we highlight the limitations and restrictions of the prior works concerning XR applications when appropriate.
A CNN method for environment lighting estimation based on spherical harmonics functions
Our goal is to recognize the real-world environment lighting, leveraging this lighting information to virtual environments, allowing more convincing lighting composition for XR experiences. We explore spherical harmonics functions to encode the environment lighting into a compact and expressive representation. This strategy allows representing smooth arbitrary area lighting, not limited to a few point light or directional light sources [51].
Our model is based on a convolutional neural network
Learning from HDR panoramas
In this section, we describe the complete pipeline to process the input HDR environment panorama into mixed-reality-views and the corresponding environment lighting. The mixed-reality-view (MRV) is a low-dynamic-range (LDR) color image similar to a photograph taken from a camera located in the HMD capturing an egocentric view of the user’s environment. Spherical harmonics coefficients encode an area light model that represents the environment lighting. Those data are used for training our
Results, experiments and performance
In this section, we show the results of our method and discuss the XR applications that are made possible by our lighting estimation method.
Conclusions
In this work, we introduced a new real-time environment lighting model that is able to compute plausible estimated environment lighting for XR applications directly from mixed-reality-views with no former constraints. Unlike previous approaches, we neither rely on any constraints on the scene geometry and lighting settings nor require the use of probes.
The environment lighting produced is encoded as 3 × 9 spherical harmonic coefficients (9 for each color channel) predicted by a new deep neural
CRediT authorship contribution statement
Bruno Augusto Dorta Marques: Conceptualization, Methodology, Software, Investigation, Writing. Esteban Walter Gonzalez Clua: Conceptualization, Supervision. Anselmo Antunes Montenegro: Investigation, Writing. Cristina Nader Vasconcelos: Conceptualization, Supervision.
Declaration of Competing Interest
One or more of the authors of this paper have disclosed potential or pertinent conflicts of interest, which may include receipt of payment, either direct or indirect, institutional support, or association with an entity in the biomedical field which may be perceived to have potential conflict of interest with this work. For full disclosure statements refer to https://doi.org/10.1016/j.cag.2021.08.007. This research was supported by CAPES, NVIDIA, CNPq and FAPERJ.
Acknowledgments
This research has been supported by the following research agencies: CAPES, Brazil, CNPq, Brazil and FAPERJ, Brazil. We also would like to thanks NVIDIA Corp., USA for providing GPUs and funding this work.
References (65)
- et al.
Inverse lighting design for interior buildings integrating natural and artificial sources
Comput Graph
(2012) - et al.
Reciprocal shading for mixed reality
Comput Graph
(2012) - et al.
Separating corneal reflections for illumination estimation
Neurocomputing
(2008) - et al.
Deep spherical harmonics light probe estimator for mixed reality games
Comput Graph
(2018) - et al.
All-weather model for sky luminance distribution-preliminary configuration and validation
Sol Energy
(1993) - et al.
Research on convolutional neural network based on improved relu piecewise activation function
Procedia Comput Sci
(2018) - et al.
Perceptual issues in augmented reality revisited
- et al.
Classification of illumination methods for mixed reality
- et al.
A survey of inverse rendering problems
- et al.
A survey on image-based approaches of synthesizing objects
A signal-processing framework for inverse rendering
Rendering synthetic objects into real scenes: Bridging traditional and image-based graphics with global illumination and high dynamic range photography
Dynamic hdr environment capture for mixed reality
Differential irradiance caching for fast high-quality light transport between virtual and real worlds
The shading probe: Fast appearance acquisition for mobile ar
Inverse lighting and photorealistic rendering for augmented reality
Vis Comput
Deeplight: Learning illumination for unconstrained mobile mixed reality
Learning lightprobes for mixed reality illumination
Learning to estimate indoor lighting from 3d objects
Elasticfusion: Real-time dense slam and light source estimation
Int J Robot Res
3d high dynamic range dense visual slam and its application to real-time object re-lighting
Efficient and robust radiance transfer for probeless photorealistic augmented reality
Real-time photometric registration from arbitrary geometry
Emptying, refurnishing, and relighting indoor spaces
ACM Trans Graph (Proc SIGGRAPH Asia 2016)
Intrinsic3d: High-quality 3d reconstruction by joint appearance and geometry optimization with spatially-varying lighting
State of the art on monocular 3d face reconstruction, tracking, and applications
A morphable model for the synthesis of 3d faces
Realistic inverse lighting from a single 2d image of a face, taken under unknown and complex lighting
Efficient and robust inverse lighting of a single face image using compressive sensing
Lighting design for portraits with a virtual light stage
Occlusion-aware 3d morphable models and an illumination prior for face image analysis
Int J Comput Vis
Single image portrait relighting
ACM Trans Graph
Cited by (6)
Using convolutional neural network models illumination estimation according to light colors
2022, OptikCitation Excerpt :Some familiar models used in image analysis are VGG16 [45], EfficientNet-B0 [46], ResNet50 [47], MobileNet [48], DenseNet121 [49], and GoogLeNet [50]. These models have different layers and highly optimized structures that can learn complex tasks [51]. Of these models, the VGG16 uses small convolution filters, one step 3 × 3 in all of its layers.
Foreword to the special section on SIBGRAPI 2021
2022, Computers and Graphics (Pergamon)Supply and Demand for Planning and Construction of Nighttime Urban Lighting: A Comparative Case Study of Binjiang District, Hangzhou
2023, Sustainability (Switzerland)Challenges for XR in Games
2023, Communications in Computer and Information ScienceLighting Spectral Power Distribution Estimation With RGB Camera
2022, Proceedings - 16th International Conference on Signal-Image Technology and Internet-Based Systems, SITIS 2022