Virtual Environment Generation with Artificial Intelligence

  • Paulo Abner Mesquita Instituto Atlântico
  • Leonardo Rocha Instituto Atlântico

Resumo


The creation of immersive 360° Virtual Reality (VR) environments remains a significant bottleneck. This pilot study explores leveraging generative artificial intelligence to create panoramic VR environments, represented as cubemaps, directly from textual descriptions. Inspired by recent text-to-3D methodologies but adapted for panoramic output, we propose a framework utilizing pre-trained text-to-image diffusion models, specifically Stable Diffusion for creative inpainting/generation, potentially combined with monocular depth estimation for consistency. The goal is to generate the six faces of a cubemap representing a coherent, 360° view suitable for VR applications, emphasizing the creative potential of generating unique environments from simple text prompts. This paper outlines the adapted methodology, discusses the advantages of cubemaps, compares them to other projection techniques (spherical, equirectangular), and highlights the potential for democratizing VR panorama creation for artists and other creative professionals looking for a tool that can leverage the creativity for computing.
Palavras-chave: Virtual Reality, Generative AI, Stable Diffusion, Cubemap, Inpainting

Referências

R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer. High-resolution image synthesis with latent diffusion models. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, 10684–10695.

L. Hollein, A. Cao, A. Owens, J. Johnson, and M. Niessner. Text2Room Extracting textured 3D meshes from 2D text-to-image models. 2023. Available [link]

A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark, G. Krueger, and I. Sutskever. Learning transferable visual models from natural language supervision. 2021. Available [link]

M. Caron, H. Touvron, I. Misra, H. Jegou, J. Mairal, P. Bojanowski, and A. Joulin. Emerging properties in self-supervised vision transformers. 2021. Available [link]

P. Esser, S. Kulal, A. Blattmann, R. Entezari, J. Muller, H. Saini, Y. Levi, D. Lorenz, A. Sauer, F. Boesel, D. Podell, T. Dockhorn, Z. English, K. Lacey, A. Goodwin, Y. Marek, and R. Rombach. Scaling rectified flow transformers for high-resolution image synthesis. 2024. Available [link]

B. Mildenhall, P. P. Srinivasan, M. Tancik, J. T. Barron, R. Ramamoorthi, and R. Ng. Nerf Representing scenes as neural radiance fields for view synthesis. 2020. Available [link]

T. Wu, Y.-J. Yuan, L.-X. Zhang, J. Yang, Y.-P. Cao, L.-Q. Yan, and L. Gao. Recent advances in 3D Gaussian Splatting. 2024. Available [link]

B. Poole, A. Jain, J. T. Barron, and B. Mildenhall. Dreamfusion Text-to-3D using 2D diffusion. 2022. Available [link]

S. Woo Han and D. Young Suh. A 360-degree panoramic image inpainting network using a cube map. Computers, Materials and Continua, 66(1):213–228, 2020. Available DOI: 10.32604/cmc.2020.012223

B. Martinson. Stumbling into virtual worlds How resolution affects users immersion in virtual reality and implications for virtual reality in therapeutic applications. No. 732, Undergraduate Honors Thesis, 2022. Available [link]

Jasperai. Flux.1-dev-controlnet-upscaler. 2024. Accessed 2025-01-16. Available [link]
Publicado
30/09/2025
MESQUITA, Paulo Abner; ROCHA, Leonardo. Virtual Environment Generation with Artificial Intelligence. In: SIMPÓSIO DE REALIDADE VIRTUAL E AUMENTADA (SVR), 27. , 2025, Salvador/BA. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2025 . p. 214-219.