AirSnip: Gesture-Based AR Screenshot Enhanced by GenAI
Resumo
This paper presents the development of an interactive tool for image capture in Augmented Reality (AR) environments, based on a custom mid-air hand gesture that mimics the behavior of a snipping tool. The system allows users to delimit a rectangular area in space using a gesture, capture a real-world view through the Meta Quest 3’s passthrough camera, and generate a 3D virtual frame with the resulting image. These frames can be manipulated using natural gestures (e.g., push, pinch) and are enhanced with personalized descriptions generated by a Large Language Model (LLM) via Google Gemini, with audio output through Text-to-Speech (TTS). Built using Unity and the Meta XR SDK, the prototype offers a seamless and intuitive interface for storytelling, education, and spatial memory. Although user studies are planned for future work, the tool demonstrates the technical feasibility of integrating gesture recognition, spatial rendering, and Generative AI in a unified AR experience.Referências
Arena, F.; Collotta, M.; Pau, G.; Termine, F. An Overview of Augmented Reality. Computers 2022, 11, 28. DOI: 10.3390/computers11020028
R. Phursule, K. Sirpor, P. Virmalwar, S. Zadbuke and P. Avachat, ”Augmented Reality Snipping Tool,” 2023 4th International Conference for Emerging Technology (INCET), Belgaum, India, 2023, pp. 1-4, DOI: 10.1109/INCET57972.2023.10170651. keywords: Graphics;Three-dimensional displays;Machine learning;Real-time systems;Workstations;Augmented reality;Augmented Reality(AR);Virtual Reality(VR);Snipping Tool
Lucas S. Figueiredo, Mariana Pinheiro, Edvar Vilar Neto, Thiago Chaves, Veronica Teichrieb. Sci-Fi Gestures Catalog. 15th Human-Computer Interaction (INTERACT), Sep 2015, Bamberg, Germany. pp.395-411, ff10.1007/978-3-319-22668-2 30ff. ffhal-01599861f
M. Chen, A. Monroy-Hernández and M. Sra, ”SceneAR: Scene-based Micro Narratives for Sharing and Remixing in Augmented Reality,” 2021 IEEE International Symposium on Mixed and Augmented Reality (ISMAR), Bari, Italy, 2021, pp. 294-303, DOI: 10.1109/ISMAR52148.2021.00045. keywords: Three-dimensional displays;Visual communication;Social networking (online);Multimedia Web sites;Lighting;Media;Reliability engineering;Human-centered computing;Visualization;Visualization techniques;Treemaps;Visualization design and evaluation methods,
S. Lyu, “The Application of Generative AI in Virtual Reality and Augmented Reality”, Journal of Industrial Engineering; Applied Science, vol. 2, no. 6, pp. 1–9, Dec. 2024.
Tafadzwa Joseph Dube, Yuan Ren, Hannah Limerick, I. Scott MacKenzie, and Ahmed Sabbir Arif. 2022. Push, Tap, Dwell, and Pinch: Evaluation of Four Mid-air Selection Methods Augmented with Ultrasonic Haptic Feedback. Proc. ACM Hum.-Comput. Interact. 6, ISS, Article 565 (December 2022), 19 pages. DOI: 10.1145/3567718
R. Phursule, K. Sirpor, P. Virmalwar, S. Zadbuke and P. Avachat, ”Augmented Reality Snipping Tool,” 2023 4th International Conference for Emerging Technology (INCET), Belgaum, India, 2023, pp. 1-4, DOI: 10.1109/INCET57972.2023.10170651. keywords: Graphics;Three-dimensional displays;Machine learning;Real-time systems;Workstations;Augmented reality;Augmented Reality(AR);Virtual Reality(VR);Snipping Tool
Lucas S. Figueiredo, Mariana Pinheiro, Edvar Vilar Neto, Thiago Chaves, Veronica Teichrieb. Sci-Fi Gestures Catalog. 15th Human-Computer Interaction (INTERACT), Sep 2015, Bamberg, Germany. pp.395-411, ff10.1007/978-3-319-22668-2 30ff. ffhal-01599861f
M. Chen, A. Monroy-Hernández and M. Sra, ”SceneAR: Scene-based Micro Narratives for Sharing and Remixing in Augmented Reality,” 2021 IEEE International Symposium on Mixed and Augmented Reality (ISMAR), Bari, Italy, 2021, pp. 294-303, DOI: 10.1109/ISMAR52148.2021.00045. keywords: Three-dimensional displays;Visual communication;Social networking (online);Multimedia Web sites;Lighting;Media;Reliability engineering;Human-centered computing;Visualization;Visualization techniques;Treemaps;Visualization design and evaluation methods,
S. Lyu, “The Application of Generative AI in Virtual Reality and Augmented Reality”, Journal of Industrial Engineering; Applied Science, vol. 2, no. 6, pp. 1–9, Dec. 2024.
Tafadzwa Joseph Dube, Yuan Ren, Hannah Limerick, I. Scott MacKenzie, and Ahmed Sabbir Arif. 2022. Push, Tap, Dwell, and Pinch: Evaluation of Four Mid-air Selection Methods Augmented with Ultrasonic Haptic Feedback. Proc. ACM Hum.-Comput. Interact. 6, ISS, Article 565 (December 2022), 19 pages. DOI: 10.1145/3567718
Publicado
30/09/2025
Como Citar
OLIVEIRA, Robson; SIMÕES, Francisco.
AirSnip: Gesture-Based AR Screenshot Enhanced by GenAI. In: WORKSHOP DE TRABALHOS EM ANDAMENTO - CONFERENCE ON GRAPHICS, PATTERNS AND IMAGES (SIBGRAPI), 38. , 2025, Salvador/BA.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2025
.
p. 180-183.
