Exploring Visual and Multimodal Interaction in NCL Authoring

  • Paulo Victor Borges PUC-Rio
  • Daniel de S. Moraes PUC-Rio
  • Joel dos Santos CEFET-RJ
  • Débora C. Muchaluat-Saade UFF
  • Sérgio Colcher PUC-Rio

Resumo


This paper introduces two innovative tools for enhancing interactive multimedia authoring using the Nested Context Language (NCL): (i) a visual extension that supports more traditional interactions with mouse and keyboard and (ii) a multimodal extension that incorporates gesture recognition and voice commands. These tools were implemented as Visual Studio Code extensions and aim to streamline the editing process, making it more intuitive and accessible. We present an evaluation of the usability and acceptance of both tools with developers in an experiment with three tasks for creating and manipulating spatial regions in hypermedia documents. By exploring the potential of multimodal interfaces, this work sets the stage for more efficient and user-friendly document editing.

Palavras-chave: Authoring, LLMs, NCL, Code Generation, Visual Studio Code

Referências

NBR ABNT. [n. d.]. Digital Terrestrial Television-Data Coding and Transmission Specification for Digital Broadcasting-Part 2: Ginga-NCL for fixed and mobile receivers, Brazilian Standard 15606-2, Brazil, 2007.

G Kumar Arora. 2017. SOLID Principles Succinctly. CreateSpace Independent Publishing Platform. 1-4.

Abel Avram. 2007. Domain-driven design Quickly. 20-32.

Aaron Bangor, Philip T Kortum, and James T Miller. 2008. An empirical evaluation of the system usability scale. Intl. Journal of Human–Computer Interaction 24, 6 (2008), 574–594.

Moniruzzaman Bhuiyan and Rich Picking. 2009. Gesture-controlled user interfaces, what have we done and what’s next. In Proceedings of the fifth collaborative research symposium on security, E-Learning, Internet and Networking (SEIN 2009), Darmstadt, Germany. Citeseer, 26–27.

John Brooke et al. 1996. SUS-A quick and dirty usability scale. Usability evaluation in industry 189, 194 (1996), 4–7.

Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde De Oliveira Pinto, Jared Kaplan, Harri Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, et al. 2021. Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374 (2021).

Xinyun Chen, Maxwell Lin, Nathanael Schärli, and Denny Zhou. 2023. Teaching large language models to self-debug. arXiv preprint arXiv:2304.05128 (2023).

Visual Studio Code. 2019. Visual studio code. Recuperado el Octubre de (2019).

Douglas Paulo de Mattos and Débora C Muchaluat-Saade. 2018. Steve: A hypermedia authoring tool based on the simple interactive multimedia model. In Proceedings of the ACM Symposium on Document Engineering 2018. 1–10.

Joel André Ferreira Dos Santos and Débora Christina Muchaluat-Saade. 2012. XTemplate 3.0: spatio-temporal semantics and structure reuse for hypermedia compositions. Multimedia Tools and Applications 61, 3 (2012), 645–673.

Louie Giray. 2023. Prompt engineering with ChatGPT: a guide for academic writers. Annals of biomedical engineering 51, 12 (2023), 2629–2633.

Moh Harris, Ali Suryaperdana Agoes, et al. 2021. Applying hand gesture recognition for user guide application using MediaPipe. In 2nd International Seminar of Science and Applied Technology (ISSAT 2021). Atlantis Press, 101–108.

Jhilmil Jain, Arnold Lund, and Dennis Wixon. 2011. The future of natural user interfaces. In CHI’11 Extended Abstracts on Human Factors in Computing Systems. 211–214.

Ankur Joshi, Saket Kale, Satish Chandel, and D Kumar Pal. 2015. Likert scale: Explored and explained. British journal of applied science & technology 7, 4 (2015),396–403.

Bipin Joshi and Bipin Joshi. 2016. Overview of SOLID Principles and Design Patterns. Beginning SOLID Principles and Design Patterns for ASP. NET Developers (2016), 1–44.

Dr Manju Kaushik and Rashmi Jain. 2014. Natural user interfaces: Trend in virtual interaction. arXiv preprint arXiv:1405.0101 (2014).

James R Lewis. 2018. The system usability scale: past, present, and future. International Journal of Human–Computer Interaction 34, 7 (2018), 577–590.

Yujia Li, David Choi, Junyoung Chung, Nate Kushman, Julian Schrittwieser, Rémi Leblond, Tom Eccles, James Keeling, Felix Gimeno, Agustin Dal Lago, et al. 2022. Competition-level code generation with alphacode. Science 378, 6624 (2022), 1092–1097.

Scott Millett and Nick Tune. 2015. Patterns, principles, and practices of domain-driven design. 50-64. John Wiley & Sons.

Daniel de Sousa Moraes, Polyana Bezerra da Costa, Antonio JG Busson, José Matheus Carvalho Boaro, Carlos de Salles Soares Neto, and Sergio Colcher. 2023. On the Challenges of Using Large Language Models for NCL Code Generation. In Anais Estendidos do XXIX Simpósio Brasileiro de Sistemas Multimídia e Web. SBC, 151–156.

Daniel de Sousa Moraes, André Luiz de B Damasceno, Antonio José G Busson, and Carlos de Salles Soares Neto. 2016. Lua2NCL: framework for textual authoring of NCL applications using Lua. In Proceedings of the 22nd Brazilian Symposium on Multimedia and the Web. 47–54.

Erik Nijkamp, Bo Pang, Hiroaki Hayashi, Lifu Tu, Huan Wang, Yingbo Zhou, Silvio Savarese, and Caiming Xiong. 2022. Codegen: An open large language model for code with multi-turn program synthesis. arXiv preprint arXiv:2203.13474 (2022).

Douglas Paulo de Mattos, Júlia Varanda da Silva, and Débora Christina Muchaluat-Saade. 2013. NEXT: graphical editor for authoring NCL documents supporting composite templates. In Proceedings of the 11th european conference on Interactive TV and video. 89–98.

Carlos de Salles Soares Neto Roberto Gerson de Albuquerque Azevedo, Mario Meireles Teixeira. 2009. NCL Eclipse: Ambiente Integrado para o Desenvolvimento de Aplicações para TV Digital Interativa em Nested Context Language. In Salão de Ferramentas - SBRC 2009. São Luís, MA, Brazil.

Victor Hazin da Rocha. 2013. DiTV–Arquitetura de desenvolvimento para aplicações interativas distribuídas para TV digital. Master’s thesis. Universidade Federal de Pernambuco.

Harmeet Singh and Syed Imtiyaz Hassan. 2015. Effect of solid design principles on quality of software: An empirical assessment. International Journal of Scientific & Engineering Research 6, 4 (2015), 1321–1324.

Luiz Fernando Gomes Soares and Rogério Ferreira Rodrigues. 2006. Nested context language 3.0 part 8–ncl digital tv profiles. Monografias em Ciência da Computação do Departamento de Informática da PUC-Rio 1200, 35 (2006), 06.

Jules White, Quchen Fu, Sam Hays, Michael Sandborn, Carlos Olea, Henry Gilbert, Ashraf Elnashar, Jesse Spencer-Smith, and Douglas C Schmidt. 2023. A prompt pattern catalog to enhance prompt engineering with chatgpt. arXiv preprint arXiv:2302.11382 (2023).

Anthony Zhang. 2017. SpeechRecognition 2.1.3. [link] Acessado: 20-08-2024.

Fan Zhang, Valentin Bazarevsky, Andrey Vakunov, Andrei Tkachenka, George Sung, Chuo-Ling Chang, and Matthias Grundmann. 2020. Mediapipe hands: On-device real-time hand tracking. arXiv preprint arXiv:2006.10214 (2020).
Publicado
14/10/2024
BORGES, Paulo Victor; MORAES, Daniel de S.; DOS SANTOS, Joel; MUCHALUAT-SAADE, Débora C.; COLCHER, Sérgio. Exploring Visual and Multimodal Interaction in NCL Authoring. In: BRAZILIAN SYMPOSIUM ON MULTIMEDIA AND THE WEB (WEBMEDIA), 30. , 2024, Juiz de Fora/MG. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2024 . p. 153-161. DOI: https://doi.org/10.5753/webmedia.2024.243157.

Artigos mais lidos do(s) mesmo(s) autor(es)

<< < 1 2