Development and Evaluation of an Air Writing and Drawing System with an Electronic Pen and VLMs

  • Luma T. L. de Souza IFES
  • Rafael A. D. Caldeira IFES
  • Sérgio D. C. Leal IFES
  • Maria Clara P. de Souza IFES
  • Thiago M. Paixão IFES
  • Richard J. M. G. Tello IFES

Abstract


The need to explore new forms of human–computer interaction and to expand resources for writing and drawing in digital environments motivated the development of this project. To this end, a pen equipped with a bluish LED at its tip was designed to generate a luminous point. The structure was produced using 3D printing and incorporated an ESP32 microcontroller with Bluetooth technology, enabling integration with the computer. The system was designed to capture, through a camera, the movements performed in the air with the pen and, with the support of Vision–Language models, to recognize both written words and drawn images. In the case of images, the system can also generate an enhanced version of the drawing by using the image description as a reference; however, this functionality will not be explored in the present study. Finally, a comparison among different models was carried out, using 12 words for testing — 6 in Portuguese and 6 in English — and 5 drawings from different classes. The models with the best performance achieved 84% accuracy with Gemini 2.5 Flash in image detection and 88.3% accuracy with Perplexity.ai’s model, based on GPT-4.1, in word detection.

References

Alam, M. S., Kwon, K.-C., and Kim, N. (2019). Trajectory-based air-writing character recognition using convolutional neural network. In 2019 4th International Conference on Control, Robotics and Cybernetics (CRC), pages 86–90.

Barbosa, C. E., Pereira, T. B., do Carmo, I. M., Tello, R. J., Boldt, F. A., and Paixao, T. M. (2024). Reconhecimento de texto para sistemas air writing: Um estudo experimental. In Escola Regional de Informática do Espírito Santo (ERI-ES), pages 21–30. SBC.

Chen, M., AlRegib, G., and Juang, B.-H. (2016). Air-writing recognition—part i: Modeling and recognition of characters, words, and connecting motions. IEEE Transactions on Human-Machine Systems, 46(3):403–413.

Chen, Y.-H., Su, P.-C., and Chien, F.-T. (2019). Air-writing for smart glasses by effective fingertip detection. In 2019 IEEE 8th Global Conference on Consumer Electronics (GCCE), pages 381–382.

Elshenaway, A. R. and Guirguis, S. K. (2021). On-air hand-drawn doodles for iot devices authentication during covid-19. IEEE Access, 9:161723–161744.

Li, J., Li, D., Xiong, C., and Hoi, S. (2022). Blip: Bootstrapping language-image pre-training for unified vision-language understanding and generation.

Li, M., Lv, T., Cui, L., Lu, Y., Florencio, D., Zhang, C., Li, Z., and Wei, F. (2021). Trocr: Transformer-based optical character recognition with pre-trained models. [link].

Peng, Z., Wang, W., Dong, L., Hao, Y., Huang, S., Ma, S., and Wei, F. (2023). Kosmos-2: Grounding multimodal large language models to the world. ArXiv, abs/2306.

Vaidya, V., Pravanth, T., and Viji, D. (2022). Air writing recognition application for dyslexic people. In 2022 International Mobile and Embedded Technology Conference (MECON), pages 553–558.

Vloison, V. and Xiwei, H. (2021). Deep learning framework for line-level handwritten text recognition. [link].

Wang, K., Zeng, W., Ma, C., Cheng, C., Sun, P., Wang, L., and Cai, W. (2017). The design of wireless air mouse based on lpc54100. In 2017 36th Chinese Control Conference (CCC), pages 6409–6413.
Published
2025-10-16
SOUZA, Luma T. L. de; CALDEIRA, Rafael A. D.; LEAL, Sérgio D. C.; SOUZA, Maria Clara P. de; PAIXÃO, Thiago M.; TELLO, Richard J. M. G.. Development and Evaluation of an Air Writing and Drawing System with an Electronic Pen and VLMs. In: REGIONAL SCHOOL OF INFORMATICS OF ESPÍRITO SANTO (ERI-ES), 10. , 2025, Espírito Santo/ES. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2025 . p. 100-109. DOI: https://doi.org/10.5753/eries.2025.16034.