Human-AI Heuristic Evaluation: Uncovering usability insights of an LLM Chatbot Interface for Personalized Learning in Autism

  • Yuri P. S. Zaidan PUCRS
  • Eric F. Monteiro PUCRS
  • Duncan Ruiz PUCRS
  • Afonso Sales PUCRS
  • Milene Silveira PUCRS

Resumo


The rise of Large Language Models (LLMs) and Generative AI (GenAI) offers new possibilities for personalized learning but also introduces usability challenges, especially in applications designed to customize the learning experience for individuals with autism spectrum disorder (ASD). To analyze whether an LLM-powered chatbot interface is effective for this specific use case, we propose a usability heuristics inspection carried out by both human experts and an AI agent to evaluate an LLM interface for ASD curriculum personalization. Preliminary results reveal human and AI-driven usability insights towards a frictionless experience for GenAI educational interventions.

Referências

Abacus.AI (2024). DeepAgent. Available in: [link]. Access in May, 2025.

Aubin Le Quéré, M., Schroeder, H., Randazzo, C., Gao, J., Epstein, Z., Perrault, S. T., Mimno, D., Barkhuus, L., and Li, H. (2024). LLMs as Research Tools: Applications and Evaluations in HCI Data Work. In Extended Abstracts of the CHI Conference on Human Factors in Computing Systems, CHI EA ’24, New York, NY, USA. Association for Computing Machinery.

Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D. M., Wu, J., Winter, C., Hesse, C., Chen, M., Sigler, E., Litwin, M., Gray, S., Chess, B., Clark, J., Berner, C., McCandlish, S., Radford, A., Sutskever, I., and Amodei, D. (2020). Language Models are Few-Shot Learners. arXiv preprint arXiv:2005.14165.

Carik, B., Ping, K., Ding, X., and Rho, E. H. (2025). Exploring Large Language Models Through a Neurodivergent Lens: Use, Challenges, Community-Driven Workarounds, and Concerns. Proc. ACM Hum.-Comput. Interact., 9(1).

Carvalho, E., Alves, F., Rodrigues, I., Souza, T., and Moreira, D. (2024). Autismo e Tecnologias Assistivas: uma Revisão Sistemática dos Anais do Congresso Brasileiro de Informática na Educação. In Anais do XXXV Simpósio Brasileiro de Informática na Educação, pages 1084–1098, Porto Alegre, RS, Brasil. SBC.

Christensen, D. and Zubler, J. (2020). CE: From the CDC: Understanding Autism Spectrum Disorder. American Journal of Nursing, 120(10):30–37.

Cunha, M. and Carvalho, L. (2024). ABC Autismo Frutas: Um aplicativo para crianças com autismo construído com base nas premissas do Design Centrado no Usuário e do Ensino Estruturado. In Anais do XXXV Simpósio Brasileiro de Informática na Educação, pages 937–950, Porto Alegre, RS, Brasil. SBC.

Fenacor (2025). Fenacor. Available in: [link]. Access in May, 2025.

Hao, Y., Song, H., Dong, L., Huang, S., Chi, Z., Wang, W., Ma, S., and Wei, F. (2022). Language Models are General-Purpose Interfaces. arXiv preprint arXiv:2206.06336.

Happé, F. and Frith, U. (2006). The Weak Coherence Account: Detail-focused Cognitive Style in Autism Spectrum Disorders. Journal of Autism and Developmental Disorders, 36(1):5–25.

Hume, K., Steinbrenner, J. R., Odom, S. L., Morin, K. L., Nowell, S. W., Tomaszewski, B., Szendrey, S., McIntyre, N. S., Yücesoy-Özkan, S., and Savage, M. N. (2021). Evidence-Based Practices for Children, Youth, and Young Adults with Autism: Third Generation Review. Journal of Autism and Developmental Disorders, 51(11):4013–4032.

Instituto Brasileiro de Geografia e Estatística (2024). Censo 2022: Indicadores - Educação. Available in: [link]. Access in May, 2025.

Kasneci, E., Sessler, K., Küchemann, S., Bannert, M., Dementieva, D., Fischer, F., Gasser, U., Groh, G., Günnemann, S., Hüllermeier, E., Krusche, S., Kutyniok, G., Michaeli, T., Nerdel, C., Pfeffer, J., Poquet, O., Sailer, M., Schmidt, A., Seidel, T., Stadler, M., Weller, J., Kuhn, J., and Kasneci, G. (2023). ChatGPT for good? On opportunities and challenges of large language models for education. Learning and Individual Differences, 103:102274.

Kohli, M., Kar, A. K., Bangalore, A., and Ap, P. (2022). Machine learning-based ABA treatment recommendation and personalization for autism spectrum disorder: an exploratory study. Brain Informatics, 9(1):16.

Morris, M. R. (2025). HCI for AGI. Interactions, 32(2):26–32.

Ng, C. and Fung, Y. (2024). Educational Personalized Learning Path Planning with Large Language Models. arXiv preprint arXiv:2407.11773.

Nielsen, J. (1994). Enhancing the explanatory power of usability heuristics. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI ’94, page 152–158, New York, NY, USA. Association for Computing Machinery.

OpenAI (2024a). Hello GPT-4o. Available in: [link]. Access in May, 2025.

OpenAI (2024b). Prompt Engineering — OpenAI Platform Documentation. Available in: [link]. Access in May, 2025.

Papadopoulos, C. (2024). Large language models for autistic and neurodivergent individuals: Concerns, benefits and the path forward. Neurodiversity, 2:27546330241301938.

Pellicano, E., Dinsmore, A., and Charman, T. (2014). What should autism research focus upon? Community views and priorities from the United Kingdom. Autism, 18:756–770.

Quéré, M. A. L., Schroeder, H., Randazzo, C., and Gao, J. (2025). The State of Large Language Models in HCI Research: Workshop Report. Interactions, 32(1):8–9.

Shi, Z., Landrum, E., O’Connell, A., Kian, M., Pinto-Alva, L., Shrestha, K., Zhu, X., and Matarić, M. J. (2024). How Can Large Language Models Enable Better Socially Assistive Human-Robot Interaction: A Brief Survey. Proceedings of the AAAI Symposium Series, 3(1):401–404.

Tissot, C. and Evans, R. (2003). Visual Teaching Strategies for Children with Autism. Early Child Development and Care, 173(4):425–433.

Wang, L., Ma, C., Feng, X., Zhang, Z., Yang, H., Zhang, J., Chen, Z., Tang, J., Chen, X., Lin, Y., Zhao, W. X., Wei, Z., and Wen, J. (2024). A survey on large language model based autonomous agents. Frontiers of Computer Science, 18(6).

Xi, Z., Chen, W., Guo, X., He, W., Ding, Y., Hong, B., Zhang, M., Wang, J., Jin, S., Zhou, E., Zheng, R., Fan, X., Wang, X., Xiong, L., Zhou, Y., Wang, W., Jiang, C., Zou, Y., Liu, X., Yin, Z., Dou, S., Weng, R., Cheng, W., Zhang, Q., Qin, W., Zheng, Y., Qiu, X., Huang, X., and Gui, T. (2023). The Rise and Potential of Large Language Model Based Agents: A Survey. arXiv preprint arXiv:2309.07864.
Publicado
24/11/2025
ZAIDAN, Yuri P. S.; MONTEIRO, Eric F.; RUIZ, Duncan; SALES, Afonso; SILVEIRA, Milene. Human-AI Heuristic Evaluation: Uncovering usability insights of an LLM Chatbot Interface for Personalized Learning in Autism. In: SIMPÓSIO BRASILEIRO DE INFORMÁTICA NA EDUCAÇÃO (SBIE), 36. , 2025, Curitiba/PR. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2025 . p. 1647-1656. DOI: https://doi.org/10.5753/sbie.2025.12579.