Investigating the Use of Intelligent Tutors Based on Large Language Models: Automated generation of Business Process Management questions using the Revised Bloom's Taxonomy
Resumo
A construção artefatos avaliativos é uma tarefa complexa, pois gerar avaliações adequadas de forma manual exige um profundo conhecimento, tanto da área a ser avaliada quando dos processos cognitivos envolvidos no aprendizado. A utilização de Large Language Models (LLMs) como base de funcionamento de Sistemas Tutores Inteligentes pode auxiliar nesta tarefa. Este trabalho experimenta os LLMs GPT-3.5-Turbo e LLama-2 como fonte de geração automática de perguntas avaliativas. O experimento foi realizado utilizando técnicas de Engenharia de Prompts na geração de perguntas da disciplina de Business Process Management (BPM). A partir do experimento foi possível observar que ambos os modelos são capazes de gerar perguntas adequadas ao contexto de BPM. Foi identificado também que, quando recebeu o contexto e o modelo da pergunta a ser gerada, o modelo Llama-2 produziu questões mais apropriadas ao nível cognitivo desejado, enquanto que o modelo GPT-3.5-Turbo recebendo apenas o contexto foi possível observar resposta similar.
Referências
Babakhani, P., Lommatzsch, A., Brodt, T., Sacker, D., Sivrikaya, F., and Albayrak, S. (2024). Opinerium: Subjective question generation using large language models. IEEE Access, 12:66085–66099.
Bhat, S., Nguyen, H., Moore, S., Stamper, J., Sakr, M., and Nyberg, E. (2022). Towards Automated Generation and Evaluation of Questions in Educational Domains. In Proceedings of the 15th International Conference on Educational Data Mining, pages 701–704. International Educational Data Mining Society.
Chen, Y., Arunasalam, A., and Celik, Z. B. (2023a). Can large language models provide security & privacy advice? measuring the ability of llms to refute misconceptions. In Proceedings of the 39th Annual Computer Security Applications Conference, pages 366–378.
Chen, Z. et al. (2023b). Student performance prediction approach based on educational data mining. IEEE Access, 11:131260–131272.
Chow, W. (2021). Teaching business process management with a flipped-classroom and problem-based learning approach with the use of apromore and other bpm software in graduate information systems courses. In 2021 IEEE International Conference on Engineering, Technology Education (TALE), pages 1–8.
Chowdhury, S. P., Zouhar, V., and Sachan, M. (2024). Scaling the authoring of autotutors with large language models. arXiv preprint arXiv:2402.09216.
Conklin, J. (2005). Review of A Taxonomy for Learning, Teaching, and Assessing: A Revision of Bloom’s Taxonomy of Educational Objectives Complete Edition. Educational Horizons, 83(3):154–159.
Dijkstra, R. e. a. (2022). Reading comprehension quiz generation using generative pretrained transformers. In iTextbooks@ AIED, pages 4–17.
Dumas, M. et al. (2018). Fundamentals of Business Process Management. Springer-Verlag.
Filho, L. P., Souza, T., and Paula, L. (2023). Analise das respostas do chatgpt em relação ao conteúdo de programação para iniciantes. In Anais do XXXIV Simpósio Brasileiro de Informática na Educação, pages 1738–1748, Porto Alegre, RS, Brasil. SBC.
Gavidia, J. J. Z. and de Andrade, L. C. V. (2003). Sistemas tutores inteligentes.
Gero, K. I., Liu, V., and Chilton, L. (2022). Sparks: Inspiration for science writing using language models. In Proceedings of the 2022 ACM Designing Interactive Systems Conference, pages 1002–1019. ACM.
Google (Accessed: 2024-06-21a). Google Colaboratory Shared Notebook. [link].
Google (Accessed: 2024-06-21b). Google Colaboratory Shared Notebook. [link].
Google (Accessed: 2024-06-21c). Google Sheets. [link].
Hugging Face (Accessed: 2024-06-21). Llama 2 7B HF Model on Hugging Face. [link].
Illinois State University (Accessed: 2024-06-21). Revised Bloom’s Taxonomy. [link].
Ji, S. and Yuan, T. (2022). Conversational intelligent tutoring systems for online learning: What do students and tutors say? In 2022 IEEE Global Engineering Education Conference (EDUCON), pages 292–298. IEEE.
Júnior, C. P., Santos, H., Rodrigues, L., and Costa, N. (2023). Investigating the effectiveness of personalized gamification in enhancing student intrinsic motivation: an experimental study in real context. In Anais do XXXIV Simpósio Brasileiro de Informática na Educação , pages 838–850, Porto Alegre, RS, Brasil. SBC.
Lee, U., Jung, H., and Jeon, Y. e. a. (2023). Few-shot is enough: exploring chatgpt prompt engineering method for automatic question generation in english education. Education and Information Technologies.
Maity, S., Deroy, A., and Sarkar, S. (2024). Harnessing the power of prompt-based techniques for generating school-level questions using large language models. In Proceedings of the 15th Annual Meeting of the Forum for Information Retrieval Evaluation, FIRE ’23, page 30–39, New York, NY, USA. Association for Computing Machinery.
Meher, J. P. and Mall, R. (2023). Bloombert: A deep learning-based cognitive complexity classifier of assessment questions. In 2023 IEEE International Conference on Advanced Learning Technologies (ICALT), pages 318–322.
Meta (Accessed: 2024-06-20). LLAMA 2. [link].
Moreira, S. A. S., Sousa, R. G., and Padua, S. I. D. (2022). Dimensões para o ensino de business process management (bpm): proposta de um modelo conceitual qualitativo. In XXV SEMEAD - Anais, Sao Paulo. SemeAd.
Mousavinasab, E., Zarifsanaiey, N., Rakhshan, M., Mirzaee, M., Amini, M., and Ghazi Saeedi, M. (2021). Intelligent tutoring systems: a systematic review of characteristics, applications, and evaluation methods. Interactive Learning Environments, 29(1):142–163.
Nasution, N. E. A. (2023). Using artificial intelligence to create biology multiple choice questions for higher education. Agricultural and Environmental Education, 2(1).
OpenAI (Accessed: 2024-06-13). ChatGPT Shared Link. [link].
OpenAI (Accessed: 2024-06-14). ChatGPT Shared Link. [link].
OpenAI (Accessed: 2024-06-20). OpenAI GPT-3.5 Turbo Documentation. [link].
Pham, P. V. L., Duc, A. V., Hoang, N. M., Do, X. L., and Luu, A. T. (2024). Chatgpt as a math questioner? evaluating chatgpt on generating pre-university math questions. In Proceedings of the 39th ACM/SIGAPP Symposium on Applied Computing, SAC ’24, page 65–73, New York, NY, USA. Association for Computing Machinery.
Sharma, R. K., Gupta, V., and Grossman, D. (2024). Spml: A dsl for defending language models against prompt attacks. arXiv preprint arXiv:2402.11755.
Sharma, S., Agarwal, R., and Mittal, A. (2022). Generating educational questions with similar difficulty level.
Silva, C., Moreira, T., Fernandes, I., Passos, C., Duarte, J., and Goldschmidt, R. (2023). Sistemas tutores inteligentes na aprendizagem por competências: Uma revisão sistematica da literatura. In Anais do XXXIV Simpósio Brasileiro de Informática na Educação, pages 1120–1132, Porto Alegre, RS, Brasil. SBC.
Silva, D. (2023). Metodologias e abordagens para o ensino e aprendizado de gerenciamento de processos de negocio: uma revisão sistemática da literatura. Dissertação de mestrado, Universidade Federal do Rio Grande do Sul, Instituto de Informática, Porto Alegre. Disponível em: [link].
Xiong, Y. and Suen, H. K. (2018). Assessment approaches in massive open online courses: Possibilities, challenges and future directions. International Review of Education, 64(2):241–263.
Zhang, Y. et al. (2023). Siren’s song in the ai ocean: a survey on hallucination in large language models. arXiv preprint arXiv:2309.01219.