LLM-Powered Educational Conversational Agent for Open Educational Resources

Renan Zafalon da Silva; Paulo Cesar Ramos Pinho; Ulian Gabriel Alff Ramires; Raymundo Carlos Machado Ferreira Filho; Tiago Thompen Primo

doi:10.5753/sbsi.2026.248280

Renan Zafalon da Silva UFPel
Paulo Cesar Ramos Pinho UFPel
Ulian Gabriel Alff Ramires UFPel
Raymundo Carlos Machado Ferreira Filho IFSul
Tiago Thompen Primo UFPel

DOI: https://doi.org/10.5753/sbsi.2026.248280

Resumo

Research Context: Educational repositories gather diverse Open Educational Resources (OER), yet sparse metadata and inconsistent terminology reduce findability. Large Language Models (LLMs) with retrieval-augmented generation (RAG) can bridge vocabulary gaps by capturing semantic similarity, thereby improving recall and user experience. Scientific and/or Practical Problem: The national OER repository (ProEdu) depends on a solely lexical engine. This dependence creates difficulties in handling synonyms, paraphrases, and domain shifts, resulting in suboptimal recall and inconsistent rankings. Proposed Solution and/or Analysis: We develop a prototype of an Educational Conversational Agent (ECA) that integrates retrieval and response generation. Three pipelines are evaluated: ProEdu, which employs a lexical approach; a field-weighted Elasticsearch (ES); and a semantic RAG system utilizing Sentence-Transformers (all-MiniLM-L6-v2) embeddings with a FAISS (Facebook AI Similarity Search) index and Llama for text generation, plus a lightweight reranking mechanism. Related IS Theory: We assert that AI-enhanced repositories diminish search obstacles and assist educators in effectively identifying suitable materials. Furthermore, the conversational interface alleviates the cognitive load by providing verified sources within context. Research Method: A comparative assessment involved 22 interdisciplinary prompts in ten domains. For each prompt, we established gold-standard datasets, formulated standardized queries, and calculated precision, recall, and F1-score. Summary of Results: The Llama/FAISS pipeline achieves the best coverage-relevance balance driven by high recall. ES attains a similar F1 through higher precision but lower recall. ProEdu performs poorly in F1. Error analysis shows semantic retrieval excels in cross-vocabulary matches and multi-facet intents. Contributions and Impact to IS area: We deliver a replicable benchmark for large-scale OER search (prompts, metrics, code) and a pragmatic architecture combining semantic RAG and ES to balance recall and precision on cost-efficient infrastructure. Prompt templates and evaluation scripts support adoption.

Referências

Boscarioli, C., de Araujo, R. M., and Maciel, R. S. P., editors (2017). I GranDSI-BR: Grand Research Challenges in Information Systems in Brazil (2016–2026). Sociedade Brasileira de Computação (SBC), Brazil. E-book by SBC’s Special Committee on Information Systems (CE-SI).

Bucchiarone, A., Bianculli, D., Zhang, Q., Pradella, M., Szilagyi, G., and Visaggio, M. (2024). Designing and generating lesson plans combining open educational content and generative ai. In Companion Proc. 27th ACM/IEEE Int. Conf. on Model Driven Engineering Languages and Systems (MODELS Companion ’24). ACM.

da Silva, F. L. and Cazella, S. C. (2020). Um modelo de recomendação de objetos de aprendizagem baseado em valores culturais dos estudantes. In Anais do XXXI Simpósio Brasileiro de Informática na Educação (SBIE), pages 16–20, Porto Alegre, RS, Brasil. Sociedade Brasileira de Computação (SBC).

Dehbozorgi, N., Kunuku, M. T., and Pouriyeh, S. (2024). Personalized pedagogy through a llm–based recommender system. In Artificial Intelligence in Education –– Posters and Late Breaking Results, Workshops and Tutorials, Industry and Innovation Tracks, Communications in Computer and Information Science, pages 63–70. Springer Nature Switzerland.

Douze, M., Guzhva, A., Deng, C., Johnson, J., Szilvásy, G., Mazaré, P.-E., Lomeli, M., Hosseini, L., and Jégou, H. (2025). The faiss library. arXiv preprint arXiv:2401.08281.

Dutta, S., Beier, F., and Werth, D. (2025). AI-based personalized multilingual course recommender system using large language models. In Proc. 17th Int. Conf. on Agents and Artificial Intelligence (ICAART 2025), Vol. 3, pages 1069–1076. SCITEPRESS.

Face, H. (2021). all-minilm-l6-v2. Online repository. Available at: [link]. Accessed on: Oct. 7, 2025.

Gheewala, S., Xu, S., and Yeom, S. (2025). In-depth survey: Deep learning in recommender systems—exploring prediction and ranking models, datasets, feature analysis, and emerging trends. Neural Computing and Applications. Early access article, published 21 Mar 2025.

Golda, A., Mekonen, K., Pandey, A., Singh, A., Hassija, V., Chamola, V., and Sikdar, B. (2024). Privacy and security concerns in generative ai: A comprehensive survey. IEEE Access, 12:48126–48144.

Gormley, C. and Tong, Z. (2015). Elasticsearch: The Definitive Guide. O’Reilly Media, Sebastopol, CA.

Gregor, S. and Hevner, A. R. (2013). Positioning and presenting design science research for maximum impact. MIS Quarterly, 37(2):337–355.

Kamphuis, C., de Vries, A. P., Boytsov, L., and Lin, J. (2020). Which BM25 do you mean? a large-scale reproducibility study of scoring variants. In Advances in Information Retrieval (ECIR 2020), volume 12036 of Lecture Notes in Computer Science, pages 28–34. Springer.

Kitchenham, B. (2004). Procedures for performing systematic reviews. Keele, UK, Keele University, 33(2004):1–26.

Li, H. and Klabjan, D. (2024). Reverse prompt engineering. arXiv preprint arXiv:2411.06729. Available at: [link].

Li, X., Jin, J., Zhou, Y., Zhang, Y., Zhang, P., Zhu, Y., and Dou, Z. (2025). From matching to generation: A survey on generative information retrieval. ACM Transactions on Information Systems, 43(3):1–62.

Li, X., Lipp, J., Shakir, A., Huang, R., and Li, J. (2024). Bmx: Entropy-weighted similarity and semantic-enhanced lexical search. arXiv preprint arXiv:2408.06643.

Mićunović, M., Rako, S., and Feldvari, K. (2023). Open educational resources (oers) at european higher education institutions in the field of library and information science during COVID-19 pandemic. Publications, 11(3):38.

Pesovski, I., Santos, R., Henriques, R., and Trajkovik, V. (2024). Generative ai for customizable learning experiences. Sustainability, 16(7):3034.

Sakai, T. (2021). Evaluating Information Retrieval and Access Tasks: NTCIR’s Legacy of Research Impact. Springer. Open access.

Son, H. X., Nguyen, T. M., Vo, H. K., Dang, K. T., Gia, K. H., and Tran, N. B. (2024). Generative ai–driven digital assistance for e-learning: A novel paradigm for personalized recommendations. In Artificial Intelligence with and for Learning Sciences. Past, Present, and Future Horizons (WAILS 2024), volume 14545 of Lecture Notes in Computer Science, pages 89–98. Springer.

Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., et al. (2023). Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971.

Touvron, H., Misra, I., Zhang, T., et al. (2024). The llama 3 herd of models. arXiv preprint arXiv:2407.21783. Meta AI Technical Report.

Zmaranda, D. R., Moisi, C. I., Győrödi, C. A., Ş. Győrödi, R., and Bandici, L. (2021). An analysis of the performance and configuration features of mysql document store and elasticsearch as an alternative backend in a data replication solution. Applied Sciences, 11(24).