Geração Automática de Perguntas em Português do Brasil Usando os Modelos PTT5 e FLAN-T5

Tiago Felipe V. braga; Bruno Cardoso Coutinho; Hilário Tomaz Alves de Oliveira

doi:10.5753/stil.2024.245392

Tiago Felipe V. braga IFES http://orcid.org/0009-0003-8351-513X
Bruno Cardoso Coutinho IFES https://orcid.org/0000-0002-4183-7865
Hilário Tomaz Alves de Oliveira IFES https://orcid.org/0000-0003-0643-7206

DOI: https://doi.org/10.5753/stil.2024.245392

Abstract

This paper performs a comparative analysis of the pre-trained neural models of PTT5 and FLAN-T5 for Brazilian Portuguese automatic question generation. To this end, two datasets, PIRA and FairyTaleQA, were used to evaluate the ability of these models to generate questions from two scenarios: (i) considering only the context and (ii) using the context and the expected answer. The ROUGE-L and BERTScore measures were used to assess the generated questions, in addition to an analysis based on GPT-4o. The results demonstrated that the PTT5_Large model consistently outperformed the other models, generating 93.06% of valid questions in PIRA and 82.32% in FairyTaleQA based on the GPT-4o evaluation.

Keywords: Question Generation, Natural Language Processing, Brazilian Portuguese, PTT5, FLAN-T5, Pre-trained Language Models, PIRÁ, FairyTaleQA, ROUGE-L, BERTScore, GPT-4o, Transformers, Machine Learning, Artificial Intelligence

References

Almeida, T. S., Abonizio, H., Nogueira, R., and Pires, R. (2024). Sabiá-2: A new generation of portuguese large language models. arXiv preprint arXiv:2403.09887. DOI: 10.48550/arXiv.2403.09887

Carmo, D., Piau, M., Campiotti, I., Nogueira, R., and Lotufo, R. (2020). Ptt5: Pre-training and validating the t5 model on brazilian portuguese data. arXiv preprint arXiv:2008.09144 DOI: 10.48550/arXiv.2008.09144

Chen, J., Lin, H., Han, X., and Sun, L. (2024). Benchmarking large language models in retrieval-augmented generation. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 17754–17762. DOI: 10.48550/arXiv.2309.01431

Chung, H. W., Hou, L., Longpre, S., Zoph, B., Tay, Y., Fedus, W., Li, Y., Wang, X., Dehghani, M., Brahma, S., et al. (2024). Scaling instruction-finetuned language models. Journal of Machine Learning Research, 25(70):1–53. DOI: 10.48550/arXiv.2210.11416

da Rocha Junqueira, J., Corrêa, U. B., and Freitas, L. (2024). Transformer models for brazilian portuguese question generation: An experimental study. In The International FLAIRS Conference Proceedings, volume 37. DOI: 10.32473/flairs.37.1.135334

Kurdi, G., Leo, J., Parsia, B., Sattler, U., and Al-Emari, S. (2020). A systematic review of automatic question generation for educational purposes. International Journal of Artificial Intelligence in Education, 30:121–204. DOI: 10.1007/s40593-019-00186-y

Leite, B. and Lopes Cardoso, H. (2022). Neural question generation for the portuguese language: A preliminary study. In EPIA Conference on Artificial Intelligence, pages 780–793. Springer. DOI: 10.1007/978-3-031-16474-3_63

Leite, B., Osório, T. F., and Cardoso, H. L. (2024). Fairytaleqa translated: Enabling educational question and answer generation in less-resourced languages. arXiv preprint arXiv:2406.04233 DOI: 10.48550/arXiv.2406.04233

Lin, C.-Y. (2004). ROUGE: A package for automatic evaluation of summaries. In Text Summarization Branches Out, pages 74–81, Barcelona, Spain. Association for Computational Linguistics. [link]

Mulla, N. and Gharpure, P. (2023). Automatic question generation: a review of methodologies, datasets, evaluation metrics, and applications. Progress in Artificial Intelligence, 12(1):1–32. DOI: 10.1007/s13748-023-00295-9

Oliveira, H. G., Caetano, I., Matos, R., and Amaro, H. (2023). Generating and ranking distractors for multiple-choice questions in portuguese. In SLATE, pages 4–1. DOI: 10.4230/OASIcs.SLATE.2023.4

Paschoal, A. F., Pirozelli, P., Freire, V., Delgado, K. V., Peres, S. M., José, M. M., Nakasato, F., Oliveira, A. S., Brandão, A. A., Costa, A. H., et al. (2021). Pirá: A bilingual portuguese-english dataset for question-answering about the ocean. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management, pages 4544–4553. DOI: 10.48550/arXiv.2202.02398

Puri, R., Spring, R., Shoeybi, M., Patwary, M., and Catanzaro, B. (2020). Training question answering models from synthetic data. In Webber, B., Cohn, T., He, Y., and Liu, Y., editors, Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 5811–5826, Online. Association for Computational Linguistics DOI: 10.48550/arXiv.2002.09599

Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., and Liu, P. J. (2020). Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of machine learning research, 21(140):1–67. DOI: 10.48550/arXiv.1910.10683

Team, G., Mesnard, T., Hardin, C., Dadashi, R., Bhupatiraju, S., Pathak, S., Sifre, L., Rivière, M., Kale, M. S., Love, J., et al. (2024). Gemma: Open models based on gemini research and technology. arXiv preprint arXiv:2403.08295. DOI: 10.48550/arXiv.2403.08295

Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al. (2023). Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971. DOI: 10.48550/arXiv.2302.13971

Vaswani, A. (2017). Attention is all you need. Advances in Neural Information Processing Systems DOI: 10.48550/arXiv.1706.03762

Wagner Filho, J. A., Wilkens, R., Idiart, M., and Villavicencio, A. (2018). The brwac corpus: A new open resource for brazilian portuguese. In Proceedings of the eleventh international conference on language resources and evaluation (LREC 2018). [link]

Xu, Y., Wang, D., Yu, M., Ritchie, D., Yao, B., Wu, T., Zhang, Z., Li, T., Bradford, N., Sun, B., Hoang, T., Sang, Y., Hou, Y., Ma, X., Yang, D., Peng, N., Yu, Z., and Warschauer, M. (2022). Fantastic questions and where to find them: FairytaleQA – an authentic dataset for narrative comprehension. In Muresan, S., Nakov, P., and Villavicencio, A., editors, Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 447–460, Dublin, Ireland. Association for Computational Linguistics. DOI: 10.18653/v1/2022.acl-long.34

hang, R., Guo, J., Chen, L., Fan, Y., and Cheng, X. (2021). A review on question generation from natural language text. ACM Trans. Inf. Syst., 40(1) DOI: 10.1145/3468889

Zhang, T., Kishore, V., Wu, F., Weinberger, K. Q., and Artzi, Y. (2019). Bertscore: Evaluating text generation with bert. arXiv preprint arXiv:1904.09675. DOI: 10.48550/arXiv.1904.09675

Automatic Generation of Questions in Brazilian Portuguese Using PTT5 and FLAN-T Models

Abstract

References