Automatic Generation of Questions in Brazilian Portuguese Using PTT5 and FLAN-T Models
Abstract
This paper performs a comparative analysis of the pre-trained neural models of PTT5 and FLAN-T5 for Brazilian Portuguese automatic question generation. To this end, two datasets, PIRA and FairyTaleQA, were used to evaluate the ability of these models to generate questions from two scenarios: (i) considering only the context and (ii) using the context and the expected answer. The ROUGE-L and BERTScore measures were used to assess the generated questions, in addition to an analysis based on GPT-4o. The results demonstrated that the PTT5Large model consistently outperformed the other models, generating 93.06% of valid questions in PIRA and 82.32% in FairyTaleQA based on the GPT-4o evaluation.
References
Carmo, D., Piau, M., Campiotti, I., Nogueira, R., and Lotufo, R. (2020). Ptt5: Pre-training and validating the t5 model on brazilian portuguese data. arXiv preprint arXiv:2008.09144 DOI: 10.48550/arXiv.2008.09144
Chen, J., Lin, H., Han, X., and Sun, L. (2024). Benchmarking large language models in retrieval-augmented generation. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 17754–17762. DOI: 10.48550/arXiv.2309.01431
Chung, H. W., Hou, L., Longpre, S., Zoph, B., Tay, Y., Fedus, W., Li, Y., Wang, X., Dehghani, M., Brahma, S., et al. (2024). Scaling instruction-finetuned language models. Journal of Machine Learning Research, 25(70):1–53. DOI: 10.48550/arXiv.2210.11416
da Rocha Junqueira, J., Corrêa, U. B., and Freitas, L. (2024). Transformer models for brazilian portuguese question generation: An experimental study. In The International FLAIRS Conference Proceedings, volume 37. DOI: 10.32473/flairs.37.1.135334
Kurdi, G., Leo, J., Parsia, B., Sattler, U., and Al-Emari, S. (2020). A systematic review of automatic question generation for educational purposes. International Journal of Artificial Intelligence in Education, 30:121–204. DOI: 10.1007/s40593-019-00186-y
Leite, B. and Lopes Cardoso, H. (2022). Neural question generation for the portuguese language: A preliminary study. In EPIA Conference on Artificial Intelligence, pages 780–793. Springer. DOI: 10.1007/978-3-031-16474-3_63
Leite, B., Osório, T. F., and Cardoso, H. L. (2024). Fairytaleqa translated: Enabling educational question and answer generation in less-resourced languages. arXiv preprint arXiv:2406.04233 DOI: 10.48550/arXiv.2406.04233
Lin, C.-Y. (2004). ROUGE: A package for automatic evaluation of summaries. In Text Summarization Branches Out, pages 74–81, Barcelona, Spain. Association for Computational Linguistics. [link]
Mulla, N. and Gharpure, P. (2023). Automatic question generation: a review of methodologies, datasets, evaluation metrics, and applications. Progress in Artificial Intelligence, 12(1):1–32. DOI: 10.1007/s13748-023-00295-9
Oliveira, H. G., Caetano, I., Matos, R., and Amaro, H. (2023). Generating and ranking distractors for multiple-choice questions in portuguese. In SLATE, pages 4–1. DOI: 10.4230/OASIcs.SLATE.2023.4
Paschoal, A. F., Pirozelli, P., Freire, V., Delgado, K. V., Peres, S. M., José, M. M., Nakasato, F., Oliveira, A. S., Brandão, A. A., Costa, A. H., et al. (2021). Pirá: A bilingual portuguese-english dataset for question-answering about the ocean. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management, pages 4544–4553. DOI: 10.48550/arXiv.2202.02398
Puri, R., Spring, R., Shoeybi, M., Patwary, M., and Catanzaro, B. (2020). Training question answering models from synthetic data. In Webber, B., Cohn, T., He, Y., and Liu, Y., editors, Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 5811–5826, Online. Association for Computational Linguistics DOI: 10.48550/arXiv.2002.09599
Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., and Liu, P. J. (2020). Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of machine learning research, 21(140):1–67. DOI: 10.48550/arXiv.1910.10683
Team, G., Mesnard, T., Hardin, C., Dadashi, R., Bhupatiraju, S., Pathak, S., Sifre, L., Rivière, M., Kale, M. S., Love, J., et al. (2024). Gemma: Open models based on gemini research and technology. arXiv preprint arXiv:2403.08295. DOI: 10.48550/arXiv.2403.08295
Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al. (2023). Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971. DOI: 10.48550/arXiv.2302.13971
Vaswani, A. (2017). Attention is all you need. Advances in Neural Information Processing Systems DOI: 10.48550/arXiv.1706.03762
Wagner Filho, J. A., Wilkens, R., Idiart, M., and Villavicencio, A. (2018). The brwac corpus: A new open resource for brazilian portuguese. In Proceedings of the eleventh international conference on language resources and evaluation (LREC 2018). [link]
Xu, Y., Wang, D., Yu, M., Ritchie, D., Yao, B., Wu, T., Zhang, Z., Li, T., Bradford, N., Sun, B., Hoang, T., Sang, Y., Hou, Y., Ma, X., Yang, D., Peng, N., Yu, Z., and Warschauer, M. (2022). Fantastic questions and where to find them: FairytaleQA – an authentic dataset for narrative comprehension. In Muresan, S., Nakov, P., and Villavicencio, A., editors, Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 447–460, Dublin, Ireland. Association for Computational Linguistics. DOI: 10.18653/v1/2022.acl-long.34
hang, R., Guo, J., Chen, L., Fan, Y., and Cheng, X. (2021). A review on question generation from natural language text. ACM Trans. Inf. Syst., 40(1) DOI: 10.1145/3468889
Zhang, T., Kishore, V., Wu, F., Weinberger, K. Q., and Artzi, Y. (2019). Bertscore: Evaluating text generation with bert. arXiv preprint arXiv:1904.09675. DOI: 10.48550/arXiv.1904.09675
