Avaliando a habilidade do ChatGPT de realizar provas de Dedução Natural em Lógica Proposicional

  • Francisco Leonardo Batista Martins UFC
  • Augusto César Araújo de Oliveira UFC
  • Davi Romero de Vasconcelos UFC
  • Maria Viviane de Menezes UFC

Abstract


The use of conversational agents (chatbots) in education has sparked growing interest among researchers, educators, and educational institutions. These systems have the ability to comprehend and process large quantity of data, offering individualized support to students. However, it is important to consider that they can also generate incorrect responses in some tasks: such as logical reasoning. This paper aims to evaluate the ability of the conversational agent ChatGPT to solve Natural Deduction exercises in propositional logic. The study seeks to determine whether ChatGPT is a suitable tool for this task. To achieve this, experiments are conducted using a database of exercises in Natural Deduction. This study aims to contribute to the understanding of the capabilities and limitations of conversational agents in logical reasoning skills.

References

Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al. (2020). Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901.

Carl, M. (2023). Using large language models for (de-) formalization and natural argumentation exercises for beginner’s students. arXiv preprint arXiv:2304.06186.

Enderton, H. B. (1972). A mathematical introduction to logic. Academic Press.

Huth, M. and Ryan, M. (2004). Logic in Computer Science: Modelling and Reasoning about Systems (2nd Ed.). Cambridge University Press.

Kasneci, E., Seßler, K., Küchemann, S., Bannert, M., Dementieva, D., Fischer, F., Gasser, U., Groh, G., Günnemann, S., Hüllermeier, E., et al. (2023). Chatgpt for good? on opportunities and challenges of large language models for education. Learning and Individual Differences, 103:102274.

Liu, H., Ning, R., Teng, Z., Liu, J., Zhou, Q., and Zhang, Y. (2023). Evaluating the logical reasoning ability of chatgpt and gpt-4. arXiv preprint arXiv:2304.03439.

Liu, J., Cui, L., Liu, H., Huang, D., Wang, Y., and Zhang, Y. (2020). Logiqa: A challenge dataset for machine reading comprehension with logical reasoning. arXiv preprint arXiv:2007.08124.

OpenAI (2021). ChatGPT. https://openai.com/research/chatgpt. Acesso em: 13 de junho de 2023.

Pelletier, F. J. (1999). A brief history of natural deduction. History and Philosophy of Logic, 20(1):1–31.

Pelletier, F. J. (2000). A history of natural deduction and elementary logic textbooks. Logical consequence: Rival approaches, 1:105–138.

Russell, S. J. (2010). Artificial intelligence a modern approach. Pearson Education, Inc.

Tlili, A., Shehata, B., Adarkwah, M. A., Bozkurt, A., Hickey, D. T., Huang, R., and Agyemang, B. (2023). What if the devil is my guardian angel: Chatgpt as a case study of using chatbots in education. Smart Learning Environments, 10(1):15.

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., and Polosukhin, I. (2017). Attention is all you need. Advances in neural information processing systems, 30.

vom Scheidt, G. (2023). Experimental results from applying gpt-4 to an unpublished formal language. arXiv e-prints, pages arXiv–2305.

Weber, F., Wambsganss, T., Rüttimann, D., and Söllner, M. (2021). Pedagogical agents for interactive learning: A taxonomy of conversational agents in education. In Forty-Second International Conference on Information Systems. Austin, Texas, pages 1–17.

Ye, J., Chen, X., Xu, N., Zu, C., Shao, Z., Liu, S., Cui, Y., Zhou, Z., Gong, C., Shen, Y., et al. (2023). A comprehensive capability analysis of gpt-3 and gpt-3.5 series models. arXiv preprint arXiv:2303.10420.

Yu, W., Jiang, Z., Dong, Y., and Feng, J. (2020). Reclor: A reading comprehension dataset requiring logical reasoning. arXiv preprint arXiv:2002.04326.
Published
2023-11-06
MARTINS, Francisco Leonardo Batista; OLIVEIRA, Augusto César Araújo de; VASCONCELOS, Davi Romero de; MENEZES, Maria Viviane de. Avaliando a habilidade do ChatGPT de realizar provas de Dedução Natural em Lógica Proposicional. In: BRAZILIAN SYMPOSIUM ON COMPUTERS IN EDUCATION (SBIE), 34. , 2023, Passo Fundo/RS. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2023 . p. 1282-1292. DOI: https://doi.org/10.5753/sbie.2023.234658.