How effective is an LLM-based Data Analysis Automation Tool? A Case Study with ChatGPT's Data Analyst

  • Beatriz A. de Miranda Universidade Federal de Campina Grande
  • Claudio E. C. Campelo Universidade Federal de Campina Grande


Artificial Intelligence (AI) tools are increasingly becoming integral to analytical processes. This paper evaluates the potential of Large Language Models (LLMs), specifically OpenAI's ChatGPT’s Data Analyst, in data analysis. We conducted a structured experiment employing this tool in 36 questions spanning descriptive, diagnostic, predictive, and prescriptive analyses to assess its effectiveness. The study revealed an overall efficiency rate of 86.11%, with robust performance in the descriptive and diagnostic categories but reduced efficacy in the more complex predictive and prescriptive tasks. By discussing the strengths and limitations of a state-of-the-art LLM-based tool in aiding data scientists, this study aims to mark a critical milestone for future developments in the field, particularly as a reference for the open-source community.
Palavras-chave: Data Analysis Automation, ChatGPT's Data Analyst, Case Study, Large Language Models


