Scaling Laws for Text-to-SQL: A Study on the Relationship Between Language Model Size and Performance

Abstract


Although large language models achieve good results in the Text-to-SQL task, their high computational cost limits adoption by small businesses. This study evaluates the feasibility of small language models as an alternative, analyzing the trade-off between model size and performance for the open-source Qwen2.5, in variants from 0.5B to 32B parameters. Experiments were conducted on Spider benchmark and on a database with information about Brazilian companies, to evaluate the effectiveness of the approach in a real-world application context. Results show that the 3B model represents the best balance between cost and performance, but the 14B and 32B models, although more costly, offer superior performance.

Keywords: Language Model, Text-to-SQL, Scaling Laws

References

Dettmers, T., Pagnoni, A., Holtzman, A., and Zettlemoyer, L. (2023). QLoRA: Efficient Finetuning of Quantized LLMs. In Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, NeurIPS 2023.

Dong, X., Zhang, C., Ge, Y., Mao, Y., Gao, Y., Chen, L., Lin, J., and Lou, D. (2023). C3: Zero-shot Text-to-SQL with ChatGPT. ArXiv, abs/2307.07306.

Gao, D., Wang, H., Li, Y., Sun, X., Qian, Y., Ding, B., and Zhou, J. (2024). Text-to-SQL Empowered by Large Language Models: A Benchmark Evaluation. Proc. VLDB Endow.

Hong, Z., Yuan, Z., Zhang, Q., Chen, H., Dong, J., Huang, F., and Huang, X. (2024). Next-Generation Database Interfaces: A Survey of LLM-based Text-to-SQL. ArXiv, abs/2406.08426.

Hu, E. J., Shen, Y., Wallis, P., Allen-Zhu, Z., Li, Y., Wang, S., Wang, L., and Chen, W. (2022). LoRA: Low-Rank Adaptation of Large Language Models. In The Tenth International Conference on Learning Representations, ICLR 2022.

José, M. and Cozman, F. (2021). mRAT-SQL+GAP: A Portuguese Text-to-SQL Transformer. In Anais da X Brazilian Conference on Intelligent Systems, BRACIS 2021.

Li, H., Zhang, J., Liu, H., Fan, J., Zhang, X., Zhu, J., Wei, R., Pan, H., Li, C., and Chen, H. (2024a). CodeS: Towards Building Open-source Language Models for Text-to-SQL. Proc. ACM Manag. Data.

Li, Z., Wang, X., Zhao, J., Yang, S., Du, G., Hu, X., Zhang, B., Ye, Y., Li, Z., Zhao, R., and Mao, H. (2024b). PET-SQL: A Prompt-Enhanced Two-Round Refinement of Text-to-SQL with Cross-consistency. ArXiv, abs/2403.09732.

Liu, X., Shen, S., Li, B., Ma, P., Jiang, R., Zhang, Y., Fan, J., Li, G., Tang, N., and Luo, Y. (2024). A Survey of NL2SQL with Large Language Models: Where are we, and where are we going? ArXiv, abs/2408.05109.

Miranda, B. and Campelo, C. E. C. (2024). How effective is an LLM-based Data Analysis Automation Tool? A Case Study with ChatGPT’s Data Analyst. In Anais do XXXIX Simpósio Brasileiro de Bancos de Dados, SBBD 2024.

Mohammadjafari, A., Maida, A. S., and Gottumukkala, R. (2024). From Natural Language to SQL: Review of LLM-based Text-to-SQL Systems. ArXiv, abs/2410.01066.

Oliveira, A., Nascimento, E., Pinheiro, J. a., Avila, C. V. S., Coelho, G., Feijó, L., Izquierdo, Y., García, G., Leme, L. A. P. P., Lemos, M., and Casanova, M. A. (2024). Small, Medium, and Large Language Models for Text-to-SQL. In Conceptual Modeling: 43rd International Conference, ER 2024.

Pourreza, M. and Rafiei, D. (2024). DTS-SQL: Decomposed Text-to-SQL with Small Large Language Models. In Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024.

Volvovsky, S., Marcassa, M., and Panbiharwala, M. (2024). DFIN-SQL: Integrating Focused Schema with DIN-SQL for Superior Accuracy in Large-Scale Databases. ArXiv, abs/2403.00872.

Wang, B., Ren, C., Yang, J., Liang, X., Bai, J., Chai, L., Yan, Z., Zhang, Q., Yin, D., Sun, X., and Li, Z. (2025). MAC-SQL: A Multi-Agent Collaborative Framework for Text-to-SQL. In Proceedings of the 31st International Conference on Computational Linguistics, COLING 2025.

Xu, L., Xie, H., Qin, S.-Z. J., Tao, X., and Wang, F. L. (2023). Parameter-Efficient Fine-Tuning Methods for Pretrained Language Models: A Critical Review and Assessment. ArXiv, abs/2312.12148.

Yang, S., Su, Q., Li, Z., Li, Z., Mao, H., Liu, C., and Zhao, R. (2024). SQL-to-Schema Enhances Schema Linking in Text-to-SQL. In Database and Expert Systems Applications - 35th International Conference, DEXA 2024.

Yu, T., Zhang, R., Yang, K., Yasunaga, M., Wang, D., Li, Z., Ma, J., Li, I., Yao, Q., Roman, S., Zhang, Z., and Radev, D. (2018). Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMLP 2018.

Zhang, T., Chen, C., Liao, C., Wang, J., Zhao, X., Yu, H., Wang, J., Li, J., and Shi, W. (2024). SQLfuse: Enhancing Text-to-SQL Performance through Comprehensive LLM Synergy. ArXiv, abs/2407.14568.

Zhong, R., Yu, T., and Klein, D. (2020). Semantic Evaluation for Text-to-SQL with Distilled Test Suites. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020.
Published
2025-09-29
SILVA, Letícia O.; SILVA, Paulo H. C.; SILVA, Fabrício A.. Scaling Laws for Text-to-SQL: A Study on the Relationship Between Language Model Size and Performance. In: BRAZILIAN SYMPOSIUM ON DATABASES (SBBD), 40. , 2025, Fortaleza/CE. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2025 . p. 140-153. ISSN 2763-8979. DOI: https://doi.org/10.5753/sbbd.2025.247042.