Data-Efficient Tabular Classification with Transformer-Based Small Language Models

  • Mario Haddad-Neto Fundação Paulo Feitoza
  • Diógenes Silva Fundação Paulo Feitoza
  • Ítalo Caliari Fundação Paulo Feitoza
  • Hendrio Bragança Fundação Paulo Feitoza

Resumo


The application of deep learning to tabular data remains an ongoing challenge, with tree-based models such as XGBoost consistently outperforming neural network methods in most real-world scenarios. Recent advances in Large Language Models (LLMs) highlight their ability to generalize to new tasks via few-shot or one-shot prompting, yet questions remain about their utility in structured, tabular domains—particularly for resource-efficient Small Language Models (SLMs). In this work, we systematically evaluate the effectiveness of both large and small transformer-based language models for tabular classification tasks using few-shot strategies. Our approach investigates input serialization schemes and prompt engineering to maximize performance in low-data regimes. Experiments on benchmark datasets—including Diabetes, Heart Failure, and German Credit Risk—demonstrate that well-designed SLMs can approach, and occasionally match, the performance of much larger models, substantially reducing the need for extensive labeled data and retraining. However, further advances are required for language models to consistently rival the strongest tree-ensemble baselines. Our findings support the idea that SLMs, when properly prompted, offer a promising, flexible, and label-efficient alternative for automating and dynamizing machine learning pipelines on tabular data.
Palavras-chave: Large Language Models, Small Language Models, Few-shot Learning, Data Classification, Transformers, Tabular Data

Referências

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pretraining of deep bidirectional transformers for language understanding, 2019.

Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial networks, 2014.

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, et al. Attention is all you need, 2023.

Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, et al. Language models are few-shot learners, 2020.

Victor Sanh, Albert Webson, Colin Raffel, Stephen H. Bach, Lintang Sutawika, Zaid Alyafeai, Antoine Chaffin, et al. Multitask prompted training enables zeroshot task generalization, 2022.

Boseop Kim, HyoungSeok Kim, Sang-Woo Lee, Gichang Lee, Donghyun Kwak, Dong Hyeon Jeon, Sunghyun Park, Sungju Kim, et al. What changes can largescale language models bring? intensive study on hyperclova: Billions-scale korean generative pretrained transformers, 2021.

Timo Schick and Hinrich Schütze. Exploiting cloze questions for few shot text classification and natural language inference, 2021.

Manoj Acharya, Kushal Kafle, and Christopher Kanan. Tallyqa: Answering complex counting questions. Proceedings of the AAAI Conference on Artificial Intelligence, 33(01):8076–8084, Jul. 2019.

Antoine Bordes, Nicolas Usunier, Sumit Chopra, and Jason Weston. Large-scale simple question answering with memory networks, 2015.

Zhiyu Chen, Wenhu Chen, Charese Smiley, Sameena Shah, Iana Borova, Dylan Langdon, Reema Moussa, Matt Beane, Ting-Hao Huang, Bryan Routledge, and William Yang Wang. Finqa: A dataset of numerical reasoning over financial data, 2022.

Griffin Adams, Alexander Fabbri, Faisal Ladhak, Eric Lehman, and Noémie Elhadad. From sparse to dense: Gpt-4 summarization with chain of density prompting, 2023.

Toufique Ahmed and Premkumar Devanbu. Few-shot training llms for projectspecific code-summarization, 2022.

Avinesh P. V. S., Benjamin Hättasch, Orkan Özyurt, Carsten Binnig, and Christian M. Meyer. Sherlock: a system for interactive summarization of large text collections. Proc. VLDB Endow., 11(12):1902–1905, August 2018.

Alec Radford, Jeff Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. Language models are unsupervised multitask learners. 2019.

Qinyuan Ye, Bill Yuchen Lin, and Xiang Ren. Crossfit: A few-shot learning challenge for cross-task generalization in nlp, 2021.

Vanessa Câmara, Rayol Mendonca-Neto, André Silva, and Luiz Cordovil-Jr. Dbvinci – towards the usage of gpt engine for processing sql queries. In Proceedings of the 29th Brazilian Symposium on Multimedia and the Web, WebMedia ’23, page 91–95, New York, NY, USA, 2023. Association for Computing Machinery.

Laria Reynolds and Kyle McDonell. Prompt programming for large language models: Beyond the few-shot paradigm, 2021.

Ethan Perez, Douwe Kiela, and Kyunghyun Cho. True few-shot learning with language models, 2021.

Maria Sahakyan, Zeyar Aung, and Talal Rahwan. Explainable artificial intelligence for tabular data: A survey. IEEE Access, 9:135392–135422, 2021.

Pengcheng Yin, Graham Neubig, Wen-tau Yih, and Sebastian Riedel. TaBERT: Pretraining for joint understanding of textual and tabular data. In Dan Jurafsky, Joyce Chai, Natalie Schluter, and Joel Tetreault, editors, Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 8413–8426, Online, July 2020. Association for Computational Linguistics.

Sercan O. Arik and Tomas Pfister. Tabnet: Attentive interpretable tabular learning, 2020.

Sungwon Han, Jinsung Yoon, Sercan O Arik, and Tomas Pfister. Large language models can automatically engineer features for few-shot tabular learning, 2024.

Ravid Shwartz-Ziv and Amitai Armon. Tabular data: Deep learning is not all you need. Information Fusion, 81:84–90, 2022.

Léo Grinsztajn, Edouard Oyallon, and Gaël Varoquaux. Why do tree-based models still outperform deep learning on tabular data?, 2022.

Vadim Borisov, Tobias Leemann, Kathrin Seßler, Johannes Haug, Martin Pawelczyk, and Gjergji Kasneci. Deep neural networks and tabular data: A survey. IEEE Transactions on Neural Networks and Learning Systems, 35(6):7499–7519, 2024.

Xin He, Kaiyong Zhao, and Xiaowen Chu. Automl: A survey of the state-of-theart. Knowledge-Based Systems, 212:106622, 2021.

Zhenyan Lu, Xiang Li, Dongqi Cai, Rongjie Yi, Fangming Liu, Xiwen Zhang, Nicholas D. Lane, and Mengwei Xu. Small language models: Survey, measurements, and insights, 2025.

Peter Kontschieder, Madalina Fiterau, Antonio Criminisi, and Samuel Rota Bulò. Deep neural decision forests. In 2015 IEEE International Conference on Computer Vision (ICCV), pages 1467–1475, 2015.

Sergei Popov, Stanislav Morozov, and Artem Babenko. Neural oblivious decision ensembles for deep learning on tabular data, 2019.

Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, and Tie-Yan Liu. Lightgbm: A highly efficient gradient boosting decision tree. In I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc., 2017.

Asaf Harari and Gilad Katz. Few-shot tabular data enrichment using fine-tuned transformer architectures. In Smaranda Muresan, Preslav Nakov, and Aline Villavicencio, editors, Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1577–1591, Dublin, Ireland, May 2022. Association for Computational Linguistics.

Kimberly Villalobos Carballo, Liangyuan Na, Yu Ma, Léonard Boussioux, Cynthia Zeng, Luis R. Soenksen, and Dimitris Bertsimas. Tabtext: A flexible and contextual approach to tabular data representation, 2023.

Tuan Dinh, Yuchen Zeng, Ruisu Zhang, Ziqian Lin, Michael Gira, Shashank Rajput, Jy yong Sohn, Dimitris Papailiopoulos, and Kangwook Lee. Lift: Languageinterfaced fine-tuning for non-language machine learning tasks, 2022.

Mario Haddad-Neto, André Silva, Rayol Mendonca-Neto, and Luiz Cordovil. Exploring the impact of zero-cost proxies for hybrid vision transformers. In 2024 International Joint Conference on Neural Networks (IJCNN), pages 1–8, 2024.

Tony Z. Zhao, Eric Wallace, Shi Feng, Dan Klein, and Sameer Singh. Calibrate before use: Improving few-shot performance of language models, 2021.

Michael Kahn. Diabetes. UCI Machine Learning Repository. DOI: 10.24432/C5T59G.

Davide Chicco and Giuseppe Jurman. Machine learning can predict survival of patients with heart failure from serum creatinine and ejection fraction alone. BMC Medical Informatics and Decision Making, 20(1):16, 2020.

Hans Hofmann. Statlog (German Credit Data). UCI Machine Learning Repository, 1994. DOI: 10.24432/C5NC77.

Jonindo Pasaribu, Novanto Yudistira, andWayan Firdaus Mahmudy. Tabular data classification and regression: Xgboost or deep learning with retrieval-augmented generation. IEEE Access, 12:191719–191732, 2024.

Srichand Doki, Siddhartha Devella, Sumanth Tallam, Sai Sujeeth Reddy Gangannagari, P. Sampathkrishna Reddy, and G. Pradeep Reddy. Heart disease prediction using xgboost. In 2022 Third International Conference on Intelligent Computing Instrumentation and Control Technologies (ICICICT), pages 1317–1320, 2022.

Stefan Hegselmann, Alejandro Buendia, Hunter Lang, Monica Agrawal, Xiaoyi Jiang, and David Sontag. Tabllm: Few-shot classification of tabular data with large language models, 2023.

Noah Hollmann, Samuel Müller, Katharina Eggensperger, and Frank Hutter. Tabpfn: A transformer that solves small tabular classification problems in a second, 2023.

Aaron Grattafiori, Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, et al. The llama 3 herd of models, 2024.

Gemma Team, Morgane Riviere, Shreya Pathak, Pier Giuseppe Sessa, Cassidy Hardin, Surya Bhupatiraju, Léonard Hussenot, et al. Gemma 2: Improving open language models at a practical size, 2024.

OpenAI, :, Aaron Hurst, Adam Lerer, Adam P. Goucher, Adam Perelman, Aditya Ramesh, Aidan Clark, AJ Ostrow, et al. Gpt-4o system card, 2024.
Publicado
10/11/2025
HADDAD-NETO, Mario; SILVA, Diógenes; CALIARI, Ítalo; BRAGANÇA, Hendrio. Data-Efficient Tabular Classification with Transformer-Based Small Language Models. In: BRAZILIAN SYMPOSIUM ON MULTIMEDIA AND THE WEB (WEBMEDIA), 31. , 2025, Rio de Janeiro/RJ. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2025 . p. 167-175. DOI: https://doi.org/10.5753/webmedia.2025.15193.