Text2Graph: Combining Lightweight LLMs and GNNs for Efficient Text Classification in Label-Scarce Scenarios

João Lucas Luz Lima Sarcinelli; Ricardo Marcondes Marcacini

João Lucas Luz Lima Sarcinelli USP
Ricardo Marcondes Marcacini USP

Resumo

Large Language Models (LLMs) have become effective zero-shot classifiers, but their high computational requirements and environmental costs limit their practicality for large-scale annotation in high-performance computing (HPC) environments. To support more sustainable workflows, we present Text2Graph, an open-source Python package that provides a modular implementation of existing text-to-graph classification approaches. The framework enables users to combine LLM-based partial annotation with Graph Neural Network (GNN) label propagation in a flexible manner, making it straightforward to swap components such as feature extractors, edge construction methods, and sampling strategies. We benchmark Text2Graph on a zero-shot setting using five datasets spanning topic classification and sentiment analysis tasks, comparing multiple variants against other zero-shot approaches for text classification. In addition to reporting performance, we provide detailed estimates of energy consumption and carbon emissions, showing that graph-based propagation achieves competitive results at a fraction of the energy and environmental cost.

Palavras-chave: Sentiment analysis, Energy consumption, Costs, Annotations, Large language models, High performance computing, Text categorization, Zero shot learning, Feature extraction, Graph neural networks, Large Language Models (LLMs), Graph Neural Networks (GNNs), Zero-Shot Learning, Text-to-Graph, Sustainable AI