Graph Condensation for Text Classification

  • René Vieira Santin USP
  • Diego Minatel USP
  • Nícolas Roque dos Santos USP
  • Solange Oliveira Rezende USP

Resumo


TextGCN is a graph-based model that achieves strong text classification performance by effectively capturing corpus-level relationships between documents and words. However, its scalability is limited by the high computational cost of processing large graphs, particularly in environments with restricted resources. One way to handle this limitation is to apply Graph Condensation (GCond), which generates much smaller synthetic graphs while preserving key information. GCond has demonstrated the ability to achieve performance comparable to the original graphs when combined with other graph neural network architectures, such as GCN, SGC, and GraphSAGE. Its application with TextGCN remains unexplored in the literature. This paper addresses this gap and proposes integrating TextGCN and GCond for scalable text classification. We experimentally evaluated our proposal on three benchmark datasets by considering three metrics–accuracy, memory usage, and training time–and compared the performance on the full original graphs against various levels of graph reduction, with the smallest condensed graph retaining only 0.02% of the nodes. The experimental results indicate that the reduced graphs perform similarly to the originals. In the best-case scenario, the most condensed synthetic graph generated by GCond was up to 20 times faster to train and consumed approximately 315,000 times less memory than the original graph, with only a two-percentage-point drop in accuracy.
Publicado
29/09/2025
SANTIN, René Vieira; MINATEL, Diego; SANTOS, Nícolas Roque dos; REZENDE, Solange Oliveira. Graph Condensation for Text Classification. In: BRAZILIAN CONFERENCE ON INTELLIGENT SYSTEMS (BRACIS), 35. , 2025, Fortaleza/CE. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2025 . p. 306-320. ISSN 2643-6264.