NFT Collection Classification: A Multimodal Approach Integrating Metadata and Textual Embeddings with Supervised Learning
Resumo
The NFT market has experienced rapid growth, giving rise to collections with significant semantic diversity and structural variations that complicate their automatic classification. This paper proposes a supervised learning approach to classify NFT collections across three distinct scenarios: (i) statistical data on transactions, (ii) embeddings generated from the textual descriptions of collections, and (iii) the combination of both. The dataset was collected from the OpenSea platform, which is one of the largest NFT marketplaces. Eight textual preprocessing techniques were tested, combined with two embedding models, four balancing strategies, and five classifiers. The best performance was achieved in the combined scenario, with an accuracy of 75.7% and an F1-score of 75.4%. This outperformed the embeddings-only configuration (accuracy of 67.9%, F1-score of 68.5%) and the statistical-only approach (accuracy of 61.9%, F1-score of 60.3%). Although the class distribution was moderately balanced, the F1-score was adopted to better reflect model performance across categories. The results demonstrate that the integration of structured and semantic data significantly enhances classification effectiveness, reinforcing the potential of this multimodal approach for applications in curation, recommendation, and market analysis in the NFT ecosystem.
Publicado
29/09/2025
Como Citar
RIBEIRO, Samuel de Oliveira; NASCIMENTO, Markesley Ramos do; GOMES, Dayan Ramos; VILLELA, Saulo Moraes; GONÇALVES, Glauber Dias.
NFT Collection Classification: A Multimodal Approach Integrating Metadata and Textual Embeddings with Supervised Learning. In: BRAZILIAN CONFERENCE ON INTELLIGENT SYSTEMS (BRACIS), 35. , 2025, Fortaleza/CE.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2025
.
p. 242-255.
ISSN 2643-6264.
