A High-Spatial Resolution Dataset and Few-shot Deep Learning Benchmark for Image Classification
Resumo
This paper presents a high-spatial-resolution dataset with remote sensing images of the Brazilian Cerrado for land use and land cover classification. The Biome Cerrado Dataset (CerraData) is a large database created from 150 scenes of the CBERS-4A satellite. Images were created by merging the near-infrared, green, and blue bands. Moreover, pan-sharpening was performed between all the scenes and their respective panchromatic bands, resulting in a final spatial resolution of two meters. A total of 2.5 million tiles of 256x256 pixels were derived from these scenes. From this total, 50 thousand tiles were labeled. We also conducted a few-shot learning experiment considering a training set with only 100 samples, 11 deep neural networks (DNNs), and two traditional machine learning (ML) algorithms, i.e., support vector machine (SVM) and random forest (RF). Results show that the DNN DenseNet-161 was the best model but its performance can be improved if it is used only as a feature extractor, leaving the classification task for the traditional ML algorithms. However, by decreasing the size of the training set, smarter approaches are needed. The labeled subset of CerraData as well as the source code we developed to support this study are available on-line: https://github.com/ai41uc/CerraData-code-data.