Getting Started with Semantic Segmentation in PyTorch Using SMP

João Fernando Mari; Leandro Henrique Furtado Pinto Silva; Mauricio Cunha Escarpinati; André Ricardo Backes

João Fernando Mari UFV
Leandro Henrique Furtado Pinto Silva UFV / UFU
Mauricio Cunha Escarpinati UFU
André Ricardo Backes UFSCar

Resumo

Semantic segmentation is a core task in computer vision, essential for applications requiring detailed scene understanding, such as medical imaging, precision agriculture, and remote sensing. Recent advances in deep learning have significantly enhanced segmentation performance, particularly through encoder-decoder architectures combined with transfer learning. This tutorial provides a practical introduction to semantic segmentation using the Segmentation Models PyTorch (SMP) library, a widely adopted framework that integrates state-of-the-art architectures with pretrained encoders in an accessible interface. We offer a comprehensive overview of key concepts, supported model architectures, loss functions, evaluation metrics, and training strategies, emphasizing transparency and flexibility through native PyTorch implementations. To reinforce the concepts, we present two case studies: binary segmentation with the 38-Cloud dataset and multiclass segmentation with the DeepGlobe dataset. Both illustrate real-world applications, model configuration, preprocessing, and performance evaluation. All tutorial materials, including source code and reproducible experiments, will be made publicly available. The goal is to equip participants with practical knowledge to design, train, and evaluate semantic segmentation models effectively in a variety of domains.

Palavras-chave: Precision agriculture, Deep learning, Semantic segmentation, Computational modeling, Switched mode power supplies, Tutorials, Computer architecture, Predictive models, Libraries, Remote sensing