Soybean Weeds Segmentation Using VT-Net: A Convolutional-Transformer Model

  • Lucas Silva FURG
  • Paulo Drews FURG
  • Rodrigo de Bem FURG


The use of machine learning and computer vision in areas related to agriculture has grown significantly in the last few years, allowing for higher precision and efficiency in several processes. In this context, the present work aims at the development of a neural network for the segmentation of images containing weeds in soybean cultivation. We developed a new hybrid model based on convolutional neural networks (CNNs) and vision transformers (ViT), a neural network that uses a self-attention mechanism. Importantly, we also extended the well-known DeepWeeds dataset with segmentation labels, mitigating the lack of publicly available training data in the literature. We compare our hybrid model with state-of-the-art Transformer segmentation networks, such as BEiT and Mask2Former. Our approach obtains results equivalent to them with the advantage of employing fewer layers than the competitors. To the best of our knowledge, this work is the first to use a hybrid convolutional model, with a pure ViT backbone for the segmentation of soybean weeds.
SILVA, Lucas; DREWS, Paulo; BEM, Rodrigo de. Soybean Weeds Segmentation Using VT-Net: A Convolutional-Transformer Model. In: CONFERENCE ON GRAPHICS, PATTERNS AND IMAGES (SIBGRAPI), 36. , 2023, Rio Grande/RS. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2023 . p. 127-132.