Análise do Impacto de Dados Sintéticos para Modelos Segmentadores de Pólipos Adenomatosos em Colonoscopia

  • Lucas Lima Neves CEIA
  • Adalberto Ferreira Barbosa Junior CEIA
  • Ricardo Augusto Pereira Franco UFG

Resumo


O auxílio a diagnóstico precoce do câncer colorretal, com sistemas de aprendizado profundo em colonoscopia, é comumente limitado pela escassez de dados anotados. Este estudo analisa o impacto da inclusão de dados sintéticos no treinamento de dez encoders de diferentes arquiteturas para a segmentação de pólipos. Os resultados demonstram melhorias consistentes, porém variando conforme o tamanho e a complexidade da rede: o incremento no IoU Global situou-se entre 1,5% e 4,8%, enquanto o Coeficiente Dice cresceu entre 2,7% e 5,6%. Os achados reforçam a eficácia dos dados sintéticos na ampliação de conjuntos de treinamento e evidenciam como diferentes arquiteturas interagem com dados gerados para aprimorar sistemas de apoio ao diagnóstico.

Referências

Aguiar, R. M. G., Scheeren, M. H., de Araujo Jr, S. L., Mendes, E., de Paula Filho, P. L., and Franco, R. A. P. (2024). Aplicação de modelos de aprendizado profundo para a segmentação semântica de imagens de colonoscopia. In Anais do Simpósio Brasileiro de Computação Aplicada à Saúde (SBCAS). SBC.

Alberti, L. R., Lima, D. C. A. D., Rodrigues, K. C. D. L., Taranto, M. P. L., Gonçalves, S. H. L., and Petroianu, A. (2012). The impact of colonoscopy for colorectal cancer screening. Surgical Endoscopy, 26:2308–2313.

Bernal, J., Sánchez, F. J., Fernández-Esparrach, G., et al. (2015). Wm-dova maps for accurate polyp highlighting in colonoscopy: Validation vs. saliency maps from physicians. Computerized Medical Imaging and Graphics, 43:99–111.

Castro, A. C., Naves, A. A., Neves, L. L., Fernandes, J. P., Santos, R., Oliveira, C., and Franco, R. A. P. (2025). A semi-automated pipeline for generating and annotating co lorectal polyp data for semantic segmentation tasks. In 2025 International Conference on Machine Learning and Applications (ICMLA). IEEE.

Corley, D. A., Jensen, C. D., Marks, A. R., et al. (2014). Adenoma detection rate and risk of colorectal cancer and death. New England Journal of Medicine, 370(14):1298–1306.

de Araujo Jr, S. L., Scheeren, M. H., Aguiar, R. M. G., Mendes, E., Franco, R. A. P., and de Paula Filho, P. L. (2024). Segmentação de pólipos em imagens de colonoscopia utilizando YOLOv8. In Anais do Simpósio Brasileiro de Computação Aplicada à Saúde (SBCAS). SBC.

de Melo, C. M., Torralba, A., Guibas, L., DiCarlo, J., Chellappa, R., and Hodgins, J. (2022). Next-generation deep learning based on simulators and synthetic data. Trends in Cognitive Sciences, 26:174–187.

Dhariwal, P. and Nichol, A. (2021). Diffusion models beat gans on image synthesis. In Advances in Neural Information Processing Systems (NeurIPS), volume 34, pages 8780–8794.

Fagereng, J. A., Thambawita, V., Storås, A. M., et al. (2022). Polypconnect: Image inpainting for generating realistic gastrointestinal tract images with polyps. In 2022 IEEE 35th International Symposium on Computer-Based Medical Systems (CBMS), pages 66–71.

He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 770–778.

Heresbach, D., Barrioz, T., Lapalus, M. G., et al. (2008). Miss rate for colorectal neo-plastic polyps: a prospective multicenter study of back-to-back video colonoscopies. Endoscopy, 40(4):284–290.

Huang, G., Liu, Z., Van Der Maaten, L., and Weinberger, K. Q. (2017). Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 4700–4708.

Jha, D., Smedsrud, P. H., Riegler, M. A., et al. (2020). Kvasir-seg: A segmented polyp dataset. In International Conference on Multimedia Modeling (MMM), pages 451–462. Springer.

Marques, A. F., Marques, K. F., dos Santos Beraldo, M. N. M., et al. (2023). Inteligência artificial na colonoscopia no rastreio do câncer colorretal: revisão de literatura. Brazilian Journal of Health Review, 6(4).

Neves, L. L., Castro, A. C., Naves, A. A., Paiva, H. S. G., Franco, R. A. P., and Cardoso, A. A. (2025). Methodology for generating medical images applied to the generation of synthetic colon polyps. In 2025 Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC). IEEE.

Nichol, A. and Dhariwal, P. (2021). Improved denoising diffusion probabilistic models. In Proceedings of the 38th International Conference on Machine Learning (ICML), volume 139, pages 8162–8171.

Picard, S., Chapdelaine, C., Cappi, C., et al. (2020). Ensuring dataset quality for machine learning certification. In 2020 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW), pages 275–282.

Pishva, A. K., Thambawita, V., Torresen, J., and Hicks, S. A. (2023). Repolyp: A framework for generating realistic colon polyps with corresponding segmentation masks using diffusion models. In 2023 IEEE 36th International Symposium on Computer-Based Medical Systems (CBMS), pages 47–52.

Ronneberger, O., Fischer, P., and Brox, T. (2015). U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention (MICCAI), pages 234–241. Springer.

Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., and Chen, L.-C. (2018). Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 4510–4520.

Silva, J., Histace, A., Romain, O., Dray, X., and Granado, B. (2014). Toward embedded detection of polyps in wce images for early diagnosis of colorectal cancer. International Journal of Computer Assisted Radiology and Surgery, 9(2):283–293.

Sung, H., Ferlay, J., Siegel, R. L., Laversanne, M., Soerjomataram, I., Jemal, A., and Bray, F. (2021). Global cancer statistics 2020: Globocan estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA: A Cancer Journal for Clinicians, 71(3):209–249.

Takida, Y., Imaizumi, M., Shibuya, T., et al. (2024). San: Inducing metrizability of gan with discriminative normalized linear layer. In The Twelfth International Conference on Learning Representations (ICLR).

Tan, M. and Le, Q. (2019). Efficientnet: Rethinking model scaling for convolutional neural networks. In Proceedings of the 36th International Conference on Machine Learning (ICML), pages 6105–6114.

Xie, E., Wang, W., Yu, Z., Anandkumar, A., Alvarez, J. M., and Luo, P. (2021). Segformer: Simple and efficient design for semantic segmentation with transformers. In Advances in Neural Information Processing Systems (NeurIPS), volume 34, pages 12077–12090.

Xie, S., Girshick, R., Dollár, P., Tu, Z., and He, K. (2017). Aggregated residual transformations for deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1492–1500.

Xu, Y., Liu, Z., Tian, Y., et al. (2023). Pfgm++: Unlocking the potential of physicsinspired generative models. In Proceedings of the 40th International Conference on Machine Learning (ICML), volume 202, pages 38566–38591.
Publicado
01/06/2026
NEVES, Lucas Lima; BARBOSA JUNIOR, Adalberto Ferreira; FRANCO, Ricardo Augusto Pereira. Análise do Impacto de Dados Sintéticos para Modelos Segmentadores de Pólipos Adenomatosos em Colonoscopia. In: SIMPÓSIO BRASILEIRO DE COMPUTAÇÃO APLICADA À SAÚDE (SBCAS), 26. , 2026, Ouro Preto/MG. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2026 . p. 1277-1288. ISSN 2763-8952. DOI: https://doi.org/10.5753/sbcas.2026.21742.