Methodology of Data Generation Applied to the Generation of Colonoscopy Exam Images
Abstract
Colonoscopy exams can detect adenomatous polyps, which represent the early stage in the vast majority of colorectal cancer cases. However, their varying sizes and shapes make them difficult to detect, even for state-of-the-art machine learning models. One of the reasons for this challenge is the lack of available data for this task. To address this issue, this paper proposes a methodology for generating artificial data for colonoscopy images, in which three image generative models, Guided Diffusion, PFGM++, and Improved Diffusion, are analyzed, and the generated images are processed and evaluated. The results show satisfactory performance in generating colonoscopy exam images across all the generative models analyzed, achieving FID values of 33.89 and SSIM of 0.2573 for the best generative model among all evaluated models and those existing in the literature.
Keywords:
colonoscopy, polyps, colorectal cancer, generative models, data generation
References
Baidoun, F., Elshiwy, K., Elkeraie, Y., Merjaneh, Z., Khoudari, G., Sarmini, M. T., Gad, M., Al-Husseini, M., and Saad, A. (2021). Colorectal cancer epidemiology: Recent trends and impact on outcomes. Current Drug Targets, 22(9):998–1009.
Bernal, J., Sánchez, F. J., Fernández-Esparrach, G., Gil, D., Rodríguez, C., and Vilariño, F. (2015). Wm-dova maps for accurate polyp highlighting in colonoscopy: Validation vs. saliency maps from physicians. Computerized Medical Imaging and Graphics, 43:99–111. Epub 2015 Mar 20.
Dhariwal, P. and Nichol, A. (2021). Diffusion models beat gans on image synthesis.
Elhmadany, M., Elmadah, I., and Abdelmunim, H. (2024). Instance segmentation on distributed deep learning big data cluster. Journal of Big Data, 11.
Fagereng, J. A., Thambawita, V., Storås, A. M., Parasa, S., de Lange, T., Halvorsen, P., and Riegler, M. A. (2022). Polypconnect: Image inpainting for generating realistic gastrointestinal tract images with polyps.
Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Klambauer, G., and Hochreiter, S. (2017). Gans trained by a two time-scale update rule converge to a nash equilibrium. CoRR, abs/1706.08500.
Jha, D., Smedsrud, P. H., Riegler, M. A., Halvorsen, P., de Lange, T., Johansen, D., and Johansen, H. D. (2020). Kvasir-seg: A segmented polyp dataset. In International Conference on Multimedia Modeling, pages 451–462. Springer.
Lan, L., You, L., Zhang, Z., Fan, Z., Zhao, W., Zeng, N., Chen, Y., and Zhou, X. (2020). Generative adversarial networks and its applications in biomedical informatics. Frontiers in Public Health, 8:164.
Marques, A. F., Marques, K. F., Beraldo, M. N. M. d. S., Lima, T. B., Sassaki, L. Y., and Beraldo, R. F. (2023). Inteligência artificial na colonoscopia no rastreio do câncer colorretal: revisão de literatura. Brazilian Journal of Health Review, 6(4):18764–18774.
Mitchell, T. M. (1997). Machine Learning. McGraw-Hill, Inc., New York, NY, USA.
Nazeri, K., Ng, E., Joseph, T., Qureshi, F. Z., and Ebrahimi, M. (2019). Edgeconnect: Generative image inpainting with adversarial edge learning.
Nichol, A. and Dhariwal, P. (2021). Improved denoising diffusion probabilistic models.
Pishva, A. K., Thambawita, V., Torresen, J., and Hicks, S. A. (2023). Repolyp: A framework for generating realistic colon polyps with corresponding segmentation masks using diffusion models. In 2023 IEEE 36th International Symposium on Computer-Based Medical Systems (CBMS), pages 47–52.
Siegel, R. L., Giaquinto, A. N., and Jemal, A. (2024). Cancer statistics, 2024. CA: A Cancer Journal for Clinicians, 74(1):12–49.
Erratum in: CA Cancer J Clin. 2024 Mar-Apr;74(2):203. DOI: 10.3322/caac.21830.
Silva, J., Histace, A., Romain, O., Dray, X., and Granado, B. (2014). Toward embedded detection of polyps in wce images for early diagnosis of colorectal cancer. International Journal of Computer Assisted Radiology and Surgery, 9(2):283–293. Epub 2013 Sep 15.
Thambawita, V., Salehi, P., Sheshkal, S. A., Hicks, S. A., Hammer, H. L., Parasa, S., Lange, T. d., Halvorsen, P., and Riegler, M. A. (2022). Singan-seg: Synthetic training data generation for medical image segmentation. PLOS ONE, 17(5):e0267976.
Viscaino, M., Torres Bustos, J., Muñoz, P., Auat Cheein, C., and Cheein, F. A. (2021). Artificial intelligence for the early detection of colorectal cancer: A comprehensive review of its advantages and misconceptions. World Journal of Gastroenterology, 27(38):6399–6414.
Waisberg, E., Ong, J., Kamran, S. A., Masalkhi, M., Paladugu, P., Zaman, N., Lee, A. G., and Tavakkoli, A. (2024). Generative artificial intelligence in ophthalmology. Survey of Ophthalmology, Epub ahead of print. S0039-6257(24)00044-4.
Wang, Z., Bovik, A. C., Sheikh, H. R., and Simoncelli, E. P. (2004). Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing, 13(4):600–612.
Wu, Z., Li, Y., Zhang, Y., Hu, H., Wu, T., Liu, S., Chen, W., Xie, S., and Lu, Z. (2020). Colorectal cancer screening methods and molecular markers for early detection. Technology in Cancer Research Treatment, 19:1533033820980426. Jan-Dec.
Xu, Y., Liu, Z., Tian, Y., Tong, S., Tegmark, M., and Jaakkola, T. (2023). Pfgm++: Unlocking the potential of physics-inspired generative models.
Zeng, Y., Fu, J., Chao, H., and Guo, B. (2021). Aggregated contextual transformations for high-resolution image inpainting.
Bernal, J., Sánchez, F. J., Fernández-Esparrach, G., Gil, D., Rodríguez, C., and Vilariño, F. (2015). Wm-dova maps for accurate polyp highlighting in colonoscopy: Validation vs. saliency maps from physicians. Computerized Medical Imaging and Graphics, 43:99–111. Epub 2015 Mar 20.
Dhariwal, P. and Nichol, A. (2021). Diffusion models beat gans on image synthesis.
Elhmadany, M., Elmadah, I., and Abdelmunim, H. (2024). Instance segmentation on distributed deep learning big data cluster. Journal of Big Data, 11.
Fagereng, J. A., Thambawita, V., Storås, A. M., Parasa, S., de Lange, T., Halvorsen, P., and Riegler, M. A. (2022). Polypconnect: Image inpainting for generating realistic gastrointestinal tract images with polyps.
Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Klambauer, G., and Hochreiter, S. (2017). Gans trained by a two time-scale update rule converge to a nash equilibrium. CoRR, abs/1706.08500.
Jha, D., Smedsrud, P. H., Riegler, M. A., Halvorsen, P., de Lange, T., Johansen, D., and Johansen, H. D. (2020). Kvasir-seg: A segmented polyp dataset. In International Conference on Multimedia Modeling, pages 451–462. Springer.
Lan, L., You, L., Zhang, Z., Fan, Z., Zhao, W., Zeng, N., Chen, Y., and Zhou, X. (2020). Generative adversarial networks and its applications in biomedical informatics. Frontiers in Public Health, 8:164.
Marques, A. F., Marques, K. F., Beraldo, M. N. M. d. S., Lima, T. B., Sassaki, L. Y., and Beraldo, R. F. (2023). Inteligência artificial na colonoscopia no rastreio do câncer colorretal: revisão de literatura. Brazilian Journal of Health Review, 6(4):18764–18774.
Mitchell, T. M. (1997). Machine Learning. McGraw-Hill, Inc., New York, NY, USA.
Nazeri, K., Ng, E., Joseph, T., Qureshi, F. Z., and Ebrahimi, M. (2019). Edgeconnect: Generative image inpainting with adversarial edge learning.
Nichol, A. and Dhariwal, P. (2021). Improved denoising diffusion probabilistic models.
Pishva, A. K., Thambawita, V., Torresen, J., and Hicks, S. A. (2023). Repolyp: A framework for generating realistic colon polyps with corresponding segmentation masks using diffusion models. In 2023 IEEE 36th International Symposium on Computer-Based Medical Systems (CBMS), pages 47–52.
Siegel, R. L., Giaquinto, A. N., and Jemal, A. (2024). Cancer statistics, 2024. CA: A Cancer Journal for Clinicians, 74(1):12–49.
Erratum in: CA Cancer J Clin. 2024 Mar-Apr;74(2):203. DOI: 10.3322/caac.21830.
Silva, J., Histace, A., Romain, O., Dray, X., and Granado, B. (2014). Toward embedded detection of polyps in wce images for early diagnosis of colorectal cancer. International Journal of Computer Assisted Radiology and Surgery, 9(2):283–293. Epub 2013 Sep 15.
Thambawita, V., Salehi, P., Sheshkal, S. A., Hicks, S. A., Hammer, H. L., Parasa, S., Lange, T. d., Halvorsen, P., and Riegler, M. A. (2022). Singan-seg: Synthetic training data generation for medical image segmentation. PLOS ONE, 17(5):e0267976.
Viscaino, M., Torres Bustos, J., Muñoz, P., Auat Cheein, C., and Cheein, F. A. (2021). Artificial intelligence for the early detection of colorectal cancer: A comprehensive review of its advantages and misconceptions. World Journal of Gastroenterology, 27(38):6399–6414.
Waisberg, E., Ong, J., Kamran, S. A., Masalkhi, M., Paladugu, P., Zaman, N., Lee, A. G., and Tavakkoli, A. (2024). Generative artificial intelligence in ophthalmology. Survey of Ophthalmology, Epub ahead of print. S0039-6257(24)00044-4.
Wang, Z., Bovik, A. C., Sheikh, H. R., and Simoncelli, E. P. (2004). Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing, 13(4):600–612.
Wu, Z., Li, Y., Zhang, Y., Hu, H., Wu, T., Liu, S., Chen, W., Xie, S., and Lu, Z. (2020). Colorectal cancer screening methods and molecular markers for early detection. Technology in Cancer Research Treatment, 19:1533033820980426. Jan-Dec.
Xu, Y., Liu, Z., Tian, Y., Tong, S., Tegmark, M., and Jaakkola, T. (2023). Pfgm++: Unlocking the potential of physics-inspired generative models.
Zeng, Y., Fu, J., Chao, H., and Guo, B. (2021). Aggregated contextual transformations for high-resolution image inpainting.
Published
2024-12-05
How to Cite
CASTRO, André Cerqueira; NEVES, Lucas Lima; GONÇALVES PAIVA, Heitor Sardinha; PEREIRA FRANCO, Ricardo Augusto.
Methodology of Data Generation Applied to the Generation of Colonoscopy Exam Images. In: REGIONAL SCHOOL ON INFORMATICS OF GOIÁS (ERI-GO), 12. , 2024, Ceres/GO.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2024
.
p. 109-118.
DOI: https://doi.org/10.5753/erigo.2024.4797.
