Data Augmentation for Medical Image Segmentation: A Comparative Analysis of Traditional Techniques and Synthetic Data Generation

Resumo


Deep Learning has been widely applied to medical image segmentation, aiming to make structures clearer in images to help physicians identify unusual patterns and anomalies. Segmentation models face challenges in collecting a large amount of data for training, due to privacy concerns and pathological representation. Data Augmentation (DA) is an alternative to mitigate this challenge, expanding the dataset by applying transformations to the original set or creating new samples using generative methods. Despite the extensive use of DA techniques, there is still limited understanding of their relative effectiveness for medical image segmentation tasks. This work presents a method for evaluating the impact of DA methods, analyzing traditional augmentation techniques, and diffusion models for generating synthetic data in medical image segmentation models.
Palavras-chave: Data augmentation, Medical Imaging, Diffusion models, Segmentation

Referências

Aktas, B., Ates, D. D., Duzyel, O., and Gumus, A. (2025). Diffusionbased data augmentation methodology for improved performance in ocular disease diagnosis using retinography images. International Journal of Machine Learning and Cybernetics, 16(5):3843–3864.

Azad, R., Aghdam, E. K., Rauland, A., Jia, Y., Avval, A. H., Bozorgpour, A., Karimijafarbigloo, S., Cohen, J. P., Adeli, E., and Merhof, D. (2024). Medical image segmentation review: The success of u-net. IEEE Transactions on Pattern Analysis and Machine Intelligence, 46(12):10076–10095.

Buslaev, A., Iglovikov, V. I., Khvedchenya, E., Parinov, A., Druzhinin, M., and Kalinin, A. A. (2020). Albumentations: Fast and flexible image augmentations. Information, 11(2).

Cavalcanti, A., Brandão, D., Bezerra, E., and Coutinho, R. (2024). Avaliação de técnicas de balanceamento de dados na detecção de fraude em transações online de cartão de crédito. In Anais do XXXIX Simpósio Brasileiro de Bancos de Dados, pages 694–700, Porto Alegre, RS, Brasil. SBC.

Codella, N., Rotemberg, V., Tschandl, P., Celebi, M. E., Dusza, S., Gutman, D., Helba, B., Kalloo, A., Liopyris, K., Marchetti, M., Kittler, H., and Halpern, A. (2019). Skin lesion analysis toward melanoma detection 2018: A challenge hosted by the international skin imaging collaboration (isic).

Consortium, M. (2024). Monai: Medical open network for ai.

Goceri, E. (2023). Medical image data augmentation: techniques, comparisons and interpretations. Artificial Intelligence Review, 56(11):12561–12605.

Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., and Hochreiter, S. (2017). Gans trained by a two time-scale update rule converge to a local nash equilibrium. In Guyon, I., Luxburg, U. V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., and Garnett, R., editors, Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc.

Jin, K., Huang, X., Zhou, J., Li, Y., Yan, Y., Sun, Y., Zhang, Q., Wang, Y., and Ye, J. (2022). Fives: A fundus image dataset for artificial intelligence based vessel segmentation. Scientific Data, 9(1):475.

Joshi, R. C., Kumar Sharma, A., and Kishore Dutta, M. (2024). Visiondeep-ai: Deep learning-based retinal blood vessels segmentation and multi-class classification framework for eye diagnosis. Biomedical Signal Processing and Control, 94:106273.

Kumar, T., Brennan, R., Mileo, A., and Bendechache, M. (2024). Image data augmentation approaches: A comprehensive survey and future directions. IEEE Access, 12:187536–187571.

Laheras, L. P., Rodrigues, P. S., Lopes, F. J. P., Palmeira, O. F. J., Falcão, A. X., Benato, B. C., and Giraldi, G. A. (2021). Aumento de dados utilizando firefly e level sets aplicado à segmentação de imagens médicas e biológicas. Revista Eletrônica de Iniciação Científica em Computação, 19(2).

Rayed, M. E., Islam, S. S., Niha, S. I., Jim, J. R., Kabir, M. M., and Mridha, M. (2024). Deep learning for medical image segmentation: State-of-the-art advancements and challenges. Informatics in Medicine Unlocked, 47:101504.

Ronneberger, O., Fischer, P., and Brox, T. (2015). U-net: Convolutional networks for biomedical image segmentation. In Navab, N., Hornegger, J., Wells, W. M., and Frangi, A. F., editors, Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015, pages 234–241, Cham. Springer International Publishing.

Tschandl, P., Rosendahl, C., and Kittler, H. (2018). The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions. Scientific Data, 5(1):180161.
Publicado
29/09/2025
UCHIDA, Mariana Aya S.; DE AGUIAR, Erikson J.; TRAINA-JR, Caetano; TRAINA, Agma J. M.. Data Augmentation for Medical Image Segmentation: A Comparative Analysis of Traditional Techniques and Synthetic Data Generation. In: SIMPÓSIO BRASILEIRO DE BANCO DE DADOS (SBBD), 40. , 2025, Fortaleza/CE. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2025 . p. 830-836. ISSN 2763-8979. DOI: https://doi.org/10.5753/sbbd.2025.247731.