Aprimoramento de Modelos de Aprendizado Profundo de Super-Resolução para Imagens de Tomografia Computadorizada Utilizando Fine-Tuning

Ramon Rodrigues Morello; Bruno Légora Souza da Silva; Thaís Pedruzzi do Nascimento

doi:10.5753/sbcas.2026.21253

Ramon Rodrigues Morello UFES
Bruno Légora Souza da Silva UFES
Thaís Pedruzzi do Nascimento UFES

DOI: https://doi.org/10.5753/sbcas.2026.21253

Resumo

Este trabalho analisa o efeito do fine-tuning no desempenho dos modelos de super-resolução Real-ESRGAN e Hybrid Attention Transformer (HAT) para aprimorar imagens de tomografia computadorizada de baixa dosagem. Os resultados aplicados ao conjunto de dados LoDoPaB-CT demonstram melhorias consistentes, com ganhos médios de 60,03% em PSNR, 92,50% em SSIM e 13,52% em PI para a arquitetura HAT, e de 51,02% em PSNR, 42,11% em SSIM e 32,42% em PI para o Real-ESRGAN. Além disso, observam-se redução de artefatos e preservação de estruturas relevantes. O HAT apresentou um tempo de treinamento aproximadamente 23,59 vezes maior do que o do Real-ESRGAN. Um repositório no GitHub está disponível para fins de reprodutibilidade.

Referências

Aghelan, A. and Rouhani, M. (2024). Fine-tuned generative adversarial network-based model for medical image super-resolution. In 2024 14th International Conference on Computer and Knowledge Engineering (ICCKE), pages 174–181. IEEE.

Agustsson, E. and Timofte, R. (2017). Ntire 2017 challenge on single image super-resolution: Dataset and study. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops.

Blau, Y. and Michaeli, T. (2018). The perception-distortion tradeoff. pages 6228–6237.

Carvalho, E. R., Silva, B. L. S., and Nascimento, T. P. (2025). Super-resolução de imagens em tomografia computadorizada de baixa dosagem: Comparação entre métodos de aprendizado profundo. Anais do Computer on the Beach, 16:263–270.

Chen, X., Wang, X., Zhang, W., Kong, X., Qiao, Y., Zhou, J., and Dong, C. (2025). Hat: Hybrid attention transformer for image restoration. IEEE Transactions on Pattern Analysis and Machine Intelligence.

Dalmazo, J., Elias Júnior, J., Brocchi, M. A. C., Costa, P. R., and Azevedo-Marques, P. M. d. (2010). Otimização da dose em exames de rotina em tomografia computadorizada: estudo de viabilidade em um hospital universitário. Radiologia Brasileira, 43:241–248.

Dong, C., Loy, C. C., He, K., and Tang, X. (2015). Image super-resolution using deep convolutional networks. IEEE transactions on pattern analysis and machine intelligence, 38(2):295–307.

John Janiczek (2018). finetune ESRGAN. [link]. GitHub repository. Accessed: Mar. 1, 2026.

Jung, H. (2021). Basic physical principles and clinical applications of computed tomography. Progress in Medical Physics, 32(1):1–17.

Ledig, C., Theis, L., Huszár, F., Caballero, J., Cunningham, A., Acosta, A., Aitken, A., Tejani, A., Totz, J., Wang, Z., et al. (2017). Photo-realistic single image super-resolution using a generative adversarial network. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4681–4690.

Lepcha, D. C., Goyal, B., Dogra, A., and Goyal, V. (2023). Image super-resolution: A comprehensive review, recent trends, challenges and applications. Information Fusion, 91:230–260.

Leuschner, J., Schmidt, M., Baguer, D. O., and Maass, P. (2021). Lodopab-ct, a benchmark dataset for low-dose computed tomography reconstruction. Scientific Data, 8(1):109.

Liang, J., Cao, J., Sun, G., Zhang, K., Van Gool, L., and Timofte, R. (2021). Swinir: Image restoration using swin transformer. In Proceedings of the IEEE/CVF international conference on computer vision, pages 1833–1844.

Picano, E. (2004). Sustainability of medical imaging. Bmj, 328(7439):578–580.

Sarasaen, C., Chatterjee, S., Breitkopf, M., Rose, G., Nürnberger, A., and Speck, O. (2021). Fine-tuning deep learning model parameters for improved super-resolution of dynamic mri with prior-knowledge. Artificial Intelligence in Medicine, 121:102196.

Selig, T., März, T., Storath, M., and Weinmann, A. (2024). Enhanced low-dose ct image reconstruction by domain and task shifting gaussian denoisers. arXiv preprint arXiv:2403.03551.

Tajbakhsh, N., Shin, J. Y., Gurudu, S. R., Hurst, R. T., Kendall, C. B., Gotway, M. B., and Liang, J. (2016). Convolutional neural networks for medical image analysis: Full training or fine tuning? IEEE transactions on medical imaging, 35(5):1299–1312.

Wang, G., Jacob, M., Mou, X., Shi, Y., and Eldar, Y. C. (2021a). Deep tomographic image reconstruction: yesterday, today, and tomorrow—editorial for the 2nd special issue “machine learning for image reconstruction”. IEEE transactions on medical imaging, 40(11):2956–2964.

Wang, X. (2021). Real-ESRGAN. [link]. GitHub repository. Accessed: Mar. 1, 2026.

Wang, X., Xie, L., Dong, C., and Shan, Y. (2021b). Real-esrgan: Training real-world blind super-resolution with pure synthetic data. In Proceedings of the IEEE/CVF international conference on computer vision, pages 1905–1914.

Wang, X., Yu, K., Wu, S., Gu, J., Liu, Y., Dong, C., Qiao, Y., and Change Loy, C. (2018). Esrgan: Enhanced super-resolution generative adversarial networks. In Proceedings of the European conference on computer vision (ECCV) workshops, pages 0–0.

Wang, Z., Bovik, A. C., Sheikh, H. R., and Simoncelli, E. P. (2004). Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing, 13(4):600–612.

XPixelGroup (2022). HAT. [link]. GitHub repository. Accessed: Mar. 1, 2026.