Exploring Foundation Models for Synthetic Medical Imaging: A Study on Chest X-Rays and Fine-Tuning Techniques

Davide Clode da Silva; Marina Musse Bernardes; Nathália Giacomini Ceretta; Gabriel Vaz de Souza; Gabriel Fonseca Silva; Rafael Heitor Bordini; Soraia Raupp Musse

doi:10.5753/sibgrapi.est.2024.31651

Davide Clode da Silva PUCRS
Marina Musse Bernardes PUCRS
Nathália Giacomini Ceretta PUCRS
Gabriel Vaz de Souza PUCRS
Gabriel Fonseca Silva PUCRS
Rafael Heitor Bordini PUCRS
Soraia Raupp Musse PUCRS

DOI: https://doi.org/10.5753/sibgrapi.est.2024.31651

Resumo

Machine learning has significantly advanced healthcare by aiding in disease prevention and treatment identification. However, accessing patient data can be challenging due to privacy concerns and strict regulations. Generating synthetic, realistic data offers a potential solution for overcoming these limitations, and recent studies suggest that fine-tuning foundation models can produce such data effectively. In this study, we explore the potential of foundation models for generating realistic medical images, particularly chest x-rays, and assess how their performance improves with fine-tuning. We propose using a Latent Diffusion Model, starting with a pre-trained foundation model and refining it through various configurations. Additionally, we performed experiments with input from a medical professional to assess the realism of the images produced by each trained model.

Referências

K. Kourou, T. P. Exarchos, K. P. Exarchos, M. V. Karamouzis, and D. I. Fotiadis, “Machine learning applications in cancer prognosis and prediction,” Computational and structural biotechnology journal, vol. 13, pp. 8–17, 2015.

A. Goncalves, P. Ray, B. Soper, J. Stevens, L. Coyle, and A. P. Sales, “Generation and evaluation of synthetic patient data,” BMC medical research methodology, vol. 20, no. 1, pp. 1–40, 2020.

C. Thapa and S. Camtepe, “Precision health data: Requirements, challenges and existing techniques for data security and privacy,” Computers in biology and medicine, vol. 129, p. 104130, 2021.

I. Keshta and A. Odeh, “Security and privacy of electronic health records: Concerns and challenges,” Egyptian Informatics Journal, vol. 22, no. 2, pp. 177–183, 2021. [Online]. Available: [link]

N. S. Almaghrabi and B. A. Bugis, “Patient confidentiality of electronic health records: A recent review of the saudi literature,” Dr. Sulaiman Al Habib Medical Journal, vol. 4, no. 3, pp. 126–135, 2022.

S. Sundaram and N. Hulkund, “Gan-based data augmentation for chest x-ray classification,” 2021.

S. Reed, Z. Akata, X. Yan, L. Logeswaran, B. Schiele, and H. Lee, “Generative adversarial text to image synthesis,” in International conference on machine learning. PMLR, 2016, pp. 1060–1069.

H. Dou, C. Chen, X. Hu, Z. Xuan, Z. Hu, and S. Peng, “Pca-srgan: Incremental orthogonal projection discrimination for face super-resolution,” in Proceedings of the 28th ACM International Conference on Multimedia, 2020, pp. 1891–1899.

B. Azad, R. Azad, S. Eskandari, A. Bozorgpour, A. Kazerouni, I. Rekik, and D. Merhof, “Foundational models in medical imaging: A comprehensive survey and future vision,” 2023.

M. E. Peters, M. Neumann, M. Iyyer, M. Gardner, C. Clark, K. Lee, and L. Zettlemoyer, “Deep contextualized word representations. naacl-hlt,” arXiv, vol. Nothing, no. Nothing, 2018.

T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell et al., “Language models are few-shot learners,” Advances in neural information processing systems, vol. 33, pp. 1877–1901, 2020.

A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark et al., “Learning transferable visual models from natural language supervision,” in International conference on machine learning. PMLR, 2021, pp. 8748–8763.

K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778.

A. Ramesh, M. Pavlov, G. Goh, S. Gray, C. Voss, A. Radford, M. Chen, and I. Sutskever, “Zero-shot text-to-image generation,” in International Conference on Machine Learning. PMLR, 2021, pp. 8821–8831.

R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, “High-resolution image synthesis with latent diffusion models,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 10 684–10 695.

R. Bommasani, D. A. Hudson, E. Adeli, R. Altman, S. Arora, S. von Arx, M. S. Bernstein, J. Bohg, A. Bosselut, E. Brunskill et al., “On the opportunities and risks of foundation models,” arXiv preprint arXiv:2108.07258, vol. Nothing, 2021.

F. Petroni, T. Rocktäschel, P. Lewis, A. Bakhtin, Y. Wu, A. H. Miller, and S. Riedel, “Language models as knowledge bases?” arXiv preprint arXiv:1909.01066, vol. Nothing, 2019.

K. Guu, K. Lee, Z. Tung, P. Pasupat, and M. Chang, “Retrieval augmented language model pre-training,” in International conference on machine learning, vol. Nothing. PMLR, 2020, pp. 3929–3938.

Z. Wang, Z. Wu, D. Agarwal, and J. Sun, “Medclip: Contrastive learning from unpaired medical images and text,” 2022.

S. Motamed, P. Rogalla, and F. Khalvati, “Data augmentation using generative adversarial networks (gans) for gan-based detection of pneumonia and covid-19 in chest x-ray images,” Informatics in Medicine Unlocked, vol. 27, p. 100779, 2021.

W. H. Pinaya, P.-D. Tudosiu, J. Dafflon, P. F. Da Costa, V. Fernandez, P. Nachev, S. Ourselin, and M. J. Cardoso, “Brain imaging generation with latent diffusion models,” in MICCAI Workshop on Deep Generative Models. Springer, 2022, pp. 117–126.

W. Chen, H. Hu, C. Saharia, and W. W. Cohen, “Re-imagen: Retrieval-augmented text-to-image generator,” arXiv preprint arXiv:2209.14491, 2022.

Y. Zhou, R. Zhang, C. Chen, C. Li, C. Tensmeyer, T. Yu, J. Gu, J. Xu, and T. Sun, “Towards language-free training for text-to-image generation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 17 907–17 917.

H. Ali, S. Murad, and Z. Shah, “Spot the fake lungs: Generating synthetic medical images using neural diffusion models,” in Irish Conference on Artificial Intelligence and Cognitive Science. Springer, 2022, pp. 32–39.

K. Packhäuser, L. Folle, F. Thamm, and A. Maier, “Generation of anonymous chest radiographs using latent diffusion models for training thoracic abnormality classification systems,” in 2023 IEEE 20th International Symposium on Biomedical Imaging (ISBI). IEEE, 2023, pp. 1–5.

A. Antoniou, A. Storkey, and H. Edwards, “Data augmentation generative adversarial networks,” arXiv preprint arXiv:1711.04340, 2017.

S. Jaeger, S. Candemir, S. Antani, Y.-X. J. Wáng, P.-X. Lu, and G. Thoma, “Two public chest x-ray datasets for computer-aided screening of pulmonary diseases,” Quantitative imaging in medicine and surgery, vol. 4, no. 6, p. 475, 2014.

N. Ruiz, Y. Li, V. Jampani, Y. Pritch, M. Rubinstein, and K. Aberman, “Dreambooth: Fine tuning text-to-image diffusion models for subject-driven generation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 22 500–22 510.

E. J. Hu, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wang, L. Wang, and W. Chen, “Lora: Low-rank adaptation of large language models,” arXiv preprint arXiv:2106.09685, 2021.

T. Dettmers, M. Lewis, S. Shleifer, and L. Zettlemoyer, “8-bit optimizers via block-wise quantization,” arXiv preprint arXiv:2110.02861, 2021.

H. Face, “8-bit optimizers,” 2023. [Online]. Available: [link]

D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” 2017.

N. Shazeer and M. Stern, “Adafactor: Adaptive learning rates with sublinear memory cost,” 2018.

A. Defazio and K. Mishchenko, “Learning-rate-free learning by d-adaptation,” in International Conference on Machine Learning. PMLR, 2023, pp. 7449–7479.

K. Mishchenko and A. Defazio, “Prodigy: An expeditiously adaptive parameter-free learner,” arXiv preprint arXiv:2306.06101, 2023.

Exploring Foundation Models for Synthetic Medical Imaging: A Study on Chest X-Rays and Fine-Tuning Techniques

Resumo

Referências

Artigos mais lidos do(s) mesmo(s) autor(es)