Text-to-Image Generation Tools: A Survey and NSFW Content Analysis

Erick L. Figueiredo; Daniel L. Fernandes; Julio C. S. Reis

doi:10.5753/webmedia_estendido.2023.235611

Erick L. Figueiredo UFV
Daniel L. Fernandes UFV
Julio C. S. Reis UFV

DOI: https://doi.org/10.5753/webmedia_estendido.2023.235611

Resumo

This study investigates the main tools for generating images through Artificial Intelligence (AI) known as “Text-to-Image”. Free tools available on the Web were collected and evaluated for their ability to generate inappropriate content (i.e., NSFW). The work emphasizes the importance of a solid ethical foundation in implementing these tools, considering the risks of disseminating inappropriate information. The results provide a compilation of the identified tools, along with an analysis of the content generated by them.

Palavras-chave: Text-to-Image Models, Tools, Neural Networks, Computer Vision, AI, Ethics

Referências

H. Bansal, D. Yin, M. Monajatipoor, and K. Chang. 2022. How well can text-to-image generative models understand ethical natural language interventions?. EMNLP.

N. Carlini, J. Hayes, M. Nasr, M. Jagielski, V. Sehwag, F. Tramer, B. Balle, D. Ippolito, and E. Wallace. 2023. Extracting training data from diffusion models. USENIX Security Symposium.

C. Chen, J. Fu, and L. Lyu. 2023. A pathway towards responsible ai generated content. IJCAI.

H. Dong,W. Xiong, D. Goyal, R. Pan, S. Diao, J. Zhang, K. Shum, and T. Zhang. 2023. Raft: Reward ranked finetuning for generative foundation model alignment. arXiv.

P. Fernandez, G. Couairon, H. Jégou, M. Douze, and T. Furon. 2023. The stable signature: Rooting watermarks in latent diffusion models. ICCV.

I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio. 2014. Generative adversarial nets. NIPS.

D. P. Kingma and M; Welling. 2013. Auto-Encoding Variational Bayes. ICLR.

S. Lee, B. Hoover, H. Strobelt, Z. J. Wang, S. Peng, A. Wright, K. Li, H. Park, H. Yang, and D. H. Chau. 2023. Diffusion Explainer: Visual Explanation for Text-to-image Stable Diffusion. arXiv.

Y. Qu, X. Shen, X. He, M. Backes, S. Zannettou, and Y. Zhang. 2023. Unsafe Diffusion: On the Generation of Unsafe Images and Hateful Memes From Text-To-Image Models. ACM CCS (2023).

A. Ramesh, P. Dhariwal, A. Nichol, C. Chu, and M. Chen. 2022. Hierarchical text-conditional image generation with clip latents. arXiv.

J. C. S. Reis, P. Melo, M. I. Silva, and F. Benevenuto. 2023. Desinformação em Plataformas Digitais: Conceitos, Abordagens Tecnológicas e Desafios. JAI/CSBC (2023).

R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer. 2022. High-resolution image synthesis with latent diffusion models. In CVPR.

A. Silberling. 2022. Kickstarter shut down the campaign for AI porn group “Unstable Diffusion” amid changing guidelines. TechCrunch.

Y. Yang, B. Hui, H. Yuan, N. Gong, and Y. Cao. 2023. SneakyPrompt: Evaluating Robustness of Text-to-image Generative Models’ Safety Filters. arXiv.

C. Zhang, C. Zhang, M. Zhang, and I. S. Kweon. 2023. Text-to-image diffusion model in generative ai: A survey. arXiv.