Text-to-Image Generation Tools: A Survey and NSFW Content Analysis

  • Erick L. Figueiredo UFV
  • Daniel L. Fernandes UFV
  • Julio C. S. Reis UFV

Resumo


This study investigates the main tools for generating images through Artificial Intelligence (AI) known as “Text-to-Image”. Free tools available on the Web were collected and evaluated for their ability to generate inappropriate content (i.e., NSFW). The work emphasizes the importance of a solid ethical foundation in implementing these tools, considering the risks of disseminating inappropriate information. The results provide a compilation of the identified tools, along with an analysis of the content generated by them.
Palavras-chave: Text-to-Image Models, Tools, Neural Networks, Computer Vision, AI, Ethics

Referências

H. Bansal, D. Yin, M. Monajatipoor, and K. Chang. 2022. How well can text-to-image generative models understand ethical natural language interventions?. EMNLP.

N. Carlini, J. Hayes, M. Nasr, M. Jagielski, V. Sehwag, F. Tramer, B. Balle, D. Ippolito, and E. Wallace. 2023. Extracting training data from diffusion models. USENIX Security Symposium.

C. Chen, J. Fu, and L. Lyu. 2023. A pathway towards responsible ai generated content. IJCAI.

H. Dong,W. Xiong, D. Goyal, R. Pan, S. Diao, J. Zhang, K. Shum, and T. Zhang. 2023. Raft: Reward ranked finetuning for generative foundation model alignment. arXiv.

P. Fernandez, G. Couairon, H. Jégou, M. Douze, and T. Furon. 2023. The stable signature: Rooting watermarks in latent diffusion models. ICCV.

I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio. 2014. Generative adversarial nets. NIPS.

D. P. Kingma and M; Welling. 2013. Auto-Encoding Variational Bayes. ICLR.

S. Lee, B. Hoover, H. Strobelt, Z. J. Wang, S. Peng, A. Wright, K. Li, H. Park, H. Yang, and D. H. Chau. 2023. Diffusion Explainer: Visual Explanation for Text-to-image Stable Diffusion. arXiv.

Y. Qu, X. Shen, X. He, M. Backes, S. Zannettou, and Y. Zhang. 2023. Unsafe Diffusion: On the Generation of Unsafe Images and Hateful Memes From Text-To-Image Models. ACM CCS (2023).

A. Ramesh, P. Dhariwal, A. Nichol, C. Chu, and M. Chen. 2022. Hierarchical text-conditional image generation with clip latents. arXiv.

J. C. S. Reis, P. Melo, M. I. Silva, and F. Benevenuto. 2023. Desinformação em Plataformas Digitais: Conceitos, Abordagens Tecnológicas e Desafios. JAI/CSBC (2023).

R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer. 2022. High-resolution image synthesis with latent diffusion models. In CVPR.

A. Silberling. 2022. Kickstarter shut down the campaign for AI porn group “Unstable Diffusion” amid changing guidelines. TechCrunch.

Y. Yang, B. Hui, H. Yuan, N. Gong, and Y. Cao. 2023. SneakyPrompt: Evaluating Robustness of Text-to-image Generative Models’ Safety Filters. arXiv.

C. Zhang, C. Zhang, M. Zhang, and I. S. Kweon. 2023. Text-to-image diffusion model in generative ai: A survey. arXiv.
Publicado
23/10/2023
FIGUEIREDO, Erick L.; FERNANDES, Daniel L.; REIS, Julio C. S.. Text-to-Image Generation Tools: A Survey and NSFW Content Analysis. In: CONCURSO DE TRABALHOS DE INICIAÇÃO CIENTÍFICA - SIMPÓSIO BRASILEIRO DE SISTEMAS MULTIMÍDIA E WEB (WEBMEDIA), 29. , 2023, Ribeirão Preto/SP. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2023 . p. 59-62. ISSN 2596-1683. DOI: https://doi.org/10.5753/webmedia_estendido.2023.235611.