Revisão sistematica das teorias norteadoras do Visual Transformers


Os Transformers emergiram como uma poderosa arquitetura em Inteligência Artificial, revolucionando várias tarefas de processamento de linguagem natural e processamento de imagem. Este artigo apresenta uma análise abrangente sobre a evolução dos Transformers historicamente, destacando seu mecanismo de autoatenção e finalizando com o novo modelo de Transformers Visual. Exploramos as principais contribuições dos trabalhos-chave na área até o desenvolvimento do Visual Transformers. Realizar uma análise sistemática nos ajuda a entender melhor o funcionamento desse modelo e quais são os tópicos chaves para cada abordagem.

Palavras-chave: autoatencao, transformers, pln, visual


Abdi, H., Valentin, D., and Edelman, B. (1999). Neural networks. Sage.

Dai, Z., Yang, Z., Yang, Y., Carbonell, J., Le, Q. V., and Salakhutdinov, R. (2019). Transformer-xl: Attentive language models beyond a fixed-length context. arXiv preprint arXiv:1901.02860.

Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.

Di Gangi, M. A., Negri, M., and Turchi, M. (2019). Adapting transformer to end-to-end spoken language translation. In Proceedings of INTERSPEECH 2019, pages 1133-1137. International Speech Communication Association (ISCA).

Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., et al. (2020). An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929.

Hinton, G., Deng, L., Yu, D., Dahl, G. E., Mohamed, A.-r., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T. N., et al. (2012). Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal processing magazine, 29(6):82-97.

Jing, Y., Yang, Y., Feng, Z., Ye, J., Yu, Y., and Song, M. (2019). Neural style transfer: A review. IEEE Transactions on Visualization and Computer Graphics, 26(11):3365-3385.

LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278-2324.

Liu, P.-r., Lu, L., Zhang, J.-y., Huo, T.-t., Liu, S.-x., and Ye, Z.-w. (2021). Application of artificial intelligence in medicine: an overview. Current Medical Science, 41(6):1105-1115.

Otter, D. W., Medina, J. R., and Kalita, J. K. (2020). A survey of the usages of deep learning for natural language processing. IEEE transactions on neural networks and learning systems, 32(2):604-624.

Ouyang, F. and Jiao, P. (2021). Artificial intelligence in education: The three paradigms. Computers and Education: Artificial Intelligence, 2:100020.

Rumelhart, D. E., Hinton, G. E., and Williams, R. J. (1986). Learning representations by back-propagating errors. Nature, 323(6088):533-536.

Russell, S. J. (2010). Artificial intelligence a modern approach. Pearson Education, Inc.

Valenzuela, O., Catala, A., Anguita, D., and Rojas, I. (2023). New advances in artificial neural networks and machine learning techniques. Neural Processing Letters, pages 1-4.

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., and Polosukhin, I. (2017). Attention is all you need. Advances in neural information processing systems, 30.

Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R. R., and Le, Q. V. (2019). Xlnet: Generalized autoregressive pretraining for language understanding. Advances in neural information processing systems, 32.

Zhang, J. and Man, K.-F. (1998). Time series prediction using rnn in multi-dimension embedding phase space. In SMC’98 Conference Proceedings. 1998 IEEE International Conference on Systems, Man, and Cybernetics (Cat. No. 98CH36218), volume 2, pages 1868-1873. IEEE.
S. JUNIOR, Joelson; LUCCA, Giancarlo; BOTTERO, Diego; DIMURO, Graçaliz P.; SANTOS, Helida. Revisão sistematica das teorias norteadoras do Visual Transformers. In: WORKSHOP-ESCOLA DE INFORMÁTICA TEÓRICA (WEIT), 7. , 2023, Rio Grande/RS. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2023 . p. 87-94. DOI: