MultiMagNet: Uma Abordagem Não Determinística na Escolha de Múltiplos Autoencoders para Detecção de Imagens Contraditórias

  • Gabriel R. Machado IME
  • Eugênio Silva UEZO
  • Ronaldo R. Goldschmidt IME

Resumo


Estudos mostram que os algoritmos de aprendizado de máquina podem ser induzidos a cometer erros de classificação diante de imagens contraditórias. Uma pesquisa recente criou um método que incorpora um componente não determinístico para detectar essas imagens. O não determinismo dificulta ao atacante mapear o comportamento do método. No entanto, essa abordagem tem sido superada por ataques aplicados de forma sistemática, que conseguiram abstrair a essência do comportamento de defesa. Assim, este artigo tem como objetivo propor um método de detecção que, ao considerar múltiplos componentes aleatórios, amplia o efeito do não determinismo. Resultados experimentais comprovam a robustez do método proposto frente aos ataques do estado da arte.

Referências

Bojarski, M., Testa, D. D., Dworakowski, D., Firner, B., Flepp, B., Goyal, P., Jackel, L. D., Monfort, M., Muller, U., Zhang, J., Zhang, X., Zhao, J., and Zieba, K. (2016). End to end learning for self-driving cars. arXiv preprint arXiv:1604.07316.

Carlini, N. and Wagner, D. (2017a). Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods. In 10th ACM Workshop on Artificial Intelligence and Security, page 12, Dallas, TX.

Carlini, N. and Wagner, D. (2017b). Magnet and”efficient defenses against adversarial attacks”are not robust to adversarial examples. arXiv pre-print arXiv:1711.08478.

Carlini, N. and Wagner, D. (2017c). Towards Evaluating the Robustness of Neural Networks. Proceedings - IEEE Symposium on Security and Privacy, pages 39–57.

Ding, L., Fang, W., Luo, H., Love, P. E., Zhong, B., and Ouyang, X. (2018). A deep hybrid learning model to detect unsafe behavior: integrating convolution neural networks and long short-term memory. Automation in Construction, 86:118–124.

Gong, Z., Wang, W., and Ku, W.-S. (2017). Adversarial and clean data are not twins. arXiv preprint arXiv:1704.04960.

Goodfellow, I., Bengio, Y., and Courville, A. (2016). Deep Learning. MIT Press. [link].

Goodfellow, I., McDaniel, P., and Papernot, N. (2018). Making machine learning robust against adversarial inputs. Communications of the ACM, 61(7):56–66.

Goodfellow, I. J., Shlens, J., and Szegedy, C. (2014). Explaining and Harnessing Adversarial Examples. In International Conference on Learning Representations (ICLR’14), pages 1–11.

He, W., Wei, J., Chen, X., Carlini, N., and Song, D. (2017). Adversarial Example Defenses: Ensembles of Weak Defenses are not Strong. In 11th USENIX Workshop on Offensive Technologies (WOOT’ 17), Vancouver, CA.

Karpathy, A. (2014). What I learned from competing against a ConvNet on Imagenet. Disponível em [link]. Acessado em 02 de setembro de 2018.

Klarreich, E. (2016). Learning securely. Communications of the ACM, 59(11):12–14.

Kurakin, A., Goodfellow, I., and Bengio, S. (2016a). Adversarial examples in the physical world. arXiv preprint arXiv:1607.02533.

Kurakin, A., Goodfellow, I., and Bengio, S. (2016b). Adversarial machine learning at scale. arXiv preprint arXiv:1611.01236.

Labati, R. D., Muñoz, E., Piuri, V., Sassi, R., and Scotti, F. (2018). Deep-ecg: Convolutional neural networks for ecg biometric recognition. Pattern Recognition Letters.

LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324.

Lecun, Y., Kavukcuoglu, K., and Farabet, C. (2010). Convolutional Networks and Applications in Vision. In Proceedings of 2010 IEEE International Symposium on Circuits and Systems, pages 253–256. IEEE.

Madry, A., Makelov, A., Schmidt, L., Tsipras, D., and Vladu, A. (2017). Towards Deep Learning Models Resistant to Adversarial Attacks. arxiv: 1706.06083.

Meng, D. and Chen, H. (2017). Magnet: a two-pronged defense against adversarial examples. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pages 135–147. ACM.

Metzen, J. H., Genewein, T., Fischer, V., and Bischoff, B. (2017). On detecting adversarial perturbations. arXiv preprint arXiv:1702.04267.

Moosavi-Dezfooli, S.-M., Fawzi, A., and Frossard, P. (2016). Deepfool: a simple and accurate method to fool deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2574–2582.

Papernot, N., Faghri, F., Carlini, N., Goodfellow, I., Feinman, R., Kurakin, A., Xie, C., Sharma, Y., Brown, T., Roy, A., Matyasko, A., Behzadan, V., Hambardzumyan, K., Zhang, Z., Juang, Y.-L., Li, Z., Sheatsley, R., Garg, A., Uesato, J., Gierke, W., Dong, Y., Berthelot, D., Hendricks, P., Rauber, J., and Long, R. (2018). cleverhans v2.1.0: an adversarial machine learning library. arXiv preprint arXiv:1610.00768.

Papernot, N., Mcdaniel, P., Jha, S., Fredrikson, M., Celik, Z. B., and Swami, A. (2016a). The limitations of deep learning in adversarial settings. Proceedings - 2016 IEEE European Symposium on Security and Privacy, EURO S and P 2016, pages 372–387.

Papernot, N., McDaniel, P., Wu, X., Jha, S., and Swami, A. (2016b). Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks. In Proceedings - 2016 IEEE Symposium on Security and Privacy, SP 2016, pages 582–597.

Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., and Fergus, R. (2013). Intriguing properties of neural networks. In International Conference on Learning Representations, pages 1–10.

Xu, W., Evans, D., and Qi, Y. (2018). Feature squeezing: Detecting adversarial examples in deep neural networks. Network and Distributed Systems Security Symposium (NDSS) 2018.
Publicado
25/10/2018
MACHADO, Gabriel R.; SILVA, Eugênio; GOLDSCHMIDT, Ronaldo R.. MultiMagNet: Uma Abordagem Não Determinística na Escolha de Múltiplos Autoencoders para Detecção de Imagens Contraditórias. In: SIMPÓSIO BRASILEIRO DE SEGURANÇA DA INFORMAÇÃO E DE SISTEMAS COMPUTACIONAIS (SBSEG), 18. , 2018, Natal. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2018 . p. 281-294. DOI: https://doi.org/10.5753/sbseg.2018.4259.