MultiMagNet: Uma Abordagem Não Determinística na Escolha de Múltiplos Autoencoders para Detecção de Imagens Contraditórias

  • Gabriel R. Machado IME
  • Eugênio Silva UEZO
  • Ronaldo R. Goldschmidt IME

Abstract


Studies reveal that machine learning algorithms can be induced to misclassify adversarial images. Recent research has created a method that incorporates a non-deterministic component to detect these images. The nondeterminism hinders attackers from tracking the behavior of the defense method. However, this method has been bypassed by attacks conducted systematically, which succeeded in reproducing the defense’s response. Thus, this paper aims at proposing a detection method that randomly considers multiple components, increasing the non-deterministic effect. Experimental results prove the robustness of the proposed method against the state-of-the-art adversarial attacks.

References

Bojarski, M., Testa, D. D., Dworakowski, D., Firner, B., Flepp, B., Goyal, P., Jackel, L. D., Monfort, M., Muller, U., Zhang, J., Zhang, X., Zhao, J., and Zieba, K. (2016). End to end learning for self-driving cars. arXiv preprint arXiv:1604.07316.

Carlini, N. and Wagner, D. (2017a). Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods. In 10th ACM Workshop on Artificial Intelligence and Security, page 12, Dallas, TX.

Carlini, N. and Wagner, D. (2017b). Magnet and”efficient defenses against adversarial attacks”are not robust to adversarial examples. arXiv pre-print arXiv:1711.08478.

Carlini, N. and Wagner, D. (2017c). Towards Evaluating the Robustness of Neural Networks. Proceedings - IEEE Symposium on Security and Privacy, pages 39–57.

Ding, L., Fang, W., Luo, H., Love, P. E., Zhong, B., and Ouyang, X. (2018). A deep hybrid learning model to detect unsafe behavior: integrating convolution neural networks and long short-term memory. Automation in Construction, 86:118–124.

Gong, Z., Wang, W., and Ku, W.-S. (2017). Adversarial and clean data are not twins. arXiv preprint arXiv:1704.04960.

Goodfellow, I., Bengio, Y., and Courville, A. (2016). Deep Learning. MIT Press. [link].

Goodfellow, I., McDaniel, P., and Papernot, N. (2018). Making machine learning robust against adversarial inputs. Communications of the ACM, 61(7):56–66.

Goodfellow, I. J., Shlens, J., and Szegedy, C. (2014). Explaining and Harnessing Adversarial Examples. In International Conference on Learning Representations (ICLR’14), pages 1–11.

He, W., Wei, J., Chen, X., Carlini, N., and Song, D. (2017). Adversarial Example Defenses: Ensembles of Weak Defenses are not Strong. In 11th USENIX Workshop on Offensive Technologies (WOOT’ 17), Vancouver, CA.

Karpathy, A. (2014). What I learned from competing against a ConvNet on Imagenet. Disponível em [link]. Acessado em 02 de setembro de 2018.

Klarreich, E. (2016). Learning securely. Communications of the ACM, 59(11):12–14.

Kurakin, A., Goodfellow, I., and Bengio, S. (2016a). Adversarial examples in the physical world. arXiv preprint arXiv:1607.02533.

Kurakin, A., Goodfellow, I., and Bengio, S. (2016b). Adversarial machine learning at scale. arXiv preprint arXiv:1611.01236.

Labati, R. D., Muñoz, E., Piuri, V., Sassi, R., and Scotti, F. (2018). Deep-ecg: Convolutional neural networks for ecg biometric recognition. Pattern Recognition Letters.

LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324.

Lecun, Y., Kavukcuoglu, K., and Farabet, C. (2010). Convolutional Networks and Applications in Vision. In Proceedings of 2010 IEEE International Symposium on Circuits and Systems, pages 253–256. IEEE.

Madry, A., Makelov, A., Schmidt, L., Tsipras, D., and Vladu, A. (2017). Towards Deep Learning Models Resistant to Adversarial Attacks. arxiv: 1706.06083.

Meng, D. and Chen, H. (2017). Magnet: a two-pronged defense against adversarial examples. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pages 135–147. ACM.

Metzen, J. H., Genewein, T., Fischer, V., and Bischoff, B. (2017). On detecting adversarial perturbations. arXiv preprint arXiv:1702.04267.

Moosavi-Dezfooli, S.-M., Fawzi, A., and Frossard, P. (2016). Deepfool: a simple and accurate method to fool deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2574–2582.

Papernot, N., Faghri, F., Carlini, N., Goodfellow, I., Feinman, R., Kurakin, A., Xie, C., Sharma, Y., Brown, T., Roy, A., Matyasko, A., Behzadan, V., Hambardzumyan, K., Zhang, Z., Juang, Y.-L., Li, Z., Sheatsley, R., Garg, A., Uesato, J., Gierke, W., Dong, Y., Berthelot, D., Hendricks, P., Rauber, J., and Long, R. (2018). cleverhans v2.1.0: an adversarial machine learning library. arXiv preprint arXiv:1610.00768.

Papernot, N., Mcdaniel, P., Jha, S., Fredrikson, M., Celik, Z. B., and Swami, A. (2016a). The limitations of deep learning in adversarial settings. Proceedings - 2016 IEEE European Symposium on Security and Privacy, EURO S and P 2016, pages 372–387.

Papernot, N., McDaniel, P., Wu, X., Jha, S., and Swami, A. (2016b). Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks. In Proceedings - 2016 IEEE Symposium on Security and Privacy, SP 2016, pages 582–597.

Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., and Fergus, R. (2013). Intriguing properties of neural networks. In International Conference on Learning Representations, pages 1–10.

Xu, W., Evans, D., and Qi, Y. (2018). Feature squeezing: Detecting adversarial examples in deep neural networks. Network and Distributed Systems Security Symposium (NDSS) 2018.
Published
2018-10-25
MACHADO, Gabriel R.; SILVA, Eugênio; GOLDSCHMIDT, Ronaldo R.. MultiMagNet: Uma Abordagem Não Determinística na Escolha de Múltiplos Autoencoders para Detecção de Imagens Contraditórias. In: BRAZILIAN SYMPOSIUM ON CYBERSECURITY (SBSEG), 18. , 2018, Natal. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2018 . p. 281-294. DOI: https://doi.org/10.5753/sbseg.2018.4259.