An Evolutionary, Gradient-Free, Query-Efficient, Black-Box Algorithm for Generating Adversarial Instances in Deep Convolutional Neural Networks
Abstract
:1. Introduction
2. Related Work
3. Threat Model
- The attacker is unaware of the network’s design, settings, or training data.
- The attacker has no access to intermediate values in the target model.
- The attacker can only use a black-box function to query the target model.
- 1.
- Black-box,
- 2.
- Untargeted,
- 3.
- -norm bounded.
4. QuEry Attack
4.1. Initialization
4.2. Mutation
4.3. Crossover
Algorithm 1 QuEry Attack |
|
5. Experiments and Results
5.1. Attacking Defenses
5.1.1. Attacking Non-Differentiable Transformations
5.1.2. Attacking Robust Models
5.2. Transferability
6. Discussion and Concluding Remarks
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Wang, X.; Jin, H.; He, K. Natural Language Adversarial Attack and Defense in Word Level. 2019. Available online: https://openreview.net/forum?id=BJl_a2VYPH (accessed on 13 September 2022).
- Morris, J.X.; Lifland, E.; Yoo, J.Y.; Qi, Y. Textattack: A Framework for Adversarial Attacks in Natural Language Processing. Proceedings of the 2020 EMNLP. 2020. Available online: https://qdata.github.io/secureml-web/4VisualizeBench/ (accessed on 13 September 2022).
- Carlini, N.; Wagner, D. Audio adversarial examples: Targeted attacks on speech-to-text. In Proceedings of the 2018 IEEE Security and Privacy Workshops (SPW), San Francisco, CA, USA, 24–24 May 2018; pp. 1–7. [Google Scholar]
- Schönherr, L.; Kohls, K.; Zeiler, S.; Holz, T.; Kolossa, D. Adversarial attacks against automatic speech recognition systems via psychoacoustic hiding. arXiv 2018, arXiv:1808.05665. [Google Scholar]
- Goodfellow, I.J.; Shlens, J.; Szegedy, C. Explaining and harnessing adversarial examples. arXiv 2014, arXiv:1412.6572. [Google Scholar]
- Papernot, N.; McDaniel, P.; Jha, S.; Fredrikson, M.; Celik, Z.B.; Swami, A. The limitations of deep learning in adversarial settings. In Proceedings of the 2016 IEEE European Symposium on Security and Privacy (EuroS&P), Saarbruecken, Germany, 21–24 March 2016; pp. 372–387. [Google Scholar]
- Carlini, N.; Wagner, D. Towards evaluating the robustness of neural networks. In Proceedings of the 2017 IEEE Symposium on Security and Privacy (SP), San Jose, CA, USA, 22–26 May 2017; pp. 39–57. [Google Scholar]
- Gu, S.; Rigazio, L. Towards deep neural network architectures robust to adversarial examples. arXiv 2014, arXiv:1412.5068. [Google Scholar]
- Moosavi-Dezfooli, S.M.; Fawzi, A.; Frossard, P. Deepfool: A simple and accurate method to fool deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 2574–2582. [Google Scholar]
- Papernot, N.; McDaniel, P.; Goodfellow, I.; Jha, S.; Celik, Z.B.; Swami, A. Practical black-box attacks against machine learning. In Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security, Abu Dhabi, United Arab Emirates, 2–6 April 2017; pp. 506–519. [Google Scholar]
- Xu, H.; Ma, Y.; Liu, H.C.; Deb, D.; Liu, H.; Tang, J.L.; Jain, A.K. Adversarial Attacks and Defenses in Images, Graphs and Text: A Review. Int. J. Autom. Comput. 2020, 17, 151–178. [Google Scholar] [CrossRef] [Green Version]
- Qureshi, A.U.H.; Larijani, H.; Yousefi, M.; Adeel, A.; Mtetwa, N. An Adversarial Approach for Intrusion Detection Systems Using Jacobian Saliency Map Attacks (JSMA) Algorithm. Computers 2020, 9, 58. [Google Scholar] [CrossRef]
- Chen, P.Y.; Zhang, H.; Sharma, Y.; Yi, J.; Hsieh, C.J. ZOO: Zeroth order optimization based black-box atacks to deep neural networks without training substitute models. In Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, Dallas, TX, USA, 3 November 2017; pp. 15–26. [Google Scholar] [CrossRef] [Green Version]
- Buckman, J.; Roy, A.; Raffel, C.; Goodfellow, I. Thermometer Encoding: One Hot Way to Resist Adversarial Examples. International Conference on Learning Representations. 2018. Available online: https://openreview.net/forum?id=S18Su--CW (accessed on 13 September 2022).
- Guo, C.; Rana, M.; Cisse, M.; Van Der Maaten, L. Countering adversarial images using input transformations. arXiv 2017, arXiv:1711.00117. [Google Scholar]
- Dhillon, G.S.; Azizzadenesheli, K.; Lipton, Z.C.; Bernstein, J.; Kossaifi, J.; Khanna, A.; Anandkumar, A. Stochastic activation pruning for robust adversarial defense. arXiv 2018, arXiv:1803.01442. [Google Scholar]
- Andriushchenko, M.; Croce, F.; Flammarion, N.; Hein, M. Square attack: A query-efficient black-box adversarial attack via random search. In Proceedings of the European Conference on Computer Vision, Online, 23–28 August 2020; Springer: Berlin/Heidelberg, Germany, 2020; pp. 484–501. [Google Scholar]
- Kannappan, K.; Spector, L.; Sipper, M.; Helmuth, T.; La Cava, W.; Wisdom, J.; Bernstein, O. Analyzing a decade of human-competitive (“HUMIE”) winners: What can we learn? In Genetic Programming Theory and Practice XII; Riolo, R., Worzel, W.P., Kotanchek, M., Eds.; Springer International Publishing: Cham, Switzerland, 2015; pp. 149–166. [Google Scholar]
- Sipper, M.; Olson, R.S.; Moore, J.H. Evolutionary computation: The next major transition of artificial intelligence? BioData Min. 2017, 10, 26. [Google Scholar] [CrossRef] [Green Version]
- Alzantot, M.; Sharma, Y.; Chakraborty, S.; Zhang, H.; Hsieh, C.J.; Srivastava, M.B. GenAttack: Practical black-box attacks with gradient-free optimization. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO’19), Prague, Czech Republic, 13–17 July 2019; Association for Computing Machinery: New York, NY, USA, 2019; pp. 1111–1119. [Google Scholar] [CrossRef] [Green Version]
- Prochazka, S.; Neruda, R. Black-box evolutionary search for adversarial examples against deep image classifiers in non-targeted attacks. In Proceedings of the 2020 International Joint Conference on Neural Networks (IJCNN), Glasgow, UK, 19–24 July 2020. [Google Scholar] [CrossRef]
- Su, J.; Vargas, D.V.; Sakurai, K. One Pixel Attack for Fooling Deep Neural Networks. IEEE Trans. Evol. Comput. 2019, 23, 828–841. [Google Scholar] [CrossRef] [Green Version]
- Das, S.; Suganthan, P.N. Differential evolution: A survey of the state-of-the-art. IEEE Trans. Evol. Comput. 2011, 15, 4–31. [Google Scholar] [CrossRef]
- Lin, J.; Xu, L.; Liu, Y.; Zhang, X. Black-Box Adversarial Sample Generation Based on Differential Evolution. J. Syst. Softw. 2020, 170, 110767. [Google Scholar] [CrossRef]
- Jere, M.; Rossi, L.; Hitaj, B.; Ciocarlie, G.; Boracchi, G.; Koushanfar, F. Scratch that! An evolution-based adversarial attack against neural networks. arXiv 2019, arXiv:1912.02316. [Google Scholar]
- Di Giovanni, M.; Brambilla, M. EFSG: Evolutionary fooling sentences generator. In Proceedings of the 2021 IEEE 15th International Conference on Semantic Computing (ICSC), Laguna Hills, CA, USA, 27–29 January 2021; pp. 171–178. [Google Scholar] [CrossRef]
- Hansen, N.; Ostermeier, A. Completely derandomized self-adaptation in evolution strategies. Evol. Comput. 2001, 9, 159–195. [Google Scholar] [CrossRef] [PubMed]
- Wang, S.; Shi, Y.; Han, Y. Universal perturbation generation for black-box attack using evolutionary algorithms. In Proceedings of the 2018 24th International Conference on Pattern Recognition (ICPR), Beijing, China, 20–24 August 2018; pp. 1277–1282. [Google Scholar]
- Kumar, S.K. On weight initialization in deep neural networks. arXiv 2017, arXiv:1704.08863. [Google Scholar]
- Koturwar, S.; Merchant, S. Weight initialization of deep neural networks (DNNs) using data statistics. arXiv 2017, arXiv:1710.10570. [Google Scholar]
- Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L.; et al. PyTorch: An imperative style, high-performance deep learning library. In Proceedings of the Advances in Neural Information Processing Systems, Vancouver, BC, Canada, 8–14 December 2019; Curran Associates, Inc.: Red Hook, NY, USA, 2019; pp. 8024–8035. [Google Scholar]
- Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 2818–2826. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- Xavier, A.I.; Villavicencio, C.; Macrohon, J.J.; Jeng, J.H.; Hsieh, J.G. Object Detection via Gradient-Based Mask R-CNN Using Machine Learning Algorithms. Machines 2022, 10, 340. [Google Scholar] [CrossRef]
- Zhang, R.; Du, L.; Xiao, Q.; Liu, J. Comparison of backbones for semantic segmentation network. J. Phys. Conf. Ser. 2020, 1544, 012196. [Google Scholar] [CrossRef]
- Song, P.; Guo, D.; Zhou, J.; Xu, M.; Wang, M. Memorial GAN With Joint Semantic Optimization for Unpaired Image Captioning. IEEE Trans. Cybern. 2022. [Google Scholar] [CrossRef]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
- Nicolae, M.I.; Sinn, M.; Tran, M.N.; Buesser, B.; Rawat, A.; Wistuba, M.; Zantedeschi, V.; Baracaldo, N.; Chen, B.; Ludwig, H.; et al. Adversarial robustness toolbox v1. 0.0. arXiv 2018, arXiv:1807.01069. [Google Scholar]
- Akiba, T.; Sano, S.; Yanase, T.; Ohta, T.; Koyama, M. Optuna: A next-generation hyperparameter optimization framework. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA, 4–8 August 2019; pp. 2623–2631. [Google Scholar]
- Mosli, R.; Wright, M.; Yuan, B.; Pan, Y. They might NOT be giants crafting black-box adversarial examples using particle swarm optimization. arXiv 2019, arXiv:1909.07490. [Google Scholar]
- Athalye, A.; Carlini, N.; Wagner, D. Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. In Proceedings of the International Conference on Machine Learning, Stockholm, Sweden, 10–15 July 2019; pp. 274–283. [Google Scholar]
- Qiu, H.; Zeng, Y.; Zheng, Q.; Zhang, T.; Qiu, M.; Memmi, G. Mitigating advanced adversarial attacks with more advanced gradient obfuscation techniques. arXiv 2020, arXiv:2005.13712. [Google Scholar]
- Dziugaite, G.K.; Ghahramani, Z.; Roy, D.M. A study of the effect of jpg compression on adversarial images. arXiv 2016, arXiv:1608.00853. [Google Scholar]
- Xu, W.; Evans, D.; Qi, Y. Feature squeezing: Detecting adversarial examples in deep neural networks. arXiv 2017, arXiv:1704.01155. [Google Scholar]
- Mikl, M.; Mareček, R.; Hluštík, P.; Pavlicová, M.; Drastich, A.; Chlebus, P.; Brázdil, M.; Krupa, P. Effects of spatial smoothing on fMRI group inferences. Magn. Reson. Imaging 2008, 26, 490–503. [Google Scholar] [CrossRef]
- Ganin, Y.; Ustinova, E.; Ajakan, H.; Germain, P.; Larochelle, H.; Laviolette, F.; Marchand, M.; Lempitsky, V. Domain-adversarial training of neural networks. J. Mach. Learn. Res. 2016, 17, 2096–2130. [Google Scholar]
- Tramèr, F.; Kurakin, A.; Papernot, N.; Goodfellow, I.; Boneh, D.; McDaniel, P. Ensemble adversarial training: Attacks and defenses. arXiv 2017, arXiv:1705.07204. [Google Scholar]
- Shafahi, A.; Najibi, M.; Ghiasi, M.A.; Xu, Z.; Dickerson, J.; Studer, C.; Davis, L.S.; Taylor, G.; Goldstein, T. Adversarial Training for Free! Available online: https://proceedings.neurips.cc/paper/2019/file/7503cfacd12053d309b6bed5c89de212-Paper.pdf (accessed on 13 September 2022).
- Wong, E.; Rice, L.; Kolter, J.Z. Fast is better than free: Revisiting adversarial training. arXiv 2020, arXiv:2001.03994. [Google Scholar]
- Gowal, S.; Rebuffi, S.A.; Wiles, O.; Stimberg, F.; Calian, D.A.; Mann, T.A. Improving robustness using generated data. In Proceedings of the Advances in Neural Information Processing Systems, Virtual, 6–14 December 2021; Volume 34, pp. 4218–4233. [Google Scholar]
- Salman, H.; Ilyas, A.; Engstrom, L.; Kapoor, A.; Madry, A. Do adversarially robust imagenet models transfer better? In Proceedings of the Advances in Neural Information Processing Systems, 6–12 December 2020; Volume 33, pp. 3533–3545. [Google Scholar]
- Croce, F.; Andriushchenko, M.; Sehwag, V.; Debenedetti, E.; Flammarion, N.; Chiang, M.; Mittal, P.; Hein, M. RobustBench: A Standardized Adversarial Robustness Benchmark. Available online: https://openreview.net/forum?id=SSKZPJCt7B (accessed on 13 September 2022).
- Papernot, N.; McDaniel, P.; Goodfellow, I. Transferability in machine learning: From phenomena to black-box attacks using adversarial samples. arXiv 2016, arXiv:1605.07277. [Google Scholar]
- Madry, A.; Makelov, A.; Schmidt, L.; Tsipras, D.; Vladu, A. Towards deep learning models resistant to adversarial attacks. arXiv 2017, arXiv:1706.06083. [Google Scholar]
- Brendel, W.; Rauber, J.; Bethge, M. Decision-based adversarial attacks: Reliable attacks against black-box machine learning models. arXiv 2017, arXiv:1712.04248. [Google Scholar]
- Lapid, R.; Sipper, M. Evolution of activation functions for deep learning-based image classification. In Proceedings of the Genetic and Evolutionary Computation Conference Companion, Boston, MA, USA, 9–13 July 2022; pp. 2113–2121. [Google Scholar]
ImageNet | |||||||
---|---|---|---|---|---|---|---|
Model | QuEry Attack | Square | AdversarialPSO | ||||
ASR | Queries | ASR | Queries | ASR | Queries | ||
Inception-v3 | 24 | 100% | 1 | 100% | 5 | 98.5% | 51 |
18 | 100% | 2 | 100% | 8 | 98.5% | 69 | |
12 | 100% | 22 | 95.5% | 32 | 97% | 102 | |
6 | 99.5% | 276 | 99.0% | 263 | 95% | 285 | |
ResNet-50 | 24 | 100% | 1 | 99.5% | 5 | 98.5% | 51 |
18 | 100% | 1 | 100% | 5 | 98.5% | 69 | |
12 | 100% | 13 | 100% | 26 | 97% | 102 | |
6 | 100% | 211 | 99.0% | 248 | 95% | 285 | |
VGG-16-BN | 24 | 100% | 1 | 100% | 5 | 98.5% | 51 |
18 | 100% | 1 | 100% | 5 | 98.5% | 69 | |
12 | 100% | 1 | 100% | 5 | 97% | 102 | |
6 | 100% | 77 | 100% | 86 | 95% | 285 | |
CIFAR10 | |||||||
Model | QuEry Attack | Square | AdversarialPSO | ||||
ASR | Queries | ASR | Queries | ASR | Queries | ||
Inception-v3 | 24 | 100% | 1 | 89.5% | 5 | 97.5% | 31 |
18 | 100% | 2 | 92.5% | 17 | 96.0% | 41 | |
12 | 97.5% | 20 | 94.5% | 77 | 95.0% | 54 | |
6 | 91.0% | 428 | 95.5% | 776 | 94.5% | 223 | |
ResNet-50 | 24 | 100% | 2 | 92.0% | 8 | 97.5% | 31 |
18 | 100% | 8 | 91.0% | 23 | 96.0% | 41 | |
12 | 100% | 96 | 87.0% | 110 | 95.0% | 54 | |
6 | 99.0% | 565 | 87.0% | 449 | 94.5% | 223 | |
VGG-16-BN | 24 | 100% | 1 | 89.5% | 5 | 97.5% | 31 |
18 | 99.0% | 2 | 87.0% | 14 | 96.0% | 41 | |
12 | 98.0% | 60 | 87.0% | 86 | 95.0% | 54 | |
6 | 95.5% | 741 | 88.5% | 890 | 94.5% | 223 | |
MNIST | |||||||
Model | QuEry Attack | Square | AdversarialPSO | ||||
ASR | Queries | ASR | Queries | ASR | Queries | ||
Conv Net | 80 | 100% | 5 | 86.0% | 14 | 76% | 2675 |
60 | 93.5% | 72 | 93.0% | 77 | 99% | 292 |
Conv Block | Hyperparameters | |||
---|---|---|---|---|
Layer | Layer type | Hyperparameter | Value | |
1 | Convolution | Epochs | 300 | |
2 | BatchNorm2d | Batch size | 64 | |
3 | ReLU | Optimizer | ||
Learning rate | ||||
Weight decay | 1 × 10−6 | |||
CNN Architecture | ||||
Layer | Layer type | No. channels | Filter size | Stride |
1 | Conv Block | 32 | 1 | |
2 | Max Pooling | N/A | 2 | |
3 | Conv Block | 64 | 1 | |
4 | Max Pooling | N/A | 2 | |
5 | Conv Block | 128 | 1 | |
6 | Max Pooling | N/A | 2 | |
7 | Dropout | N/A | N/A | N/A |
8 | Fully Connected | 128 | N/A | N/A |
9 | Fully Connected | 10 | N/A | N/A |
Defense | CIFAR10 | ImageNet | |||
---|---|---|---|---|---|
ASR | Queries | ASR | Queries | ||
JPEG Compression | 18 | 100% | 2 | 100% | 2 |
12 | 98.0% | 13 | 97.5% | 14 | |
6 | 93.5% | 287 | 99.5% | 311 | |
Bit-Depth Reduction | 18 | 100% | 2 | 100% | 2 |
12 | 99.5% | 5 | 100% | 27 | |
6 | 96.0% | 72 | 99.5% | 145 | |
Spatial Smoothing | 18 | 100% | 1 | 100% | 1 |
12 | 100% | 1 | 100% | 2 | |
6 | 99.0% | 6 | 99.5% | 144 |
Model | ImageNet | ||
---|---|---|---|
ASR | Queries | ||
Wide ResNet-50-2 | 12 | 98.0% | 98 |
6 | 93.5% | 1187 | |
CIFAR10 | |||
Wide ResNet-70-16 | 18 | 94.5% | 156 |
12 | 85.0% | 487 |
Source Model → Target Model | TSR | |
---|---|---|
Inception-v3 → ResNet-50 | 24 | 67.0% |
18 | 47.5% | |
12 | 33.5% | |
6 | 13.5% | |
ResNet-50 → VGG-16-BN | 24 | 90.0% |
18 | 81.5% | |
12 | 61.5% | |
6 | 28.5% | |
VGG-16-BN → Inception-v3 | 24 | 59.0% |
18 | 46.5% | |
12 | 30.0% | |
6 | 12.5% |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Lapid, R.; Haramaty, Z.; Sipper, M. An Evolutionary, Gradient-Free, Query-Efficient, Black-Box Algorithm for Generating Adversarial Instances in Deep Convolutional Neural Networks. Algorithms 2022, 15, 407. https://doi.org/10.3390/a15110407
Lapid R, Haramaty Z, Sipper M. An Evolutionary, Gradient-Free, Query-Efficient, Black-Box Algorithm for Generating Adversarial Instances in Deep Convolutional Neural Networks. Algorithms. 2022; 15(11):407. https://doi.org/10.3390/a15110407
Chicago/Turabian StyleLapid, Raz, Zvika Haramaty, and Moshe Sipper. 2022. "An Evolutionary, Gradient-Free, Query-Efficient, Black-Box Algorithm for Generating Adversarial Instances in Deep Convolutional Neural Networks" Algorithms 15, no. 11: 407. https://doi.org/10.3390/a15110407
APA StyleLapid, R., Haramaty, Z., & Sipper, M. (2022). An Evolutionary, Gradient-Free, Query-Efficient, Black-Box Algorithm for Generating Adversarial Instances in Deep Convolutional Neural Networks. Algorithms, 15(11), 407. https://doi.org/10.3390/a15110407