ELAA: An Ensemble-Learning-Based Adversarial Attack Targeting Image-Classification Model
Abstract
:1. Introduction
- (1)
- Based on AutoAttacker, a black-box adversarial-attack framework based on reinforcement learning, a new black-box adversarial sample attack model is proposed ELAA—an adversarial sample attack model targetting image classification based on ensemble learning. Adversarial samples can be generated without knowing the internal information of the attacked network, such as structure and weight.
- (2)
- BAGGING method is adopted for integrated learning of reinforcement-learning-based base learning, and voting combinations effectively strengthen the advantages of each base learning.
- (3)
- Taking the attack on the classical image-classification model ResNet as an example, the experiment results show a significant attack effect. The attack success rate with ensemble learning is about 35% higher than that with a single learning model. The attack success rate of ELAA is 15% higher than any of the baseline methods.
2. Related Works
3. Proposed Method
3.1. Basic Idea
3.2. Assumptions and Definitions
3.3. Overview of Proposed Model
3.4. Base Learner with Reinforcement Learning Agent
Algorithm 1: Actor-Critic in the proposed model |
Input: Iteration T, time step , discount factor , hypermeter for policy network Process: Initialize observations of states for : ,, = Actor(,) , = Critic(,) Update TD Error by Update Critic by Update Actor by Update State by end for Output: |
3.5. Ensemble Model Based on Bagging
Algorithm 2: Bagging algorithm in the proposed model |
Input: Datasets , is an image, is the label of : Base learner algorithm: Iterations n. Process: for Sampling randomly from training set X using bootstrapping and can be obtained, is a subset of the training set X: Training base learner with RL ondataset ; end for Output: The results of each base learner are combined into a final result by a plurality voting strategy, |
4. Experiments and Analysis
4.1. Target Model and Datasets
4.2. Experimental Results and Analysis
4.2.1. Performance of Ensemble Learning of ELAA on MNIST
4.2.2. Performance of Ensemble Learning of ELAA on CIFAR-10
4.2.3. Comparison of Attack Performance between ELAA and AutoAttacker
4.2.4. Adversarial Examples
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Zhong, N.; Qian, Z.; Zhang, X. Undetectable adversarial examples based on microscopical regularization. In Proceedings of the 2021 IEEE International Conference on Multimedia and Expo (ICME), Shenzhen, China, 5–9 July 2021; pp. 1–6. [Google Scholar]
- Athalye, A.; Engstrom, L.; Ilyas, A.; Kwok, K. Synthesizing robust adversarial examples. In Proceedings of the International Conference on Machine Learning, PMLR, Stockholm Sweden, 10–15 July 2018; pp. 284–293. [Google Scholar]
- Wu, L.; Zhu, Z.; Tai, C.; Ee, W. Understanding and enhancing the transferability of adversarial examples. arXiv 2018, arXiv:1802.09707. [Google Scholar]
- Bhambri, S.; Muku, S.; Tulasi, A.; Buduru, A.B. A survey of black-box adversarial attacks on computer vision models. arXiv 2019, arXiv:1912.01667. [Google Scholar]
- Chen, X.; Weng, J.; Deng, X.; Luo, W.; Lan, Y.; Tian, Q. Feature Distillation in Deep Attention Network Against Adversarial Examples. IEEE Trans. Neural Netw. Learn. Syst. 2021. [Google Scholar] [CrossRef] [PubMed]
- Inkawhich, N.; Liang, K.J.; Carin, L.; Chen, Y. Transferable perturbations of deep feature distributions. arXiv 2020, arXiv:2004.12519. [Google Scholar]
- Yuan, X.; He, P.; Zhu, Q.; Li, X. Adversarial examples: Attacks and defenses for deep learning. IEEE Trans. Neural Netw. Learn. Syst. 2019, 30, 2805–2824. [Google Scholar] [CrossRef] [Green Version]
- Arcos-Garcia, A.; Alvarez-Garcia, J.A.; Soria-Morillo, L.M. Evaluation of deep neural networks for traffic sign detection systems. Neurocomputing 2018, 316, 332–344. [Google Scholar] [CrossRef]
- Yang, X.; Liu, W.; Zhang, S.; Liu, W.; Tao, D. Targeted attention attack on deep learning models in road sign recognition. IEEE Internet Things J. 2020, 8, 4980–4990. [Google Scholar] [CrossRef]
- Kurakin, A.; Goodfellow, I.J.; Bengio, S. Adversarial examples in the physical world. In Artificial Intelligence Safety and Security; Chapman and Hall/CRC: Boca Raton, FL, USA, 2018; pp. 99–112. [Google Scholar]
- Lee, M.; Kolter, Z. On physical adversarial patches for object detection. arXiv 2019, arXiv:1906.11897. [Google Scholar]
- Chen, S.T.; Cornelius, C.; Martin, J.; Chau, D.H.P. Shapeshifter: Robust physical adversarial attack on faster r-cnn object detector. In Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases, Bilbao, Spain, 13–17 September 2018; Springer: Berlin/Heidelberg, Germany, 2018; pp. 52–68. [Google Scholar]
- Zolfi, A.; Kravchik, M.; Elovici, Y.; Shabtai, A. The translucent patch: A physical and universal attack on object detectors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 15232–15241. [Google Scholar]
- Thys, S.; Van Ranst, W.; Goedemé, T. Fooling automated surveillance cameras: Adversarial patches to attack person detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Long Beach, CA, USA, 15–20 June 2019. [Google Scholar]
- Xiao, Z.; Gao, X.; Fu, C.; Dong, Y.; Gao, W.; Zhang, X.; Zhou, J.; Zhu, J. Improving transferability of adversarial patches on face recognition with generative models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 11845–11854. [Google Scholar]
- Mingxing, D.; Li, K.; Xie, L.; Tian, Q.; Xiao, B. Towards multiple black-boxes attack via adversarial example generation network. In Proceedings of the 29th ACM International Conference on Multimedia, Virtual Event, China, 20–24 October 2021; pp. 264–272. [Google Scholar]
- Dong, Y.; Cheng, S.; Pang, T.; Su, H.; Zhu, J. Query-Efficient Black-box Adversarial Attacks Guided by a Transfer-based Prior. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 44, 9536–9548. [Google Scholar] [CrossRef]
- Co, K.T.; Muñoz-González, L.; de Maupeou, S.; Lupu, E.C. Procedural noise adversarial examples for black-box attacks on deep convolutional networks. In Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security, London, UK, 11–15 November 2019; pp. 275–289. [Google Scholar]
- Jia, S.; Song, Y.; Ma, C.; Yang, X. Iou attack: Towards temporally coherent black-box adversarial attack for visual object tracking. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 6709–6718. [Google Scholar]
- Szegedy, C.; Zaremba, W.; Sutskever, I.; Bruna, J.; Erhan, D.; Goodfellow, I.; Fergus, R. Intriguing properties of neural networks. arXiv 2013, arXiv:1312.6199. [Google Scholar]
- Baluja, S.; Fischer, I. Learning to attack: Adversarial transformation networks. In Proceedings of the AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 4–6 February 2018; Volume 32. [Google Scholar]
- Goodfellow, I.J.; Shlens, J.; Szegedy, C. Explaining and harnessing adversarial examples. arXiv 2014, arXiv:1412.6572. [Google Scholar]
- Huang, Z.; Zhang, T. Black-box adversarial attack with transferable model-based embedding. arXiv 2019, arXiv:1911.07140. [Google Scholar]
- Laidlaw, C.; Feizi, S. Functional adversarial attacks. In Proceedings of the 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, BC, Canada, 8–14 December 2019; Volume 32. [Google Scholar]
- Ma, X.; Li, B.; Wang, Y.; Erfani, S.M.; Wijewickrema, S.; Schoenebeck, G.; Song, D.; Houle, M.E.; Bailey, J. Characterizing adversarial subspaces using local intrinsic dimensionality. arXiv 2018, arXiv:1801.02613. [Google Scholar]
- Chen, P.Y.; Zhang, H.; Sharma, Y.; Yi, J.; Hsieh, C.J. Zoo: Zeroth order optimization based black-box attacks to deep neural networks without training substitute models. In Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, Dallas, TX, USA, 3 November 2017; pp. 15–26. [Google Scholar]
- Carlini, N.; Wagner, D. Towards evaluating the robustness of neural networks. In Proceedings of the 2017 IEEE Symposium on Security and Privacy (SP), San Jose, CA, USA, 22–26 May 2017; pp. 39–57. [Google Scholar]
- Wierstra, D.; Schaul, T.; Glasmachers, T.; Sun, Y.; Peters, J.; Schmidhuber, J. Natural evolution strategies. J. Mach. Learn. Res. 2014, 15, 949–980. [Google Scholar]
- Salimans, T.; Ho, J.; Chen, X.; Sidor, S.; Sutskever, I. Evolution strategies as a scalable alternative to reinforcement learning. arXiv 2017, arXiv:1703.03864. [Google Scholar]
- Ilyas, A.; Engstrom, L.; Athalye, A.; Lin, J. Black-box adversarial attacks with limited queries and information. In Proceedings of the International Conference on Machine Learning, PMLR, Stockholm, Sweden, 10–15 July 2018; pp. 2137–2146. [Google Scholar]
- Li, Y.; Li, L.; Wang, L.; Zhang, T.; Gong, B. Nattack: Learning the distributions of adversarial examples for an improved black-box attack on deep neural networks. In Proceedings of the International Conference on Machine Learning, PMLR, Long Beach, CA, USA, 9–15 June 2019; pp. 3866–3876. [Google Scholar]
- Ilyas, A.; Engstrom, L.; Madry, A. Prior convictions: Black-box adversarial attacks with bandits and priors. arXiv 2018, arXiv:1807.07978. [Google Scholar]
- Papernot, N.; McDaniel, P.; Goodfellow, I.; Jha, S.; Celik, Z.B.; Swami, A. Practical black-box attacks against machine learning. In Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security, Abu Dhabi, United Arab Emirates, 2–6 April 2017; pp. 506–519. [Google Scholar]
- Hang, J.; Han, K.; Chen, H.; Li, Y. Ensemble adversarial black-box attacks against deep learning systems. Pattern Recognit. 2020, 101, 107184. [Google Scholar] [CrossRef]
- Tsingenopoulos, I.; Preuveneers, D.; Joosen, W. AutoAttacker: A reinforcement learning approach for black-box adversarial attacks. In Proceedings of the 2019 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW), Stockholm, Sweden, 17–19 June 2019; pp. 229–237. [Google Scholar]
- Perolat, J.; Malinowski, M.; Piot, B.; Pietquin, O. Playing the game of universal adversarial perturbations. arXiv 2018, arXiv:1809.07802. [Google Scholar]
- Wang, Z.; Wang, Y.; Wang, Y. Fooling Adversarial Training with Inducing Noise. arXiv 2021, arXiv:2111.10130. [Google Scholar]
- Wang, X.; Yang, Y.; Deng, Y.; He, K. Adversarial training with fast gradient projection method against synonym substitution based text attacks. In Proceedings of the AAAI Conference on Artificial Intelligence, Virtual, 2–9 February 2021; Volume 35, pp. 13997–14005. [Google Scholar]
- García, J.; Majadas, R.; Fernández, F. Learning adversarial attack policies through multi-objective reinforcement learning. Eng. Appl. Artif. Intell. 2020, 96, 104021. [Google Scholar] [CrossRef]
- Sun, Y.; Wang, S.; Tang, X.; Hsieh, T.Y.; Honavar, V. Adversarial attacks on graph neural networks via node injections: A hierarchical reinforcement learning approach. In Proceedings of the Web Conference 2020, Taipei, Taiwan, 20–24 April 2020; pp. 673–683. [Google Scholar]
- Yang, C.; Kortylewski, A.; Xie, C.; Cao, Y.; Yuille, A. Patchattack: A black-box texture-based attack with reinforcement learning. In Proceedings of the European Conference on Computer Vision, Glasgow, UK, 23–28 August 2020; Springer: Berlin/Heidelberg, Germany, 2020; pp. 681–698. [Google Scholar]
- Sarkar, S.; Mousavi, S.; Babu, A.R.; Gundecha, V.; Ghorbanpour, S.; Shmakov, A.K. Measuring Robustness with Black-Box Adversarial Attack using Reinforcement Learning. In Proceedings of the NeurIPS ML Safety Workshop, Virtual, 9 December 2022. [Google Scholar]
- Chaubey, A.; Agrawal, N.; Barnwal, K.; Guliani, K.K.; Mehta, P. Universal adversarial perturbations: A survey. arXiv 2020, arXiv:2005.08087. [Google Scholar]
- Breiman, L. Bagging predictors. Mach. Learn. 1996, 24, 123–140. [Google Scholar] [CrossRef]
- Guo, C.; Gardner, J.; You, Y.; Wilson, A.G.; Weinberger, K. Simple Black-box Adversarial Attacks. In Proceedings of the 36th International Conference on Machine Learning, PMLR, Long Beach, CA, USA, 9–15 June 2019; Chaudhuri, K., Salakhutdinov, R., Eds.; Volume 97, pp. 2484–2493. [Google Scholar]
- Tu, C.C.; Ting, P.; Chen, P.Y.; Liu, S.; Zhang, H.; Yi, J.; Hsieh, C.J.; Cheng, S.M. Autozoom: Autoencoder-based zeroth order optimization method for attacking black-box neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA, 27 January–1 February 2019; Volume 33, pp. 742–749. [Google Scholar]
Algorithms | Model-Free | Policy-Based | Value-Based | On-Policy | Off-Policy |
---|---|---|---|---|---|
Q-Learning | ✓ | ✓ | ✓ | ||
Sarsa | ✓ | ✓ | ✓ | ||
Policy-Gradient | ✓ | ✓ | |||
Deep Q Network | ✓ | ✓ | ✓ | ||
Actor-Critic | ✓ | ✓ | ✓ |
SR | QA | L2 | |
---|---|---|---|
SimBA-CB | 100% | 322 | 2.04 |
SimBA-DCT | 100% | 353 | 2.21 |
AutoZOOM-BiLIN | 100% | 85.6 | 1.99 |
CompleteRandom | 69.6% | 161.2 | 3.89 |
ELAA | 100% | 83.91 | 1.53 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Fu, Z.; Cui, X. ELAA: An Ensemble-Learning-Based Adversarial Attack Targeting Image-Classification Model. Entropy 2023, 25, 215. https://doi.org/10.3390/e25020215
Fu Z, Cui X. ELAA: An Ensemble-Learning-Based Adversarial Attack Targeting Image-Classification Model. Entropy. 2023; 25(2):215. https://doi.org/10.3390/e25020215
Chicago/Turabian StyleFu, Zhongwang, and Xiaohui Cui. 2023. "ELAA: An Ensemble-Learning-Based Adversarial Attack Targeting Image-Classification Model" Entropy 25, no. 2: 215. https://doi.org/10.3390/e25020215
APA StyleFu, Z., & Cui, X. (2023). ELAA: An Ensemble-Learning-Based Adversarial Attack Targeting Image-Classification Model. Entropy, 25(2), 215. https://doi.org/10.3390/e25020215