Universal Adversarial Training Using Auxiliary Conditional Generative Model-Based Adversarial Attack Generation
Abstract
:1. Introduction
- To propose the use of an auxiliary generative model for adversarial training purposes;
- To enhance model robustness by adopting the AC-GAN architecture and using it to generate adversarial samples for adversarial training;
- To show experimental test results of AC-GAN-based adversarially trained models and compare their attack robustness with adversarially trained models using different methods.
2. Background Research
2.1. Conditional Generative Adversarial Networks and AC-GANs
2.2. Adversarial Attacks
- Fast Gradient Sign Method (FGSM) [1]: FGSM is an adversarial attack generation algorithm that is performed by calculating the gradient with respect to each image pixel and adjusting it to maximize the loss value.
- Projected Gradient Descent (PGD) [9]: This attack is an update of FGS; while FGSM takes only one step to calculate the generated noise, PGD improves the attack efficacy through a multi-step process.
- Momentum Iterative Method (MIM) [36]: this attack uses momentum-based algorithms to “boost” adversarial attacks, i.e., improve the generated adversarial examples by adding momentum to the process and avoiding local maxima that are not sufficient during each iteration.
- Basic Iterative Method (BIM) [21]: BIM extends the idea of FGSM and implements the attack multiple times with smaller step sizes.
- Unrestricted Adversarial Examples [19,20]: the authors who proposed this idea took a different approach to generating adversarial examples; in this approach, a GAN learns to create adversarial examples from scratch by searching the latent space of the model that is being attacked. Because our architecture is based on their proposed solution, this alternative is discussed in detail in the following sub-section.
2.3. Adversarial Training
2.4. Unrestricted Adversarial Examples
2.5. Mixup Data Augmentation
3. Methodology
3.1. Auxiliary Classifier GAN to Generate “Unrestricted Adversarial Examples”
3.2. Modifying AC-GAN Architecture for Adversarial Training
3.3. Augmenting Model Training with Generated Images
4. Experimental Results
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Goodfellow, I.J.; Shlens, J.; Szegedy, C. Explaining and Harnessing Adversarial Examples. arXiv 2015, arXiv:1412.6572. [Google Scholar]
- Szegedy, C.; Zaremba, W.; Sutskever, I.; Bruna, J.; Erhan, D.; Goodfellow, I.; Fergus, R. Intriguing properties of neural networks. arXiv 2014, arXiv:1312.6199. [Google Scholar] [CrossRef]
- Lin, Z.; Shi, Y.; Xue, Z. IDSGAN: Generative Adversarial Networks for Attack Generation against Intrusion Detection. In Advances in Knowledge Discovery and Data Mining; Springer: Cham, Switzerland, 2022; Volume 13282, pp. 79–91. [Google Scholar]
- Abdoli, S.; Hafemann, L.G.; Rony, J.; Ayed, I.B.; Cardinal, P.; Koerich, A.L. Universal Adversarial Audio Perturbations. arXiv 2020, arXiv:1908.03173. [Google Scholar]
- Carlini, N.; Wagner, D. Audio Adversarial Examples: Targeted Attacks on Speech-to-Text. arXiv 2018, arXiv:1801.01944. [Google Scholar]
- Kang, D.; Sun, Y.; Brown, T.; Hendrycks, D.; Steinhardt, J. Transfer of Adversarial Robustness Between Perturbation Types 2019. arXiv 2019, arXiv:1905.01034. [Google Scholar] [CrossRef]
- Papernot, N.; McDaniel, P.; Goodfellow, I.; Jha, S.; Celik, Z.B.; Swami, A. Practical Black-Box Attacks against Machine Learning. arXiv 2017, arXiv:1602.02697. [Google Scholar] [CrossRef]
- Wang, W.; Wang, R.; Wang, L.; Wang, Z.; Ye, A. Towards a Robust Deep Neural Network in Texts: A Survey. arXiv 2021, arXiv:1902.07285. [Google Scholar]
- Madry, A.; Makelov, A.; Schmidt, L.; Tsipras, D.; Vladu, A. Towards Deep Learning Models Resistant to Adversarial Attacks. arXiv 2019, arXiv:1706.06083. [Google Scholar]
- Mahdy, A.M.S.; Lotfy, K.; El-Bary, A.A. Use of optimal control in studying the dynamical behaviors of fractional financial awareness models. Soft Comput. 2022, 26, 3401–3409. [Google Scholar] [CrossRef]
- Khader, M.M.; Swetlam, N.H.; Mahdy, A.M.S. The Chebyshev Collection Method for Solving Fractional Order Klein-Gordon Equation. WSEAS Trans. Math. 2014, 13, 31–38. [Google Scholar]
- Bai, T.; Luo, J.; Zhao, J.; Wen, B.; Wang, Q. Recent Advances in Adversarial Training for Adversarial Robustness. arXiv 2021, arXiv:2102.01356. [Google Scholar] [CrossRef]
- Huang, R.; Xu, B.; Schuurmans, D.; Szepesvari, C. Learning with a Strong Adversary. arXiv 2016, arXiv:1511.03034. [Google Scholar] [CrossRef]
- Pang, T.; Xu, K.; Du, C.; Chen, N.; Zhu, J. Improving Adversarial Robustness via Promoting Ensemble Diversity. arXiv 2019, arXiv:1901.08846. [Google Scholar] [CrossRef]
- Tramèr, F.; Kurakin, A.; Papernot, N.; Goodfellow, I.; Boneh, D.; McDaniel, P. Ensemble Adversarial Training: Attacks and Defenses. arXiv 2020, arXiv:1705.07204. [Google Scholar] [CrossRef]
- Athalye, A.; Carlini, N.; Wagner, D. Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples. arXiv 2018, arXiv:1802.00420. [Google Scholar] [CrossRef]
- Carlini, N.; Wagner, D. Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods. arXiv 2017, arXiv:1705.07263. [Google Scholar] [CrossRef]
- A Complete List of All Adversarial Example Papers. Available online: https://nicholas.carlini.com/writing/2019/all-adversarial-example-papers.html (accessed on 14 September 2022).
- Brown, T.B.; Carlini, N.; Zhang, C.; Olsson, C.; Christiano, P.; Goodfellow, I. Unrestricted Adversarial Examples. arXiv 2018, arXiv:1809.08352. [Google Scholar]
- Song, Y.; Shu, R.; Kushman, N.; Ermon, S. Constructing Unrestricted Adversarial Examples with Generative Models. arXiv 2018, arXiv:1805.07894. [Google Scholar]
- Kurakin, A.; Goodfellow, I.; Bengio, S. Adversarial examples in the physical world. arXiv 2017, arXiv:1607.02533. [Google Scholar]
- Papernot, N.; McDaniel, P.; Jha, S.; Fredrikson, M.; Celik, Z.B.; Swami, A. The Limitations of Deep Learning in Adversarial Settings. arXiv 2015, arXiv:1511.07528. [Google Scholar]
- Goodfellow, I.J.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative Adversarial Networks. arXiv 2014, arXiv:1406.2661. [Google Scholar] [CrossRef]
- Kingma, D.P.; Welling, M. Auto-Encoding Variational Bayes. arXiv 2014, arXiv:1312.6114. [Google Scholar] [CrossRef]
- Goodfellow, I. NIPS 2016 Tutorial: Generative Adversarial Networks. arXiv 2017, arXiv:1701.00160. [Google Scholar] [CrossRef]
- Mirza, M.; Osindero, S. Conditional Generative Adversarial Nets. arXiv 2014, arXiv:1411.1784. [Google Scholar] [CrossRef]
- Odena, A.; Olah, C.; Shlens, J. Conditional Image Synthesis With Auxiliary Classifier GANs. arXiv 2017, arXiv:1610.09585. [Google Scholar]
- Kang, M.; Shim, W.; Cho, M.; Park, J. Rebooting ACGAN: Auxiliary Classifier GANs with Stable Training. arXiv 2021, arXiv:2111.01118. [Google Scholar] [CrossRef]
- Kang, M.; Park, J. ContraGAN: Contrastive Learning for Conditional Image Generation. In Proceedings of the 34th Conference on Neural Information Processing Systems (NeurIPS 2020), Vancouver, BC, Canada, 6–12 December 2020; Curran Associates, Inc.: Red Hook, NY, USA, 2020; Volume 33, pp. 21357–21369. [Google Scholar]
- Miyato, T.; Koyama, M. cGANs with Projection Discriminator. arXiv 2018, arXiv:1802.05637. [Google Scholar] [CrossRef]
- Papers with Code—ArtBench-10 (32 × 32) Benchmark (Conditional Image Generation). Available online: https://paperswithcode.com/sota/conditional-image-generation-on-artbench-10 (accessed on 11 August 2022).
- Adversarial Example Using FGSM|TensorFlow Core. Available online: https://www.tensorflow.org/tutorials/generative/adversarial_fgsm (accessed on 13 September 2022).
- Carlini, N.; Wagner, D. Towards Evaluating the Robustness of Neural Networks. arXiv 2017, arXiv:1608.04644. [Google Scholar]
- Spall, J.C. Multivariate stochastic approximation using a simultaneous perturbation gradient approximation. IEEE Trans. Autom. Control 1992, 37, 332–341. [Google Scholar] [CrossRef] [Green Version]
- Uesato, J.; O’donoghue, B.; Kohli, P.; Oord, A. Adversarial Risk and the Dangers of Evaluating Against Weak Attacks. arXiv 2018, arXiv:1802.05666. [Google Scholar] [CrossRef]
- Dong, Y.; Liao, F.; Pang, T.; Su, H.; Zhu, J.; Hu, X.; Li, J. Boosting Adversarial Attacks with Momentum. arXiv 2018, arXiv:1710.06081. [Google Scholar] [CrossRef]
- Deng, J.; Chen, S.; Dong, L.; Yan, D.; Wang, R. Transferability of Adversarial Attacks on Synthetic Speech Detection. arXiv 2022, arXiv:2205.07711. [Google Scholar]
- Xu, Y.; Zhong, X.; Yepes, A.J.; Lau, J.H. Grey-Box Adversarial Attack and Defence For Sentiment Classification. arXiv 2021, arXiv:2103.11576. [Google Scholar]
- Le, T.; Wang, S.; Lee, D. MALCOM: Generating Malicious Comments to Attack Neural Fake News Detection Models. arXiv 2020, arXiv:2009.01048. [Google Scholar] [CrossRef]
- Zhang, X.; Zhang, J.; Chen, Z.; He, K. Crafting Adversarial Examples for Neural Machine Translation. In Proceedings of the Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Online, 1–6 August 2021; pp. 1967–1977. [Google Scholar]
- Biggio, B.; Corona, I.; Maiorca, D.; Nelson, B.; Srndic, N.; Laskov, P.; Giacinto, G.; Roli, F. Evasion Attacks against Machine Learning at Test Time. In Machine Learning and Knowledge Discovery in Databases; Springer: Berlin/Heidelberg, Germany, 2013; Volume 7908, pp. 387–402. [Google Scholar]
- Dalvi, N.; Domingos, P.; Mausam; Sanghai, S.; Verma, D. Adversarial classification. In Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Seattle, WA, USA, 22–25 August 2004; Association for Computing Machinery: New York, NY, USA, 2004; pp. 99–108. [Google Scholar]
- Lowd, D.; Meek, C. Adversarial learning. In Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, Chicago, IL, USA, 21–24 August 2005; Association for Computing Machinery: New York, NY, USA, 2005; pp. 641–647. [Google Scholar]
- Martins, N.; Cruz, J.M.; Cruz, T.; Henriques Abreu, P. Adversarial Machine Learning Applied to Intrusion and Malware Scenarios: A Systematic Review. IEEE Access 2020, 8, 35403–35419. [Google Scholar] [CrossRef]
- Experimental Security Research of Tesla Autopilot.pdf. Available online: https://keenlab.tencent.com/en/whitepapers/Experimental_Security_Research_of_Tesla_Autopilot.pdf (accessed on 26 July 2023).
- Rigaki, M.; Elragal, A. Adversarial Deep Learning Against Intrusion Detection Classifiers. In Proceedings of the ST-152 Workshop on Intelligent Autonomous Agents for Cyber Defence and Resilience, Prague, Czech Republic, 18–20 October 2017. [Google Scholar]
- Wang, Z. Deep Learning-Based Intrusion Detection With Adversaries. IEEE Access 2018, 6, 38367–38384. [Google Scholar] [CrossRef]
- Apruzzese, G.; Colajanni, M.; Ferretti, L.; Marchetti, M. Addressing Adversarial Attacks against Security Systems Based on Machine Learning. In Proceedings of the 2019 11th International Conference on Cyber Conflict (CyCon), Tallinn, Estonia, 28–31 May 2019; Volume 900, pp. 1–18. [Google Scholar]
- Yang, K.; Liu, J.; Zhang, C.; Fang, Y. Adversarial Examples Against the Deep Learning Based Network Intrusion Detection Systems. In Proceedings of the MILCOM 2018—2018 IEEE Military Communications Conference (MILCOM), Los Angeles, CA, USA, 29–31 October 2018; pp. 559–564. [Google Scholar]
- Wu, D.; Fang, B.; Wang, J.; Liu, Q.; Cui, X. Evading Machine Learning Botnet Detection Models via Deep Reinforcement Learning. In Proceedings of the ICC 2019—2019 IEEE International Conference on Communications (ICC), Shanghai, China, 20–24 May 2019; pp. 1–6. [Google Scholar]
- Balaji, Y.; Goldstein, T.; Hoffman, J. Instance adaptive adversarial training: Improved accuracy tradeoffs in neural nets. arXiv 2019, arXiv:1910.08051. [Google Scholar] [CrossRef]
- Ding, G.W.; Sharma, Y.; Lui, K.Y.C.; Huang, R. MMA Training: Direct Input Space Margin Maximization through Adversarial Training. arXiv 2020, arXiv:1812.02637. [Google Scholar] [CrossRef]
- Cheng, M.; Lei, Q.; Chen, P.-Y.; Dhillon, I.; Hsieh, C.-J. CAT: Customized Adversarial Training for Improved Robustness. arXiv 2020, arXiv:2002.06789. [Google Scholar] [CrossRef]
- Zhang, W.; Li, D.; Min, X.; Zhai, G.; Guo, G.; Yang, X.; Ma, K. Perceptual Attacks of No-Reference Image Quality Models with Human-in-the-Loop. arXiv 2022, arXiv:2210.00933. [Google Scholar] [CrossRef]
- Yin, X.; Kolouri, S.; Rohde, G.K. GAT: Generative Adversarial Training for Adversarial Example Detection and Robust Classification. arXiv 2022, arXiv:1905.11475. [Google Scholar]
- Catak, E.O.; Sivaslioglu, S.; Sahinbas, K. A Generative Model based Adversarial Security of Deep Learning and Linear Classifier Models. arXiv 2020, arXiv:2010.08546. [Google Scholar] [CrossRef]
- Sami, M.; Mobin, I. A Comparative Study on Variational Autoencoders and Generative Adversarial Networks. In Proceedings of the 2019 International Conference of Artificial Intelligence and Information Technology (ICAIIT), Yogyakarta, Indonesia, 13–15 March 2019; pp. 1–5. [Google Scholar]
- Arjovsky, M.; Chintala, S.; Bottou, L. Wasserstein GAN. arXiv 2017, arXiv:1701.07875. [Google Scholar] [CrossRef]
- Zhang, H.; Cisse, M.; Dauphin, Y.N.; Lopez-Paz, D. mixup: Beyond Empirical Risk Minimization. arXiv 2018, arXiv:1710.09412. [Google Scholar] [CrossRef]
- Zhang, L.; Deng, Z.; Kawaguchi, K.; Ghorbani, A.; Zou, J. How Does Mixup Help with Robustness and Generalization? arXiv 2021, arXiv:2010.04819. [Google Scholar] [CrossRef]
- Nicolae, M.-I.; Sinn, M.; Tran, M.N.; Buesser, B.; Rawat, A.; Wistuba, M.; Zantedeschi, V.; Baracaldo, N.; Chen, B.; Ludwig, H.; et al. Adversarial Robustness Toolbox v1.0.0. arXiv 2019, arXiv:1807.01069. [Google Scholar] [CrossRef]
- Papernot, N.; Faghri, F.; Carlini, N.; Goodfellow, I.; Feinman, R.; Kurakin, A.; Xie, C.; Sharma, Y.; Brown, T.; Roy, A.; et al. Technical Report on the CleverHans v2.1.0 Adversarial Examples Library. arXiv 2018, arXiv:1610.00768. [Google Scholar]
- Zhang, H.; Chen, H.; Song, Z.; Boning, D.; Dhillon, I.S.; Hsieh, C.-J. The Limitations of Adversarial Training and the Blind-Spot Attack. arXiv 2019, arXiv:1901.04684. [Google Scholar]
- Ojha, U.; Li, Y.; Lu, J.; Efros, A.A.; Jae Lee, Y.; Shechtman, E.; Zhang, R. Few-shot Image Generation via Cross-domain Correspondence. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; IEEE: New York City, NY, USA, 2021; pp. 10738–10747. [Google Scholar]
Loss Functions | Unrestricted Adv. Model | Our Model |
---|---|---|
Yes | Yes | |
Yes | No | |
Yes | Yes | |
No | Yes |
Attacks | Clean Model | Madry’s Model | Unrest. Adv Model | Our Model (w./o Mixup) | Our Model |
---|---|---|---|---|---|
Clean | 97.1% | 98.7% | 98.5% | 98.4% | 98.4% |
PGD | 13.5% | 99.9% | 98.7% | 98.6% | 98.3% |
FGSM | 13.4% | 98.8% | 98.4% | 98.4% | 98.8% |
Unrestricted | 13.9% | 91.1% | 99.9% | 99.1% | 99.1% |
SPSA | 37.7% | 76.9% | 90.3% | 94.4% | 99.8% |
MIM | 4.9% | 50.0% | 92.9% | 90.2% | 94.0% |
BIM | 9.3% | 75.5% | 89.5% | 92.9% | 98.5% |
Model | Average | Variance |
---|---|---|
Madry’s Model | 83.43% | 3 × 10−2 |
Unrestricted Adv. Model | 94.70% | 2 × 10−3 |
Proposed Model | 97.48% | 7.2 × 10−4 |
Attacks | Clean Model | Madry’s Model | Unrest. Adv Model | Our Model (w/o Mixup) | Our Model |
---|---|---|---|---|---|
Clean | 96.4% | 95.5% | 91.2% | 94.2% | 96.8% |
SPSA | 26.4% | 70.5% | 88.5% | 90.5% | 91.7% |
MIM | 10.3% | 50.2% | 92.9% | 94.1% | 94.5% |
BIM | 7.4% | 70.3% | 88.6% | 90.0% | 93.1% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Dingeto, H.; Kim, J. Universal Adversarial Training Using Auxiliary Conditional Generative Model-Based Adversarial Attack Generation. Appl. Sci. 2023, 13, 8830. https://doi.org/10.3390/app13158830
Dingeto H, Kim J. Universal Adversarial Training Using Auxiliary Conditional Generative Model-Based Adversarial Attack Generation. Applied Sciences. 2023; 13(15):8830. https://doi.org/10.3390/app13158830
Chicago/Turabian StyleDingeto, Hiskias, and Juntae Kim. 2023. "Universal Adversarial Training Using Auxiliary Conditional Generative Model-Based Adversarial Attack Generation" Applied Sciences 13, no. 15: 8830. https://doi.org/10.3390/app13158830
APA StyleDingeto, H., & Kim, J. (2023). Universal Adversarial Training Using Auxiliary Conditional Generative Model-Based Adversarial Attack Generation. Applied Sciences, 13(15), 8830. https://doi.org/10.3390/app13158830