Towards Adversarial Attacks for Clinical Document Classification
Abstract
:1. Introduction
- We compare the effectiveness of different black-box adversarial attacks on the robustness of the CNN model for document classification on long clinical texts.
- We evaluate the effectiveness of using class label names as AEs by either concatenating these examples to the unstructured text or editing them whenever they appear in the text.
- We propose a novel defense technique based on feature selection and filtering to enhance the robustness of the CNN model.
- We evaluate the robustness of the proposed approach on clinical document classification.
2. Related Works
- Model knowledge determines if the adversary has access to the model information (white-box attack) or if the model is unknown and inaccessible to the adversary (black-box attack).
- Target type determines the aim of the adversary. If the attack can alter the output prediction to a specific class, it is called a targeted attack, whereas an untargeted attack tries to fool the DL model into making any incorrect prediction.
- Semantic granularity refers to the level to which the perturbations are applied. In other words, AEs are generated by perturbing sentences (sentence-level), words (word-level) or characters (character-level).
2.1. Adversarial Attack Strategies
2.2. Adversarial Defense Strategies
3. Method
3.1. Problem Formulation
3.2. Concatenation Adversaries
- Adding perturbation words at random locations: In this attack, we attempt to add a various number of a specific perturbation word in random locations of input documents. If denotes the added word, the adversarial input would be as . The location of each is determined randomly.
- Adding perturbation words in the beginning: In this attack, the aim is to append a various number of a specific perturbation word in the beginning of each input document. In this way, the adversarial input would be as .
- Adding perturbation words at the end: This attack is carried out to add a various number of a specific perturbation word at the end of each document. The adversarial inputs would be as .
3.3. Edit Adversaries
- Synthetic perturbation: In this attack, AEs are generated by perturbing characters including swapping all two neighboring characters , randomly deleting one character , randomly changing orders of all characters and randomly changing orders of characters except the first and last ones of a specific list words. Each of these perturbations are performed on all selected words at the same time.
- Replacing strategy: This attack is a targeted attack in which the selected words in all input documents are replaced with a specific word that leads to the targeted prediction (for instance and is the perturbation word that makes the prediction instead of ).
3.4. Defense Strategy
3.5. Evaluation Metrics
4. Experimental Setup
4.1. Data
4.2. Target Model
4.3. Adversarial Attack
- Using these names is considered a black-box attack as the adversary does not need to know the CNN model details.
- As we will see later, filtering these names during the model training does not impact the overall model accuracy.
4.3.1. Concatenation Adversaries
- Concat-Random: For each selected perturbation word, 1, 2, 3, 5, 10, 20 or 40 words are randomly added to all input documents. For instance, Concat-Random-Breast-1 means randomly adding one “breast” perturbation word to the documents.
- Concat-Begin: For each selected perturbation word, 1, 2, 3, 5, 10, 20 or 40 words are added at the beginning of all input documents. For instance, Concat-Begin-Breast-1 denotes appending one “breast” perturbation word at the beginning of all documents.
- Concat-End: For each selected perturbation word, 1, 2, 3, 5, 10, 20 or 40 words are appended at the end of all input documents. For instance, Concat-End-Breast-1 means adding one “breast” perturbation word at the end of the documents.
4.3.2. Edit Adversaries
- Edit-Synthetic: In this attack, we perturb the letters of all the selected words (breast, leukemia/lymphoma, and sarcoma) whenever they appear in the document text. Different approaches are applied to edit the targeted tokens, such as swapping all two neighboring characters (Swap), randomly deleting one character (Delete), randomly changing orders of all character (Fully Random), or randomly changing orders of characters except the first and last ones (Middle Random).
- Edit-Replacing: In this attack, all class label names (the 25 different labels) are replaced with one of the target words (breast, leukemia/lymphoma, or sarcoma) whenever they appear in the unstructured text. For instance, Edit-Replacing-Breast means all class label names that appear in the input document text are replaced with the word “breast” as the perturbed word.
4.4. Defense
5. Results
5.1. Concatenation Adversaries
5.2. Edit Adversaries
5.3. Defense
6. Conclusions
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
Appendix A. TCGA Dataset
Appendix B. Adversarial Attack
Occurrence | |
---|---|
duct | 1542 |
bile | 1012 |
gland | 2589 |
adrenal | 1786 |
lymphoma | 90 |
leukemia | 3 |
neck | 2817 |
head | 356 |
Appendix B.1. Concatenation Adversaries
Micro F1 | Macro F1 | |
---|---|---|
Baseline | 0.9623 | 0.9400 |
Concat-End-Breast-1 | 0.8003 | 0.8156 |
Concat-End-Breast-3 | 0.2335 | 0.2029 |
Concat-End-Breast-5 | 0.1486 | 0.0915 |
Concat-End-Breast-20 | 0.1486 | 0.0915 |
Micro F1 | Macro F1 | |
---|---|---|
Baseline | 0.9623 | 0.9400 |
Concat-End-sarcoma-1 | 0.9520 | 0.9172 |
Concat-End-sarcoma-3 | 0.7594 | 0.6818 |
Concat-End-sarcoma-5 | 0.6156 | 0.5506 |
Concat-End-sarcoma-20 | 0.6156 | 0.5506 |
Micro F1 | Macro F1 | |
---|---|---|
Baseline | 0.9623 | 0.9400 |
Concat-End-lymphoma-1 | 0.9520 | 0.9091 |
Concat-End-lymphoma-3 | 0.7932 | 0.7367 |
Concat-End-lymphoma-5 | 0.6824 | 0.6203 |
Concat-End-lymphoma-20 | 0.6824 | 0.6203 |
Micro F1 | Macro F1 | |
---|---|---|
Baseline | 0.9623 | 0.9400 |
Concat-Begin-Breast-1 | 0.9198 | 0.9157 |
Concat-Begin-Breast-3 | 0.2461 | 0.2337 |
Concat-Begin-Breast-5 | 0.1682 | 0.1332 |
Concat-Begin-Breast-20 | 0.1682 | 0.1332 |
Micro F1 | Macro F1 | |
---|---|---|
Baseline | 0.9623 | 0.9400 |
Concat-Begin-sarcoma-1 | 0.9615 | 0.9157 |
Concat-Begin-sarcoma-3 | 0.7429 | 0.6666 |
Concat-Begin-sarcoma-5 | 0.6211 | 0.5684 |
Concat-Begin-sarcoma-20 | 0.6211 | 0.5684 |
Micro F1 | Macro F1 | |
---|---|---|
Baseline | 0.9623 | 0.9400 |
Concat-Begin-lymphoma-1 | 0.9638 | 0.9289 |
Concat-Begin-lymphoma-3 | 0.7862 | 0.7262 |
Concat-Begin-lymphoma-5 | 0.6863 | 0.6209 |
Concat-Begin-lymphoma-20 | 0.6863 | 0.6209 |
Micro F1 | Macro F1 | |
---|---|---|
Baseline | 0.9623 | 0.9400 |
Concat-Random-Breast-1 | 0.8066 | 0.8240 |
Concat-Random-Breast-10 | 0.5660 | 0.6006 |
Concat-Random-Breast-20 | 0.4049 | 0.3992 |
Micro F1 | Macro F1 | |
---|---|---|
Baseline | 0.9623 | 0.9400 |
Concat-Random-lymphoma-1 | 0. 9520 | 0.9105 |
Concat-Random-lymphoma-10 | 0.9033 | 0.8567 |
Concat-Random-lymphoma-20 | 0.8381 | 0.7924 |
Micro F1 | Macro F1 | |
---|---|---|
Baseline | 0.9623 | 0.9400 |
Concat-Random-sarcoma-1 | 0. 4049 | 0.3992 |
Concat-Random-sarcoma-10 | 0.8585 | 0.8051 |
Concat-Random-sarcoma-20 | 0.7720 | 0.7148 |
Appendix B.2. Edit Adversaries
Appendix C. Defense
Micro F1 | Macro F1 | |
---|---|---|
Baseline | 0.9544 | 0.9240 |
Edit-Replacing-Breast | 0.9583 | 0.9369 |
Edit-Replacing-Lung | 0.9583 | 0.9369 |
Edit-Replacing-Kidney | 0.9583 | 0.9369 |
Edit-Replacing-Brain | 0.9583 | 0.9369 |
Edit-Replacing-colon | 0.9583 | 0.9369 |
Edit-Replacing-uterus | 0.9583 | 0.9369 |
Edit-Replacing-thyroid | 0.9583 | 0.9369 |
Edit-Replacing-prostate | 0.9583 | 0.9369 |
Edit-Replacing-head and neck | 0.9583 | 0.9369 |
Edit-Replacing-skin | 0.9583 | 0.9369 |
Edit-Replacing-bladder | 0.9583 | 0.9369 |
Edit-Replacing-liver | 0.9583 | 0.9369 |
Edit-Replacing-stomach | 0.9583 | 0.9369 |
Edit-Replacing-cervix | 0.9583 | 0.9369 |
Edit-Replacing-ovary | 0.9583 | 0.9369 |
Edit-Replacing-sarcoma | 0.9583 | 0.9369 |
Edit-Replacing-adrenal gland | 0.9583 | 0.9369 |
Edit-Replacing-pancreas | 0.9583 | 0.9369 |
Edit-Replacing-oesophagus | 0.9583 | 0.9369 |
Edit-Replacing-testes | 0.9583 | 0.9369 |
Edit-Replacing-thymus | 0.9583 | 0.9369 |
Edit-Replacing-melanoma | 0.9583 | 0.9369 |
Edit-Replacing-leukemia/lymphoma | 0.9583 | 0.9369 |
Edit-Replacing-bile duct | 0.9583 | 0.9369 |
References
- Köksal, Ö.; Akgül, Ö. A Comparative Text Classification Study with Deep Learning-Based Algorithms. In Proceedings of the 2022 9th International Conference on Electrical and Electronics Engineering (ICEEE), Alanya, Turkey, 29–31 March 2022; IEEE: New York, NY, USA, 2022; pp. 387–391. [Google Scholar]
- Varghese, M.; Anoop, V. Deep Learning-Based Sentiment Analysis on COVID-19 News Videos. In Proceedings of the International Conference on Information Technology and Applications, Lisbon, Portugal, 20–22 October 2022; Spinger: Berlin/Heidelberg, Germany, 2022; pp. 229–238. [Google Scholar]
- Affi, M.; Latiri, C. BE-BLC: BERT-ELMO-Based deep neural network architecture for English named entity recognition task. Procedia Comput. Sci. 2021, 192, 168–181. [Google Scholar] [CrossRef]
- Zhang, W.E.; Sheng, Q.Z.; Alhazmi, A.; Li, C. Adversarial attacks on deep-learning models in natural language processing: A survey. ACM Trans. Intell. Syst. Technol. (TIST) 2020, 11, 1–41. [Google Scholar] [CrossRef] [Green Version]
- Alawad, M.; Yoon, H.J.; Tourassi, G.D. Coarse-to-fine multi-task training of convolutional neural networks for automated information extraction from cancer pathology reports. In Proceedings of the 2018 IEEE EMBS International Conference on Biomedical & Health Informatics (BHI), Las Vegas, NV, USA, 4–7 March 2018; pp. 218–221. [Google Scholar] [CrossRef]
- Olthof, A.W.; van Ooijen, P.M.A.; Cornelissen, L.J. Deep Learning-Based Natural Language Processing in Radiology: The Impact of Report Complexity, Disease Prevalence, Dataset Size, and Algorithm Type on Model Performance. J. Med. Syst. 2021, 45. [Google Scholar] [CrossRef] [PubMed]
- Wang, Y.; Bansal, M. Robust machine comprehension models via adversarial training. arXiv 2018, arXiv:1804.06473. [Google Scholar]
- Suya, F.; Chi, J.; Evans, D.; Tian, Y. Hybrid batch attacks: Finding black-box adversarial examples with limited queries. In Proceedings of the 29th USENIX Security Symposium (USENIX Security 20), Boston, MA, USA, 12–14 August 2020; pp. 1327–1344. [Google Scholar]
- Yala, A.; Barzilay, R.; Salama, L.; Griffin, M.; Sollender, G.; Bardia, A.; Lehman, C.; Buckley, J.M.; Coopey, S.B.; Polubriaginof, F.; et al. Using Machine Learning to Parse Breast Pathology Reports. bioRxiv 2016. [Google Scholar] [CrossRef] [PubMed]
- Buckley, J.M.; Coopey, S.B.; Sharko, J.; Polubriaginof, F.C.G.; Drohan, B.; Belli, A.K.; Kim, E.M.H.; Garber, J.E.; Smith, B.L.; Gadd, M.A.; et al. The feasibility of using natural language processing to extract clinical information from breast pathology reports. J. Pathol. Inform. 2012, 3, 23. [Google Scholar] [CrossRef] [PubMed]
- Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the NAACL-HLT, Minneapolis, MN, USA, 2–7 June 2019; pp. 4171–4186. [Google Scholar]
- Gao, S.; Alawad, M.; Young, M.T.; Gounley, J.; Schaefferkoetter, N.; Yoon, H.J.; Wu, X.C.; Durbin, E.B.; Doherty, J.; Stroup, A.; et al. Limitations of Transformers on Clinical Text Classification. IEEE J. Biomed. Health Inform. 2021, 25, 3596–3607. [Google Scholar] [CrossRef] [PubMed]
- Chakraborty, A.; Alam, M.; Dey, V.; Chattopadhyay, A.; Mukhopadhyay, D. Adversarial Attacks and Defences: A Survey, 2018. arXiv 2018, arXiv:1810.00069. [Google Scholar] [CrossRef]
- Long, T.; Gao, Q.; Xu, L.; Zhou, Z. A survey on adversarial attacks in computer vision: Taxonomy, visualization and future directions. Comput. Secur. 2022, 121, 102847. [Google Scholar] [CrossRef]
- Simoncini, W.; Spanakis, G. SeqAttack: On adversarial attacks for named entity recognition. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, Punta Cana, Dominican Republic, 7–11 November 2021; pp. 308–318. [Google Scholar]
- Araujo, V.; Carvallo, A.; Aspillaga, C.; Parra, D. On adversarial examples for biomedical nlp tasks. arXiv 2020, arXiv:2004.11157. [Google Scholar]
- Jin, D.; Jin, Z.; Zhou, J.T.; Szolovits, P. Is bert really robust? a strong baseline for natural language attack on text classification and entailment. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; Volume 34, pp. 8018–8025. [Google Scholar]
- Gao, J.; Lanchantin, J.; Soffa, M.L.; Qi, Y. Black-box generation of adversarial text sequences to evade deep learning classifiers. In Proceedings of the 2018 IEEE Security and Privacy Workshops (SPW), San Francisco, CA, USA, 24–24 May 2018; IEEE: New York, NY, USA, 2018; pp. 50–56. [Google Scholar]
- Yuan, L.; Zheng, X.; Zhou, Y.; Hsieh, C.J.; Chang, K.W. On the Transferability of Adversarial Attacksagainst Neural Text Classifier. arXiv 2020, arXiv:2011.08558. [Google Scholar]
- Pei, W.; Yue, C. Generating Content-Preserving and Semantics-Flipping Adversarial Text. In Proceedings of the 2022 ACM on Asia Conference on Computer and Communications Security, Nagasaki, Japan, 30 May–3 June 2022; pp. 975–989. [Google Scholar]
- Finlayson, S.G.; Kohane, I.S.; Beam, A.L. Adversarial Attacks Against Medical Deep Learning Systems. CoRR2018, abs/1804.05296. Available online: http://xxx.lanl.gov/abs/1804.05296 (accessed on 1 December 2022).
- Mondal, I. BBAEG: Towards BERT-based biomedical adversarial example generation for text classification. arXiv 2021, arXiv:2104.01782. [Google Scholar]
- Zhang, R.; Zhang, W.; Liu, N.; Wang, J. Susceptible Temporal Patterns Discovery for Electronic Health Records via Adversarial Attack. In Proceedings of the International Conference on Database Systems for Advanced Applications, Taipei, Taiwan, 11–14 April; Springer: Berlin/Heidelberg, Germany, 2021; pp. 429–444. [Google Scholar]
- Sun, M.; Tang, F.; Yi, J.; Wang, F.; Zhou, J. Identify susceptible locations in medical records via adversarial attacks on deep predictive models. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, London, UK, 19–23 August 2018; pp. 793–801. [Google Scholar]
- Xu, H.; Ma, Y.; Liu, H.C.; Deb, D.; Liu, H.; Tang, J.L.; Jain, A.K. Adversarial attacks and defenses in images, graphs and text: A review. Int. J. Autom. Comput. 2020, 17, 151–178. [Google Scholar] [CrossRef]
- Szegedy, C.; Zaremba, W.; Sutskever, I.; Bruna, J.; Erhan, D.; Goodfellow, I.; Fergus, R. Intriguing properties of neural networks. arXiv 2013, arXiv:1312.6199. [Google Scholar]
- Wang, W.; Park, Y.; Lee, T.; Molloy, I.; Tang, P.; Xiong, L. Utilizing Multimodal Feature Consistency to Detect Adversarial Examples on Clinical Summaries. In Proceedings of the 3rd Clinical Natural Language Processing Workshop, Online, 19 November 2020; pp. 259–268. [Google Scholar]
- Belinkov, Y.; Bisk, Y. Synthetic and natural noise both break neural machine translation. arXiv 2017, arXiv:1711.02173. [Google Scholar]
- Alawad, M.; Gao, S.; Qiu, J.; Schaefferkoetter, N.; Hinkle, J.D.; Yoon, H.J.; Christian, J.B.; Wu, X.C.; Durbin, E.B.; Jeong, J.C.; et al. Deep transfer learning across cancer registries for information extraction from pathology reports. In Proceedings of the 2019 IEEE EMBS International Conference on Biomedical & Health Informatics (BHI), Chicago, IL, USA, 19–22 May 2019; IEEE: New York, NY, USA, 2019; pp. 1–4. [Google Scholar] [CrossRef]
- Gao, S.; Alawad, M.; Schaefferkoetter, N.; Penberthy, L.; Wu, X.C.; Durbin, E.B.; Coyle, L.; Ramanathan, A.; Tourassi, G. Using case-level context to classify cancer pathology reports. PLoS ONE 2020, 15, e0232840. [Google Scholar] [CrossRef] [PubMed]
Model | Micro F1 | Macro F1 | ||||
---|---|---|---|---|---|---|
Beginning | End | Random | Beginning | End | Random | |
Baseline | 0.9623 | 0.9400 | ||||
Concat-Breast-3 | 0.2461 | 0.2335 | 0.7193 | 0.2337 | 0.2029 | 0.7501 |
Concat-Sarcoma-3 | 0.7429 | 0.7594 | 0.9261 | 0.6666 | 0.6818 | 0.8794 |
Concat-lymphoma-3 | 0.7862 | 0.7932 | 0.9465 | 0.7262 | 0.7367 | 0.9028 |
Number of Documents | |
---|---|
Baseline-breast | 134 out of 1272 |
Concat-Random-Breast-1 | 359 |
Concat-Random-Breast-10 | 671 |
Concat-Random-Breast-20 | 878 |
Baseline-sarcoma | 31 out of 1272 |
Concat-Random-sarcoma-1 | 61 |
Concat-Random-sarcoma-10 | 196 |
Concat-Random-sarcoma-20 | 312 |
Baseline-lymphoma | 6 out of 1272 |
Concat-Random-lymphoma-1 | 22 |
Concat-Random-sarcoma-10 | 90 |
Concat-Random-lymphoma-20 | 179 |
Micro F1 | Macro F1 | |
---|---|---|
Baseline | 0.9623 | 0.9400 |
Swap | 0.9230 | 0.8815 |
Delete | 0.9230 | 0.8815 |
Fully Random | 0.9230 | 0.8815 |
Middle Random | 0.9230 | 0.8815 |
Edit-Replacing-Breast | 0.3774 | 0.4209 |
Edit-Replacing-Sarcoma | 0.7987 | 0.7366 |
Edit-Replacing-Lymphoma | 0.8373 | 0.7648 |
Model | Micro F1 | Macro F1 | ||||
---|---|---|---|---|---|---|
Beginning | End | Random | Beginning | End | Random | |
Baseline | 0.9544 | 0.9240 | ||||
Concat-Breast-3 | 0.9544 | 0.9544 | 0.9544 | 0.9240 | 0.9240 | 0.9243 |
Concat-Sarcoma-3 | 0.9544 | 0.9544 | 0.9544 | 0.9240 | 0.9243 | 0.9243 |
Concat-lymphoma-3 | 0.9544 | 0.9544 | 0.9544 | 0.9240 | 0.9240 | 0.9243 |
Micro F1 | Macro F1 | |
---|---|---|
Baseline | 0.9544 | 0.9240 |
Swap | 0.9583 | 0.9369 |
Delete | 0.9583 | 0.9369 |
Fully Random | 0.9583 | 0.9369 |
Middle Random | 0.9583 | 0.9369 |
Edit-Replacing-Breast | 0.9583 | 0.9369 |
Edit-Replacing-Sarcoma | 0.9583 | 0.9369 |
Edit-Replacing-Lymphoma | 0.9583 | 0.9369 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Fatehi, N.; Alasad, Q.; Alawad, M. Towards Adversarial Attacks for Clinical Document Classification. Electronics 2023, 12, 129. https://doi.org/10.3390/electronics12010129
Fatehi N, Alasad Q, Alawad M. Towards Adversarial Attacks for Clinical Document Classification. Electronics. 2023; 12(1):129. https://doi.org/10.3390/electronics12010129
Chicago/Turabian StyleFatehi, Nina, Qutaiba Alasad, and Mohammed Alawad. 2023. "Towards Adversarial Attacks for Clinical Document Classification" Electronics 12, no. 1: 129. https://doi.org/10.3390/electronics12010129
APA StyleFatehi, N., Alasad, Q., & Alawad, M. (2023). Towards Adversarial Attacks for Clinical Document Classification. Electronics, 12(1), 129. https://doi.org/10.3390/electronics12010129