1. Introduction
Smart healthcare systems are the result of the fourth industrial revolution (I4.0) that shifted the world’s attention to automation and digitization [
1]. Automation and data exchange are the essential goals of I4.0. They enable the development of smart factories, such as smart healthcare, via the integration of IoT, big data, AI, Cloud computing, and Digital Twins (DT). In order to design a smart healthcare system (i.e., Healthcare 4.0), self-sustaining wireless and proactive online learning systems are important [
2]. An important characteristic of the first category (i.e., self-sustaining wireless) is automation and of the second category is an active data exchange and analysis. DT can play a crucial role as an enabler of the design of smart healthcare systems. Smart health can be defined as an intelligent and context-aware evolution of remote healthcare services that uses advanced technologies, such as monitoring wearable devices, mobile devices, sensors, and actuators to data, for remote support [
3]. DTs can be considered to be the building blocks of the metaverse, a virtual world in which DTs interact as physical entities do in the real world [
4].
DT, which are replicas of physical entities (e.g., healthcare), can be used for monitoring and testing, to suggest changes and improvements, and to support decision-making. Also, the ability to reflect, mimic and predict the status of physical systems in real time makes DT a promising technology for smart healthcare systems [
5]. DT consists of three components: the real (physical) space, the virtual space and a communication medium between the two spaces [
6]. Professional health carers can use DT to simulate physical objects and run experiments on the virtual copies before performing actual actions [
7]. In the creation of a replica of a physical healthcare entity, the Internet of Things (IoT) can play an important role in data acquisition and the exchange of tasks. IoT enables smart machines to communicate with one another and helps build a DT of a person’s health information via wearable devices [
8]. The smart healthcare system uses the Internet of Medical Things (IoMT) devices to actively monitor medical traffic. Healthcare traffic is received by an artificial intelligence (AI) enabled framework for disease prediction and remote health monitoring [
9]. AI/ML algorithms can be combined with DT for data processing to make health predictions, detect early warning signs, suggest optimal treatment and personalize medicine [
10]. For ML to model human DT, continuous interaction between the physical and digital spaces to obtain updated data is important [
2]. In the end, AI and IoT can help DT to perform the following tasks: prediction, optimization, detection, and dynamic decision-making [
1].
1.1. Motivations
Although as a key enabling technology for smart healthcare AI/ML has many benefits, these are offset by shortcomings such as issues with security, privacy, unfairness, and trustworthiness. The privacy of the patients’ data makes the adaptation of AI/ML for healthcare challenging. The management of smart healthcare systems relies on centralized AI/ML models that are located in the data center or cloud. This centralized architecture could lead to scalability issues [
9]. It can also make AI/ML a valuable target of a security attack. Additionally, ML algorithms trained on selected biased datasets (e.g., patients with certain socioeconomic backgrounds) may fail to produce accurate results [
11]. The consequences of a wrong prediction in diagnostics may cause life-changing decisions for a patient [
3].
Explainability of AI (XAI) and Interpretability of ML (IML) are among the major obstacles to a full deployment of AI-enabled DT for smart healthcare systems. There is a recent trend in the use of complex ML-based models, such as deep learning models and fully-connected neural networks. Although these complex models have been proven to achieve better performances, they are often perceived as black-box models. The opposite of black-box models is white-box models which are also known as human-interpretable models. Although there is no formal consensus on the definition for explainability and interpretability of AI/ML, they can be defined as “methods and models that make the behaviors and predictions of machine learning systems understandable to humans” [
12]. Interpretability and explainability are growing fields in AI/ML and this increased interest has resulted in a number of studies on different aspects of interpretable/explainable ML, from medical diagnoses to decision-making in the justice and education systems. Although the two terms interpretability and explainability are often used interchangeably in the literature, the two terms have different meanings. Interpretability enables humans to predict what is going to happen, whereas explainability enables humans to explain what is happening [
13]. The ability to interpret an ML model enables decision makers (e.g., experts or non-experts) to debug, update, and ultimately trust it [
14]. Additionally, ML interpretability can be either global or local. While global interpretation methods are useful for understanding the general mechanisms or debugging a model, local interpretation methods explain individual predictions [
12,
15].
In the context of the cybersecurity domain, there is a lack of studies that discuss the explainability and interpretability of AI/ML-enabled DT for smart healthcare systems. Existing studies focus on XAI for healthcare from the health carers’ perspective. However, several recent studies have shown that the explainability and interpretability of AI/ML can be defined on the basis of what to explain (e.g., data, features, or decisions), how to explain it when to explain it (e.g., design stage) and who to explain it to (users, health carers, or designers) [
16]. Similarly, ref. [
3] state that the 6W questions, Why, Who, What, Where, When, and How, need to be evaluated to design an explainable security system. The explainability and interpretability of AI/ML are very important to healthcare practitioners as they add a high level of accountability and trust and can help cybersecurity administrators debug systems. However, adversaries also can use XAI explanations to understand how the black box model works.
1.2. Contributions
Based on these considerations, this paper reviews the security of XAI-enabled DT for smart healthcare to answer the following research questions:
RQ1 Do existing studies in the field of XAI-enabled DT for healthcare systems consider the robustness of the adopted XAI algorithms?
RQ2 What does the XAI/IML of DT-based smart healthcare systems mean from the perspective of cybersecurity experts?
RQ3 How can we improve the robustness of XAI-enabled DT for smart healthcare frameworks?
RQ4 What are the potential adversarial attacks against XAI-based healthcare systems?
As in [
17], the differences between this paper and related papers in the literature are highlighted in
Table 1. It shows the coverage extent of the XAI’s security, DT’s security, Healthcare systems, and security. For example, the term “partial” refers to the level of coverage for the selected topics. To the best of the author’s knowledge, this paper is the first of its kind that attempts to explore the security of the XAI part of XAI-enabled DT for smart healthcare systems. It discussed the importance of the explainability of XAI-enabled DT healthcare systems from a cybersecurity perspective. An in-depth examination regarding the security of XAI of healthcare systems that use DT is provided. To summarize, The main contributions of the research are as follows:
It defines the explainability and interpretability of AI/ML-enabled DT for smart healthcare systems from a cybersecurity perspective.
It provides a survey of important and relevant research that discusses the security of XAI-enabled DT healthcare systems.
It proposes a framework of XAI/IML-enabled DT for healthcare systems.
It presents a simulation of potential adversarial attacks against the XAI part of a DT healthcare system.
The rest of the paper is organized as follows:
Section 2 discusses related work in the field of XAI, DT and smart healthcare. A systematic literature review was presented in
Section 3.
Section 4 defines the explainability of AI-enabled DT for smart healthcare systems from a cybersecurity perspective.
Section 5 proposes a DT framework architecture.
Section 6 presents the results and analysis of a potential adversarial attack against an XAI model.
Section 7 presents the limitations of the study and discusses the results of the experiments.
Section 8 concludes the paper and discusses future work.
2. Related Work
Researchers have experimented with the creation of virtual and simulated systems (i.e., mirrored systems) of physical systems. However, the systems that were created lacked a continuous connection and real-time data exchange, which are essential elements in the “twinning” of digital to physical systems [
29]. The twinning is a result of seamless interaction, communication, and synchronization between the DT, the physical twin and the surrounding environments. Authors in [
22] state that mirror systems can be classified into three types: (1) a Digital Model (DM) which is an isolated system that does not have a connection to the real world; (2) a Digital Shadow (DS), which is an automatic one-way communication between physical and virtual spaces and (3) a DT which is an automatic two-way communication between the two spaces. A combination of technologies, such as big data, AI, cloud computing, and IoT allow the communication and interaction between a physical entity and its DT twin [
29]. The DT concept was introduced by Michael Grieves in 2002 [
30,
31]. Since then, there has been an increased interest in the potential of using DT in healthcare systems. Although most studies focus on the application of DT in the context of industry [
32], there is increased interest in studying the potential of DT in healthcare applications.
The Internet of Things (IoT) is an important technology for smart healthcare systems. It provides long coverage communication between different entities that can be used for data collection. Medical Internet of Things (MIoT) is a variant of IoT that has been deployed with DT in healthcare systems. Firouzi et al. [
10] discuss how Wearable IoT (WIoT) has improved healthcare systems during the COVID-19 pandemic. WIoT devices enable the remote tracking of patients and the monitoring of their health for an early diagnosis. However, security and privacy issues brought up by the use of IoT are obstacles that need to be addressed. Yang et al. [
33] carried out a survey on security research, threats, and open issues on MioT. Taimoor et al. [
34] also discuss the challenges of the fourth-generation healthcare systems in terms of security and privacy. The authors state that although IoT and AI can improve healthcare systems, issues remain regarding their security and privacy limitations.
Another key technology that can empower DT in healthcare systems is AI/ML. The majority of AI-enabled DTs in healthcare involve human digital twins. To date, research has focused on digitally twinning some aspects of human biology [
35]. It has not yet been possible to create fully functional replicas of humans. The large amount of data collected by IoT sensors needs to be analyzed in order for DT to generate a digital twin. Thus, AI/ML is considered to be the brain of DT [
6]. Learning, self-correction, and reasoning are three human skills that AI tries to replicate digitally. It learns from collected data by drawing domain-specific conclusions from numerical data faster than humans can. The ability to select the best option to achieve the desired goal is an example of the digital reasoning ability of AI. Also, the third skill of AI is self-correction, which is a process of learning and repeatedly making decisions [
1]. AI/ML algorithms or technologies enable DT to perform the following tasks:
Prediction: Predicting diseases.
Detection: Detecting early signs.
Optimization: Monitoring disease progress.
One of the promising ML algorithms that has frequently been discussed in the healthcare field is Federated Learning (FL). Federated learning introduces a new distributed interactive AI concept for smart healthcare as it allows a number of hospitals to participate in AI training while maintaining data privacy or locally train their own model while sharing parameters with other hospitals. One of the biggest concerns of FL is that it includes data transfer mobile application usage and allows remote access to healthcare information. Also, it is vulnerable to communication cyber security attacks, such as jamming and Denial of Service a(DDoS) [
25,
26]. Consequently, FL improves the privacy of healthcare, but it increases the attack services. Also, the black-box nature of FL is one of the challenges that hinder adaptation in the healthcare system. Designing explainable FL could be a future research direction.
Among the biggest challenges for the full deployment of AI/ML are the explainability and interpretability of AI/ML. Helping human decision-making is one of the ultimate goals of using AI/ML. To enable this, AI/ML needs to be able to produce a detailed rationale for its decisions that can facilitate interaction with humans. This is known as AI explainability. Kobayashi et al. [
5] state that XAI is important for accurate predictions. The authors discuss the importance of complex ML for DT-enabled systems in prediction and system update. They define explainability as allowing users to understand the most important factors in the model’s prediction and interpretability as ensuring that the model predictions can be understood by non-technical users. Authors in [
36] studied the importance of XAI and IML for AI applications designed for healthcare and medical diagnosis. They conclude that using existing libraries for the model’s interpretability with XAI frameworks and other clinical factors helps humans understand “black box” AI.
The security and privacy of AI/ML are among the challenges that make health carers hesitant about fully adopting AI/ML-enabled systems. The robustness of ML-based models against adversarial attacks has recently become subject to increased interest in the research community [
37]. Although ML-based models have been widely used to automate different types of systems, these models are vulnerable to well-crafted, small perturbations. The vulnerability of ML has been examined by a large number of studies, where authors develop frameworks for evaluating algorithms [
38,
39], launching attacks against ML models, and designing countermeasures [
40]. Also, authors in [
38,
39,
41] propose frameworks for evaluating the security of ML and envisioning different attack scenarios against ML algorithms. The framework suggests the following steps: (1) identify potential attacks against ML models by using the popular taxonomy; (2) simulate these attacks to evaluate the resilience of ML models and assume that the adversary’s attacks are implemented according to their goals, knowledge, and capabilities/resources and (3) investigate some possible defense strategies against these attacks. Defending against adversarial attacks is challenging because these attacks are non-intrusive in nature. Thus, designing proactive models rather than traditional reactive models is a necessity for AI/ML-enabled healthcare systems. This has motivated researchers to formulate different attack scenarios against machine learning algorithms and classification models and propose some countermeasures [
42].
Designing XAI-enabled systems is challenging because interpretability and accuracy are two competing concepts. Senevirathna et al. [
3] add that explainability is regarded as a third property constraining performance and security. Simplification and generalization are the main concerns of interpretability, as accuracy favors nuance and exception [
14]. There are two approaches that are widely used for interpretations of ML models: regression analysis and rule-based ML [
12]. These models provide descriptions of a class as well as predicitions [
43]. On the one hand, linear regression models can be interpreted by analyzing the model structure or a weighted sum of features. For example, the weights can be interpreted as the effects that the features have on the prediction. On the other hand, rule-based ML models, such as decision trees [
44] or decision sets [
43], interpret a learned structure (e.g., IF-THEN) to understand how the model makes predictions. Some recent studies have attempted to make complex models interpretable; for example, refs. [
36,
45] have visualized the features of CNN and [
46] the important features of a random forest. However, in high-dimensional scenarios, linear regression, decision trees, or complex models may become not interpretable [
12]. Additionally, a decision-set type of rule-based ML has some shortcomings. Understanding all of the possible conditions that must be satisfied is difficult and limits interpretability. This is especially so in multi-class classification [
14] or complex AI/ML-enabled healthcare systems [
15].
Although, increasingly, there are methods being developed that explain how algorithms work and reach their final decision (Designing AI Using a Human-Centered), there is a lack of studies that focus on designing methods for the security of AI explainability. To design an explainable AI-based system, the researcher needs to answer the following questions: What to explain? Who to explain to? How to explain? and Why to explain? The answer to these questions regarding XAI for cybersecurity will be different according to the target audience [
16]. Some studies suggest overcoming the XAI shortcomings by removing humans from the loop, which is dangerous in the healthcare domain. Others propose using less complex AI models, which may lead to systems that do not provide the desirable results. Thus, the degree of explainability is another important question to ask. Taimoor et al. [
34] state that XAI research includes feature engineering, developing and testing algorithms, risk and opportunities, ethical considerations, and trust. We believe that robustness to adversarial examples needs to be included. Authors in [
47] state that IoT and AI are not secure solutions. The number of publications that survey DT in terms of definitions, characteristics, development, and application has increased, but there is a lack of studies that focus on the security of DT-based healthcare systems. Designing a simplified approximation model to make complex models interpretable to cybersecurity administrators/experts is one of the goals of this paper. The level of AI explainability/ML interpretability to users should be evaluated as it may make the system vulnerable to adversarial attacks. Thus, this paper focuses on exploring the security of AI explainability as part of smart healthcare systems that use DT.
4. XAI from Cybersecurity Perspectives
One of the research questions that this paper is trying to answer is (RQ2) what does the XAI/IML of DT-based smart healthcare systems mean from the perspective of cybersecurity experts? To answer this question, two points are considered: (1) whether there is any difference between AI’s explainability and interpretability and (2) why it is important to define whether the XAI/IML is designed for security purposes. First, the explainability and interpretability of AI/ML are always used interchangeably, but some recent studies show that these are not the same. Thus, the first step to understanding the XAI/IML of DT-based smart healthcare systems from a cybersecurity perspective, explainability and interpretability need to be defined clearly. Kobayashi et al. [
5] define XAI as the ability of an AI system to provide reasoning behind its actions (i.e., predictions and decisions), whereas IML is the process of explaining the relationship between a model’s input and output variables to the decision-making process. Also, Fischer et al. [
13] state that interpretability enables humans to predict what is going to happen, but explainability enables humans to explain what is happening. This shows that it is very important for cybersecurity administrators/analysts to distinguish between the two as interpretability (a.k.a pre-model explainability) is crucial at the design stage and explainability is crucial after an attack and at the debugging stage. However, it may not be important for doctors or nurses to learn about the correlation between input and output samples to the prediction process; they may be more interested in the reasoning behind the predictions.
The second important point to be taken into consideration when addressing this research question is that one of the two categories of adversarial attacks against AI/ML is an exploratory attack, where an adversary tries to learn some of the characteristics of AI/ML. In the case of XAI/IML, adversaries may take advantage of the explainability and interpretability of an XAI/IML. In other words, as the adversaries’ knowledge increases, the effectiveness of the attacks also increases. This idea supports the statement discussed in [
3] about the importance of defining the XAI/IML’s involved to improve accountability. The authors state that by answering the 6W questions, Why, Who, What, Where, When, and How explanations would be generated.
Table 3, inspired by [
3,
51], shows the steps for designing XAI. For smart healthcare systems, stakeholders can be categorized into the following groups: creators (i.e., developers, testers, or cybersecurity experts), system operators who manage systems on a daily basis, and end users (i.e., doctors or nurses). The granularity of explanations should be different for these groups. For instance, system operators need a level of explanation that is higher than the explanation needed by end users and slightly lower than the explanation needed by the creators. Another thing that necessitates the identification of XAI stakeholders is that XAI does not support solely the decision making, but also the security and accountability. Thus, the purpose of adapting XAI is different for stakeholders.
To summarize, the requirements of designing XAI/IML-enabled DT for smart healthcare systems from the cybersecurity experts’ perspective are different as the objective, granularity, and methods used for explanation are different. Also, the expertise level required is different and thus requires distinct explanations [
52]. Answering the 6W questions is important; existing literature often discusses XAI, in general, without differentiating the stakeholders. For example, ref. [
53] state that visual-based XAI methods could be useful for users with low AI/ML domain knowledge; whereas, some of the existing XAI tools or libraries require a significant amount of technical knowledge.
6. Adversarial Attacks Against XAI
The aim of this section is to answer (RQ4) What are the potential adversarial attacks against XAI-enabled DT for healthcare systems? Despite the fact that XAI can play an important role in improving the trustworthiness, transparency, and security of DT-based healthcare systems, their explainability can also be used to compromise the system. It is important to consider vulnerability to cyber attacks for both the AI models deployed and the explainability part [
59]. Employing XAI increases the attack surface against smart healthcare. Falsifying the explainability can be a target for an attacker [
3]. Adversaries can modify explanations (i.e., post-hoc) without affecting the model’s prediction which may cause a stakeholder to make the wrong decision [
3]. Thus, designing proactive XAI-enabled DT for healthcare systems rather than traditional reactive systems is a necessity in an adversarial environment, where the arms race between system designers and adversaries is never-ending. Since reacting to detected attacks will never prevent future attacks, proactively anticipating adversaries’ activities enables the development of suitable defense methods before an attack occurs [
39]. This has motivated us to formulate an attack scenario against the XAI part of DT-based healthcare systems using existing proposed frameworks [
15,
38,
39,
60].
A taxonomy proposed in [
38,
39,
61] was adapted and updated to simulate a potential attack scenario against XAI algorithms considering the four following axes:
The first axis, which is the attack influence, concerns an adversary’s capability to influence an XAI’s predictions/explanations. The influence is causative if an adversary misleads the deployed XAI by contaminating (poisoning) the training dataset via injecting carefully crafted samples (a.k.a adversarial example) into it. On the other hand, the influence is exploratory if an adversary gains knowledge about the deployed XAI to cause mis-explanation at the testing phase without influencing training data.
The second axis describes the type of security violation committed by an adversary. The security violation can be regarded as an integrity violation if it enables an adversary to alter an XAI model’s explanation. Also, the attack can violate the XAI model’s availability if it creates a denial of service, where it prevents legitimate users from accessing the system’s explanation. Additionally, the security violation can be regarded as a privacy violation if it allows an adversary to have authorized access to the XAI’s explanation.
The third axis, which was adapted from [
62], specifies the attackers’ target. Attackers can either target the explainability or prediction of the XAI model or both. In
I-attacks, attackers attempt to attack the single explanation without affecting the prediction of a deployed classifier. In the
CI-attacks, attackers attempt to concurrently compromise the integrity of the classifier and explanation.
The fourth axis of the taxonomy refers to the specificity of an attack. It indicates how specific an adversary’s goal is. The attack specificity can be either targeted or indiscriminate, depending on whether the attack (1) causes the XAI model to misclassify a single or few instances, or (2) undermines the model’s performance on a larger set of instances.
The existing literature on adversarial ML models provides different attack examples and defense methods for both adversarial attack types (causative and exploratory). Ref. [
15] presents a taxonomy of potential attacks against ML models (see
Table 4).
6.1. Threat Models
Threat modeling, which involves defining an adversary’s goal, knowledge, and capability [
38,
39,
61], is an important step toward identifying potential attack scenarios. The attacker’s goal can be based on the type of security violation, on the attack target, and on the attack specificity. For instance, the adversary’s goal could be to violate the XAI models’ integrity by manipulating a specific instance to cause a wrong explanation. An attacker’s level of knowledge about the XAI models varies and may include perfect knowledge (having full access to XAI’s explanation), limited knowledge (having limited access to XAI’s explanation), or zero knowledge (not having access to XAI’s explanation). An attacker’s capability can enable them to either influence training data (causative attack) or testing data (exploratory attack).
6.2. A Potential Attack Scenario
Here, an experiment that depicts a possible scenario of an adversarial attack against the XAI part of a DT-enabled healthcare system is discussed. One of the most common types of a causative attack is a poisoning attack, in which an adversary contaminates the training dataset to affect ML models’ output [
38]. A label-flipping attack, which is an example of a causative attack, was chosen for the experiment. In a label-flipping attack, an adversary flips the label of some samples and then injects them into the training data. Different methods have been used to perform this attack in the literature and, the easiest method is to randomly flip the label of some samples that may be used for retraining. In [
63], it was shown that randomly flipping about 40% of the training data’s labels decreased the prediction accuracy of the deployed classifier. However, as the experiment in this current paper focuses on the XAI model, the model’s explanation was used to select samples to be flipped.
The settings of the attack scenario are as follows: the adversary’s goal is to violate the integrity and privacy of the XAI model, the attack target is CI, and the attack specificity can be either targeted or indiscriminate. In The adversary’s capability, it is assumed that the adversary is capable of influencing the training data. The adversary’s knowledge is assumed to be perfect (white-box setting). Within such a scenario, a potential attack strategy is as follows:
Adversaries can use the model’s explanation that either has been inferred through probing or gained via unauthorized access (privacy violation).
Depending on the knowledge that the adversary gains, they can select samples that need to be flipped in the training dataset.
6.3. Experimental Settings
This section presents and discusses the experimental results and evaluation. Experiments were run on Linux Ubuntu 18.04 LTS operating system with an Intel(R) Core(TM) i7-8750H CPU 2.20 GHz x 12 of 983.4 GB memory. A model built in (
https://www.kaggle.com/code/joshuaswords/predicting-a-stroke-shap-lime-explainer-eli5/notebook (accessed on 6 April 2024)) was used for implementation and a Stroke Prediction Dataset (
https://www.kaggle.com/datasets/fedesoriano/stroke-prediction-dataset (accessed on 6 April 2024)) was used. The dataset contains 5110 unique IDs for patients with 12 attributes, such as gender, age, disease, and smoking status. Each row in the dataset, which provides relevant information about the patient, is classified into two classes: 1 if the patient had had a stroke or 0 if not. Different ML algorithms were used for the experiments in this section.
6.3.1. Evaluation Metrics
The following evaluation metrics have been used to measure the models’ performance: accuracy, recall, precision, and F1 score. These metrics, along with their descriptions, are defined in
Table 5 [
15,
64,
65].
6.3.2. Models’ Design
The experiments conducted in this section focused on using the adopted models to predict whether a patient is likely to suffer a stroke based on 11 clinical features. The same settings used by the author of the adapted models were followed. First, the dataset was processed to find missing values and explore the features. Then, the dataset was split into training and testing for model preparation. Also, the SMOTE (Synthetic Minority Over-sampling Technique) was used to balance the dataset. Three algorithms Random Forest (RF), Support Vector Machines (SVM), and Logistic Regression (LR) were selected for the classification task. After building the models, a 10-fold cross-validation was used for evaluation. The results in
Table 6,
Table 7 and
Table 8 show that Random Forest performed the best with 88 accuracy.
6.3.3. Models’ Explainability
To explain the results of the selected model (i.e., RF), a feature selection and three explanation methods were used. First, the RF feature importance was plotted and the result shows that age is by far the most used feature by the model (see
Figure 3).
Then, three popular XAI frameworks, namely, SHAP [
66], LIME [
67], ELI5 [
68] were used. SHAP (SHapley Additive exPlanations) is a tool for determining the contribution of each feature to the model’s prediction. LIME (Local Interpretable Model-agnostic Explanations) was used to interpret the prediction of the model on a single instance. ELI5 stands for “explain like I am 5” and aims to explain the prediction of any model. The results shown in
Figure 4,
Table 9 and
Table 10 show that SHAP and ELIS (i.e., global explainers) rank age and then bmi as the most important features. On the other hand, LIME, which is a local explainer, ranks gender and bmi as the most important features.
6.3.4. A Simulation of Potential Adversarial Attack
To simulate a label flipping attack as discussed in
Section 6.3, the labeled 10% of the dataset was flipped considering the most important feature (i.e., age); 25 out of 250 labels, which account for 10% of label 1 were flipped. Labels of patients aged 2 were flipped to 1, thus labeling 2-year-old patients as being at risk of suffering a stroke. The performance of the three ML models was evaluated using the contaminated dataset. The results in
Table 11,
Table 12 and
Table 13 show that the performance of the SVM and LR was slightly degraded, but it remained almost the same for RF. However, the results of the four explanation methods showed that the label-flipping attack affected the explanation. Although the literature suggests that flipping about 40% of the training data labels decreased the accuracy, 10% of the training data decreased the performance of the adopted model in this paper.
Figure 5 shows that feature importance order has changed as BMI is replaced by avg_glucose_level. Also,
Figure 6 depicts an order change in the feature importance for the SHAP explainer. Gender becomes the second most important feature instead of BMI. LIME explainer
Table 14 ranks gender and work_type as the most important features, which is different from the original ranking. Moreover, work_type replaces BMI in the ELI5 explanation
Table 15.
7. Discussion
One of the challenges in designing an adversary-aware XAI is that explainability is much more difficult to measure than accuracy or performance as the level of an explanation is subjective [
69]. Also, ref. [
59] adds that the XAI output is as crucial as the format of the explanation output and could have a strong influence on certain users. For example, a text-based explanation is commonly used in the Natural Language Processing (NLP) field, but visualized explanation approaches are used in healthcare. Also, the explainability of the deployed model can reveal valuable information to adversaries. To attack ML-based models that do not provide an explanation, adversaries need to probe the model to collect information. However, the XAI produces helpful information for adversaries. One of the solutions proposed in the literature is to provide stakeholders with the minimum level of explanation that enables them to achieve their goals. Model explanations should be categorized according to the stakeholders’ privileges. The models’ explanations (i.e., pre-model and in-model) need to be encrypted and the model should not reveal detailed information about features and algorithms. It can provide users with a warning message that consists of a prediction description, and an explanation of the hazards and consequences (i.e., post-model) [
62]. However, users with higher privileges can query the model to access the encrypted detailed explanation. Another defensive method is encrypting the explanation, so that only stakeholders with a high level of privileges can access it. Moreover, delaying the explanation availability can disrupt the adversaries’ attack process [
3].
The proposed human-in-the-loop approach layer between layer 3 and layer 5 is needed for verification. The explanation can be used by the system administrators to detect adversarial examples (i.e., explanation-based security approach). For example, if the deployed XAI predicts that a patient has high blood pressure, the explanation (i.e., feature importance) needs to be evaluated. Also, using the ensemble of XAI tools for adversarial example detection can help with the verification task before layer 5 by alerting administrators in case of an explanation disagreement. For this solution to work, systems designers and medical professionals need to design explanation messages based on the XAI features. There are some features that indicate certain types of disease. Consequently, a sudden change in such features is regarded as a sign of an adversarial attack or a disagreement on feature importance and the explanation message is a sign of an adversarial attack against XAI’s explanation.
Grey-box and white-box attacks, such as the label-flipping attack presented earlier, can be successful against XAI. Designing an adversary-aware XAI-based DT healthcare system considering the adaptability of the deployed model is important as an increase in the false positives and false negatives rate can lead to life-threatening consequences. Also, “the life-critical nature of healthcare applications demands that the developed ML systems should be safe and robust and should remain robust over time” [
70]. Consequently, the DT framework proposed in this current paper uses an ensemble of XAI tools to help detect adversarial examples by considering the disagreement between the deployed classifiers and using the detected adversarial examples to update the framework (i.e., adaptability). Another important point that needs to be considered at an early stage of designing an XAI DT-enabled healthcare system is involving humans in data labeling as data generated by artifacts (i.e., IoT, cameras, robots) lacks context. Manual data labeling is about adding extra information or metadata to a piece of data [
16]. Metadata increase the explainability of AI and helps with feature engineering, detecting incorrect labels, and detecting adversarial examples.
Additionally, one of the promising solutions for protecting privacy in smart-health systems, such as XAI-enable DT, is federated learning (FL) [
71]. It enables models to be trained in a distributed training fashion by averaging the local model updates from multiple IoMT networks without accessing the local data [
9]. As a result, potential risks of gaining unauthorized access to the dataset used to train models can be mitigated. However, this will not protect the likelihood of the explanation disclosure.