4.2.1. Current Trends of Game-Theory Approaches in APT Detection and Prevention
The current trends of game-theory approaches for addressing APTs are summarized in
Table 6. Overall, the approaches were diverse and provided significant coverage of the different APT studies. Most of the studies focused on the objective of APT detection through increased awareness, including [
26,
32,
33,
34,
36,
38,
39,
40,
56,
57,
58,
59,
60,
61,
62,
63,
64,
65,
66,
67,
68,
69,
70,
71,
72,
73,
74,
75,
76,
77,
78,
79,
80,
81,
82,
83,
84,
85,
86,
87,
88,
89,
90]. Most of them adopted multi-stage game processes to model the adversarial interactions between the attackers and the defenders (or multiple entities) and differentiated between different levels (or phases) and timing (or circumstances) [
56,
57,
58,
63,
68,
71,
73,
81,
84,
87,
88,
90].
In addition, some studies adopted mixed strategies to solve these game processes and reach an optimal payoff equilibrium, depending upon the perspective of the game state (defender or attacker point-of-view) by improving the possible outcomes through randomizing the moves made [
56,
57,
81,
84,
90]; however, some studies employed pure strategies when the best payoffs were the only options determined to achieve the maximum profit or the best outcome [
87,
88].
Furthermore, many reviewed articles focused on identifying and detecting a potential vulnerability in their study domain (or industrial application) against APTs [
33,
38,
56,
58,
67,
71,
72,
75,
78,
79,
80,
83,
85,
88,
90,
91,
94,
95], whereas some more recent ones concurrently provided improvement techniques on their APT detection method to overcome these vulnerabilities [
33,
72,
88,
90,
94]. While pure and mixed strategies were among those most used by researchers throughout the years (
Figure 6), they also tended to be unrealistic and impractical due to the nature of APT stealth and human-focused attacks (e.g., irrational, unpredictable, and adaptive). Therefore, different strategies have been adapted by researchers to further enhance their game-theory approaches against APTs.
Before the 2000s, a prominent approach was the Gestalt strategy, which was inspired by the psychological concept that a being is more than the sum of its experiences [
96]. Such a strategy allowed for optimizing the equilibrium of a system with multiple phases (temporal) and stages (spatial) simultaneously in a way that was determined elegantly and holistically [
38,
58]. The strategy was also related to multi-layered games, and it required solving each game optimally, given the results of the other games, without the need to analyze one large game [
33]. These multiple games or combinations of sub-games could be structural variants of games (i.e., signaling games and FlipIt games) that ensured that plug-and-play solutions were widely available.
Furthermore, several studies adopted hybrid strategies to address high-risk conditions in APTs [
63,
71,
83,
94]. This strategy involved judging the availability of multiple information sources (i.e., multiple levels and phases), ensuring different probabilities of choice [
63] and managing the diversity of capabilities and opportunities [
71], as well as belief systems [
94]. In addition, when a critical resource was finite (e.g., time or cost), hybrid strategies were a vital advantage, relative to the timing of its deployment: for instance, fixed (a “periodic” strategy) and exponential (an “exponential” strategy). Under such conditions, specific decision-making could be costly over time and disincentivize adverse intentions [
83]. Moreover, the hybrid strategy also provided diversity in resolving APT problems when conflicting information or uncertainty was present [
94].
In recent years, Stackelberg’s strategy was among the most popular strategies adopted in game-theory approaches to address APT problems [
56,
70,
72,
82,
90,
95]. It is based on the strategic leadership model in economics, where a leader and a follower compete on quantity by moving sequentially (sometimes referred to as the “market leader”). The purpose was to focus on minimizing potential loss or maximizing the payoff [
70,
72] when faced with an asymmetric information structure (i.e., stealthy attacks) by using randomized policies [
56,
72,
90] and optimally allocating critical resources and judgments [
82,
90]. Moreover, the presence of a leader allowed for the discovery of high rewards that would not affect the equilibrium [
56].
Considering the game processes of the game-theory approaches adopted against the APTs, seven distinct process types were identified from the reviewed articles (
Figure 7): static, dynamic, Markov, Bayesian, signaling, Gaussian, and stochastic (refer to
Section 2.2.3 for details). Among them, the Bayesian and stochastic game processes were frequently adopted, accounting for about 29% and 23% of the reviewed articles, respectively. Such a condition was the norm when addressing APTs, which dealt with incomplete information and uncertain game states in an adversarial situation.
Moreover, this trend was also followed by Markov and signaling game processes, which accounted for about 25% of the reviewed articles. This situation reflected the trend of research articles incorporating such game processes where specific situations depended on the previous states (decision chains) or triggers (signals) to effectively overcome APTs. Nevertheless, dynamic game types were still a prominent game-theory approach when dealing with APTs since those variations (Bayesian, Markov, signal, and stochastic) were introduced to fit particular security needs and requirements.
Moreover, several added-values dominated the final 48 articles considered in this review. For instance, concepts, such as bounded rationality, were adopted by several researchers [
34,
36,
57,
76,
82,
86], which imitates the decision-making of agents who have limited rationality and information available in a given adverse situation (both in capacity and time) in order to make satisfactory choices. Furthermore, some studies [
36,
59,
66,
76,
91] used prospect theory to imitate the actual behaviors of two opposing parties in a variety of application domains. These were among the approaches that considered the practical behaviors of the involved parties when operating with limited information (or uncertainty) and overcoming (or valuating) the risk of aversion.
Another perspective of the study involved deterring adverse situations by being deceptive—called defense deception. Defense deception utilizes several intuitive techniques to mislead attackers. For instance, lying costs were optimized to determine the privacy of partial information (i.e., revealing cues via “cheap-talking”) [
65,
77], thereby, reducing the misalignment between different incentives [
65,
81], providing an alternative perspective that tilts the information asymmetry (or cognitive bias) [
39,
86,
90,
94], encouraging more conservative behaviors instead of aggressive ones [
86,
94], and discovering novel rules to achieve a better trade-off between security and usability [
39,
86,
90].
Several researchers adopted differential games to sufficiently describe and analyze the dynamic process of adversarial conditions (i.e., attack versus defense). This condition is particularly critical when operating with incomplete information since capturing the system information is complex and may cause many uncertain factors that result in random changes to the system state or strategy [
61,
66,
87].
Some researchers described it as a state-evolution model [
66,
70], where random factors influence and change in intensity [
87]. Differential games also rely on the optimal control principles to attract either the global [
70,
80] or the saddle-point [
87] of the system equilibrium. Typically, control and payoff functions are integrated to describe the algorithmic selection of real-time responses [
87], particularly when associated with shifting resource vulnerabilities on a variety of attack surfaces (c.f., moving target defense with different spatial dimensions and informational elements [
88]).
Huang et al. [
61] constructed a multi-stage and multi-state Markov differential game model to analyze real-time attack–defense behaviors and resolve the persistent defense decision-making challenge by calculating the strategy control function over time, thus, allowing for a more guiding role in the timeliness of the decision-making. Furthermore, a differential dynamical system was introduced to protect cloud storage systems [
66] and enterprise group systems [
80] while mitigating loss and capturing the time-variance in confrontations between the attackers and defenders.
Moreover, Yang et al. [
64] and Yang et al. [
70] modeled an APT repair problem as a differential Nash-equilibrium game (the attacker attempted to maximize the potential benefits, while the defender mitigated the potential losses) using an epidemic model based on three practical situations: time-varying communication, the lateral movements of APTs, and the changes of attack–repair strategies over time. Subsequently, the key to solving a differential game was deriving the optimal system by using the associated Hamiltonian principles and determining the Nash equilibrium conditions for the game [
79,
80].
It is worth noting that, among the adopted game-theory approaches for the defense and detection against APTs, AI techniques were among the scarcely adopted techniques, even in recent years. Only two of the 48 final articles reviewed adopted AI techniques to address specific sub-problems of the game-theory approaches. For example, Laszka et al. [
62] adopted a simulated annealing algorithm to find a near-optimal detector configuration to mitigate the attackers’ action in an intelligent traffic control game model.
Li et al. [
95] incorporated a deep reinforcement-learning technique based on convolutional neural networks and information-rich features to proactively addresses APTs. Nevertheless, both AI techniques require data availability and were applied for a very narrow scope of their respective uses, which can be counter-intuitive against the APT scenario. As game theory provides a framework capable of proactive defense and detecting uncertainty caused by APT [
39], AI technique integration with game theory could potentially enhance the system’s capability while mitigating adverse conditions, thus, providing a fertile topic for future investigation.
4.2.2. Challenges and Benefits of Adopting Game-Theory Approaches in APT Detection and Prevention
The challenges of game-theory approaches for addressing APTs are summarized in
Table 7. The major challenge found among the considered articles was the asymmetricity of the informational structure between an attacker and a defender. Such a situation allowed for risk or uncertainty to be managed and modeled as probability distributions instead of real-value payoffs [
73,
92]. Such asymmetricity could also be dependent upon the types of devices [
32], the types of attacker influence [
59,
60], and the dynamic interactions between different stages or phases of a modeled game structure [
38,
73,
89]. However, this asymmetric information focused on the context of the detection, which was related to the weighting or the valuation of compromising those informational structures, which is typically costly at the operational and tactical levels of an organization.
Another challenge to asymmetricity in informational structures that is also influential in the context of defense against APTs was generally identified as defense deception. Such situations involved the fusion of different data sources [
63,
75,
84,
94]; the withholding of specifics about data (differential privacy) [
77]; delay tactics by repairing [
64]; a multi-level interdependent mechanism that validates/justifies another [
33,
34,
71,
84,
94]; the exposure of a potential perpetrator by leaking some evidence [
65] or a stage-wise judicial decision [
39,
81]; the restriction of resources to starve a potential threat [
72]; and the corroboration of interaction levels with other factors [
76,
80,
85,
95]. These defense-deception mechanisms demonstrated the level of complexity involved in defending against APTs, where the disadvantages of the attackers in the form of missing or incomplete information are used for the defender’s advantage via passive [
63,
65,
75,
77,
84,
94] or proactive responses [
33,
34,
39,
64,
71,
72,
76,
80,
81,
84,
85,
94,
95].
Furthermore, other studies addressed the challenge of asymmetricity in the informational structures by focusing on optimal defense strategies that mirrored the attackers [
56,
67,
78,
80]; characterizing the best response [
34,
72,
85,
86,
88]; delaying the attackers by employing a honeypot (decoy mechanism) [
36,
68,
76,
81,
90]; managing the expected loss through dynamic recovery [
66,
70] or cyber-insurance [
82]; tagging data to identify suspicious information flows [
69]; characterizing the signature or profile of the attack [
40,
75,
78]; and managing misconceptions (i.e., suspicions of a nonexistent attack) [
74,
90].
This challenge involved subjective aspects of the threat, where both the defender and attacker could adopt progressive, aggressive, or conservative strategies, depending on the real-time situation at various stages of the game model. Such modeling has successfully defended against APTs but requires careful consideration of the trade-offs between practicality and performance, particularly when mitigating insider threats (where they could access privileged information for financial gain) [
66].
Other challenges that consider the trade-offs between the two metrics were cost offshoot and micro-management, where unexpected or compounding effects of the latter were caused by the former. In contrast, the latter influenced the affinity of the former. For example, the attacker’s cost overestimation could be the reason for such a condition due to the defender’s uncertain scanning intervals [
91,
93] or balancing defense decision-making against the state randomness of system security [
61]. Furthermore, the defender could also vary its strategies (i.e., periodic strategy) and estimate the compromise probability based on control incentives [
58].
Another condition was the integration of security vulnerability quantification, involving an attack tree, the common vulnerability scoring system (CVSS), and game-theory approaches to provide objective evaluations while anticipating and preparing countermeasures against an adversary [
26]. In addition to complementing the limitations of the approaches, managing them requires specific calibrations relative to the identified vulnerability to avoid the over-fitting of anticipated countermeasures, which could lead to a predictable routine.
Other challenges were related to the adoption of a periodic strategy, where the time between consecutive moves in the game-theory model was constant; therefore, the player “utility” of the game (attacker or defender) relies on the gains and costs of the available resources over time, which requires characterizing and anticipating optimal strategies to fend off the threat [
83]. In contrast, random disturbances and stochastic times between consecutive moves in the game-theory model require maximizing the defense effectiveness and minimizing the cost.
This situation required efficient management that could be sensitive to any perturbation [
61,
87]. Another aspect of the sensitivity that is challenging when related to APTs is choosing an optimal defense strategy when the strategy space is massive and computationally exhaustive due to the sheer number of interactive elements within the system (i.e., the IoT, transportation, and network node security) [
62,
79,
94,
95]. Such a situation is similar to the cost-offshoot challenge and could further perpetuate the APT detection and defense complexity.
In contrast, several benefits have been gained by adopting game-theory approaches for addressing APTs as summarized in
Table 8. One major benefit gained from game-theory approaches was through their stability and reliability in the intended system, where the defense performance could be efficiently determined [
26,
38,
87] while capturing the essence of the coordinated attacks [
57] and providing locally asymptotically stable points of the game states [
87,
93].
Furthermore, the timeliness of decision-making and the scope of the application were improved by Huang et al. [
61] through the application of the Markov decision-making method in a continuous multi-dimensional phase space with problem variations while considering payoff discounting. Furthermore, Zhu and Rass [
38] offered a reasoning approach that systematized the whole system based on local security assessments and defined scores that could be tailored to specific contexts, leading to enhanced timeliness and a substantial increase in reliability.
In another study, decision probability played a crucial role in addressing an APT, as the computational performance was advantageous when the size of the strategy space was computationally expensive [
62], the decisions were multi-layered [
65], and random disturbances were considered [
87]. In other words, it was advantageous to identify and determine the best strategies to overcome the threat before it occurred [
61,
62].
The general pattern of detecting a malicious entity can be to fuse data from different sources to compute a comprehensive payoff and optimally allocate constrained secure resources [
63], allocate the available repair (or recovery) resources to mitigate potential losses [
64,
79], predict risk compensation via the effect of insurance [
73,
82], select suitable defense timing [
56,
75,
76,
88], tag sensitive information flows [
69], or actively expose vulnerabilities (e.g., a honeypot) [
36] while conducting an objective assessment on the expected state of the target system [
33,
40,
64,
66,
75,
79].
An interdependent model of trust management decisions that involve multi-layered optimization provided structural reliability when multiple or dynamic games were considered to cater to the diverse possibilities of APTs [
33] and did not always strive to eliminate leakage when deception cues could be used as deterrents [
65]. In Hu et al. [
34], they introduced a more generalized approach that involved different social players by measuring and quantifying their rational degrees and simulating their growth to reflect the randomness and inertia of the population’s social behaviors for realistic attackers and defenders.
In Seo and Kim [
90], they investigated a defender-deception method (e.g., honeypot and decoy that induced cognitive bias and induction) that were formulated for the scenario and attributed to the secure dominant organizational share, and they presented an optimal strategy that minimized the performance degradation and maximized their efficiency while constructing a deceptive container-management plan that yielded the highest defense and the lowest cost for defenders with limited utilities.
Another approach to decision probability incorporated novel metrics to improve the benefits of being reliable and objective in APT defense. For instance, Wang et al. [
67] introduced the concept of defense effectiveness that quantified the impact of a defense strategy against an attack strategy when both sides had reached a balanced state, which was based on the prior belief and payoff of the defender when selecting the optimal defense strategy. Another study by Horák et al. [
68] modeled the uncertainty of the defender using beliefs that mapped onto the probability distribution over the subsets of the possible security states.
In Zhang et al. [
72], they provided a response characterization in more general settings where the challenges faced by the defender and attacker were a continuous convex optimization and a fractional-knapsack problem, respectively, resulting in an effortless determination of the response strategy. These conditions were also applicable when resources were limited [
67,
68,
72]. Furthermore, Ye et al. [
77] adopted differential privacy techniques that preserved the privacy of the systems in networks, mitigating utility gains of the attackers while retaining the system performance under various conditions.
Some studies focused on the benefits of reliability and stability while providing good frameworks for decision probability when addressing APTs. For example, Merlevede et al. [
83] assumed the strategies chosen by both sides (attacker and defender) were from restricted strategy spaces and between exponential and periodic strategies, which was practical for the incentive design for time-based security decisions. Furthermore, Nisioti et al. [
84] proposed a Bayesian Cyber-Investigation Game (BCIG) that assumed a probabilistic distribution calculated from past incident reports and adopted an anti-forensic technique on the side of the defender to increase the collected benefits across a wide range of investigations while decreasing the costs.
Furthermore, a multidimensional transition of attack and detection surfaces was analyzed by Tan et al. [
88] and Wan et al. [
94], where the characteristics of stealth interactions (stochastic, aggressive, and conservative) were represented, thus, allowing for a generalized and objective view of the trends in the state transitions of a network system. In addition, Liu et al. [
89] quantitatively analyzed the safety in cyber–physical interactions using a weighted, colored Petri net and attack models that calculated the attack weights by using a threat-propagation matrix as well as a security-state vector. Explainability was also integrated into dynamic and persistent risk-assessment schemes with resource-allocation mechanisms [
95]. Security awareness could also significantly accelerate security monitoring, analysis, and comprehension.
Another benefit was found via utility change, such as the competitive strategy profile proposed by Li and Yang [
66], where the necessity system guided the defender to search for an admissible set of strategies and randomly outperform the generated strategy profiles. This situation allowed for the dynamic recovery strategy to be competitive and practical when compared to relying on the dynamic attack strategy alone. Furthermore, in a one-shot game model (where interactions only happened once), a trigger strategy (choosing a certain strategy at the beginning and adapting later) was adopted to address the main priority of the target system (maximize defenses or mitigate losses) [
71], or the strategy incorporated bounded rationality to influence the attacker’s appraisal [
36,
76].
In a system of centralized or distributed connectivity of the defenders (i.e., IoT devices and fog computing), the optimal incentive-compatible insurance contract insured half of the defender’s losses, which were quantitatively determined by the loss parameters of the device ownership [
73], and cyber-insurance-enabled security provided service coverage [
82]. Such conditions have allowed for an acceptable level of economic/financial losses [
40,
73,
82].
However, a study by Halabi et al. [
56] focused on integrating a new layer of robustness in the defense architecture designed to increase its tolerance of sophisticated attacks resulting from APTs. Furthermore, an insider threat’s advantage in assisting APTs could be deterred or mitigated by optimizing the initial defensive mechanism via modeling the organizational culture (i.e., limiting the information that a defender could have and diversifying its utility functions) [
85].
4.2.3. Implications and Converging Topics for Future Work in Defense and Detection against APTs
As the trends of APT detection and prevention have converged with game-theory approaches, several implications were observed. These implications included optimized protective performance, a full-fledged simulation tool for security scenarios, a security-as-service paradigm, a form of trust framework, the prioritization of repair over protective details, the conversion of deception into protection, and the encouragement of richer quality interactions between the attacker and the defender.
Several studies focused on optimizing the protective performance of a game-theory approach against APTs. In one study, the cloud storage system’s data protection level and the defender utility were improved by learning faster and being more resistant to APT attackers who chose an attack policy based on the estimated defense learning scheme [
60]. In another article, a game-theory-based vulnerability quantification method allowed for the objective calculation of the security vulnerabilities of a network system (e.g., social IoTs [
26] or moving-target defense (MTD) [
75,
88]) while anticipating and preparing for countermeasures against adversarial attacks [
26,
34,
38].
A differential game model was developed to analyze dynamic, continuous, and real-time attack–defense processes to predict a multi-stage continuous attack–defense process [
61,
67]. Repeated defense actions were used for employee awareness training based on information gathered from APT incidents and enhanced model flexibility [
38]. Some studies investigated the spatiotemporal aspects of an attack [
75,
88], making this an essential addition to the attack-surface transformation process.
Game theory and the consideration of time-evolved states [
70] and Bayesian game theory to infer incomplete information regarding an attacker’s behaviors was used to determine optimal defense strategies [
89]. These approaches provided a more comprehensive, dynamic, and practical approach to addressing APT attacks.
Some studies encouraged a game-theory approach as a simulation tool for recreating the security scenario with subjective attacking behaviors [
57,
91]. Such a perspective allowed for evaluating optimal risk mitigation strategies based on the available information while easily addressing adversary modeling issues [
36,
91]. This condition was particularly useful in defending against APTs because uncertainty exists in the attacker’s capabilities, incentives, and induced damages.
Using matrix games with distributed payoffs, where the game is in discrete time for one player but continuous time for the other, has allowed for the natural mitigation of APTs [
57,
92]. In addition, a physical understanding of the infrastructure and theoretical methods can be combined to create a practical solution; define appropriate model parameters, proper categories, and representative definitions; and design suitable payoff modeling [
57]. Due to the probability-weighting distortion, a subjective attacker tends to overestimate the attack cost and, thus, attacks less frequently in cumulative prospect theory (CPT)-based detection games, thus, improving the data protection and cloud utility [
59].
Furthermore, the existence of Bayesian–Nash equilibrium strategies has been proven under bounded rationality. At the same time, changing the strategy selection and utility, improving the detection rate, and increasing the comprehension of adversarial behaviors in a grid system [
76] and the IoT [
36] have also been addressed.
Using the security-as-service paradigm, the best contract design was investigated for a cloud-enabled internet of controlled things (IoCT). Optimal contract design was determined based on cloud security quality, where payoff compatibility and contract penalty was utilized, alongside the payoff of the cloud service providers when optimizing the security utility [
32].
In one study, a game-theory approach based on the FlipIn framework was adopted to design incentive-compatible, welfare-maximizing cyber-insurance contracts, and this offered a theoretical foundation for the quantitative assessment of cyber-risks, the development of cross-layer defense mechanisms, the design of cyber-insurance policies [
73], and the development of the pricing problem as an optimal control problem via a hierarchical dynamic game framework [
82].
Moreover, a game-theory model of cyber attacks on traffic control was introduced to provide a theoretical foundation for planning and improving the performance of delivered services, as well as for implementing countermeasures against the risks posed by cyber attacks on transportation networks and infrastructures (e.g., traffic signal tampering [
62] and the internet of vehicles [
56]).
A unique take on countering APTs was provided in the form of a trust framework, where vulnerabilities and risks were passively identified by integrating a trustful system or a set of procedures as part of the game-theory elements. A framework of trust built on incentives and costs for system control was incorporated. This allowed for continuous decision-making, a better understanding of strategic trust, and multi-layer security [
33,
58]. In addition, such a framework improved the level of data protection and had a faster learning speed for strategic defense selection [
95].
A combination of different data sources (e.g., network protocols and log documents) was used to precisely calculate the payoff of a game-theory approach. This situation allowed the Nash equilibrium to be computed in order to detect the possibility of a malicious attack while maintaining the target system functions and providing effective protection [
63]. Another approach via differential privacy was designed to resist attacks regardless of the attackers’ rationale and to increase the complexity of attack formulations, thereby, giving administrators more time to build defense policies [
77].
An effective dynamic-recovery (DR) strategy to mitigate the total loss of a cloud defender in the face of an APT campaign was investigated by Li and Yang [
66]. The concept introduced a competitive strategy profile that outperformed other randomly generated strategies and enhanced the APT defense capabilities [
69] particularly in situations where insiders with privileged access could facilitate the APT campaign for financial gain [
81]. Moreover, an organization subjected to APT could flexibly divide a long repair time into several relatively shorter repair periods.
The corresponding potential repair strategy in this time horizon was realized by estimating its expected state. Although the APT repair game was open-loop and lacked flexibility, the organization could handle the APT in a closed-loop manner for the most part and mitigate its potential loss even further [
64]. Another study by Yang et al. [
79] formulated a model based on a data backup-and-recovery system (DBARS) when defending against APTs by proactively seeking out and eliminating the compromised portion of a system via evolution, leading to a potentially cost-effective real-time solution [
69].
Many studies focused on making deception an advantageous situation for the protector. However, the techniques for detecting deception in cybersecurity should not always aim to eliminate leakage, as revealing specific cues to deception could serve as a deterrent [
65]. Such a condition could be used to design the detection mechanisms for implementing online policies without requiring iterative numerical computations. In some situations, game representation and algorithmic design encumbered the scalability of the solution [
68] and its interoperability [
40]. Legitimate system users could also be compromised.
The use of defensive deception could also generate uncertainties for attackers and motivate them to take more conservative behaviors [
39]. A belief concept was introduced as a proactive defensive response to provide a probabilistic detection system, achieve a better payoff rate, and prevent effectual reconnaissance. However, the strict resources could characterize their behavior as an adaptive strategy instead [
72]. Moreover, a hypergame was proposed as a valuable model to analyze the effects of adversarial perturbations and stochastic conditions to better understand cyber attackers and defenders in control systems [
74,
86].
The concepts of the motive and deterrence thresholds were introduced to assess the average motive of the insider population and the adequacy of the honeypots [
81,
85]. Considering the incomplete and deceptive nature of the organizational environment and information vulnerability, researchers constructed proactive deception strategies based on the organizational domain and simulated their related improvements in deception efficiency, which was intended to minimize performance degradation and maximize security [
90].
Finally, richer quality interactions between the attacker and the defender have been realized in several studies. The real-world interactions between cyber attackers and defenders were realistically modeled to predict the differences in their behaviors, strategies, and tactics under various conditions [
71,
78]. In some cases, the computational overhead increased, which demanded a higher observation cost of vulnerable resources [
78]. Increasing the paralysis threshold (the point at which a group cannot continue interacting) within a specific range could facilitate a short-term, high-intensity interaction.
In addition, effective strategies should be implemented as early as possible to achieve dominance and affect the network states. This condition suggested that obtaining an equilibrium strategy was challenging when interaction strategies were mutually restrictive [
80]. The research emphasized the importance of considering the timing of security decisions (exponential and periodic) and the impact of the passing of time on the valuation of a resource in security policy-making, where an attack could be disincentivized and information symmetry overcome between the attacker and the defender [
83].
The importance of using anti-forensic techniques was emphasized in a forensic investigation of real-world scenarios, which considered additional parameters, assumed multiple attacker types at each decision point, and combined other optimization methods [
84]. Another aspect presented by Mi et al. [
87] provided a reference for selecting an optimal defense strategy (or increasing it beyond the limit) while ensuring its advantage, maximizing defense effectiveness at a minimum cost, and minimizing loss when the defense was not possible. Moreover, mitigation of the uncertainty perceived by both the attacker and the defender led to higher resilience and high expected utility [
94].
From the identified implications of game-theory applications for combating cybersecurity threats, several converging topics against APTs were noted and summarized as follows:
Improving the protective performance of a game-theory approach through methods, such as learning faster and being more resistant to attacks; quantifying vulnerabilities; and anticipating and/or preparing for countermeasures.
Analyzing dynamic, continuous, and real-time attack–defense processes to predict multi-stage continuous attack–defense processes and improve awareness of future attacks (i.e., employee training).
Using game theory as a simulation tool to recreate security scenarios with subjective attacking behaviors that are practical and realistic, consider spatiotemporal aspects of attacks, infer incomplete information about attacker behavior, and evaluate optimal risk mitigation and defense strategies.
Investigating optimal contract design, designing incentive-compatible and welfare-maximizing cyber-insurance contracts, and formulating the pricing problem as an optimal control problem through a hierarchical dynamic game framework.
Applying game theory by optimizing the security of cyber–physical systems (CPS) and transportation systems by considering various factors, such as the attacker behavior, system constraints, and the interdependence of components.
Using game theory to optimize the security of social networks by considering the influence of users on each other’s behaviors and the strategic interaction between users and a network administrator.
As one of the most pervasive information and communication technologies, the use of smartphones over the last decade has increased dramatically. As a result, smartphone usage has faced threats at all stages of application utility, from application downloads being implanted with malicious codes, application installation or usage that contains malicious programs, and even uninstalled applications that leave behind malicious residual code [
97]. In addition, a plethora of features exist in smartphones, such as inertial sensors, positioning sensors, ambient sensors, telephony services, telecommunications, and other utilities, that provide a continuous flow of information and an accurate description of a user’s routines and behaviors, thereby, enabling an attacker to generate a highly specific and successful APT campaign [
4].
As such, smartphone security is a critical topic that requires the attention of academia and industry alike. In general, mobile APTs are defined as sophisticated attacks in mobile-device environments where social engineering has been used to leak data using features that are innate for information management (e.g., sensors and services). However, the threat in mobile devices is still nascent and challenging to assess, as identifying an attacker is difficult due to the following reasons [
2]: (i) high accessibility; (ii) various initial points of access; and (iii) jurisdictional limitations that are relatively low-entry and high-reward. As such, there is a significant possibility that a broader diversity of attacker avatars exist, from nation-state bad actors motivated by national interests to savvy individuals focused on personal gain.
In addition, attack procedures associated with the established threat actors in the mobile environment were also related to threat actors in the PC environment [
9]. This condition allowed threat actors to move freely between PC and mobile environments to achieve their goals. Therefore, it is possible to improve the understanding of the current cyber-threat environment using traditional cyber-attribution methods that employ complex evaluations of both their technical and socio-political attributes [
2,
9]. Furthermore, since mobile APTs can originate from diverse regions and borders in different countries and regulations, jurisdictional limitations can hinder cross-border cyber-crime investigations while preventing the progress of collecting evidence. However, the rapid growth of mobile devices in various fields where massive volumes of data are constantly generated could take advantage of the converging topics on game-theory approaches as a suitable solution for addressing mobile APTs.
However, public datasets and data on mobile-based APTs are scarce, which may impede research progress regarding the detection of and defense against new generations of APT attacks (e.g., using mobile, IoTs, and other smart devices). Recent solutions have involved the adoption of the situational-awareness (SA) model, also known as the observe–orient–decide–act (OODA) framework, which mitigates APTs by conceptually monitoring the fingerprinting of mobile device behaviors [
16].
Regarding another aspect, Al-Kadhimi et al. [
98] provided a solution to improve the awareness of APT detection on smartphones based on the correlation of the MITRE Framework and an attack tree, called a fingerprint, for a mobile-sensor APT-detection framework (FORMAP). Similarly, Jabar et al. [
99] proposed a framework for mobile APT detection based on device behavior (SHOVEL), and this study demonstrated the impacts of APT attacks on user behavior when self-adaptive, auto-predictive, and auto-reflective considerations were present in their decision-making. As such, this direction could be ideal for future game-theory-based research endeavors.