Reinforcement Learning-Based Network Dismantling by Targeting Maximum-Degree Nodes in the Giant Connected Component

Liu, Shixuan; Pu, Tianle; Zeng, Li; Wang, Yunfei; Cheng, Haoxiang; Liu, Zhong

doi:10.3390/math12172766

Open AccessArticle

Reinforcement Learning-Based Network Dismantling by Targeting Maximum-Degree Nodes in the Giant Connected Component

by

Shixuan Liu

^1,*,†

,

Tianle Pu

^1,†,

Li Zeng

^1,†,

Yunfei Wang

²,

Haoxiang Cheng

¹

and

Zhong Liu

¹

Laboratory for Big Data and Decision, College of Systems Engineering, National University of Defense Technology, Changsha 410073, China

²

National Key Laboratory of Information Systems Engineering, College of Systems Engineering, National University of Defense Technology, Changsha 410073, China

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Mathematics 2024, 12(17), 2766; https://doi.org/10.3390/math12172766

Submission received: 26 July 2024 / Revised: 23 August 2024 / Accepted: 5 September 2024 / Published: 6 September 2024

Download

Browse Figures

Versions Notes

Abstract

:

Tackling the intricacies of network dismantling in complex systems poses significant challenges. This task has relevance across various practical domains, yet traditional approaches focus primarily on singular metrics, such as the number of nodes in the Giant Connected Component (GCC) or the average pairwise connectivity. In contrast, we propose a unique metric that concurrently targets nodes with the highest degree and reduces the GCC size. Given the NP-hard nature of optimizing this metric, we introduce MaxShot, an innovative end-to-end solution that leverages graph representation learning and reinforcement learning. Through comprehensive evaluations on both synthetic and real-world datasets, our method consistently outperforms leading benchmarks in accuracy and efficiency. These results highlight MaxShot’s potential as a superior approach to effectively addressing the network dismantling problem.

Keywords:

complex networks; network dismantling; graph representation learning; reinforcement learning

MSC:

90C27; 05C85; 90-10

1. Introduction

Network dismantling is a pivotal issue in complex network science, and has captivated a broad array of researchers [1,2]. This pursuit involves identifying the smallest set of nodes whose removal would significantly impair or completely disable the functionality of the network [3]. The implications of resolving this problem are vast, spanning numerous practical applications. A primary application lies in cybersecurity [4,5], where identifying and disabling key nodes within a network can prevent the spread of malware or neutralize coordinated cyber threats. In social network analysis, dismantling techniques are used to assess the resilience of social structures and to counteract the spread of misinformation by disrupting influential users [6]. In addition, network dismantling finds relevance in infrastructure resilience [7], where critical nodes in transportation or utility grids may be strengthened or redundancy introduced to prevent catastrophic failures. Another example is drug discovery [8], where identifying and targeting key protein interactions within a cellular network can lead to novel treatments. Each of these applications underscores the profound impact of effective network dismantling across various fields.

To quantify network functionality, an appropriate measure must be defined based on the specific application scenario. Network connectivity is often regarded as a key indicator due to the necessity for most network applications to operate in a connected environment [9]. Common measures of connectivity include the number of connected components [10], pairwise connectivity [11], the size of the giant connected component (GCC) [12], and the shortest path length between specific nodes [13].

The size of the GCC is particularly significant because of its relevance to both the optimal attack problem, which aims to minimize the GCC, and the optimal spreading problem under linear threshold spreading dynamics [14]. Consequently, previous research on network dismantling has mainly focused on reducing the number of nodes in the GCC. However, this approach often neglects the importance of high-degree nodes. By concentrating on a single objective, such strategies may overlook the critical role that highly connected nodes in GCCs play in maintaining network integrity and functionality.

Approaches that solely focus on the size of the GCC, such as FINDER [14,15], exhibit several notable drawbacks. Researchers have observed a slow start in its strategy dismantling process and a tendency to initially target peripheral nodes for removal, which is suboptimal for real-world scenarios where eliminating critical nodes in GCCs could be more effective. Targeting critical nodes early on has the advantage of potentially creating a fragile chain-like structure with direct paths, leading to quicker network disintegration. Thus, the initial focus on peripheral nodes by GCC-only strategies reduces their overall efficacy by delaying emergence of the critical structural vulnerabilities needed for efficient dismantling.

In the Highest Degree Algorithm (HDA) [16], in each iteration the node with the highest degree is systematically removed from the network. This removal is followed by updating the node degrees, and the process is repeated until the network is entirely cycle-free.

Inspired by the observation of slow starting which typically occurs in the GCC-only dismantling method and building upon the principles of HDA methods, we introduce a dual metric approach that evaluates network dismantling based on the size of the GCC and the maximum degree within the GCC. Additionally, recognizing that minimizing the area under the curve while removing the least number of nodes is an NP-hard problem, we enhance computational efficiency by reformulating this problem as a Markov Decision Process (MDP), for which we propose a novel algorithm, MaxShot. Our algorithm harnesses graph representation learning and reinforcement learning to develop heuristic strategies aimed at optimizing the dual metric in the dismantling process.

Extensive experiments have been conducted across both synthetic graphs and real-world datasets, with the latter comprising tens of thousands of nodes and edges. The results demonstrate that the proposed MaxShot model generally outperforms existing methods while exhibiting a considerable speed advantage.

In summary, our contributions can be summarized as follows:

We present a novel dual metric that simultaneously considers the size of the GCC and its maximum degree during the network dismantling procedure. To tackle the optimization challenge, we propose an end-to-end learning method called MaxShotwhich facilitates the training of an agent on synthetic graph data, enabling direct application to real-world networks.
Extensive experiments have been conducted to assess the performance of our model. The findings demonstrate that MaxShot surpasses current state-of-the-art methods in terms of both accuracy and computational efficiency.

The rest of the paper is structured as follows: Section 2 reviews related work; Section 3 covers the relevant preliminaries; Section 4 and Section 5 delve into our MaxShot and the experimental setup, respectively; finally, Section 6 summarizes our findings and suggests potential directions for future research.

2. Related Works

Within the sphere of complex network analysis, the exploration and enhancement of network robustness and resilience via the process of network dismantling has emerged as a critical domain of scholarly inquiry. Research into complex network disintegration now transcends traditional methods, encompassing machine learning and reinforcement learning methodologies. This paper sequentially addresses the current state of research in these three areas of network dismantling.

Traditional Network Dismantling. Traditional methods for network disintegration primarily rely on percolation theory within network science, which studies the vulnerability of networks when nodes or edges are removed. Specifically, a suite of metrics are employed to assess the pivotal roles of individual nodes. Commonly used metrics include node degree, betweenness, and closeness [17], alongside centrality indices such as betweenness centrality [18], closeness centrality [19], and PageRank [20]. Additionally, metrics such as components [10], pairwise connectivity [11], and the largest connected component size [12] are also commonly used. Notably, some papers consider dual metrics, such as identifying the highest-betweenness node within the GCC [21]. However, these conventional tactics are often limited by the requirement of prior knowledge about the network’s structural attributes, restricting their efficacy in the context of intricate networks.

Machine Learning-based approaches. In the domain of machine learning, methodologies such as deep reinforcement learning and algorithmic learning-based techniques present innovative frameworks for the identification of pivotal entities in network disintegration processes. Illustrative of these advancements are the FINDER [14,15] and CoreGQN [22] approaches, which harness synthetic network architectures and self-play strategies to train models that demonstrate superior performance over conventional tactics. In addition, GDM [23] effectively dismantles large-scale social, infrastructure, and technological networks by identifying patterns above the topological level, and can also quantify system risk and detect early warning signs of systemic collapse. These methodologies provide expedited and scalable solutions to the NP-hard challenges inherent in network science.

Reinforcement Learning-based approaches. Reinforcement learning approaches include deep reinforcement learning, multi-agent reinforcement learning, and model-based reinforcement learning. Deep reinforcement learning, exemplified by the DQN algorithm [24], integrates the powerful representational capabilities of deep learning with the decision-making prowess of reinforcement learning, enabling models to make optimal decisions within complex network entities. The MiniKey algorithm aims to utilize the minimal number of nodes required to sustain optimal network performance [14]. In parallel, SmartCore leverages reinforcement learning to uncover heuristic strategies, significantly reducing the cumulative size of the 2-core [25]. Model-based reinforcement learning approaches such as PILCO [26] make decisions by learning the dynamics of the environment, which promotes efficient data utilization and accuracy, although they encounter challenges such as limited generalization, high modeling complexity, and increased training costs. Multi-agent reinforcement learning, as seen in Nash Q-Learning [27] and NFSP [28], involves multiple agents engaged in a collaborative effort to dismantle networks; it offers advantages such as swift decision-making and diverse strategies, while contending with issues such as uncertain cooperative mechanisms and limited information acquisition.

3. Preliminaries and Notations

3.1. Network Dismantling

In a network

G (V, E)

, where V represents the set of nodes and E is the set of edges, the Giant Connected Component (GCC) is characterized as the largest connected component that contains a significant proportion of the nodes in the entire network [29]. A pivotal assumption in network theory posits that only interconnected subnetworks can preserve their operational integrity [29]. Consequently, the GCC not only offers crucial insights into the network’s overall structure [30] but is also instrumental in determining the system’s robustness and resilience in response to perturbations [29].

Network dismantling aims to identify a subset of nodes

S \in V

whose removal leads to the fragmentation of the network such that the GCC size does not surpass a predefined threshold C. Denoting

C (G §)

as the size of the GCC in

G §

, where

v \in S

implies

v \notin G §

, S then qualifies as a C-dismantling set if and only if

C (G §) \leq C

. The network dismantling problem can be formalized as an optimization problem with the objective of minimizing

| S |

while ensuring

C (G §) \leq C

, where

| S |

signifies the count of nodes in subset S and C is a designated threshold.

3.2. Graph Neural Networks

Graph Neural Networks (GNNs) consist of a neural network architecture specifically designed to process graph-structured data [31,32,33,34]. Considering a graph

G = (V, E)

, each node

v \in V

is associated with a feature vector

X_{v}

. The objective of a GNN is to learn a function f that maps the input graph G and its node features

{X_{v}}_{v \in V}

to a set of output predictions or refined node representations. Letting

H_{v}^{(l)}

denote the representation of node v at layer l of the GNN, the transition from layer

l - 1

to layer l is governed by the following update rule:

H_{v}^{(l)} \leftarrow \underset{\forall s \in N (v), \forall e \in E (s, v)}{Aggregate} (\{Extract (H_{s}^{(l - 1)}; H_{v}^{(l - 1)}, e)\}) (1)

where

N (v)

denotes the set of nodes that have direct edges to v and

E (s, v)

represents the set of edges from node s to node v. The functions

Extract (\cdot)

and

Aggregate (\cdot)

are pivotal to the operation of GNNs; the

Extract (\cdot)

function involves isolating pertinent information from neighboring nodes by utilizing the target node’s previous layer representation

H_{v}^{(l - 1)}

and the edge

E (s, v)

to distill insights from

H_{s}^{(l - 1)}

. The

Aggregate (\cdot)

function is responsible for integrating this neighborhood information, for which it is possible to employ straightforward methods such as summation or averaging or more sophisticated techniques such as pooling.

3.3. Reinforcement Learning

Reinforcement Learning (RL) involves training an agent to make sequential decisions within an environment with the aim of optimizing cumulative rewards or attaining specific goals [35]. The essence of reinforcement learning involves modeling decision-making scenarios, commonly through a Markov Decision Process (MDP) when the environment is fully observable [36]. An MDP is characterized by the tuple

M = (S, A, P, R, γ)

, where S denotes the state space, A the action space, P the transition probability distribution, R the reward function, and

γ

the discount factor [36]. The transition probability is defined as

p (s_{t + 1} | s_{1}, a_{1}, s_{2}, a_{2}, \dots, s_{t}, a_{t}) = p (s_{t + 1} | s_{t}, a_{t})

.

At each time step t, the agent, situated in state

s_{t} \in S

, selects the optimal action

a_{t} \in A

based on the policy

π (a_{t} | s_{t})

. Subsequently, the environment transitions to a new state according to the state transition function

p (s_{t + 1} | s_{t}, a_{t})

and provides a reward

r_{t} \in R

to the agent. The agent’s goal is to maximize the cumulative reward

G = \sum_{t = 0}^{T} γ r_{t}

[37]. The policy

π

denotes the agent’s strategy, which maps states to actions; different policies yield unique paths of exploration. Value functions are employed to assess the desirability of states and actions, including the state value function and the action value function [36]. The state value function is formulated as follows:

v_{π} (s) = E [R_{t} | s_{t} = s] = \sum_{a} π (a | s) \sum_{s, r} p (s^{'}, r | s, a) [r + γ v_{π} (s^{'})]

where

p (s^{'}, r | s, a)

represents the transition function from a state–action pair to the subsequent state–reward pair and

v_{π} (s^{'})

is the value of the subsequent state

s^{'}

. This equation is known as the Bellman equation for

v_{π} (s)

, and encapsulates the relationship between the current state value and future state values. The state–action value function, denoted as

q_{π} (s, a)

, quantifies the expected return for all feasible decision sequences that initiate from state s and follow action a according to policy

π

. It is defined as follows:

q_{π} (s, a) = E [R_{t} | s_{t} = s, a_{t} = a] = \sum_{s^{'}, r} p (s^{'}, r | s, a) [r + γ max_{a^{'}} q_{π} (s^{'}, a^{'})]

4. Methodology

In this section, we introduce our innovative MaxShotframework, crafted to efficiently target the removal of nodes with the highest degrees within the GCC. To enhance clarity, the acronyms used in this paper are listed in Table 1. The MaxShot framework conceptualizes the network dismantling task as an MDP:

State ( $s \in S$ ): Encapsulates the current size of the remaining GCC in the graph.
Action ( $a \in A$ ): Involves selecting and removing a node from the active GCC.
Reward ( $r \in R$ ): Defined as the relative change in a specific dual metric calculated before and after node removal

$score (G_{t}) : = \frac{| GCC (G_{t}) |}{| G_{t} |} \times \frac{max \deg (GCC (G_{t}))}{| G_{t} |}; r_{t} = - (score (G_{t}) - score (G_{t - 1}))$

(1)

where $| \cdot |$ denotes the graph size. As we want to minimize the score, and as RL seeks to maximize the cumulative reward, there is a negative mark.
Terminal State: This occurs when the GCC is completely eliminated.

Next, we delve into the MaxShot’s architecture, elaborate on the training methodology, and analyze its computational complexity.

4.1. Architecture of MaxShot

Figure 1 depicts the architectural outline of the MaxShot framework. The MaxShot algorithm proposed herein utilizes a fundamental encoder–decoder framework. In conventional encoding strategies, nodes and graphs are frequently represented using manually crafted features such as global or local degree distributions, motif counts, and similar metrics. These traditional approaches are typically customized on a case-by-case basis, and can often fall short of optimal performance outcomes.

In the encoding phase, we employ GraphSAGE [38] as our feature extraction engine, targeting the entire graph. Converting intricate network topologies and node-specific details into a unified dense vector space enhances both representation and learning capabilities. GraphSAGE offers superior scalability and efficiency through its neighborhood sampling strategy and supports inductive learning, making it well-suited for large and dynamic graphs, whereas graph isomorphism networks and graph attention networks may face limitations in computational overhead and adaptability without similar methods. Furthermore, its inductive learning capacity ensures that it can generalize to unseen nodes, which is a crucial attribute given the dynamic nature of many networks, including in network dismantling scenarios. To further amplify the model’s representational power, we introduce a virtual node concept that effectively embodies global graph characteristics. Because GraphSAGE’s parameters remain robust regardless of graph size, this virtual node approach seamlessly extends to dynamic graphs, thereby enhancing model adaptability.

In the decoding phase, Multi-Layer Perceptrons (MLPs) equipped with the ReLU activation function are utilized to transform the encoded state and action representations into scalar Q-values, which represent potential long-term returns. This approach effectively translates action node vectors and their associated graph vectors into Q-values. The Q-value serves as a critical metric for action selection. The agent employs this heuristic in a greedy iterative manner, always choosing the node with the highest Q-value. This process continues until the network is transformed into an acyclic structure, guaranteeing the removal of all cycles.

4.2. Training Algorithms

The computation of the Q-score is carried out by the encoder–decoder architecture, parameterized by

θ_{f}

for the encoder and

θ_{g}

for the decoder. In our approach to training this model, we implemented the Double DQN method as delineated in [39], which aims to fine-tune these parameters by performing gradient descent on the sampled experience tuples

(s, a, r, s^{'})

. One significant advantage of the Double DQN methodology is its mitigation of the overestimation bias typically associated with traditional DQN. It leverages distinct dual neural networks for the separate tasks of action selection and action value evaluation, resulting in more precise estimation of the Q-values. This improvement translates into enhanced stability and faster convergence rates during the training phase, ultimately leading to superior performance metrics, especially in complex operational contexts such as network dismantling.

The goal of our training objective revolves around the minimization of the loss function, characterized as follows:

l o s s = E_{(s, a, r, s^{'}) \sim U n i f (B)} [{(r + γ \hat{Q} (s^{'}, arg max_{a^{'}} Q (s^{'}, a^{'})) - Q (s, a))}^{2}] .

(2)

In this study, state–action–reward–next state tuples

(s, a, r, s^{'})

are sampled uniformly at random from the replay buffer

B = {h_{1}, h_{2}, . . ., h_{t}}

, where each

h_{t} = (s_{t}, a_{t}, r_{t}, s_{t + 1})

. The target network, denoted as

\hat{Q}

, undergoes parameter updates from the Q network every C intervals, and its parameters remain static between updates. For training, synthetic graphs are generated; the training episodes consist of sequentially removing nodes from a graph until the GCC becomes null. An episode’s trajectory encompasses a sequence of state–action transitions

(s_{0}, a_{0}, r_{0}, s_{1}, r_{1}, s_{2}, . . ., s_{T})

. An

ϵ

-greedy policy is followed during training, beginning with

ϵ

at 1.0 and gradually reducing it to 0.01 over a span of 10,000 episodes, achieving a balance between exploration and exploitation. During the inference phase, nodes are removed considering the highest Q-scores until reaching the terminal state. After completing each episode, the loss is minimized by applying stochastic gradient descent on randomly sampled mini-batches from the replay buffer. The full training methodology is elucidated in Algorithm 1.

Algorithm 1 Training Procedure of MaxShot

1:: Initialize experience replay buffer B
2:: Initialize the parameters for GraphSage and MLP $θ = {θ_{f}, θ_{g}}$ to parameterize the state-action value function $Q (\cdot, \cdot; θ)$
3:: Parameterize target Q function with cloned weights $\hat{θ} = θ$
4:: for episode = 1 to N do
5:: Generate a graph G from the BA model
6:: Initialize the state to an empty sequence $s_{1} = {}$
7:: for $t = 1$ to T do
8:: Select a node for removal based on $a_{t} = arg {max}_{a} Q (s_{t}, a; θ)$ with $ϵ$ -greedy
9:: Remove node $a_{t}$ from current graph $G^{'}$ and receive reward $r_{t}$
10:: Update state sequence $s_{t + 1} = s_{t} \cup a_{t}$
11:: if $t > n$ then
12:: Store transition $(s_{t}, a_{t}, r_{t}, s_{t + 1})$ into the buffer B
13:: Sample random a batch of transitions $(s_{k}, a_{k}, r_{k}, s_{k + 1})$ from B
14:: Set $y_{k} = \{\begin{matrix} r_{k}; For terminal step k + 1 \\ r_{k} + γ Q (s^{'}, arg {max}_{a^{'}} Q (s^{'}, a^{'}; θ); \hat{θ}); Otherwise \end{matrix}$
15:: Optimize $θ$ to minimize $| | y_{k} - Q (s_{k}, a_{k}; θ) {| |}_{2}$
16:: Every C steps, update $\hat{θ} \leftarrow θ$
17:: end if
18:: end for
19:: end for

4.3. Computational Complexity Analysis

The time complexity of the MaxShot algorithm can be succinctly captured by the expression

O (T | E | t)

, where T represents the number of layers within the GraphSAGE architecture,

| E |

denotes the comprehensive count of edges present in the given graph, and t accounts for the cumulative number of nodes that are sequentially removed until the GCC is entirely eradicated. By leveraging advanced sparse matrix representations to model the graph structure, MaxShot is remarkably proficient at managing the immense and intricate graphs that typically arise in real-world applications. This proficiency highlights the model’s inherent scalability and robust performance, ensuring that it is well-suited for the demanding and large-scale computational tasks encountered in contemporary data-driven environments.

5. Experiments

5.1. Settings

We validate the efficacy of the proposed MaxShot model against several widely-used algorithms: HDA, HBA, HCA, and HPRA on simulated graphs. We consider two types of graph models for synthetic date: Barabási–Albert (BA) graphs and Stochastic Block Model (SBM) graphs. Specifically, we utilized the BA network model (where

m = 4

) to create 100 synthetic graphs for each of the following node ranges: 30–50, 50–100. Each SBM graph comprises three communities. The sizes of these communities, as detailed in Table 2 and Table 3, were between 30 to 50 nodes, resulting in overall network sizes ranging from 90 to 150 nodes. We maintained an inter-community connection probability of 0.02, reinforcing dense intra-community connections and sparse inter-community connectivity to reflect typical community structures. To ensure robust results, we generated 100 graphs for each community size. These graphs provide a comprehensive evaluation across various scales of simulated networks.

To evaluate performance on real-world networks, we chose HDA, CI, MinSum, CoreHD, BPD, and GND as our reference methods. Four real-world datasets were selected from SNAP Datasets to evaluate the performance of our MaxShot model, as shown in Table 4. The details of these benchmark methods are elaborated below:

High-Degree Algorithm (HDA) [40]: Removes nodes from the network based on their number of connections (degree), prioritizing those with the highest degree and persisting until the network is devoid of cycles.
High-Betweenness Algorithm (HBA) [41]: Targets nodes with the highest betweenness centrality, which measures the number of shortest paths passing through a node.
High PageRank Removal Algorithm (HPRA) [42]: Targets nodes distinguished by their superior PageRank scores, akin to a popularity score.
High Closeness Algorithm (HCA) [43]: Removes nodes with high closeness centrality, which are those most central to the network (closest to all other nodes), maximizing the increase in distances within the network and causing the greatest fragmentation.
Collective Influence (CI) [44]: Prioritizes nodes for removal based on a concept called collective influence, which combines local information about the node degree and a measure of global network influence to identify key nodes whose removal maximally disrupts the network.
MinSum [9]: Estimates the influence of nodes by minimizing the total weight (or cost) of nodes’ influence spread across the network.
CoreHD (High-Degree Core) [45]: Focuses on nodes within the k-core of the network (a subgraph in which each node has at least k connections) and repeatedly removes nodes with the highest degree, aiming to effectively collapse the core structure.
Belief Propagation Decimation (BPD) [46]: Utilizes the principles of belief propagation to iteratively update the probabilities associated with node states.
Generalized Network Dismantling (GND) [47]: Leverages generalized optimization techniques that consider various structural and dynamical properties for effective network dismantling.

The training trajectories spanned 50,000 episodes, with a replay memory that retained up to 20,000 of the latest transitions. To gauge model efficacy, we evaluated it after every 300 episodes using a dataset comprising 100 synthetic graphs, each mirroring the dimensions of the training graphs. We then recorded the mean performance metrics obtained during these evaluations. The hyperparameters of MaxShot are shown in Table 5. All experimental tasks were conducted on a platform provided by Huawei Cloud, equipped with a 32 GB Nvidia GeForce Tesla A100 GPU.

5.2. Results on Synthetic Dataset

Unlike the traditional dismantling methods of other GCC-only strategies, MaxShot selects the node with the highest degree while reducing the GCC size. We comprehensively evaluated the performance of MaxShot using two metrics: the GCC size and our proposed innovative dual metric proposed, which is shown in Equation (1). Figure 2 and Figure 3 respectively display the test results of MaxShot and other baseline methods, including HDA, HBA, HCA, and HPRA, on the GCC size and the maximum degree of GCC size for 100 BA graphs. Table 2 and Table 3 present the test results on GCC size and the maximum degree within the GCC, respectively, for the SBM graphs. Based on these figures, MaxShot not only surpasses other baseline methods in terms of the traditional metric of GCC size but also demonstrates impressive performance in terms of the maximum degree of GCC size. The results on SBM graphs further indicate that our model effectively identifies key nodes within the GCC, which is particularly useful in scenarios characterized by clustered structures.

5.3. Results on Real-World Dataset

To further demonstrate the performance of MaxShot, we conducted experiments on four real-world datasets. The results are displayed in Table 6 and Table 7, showing the accumulated maximum degree of GCC size and the accumulated GCC size, respectively. From these results, it is evident that MaxShot consistently outperforms other methods on the Crime, HI-II-14, and Enron datasets. In particular, its performance is significantly superior on the HI-II-14 and Enron datasets. This advantage could be attributed to the introduction of the maximum degree, which effectively mitigates the early slow-down issue of traditional disintegration strategies while also maintaining a small area under the curve. Additionally, Figure 4 compares the Pareto frontiers of various methods on real-world datasets. The Pareto frontier represents the set of solutions for which no other feasible solution can be found that would improve one objective without increasing at least one other objective. In this context, MaxShot successfully addresses both objectives simultaneously.

5.4. Other Analysis of MaxShot

5.4.1. Convergence of MaxShot

We visualized the GCC size and the maximum degree GCC size of MaxShoton a validation set of 100 BA graphs with the same distribution during the training process, as shown in Figure 5 and Figure 6, respectively. It is not difficult to see from these figures that MaxShotis able to maintain optimization of the GCC size while simultaneously optimizing the maximum degree GCC size.

5.4.2. Running Time of Different Methods

While the task of network dismantling requires effective dismantling, it is also necessary to pay attention to the run time. Table 8 and Table 9 present the running times of various methods on synthetic and real-world datasets. Compared to other methods, MaxShot has a significant advantage in run time, especially on large-scale real-world datasets.

6. Conclusions

In this paper, we introduce MaxShot, a cutting-edge algorithm that integrates graph representation learning with reinforcement learning to address the network dismantling challenge by minimizing a targeted dual metric that prioritizes high-degree nodes within the Giant Connected Component (GCC). Leveraging a sophisticated encoder–decoder architecture, MaxShot effectively translates graph structures into dense representations using GraphSAGE, then applies Double DQN to refine the node selection process.

Extensive experiments conducted on both synthetic and real-world datasets demonstrate that MaxShot surpasses existing state-of-the-art methods in performance. Moreover, MaxShot exhibits remarkable computational efficiency, achieving faster run times on several datasets. The incorporation of Double DQN enhances the decision-making process, leading to more strategic and effective node removals.

Looking forward, the success of MaxShot signals a significant advancement in harnessing graph neural networks and reinforcement learning for network dismantling and related optimization tasks. Future research will explore the adaptability of the MaxShotapproach to a variety of graph-based problems and aim to integrate additional graph-level features to further enhance its performance. Additionally, including information about the clustered properties of the network in the RL state and developing efficient ways to represent it could potentially boost performance and increase real-world practicality.

Author Contributions

Conceptualization, S.L.; Methodology S.L.; Formal analysis, S.L.; investigation, T.P.; resources, T.P.; data curation, L.Z.; visualization, L.Z.; Writing—original draft, Y.W., H.C. and Z.L.; Writing—review and editing, Y.W., H.C. and Z.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no involvement in the design of the study, in the collection, analysis, or interpretation of data, in the writing of the manuscript, or in the decision to publish the results.

References

Vespignani, A. Twenty years of network science. Nature 2018, 558, 528–529. [Google Scholar] [CrossRef] [PubMed]
Gosak, M.; Markovič, R.; Dolenšek, J.; Rupnik, M.S.; Marhl, M.; Stožer, A.; Perc, M. Network science of biological systems at different scales: A review. Phys. Life Rev. 2018, 24, 118–135. [Google Scholar] [CrossRef] [PubMed]
Boccaletti, S.; Latora, V.; Moreno, Y.; Chavez, M.; Hwang, D.U. Complex networks: Structure and dynamics. Phys. Rep. 2006, 424, 175–308. [Google Scholar] [CrossRef]
Cavelty, M.D.; Wenger, A. Cyber security meets security politics: Complex technology, fragmented politics, and networked science. Contemp. Secur. Policy 2020, 41, 5–32. [Google Scholar] [CrossRef]
Veksler, V.D.; Buchler, N.; Hoffman, B.E.; Cassenti, D.N.; Sample, C.; Sugrim, S. Simulations In Cyber-Security: A Review of Cognitive Modeling of Network Attackers, Defenders, and Users. Front. Psychol. 2018, 9, 691. [Google Scholar] [CrossRef]
Wandelt, S.; Lin, W.; Sun, X.; Zanin, M. From random failures to targeted attacks in network dismantling. Reliab. Eng. Syst. Saf. 2022, 218, 108146. [Google Scholar] [CrossRef]
Pastor-Satorras, R.; Vespignani, A. Immunization of complex networks. Phys. Rev. 2002, 65, 036104. [Google Scholar] [CrossRef]
Siegelin, M.D.; Plescia, J.; Raskett, C.M.; Gilbert, C.A.; Ross, A.H.; Altieri, D.C. Global targeting of subcellular heat shock protein-90 networks for therapy of glioblastoma. Mol. Cancer Ther. 2010, 9, 1638–1646. [Google Scholar] [CrossRef]
Braunstein, A.; Dall’Asta, L.; Semerjian, G.; Zdeborová, L. Network dismantling. Proc. Natl. Acad. Sci. USA 2016, 113, 12368–12373. [Google Scholar] [CrossRef]
Addis, B.; Summa, M.D.; Grosso, A. Identifying critical nodes in undirected graphs: Complexity results and polynomial algorithms for the case of bounded treewidth. Discret. Appl. Math. 2013, 161, 2349–2360. [Google Scholar] [CrossRef]
Detecting critical nodes in sparse graphs. Comput. Oper. Res. 2009, 36, 2193–2200. [CrossRef]
Li, H.; Shang, Q.; Deng, Y. A Generalized Gravity Model For Influential Spreaders Identification In Complex Networks. Chaos Solitons Fractals 2021, 143, 110456. [Google Scholar] [CrossRef]
Fan, C.; Zeng, L.; Ding, Y.; Chen, M.; Sun, Y.; Liu, Z. Learning to Identify High Betweenness Centrality Nodes from Scratch: A Novel Graph Neural Network Approach. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management, Beijing, China, 3–7 November 2019; pp. 559–568. [Google Scholar]
Zeng, L.; Fan, C.; Chen, C. Leveraging Minimum Nodes for Optimum Key Player Identification in Complex Networks: A Deep Reinforcement Learning Strategy with Structured Reward Shaping. Mathematics 2023, 11, 3690. [Google Scholar] [CrossRef]
Fan, C.; Zeng, L.; Sun, Y.; Liu, Y.Y. Finding key players in complex networks through deep reinforcement learning. Nat. Mach. Intell. 2020, 2, 317–324. [Google Scholar] [CrossRef]
Crucitti, P.; Latora, V.; Marchiori, M.; Rapisarda, A. Error and attack tolerance of complex networks. Phys. Stat. Mech. Its Appl. 2004, 340, 388–394. [Google Scholar] [CrossRef]
Valdez, L.D.; Shekhtman, L.; Rocca, C.E.L.; Zhang, X.; Buldyrev, S.; Trunfio, P.A.; Braunstein, L.A.; Havlin, S. Cascading failures in complex networks. J. Complex Netw. 2020, 8, cnaa013. [Google Scholar] [CrossRef]
Moore, T.J.; Cho, J.H.; Chen, I.R. Network Adaptations under Cascading Failures for Mission-Oriented Networks. IEEE Trans. Netw. Serv. Manag. 2019, 16, 1184–1198. [Google Scholar] [CrossRef]
Weinbrenner, L.T.; Vandré, L.; Coopmans, T.; Gühne, O. Aging and Reliability of Quantum Networks. Phys. Rev. A 2023, 109, 052611. [Google Scholar] [CrossRef]
Perez, I.A.; Porath, D.B.; Rocca, C.E.L.; Braunstein, L.A.; Havlin, S. Critical behavior of cascading failures in overloaded networks. Phys. Rev. E 2024, 109, 034302. [Google Scholar] [CrossRef]
Nguyen, Q.; Pham, H.D.; Cassi, D.; Bellingeri, M. Conditional attack strategy for real-world complex networks. Phys. Stat. Mech. Its Appl. 2019, 530, 121561. [Google Scholar] [CrossRef]
Fan, C.; Zeng, L.; Feng, Y.; Cheng, G.; Huang, J.; Liu, Z. A novel learning-based approach for efficient dismantling of networks. Int. J. Mach. Learn. Cybern. 2020, 11, 2101–2111. [Google Scholar] [CrossRef]
Grassia, M.; Domenico, M.D.; Mangioni, G. Machine learning dismantling and early-warning signals of disintegration in complex systems. Nat. Commun. 2021, 12, 5190. [Google Scholar] [CrossRef]
Mnih, V.; Kavukcuoglu, K.; Silver, D.; Rusu, A.A.; Veness, J.; Bellemare, M.G.; Graves, A.; Riedmiller, M.; Fidjeland, A.K.; Ostrovski, G.; et al. Human-level control through deep reinforcement learning. Nature 2015, 518, 529–533. [Google Scholar] [CrossRef] [PubMed]
Pu, T.; Zeng, L.; Chen, C. Deep Reinforcement Learning for Network Dismantling: A K-Core Based Approach. Mathematics 2024, 12, 1215. [Google Scholar] [CrossRef]
Deisenroth, M.P.; Rasmussen, C.E. PILCO: A Model-Based and Data-Efficient Approach to Policy Search. In Proceedings of the International Conference on Machine Learning, Washington, DC, USA, 28 June–2 July 2011; pp. 465–472. [Google Scholar]
Hu, J.; Wellman, M.P. Nash q-learning for general-sum stochastic games. J. Mach. Learn. Res. 2003, 4, 1039–1069. [Google Scholar]
Heinrich, J.; Silver, D. Deep Reinforcement Learning from Self-Play in Imperfect-Information Games. arXiv 2016, arXiv:1603.01121. [Google Scholar]
Kitsak, M.; Ganin, A.A.; Eisenberg, D.A.; Krapivsky, P.L.; Krioukov, D.; Alderson, D.L.; Linkov, I. Stability of a giant connected component in a complex network. Phys. Rev. E 2018, 97, 012309. [Google Scholar] [CrossRef]
Dorogovtsev, S.N.; Mendes, J.F.F.; Samukhin, A.N. Giant strongly connected component of directed networks. Phys. Rev. E Stat. Nonlin. Soft Matter Phys. 2001, 64, 025101. [Google Scholar] [CrossRef]
Kipf, T.N.; Welling, M. Semi-Supervised Classification with Graph Convolutional Networks. arXiv 2016, arXiv:1609.02907. [Google Scholar]
Ying, R.; He, R.; Chen, K.; Eksombatchai, P.; Hamilton, W.L.; Leskovec, J. Graph Convolutional Neural Networks for Web-Scale Recommender Systems. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, London, UK, 19–23 August 2018. [Google Scholar]
Schlichtkrull, M.; Kipf, T.N.; Bloem, P.; Berg, R.V.; Welling, M. Modeling Relational Data with Graph Convolutional Networks; Springer: Cham, Switzerland, 2018. [Google Scholar]
Hu, Z.; Dong, Y.; Wang, K.; Chang, K.W.; Sun, Y. GPT-GNN: Generative Pre-Training of Graph Neural Networks. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Virtual, 6–10 July 2020. [Google Scholar]
Joshi, D.J.; Kale, I.; Gandewar, S.; Korate, O.; Patwari, D.; Patil, S. Reinforcement learning: A survey. In Proceedings of the Machine Learning and Information Processing, Hyderabad, India, 28–29 November 2020; pp. 297–308. [Google Scholar]
Sutton, R.S.; Barto, A.G. Reinforcement Learning: An Introduction; MIT Press: Cambridge, MA, USA, 2018. [Google Scholar]
Ladosz, P.; Weng, L.; Kim, M.; Oh, H. Exploration in Deep Reinforcement Learning: A Survey. Inf. Fusion 2022, 85, 1–22. [Google Scholar] [CrossRef]
Hamilton, W.L.; Ying, R.; Leskovec, J. Inductive Representation Learning on Large Graphs. Adv. Neural Inf. Process. Syst. 2017, 30, 1024–1034. [Google Scholar]
Hasselt, H.V. Double Q-learning. Adv. Neural Inf. Process. Syst. 2010, 23, 2613–2621. [Google Scholar]
Hooshmand, F.; Mirarabrazi, F.; MirHassani, S. Efficient benders decomposition for distance-based critical node detection problem. Omega 2020, 93, 102037. [Google Scholar] [CrossRef]
Carmi, S.; Havlin, S.; Kirkpatrick, S.; Shavitt, Y.; Shir, E. A model of Internet topology using k-shell decomposition. Proc. Natl. Acad. Sci. USA 2007, 104, 11150–11154. [Google Scholar] [CrossRef]
Wandelt, S.; Sun, X.; Feng, D.; Zanin, M.; Havlin, S. A comparative analysis of approaches to network-dismantling. Sci. Rep. 2018, 8, 13513. [Google Scholar] [CrossRef]
Bavelas, A. Communication patterns in task-oriented groups. J. Acoust. Soc. Am. 1950, 22, 725–730. [Google Scholar] [CrossRef]
Morone, F.; Makse, H.A. Influence maximization in complex networks through optimal percolation. Nature 2015, 524, 65–68. [Google Scholar] [CrossRef] [PubMed]
Zdeborová, L.; Zhang, P.; Zhou, H.J. Fast and simple decycling and dismantling of networks. Sci. Rep. 2016, 6, 37954. [Google Scholar] [CrossRef]
Mugisha, S.; Zhou, H.J. Identifying optimal targets of network attack by belief propagation. Phys. Rev. E 2016, 94, 012305. [Google Scholar] [CrossRef]
Ren, X.L.; Gleinig, N.; Helbing, D.; Antulov-Fantulin, N. Generalized network dismantling. Proc. Natl. Acad. Sci. USA 2019, 116, 6554–6559. [Google Scholar] [CrossRef]

Figure 1. MaxShot Framework overview. The MaxShot framework employs a standard encoder–decoder architecture. The encoder, denoted as f and parameterized by GraphSAGE, transforms the raw graph input into a compact embedding. This embedding is subsequently processed by a linear decoder g to produce the Q-value for node selection. After node removal, the modified graph is compared with the original input graph to generate a reward, which is used to optimize both f and g using the Double DQN algorithm. This iterative process continues until performance converges.

Figure 2. Accumulated GCC size across different methods on BA graphs of different scales.

Figure 3. Accumulated maximum degree of GCC size across different methods on BA graphs of different scales.

Figure 4. Comparison of Pareto frontiers across different methods on real-world datasets. (a) Crime, (b) HI-II-14, (c) Digg, (d) Enron.

Figure 5. GCC size of training convergence curve for MaxShoton BA graph.

Figure 6. Maximum degree of GCC size of training convergence curve for MaxShoton BA graph.

Table 1. Acronyms used in the paper.

Acronyms	Meaning
GCC	Giant Connected Component
HDA	Highest Degree Algorithm
HBA	High-Betweenness Algorithm
HPRA	High PageRank Removal Algorithm
HCA	High Closeness Algorithm
CI	Collective Influence
BPD	Belief Propagation Decimation
RL	Reinforcement learning
MDP	Markov Decision Process
DQN	Deep Q-Learning
GNN	Graph Neural Networks
MLP	Multi-Layer Perceptron

Table 2. Accumulated maximum degree of GCC size across different methods on SBM graphs of different scales.

Size	30–50	50–100	100–200	200–300
HDA	0.0449 ± 0.0055	0.0267 ± 0.0047	0.0143 ± 0.0026	0.0086 ± 0.0009
HBA	0.0458 ± 0.0057	0.0270 ± 0.0048	0.0142 ± 0.0027	0.0086 ± 0.0009
HCA	0.0463 ± 0.0054	0.0284 ± 0.0048	0.0156 ± 0.0027	0.0097 ± 0.0010
HPRA	0.0450 ± 0.0055	0.0267 ± 0.0047	0.0140 ± 0.0027	0.0085 ± 0.0009
MaxShot	0.0444 ± 0.0055	0.0262 ± 0.0047	0.0139 ± 0.0027	0.0085 ± 0.0011

Table 3. Accumulated GCC size across different methods on SBM graphs of different scales.

Size	30–50	50–100	100–200	200–300
HDA	0.2967 ± 0.0174	0.2718 ± 0.0197	0.2571 ± 0.0124	0.2474 ± 0.0098
HBA	0.2978 ± 0.0159	0.2702 ± 0.0175	0.2514 ± 0.0117	0.2500 ± 0.0093
HCA	0.2936 ± 0.0178	0.2696 ± 0.0194	0.2534 ± 0.0135	0.2431 ± 0.0104
HPRA	0.2995 ± 0.0173	0.2774 ± 0.0204	0.2647 ± 0.0144	0.2560 ± 0.0115
MaxShot	0.2943 ± 0.0185	0.2676 ± 0.0196	0.2525 ± 0.0128	0.2421 ± 0.0097

Table 4. Dataset statistics and descriptions for various network datasets.

Data	#Nodes	#Edges	Diameter	Description
HI-II-14	4165	13,087	11	• Human interaction data from Space II.
Digg	29,652	84,781	12	• Interactions on the social news platform, Digg.
Enron	33,696	180,811	11	• Million-scale Email communication patterns within Enron Corporation.
Epinion	75,879	508,837	14	• Trust network from the online social network, Epinions, illustrating user interactions.

Table 5. Hyperparameters and their respective settings for the MaxShot framework.

Name	Value	Description
Learning rate	0.0001	The learning rate used by the Adam optimizer
Embedding dimension	64	Dimensions of node embedding vector
Layer iterations	5	Number of GraphSAGE layers
Q-learning steps	3	Number of Q-learning steps
Batch size	64	Number of mini-batch training samples

Table 6. Accumulated maximum degree of GCC size across different methods on four real-world datasets.

Method/Datasets	Crime	HI-II-14	Digg	Enron
HDA	0.11228	0.05757	0.08786	0.04527
CI	0.00708	0.00941	0.00169	0.00211
MinSum	0.00737	0.00928	0.00129	0.00165
BPD	0.00733	0.00953	0.00129	0.00212
CoreHD	0.00722	0.00951	0.00137	0.00227
GND	0.00695	0.00987	0.00163	0.00200
FINDER	0.00691	0.01137	0.00239	0.00278
MaxShot	0.00682	0.00577	0.00141	0.00024

Table 7. Accumulated GCC size across different methods on four real-world datasets.

Method/Datasets	Crime	HI-II-14	Digg	Enron
HDA	0.42099	0.27923	0.35407	0.38061
CI	0.36765	0.24695	0.28057	0.19005
MinSum	0.36644	0.24611	0.25162	0.16330
BPD	0.36595	0.24795	0.25176	0.19202
CoreHD	0.36927	0.24766	0.25398	0.19888
GND	0.37612	0.25405	0.26840	0.16785
FINDER	0.36655	0.29092	0.34271	0.24017
MaxShot	0.36467	0.24206	0.28325	0.04742

Table 8. Running times across different methods on synthetic datasets.

Method/Data Size	BA (30–50)	BA (50–100)	SBM (30–50)	SBM (50–100)	SBM (100–200)	SBM (200–300)
HDA	0.09 ± 0.04	0.19 ± 0.07	0.0010 ± 0.0005	0.0023 ± 0.0009	0.0087 ± 0.0030	0.0201 ± 0.0046
HBA	2.81 ± 1.12	15.79 ± 9.06	0.0664 ± 0.0274	0.3659 ± 0.1879	2.3348 ± 1.3352	5.9396 ± 2.4096
HCA	3.56 ± 1.47	18.91 ± 10.81	0.0286 ± 0.0109	0.1621 ± 0.0812	1.2341 ± 0.6042	7.3033 ± 3.5941
HPRA	8.80 ± 2.53	27.39 ± 12.07	0.0712 ± 0.0203	0.2408 ± 0.1165	0.7886 ± 0.2706	2.2153 ± 0.5425
MaxShot	0.02 ± 0.01	0.05 ± 0.02	0.0221 ± 0.0092	0.0485 ± 0.0138	0.1826 ± 0.0680	0.3564 ± 0.0691

Table 9. Running times across different methods on six real-world datasets.

Method/Dataset	HI-II-14	Digg	Enron	Epinions
HDA	0.78	117.23	139.30	311.70
CI	1.96	113.96	135.42	835.78
MinSum	2.03	113.32	134.82	876.25
CoreHD	2.03	112.73	136.72	893.14
BPD	2.02	114.24	136.08	895.85
GND	2.02	115.22	136.67	864.12
MaxShot	0.02	0.11	2.38	3.15

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liu, S.; Pu, T.; Zeng, L.; Wang, Y.; Cheng, H.; Liu, Z. Reinforcement Learning-Based Network Dismantling by Targeting Maximum-Degree Nodes in the Giant Connected Component. Mathematics 2024, 12, 2766. https://doi.org/10.3390/math12172766

AMA Style

Liu S, Pu T, Zeng L, Wang Y, Cheng H, Liu Z. Reinforcement Learning-Based Network Dismantling by Targeting Maximum-Degree Nodes in the Giant Connected Component. Mathematics. 2024; 12(17):2766. https://doi.org/10.3390/math12172766

Chicago/Turabian Style

Liu, Shixuan, Tianle Pu, Li Zeng, Yunfei Wang, Haoxiang Cheng, and Zhong Liu. 2024. "Reinforcement Learning-Based Network Dismantling by Targeting Maximum-Degree Nodes in the Giant Connected Component" Mathematics 12, no. 17: 2766. https://doi.org/10.3390/math12172766

APA Style

Liu, S., Pu, T., Zeng, L., Wang, Y., Cheng, H., & Liu, Z. (2024). Reinforcement Learning-Based Network Dismantling by Targeting Maximum-Degree Nodes in the Giant Connected Component. Mathematics, 12(17), 2766. https://doi.org/10.3390/math12172766

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Reinforcement Learning-Based Network Dismantling by Targeting Maximum-Degree Nodes in the Giant Connected Component

Abstract

1. Introduction

2. Related Works

3. Preliminaries and Notations

3.1. Network Dismantling

3.2. Graph Neural Networks

3.3. Reinforcement Learning

4. Methodology

4.1. Architecture of MaxShot

4.2. Training Algorithms

4.3. Computational Complexity Analysis

5. Experiments

5.1. Settings

5.2. Results on Synthetic Dataset

5.3. Results on Real-World Dataset

5.4. Other Analysis of MaxShot

5.4.1. Convergence of MaxShot

5.4.2. Running Time of Different Methods

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI