Dynamic Community Detection Based on Evolutionary DeepWalk
Abstract
:1. Introduction
- High-dimensional characteristics of the network. In real-world complex networks, the number of nodes is usually tens of thousands. The representation of the network in computer applications mainly uses two basic structures: Adjacency list and adjacency matrix. The storage mode of the adjacency list and matrix are not conducive to computer calculation and need large storage [12]. Additionally, the network is mostly sparse, which will cause a waste of storage and computing resources.
- Dynamic characteristics of the network. The nodes and edges in the network often change over time. The dynamic changes of the network will lead to corresponding changes in the structure of the network, but most of the nodes and edges in the network have not changed. How to balance the historical structure information and the current structure information in the network has become a major challenge.
- According to the idea of evolutionary clustering, we propose the concept of evolutionary DeepWalk, and extend the application of static graph embedding algorithm DeepWalk to the field of dynamic graph analysis;
- We propose a method of dynamic community detection based on evolutionary DeepWalk, which combines evolutionary DeepWalk with the clustering algorithm;
- We compare our algorithm with five algorithms on ten datasets, including six artificial dynamic networks and four real-world dynamic networks. The experimental results demonstrate that DEDW improves accuracy, showing that network embedding is promising for a dynamic community detection.
2. Related Work
2.1. Graph Embedding
2.2. Community Detection
3. The DeDW Algorithm
3.1. Problem Description and Symbol Definition
3.2. Evolutionary DeepWalk
- Nodes Addition. At time t, if a new node i is added, there will be a new feature vector CFit in the feature vector of CFt. To use Equation (2), we must ensure that the dimension of HFt and CFt are n × d. HFt does not have node i, so we need to deal with HFt. Our method is to add zero vectors to the position of the corresponding new node i as the reserved position in HFt, so that there is a position of the corresponding new node i vector when calculating ComFt.
- Nodes Deletion. (b). At time t, if there is a node i leaving the network, the eigenvector of node i does not exist in the eigenvector of CFt. It only needs to delete the eigenvector HFit in HFt to keep the same number of nodes and dimensions of HFt and CFt, which is convenient for calculating ComFt.
Algorithm 1: evolutionary DeepWalk (EDW) | |
Input: Historical Features matrix HFt; Current Features matrix CFt | |
Output: Comprehensive Features matrix ComFt | |
1 | compute AddNodes form Equation (3) //get nodes which are added |
2 | compute DelNodes from Equation (4) //get nodes which are deleted |
3 | if AddNodes ≠ Ø do //Nodes Addition |
4 | HFt. Add( ∗ AddNodes) |
5 | end if |
6 | if DelNodes ≠ Ø do //Nodes Deletion |
7 | HFt. Del(DelNodes) |
8 | end if |
9 | ComFt = αHFt + (1 − α) CFt |
10 | return ComFt |
3.3. Dynamic Community Detection
Algorithm 2: Dynamic community detection framework based on EDW (DEDW) | |
Input: Dynamic network G = {g1, ⋯, gt, ⋯,gT}; Weight parameter α | |
Output: Whole community structure C = {C1, C2, ⋯, CT} | |
1 | C = Ø |
2 | if t = 1 do // deal with the initial network |
3 | HF2 = ComF1 = HF1 = CF1 = DeepWalk(g1) |
4 | C1 = K-Means(ComF1) |
5 | C.append(C1) |
6 | end if |
7 | fort = 2 to T do |
8 | CFt = DeepWalk(gt) |
9 | HFt = ComFt−1 |
10 | ComFt = EvoGE(CFt, HFt) |
11 | Ct = K-Means(ComFt) |
12 | C.append(Ct) |
13 | end for |
14 | return C |
- Firstly, we perform community detection on g1. Then, we use graph embedding to obtain the comprehensive feature matrix ComF1 and then use the clustering algorithm to cluster the feature matrix to obtain the community result at t = 1.
- Then, observing g2, we can see that the network only adds a new node i in t = 2. First, we use DeepWalk to embed g2 to obtain the current embedding feature matrix CF2 and use ComF1 as the historical embedding feature matrix HF2 in this step. Since there is no corresponding feature vector of node i in HF2, it is necessary to judge the change of nodes in g2. According to the row number of feature vectors, we can get the embedding features vector of node i added in CF2, so we can set the node i vectors with 0 vectors as the features in the corresponding position of HF2. Then, we calculate ComF2 according to Equation (2) and use the clustering algorithm for ComF2 to obtain the community structure of g2 when t = 2.
- Finally, we try to deal with the changing of nodes and their relationship in g3. As shown in Figure 2, the main change in g3 is that node j leaves the current network, and the connections between this node and other nodes are eliminated at the same time. Firstly, we perform graph embedding on g3 to obtain the current embedding feature matrix CF3. Since there is no feature vector of node j in CF3, the relevant information of node j needs to be deleted from HF3. Then, we use the Equation (2) to obtain ComF3 and use the clustering algorithm to obtain the community information of g3.
- The time complexity of DeepWalk;
- The time complexity of evolutionary graph embedding;
- The time complexity of K-means;
- The number of temporal snapshot networks.
4. Experimental Analysis
4.1. Datasets and Baseline Methods
4.1.1. Datasets
4.1.2. Baseline Methods
4.2. Network Evaluation
- Analyzing the influence of the parameter α in EvoGE on the community detection results.
- Analyzing the average performance of each algorithm on artificial and real-world datasets.
4.2.1. Parameter Analysis
4.2.2. Overall Network Analysis
- When μ ≤ 0.5, DeepWalk achieves the best results. It has more outstanding processing performance in dealing with relatively simple networks than other dynamic graph embedding algorithms and community detection algorithms.
- When μ ≥ 0.6, DEDW has the best performance. It can be inferred that our method has a good dynamic community partition effect for the complex networks.
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Duan, X.Y.; Yuan, G.; Meng, F.R. Dynamic Community Detection: A Survey. J. Front. Comput. Sci. 2021, 15, 612–630. [Google Scholar] [CrossRef]
- Wang, Y.; Piao, C.; Liu, C.H.; Zhou, C.; Tang, J. Modeling User Interests with Online Social Network Influence by Memory Augmented Sequence Learning. IEEE Trans. Netw. Sci. Eng. 2021, 8, 541–554. [Google Scholar] [CrossRef]
- Bugnon, L.A.; Yones, C.; Milone, D.H.; Stegmayer, G. Deep Neural Architectures for Highly Imbalanced Data in Bioinformatics. IEEE Trans. Neural Netw. Learning Syst. 2020, 31, 2857–2867. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Ma, X.; Sun, P.; Gong, M. An Integrative Framework of Heterogeneous Genomic Data for Cancer Dynamic Modules Based on Matrix Decomposition. IEEE/ACM Trans. Comput. Biol. Bioinf. 2022, 19, 305–316. [Google Scholar] [CrossRef] [PubMed]
- Zhuang, H.; Sun, Y.; Tang, J.; Zhang, J.; Sun, X. Influence Maximization in Dynamic Social Networks. In Proceedings of the 13th IEEE International Conference on Data Mining, Dallas, TX, USA, 8–11 December 2013; pp. 1313–1318. [Google Scholar] [CrossRef]
- Tang, J.; Fong, A.C.M.; Wang, B.; Zhang, J. A Unified Probabilistic Framework for Name Disambiguation in Digital Library. IEEE Trans. Knowl. Data Eng. 2012, 24, 975–987. [Google Scholar] [CrossRef]
- Newman, M.E.J. Detecting Community Structure in Networks. Eur. Phys. J. B. 2004, 38, 321–330. [Google Scholar] [CrossRef]
- Calderoni, F.; Brunetto, D.; Piccardi, C. Communities in Criminal Networks: A Case Study. Soc. Netw. 2017, 48, 116–125. [Google Scholar] [CrossRef]
- Taya, F.; de Souza, J.; Thakor, N.V.; Bezerianos, A. Comparison Method for Community Detection on Brain Networks from Neuroimaging Data. Appl. Netw. Sci. 2016, 1, 8. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Rezaeimehr, F.; Moradi, P.; Ahmadian, S.; Qader, N.N.; Jalili, M. TCARS: Time- and Community-Aware Recommendation System. Future Gener. Comput. Syst. 2018, 78, 419–429. [Google Scholar] [CrossRef]
- Soundarajan, S.; Hopcroft, J. Using Community Information to Improve the Precision of Link Prediction Methods. In Proceedings of the 21st International Conference on World Wide Web, Lyon, France, 16–20 December 2012; pp. 607–608. [Google Scholar] [CrossRef]
- Liu, Q.D. Research on Community Detection Based on Network Embedding. Ph.D. Dissertation, Lanzhou University, Lanzhou, China, 2018. [Google Scholar]
- Perozzi, B.; Al-Rfou, R.; Skiena, S. DeepWalk: Online Learning of Social Representations. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA, 24–27 August 2014; pp. 701–710. [Google Scholar] [CrossRef] [Green Version]
- Qi, Z.W.; Wang, J.H.; Yue, K.; Qiao, S.J.; Li, J. Methods and Applications of Graph Embedding: A Survey. Act. Electron. Sinica 2020, 48, 808–818. [Google Scholar] [CrossRef]
- Cao, Y.; Dong, Y.M.; Wu, S.Q.; Chen, H.H.; Qian, J.B.; Pan, S.L. Dynamic Network Representation Learning: A Review. Act. Electron. Sinica 2020, 48, 2047–2059. [Google Scholar] [CrossRef]
- Grover, A.; Leskovec, J. Node2vec: Scalable Feature Learning for Networks. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 855–864. [Google Scholar] [CrossRef] [Green Version]
- Wang, D.; Cui, P.; Zhu, W. Structural Deep Network Embedding. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 1225–1234. [Google Scholar]
- Goyal, P.; Kamra, N.; He, X.; Liu, Y. DynGEM: Deep Embedding Method for Dynamic Graphs. In Proceedings of the 3rd International Workshop on Representation Learning for Graphs, Melbourne, Australia, 19–25 August 2017; pp. 1–8. [Google Scholar] [CrossRef]
- Goyal, P.; Chhetri, S.R.; Canedo, A. Dyngraph2vec: Capturing Network Dynamics Using Dynamic Graph Representation Learning. Knowl.-Based Syst. 2020, 187, 104816. [Google Scholar] [CrossRef]
- Cavallari, S.; Zheng, V.W.; Cai, H.; Chang, K.C.C.; Cambria, E. Learning Community Embedding with Community Detection and Node Embedding on Graphs. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, Singapore, 6–10 November 2017; pp. 377–386. [Google Scholar] [CrossRef]
- Jia, Y.; Zhang, Q.; Zhang, W.; Wang, X. CommunityGAN: Community Detection with Generative Adversarial Nets. In Proceedings of the World Wide Web Conference, San Francisco, CA, USA, 13–17 May 2019; pp. 784–794. [Google Scholar] [CrossRef] [Green Version]
- Chakrabarti, D.; Kumar, R.; Tomkins, A. Evolutionary Clustering. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Philadelphia, PA, USA, 20–23 August 2006; pp. 554–560. [Google Scholar] [CrossRef]
- Lin, Y.R.; Chi, Y.; Zhu, S.; Sundaram, H.; Tseng, B.L. Facetnet: A Framework for Analyzing Communities and Their Evolutions in Dynamic Networks. In Proceedings of the 17th International Conference on World Wide Web, Beijing, China, 23–27 April 2008; pp. 685–694. [Google Scholar] [CrossRef]
- Ma, X.; Li, D.; Tan, S.; Huang, Z. Detecting Evolving Communities in Dynamic Networks Using Graph Regularized Evolutionary Nonnegative Matrix Factorization. Phys. A. 2019, 530, 121279. [Google Scholar] [CrossRef]
- Agarwal, P.; Verma, R.; Agarwal, A.; Chakraborty, T. DyPerm: Maximizing Permanence for Dynamic Community Detection. In Proceedings of the 22nd Pacific-Asia Conference on Knowledge Discovery and Data Mining, Melbourne, Australia, 15–18 May 2018; pp. 437–449. [Google Scholar] [CrossRef] [Green Version]
- Mikolov, T.; Chen, K.; Corrado, G.; Dean, J. Efficient Estimation of Word Representations in Vector Space. In Proceedings of the 16th International Conference on Learning Representations, Scottsdale, AZ, USA, 2–4 May 2013; pp. 1–12. [Google Scholar] [CrossRef]
- Cai, H.; Zheng, V.W.; Chang, K.C.C. A Comprehensive Survey of Graph Embedding: Problems, Techniques, and Applications. IEEE Trans. Knowl. Data Eng. 2018, 30, 1616–1637. [Google Scholar] [CrossRef] [Green Version]
- Cui, P.; Wang, X.; Pei, J.; Zhu, W. A Survey on Network Embedding. IEEE Trans. Knowl. Data Eng. 2019, 31, 833–852. [Google Scholar] [CrossRef] [Green Version]
- Greene, D.; Doyle, D.; Cunningham, P. Tracking the Evolution of Communities in Dynamic Social Networks. In Proceedings of the 2010 International Conference on Advances in Social Networks Analysis and Mining, Odense, Denmark, 9–11 August 2010; pp. 176–183. [Google Scholar] [CrossRef] [Green Version]
- Blondel, V.D.; Guillaume, J.L.; Lambiotte, R.; Lefebvre, E. Fast Unfolding of Communities in Large Networks. J. Stat. Mech. 2008, 2008, P10008. [Google Scholar] [CrossRef] [Green Version]
- Goyal, P.; Chhetri, S.R.; Mehrabi, N.; Ferrara, E.; Canedo, A. DynamicGEM: A Library for Dynamic Graph Embedding Methods. arXiv 2018. [Google Scholar] [CrossRef]
Symbols | Descriptions |
---|---|
G | dynamic networks |
g | temporal snapshot network |
t | time snapshot |
T | the number of temporal snapshot networks |
m | the number of edges |
n | the number of nodes |
d | the dimension of embedding vector |
C | community aggregation of dynamic network |
HFit | the embedding vector of node i in historical embedding matrix at time t |
CFit | the embedding vector of node i in current embedding matrix at time t |
ComFt | the comprehensive features matrix of network embedding at time t |
Dataset Name | Descriptions |
---|---|
PS | The face-to-face communication network of 242 teachers and students in a primary school. |
CW | A contact network of 145 employees in an office building in France. |
HS2011 | The network of 126 teachers and students in Marseille high school in France within a certain 4-days period in December 2011 |
HS2012 | The contact network of 180 teachers and students of Marseille High School in France within a 7-day period in November 2012. |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Qu, S.; Du, Y.; Zhu, M.; Yuan, G.; Wang, J.; Zhang, Y.; Duan, X. Dynamic Community Detection Based on Evolutionary DeepWalk. Appl. Sci. 2022, 12, 11464. https://doi.org/10.3390/app122211464
Qu S, Du Y, Zhu M, Yuan G, Wang J, Zhang Y, Duan X. Dynamic Community Detection Based on Evolutionary DeepWalk. Applied Sciences. 2022; 12(22):11464. https://doi.org/10.3390/app122211464
Chicago/Turabian StyleQu, Song, Yuqing Du, Mu Zhu, Guan Yuan, Jining Wang, Yanmei Zhang, and Xiangyu Duan. 2022. "Dynamic Community Detection Based on Evolutionary DeepWalk" Applied Sciences 12, no. 22: 11464. https://doi.org/10.3390/app122211464
APA StyleQu, S., Du, Y., Zhu, M., Yuan, G., Wang, J., Zhang, Y., & Duan, X. (2022). Dynamic Community Detection Based on Evolutionary DeepWalk. Applied Sciences, 12(22), 11464. https://doi.org/10.3390/app122211464