Gain and Pain in Graph Partitioning: Finding Accurate Communities in Complex Networks †
Abstract
:1. Introduction and Literature Review
1.1. Main Contributions
- (1)
- We synergistically integrate two highly effective community quality functions: (i) the connectivity-based metric introduced in [21] and (ii) the Max-Min Modularity [16]. Specifically, we employ the connectivity-based metric and the heuristic algorithm proposed in [21] to establish a novel complementary graph associated with the Max-Min Modularity and to obtain a high-quality initial solution to the Max-Min Modularity maximization problem.
- (2)
- We develop our community detection problem as an integer programming formulation of a revised Max-Min Modularity maximization problem (extending and improving the approach previously described in [15]).
- (3)
- To mitigate time complexity, by deriving analytical insights, we demonstrate the time efficiency of the employed heuristic method, thereby validating the efficiency of obtaining the refined complementary graph. Besides, we present a more compact yet equivalent model to the original integer formulation and demonstrate through theoretical arguments that the proposed sparse model yields the same optimal solution but with significantly fewer variables and constraints.
- (4)
- We design a bound for controlling the degree to which the algorithm is supposed to delve into optimally solving the model and when it should obtain assistance from the heuristic method. More specifically, taking the promising communities generated by the heuristic method as the initial solution to the sparse counterpart of the proposed Max-Min Modularity maximization problem’s integer model, we devise a procedure for balancing between heuristic and exact optimization based on the size of the given network. This facilitates the detection of communities even in extensive networks.
- (5)
- Through various experiments, we showcase the success of our proposed algorithm.
1.2. Paper Organization
2. Methodology
2.1. Heuristic but Fast
Algorithm 1: Creating Initial Communities (CIC) |
- 1.
- Sorting nodes by degree: sorting n nodes based on their degrees takes time.
- 2.
- Constructing : we iterate through the n nodes. For each node v, we need to check its distance to all nodes in . In other words, for each node v, we should verify that it is at least steps away from all nodes in . To do this, we can use Breadth-First Search (BFS) starting from v. BFS runs in time, where m is the number of edges in Graph G.
Algorithm 2: Updating Communities (UC) |
* //small constant ϵ guarantees that running time for this stage remains polynomial [32,33]. |
2.2. Toward Exactness
2.3. The Proposed Hybrid Approach
Algorithm 3: The proposed algorithm for community discovery |
3. Evaluation of Accuracy
3.1. Normalized Mutual Information (NMI)
3.2. Experiments
4. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Jalali, M.; Tsotsalas, M.; Wöll, C. MOFSocialNet: Exploiting Metal-Organic Framework Relationships via Social Network Analysis. Nanomaterials 2022, 12, 704. [Google Scholar] [CrossRef] [PubMed]
- Jiang, L.; Shi, L.; Liu, L.; Yao, J.; Yousuf, M.A. User interest community detection on social media using collaborative filtering. Wirel. Netw. 2019, 28, 1169–1175. [Google Scholar] [CrossRef]
- Hu, L.; Zhang, J.; Pan, X.; Yan, H.; You, Z.H. HiSCF: Leveraging higher-order structures for clustering analysis in biological networks. Bioinformatics 2021, 37, 542–550. [Google Scholar] [CrossRef] [PubMed]
- Zhao, D.; Li, J.; Jiang, Z. Effects of link perturbation on network modularity for community detections in complex network systems. Mod. Phys. Lett. B 2021, 35, 2150214. [Google Scholar] [CrossRef]
- Gawande, N.; Ghosh, S.; Halappanavar, M.; Tumeo, A.; Kalyanaraman, A. Towards scaling community detection on distributed-memory heterogeneous systems. Parallel Comput. 2022, 111, 102898. [Google Scholar] [CrossRef]
- Ramakrishna, R.; Scaglione, A. Grid-graph signal processing (grid-GSP): A graph signal processing framework for the power grid. IEEE Trans. Signal Process. 2021, 69, 2725–2739. [Google Scholar] [CrossRef]
- Magnani, M.; Hanteer, O.; Interdonato, R.; Rossi, L.; Tagarelli, A. Community detection in multiplex networks. ACM Comput. Surv. (CSUR) 2021, 54, 1–35. [Google Scholar] [CrossRef]
- Girvan, M.; Newman, M.E. Community structure in social and biological networks. Proc. Natl. Acad. Sci. USA 2002, 99, 7821–7826. [Google Scholar] [CrossRef] [PubMed]
- Chen, J.; Zaïane, O.R.; Goebel, R. Detecting communities in social networks using max-min modularity. In Proceedings of the 2009 SIAM International Conference on Data Mining, Sparks, NV, USA, 30 April–2 May 2009; pp. 978–989. [Google Scholar]
- Boudebza, S.; Cazabet, R.; Azouaou, F.; Nouali, O. OLCPM: An online framework for detecting overlapping communities in dynamic social networks. Comput. Commun. 2018, 123, 36–51. [Google Scholar] [CrossRef]
- Shen, H.W.; Cheng, X.Q.; Guo, J.F. Quantifying and identifying the overlapping community structure in networks. J. Stat. Mech. Theory Exp. 2009, 2009, P07042. [Google Scholar] [CrossRef]
- Devi, J.C.; Poovammal, E. An analysis of overlapping community detection algorithms in social networks. Procedia Comput. Sci. 2016, 89, 349–358. [Google Scholar] [CrossRef]
- Newman, M.E. Fast algorithm for detecting community structure in networks. Phys. Rev. E 2004, 69, 066133. [Google Scholar] [CrossRef]
- Raghavan, U.N.; Albert, R.; Kumara, S. Near linear time algorithm to detect community structures in large-scale networks. Phys. Rev. E 2007, 76, 036106. [Google Scholar] [CrossRef] [PubMed]
- Ferdowsi, A.; Chenary, M.D. Toward an Optimal Solution to the Network Partitioning Problem. In Proceedings of the 2023 18th Conference on Computer Science and Intelligence Systems (FedCSIS), Warsaw, Poland, 17–20 September 2023; pp. 111–117. [Google Scholar]
- Ferdowsi, A.; Khanteymoori, A. Discovering communities in networks: A linear programming approach using max-min modularity. In Proceedings of the 2021 16th Conference on Computer Science and Intelligence Systems (FedCSIS), Sofia, Bulgaria, 2–5 September 2021; pp. 329–335. [Google Scholar]
- Schaeffer, S.E. Graph clustering. Comput. Sci. Rev. 2007, 1, 27–64. [Google Scholar] [CrossRef]
- Cheikh, S.; Sara, B.; Sara, Z. A Hybrid Heuristic Community Detection Approach. In Proceedings of the 2020 International Conference on INnovations in Intelligent SysTems and Applications (INISTA), Novi Sad, Serbia, 24–26 August 2020; pp. 1–7. [Google Scholar]
- Gao, Y.; Yu, X.; Zhang, H. Overlapping community detection by constrained personalized PageRank. Expert Syst. Appl. 2021, 173, 114682. [Google Scholar] [CrossRef]
- Sahu, S.; Rani, T.S. A neighbour-similarity based community discovery algorithm. Expert Syst. Appl. 2022, 206, 117822. [Google Scholar] [CrossRef]
- Ferdowsi, A.; Dehghan Chenary, M.; Khanteymoori, A. Tscda: A dynamic two-stage community discovery approach. Soc. Netw. Anal. Min. 2022, 12, 46. [Google Scholar] [CrossRef]
- Chakraborty, T.; Dalmia, A.; Mukherjee, A.; Ganguly, N. Metrics for community analysis: A survey. ACM Comput. Surv. (CSUR) 2017, 50, 54. [Google Scholar] [CrossRef]
- Radicchi, F.; Castellano, C.; Cecconi, F.; Loreto, V.; Parisi, D. Defining and identifying communities in networks. Proc. Natl. Acad. Sci. USA 2004, 101, 2658–2663. [Google Scholar] [CrossRef]
- Wei, Y.C.; Cheng, C.K. Towards efficient hierarchical designs by ratio cut partitioning. In Proceedings of the 1989 IEEE International Conference on Computer-Aided Design. Digest of Technical Papers, Santa Clara, CA, USA, 5–9 November 1989; pp. 298–301. [Google Scholar]
- Shi, J.; Malik, J. Normalized cuts and image segmentation. Dep. Pap. (CIS) 2000, emph22, 888–905. [Google Scholar]
- Flake, G.W.; Lawrence, S.; Giles, C.L. Efficient identification of web communities. In Proceedings of the sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Boston, MA, USA, 20–23 August 2000; pp. 150–160. [Google Scholar]
- Newman, M.E. Modularity and community structure in networks. Proc. Natl. Acad. Sci. USA 2006, 103, 8577–8582. [Google Scholar] [CrossRef]
- Alozie, G.U.; Arulselvan, A.; Akartunalı, K.; Pasiliao, E.L., Jr. Efficient methods for the distance-based critical node detection problem in complex networks. Comput. Oper. Res. 2021, 131, 105254. [Google Scholar] [CrossRef]
- Hamilton, W.L. Graph representation learning. Synth. Lect. Artifical Intell. Mach. Learn. 2020, 14, 1–159. [Google Scholar]
- Orman, G.K.; Labatut, V. A comparison of community detection algorithms on artificial networks. In Proceedings of the International Conference on Discovery Science, Porto, Portugal, 3–5 October 2009; Springer: Berlin/Heidelberg, Germany, 2009; pp. 242–256. [Google Scholar]
- Ferdowsi, A.; Abhari, A. Generating high-quality synthetic graphs for community detection in social networks. In Proceedings of the 2020 Spring Simulation Conference, Fairfax, VA, USA, 18 May 2020; pp. 1–10. [Google Scholar]
- Arya, V.; Garg, N.; Khandekar, R.; Meyerson, A.; Munagala, K.; Pandit, V. Local search heuristics for k-median and facility location problems. SIAM J. Comput. 2004, 33, 544–562. [Google Scholar] [CrossRef]
- Gupta, A.; Tangwongsan, K. Simpler analyses of local search algorithms for facility location. arXiv 2008, arXiv:0809.2554. [Google Scholar]
- Ferdowsi, A. An Integer Programming Approach Reinforced by a Message-passing Procedure for Detecting Dense Attributed Subgraphs. In Proceedings of the 2022 17th Conference on Computer Science and Intelligence Systems (FedCSIS), Sofia, Bulgaria, 4–7 September 2022; pp. 569–576. [Google Scholar]
- Ghosh, S.; Halappanavar, M.; Tumeo, A.; Kalyanaraman, A.; Lu, H.; Chavarria-Miranda, D.; Khan, A.; Gebremedhin, A. Distributed louvain algorithm for graph community detection. In Proceedings of the 2018 IEEE international parallel and distributed processing symposium (IPDPS), Vancouver, BC, Canada, 21–25 May 2018; pp. 885–895. [Google Scholar]
- Aloise, D.; Caporossi, G.; Hansen, P.; Liberti, L.; Perron, S.; Ruiz, M. Modularity maximization in networks by variable neighborhood search. Graph Partitioning Graph Clust. 2012, 588, 113. [Google Scholar]
- Xie, J.R.; Wang, B.H. Modularity-like objective function in annotated networks. Front. Phys. 2017, 12, 128903. [Google Scholar] [CrossRef]
- Lancichinetti, A.; Fortunato, S.; Radicchi, F. Benchmark graphs for testing community detection algorithms. Phys. Rev. E 2008, 78, 046110. [Google Scholar] [CrossRef] [PubMed]
- Zachary, W.W. An information flow model for conflict and fission in small groups. J. Anthropol. Res. 1977, 33, 452–473. [Google Scholar] [CrossRef]
- Gil-Mendieta, J.; Schmidt, S. The political network in Mexico. Soc. Netw. 1996, 18, 355–381. [Google Scholar] [CrossRef]
- Lusseau, D.; Schneider, K.; Boisseau, O.J.; Haase, P.; Slooten, E.; Dawson, S.M. The bottlenose dolphin community of Doubtful Sound features a large proportion of long-lasting associations. Behav. Ecol. Sociobiol. 2003, 54, 396–405. [Google Scholar] [CrossRef]
- Guimera, R.; Danon, L.; Diaz-Guilera, A.; Giralt, F.; Arenas, A. Self-similar community structure in a network of human interactions. Phys. Rev. E 2003, 68, 065103. [Google Scholar] [CrossRef] [PubMed]
- Mahajan, A.; Kaur, M. Various approaches of community detection in complex networks: A glance. Int. J. Inf. Technol. Comput. Sci. (IJITCS) 2016, 8, 35–41. [Google Scholar] [CrossRef]
- Meghanathan, N. A greedy algorithm for neighborhood overlap-based community detection. Algorithms 2016, 9, 8. [Google Scholar] [CrossRef]
- Batagelj, V.; Mrvar, A. Pajek Datasets. 2009. Available online: http://vlado.fmf.uni-lj.si/pub/networks/data/ (accessed on 20 May 2024).
- Cangelosi, A.; Parisi, D. A neural network model of caenorhabditis elegans: The circuit of touch sensitivity. Neural Process. Lett. 1997, 6, 91–98. [Google Scholar] [CrossRef]
- Mrvar, A.; Batagelj, V. Pajek: Programs for Analysis and Visualization of Very Large Networks: Reference Manual: List of Commands with Short Explanation Version 5.10; Mrvar, A., Ed.; University of Ljubljana: Ljubljana, Slovenia, 2020. [Google Scholar]
- Chand, S.; Mehta, S. Community detection using nature inspired algorithm. In Hybrid Intelligence for Social Networks; Springer: Berlin/Heidelberg, Germany, 2017; pp. 47–76. [Google Scholar] [CrossRef]
- Leskovec, J.; Kleinberg, J.; Faloutsos, C. Graph evolution: Densification and shrinking diameters. ACM Trans. Knowl. Discov. Data (TKDD) 2007, 1, 2. [Google Scholar] [CrossRef]
- Mkhitaryan, K.; Mothe, J.; Haroutunian, M. Detecting communities from networks: Comparison of algorithms on real and synthetic networks. Int. J. Inf. Theor. Appl. 2019, 26, 231–267. [Google Scholar]
- Yang, J.; Leskovec, J. Defining and evaluating network communities based on ground-truth. Knowl. Inf. Syst. 2015, 42, 181–213. [Google Scholar] [CrossRef]
- Shi, X.; Lu, H.; He, Y.; He, S. Community detection in social network with pairwisely constrained symmetric non-negative matrix factorization. In Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), Paris, France, 25–28 August 2015; pp. 541–546. [Google Scholar]
- Yip, K.Y.; Cheung, D.W.; Ng, M.K. Harp: A practical projected clustering algorithm. IEEE Trans. Knowl. Data Eng. 2004, 16, 1387–1397. [Google Scholar] [CrossRef]
- Danon, L.; Diaz-Guilera, A.; Duch, J.; Arenas, A. Comparing community structure identification. J. Stat. Mech. Theory Exp. 2005, 2005, P09008. [Google Scholar] [CrossRef]
- Rosvall, M.; Bergstrom, C.T. Maps of random walks on complex networks reveal community structure. Proc. Natl. Acad. Sci. USA 2008, 105, 1118–1123. [Google Scholar] [CrossRef] [PubMed]
- Xu, G.; Guo, J.; Yang, P. TNS-LPA: An Improved Label Propagation Algorithm for Community Detection Based on Two-Level Neighbourhood Similarity. IEEE Access 2020, 9, 23526–23536. [Google Scholar] [CrossRef]
Function | Definition | |
---|---|---|
IC | Internal density [23] | |
Edge inside [23] | ||
Average degree [23] | ||
EC | Expansion [23] | |
Cut Ratio [24] | ||
MIE | Conductance [25] | |
Normalized Cut [25] | ||
Out-Degree Fraction [26] | ||
Flake-ODF [26] |
ID | Network | n | m |
---|---|---|---|
1 | Zachary’s karate club [39] | 34 | 78 |
2 | Mexican Politicians [40] | 35 | 117 |
3 | Dolphin network [41] | 62 | 159 |
4 | Les Miserables [41] | 77 | 254 |
5 | p53 protein [42] | 104 | 226 |
6 | Books about U.S. politics [43] | 105 | 441 |
7 | American college football [8] | 115 | 613 |
8 | Citation graph drawing [44] | 311 | 640 |
9 | USAir97 [45] | 332 | 2126 |
10 | C. Elegans [46] | 453 | 2025 |
11 | Erdos collaboration [47] | 472 | 1314 |
12 | Electronic circuit [48] | 512 | 819 |
13 | Email-Eu-core [49] | 986 | 16,064 |
14 | Amazon network [50,51] | 334,863 | 925,872 |
15 | LiveJournal social network [51,52] | 3,997,962 | 34,681,189 |
Network ID | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Parameter | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 |
0 | 0 | 0 | 2 | 2 | 2 | 2 | 3 | 4 | 4 | 4 | 4 | 5 | 5 | 5 | |
2 | 2 | 3 | 3 | 3 | 3 | 5 | 6 | 8 | 8 | 11 | 15 | 18 | 25 | 29 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ferdowsi, A.; Dehghan Chenary, M. Gain and Pain in Graph Partitioning: Finding Accurate Communities in Complex Networks. Algorithms 2024, 17, 226. https://doi.org/10.3390/a17060226
Ferdowsi A, Dehghan Chenary M. Gain and Pain in Graph Partitioning: Finding Accurate Communities in Complex Networks. Algorithms. 2024; 17(6):226. https://doi.org/10.3390/a17060226
Chicago/Turabian StyleFerdowsi, Arman, and Maryam Dehghan Chenary. 2024. "Gain and Pain in Graph Partitioning: Finding Accurate Communities in Complex Networks" Algorithms 17, no. 6: 226. https://doi.org/10.3390/a17060226
APA StyleFerdowsi, A., & Dehghan Chenary, M. (2024). Gain and Pain in Graph Partitioning: Finding Accurate Communities in Complex Networks. Algorithms, 17(6), 226. https://doi.org/10.3390/a17060226