Near-Data Source Graph Partitioning
Abstract
:1. Introduction
2. Related Works
3. Near-Data Graph Partitioning
4. Implementation
Algorithm 1 NDGP_Random |
|
Algorithm 2 NDGP_Greedy |
|
5. Experiments
5.1. Experiment Platform
5.2. Benchmarking Algorithms and Data Set
5.3. Experiment Design and Results
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Sakr, S.; Pardede, E. Graph Data Management: Techniques and Applications; Information Science Reference; IGI Publishing: Hershey, PA, USA, 2011. [Google Scholar]
- Rathore, M.M.U.; Gul, M.J.J.; Paul, A.; Khan, A.A.; Ahmad, R.W.; Rodrigues, J.; Bakiras, S. Multilevel Graph-Based Decision Making in Big Scholarly Data: An Approach to Identify Expert Reviewer, Finding Quality Impact Factor, Ranking Journals and Researchers. IEEE Trans. Emerg. Top. Comput. 2018, 9, 280–292. [Google Scholar] [CrossRef]
- Xue, Z.; Chen, J.X.; Zhao, Y.; Medvar, B.; Knepper, M.A. Data Integration in Physiology Using Bayes’ Rule and Minimum Bayes’ Factors: Deubiquitylating Enzymes in the Renal Collecting Duct. Physiol. Genom. 2016, 49, 151–159. [Google Scholar] [CrossRef]
- Chang, F.; Zhang, B.; Zhao, Y.; Wu, S.; Yoshigoe, K. Overlapping Community Detecting Based on Complete Bipartite Graphs in Micro-bipartite Network Bi-EgoNet. IEEE Access 2019, 7, 91488–91498. [Google Scholar] [CrossRef]
- Ullah, F.; Srivastava, G.; Ullah, S.; Yoshigoe, K.; Zhao, Y. NIDS-VSB: Network Intrusion Detection System for VANET Using Spark-Based Big Data Optimization and Transfer Learning. IEEE Trans. Consum. Electron. 2024, 70, 1798–1809. [Google Scholar] [CrossRef]
- Schulz, C.; Strash, D. Graph partitioning: Formulations and applications to big data. In Encyclopedia of Big Data Technologies; Springer: Cham, Switzerland, 2018; pp. 1–7. [Google Scholar]
- Sardianos, C.; Papadatos, G.B.; Varlamis, I. Optimizing Parallel Collaborative Filtering Approaches for Improving Recommendation Systems Performance. Information 2019, 10, 155. [Google Scholar] [CrossRef]
- Sardianos, C.; Varlamis, I.; Eirinaki, M. Scaling Collaborative Filtering to Large-Scale Bipartite Rating Graphs Using Lenskit and Spark. In Proceedings of the 2017 IEEE Third International Conference on Big Data Computing Service and Applications (BigDataService), Redwood City, CA, USA, 6–9 April 2017. [Google Scholar]
- Mcsherry, F. Spectral Partitioning of Random Graphs. In Proceedings of the 42nd IEEE Symposium on Foundations of Computer Science, Newport Beach, CA, USA, 8–11 October 2002. [Google Scholar]
- Chang, F.; Zhang, B.; Wu, S.; Zhao, Y.L.; Li, B.; Maimaitiriyimu, J. OCDAD: An Overlapping Community Detecting Algorithm using Attention Degree in Directed Ex-EgoNet. In Proceedings of the 2019 IEEE Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech), Fukuoka, Japan, 5–8 August 2019; pp. 442–448. [Google Scholar]
- Chang, F.; Zhang, B.; Li, H.; Huang, M.; Li, B.; Zhao, Y. Discovering overlapping communities in ego-nets using friend intimacy. J. Intell. Fuzzy Syst. Appl. Eng. Technol. 2019, 36, 5167–5175. [Google Scholar] [CrossRef]
- Zhao, Y.P.; Dai, X.; Chen, Y.; Zhang, C.; Chen, L.; Zhao, Y. Bilevel fuzzy clustering via adaptive similarity graphs fusion. Inf. Sci. 2024, 662, 120281. [Google Scholar] [CrossRef]
- Mathur, R.P.; Sharma, M. Graph-Based Application Partitioning Approach for Computational Offloading in Mobile Cloud Computing. Recent Adv. Comput. Sci. Commun. (Former. Recent Patents Comput. Sci.) 2021, 14, 92–99. [Google Scholar] [CrossRef]
- Martella, C.; Logothetis, D.; Loukas, A.; Siganos, G. Spinner: Scalable graph partitioning in the cloud. In Proceedings of the 2017 IEEE 33rd International Conference on Data Engineering (ICDE), San Diego, CA, USA, 19–22 April 2017; pp. 1083–1094. [Google Scholar]
- Schloegel, K.; Karypis, G.; Kumar, V. Graph Partitioning for High Performance Scientific Simulations; Army High Performance Computing Research Center: Minneapolis, MN, USA, 2000; pp. 491–541. [Google Scholar]
- Bichot, C.E.; Siarry, P. Graph Partitioning; WILEY-ISTE: Hoboken, NJ, USA, 2011. [Google Scholar]
- Walshaw, C.; Cross, M. JOSTLE: Parallel multilevel graph-partitioning software—An overview. Mesh Partitioning Tech. Domain Decompos. Tech. 2007, 10, 27–58. [Google Scholar]
- Buluç, A.; Meyerhenke, H.; Safro, I.; Sanders, P.; Schulz, C. Recent advances in graph partitioning. In Algorithm Engineering; Springer: Cham, Switzerland, 2016; pp. 117–158. [Google Scholar]
- Wu, Z.; Karimi, H.; Dang, C. An approximation algorithm for graph partitioning via deterministic annealing neural network. Neural Netw. 2019, 117, 191–200. [Google Scholar] [CrossRef]
- Li, X.; Zhang, M.; Chen, K.; Wu, Y.; Qian, X.; Zheng, W. 3-d partitioning for large-scale graph processing. IEEE Trans. Comput. 2020, 70, 111–127. [Google Scholar] [CrossRef]
- Sanders, P.; Schulz, C. KaHIP v2. 00–Karlsruhe High Quality Partitioning User Guide. arXiv 2013, arXiv:1311.1714. [Google Scholar]
- Karypis, G.; Kumar, V. A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J. Sci. Comput. 1998, 20, 359–392. [Google Scholar] [CrossRef]
- Chevalier, C.; Pellegrini, F. PT-Scotch: A tool for efficient parallel graph ordering. Parallel Comput. 2008, 34, 318–331. [Google Scholar] [CrossRef]
- LaSalle, D.; Karypis, G. Multi-threaded graph partitioning. In Proceedings of the 2013 IEEE 27th International Symposium on Parallel and Distributed Processing, Cambridge, MA, USA, 20–24 May 2013; pp. 225–236. [Google Scholar]
- Holtgrewe, M.; Sanders, P.; Schulz, C. Engineering a scalable high quality graph partitioner. In Proceedings of the 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS), Atlanta, GA, USA, 19–23 April 2010; pp. 1–12. [Google Scholar]
- Meyerhenke, H. Shape optimizing load balancing for MPI-parallel adaptive numerical simulations. Graph Partitioning Graph Clust. 2012, 588, 67. [Google Scholar]
- Meyerhenke, H.; Sanders, P.; Schulz, C. Parallel graph partitioning for complex networks. IEEE Trans. Parallel Distrib. Syst. 2017, 28, 2625–2638. [Google Scholar] [CrossRef]
- Zhao, Y.; Yoshigoe, K.; Xie, M.; Zhou, S.; Seker, R.; Bian, J. Lightgraph: Lighten communication in distributed graph-parallel processing. In Proceedings of the 2014 IEEE International Congress on Big Data, Anchorage, AK, USA, 27 June–2 July 2014; pp. 717–724. [Google Scholar]
- Zhao, Y.; Yoshigoe, K.; Xie, M.; Bian, J.; Xiong, K. L-PowerGraph: A lightweight distributed graph-parallel communication mechanism. J. Supercomput. 2018, 76, 1850–1879. [Google Scholar] [CrossRef]
- Huang, J.; Abadi, D.J. Leopard: Lightweight edge-oriented partitioning and replication for dynamic graphs. Proc. VLDB Endow. 2016, 9, 540–551. [Google Scholar] [CrossRef]
- Zhang, H.; Lin, L.; Xu, L.; Wang, X. Graph partition based privacy-preserving scheme in social networks. J. Netw. Comput. Appl. 2021, 195, 103214. [Google Scholar] [CrossRef]
- Zhang, S.; Jiang, Z.; Hou, X.; Guan, Z.; Yuan, M.; You, H. An Efficient and Balanced Graph Partition Algorithm for the Subgraph-Centric Programming Model on Large-scale Power-law Graphs. In Proceedings of the 2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS), Washington, DC, USA, 7–10 July 2020. [Google Scholar]
- Moussawi, A.E.; Seghouani, N.B.; Bugiotti, F. A Graph Partitioning Algorithm for Edge or Vertex Balance. In Database and Expert Systems Applications; Springer: Cham, Switzerland, 2020. [Google Scholar]
- Yin, D.; Wang, S.; Ouyang, Y. ViCTS: A novel network partition algorithm for scalable agent-based modeling of mass evacuation. Comput. Environ. Urban Syst. 2020, 80, 101452. [Google Scholar] [CrossRef]
- Yan, J.; Xu, H.; Li, N.; Zhang, Z. Large-Scale Emulation Network Topology Partition Based on Community Detection With the Weight of Vertex Similarity. Comput. J. 2023, 66, 1817–1828. [Google Scholar] [CrossRef]
- Wiegele, A.; Zhao, S. SDP-based bounds for graph partition via extended ADMM. arXiv 2021, arXiv:2105.09075. [Google Scholar] [CrossRef]
- Chen, J.; Qian, X. Khuzdul: Efficient and Scalable Distributed Graph Pattern Mining Engine. In Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Vancouver, BC, Canada, 25–19 March 2023; Volume 2. [Google Scholar]
- Liu, C.; Peng, Z.; Zheng, W.; Zou, L. FSM: A Fine-Grained Splitting and Merging Framework for Dual-Balanced Graph Partition. Proc. VLDB Endow. 2024, 17, 2378–2391. [Google Scholar] [CrossRef]
- Siguenza-Torres, A.; Wieder, A.; Meng, Z.; Narvaez Rivas, S.; Gao, M.; Grossi, M.; Du, X.; Bortoli, S.; Cai, W.; Knoll, A. ENHANCE: Multilevel Heterogeneous Performance-Aware Re-Partitioning Algorithm For Microscopic Vehicle Traffic Simulation. ACM Trans. Model. Comput. Simul. 2024; just accepted. [Google Scholar] [CrossRef]
- Liu, P.; Cai, P.; Li, C.; Chen, H. AVPS: Automatic Vertical Partitioning for Dynamic Workload. In Advanced Intelligent Computing Technology and Applications, Proceedings of the 20th International Conference, ICIC 2024, Tianjin, China, 5–8 August 2024; Springer: Singapore, 2024; pp. 146–157. [Google Scholar]
- Gonzalez, J.E.; Low, Y.; Gu, H.; Bickson, D.; Guestrin, C. Powergraph: Distributed graph-parallel computation on natural graphs. In Proceedings of the 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI 12), Hollywood, CA, USA, 10–18 October 2012; pp. 17–30. [Google Scholar]
- Ren, H.; Wu, B. OLPGP: An Optimized Label Propagation-Based Distributed Graph Partitioning Algorithm. In International Conference on Data Mining and Big Data, Proceedings of the 7th International Conference, DMBD 2022, Beijing, China, 21–24 November 2022; Springer: Singapore, 2022; pp. 120–133. [Google Scholar]
- Gonzalez, J.E.; Xin, R.S.; Dave, A.; Crankshaw, D.; Franklin, M.J.; Stoica, I. {GraphX}: Graph processing in a distributed dataflow framework. In Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI 14), Broomfield, CO, USA, 6–8 October 2014; pp. 599–613. [Google Scholar]
- Yang, J.; Leskovec, J. Defining and Evaluating Network Communities Based on Ground-Truth. In Proceedings of the 2012 IEEE 12th International Conference on Data Mining, Brussels, Belgium, 10–13 December 2012; pp. 745–754. [Google Scholar] [CrossRef]
- Karypis, G.; Kumar, V. METIS: A Software Package for Partitioning Unstructured Graphs, Partitioning Meshes, and Computing Fill-Reducing Orderings of Sparse Matrices; Technical report; University Digital Conservancy: Minneapolis, MN, USA, 1997. [Google Scholar]
- Li, X.; Pang, Y.; Zhao, C.; Liu, Y.; Dong, Q. A new multi-level algorithm for balanced partition problem on large scale directed graphs. Adv. Aerodyn. 2021, 3, 23. [Google Scholar] [CrossRef]
Algorithm | Characteristics | Application Illustration | Category |
---|---|---|---|
PageRank | Iterative, high communication | Importance ranking | Direct Acyclic Graph |
SSSP (shortest path) | Iterative, medium communication | Decision making | Direct Acyclic Graph |
Triangle Counting | Single step, medium communication | Clustering coefficient | General |
Graph | Description | Type | # of Vertices | # of Edges | Graph Density () | Average Degree | Memory Size (GB) |
---|---|---|---|---|---|---|---|
soc-LiveJournal | Friednship social network | Directed | 4,847,571 | 68,993,773 | 0.59 | 14 | 10.3 |
Social networks | Directed | 41,652,230 | 1,468,365,182 | 0.08 | 35 | 231.2 | |
BFS | Facebook social networks | Directed | 61,876,615 | 336,776,269 | 0.02 | 5 | 51.6 |
com-Orkut [44] | Orkut online social network | Undirected | 3,072,441 | 1,171,850,837 | 2.48 | 76 | 22.1 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Chang, F.; Guo, H.; Ullah, F.; Wang, H.; Zhao, Y.; Zhang, H. Near-Data Source Graph Partitioning. Electronics 2024, 13, 4455. https://doi.org/10.3390/electronics13224455
Chang F, Guo H, Ullah F, Wang H, Zhao Y, Zhang H. Near-Data Source Graph Partitioning. Electronics. 2024; 13(22):4455. https://doi.org/10.3390/electronics13224455
Chicago/Turabian StyleChang, Furong, Hao Guo, Farhan Ullah, Haochen Wang, Yue Zhao, and Haitian Zhang. 2024. "Near-Data Source Graph Partitioning" Electronics 13, no. 22: 4455. https://doi.org/10.3390/electronics13224455
APA StyleChang, F., Guo, H., Ullah, F., Wang, H., Zhao, Y., & Zhang, H. (2024). Near-Data Source Graph Partitioning. Electronics, 13(22), 4455. https://doi.org/10.3390/electronics13224455