Design and Optimization of a Distributed File System Based on RDMA
Abstract
:1. Introduction
2. Background and Motivation
2.1. Comparison of RDMA Technology and TCP
2.2. Asymmetry of RDMA Unilateral Communication
3. Design of a Lightweight Distributed File System Based on RDMA Communication Mode
3.1. Design of a Lightweight Distributed File System Based on RDMA Communication Mode
3.2. FastDFS Communication Process Optimization Scheme Based on RDMA
- (1)
- Determining when to obtain results from the server;
- (2)
- Determining the size of the result obtained from the server at a time.
- (3)
- Download the location information into storage;
- (4)
- Download the content of the file.
4. Experimental Analysis
4.1. Test Environment
4.2. Comparative Test and Result Analysis
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Kumar, K.J. Implementing Network File System Protocol for Highly Available Clustered Applications on Network Attached Storage. In Proceedings of the 2013 5th International Conference and Computational Intelligence and Communication Networks, Mathura, India, 27–29 September 2013; pp. 496–499. [Google Scholar]
- Subramoni, H.; Lai, P.; Luo, M.; Panda, D.K. RDMA over Ethernet—A preliminary study. In Proceedings of the 2009 IEEE International Conference on Cluster Computing & Workshops, New Orleans, LA, USA, 31 August–4 September 2010. [Google Scholar]
- Liu, J.; Li, B.; Song, M. THE optimization of HDFS based on small files. In Proceedings of the 2010 3rd IEEE International Conference on Broadband Network and Multimedia Technology (IC-BNMT), Beijing, China, 26–28 October 2011. [Google Scholar]
- Qi, X.; Hu, H.; Guo, J.; Huang, C.; Zhou, X.; Xu, N.; Zhou, A. High-availability in-memory key-value store using RDMA and Optane DCPMM. Front. Comput. Sci. 2023, 17, 3. [Google Scholar] [CrossRef]
- Wasi-ur-Rahman, M.; Islam, N.S.; Lu, X.; Jose, J.; Subramoni, H.; Wang, H.; Panda, D.K.D. High-Performance RDMA-based Design of Hadoop MapReduce over InfiniBand. In Proceedings of the IEEE International Symposium on Parallel & Distributed Processing Workshops & Phd Forum, Cambridge, MA, USA, 20–24 May 2013. [Google Scholar]
- Kim, H.; Yeom, H. Improving Small File I/O Performance for Massive Digital Archives. In Proceedings of the 2017 IEEE 13th International Conference on e-Science (e-Science), Auckland, New Zealand, 24–27 October 2017; pp. 256–265. [Google Scholar]
- Dragojević, A.; Narayanan, D.; Castro, M.; Hodson, O. FaRM: Fast remote memory. In Proceedings of the Usenix Symposium on Networked Systems Design & Implementation, Seattle, WA, USA, 2–4 April 2014. [Google Scholar]
- Kalia, A.; Kaminsky, M.; Andersen, D.G. Using RDMA efficiently for key-value services. In Proceedings of the 2014 ACM Conference on SIGCOMM, Chicago, IL, USA, 17–22 August 2014; pp. 295–306. [Google Scholar]
- He, Q.; Bian, G.; Zhang, W.; Li, Z. RTFTL: Design and implementation of real-time FTL algorithm for flash memory. J. Supercomput. 2022, 78, 18959–18993. [Google Scholar] [CrossRef]
- Chen, W.; Yu, S.; Wang, Z. Fast In-Memory Key-Value Cache System with RDMA. J. Circuits Syst. Comput. 2018, 28, 1950074. [Google Scholar] [CrossRef]
- He, Q.; Bian, G.; Zhang, W.; Wu, F.; Li, Z. TCFTL: Improved Real-Time Flash Memory Two Cache Flash Translation Layer Algorithm. J. Nanoelectron. Optoelectron. 2021, 16, 403–413. [Google Scholar] [CrossRef]
- Campbell, R. Managing AFS: The Andrew File System; Prentice-Hall: Saddle River, NJ, USA, 1998. [Google Scholar]
- He, Q.; Zhang, F.; Bian, G.; Zhang, W.; Li, Z.; Chen, C. Dynamic decision-making strategy of replica number based on data hot. J. Supercomput. 2023, 79, 9584–9603. [Google Scholar] [CrossRef]
- Vasile, M.; Martoiu, S.; Boukhadida, N.; Stoicea, G.; Micu, P.; Dumitru, A.; Iordache, C. Integration of FPGA RDMA into the ATLAS Readout with FELIX in High Luminosity LHC. J. Instrum. 2022, 18, C01025. [Google Scholar] [CrossRef]
- Wang, Y.; Meng, X.; Zhang, L.; Tan, J. C-Hint: An Effective and Reliable Cache Management for RDMA-Accelerated Key-Value Stores. In Proceedings of the ACM Symposium on Cloud Computing, Seattle, WA, USA, 3–5 November 2014. [Google Scholar]
- Mittal, R.; Shpiner, A.; Panda, A.; Zahavi, E.; Krishnamurthy, A.; Ratnasamy, S.; Shenker, S. Revisiting Network Support for RDMA. In Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication, 20–25 August 2018. [Google Scholar]
- Dalessandro, D.; Devulapalli, A.; Wyckoff, P. iWarp protocol kernel space software implementation. In Proceedings of the International Conference on Parallel & Distributed Processing, Rhodes, Greece, 25–29 April 2006. [Google Scholar]
- Fan, K.F. System and Method for RDMA QP State Split between RNIC and Host Software. U.S. Patent 8,161,126, 14 April 2012. [Google Scholar]
- Michael, S.; Zhen, L.; Henschel, R.; Simms, S.; Barton, E.; Link, M. A study of lustre networking over a 100 gigabit wide area network with 50 milliseconds of latency. In Proceedings of the Fifth International Workshop on Data-Intensive Distributed Computing Date, Delft, The Netherlands, 19 June 2012; pp. 43–52. [Google Scholar]
- Miao, M.; Ren, F.; Luo, X.; Xie, J.; Meng, Q.; Cheng, W. SoftRDMA: Rekindling High Performance Software RDMA over Commodity Ethernet. In Proceedings of the Asia-Pacific Workshop on Networking, Hong Kong, China, 3–4 August 2017. [Google Scholar]
- Balaji, P.; Narravula, S.; Vaidyanathan, K.; Krishnamoorthy, S.; Wu, J.; Panda, D.K. Sockets Direct Protocol over InfiniBand in clusters: Is it beneficial? In Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, Austin, TX, USA, 10–12 March 2004; pp. 28–35. [Google Scholar]
- Sur, S.; Koop, J.M.; Chai, L.; Panda, K.D. Performance analysis and evaluation of Mellanox ConnectX Infiniband architecture with multi-core platforms. In Proceedings of the 15th Annual IEEE Symposium on High Performance Interconnects (HOTI’07), Stanford, CA, USA, 22–24 August 2007; pp. 125–134. [Google Scholar]
- Balaji, P.; Shivam, P.; Wyckoff, P.; Panda, K.D. High Performance User Level Sockets over Gigabit Ethernet. In Proceedings of the IEEE International Conference on Cluster Computing (Cluster’02), Chicago, IL, USA, 26 September 2002; pp. 179–186. [Google Scholar]
- Grant, R.E.; Rashti, M.J.; Afsahi, A. An Analysis of QoS Provisioning for Sockets Direct Protocol vs. IPoIB over Modern InfiniBand Networks. In Proceedings of the International Conference on Parallel Processing-Workshops (ICPP-W’08), Portland, OR, USA, 8–12 September 2008; pp. 79–86. [Google Scholar]
- Yu, W.; Rao, N.S.; Wyckoff, P.; Vetter, J.S. Performance of RDMA-capable storage protocols on wide-area network. In Proceedings of the Petascale Data Storage Workshop, Austin, TX, USA, 17 November 2008; pp. 1–5. [Google Scholar]
- Yu, W.; Tian, Y.; Vetter, J.S. Efficient zero-copy noncontiguous I/O for globus on infiniband. In Proceedings of the 2010 39th International Conference on Parallel Processing Workshops, San Diego, CA, USA, 13–16 September 2010; pp. 362–368. [Google Scholar]
- Huang, J.; Ouyang, X.; Jose, J.; Wasi-ur-Rahman, M.; Wang, H.; Luo, M.; Subramoni, H.; Murthy, C.; Panda, D.K. High-Performance Design of HBase with RDMA over InfiniBand. In Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium, Shanghai, China, 21–25 May 2012; pp. 774–785. [Google Scholar]
- Ceph over Accelio. 2014. Available online: https://www.cohortfs.com/ceph-over-accelio (accessed on 6 June 2023).
- Islam, N.S.; Rahman, M.W.; Jose, J.; Rajachandrasekar, R.; Wang, H.; Subramoni, H.; Murthy, C.; Panda, D.K. High performance RDMA-based design of HDFS over InfiniBand. In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, Salt Lake City, UT, USA, 10–16 November 2012; pp. 1–12. [Google Scholar]
- GlusterFS Docs. Available online: https://docs.gluster.org/en/latest/ (accessed on 6 June 2023).
- Clark, D.D.; Jacobson, V.; Romkey, J.; Salwen, H. An Analysis of TCP processing overhead. IEEE Commun. Mag. 1989, 27, 23–29. [Google Scholar] [CrossRef]
- Taranov, K.; Rothenberger, B.; De Sensi, D.; Perrig, A.; Hoefler, T. NeVerMore: Exploiting RDMA Mistakes in NVMe-oF Storage Applications. In Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security, Los Angeles, CA, USA, 7–11 November 2022. [Google Scholar] [CrossRef]
- Shi, W.; Wang, Y.; Corriveau, J.P.; Niu, B.; Croft, W.L.; Peng, M. Smart Shuffling in MapReduce: A Solution to Balance Network Traffic and Workloads. In Proceedings of the 2015 IEEE/ACM 8th International Conference on Utility and Cloud Computing (UCC), Limassol, Cyprus, 7–10 September 2015; pp. 35–44. [Google Scholar]
- Wu, Y.; Ma, T.; Su, M.; Zhang, M.; Chen, K.; Guo, Z. RF-RPC: Remote Fetching RPC Paradigm for RDMA-Enabled Network. IEEE Trans. Parallel Distrib. Syst. 2019, 30, 1657–1671. [Google Scholar] [CrossRef]
- Liu, X.; Yu, Q.; Liao, J. Fastdfs: A high performance distributed file system. ICIC Express Lett. Part B Appl. Int. J. Res. Surv. 2014, 5, 1741–1746. [Google Scholar]
Category | Configuration |
---|---|
CPU | Intel® Core™ i7-8300HQ CPU @2.30 GHZ |
RAM | 32.0 GB |
Disk | 500 GB |
Translator | GCC 4.47 |
OS | Windows 10 64-bit |
ID | IP Address | Configuration | Role |
---|---|---|---|
A | 192.168.156.61 | CentOS7/2 GB RAM/20 G Disk/CPU/100 M | Tracker 1 |
B | 192.168.156.62 | Tracker 1 | |
C | 192.168.156.63 | Group 1 Storage 1 | |
D | 192.168.156.64 | Group 1 Storage 2 | |
E | 192.168.156.65 | Group 2 Storage 3 | |
F | 192.168.156.66 | Group 2 Storage 4 | |
G | 192.168.156.67 | Client |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
He, Q.; Gao, P.; Zhang, F.; Bian, G.; Zhang, W.; Li, Z. Design and Optimization of a Distributed File System Based on RDMA. Appl. Sci. 2023, 13, 8670. https://doi.org/10.3390/app13158670
He Q, Gao P, Zhang F, Bian G, Zhang W, Li Z. Design and Optimization of a Distributed File System Based on RDMA. Applied Sciences. 2023; 13(15):8670. https://doi.org/10.3390/app13158670
Chicago/Turabian StyleHe, Qinlu, Pengze Gao, Fan Zhang, Genqing Bian, Weiqi Zhang, and Zhen Li. 2023. "Design and Optimization of a Distributed File System Based on RDMA" Applied Sciences 13, no. 15: 8670. https://doi.org/10.3390/app13158670
APA StyleHe, Q., Gao, P., Zhang, F., Bian, G., Zhang, W., & Li, Z. (2023). Design and Optimization of a Distributed File System Based on RDMA. Applied Sciences, 13(15), 8670. https://doi.org/10.3390/app13158670