An Efficient Approach to Consolidating Job Schedulers in Traditional Independent Scientific Workflows
Abstract
:1. Introduction
2. Materials and Methods
2.1. Job History Information
2.2. Status before Integration
2.3. Deployment of the Integrated Cluster
2.3.1. Configuration of the Integrated Cluster
2.3.2. Configuration of Submission Node
2.3.3. Configuration of the Job Management Node
3. Results and Discussion
3.1. Characterization of Jobs before and after Integration
3.2. Analyzing Submitted Jobs after Integration
3.2.1. Job Submission Trends
3.2.2. Characteristics of Preempted Jobs
3.2.3. Time Loss along Preemption
4. Conclusions
Author Contributions
Funding
Conflicts of Interest
References
- Hey, A.; Tansley, S.; Tolle, K. The Fourth Paradigm: Data-Intensive Scientific Discovery; Microsoft Research: Redmond, WA, USA, 2009. [Google Scholar]
- Agrawal, A.; Choudhary, A. Perspective: Materials informatics and big data: Realization of the ‘fourth paradigm’ of science in materials science. APL Mater. 2016, 4, 053208. [Google Scholar] [CrossRef] [Green Version]
- Trajanov, D.; Zdraveski, V.; Stojanov, R.; Kocarev, L. Dark Data in Internet of Things (IoT): Challenges and Opportunities. In Proceedings of the 7th Small Systems Simulation Symposium, Niš, Serbia, 12–14 February 2018. [Google Scholar]
- Oliveira, S.F.; Fürlinger, K.; Kranzlmüller, D. Trends in computation, communication and storage and the consequences for data-intensive science. In Proceedings of the 2012 IEEE 14th International Conference on High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and Systems, Liverpool, UK, 25–27 June 2012; pp. 572–579. [Google Scholar]
- Cukier, K.; Mayer-Schoenberger, V. The rise of big data. Foreign Aff. 2013, 92, 29. [Google Scholar]
- Hershey, P.C. Data Analytics Implications. IEEE Potentials 2018, 37, 10–11. [Google Scholar] [CrossRef]
- Livny, M.; Basney, J.; Raman, R.; Tannenbaum, T. Mechanisms for High Throughput Computing. SPEEDUP J. 1997, 11, 36–40. [Google Scholar]
- Thain, D.; Tannenbaum, T.; Livny, M. Distributed computing in practice: The condor experience. Concurr. Pract. Exp. 2005, 17, 323–356. [Google Scholar] [CrossRef] [Green Version]
- Tannenbaum, T.; Wright, D.; Miller, K.; Livny, M. Condor—A Distributed Job Scheduler. In Beowulf Cluster Computing with Linux; Sterling, T., Ed.; The MIT Press: Cambridge, MA, USA, 2002; ISBN 0-262-69274-0. [Google Scholar]
- HTCondor, High Throughput Computing. Available online: https://research.cs.wisc.edu/htcondor (accessed on 29 December 2019).
- Henderson, R.; Tweten, D. Portable Batch System: External Reference Specification; Technical Report; Ames Research Center: Mountain View, CA, USA, 1996. [Google Scholar]
- PBS. Available online: http://pbspro.org (accessed on 29 December 2019).
- Jette, M.A.; Yoo, A.B.; Grondona, M. SLURM: Simple Linux Utility for Resource Management. In Proceedings of the Job Scheduling Strategies for Parallel Processing (JSSPP), Seattle, WA, USA, 24 June 2003; Lecture Notes in Computer Science. pp. 44–60. [Google Scholar]
- Slurm. Available online: https://slurm.schedmd.com (accessed on 29 December 2019).
- Chlumský, V.; Klusáček, D.; Ruda, M. The extension of TORQUE scheduler allowing the use of planning and optimization algorithms in Grids. Comput. Sci. 2012, 13, 5–19. [Google Scholar] [CrossRef] [Green Version]
- Torque. Available online: https://www.adaptivecomputing.com/products/open-source/torque (accessed on 29 December 2019).
- Kim, B.; Ahn, S.; Khan, T.; Jnag, H. Introduction of GSDC project and activities. In Proceedings of the International Symposium on Grids and Clouds and the Open Grid Forum, Taipei, Taiwan, 19–25 March 2011; SISSA Medialab: Trieste, Italy, 2011; p. 13. [Google Scholar]
- Cho, K.; KISTI, Daejeon, Korea. Personal communication, 2019.
- Cho, K. 8th GSDC Resources Review Boards; KISTI: Daejeon, Korea, 2019. [Google Scholar]
- Albrecht, J.; Alves, A.A.; Amadio, G.; Andronico, G.; Anh-Ky, N.; Aphecetche, L.; Bagliesi, G. A Roadmap for HEP Software and Computing R&D for the 2020s. Comput. Softw. Big Sci. 2019, 3, 7. [Google Scholar]
- Kong, B.; Ryu, G.; Noh, S. Multi-experimental Support Through HTCondor Scheduling Policy Integrated Pool Configurationr. Platf. Technol. Lett. 2019, 6, 17–20. [Google Scholar]
- Ahn, U.; Jaikar, A.; Kong, B.; Yeo, I.; Bae, S.; Kim, J. Experience on HTCondor batch system for HEP and other research fields at KISTI-GSDC. J. Phys. Conf. Ser. 2017, 898, 182938. [Google Scholar] [CrossRef]
- HTCondor Manual. Available online: https://htcondor.readthedocs.io/en/latest/classad-attributes/job-classad-attributes.html (accessed on 29 December 2019).
- Raman, R.; Livny, M.; Solomon, M. Matchmaking: Distributed Resource Management for High Throughput Computing. In Proceedings of the Seventh IEEE International Symposium on High Performance Distributed Computing, Chicago, IL, USA, 28–31 July 1998. [Google Scholar]
- Ryu, G.; Bae, S.; Noh, S.; Yoon, H. A study on performance degradation when using Singularity container of HTCondor scheduler. Platf. Technol. Lett. 2019, 6, 1. [Google Scholar]
- Kurtzer, G.M.; Sochat, V.; Bauer, M.W. Singularity: Scientific containers for mobility of compute. PLoS ONE 2017, 12, e0177459. [Google Scholar] [CrossRef] [PubMed]
- Singularity. Available online: https://singularity.lbl.gov (accessed on 29 December 2019).
Item | Mean |
---|---|
AcctGroup | Group information for submitted jobs |
AcctGroupUser | User information for submitted jobs |
CMD | Executed command |
CommittedTime | The number of seconds of wall clock time that the job has been allocated to a machine |
CumulativeSlotTime | Cumulative number of seconds the job has been allocated to a machine |
JobCurrentStartDate | Time at which the job most recently began running |
JobStartDate | Time at which the job first began running |
NumJobStarts | An integer count of the number of times the job started executing |
QDate | Time at which the job was submitted to the job queue |
Number of Jobs Processed | Total WallTime (s) | Total WaitingTime (s) | ||||
---|---|---|---|---|---|---|
Group A | Group B | Group A | Group B | Group A | Group B | |
2018 | 242,963 | 2,788,872 | 647,731,692 | 5,342,505,530 | 201,578,262 | 15,587,662,341 |
2019 | 447,944 | 1,799,363 | 4,450,001,236 | 4,015,125,597 | 2,625,918,725 | 6,370,466,789 |
+84.37% | −35.48% | +587.01% | −24.85% | +1202.68% | −59.13% |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kong, B.; Ryu, G.; Bae, S.; Noh, S.-Y.; Yoon, H. An Efficient Approach to Consolidating Job Schedulers in Traditional Independent Scientific Workflows. Appl. Sci. 2020, 10, 1455. https://doi.org/10.3390/app10041455
Kong B, Ryu G, Bae S, Noh S-Y, Yoon H. An Efficient Approach to Consolidating Job Schedulers in Traditional Independent Scientific Workflows. Applied Sciences. 2020; 10(4):1455. https://doi.org/10.3390/app10041455
Chicago/Turabian StyleKong, Byungyun, Geonmo Ryu, Sangwook Bae, Seo-Young Noh, and Heejun Yoon. 2020. "An Efficient Approach to Consolidating Job Schedulers in Traditional Independent Scientific Workflows" Applied Sciences 10, no. 4: 1455. https://doi.org/10.3390/app10041455
APA StyleKong, B., Ryu, G., Bae, S., Noh, S. -Y., & Yoon, H. (2020). An Efficient Approach to Consolidating Job Schedulers in Traditional Independent Scientific Workflows. Applied Sciences, 10(4), 1455. https://doi.org/10.3390/app10041455