BIOS-Based Server Intelligent Optimization
Abstract
:1. Introduction
- (1)
- To improve the performance of Kunpeng processor-based servers, we proposed a performance tuning framework for dynamically adjusting the BIOS configuration. It monitors the server workload information to identify scenarios and implements performance optimization based on the results of static tuning or empirical configuration.
- (2)
- At the static configuration tuning stage, finding a near-optimal BIOS configuration is modeled as starting from a feasible initial configuration and adjusting the BIOS configuration to obtain an improvement. Based on this model, we propose a joint BIOS optimization algorithm using a deep -network combining reinforcement learning and nearest neighbor search.
- (3)
- With the proposed optimization algorithm, we significantly improve the memory bandwidth rate in memory-intensive scenarios. To further evaluate the proposed static tuning method, we compare it with two metaheuristic methods: genetic algorithm and particle swarm optimization algorithm. The algorithm in this paper is more stable and has a lower probability of server downtime.
- (4)
- We have also carried out optimization work in other load scenarios and found that in some scenarios, performance indicators are no longer critical optimization indicators.
2. Dynamic Tuning Framework
3. Workload Scenario Recognition
3.1. Scenario Preparation
3.2. Data Processing and Scenario Recognition
4. Workload Scenario Optimization
4.1. Markov Model for BIOS Control Optimization
4.2. Deep Q-Network
4.3. Joint BIOS Optimization Algorithm Using DQN
4.3.1. Environment Design
- The state of the system is obtained at the next moment. The state at the next moment is obtained through end-to-end testing. First, we calculate the absolute BIOS configuration at the next moment from the predicted action at the current time, and then we test the server to obtain the performance evaluation. Finally, the two parts are merged to form a state.
- An instant reward is obtained. In contrast to the general reinforcement learning task, BIOS control optimization has no specific target, and the desired effect is that the algorithm can obtain better server performance quickly while ensuring the ability to jump out of local optimization. For the STREAM test scenario, the goal is to adjust the configuration, ensuring that the memory scores can increase rapidly and have the ability to find higher scores. Therefore, the reward function is set as follows:
4.3.2. Agent Decision-Making and Learning
Algorithm 1 BIOS optimization algorithm based on DQN (Part 1). |
Input: N Initial capacity of the playback pool, |
M Number of final exploration frames for optimization, |
T Number of optimization steps per iteration process, |
State corresponding to the initial configuration |
1 Initialize replay memory (D) to capacity N; |
2 Initialize Q-network with random weights ; |
3 Initialize environment; |
4 For episode = 1, M do |
5 Initialize environment state ; |
6 For t = 1, T do |
7 Calculate in state ; |
8 With probability select a random action , |
Otherwise, select ; |
9 Execute action in the environment, |
obtain a reward , next state , and whether server downtime; |
10 If server downtime: store the same transition in D times; |
11 else: store one transition in D; |
12 Sample random minibatch of transitions from D |
13 Update parameters in Q-network with Formula (5) |
14 Every C steps reset |
15 End For |
16 End For |
4.3.3. State Control Optimization Algorithm
5. Experimental Results
5.1. Workload Scenario Recognition Results
5.2. Operating Scenario Performance Optimization Experiment
5.2.1. Simulation
5.2.2. Measured Experiment
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Appendix A
Serial Number | Feature Name | Data Type |
---|---|---|
1 | Cpu_cycles | int |
2 | L1_cache_refill | int |
3 | L1d_cache_refill | int |
4 | L1d_cache | int |
5 | L1d_tld_refill | int |
6 | L1i_cache | int |
7 | L2d_cache | int |
8 | L2d_cache_refill | int |
9 | L1d_tlb | int |
10 | Prf_req | int |
11 | Hit_on_prf | int |
12 | Mem_stall_l1miss | int |
13 | Mem_stall_l2miss | int |
14 | Atc_cmd | int |
15 | Flux_rd | int |
16 | Flux_wr | int |
17 | Prc_cmd | int |
18 | Rnk_chg | int |
19 | Rw_chg | int |
20 | Fluxid_wcmd | int |
21 | Fluxid_rcmd | int |
22 | Bnk_chg | int |
23 | Rd_cpipe | int |
24 | Rx_ops_num | int |
25 | Rx_outer | int |
26 | Rx_sccl | int |
27 | Retry | int |
28 | Tx_snp_outer | int |
29 | Hac_smmu_transaction | int |
30 | Net_smmu_transaction | int |
31 | Net_smmu_l2_ltb_hit | int |
32 | Net_smmu_ltb_miss | int |
33 | Pcie_smmu_transaction | int |
34 | Pcie_smmu_trans_table_walk_access | int |
35 | Pcie_smmu_context_bank_cache_miss | int |
36 | Pcie_smmu_ltb_miss | int |
37 | Pcie_smmu_l1_tlb | int |
38 | Pcie_smmu_l2_ltb_hit | int |
References
- Li, J.; Lu, M. The performance optimization and modeling analysis based on the Apache Web Server. In Proceedings of the 32nd Chinese Control Conference, Xian, China, 26–28 July 2013; IEEE: Piscataway, NJ, USA, 2013; pp. 1712–1716. [Google Scholar]
- Mahajan, D.; Blakeney, C.; Zong, Z. Improving the energy efficiency of relational and NoSQL databases via query optimizations. Sustain. Comput. Inform. Syst. 2019, 22, 120–133. [Google Scholar] [CrossRef]
- Bakhshalipour, M.; Tabaeiaghdaei, S.; Lotfi-Kamran, P.; Sarbazi-Azad, H. Evaluation of Hardware Data Prefetchers on Server Processors. ACM Comput. Surv. 2020, 52, 1–29. [Google Scholar] [CrossRef]
- Liao, S.; Hung, T.H.; Nguyen, D.; Chou, C.; Tu, C.; Zhou, H. Machine learning-based prefetch optimization for data center applications. In Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, New York, NY, USA, 14–20 November 2009; pp. 1–10. [Google Scholar]
- Rahman, S.; Burtscher, M.; Zong, Z.; Qasem, A. Maximizing hardware prefetch effectiveness with machine learning. In Proceedings of the 2015 IEEE 17th International Conference on High Performance Computing and Communications, 2015 IEEE 7th International Sympo-sium on Cyberspace Safety and Security, and 2015 IEEE 12th International Conference on Embedded Software and Systems, New York, NY, USA, 24–26 August 2015; IEEE: Piscataway, NJ, USA, 2015; pp. 383–389. [Google Scholar]
- Li, M.; Chen, G.; Wang, Q.; Lin, Y.; Hofstee, P.; Stenstrom, P.; Zhou, D. PATer: A Hardware Prefetching Automatic Tuner on IBM POWER8 Processor. IEEE Comput. Arch. Lett. 2015, 15, 37–40. [Google Scholar] [CrossRef]
- Xia, J.; Cheng, C.; Zhou, X.; Hu, Y.; Chun, P. Kunpeng 920: The First 7-nm Chiplet-Based 64-Core ARM SoC for Cloud Services. IEEE Micro 2021, 41, 67–75. [Google Scholar] [CrossRef]
- Regenscheid, A. BIOS Protection Guidelines for Servers. NIST Spec. Publ. 2014, 800, 147B. [Google Scholar] [CrossRef]
- openEuler: Prefetch_tuning. Available online: https://gitee.com/openeuler/prefetch_tuning/tree/master (accessed on 19 May 2022).
- Katoch, S.; Chauhan, S.S.; Kumar, V. A review on genetic algorithm: Past, present, and future. Multimed. Tools Appl. 2021, 80, 8091–8126. [Google Scholar] [CrossRef]
- Wu, D.; Wu, C. Research on the Time-Dependent Split Delivery Green Vehicle Routing Problem for Fresh Agricultural Products with Multiple Time Windows. Agriculture 2022, 12, 793. [Google Scholar] [CrossRef]
- Koyuncu, H.; Ceylan, R. A PSO based approach: Scout particle swarm algorithm for continuous global optimization problems. J. Comput. Des. Eng. 2019, 6, 129–142. [Google Scholar] [CrossRef]
- Beheshti, Z.; Shamsuddin, S.M.; Hasan, S. Memetic binary particle swarm optimization for discrete optimization problems. Inf. Sci. 2015, 299, 58–84. [Google Scholar] [CrossRef]
- Almahdi, S.; Yang, S.Y. A constrained portfolio trading system using particle swarm algorithm and recurrent reinforcement learning. Expert Syst. Appl. 2019, 130, 145–156. [Google Scholar] [CrossRef]
- Goyal, S.; Bhushan, S.; Kumar, Y.; Rana, A.; Bhutta, M.; Ijaz, M.; Son, Y. An Optimized Framework for Energy-Resource Allocation in a Cloud Environment based on the Whale Optimization Algorithm. Sensors 2021, 21, 1583. [Google Scholar] [CrossRef] [PubMed]
- Faris, H.; Aljarah, I.; Al-Betar, M.A.; Mirjalili, S. Grey wolf optimizer: A review of recent variants and applications. Neural Comput. Appl. 2018, 30, 413–435. [Google Scholar] [CrossRef]
- Chen, H.; Miao, F.; Chen, Y.; Xiong, Y.; Chen, T. A Hyperspectral Image Classification Method Using Multifeature Vectors and Optimized KELM. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 2781–2795. [Google Scholar] [CrossRef]
- Ghoul, T.; Sayed, T. Real-Time Safety Optimization of Connected Vehicle Trajectories Using Reinforcement Learning. Sensors 2021, 21, 3864. [Google Scholar] [CrossRef]
- Mnih, V.; Kavukcuoglu, K.; Silver, D.; Graves, A.; Antonoglou, I.; Wierstra, D.; Riedmiller, M. Playing atari with deep reinforcement learning. arXiv 2013, arXiv:1312.5602. [Google Scholar]
- Zamani, R.; Afsahi, A. A study of hardware performance monitoring counter selection in power modeling of computing systems. In Proceedings of the 2012 International Green Computing Conference (IGCC), San Jose, CA, USA, 4–8 June 2012; IEEE: Piscataway, NJ, USA, 2012; pp. 1–10. [Google Scholar]
- Smith, J.W.; Sommerville, I. Workload classification software energy measurement for efficient scheduling on private cloud platforms. arXiv 2011, arXiv:1105.2584. [Google Scholar]
- Myint, S.H. Server Workload Classification and Analysis with Machine Learning Algorithms. Available online: https://meral.edu.mm/record/4398/files/11108.pdf (accessed on 19 May 2022).
- Linux/Tools/Perf/Pmu-Events. Available online: https://github.com/torvalds/linux/tree/master/tools/perf/pmu-events/arch/arm64 (accessed on 19 May 2022).
- FIO-Flexible I/O Benchmark. Available online: https://fio.readthedocs.io/en/latest/fio_doc.html (accessed on 19 May 2022).
- IOzone Filesystem Benchmark. Available online: https://www.iozone.org/ (accessed on 19 May 2022).
- Sysbench Benchmark. Available online: https://wiki.gentoo.org/wiki/Sysbench (accessed on 19 May 2022).
- STREAM Benchmark. Available online: https://github.com/jeffhammond/STREAM (accessed on 19 May 2022).
- iPerf3. Available online: https://iperf.fr/ (accessed on 19 May 2022).
- Patel, H.H.; Prajapati, P. Study and analysis of decision tree based classification algorithms. Int. J. Com-Puter. Sci. Eng. 2018, 6, 74–78. [Google Scholar] [CrossRef]
- Wright, R.E. Logistic regression. In Reading and Understanding Multivariate Statistics; Grimm, L.G., Yarnold, P.R., Eds.; American Psychological Association: Washington, DC, USA, 1995; pp. 217–244. [Google Scholar]
- Guo, G.; Wang, H.; Bell, D.; Bi, Y.; Greer, K. KNN model-based approach in classification. In On The Move to Meaningful Internet Systems 2003: CoopIS, DOA, and ODBASE: Proceedings of the OTM Confederated International Conferences CoopIS, DOA, and ODBASE 2003 Catania, Sicily, Italy, 3–7 November 2003; Springer: Berlin/Heidelberg, Germany, 2003; pp. 986–996. [Google Scholar]
- Haldorai, A.; Ramu, A. Canonical correlation analysis based hyper basis feedforward neural network classification for ur-ban sustainability. Neural Processing Lett. 2021, 53, 2385–2401. [Google Scholar] [CrossRef]
- Zhao, H.; Liu, J.; Chen, H.; Chen, J.; Li, Y.; Xu, J.; Deng, W. Intelligent Diagnosis Using Continuous Wavelet Transform and Gauss Convolutional Deep Belief Network. IEEE Trans. Reliab. 2022. [Google Scholar] [CrossRef]
- Puterman, M.L. Markov decision processes. In Handbooks in Operations Research and Management Science; Elsevier: Amsterdam, The Netherlands, 1990; Volume 2, pp. 331–434. [Google Scholar]
- Arulkumaran, K.; Deisenroth, M.P.; Brundage, M.; Bharath, A.A. Deep Reinforcement Learning: A Brief Survey. IEEE Signal Process. Mag. 2017, 34, 26–38. [Google Scholar] [CrossRef]
- Mnih, V.; Kavukcuoglu, K.; Silver, D.; Rusu, A.A.; Veness, J.; Bellemare, M.G.; Graves, A.; Riedmiller, M.; Fidjeland, A.K.; Ostrovski, G.; et al. Human-level control through deep reinforcement learning. Nature 2015, 518, 529–533. [Google Scholar] [CrossRef] [PubMed]
- Alom, M.Z.; Taha, T.M.; Yakopcic, C.; Westberg, S.; Sidike, P.; Nasrin, M.S.; Hasan, M.; Van Essen, B.C.; Awwal, A.A.; Asari, V.K. A state-of-the-art survey on deep learning theory and architectures. Electronics 2019, 8, 292. [Google Scholar] [CrossRef] [Green Version]
- Benesty, J.; Chen, J.; Huang, Y.; Cohen, I. Pearson correlation coefficient. In Noise Reduction in Speech Processing; Springer: Berlin/Heidelberg, Germany, 2009; pp. 1–4. [Google Scholar]
Scenario Type | Benchmarking Tools |
---|---|
I/O-intensive scenario | FIO, IOZone |
Network-intensive scenario | iPerf3 |
CPU-intensive scenario | Sysbench-CPU |
Memory-intensive scenario | STREAM |
Idle scenario | None |
Register Name | Number of Configuration Bits Selected for Optimization |
---|---|
L3T_STATIC_CTRL | 8 |
L3T_DYNAMIC_CTRL | 16 |
L3T_DYNAMIC_AUCTRL0 | 8 |
L3T_DYNAMIC_AUCTRL1 | 21 |
L3T_PREFECTH | 8 |
HHA_DIR_CTRL | 15 |
HHA_FUNC_DIS | 17 |
HHA_TOTEMNUM | 11 |
Hyperparameters | Value |
---|---|
minibatch size | 32 |
replay memory size | 10,000 |
discount factor γ | 0.90 |
learning rate α | 0.005 |
1 | |
0.995 | |
0.1 | |
final exploration frame M | 2000 |
final step T | 100 |
Methods | Main Parameter Settings | Best Performance Score (Average Memory Rate MB/s) | Server Downtime Probability during the Experiment Process | |
---|---|---|---|---|
Prior Knowledge | 263,000 | |||
Joint BIOS optimization algorithm using DQN (including the training process) | Iter = 50 | 290,400 | 4.1% | |
Genetic algorithm | Iter = 100, popsize = 20 | 281,000 | 8.1% | |
Iter = 100, popsize = 100 | 289,700 | 20.3% | ||
Binary particle swarm algorithm | Popsize = 100 | 276,000 | 39.8% |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Qi, X.; Yang, J.; Zhang, Y.; Xiao, B. BIOS-Based Server Intelligent Optimization. Sensors 2022, 22, 6730. https://doi.org/10.3390/s22186730
Qi X, Yang J, Zhang Y, Xiao B. BIOS-Based Server Intelligent Optimization. Sensors. 2022; 22(18):6730. https://doi.org/10.3390/s22186730
Chicago/Turabian StyleQi, Xianxian, Jianfeng Yang, Yiyang Zhang, and Baonan Xiao. 2022. "BIOS-Based Server Intelligent Optimization" Sensors 22, no. 18: 6730. https://doi.org/10.3390/s22186730
APA StyleQi, X., Yang, J., Zhang, Y., & Xiao, B. (2022). BIOS-Based Server Intelligent Optimization. Sensors, 22(18), 6730. https://doi.org/10.3390/s22186730