Compiler Optimization Parameter Selection Method Based on Ensemble Learning
Abstract
:1. Introduction
- (1)
- For program optimization parameter search problems, an improved adaptive multiobjective PSO was proposed to explore the parameter optimization search space. Compared with the default standard optimization level -O3 of GCC, we obtained 1.39× and 1.36× speedup on two platforms at the training phase of the model.
- (2)
- We proposed a program feature representation method FCR, which uses the mixed static and dynamic features to represent the programs. Based on FCR, the optimal feature subset of the original feature was applied to the ensemble learning process to minimize the training time cost.
- (3)
- Program features and the optimal optimization parameters constitute the sample data of the ELOPS model. We used the sample data to establish a statistical model, and then integrated the model into the compiler framework. The ELOPS model can execute the heuristic selection of optimization parameters and compile the process drive. Given the new target program, the ELOPS model can extract the program features as input, and predict the optimal compiler optimization parameter effectively. We achieved 1.29× and 1.26× speedup on two platforms at the prediction phase of the model. At the same time, we saved considerable search time.
2. Related Work
2.1. Analytic Models
2.2. Predictive Models
3. Modeling the Program Optimization Parameters Selection Problem
3.1. Definition of Optimization Parameters
3.2. Program Transformation Optimization Parameter Selection Model
4. Optimization Space Search Algorithm for Optimized Parameters
4.1. Genetic Algorithm
Algorithm 1 Optimization parameter selection algorithm based on GA |
1: Input: Hotspot function h, optimization parameter vector q |
2: Output: The best optimization parameter vector and its corresponding program running time |
3: Algorithm: |
4: Step 1: Initialization: Read the input parameters and set the population size N, the largest genetic generation M. Then, generate the initial population with N individuals and each individual corresponds to an optimized parameter vector. |
5: Step 2: For every chromosome in the current generation, calculate the fitness value. Ftness function Fitness(T) is represented as reciprocal form on the summed time for both compilation and execution of the function when the corresponding optimization parameter is adopted: |
6: Step 3: Select: choose the individual to participate in the cross, select operator using a roulette selection method and elite strategy. First, choose a small proportion of the elite from the current generation , directly into the next generation of population, and then select the remaining individuals based on roulette method. |
7: Step 4: Cross: perform arithmetic cross operation, set the crossover probability |
. |
8: Step 5: Variation: mutate the individual in the population, set the mutation probability to . |
9: Step 6: algorithm termination condition judgment. When the genetic generation arrives at the peak genetic generation M or the number of optimal individuals’ consecutive generations does not change, the algorithm terminates, go to step 7; otherwise k = k + 1, go to step 2. |
10: Step 7: Select the optimization parameter vector and the program run time corresponding to the individual and the output will be the one that has the highest value of fitness in the current population. |
4.2. Multiobjective PSO Based on Adaptive Constraints
4.2.1. Standard PSO Algorithm
4.2.2. Adaptive Multiobjective Particle Swarm Optimization
Algorithm 2 Optimization parameter selection algorithm based on AMPSO |
1: Input: Test program P, optimization parameter number n, evolutionary parameters , population size N, Pareto nondominated solution size S, maximum number of iterations M. |
2: Output: The best optimization parameter vector and its corresponding program running time . |
3: Algorithm: |
4: Step 1: The range of values Ω is defined for each optimization parameter, and initializes the population location and speed in the given search field. |
5: Step 2: Calculate the target function value of each particle in the population, i.e., the program execution time and the constraint violation degree . |
6: Step 3: Select all feasible solutions, , and construct the current Pareto nondominated solution set based on Pareto domination relation. If the current nondominated solution is larger than the given scale S, the CD algorithm is used to maintain the scale of the nondominated solution set at a given scale. |
7: Step 4: If the current Pareto nondominated solution set is empty, select the particle with the smallest as the global optimal position. Otherwise, the solution with the largest CD in the solution set of Pareto nondominated will be selected and be the global optimal position. |
8: Step 5: Update the individual optimal position of the particles: if there are two particles, one is feasible, and the other is unfeasible, then we select the feasible particles as excellent. If the two particles are all unfeasible, then select particle with small is better. If both particles are feasible, the nondominated particle is chosen based on the Pareto domination relation. |
9: Step 6: Update the position and speed of all particles according to AMPSO. |
10: Step 7: Determine if the biggest number of iterations M is achieved; if not, return to Step 2; otherwise, the loop ends and output the best optimization parameter vector and its corresponding minimum program running time are selected. |
5. Optimized Parameter Prediction Based on Ensemble Learning
5.1. Data Preprocessing
5.2. Stacking Ensemble Prediction Model
Algorithm 3 Optimization parameter selection based on the Stacking ensemble model |
1: Input: Dataset ; level-1 learning algorithm and level-2 learning algorithm L. |
2: Output: Program optimization parameter classification. |
3: Algorithm: |
4: for |
5: |
6: end for |
7: |
8: for |
9: for |
10 |
11 end for |
12 |
13 end for |
14 |
5.3. Program Feature Extraction
5.4. Evaluating Indicator
- where expresses the program running time using the compiler’s standard optimization parameters. denotes the running time through the ELOPS model proposed in this paper.
6. Experiment Analysis
6.1. Experiment Environment
6.2. Experiment Design
6.3. Experimental Results
6.3.1. Comparison of GA and AMPSO Search Performance
- (1)
- The AMPSO algorithm does not include commonly used genetic operations such as crossover and mutation. AMPSO uses the speed information and position information of the individual in the solution space to make individual changes, which makes its computational complexity significantly smaller than that of GA.
- (2)
- The AMPSO algorithm makes the particles have a “memory” function. The next generation of solutions can inherit more information from their ancestors through their own learning or learning from other particles. As a result, the AMPSO algorithm can select the optimum solution faster.
- (3)
- The AMPSO algorithm has a special information sharing system. For the GA, chromosomes share information with each other, and all populations move to the optimal position at a more uniform speed. For AMPSO, the information moves in a one-way fashion, which means that only the best particles transmit messages to other particles.
6.3.2. Comparison of Feature Selection
6.3.3. Prediction Performance Analysis
- The performance of different machine learning models
- Comparison with state-of-the-art prediction models
6.3.4. Offline Learning and Online Prediction Time
7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Benacchio, T.; Bonaventura, L.; Altenbernd, M.; Cantwell, C.D.; Düben, P.D.; Gillard, M.; Giraud, L.; Göddeke, D.; Raffin, E.; Teranishi, K.; et al. Resilience and fault tolerance in high-performance computing for numerical weather and climate prediction. Int. J. High Perform. Comput. Appl. 2021, 35, 285–311. [Google Scholar] [CrossRef]
- Gómez-Gónzalez, E.; Núñez, N.O.; Caro, C.; García-Martín, M.L.; Fernández-Afonso, Y.; de la Fuente, J.M.; Balcerzyk, M.; Ocaña, M. Dysprosium and Holmium Vanadate Nanoprobes as High-Performance Contrast Agents for High-Field Magnetic Resonance and Computed Tomography Imaging. Inorg. Chem. 2021, 60, 152–160. [Google Scholar] [CrossRef] [PubMed]
- Vermaas, J.V.; Sedova, A.; Baker, M.B.; Boehm, S.; Rogers, D.M.; Larkin, J.; Glaser, J.; Smith, M.D.; Hernandez, O.; Smith, J.C. Supercomputing pipelines search for therapeutics against COVID-19. Comput. Sci. Eng. 2021, 23, 7–16. [Google Scholar] [CrossRef]
- Chetverushkin, B.N.; Olkhovskaya, O.G.; Tsigvintsev, I.P. Numerical solution of high-temperature gas dynamics problems on high-performance computing systems. J. Comput. Appl. Math. 2021, 390, 113374. [Google Scholar] [CrossRef]
- Chen, Y.; Pan, F.; Holzer, J.; Rothberg, E.; Ma, Y.; Veeramany, A. A High Performance Computing Based Market Economics Driven Neighborhood Search and Polishing Algorithm for Security Constrained Unit Commitment. IEEE Trans. Power Syst. 2021, 36, 292–302. [Google Scholar] [CrossRef]
- Xu, L.; Lux, T.; Chang, T.; Li, B.; Hong, Y.; Watson, L.; Butt, A.; Yao, D.; Cameron, K. Prediction of high-performance computing input/output variability and its application to optimization for system configurations. Equal. Eng. 2021, 33, 318–334. [Google Scholar] [CrossRef]
- Asri, M.; Malhotra, D.; Wang, J.; Biros, G.; John, L.K.; Gerstlauer, A. Hardware accelerator integration tradeoffs for high-performance computing: A case study of GEMM acceleration in N-Body methods. IEEE Trans. Parallel Distrib. Syst. 2021, 32, 2035–2048. [Google Scholar] [CrossRef]
- Santana, A.; Freitas, V.; Castro, M.; Pilla, L.L.; Méhaut, J.F. ARTful: A model for user-defined schedulers targeting multiple high-performance computing runtime systems. Softw. Pract. Exp. 2021, 51, 1622–1638. [Google Scholar] [CrossRef]
- Kumar, V.; Sharma, D.K.; Mishra, V.K. Cheval M: A GPU-based in-memory high-performance computing framework for accelerated processing of big-data streams. J. Supercomput. 2021, 77, 6936–6960. [Google Scholar] [CrossRef]
- Gai, K.; Qin, X.; Zhu, L. An energy-aware high performance task allocation strategy in heterogeneous fog computing environments. IEEE Trans. Comput. 2021, 70, 626–639. [Google Scholar] [CrossRef]
- Gill, A.; Lalith, M.; Poledna, S.; Hori, M.; Fujita, K.; Ichimura, T. High-Performance Computing Implementations of Agent-Based Economic Models for Realizing 1:1 Scale Simulations of Large Economies. IEEE Trans. Parallel Distrib. Syst. 2021, 32, 2101–2114. [Google Scholar] [CrossRef]
- Yang, Q.; Yang, H.; Lv, D.; Yu, R.; Li, E.; He, L.; Chen, Q.; Chen, H.; Guo, T. High-Performance Organic Synaptic Transistors with an Ultrathin Active Layer for Neuromorphic Computing. ACS Appl. Mater. Interfaces 2021, 13, 8672–8681. [Google Scholar] [CrossRef]
- Li, J.; Zhang, X.; Han, L.; Ji, Z.; Dong, X.; Hu, C. OKCM: Improving parallel task scheduling in high-performance computing systems using online learning. J. Supercomput. 2021, 77, 5960–5983. [Google Scholar] [CrossRef]
- Mohammed, A.; Eleliemy, A.; Ciorba, F.M.; Kasielke, F.; Banicescu, I. An approach for realistically simulating the performance of scientific applications on high performance computing systems. Future Gener. Comput. Syst. 2020, 111, 617–633. [Google Scholar] [CrossRef]
- Morales-Hernández, M.; Sharif, M.B.; Gangrade, S.; Dullo, T.T.; Kao, S.-C.; Kalyanapu, A.; Ghafoor, S.K.; Evans, K.J.; Madadi-Kandjani, E.; Hodges, B.R. High-performance computing in water resources hydrodynamics. J. Hydroinform. 2020, 22, 1217–1235. [Google Scholar] [CrossRef] [Green Version]
- Ogilvie, W.F.; Petoumenos, P.; Wang, Z.; Leather, H. Minimizing the cost of iterative compilation with active learning. In Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization, Austin, TX, USA, 4–8 February 2017; pp. 245–256. [Google Scholar]
- Zhao, J.; Li, Y.Y.; Zhao, R.C. “Black magic” of polyhedral compilation. J. Softw. 2018, 29, 2371–2396. [Google Scholar]
- Wang, Z.; O’Boyle, M. Machine Learning in Compiler Optimization. Proc. IEEE 2018, 106, 1879–1901. [Google Scholar] [CrossRef] [Green Version]
- Agakov, F.; Bonilla, E.; Cavazos, J.; Franke, B.; Fursin, G.; O’Boyle, M.F.P.; Thomson, J.; Toussaint, M.; Williams, C.K.I. Using Machine Learning to Focus Iterative Optimization. In Proceedings of the International Symposium on Code Generation and Optimization, New York, NY, USA, 26–29 March 2006; pp. 295–305. [Google Scholar]
- Colucci, A.; Juhasz, D.; Mosbeck, M.; Marchisio, A.; Rehman, S.; Kreutzer, M.; Nadbath, G.; Jantsch, A.; Shafique, M. MLComp: A Methodology for Machine Learning-based Performance Estimation and Adaptive Selection of Pareto-Optimal Compiler Optimization Sequences. In Proceedings of the IEEE/ACM Design, Automation and Test in Europe Conference, Grenoble, France, 1–5 February 2021; pp. 108–113. [Google Scholar]
- Park, E.; Cavazos, J.; Pouchet, L.-N.; Bastoul, C.; Cohen, A.; Sadayappan, P. Predictive Modeling in a Polyhedral Optimization Space. Int. J. Parallel Program. 2013, 41, 704–750. [Google Scholar] [CrossRef] [Green Version]
- Pekhimenko, G.; Brown, A.D. Efficient program compilation through machine learning techniques. In Software Automatic Tuning; Springer: New York, NY, USA, 2011. [Google Scholar]
- Kulkarni, S.; Cavazos, J. Mitigating the Compiler Optimization Phase-ordering Problem using Machine Learning. In Proceedings of the ACM International Conference on Object Oriented Programming Systems Languages and Applications, Tucson, AZ, USA, 19–26 October 2012; pp. 147–162. [Google Scholar]
- Liu, H.; Zhao, R.; Nie, K. Using Ensemble Learning to Improve Automatic Vectorization of Tensor Contraction Program. IEEE Access 2018, 6, 47112–47124. [Google Scholar] [CrossRef]
- Mohammed, M.; Mwambi, H.; Mboya, I.B.; Elbashir, M.K.; Omolo, B. A stacking ensemble deep learning approach to cancer type classification based on TCGA data. Sci. Rep. 2021, 11, 15626. [Google Scholar] [CrossRef]
- Park, E.; Kulkarni, S.; Cavazos, J. An Evaluation of Different Modeling Techniques for Iterative Compilation. In Proceedings of the 14th International Conference on Compilers, Architecture, and Synthesis for Embedded Systems, Taipei, Taiwan, 9–14 October 2011; pp. 65–74. [Google Scholar]
- Yuki, T.; Renganarayanan, L.; Rajopadhye, S.; Anderson, C.; Eichenberger, A.E.; O’Brien, K. Automatic Creation of Tile Size Selection Models. In Proceedings of the 8th Annual IEEE/ACM International Symposium on Code Generation and Optimization, Toronto, ON, Canada, 24–28 April 2010; pp. 190–199. [Google Scholar]
- Schkufza, E.; Sharma, R.; Aiken, A. Stochastic Optimization of Floating-point Programs with Tunable Precision. ACM SIGPLAN Not. 2014, 49, 53–64. [Google Scholar] [CrossRef]
- Purini, S.; Jain, L. Finding Good Optimization Sequences Covering Program Space. ACM Trans. Archit. Code Optim. 2013, 9, 1–23. [Google Scholar] [CrossRef]
- Nie, K.; Zhou, Q.; Qian, H.; Pang, J.; Xu, J.; Li, Y. Parallel Region Reconstruction Technique for Sunway High-Performance Multi-core Processors. In Proceedings of the 7th International Conference of Pioneering Computer Scientists, Engineers and Educators (ICPCSEE), Taiyuan, China, 17–20 September 2021; pp. 163–179. [Google Scholar]
- Ashouri, A.H.; Killian, W.; Cavazos, J.; Palermo, G.; Silvano, C. A Survey on Compiler Autotuning using Machine Learning. ACM Comput. Surv. 2019, 51, 1–42. [Google Scholar] [CrossRef] [Green Version]
- Ashouri, A.H.; Palermo, G.; Cavazos, J.; Silvano, C. Automatic Tuning of Compilers Using Machine Learning. In Springer Briefs in Applied Sciences and Technology; Springer: Cham, Switzerland, 2018. [Google Scholar]
- Kong, J.; Nie, K.; Zhou, Q.; Xu, J.; Han, L. Thread Private Variable Access Optimization Technique for Sunway High-Performance Multi-core Processors. In Proceedings of the 7th International Conference of Pioneering Computer Scientists, Engineers and Educators (ICPCSEE), Taiyuan, China, 17–20 September 2021; pp. 180–189. [Google Scholar]
- Pouchet, L.-N.; Bastoul, C.; Cohen, A.; Vasilache, N. Iterative Optimization in the Polyhedral Model: Part I, One-dimensional Time. In Proceedings of the International Symposium on Code Generation and Optimization, San Jose, CA, USA, 11–14 March 2007; pp. 144–156. [Google Scholar]
- Nobre, R.; Martins, L.G.A.; Cardoso, J.M.P. Use of Previously Acquired Positioning of Optimizations for Phase Ordering Exploration. In Proceedings of the 18th International Workshop on Software and Compilers for Embedded Systems, Sankt Goar, Germany, 1–3 June 2015; pp. 58–67. [Google Scholar]
- Fang, S.; Xu, W.; Chen, Y.; Eeckhout, L.; Temam, O.; Chen, Y.; Wu, C.; Feng, X. Practical Iterative Optimization for the Data Center. ACM Trans. Archit. Code Optim. 2015, 12, 49–60. [Google Scholar] [CrossRef]
- Fursin, G.; Kashnikov, Y.; Memon, A.W.; Chamski, Z.; Temam, O.; Namolaru, M.; Yom-Tov, E.; Mendelson, B.; Zaks, A.; Courtois, E.; et al. Milepost GCC: Machine Learning Enabled Self-tuning Compiler. Int. J. Parallel Program. 2011, 39, 296–327. [Google Scholar] [CrossRef] [Green Version]
- Fursin, G.; Miranda, C.; Temam, O.; Namolaru, M.; Yom-Tov, E.; Zaks, A.; Mendelson, B.; Bonilla, E.; Thomson, J.; Leather, H.; et al. Milepost GCC: Machine Learning Based Research Compiler; GCC Summit: Ottawa, ON, Canada, 2008; pp. 1–13. [Google Scholar]
- Ashouri, A.H.; Bignoli, A.; Palermo, G.; Silvano, C.; Kulkarni, S.; Cavazos, J. MiCOMP: Mitigating the Compiler Phase-Ordering Problem using Optimization Sub-Sequences and Machine Learning. ACM Trans. Archit. Code Optim. 2017, 14, 1–28. [Google Scholar] [CrossRef] [Green Version]
- Wang, Z.; Tournavitis, G.; Franke, B.; O’Boyle, M.F.P. Integrating Profile-driven Parallelism Detection and Machine-learning-based Mapping. ACM Trans. Archit. Code Optim. 2014, 11, 1–26. [Google Scholar] [CrossRef] [Green Version]
- Cummins, C.; Petoumenos, P.; Wang, Z.; Leather, H. Synthesizing Benchmarks for Predictive Modeling. In Proceedings of the International Symposium on Code Generation and Optimization, Austin, TX, USA, 4–8 February 2017; pp. 86–99. [Google Scholar]
- Ashouri, A.H.; Mariani, G.; Palermo, G.; Park, E.; Cavazos, J.; Silvano, C. COBAYN: Compiler Autotuning Framework using Bayesian Networks. ACM Trans. Archit. Code Optim. 2016, 13, 1–25. [Google Scholar] [CrossRef]
- Leather, H.; Bonilla, E.; O’boyle, M. Automatic Feature Generation for Machine Learning-based Optimizing Compilation. In Proceedings of the 7th Annual IEEE/ACM International Symposium on Code Generation and Optimization, Seattle, WA, USA, 22–25 March 2009; pp. 81–91. [Google Scholar]
- Ding, Y.; Ansel, J.; Veeramachaneni, K.; Shen, X.; O’Reilly, U.-M.; Amarasinghe, S. Autotuning Algorithmic Choice for Input Sensitivity. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, Portland, OR, USA, 13–17 June 2015; pp. 379–390. [Google Scholar]
- Martins, L.G.A.; Nobre, R.; Cardoso, J.M.P.; Delbem, A.C.B.; Marques, E. Clustering-Based Selection for the Exploration of Compiler Optimization Sequences. ACM Trans. Archit. Code Optim. 2016, 13, 1–28. [Google Scholar] [CrossRef] [Green Version]
- Kulkarni, P.A.; Whalley, D.B.; Tyson, G.S.; Davidson, J.W. Practical Exhaustive Optimization Phase Order Exploration and Evaluation. ACM Trans. Archit. Code Optim. 2009, 6, 1–36. [Google Scholar] [CrossRef]
- Ballal, P.A.; Sarojadevi, H.; Harsha, P.S. Compiler Optimization: A Genetic Algorithm Approach. Int. J. Comput. Appl. 2015, 112, 9–13. [Google Scholar]
- Kumar, T.S.; Sakthivel, S.; Kumar, S.S. Optimizing Code by Selecting Compiler Flags using Parallel Genetic Algorithm on Multicore CPUs. Int. J. Eng. Technol. 2014, 6, 544–551. [Google Scholar]
- Li, Y.Y.; Zhao, J.; Pang, J.M. Split tiling design and implementation in the polyhedral model. Chin. J. Comput. 2020, 43, 1038–1051. [Google Scholar]
- Torres-Madroñero, J.L.; Nieto-Londoño, C.; Sierra-Perez, J. Hybrid Energy Systems Sizing for the Colombian Context: A Genetic Algorithm and Particle Swarm Optimization Approach. Energies 2020, 13, 5648. [Google Scholar] [CrossRef]
- Cao, Y.; Fan, X.; Guo, Y.; Li, S.; Huang, H. Multiobjective optimization of injection-molded plastic parts using entropy weight, random forest, and genetic algorithm methods. J. Polym. Eng. 2020, 40, 360–371. [Google Scholar] [CrossRef]
- Nie, K.; Zhou, Q.; Qian, H.; Pang, J.; Xu, J.; Li, X. Loop selection for multilevel nested loops using a genetic algorithm. Math. Probl. Eng. 2021, 2021, 6643604. [Google Scholar] [CrossRef]
- Al-Janabi, S.; Alkaim, A. A novel optimization algorithm (Lion-AYAD) to find optimal DNA protein synthesis. Egypt. Inform. J. 2022, 23, 271–290. [Google Scholar] [CrossRef]
- Al-Janabi, S.; Mohammad, M.; Al-Sultan, A. A new method for prediction of air pollution based on intelligent computation. Soft Comput. 2020, 24, 661–680. [Google Scholar] [CrossRef]
- Al-Janabi, S.; Alkaim, A.F.; Adel, Z. An Innovative synthesis of deep learning techniques (DCapsNet & DCOM) for generation electrical renewable energy from wind energy. Soft Comput. 2020, 24, 10943–10962. [Google Scholar]
- Liu, H.; Zhao, R.; Wang, Q.; Li, Y. ALIC: A Low Overhead Compiler Optimization Prediction Model. Wirel. Pers. Commun. 2018, 103, 809–829. [Google Scholar] [CrossRef]
- SPEC CPU2006: SPEC CPU2006 Benchmark Suite. Available online: http://www.spec.org/cpu/ (accessed on 12 October 2021).
- Bailey, D.H.; Barszcz, E.; Barton, J.T.; Browning, D.S.; Carter, R.L.; Dagum, D.; Fatoohi, R.A.; Frederickson, P.O.; Lasinski, T.A.; Schreiber, R.S.; et al. The NAS Parallel Benchmarks. Int. J. Supercomput. Appl. 1991, 5, 63–73. [Google Scholar] [CrossRef] [Green Version]
Program | Description | Code Lines |
---|---|---|
SWE | Shallow water equation | 12,000 |
OpenCFD | An Open CFD code with High Order Accuracy Method | 13,000 |
FDM | Seismic 3D forward modeling computing program | 2847 |
WRF | The weather research and forecasting model | 860,000 |
GKUA | Algorithm has been applied to the gas flow simulations | 14,000 |
Benchmark | Platform I | Platform II | ||
---|---|---|---|---|
AMPSO | GA | AMPSO | GA | |
401.bzip2 | 1.51 | 1.42 | 1.49 | 1.38 |
429.mcf | 1.48 | 1.35 | 1.49 | 1.34 |
445.gobmk | 1.41 | 1.48 | 1.34 | 1.38 |
456.hmmer | 1.38 | 1.25 | 1.36 | 1.27 |
462.libquantum | 1.47 | 1.35 | 1.46 | 1.34 |
464.h264ref | 1.38 | 1.21 | 1.36 | 1.17 |
473.astar | 1.35 | 1.29 | 1.24 | 1.30 |
410.bwaves | 1.46 | 1.37 | 1.35 | 1.29 |
416.gamess | 1.37 | 1.25 | 1.35 | 1.26 |
433.milc | 1.57 | 1.45 | 1.36 | 1.24 |
435.gromacs | 1.46 | 1.37 | 1.29 | 1.28 |
437.leslie3d | 1.31 | 1.24 | 1.34 | 1.24 |
444.namd | 1.35 | 1.25 | 1.40 | 1.25 |
447.dealII | 1.38 | 1.31 | 1.34 | 1.28 |
453.povray | 1.45 | 1.29 | 1.44 | 1.23 |
454.calculix | 1.47 | 1.35 | 1.48 | 1.26 |
459.GemsFDTD | 1.36 | 1.37 | 1.30 | 1.25 |
470.lbm | 1.08 | 1.11 | 1.12 | 1.13 |
481.wrf | 1.23 | 1.17 | 1.34 | 1.19 |
482.sphinx3 | 1.41 | 1.05 | 1.29 | 1.11 |
Average | 1.39 | 1.30 | 1.36 | 1.26 |
Benchmark | Platform I | Platform II | ||||||
---|---|---|---|---|---|---|---|---|
IG | CS | RF | AMPSO | IG | CS | RF | AMPSO | |
401.bzip2 | 1.31 | 1.19 | 1.15 | 1.51 | 1.35 | 1.34 | 1.21 | 1.49 |
429.mcf | 1.33 | 1.13 | 1.10 | 1.48 | 1.33 | 1.26 | 1.24 | 1.49 |
445.gobmk | 1.29 | 1.16 | 1.06 | 1.41 | 1.24 | 1.09 | 1.08 | 1.34 |
456.hmmer | 1.29 | 1.25 | 1.27 | 1.38 | 1.18 | 1.16 | 1.17 | 1.29 |
462.libquantum | 1.34 | 1.14 | 1.12 | 1.47 | 1.34 | 1.13 | 1.12 | 1.46 |
464.h264ref | 1.23 | 1.05 | 1.03 | 1.38 | 1.29 | 1.17 | 1.18 | 1.36 |
473.astar | 1.23 | 1.23 | 1.21 | 1.35 | 1.21 | 1.19 | 1.17 | 1.24 |
410.bwaves | 1.33 | 1.22 | 1.21 | 1.46 | 1.23 | 1.21 | 1.22 | 1.35 |
416.gamess | 1.25 | 1.14 | 1.14 | 1.37 | 1.26 | 1.14 | 1.12 | 1.35 |
433.milc | 1.34 | 1.35 | 1.37 | 1.57 | 1.31 | 1.24 | 1.19 | 1.36 |
435.gromacs | 1.34 | 1.23 | 1.22 | 1.46 | 1.29 | 1.27 | 1.27 | 1.36 |
437.leslie3d | 1.23 | 1.19 | 1.17 | 1.31 | 1.26 | 1.16 | 1.14 | 1.34 |
444.namd | 1.32 | 1.30 | 1.29 | 1.35 | 1.34 | 1.29 | 1.27 | 1.40 |
447.dealII | 1.24 | 1.23 | 1.22 | 1.38 | 1.21 | 1.18 | 1.19 | 1.34 |
453.povray | 1.36 | 1.23 | 1.21 | 1.45 | 1.29 | 1.23 | 1.21 | 1.44 |
454.calculix | 1.32 | 1.25 | 1.24 | 1.47 | 1.32 | 1.21 | 1.19 | 1.48 |
459.GemsFDTD | 1.31 | 1.29 | 1.31 | 1.36 | 1.27 | 1.25 | 1.23 | 1.30 |
470.lbm | 1.03 | 1.01 | 1.01 | 1.08 | 1.09 | 1.07 | 1.06 | 1.12 |
481.wrf | 1.35 | 1.29 | 1.18 | 1.23 | 1.28 | 1.09 | 1.10 | 1.34 |
482.sphinx3 | 1.33 | 1.09 | 1.08 | 1.41 | 1.21 | 1.15 | 1.13 | 1.29 |
Average | 1.29 | 1.20 | 1.18 | 1.39 | 1.27 | 1.19 | 1.17 | 1.36 |
Benchmark | Platform | Model | Precision | Recall | F1 |
---|---|---|---|---|---|
SPEC 2006 | Platform I | DT | 0.482 | 0.257 | 0.335 |
LR | 0.549 | 0.209 | 0.303 | ||
BN | 0.528 | 0.259 | 0.359 | ||
NN | 0.546 | 0.269 | 0.366 | ||
Platform II | DT | 0.483 | 0.257 | 0.335 | |
LR | 0.550 | 0.211 | 0.305 | ||
BN | 0.527 | 0.256 | 0.357 | ||
NN | 0.536 | 0.268 | 0.365 |
Benchmark | Platform | Model | Precision | Recall | F1 |
---|---|---|---|---|---|
SPEC 2006 | Platform I | DT + BN | 0.483 | 0.257 | 0.335 |
LR + BN | 0.686 | 0.273 | 0.391 | ||
DT + NN | 0.482 | 0.256 | 0.334 | ||
LR + NN | 0.710 | 0.247 | 0.366 | ||
Platform II | DT + BN | 0.484 | 0.256 | 0.335 | |
LR + BN | 0.691 | 0.272 | 0.390 | ||
DT + NN | 0.480 | 0.254 | 0.332 | ||
LR + NN | 0.711 | 0.245 | 0.364 |
Platform I | Platform II | |||||||
---|---|---|---|---|---|---|---|---|
KNN | SVM | ELOPS | GCC | KNN | SVM | ELOPS | GCC | |
Best parameter | 0.63 | 0.65 | 0.71 | 0.15 | 0.61 | 0.63 | 0.72 | 0.16 |
Second best parameter | 0.63 | 0.64 | 0.65 | 0.15 | 0.59 | 0.63 | 0.65 | 0.26 |
Third best parameter | 0.14 | 0.14 | 0.12 | 0.23 | 0.09 | 0.10 | 0.11 | 0.24 |
Fourth best parameter | 0.08 | 0.04 | 0.01 | 0.23 | 0.07 | 0.06 | 0.08 | 0.17 |
Fifth best parameter | 0.06 | 0.04 | 0.04 | 0.14 | 0.03 | 0.03 | 0.03 | 0.06 |
Sixth best parameter | 0.03 | 0.01 | 0.01 | 0.05 | 0.02 | 0.02 | 0.01 | 0.03 |
Seventh best parameter | 0.03 | 0.02 | 0.01 | 0.05 | 0.02 | 0.01 | 0.01 | 0.03 |
Benchmark | Model | Precision | Recall | F1 | AUC |
---|---|---|---|---|---|
IS | KNN | 0.682 | 0.267 | 0.329 | 0.732 |
SVM | 0.749 | 0.289 | 0.331 | 0.735 | |
ELOPS | 0.828 | 0.359 | 0.378 | 0.885 | |
SP | KNN | 0.783 | 0.217 | 0.331 | 0.699 |
SVM | 0.789 | 0.299 | 0.327 | 0.721 | |
ELOPS | 0.817 | 0.389 | 0.366 | 0.883 | |
SWE | KNN | 0.762 | 0.243 | 0.366 | 0.688 |
SVM | 0.749 | 0.278 | 0.334 | 0.721 | |
ELOPS | 0.833 | 0.369 | 0.388 | 0.888 | |
WRF | KNN | 0.773 | 0.258 | 0.317 | 0.688 |
SVM | 0.819 | 0.277 | 0.316 | 0.722 | |
ELOPS | 0.831 | 0.371 | 0.391 | 0.887 |
Phase | Platform I (Time) | Platform II (Time) | |||||
---|---|---|---|---|---|---|---|
KNN | SVM | ELOPS | KNN | SVM | ELOPS | ||
Offline training | Data collection | 11 d | 11 d | 12 d | 11 d | 12 d | 13 d |
Model construction | 57 s | 55 s | 69 s | 63 s | 70 s | 76 s | |
Online prediction | Feature extraction | 11 s | 15 s | 12 s | 12 s | 16 s | 14 s |
Model prediction | 0.79 s | 0.84 s | 0.36 s | 0.98 s | 0.97 s | 0.45 s |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Liu, H.; Xu, J.; Chen, S.; Guo, T. Compiler Optimization Parameter Selection Method Based on Ensemble Learning. Electronics 2022, 11, 2452. https://doi.org/10.3390/electronics11152452
Liu H, Xu J, Chen S, Guo T. Compiler Optimization Parameter Selection Method Based on Ensemble Learning. Electronics. 2022; 11(15):2452. https://doi.org/10.3390/electronics11152452
Chicago/Turabian StyleLiu, Hui, Jinlong Xu, Sen Chen, and Te Guo. 2022. "Compiler Optimization Parameter Selection Method Based on Ensemble Learning" Electronics 11, no. 15: 2452. https://doi.org/10.3390/electronics11152452
APA StyleLiu, H., Xu, J., Chen, S., & Guo, T. (2022). Compiler Optimization Parameter Selection Method Based on Ensemble Learning. Electronics, 11(15), 2452. https://doi.org/10.3390/electronics11152452