Batch-Wise Permutation Feature Importance Evaluation and Problem-Specific Bigraph for Learn-to-Branch
Abstract
:1. Introduction
- In order to measure the feature importance, a batch-wise PFI (BPFI) evaluation method is proposed for learn-to-branch, which permutes features within only one batch in the forward pass. The GCNN model is augmented as BPFI-GCNN by adding one shuffling switch in the GCNN model, therefore allowing the fragmented processing of the branching samples.
- Based on the results of the BPFI evaluation, a reduced bigraph representation is proposed for each specific benchmark problem to reduce the model complexity. The proposed representation is shown to outperform the original in most cases on both branching accuracy and solution efficiency.
2. Background
2.1. Machine Learning Based Branch-and-Bound
2.2. The Bigraph Representation for State Embedding
2.3. Refined Problem-Specific Branch-and-Bound
3. Preliminaries
3.1. Benchmark Problems
- 1.
- The SC problem can be formulated as follows:
- 2.
- The CA problem can be formulated as follows:
- 3.
- The MIS problem can be formulated as
- 4.
- The CFL problem can be formulated as
3.2. Metrics
- 1.
- The branching accuracy is described by four metrics, i.e., the percentage of times the decision has the highest strong branching score (acc@1), one of the three highest (acc@3), one of the five highest (acc@5), and one of the ten highest (acc@10) strong branching scores.
- 2.
- The solution efficiency is described by two metrics, i.e., the 1-shifted geometric mean of the solving times in seconds (Time) and final node counts of instances (Nodes).
4. Methodology
4.1. Batch-Wise Permutation Feature Importance
- Train and evaluate the model for a performance score A.
- Evaluate the model on a modified test dataset with feature i shuffled. Compute performance score , , for N different shuffling seeds.
- The PFI of feature i is computed as the drop of performance after shuffling:
4.2. Problem-Specific Bipartite Graph Representation
5. Computer Experiments
5.1. Experimental Framework
- 1.
- Generate instances that include the four benchmark problems, i.e., set covering, combinatorial auction, maximum independent set, and capacitated facility location.
- 2.
- Sample the branching decision data during the B&B solution of MILP instances with SCIP 7.0 [5], obtaining branching samples datasets for training, validation, and testing.
- 3.
- Train the GCNN model with the full bigraph representation after the shuffling switch is turned off.
- 4.
- Perform BPFI evaluation, reduce the bigraph representation for each of the benchmark problems, and train GCNN with each reduced bigraph representation after the shuffling switch is turned on. As features are reduced, the GCNN also requires fewer parameters, thus decreasing in size.
- 5.
- Test and compare the branching accuracy of the trained models, including the full GCNN and the BPFI-GCNN.
- 6.
- Evaluate and compare the MILP solution efficiency of the ML-B&B models by embedding the trained GCNNs into the SCIP’s B&B solution process.
5.2. BPFI Evaluation and the Resulting BPFI-GCNN
5.3. Comparison of Branching Accuracy
5.4. Comparison of Problem-Solving Efficiency
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
MILP | Mixed-integer linear programming |
ML | Machine learning |
B&B | Branch-and-bound |
GCNN | Graph convolutional neural network |
BPFI | Batch-wise permutation feature importance |
PFI | Permutation feature importance |
SC | Set covering |
CA | Combinatorial auction |
MIS | Maximum independent set |
CFL | Capacitated facility location |
SB | Strong branching |
IL | Imitation learning |
RL | Reinforcement learning |
DGL | Deep graph library |
Appendix A
Tensor | Feature | Description |
---|---|---|
C | obj_cos_sim | Cosine similarity with objective. |
bias | Bias value, normalized with constraint coefficients. | |
is_tight | Tightness indicator in LP solution. | |
dualsol_val | Dual solution value, normalized. | |
c_age | LP age, normalized with total number of LPs. | |
E | e_coef | Constraint coefficient, normalized per constraint. |
V | type | Type (binary, integer, impl.integer, continuous) as one-hot encoding. |
v_coef | Objective coefficient, normalized. | |
has_lb | Lower bound indicator. | |
has_ub | Upper bound indicator. | |
sol_is_at_lb | Solution value equals lower bound. | |
sol_is_at_ub | Solution value equals upper bound. | |
sol_frac | Solution value fractionality. | |
basis_status | Simplex basis status (lower, basic, upper, zero) as one-hot encoding. | |
reduced_cost | Reduced cost, normalized. | |
v_age | LP age, normalized. | |
sol_val | Solution value. | |
inc_val | Value in incumbent. | |
avg_inc_val | Average value in incumbents. |
References
- Chen, B.; Zhu, D.; Wang, Y.; Zhang, P. An Approach to Combine the Power of Deep Reinforcement Learning with a Graph Neural Network for Routing Optimization. Electronics 2022, 11, 368. [Google Scholar] [CrossRef]
- Jin, L.; Li, S.; La, H.M.; Luo, X. Manipulability optimization of redundant manipulators using dynamic neural networks. IEEE Trans. Ind. Electron. 2017, 64, 4710–4720. [Google Scholar] [CrossRef]
- Cosic, A.; Stadler, M.; Mansoor, M.; Zellinger, M. Mixed-integer linear programming based optimization strategies for renewable energy communities. Energy 2021, 237, 121559. [Google Scholar] [CrossRef]
- Holmström, K.; Göran, A.O.; Edvall, M.M. User’s Guide for TOMLAB/CPLEX v12. 1. Tomlab Optim. Retrieved 2009, 1, 2017. [Google Scholar]
- Gamrath, G.; Anderson, D.; Bestuzheva, K.; Chen, W.K.; Eifler, L.; Gasse, M.; Gemander, P.; Gleixner, A.; Gottwald, L.; Halbig, K.; et al. The Scip Optimization Suite 7.0. 2020. Available online: http://www.optimization-online.org/DB_HTML/2020/03/7705.html (accessed on 28 May 2022).
- Optimization, L.G. Gurobi Optimizer Reference Manual. 2020. Available online: https://www.gurobi.com (accessed on 28 May 2022).
- Land, A.; Doig, A. An automatic method of solving discrete programming problems. econometrica. Econometrica 1960, 28, 497–520. [Google Scholar] [CrossRef]
- Dhiman, P.; Kukreja, V.; Manoharan, P.; Kaur, A.; Kamruzzaman, M.; Dhaou, I.B.; Iwendi, C. A Novel Deep Learning Model for Detection of Severity Level of the Disease in Citrus Fruits. Electronics 2022, 11, 495. [Google Scholar] [CrossRef]
- Zarpellon, G.; Jo, J.; Lodi, A.; Bengio, Y. Parameterizing branch-and-bound search trees to learn branching policies. In Proceedings of the AAAI Conference on Artificial Intelligence, Virtual, 2–9 February 2021; Volume 35, pp. 3931–3939. [Google Scholar]
- Alvarez, A.M.; Louveaux, Q.; Wehenkel, L. A Supervised Machine Learning Approach to Variable Branching in Branch-and-Bound; Technical Report; Université de Liège: Liège, Belgium, 2014. [Google Scholar]
- Khalil, E.; Le Bodic, P.; Song, L.; Nemhauser, G.; Dilkina, B. Learning to branch in mixed integer programming. In Proceedings of the AAAI Conference on Artificial Intelligence, Phoenix, AZ, USA, 12–17 February 2016; Volume 30. [Google Scholar]
- Hansknecht, C.; Joormann, I.; Stiller, S. Cuts, primal heuristics, and learning to branch for the time-dependent traveling salesman problem. arXiv 2018, arXiv:1805.01415. [Google Scholar]
- Alvarez, A.M.; Louveaux, Q.; Wehenkel, L. A machine learning-based approximation of strong branching. INFORMS J. Comput. 2017, 29, 185–195. [Google Scholar] [CrossRef] [Green Version]
- Gasse, M.; Chételat, D.; Ferroni, N.; Charlin, L.; Lodi, A. Exact combinatorial optimization with graph convolutional neural networks. Adv. Neural Inf. Process. Syst. 2019, 32, 15554–15566. [Google Scholar]
- Balcan, M.F.; Dick, T.; Sandholm, T.; Vitercik, E. Learning to branch. In Proceedings of the International Conference on Machine Learning, PMLR, Stockholm, Sweden, 10–15 July 2018; pp. 344–353. [Google Scholar]
- Peng, C.; Liao, B. Heavy-Head Sampling for Fast Imitation Learning of Machine Learning Based Combinatorial Auction Solver. Neural Process. Lett. 2022, 1–14. [Google Scholar] [CrossRef]
- Gupta, P.; Gasse, M.; Khalil, E.; Mudigonda, P.; Lodi, A.; Bengio, Y. Hybrid models for learning to branch. Adv. Neural Inf. Process. Syst. 2020, 33, 18087–18097. [Google Scholar]
- Altmann, A.; Toloşi, L.; Sander, O.; Lengauer, T. Permutation importance: A corrected feature importance measure. Bioinformatics 2010, 26, 1340–1347. [Google Scholar] [CrossRef]
- Linderoth, J.T.; Savelsbergh, M.W. A Computational Study of Search Strategies for Mixed Integer Programming. Inf. J. Comput. 1999, 11, 173–187. [Google Scholar] [CrossRef] [Green Version]
- Li, Y.; Li, J.; Pang, J. A Graph Attention Mechanism-Based Multiagent Reinforcement-Learning Method for Task Scheduling in Edge Computing. Electronics 2022, 11, 1357. [Google Scholar] [CrossRef]
- Bengio, Y.; Lodi, A.; Prouvost, A. Machine learning for combinatorial optimization: A methodological tour d’horizon. Eur. J. Oper. Res. 2021, 290, 405–421. [Google Scholar] [CrossRef]
- He, H.; Daume, H., III; Eisner, J.M. Learning to search in branch and bound algorithms. Adv. Neural Inf. Process. Syst. 2014, 27, 3293–3301. [Google Scholar]
- Mazyavkina, N.; Sviridov, S.; Ivanov, S.; Burnaev, E. Reinforcement learning for combinatorial optimization: A survey. Comput. Oper. Res. 2021, 134, 105400. [Google Scholar] [CrossRef]
- Silver, D.; Huang, A.; Maddison, C.J.; Guez, A.; Sifre, L.; Van Den Driessche, G.; Schrittwieser, J.; Antonoglou, I.; Panneershelvam, V.; Lanctot, M.; et al. Mastering the game of Go with deep neural networks and tree search. Nature 2016, 529, 484–489. [Google Scholar] [CrossRef]
- Arnold, F.; Sörensen, K. What makes a VRP solution good? The generation of problem-specific knowledge for heuristics. Comput. Oper. Res. 2019, 106, 280–288. [Google Scholar] [CrossRef]
- Khachay, M.; Ukolov, S.; Petunin, A. Problem-Specific Branch-and-Bound Algorithms for the Precedence Constrained Generalized Traveling Salesman Problem. In International Conference on Optimization and Applications; Springer: Berlin/Heidelberg, Germany, 2021; pp. 136–148. [Google Scholar]
- Kudriavtsev, A.; Khachay, D.; Ogorodnikov, Y.; Ren, J.; Shao, S.C.; Zhang, D.; Khachay, M. The Shortest Simple Path Problem with a Fixed Number of Must-Pass Nodes: A Problem-Specific Branch-and-Bound Algorithm. In International Conference on Learning and Intelligent Optimization; Springer: Cham, Switzerland, 2021; pp. 198–210. [Google Scholar]
- Balas, E.; Ho, A. Set covering algorithms using cutting planes, heuristics, and subgradient optimization: A computational study. In Combinatorial Optimization; Springer: Berlin/Heidelberg, Germany, 1980; pp. 37–60. [Google Scholar]
- Leyton-Brown, K.; Pearson, M.; Shoham, Y. Towards a universal test suite for combinatorial auction algorithms. In Proceedings of the 2nd ACM Conference on Electronic Commerce, Minneapolis, MN, USA, 17–20 October 2000; pp. 66–76. [Google Scholar]
- Bergman, D.; Cire, A.A.; Van Hoeve, W.J.; Hooker, J. Decision Diagrams for Optimization; Springer: Berlin/Heidelberg, Germany, 2016; Volume 1. [Google Scholar]
- Cornuéjols, G.; Sridharan, R.; Thizy, J.M. A comparison of heuristics and relaxations for the capacitated plant location problem. Eur. J. Oper. Res. 1991, 50, 280–297. [Google Scholar] [CrossRef]
- Wang, M.; Zheng, D.; Ye, Z.; Gan, Q.; Li, M.; Song, X.; Zhou, J.; Ma, C.; Yu, L.; Gai, Y.; et al. Deep Graph Library: A Graph-Centric, Highly-Performant Package for Graph Neural Networks. arXiv 2019, arXiv:1909.01315. [Google Scholar]
- Geurts, P.; Ernst, D.; Wehenkel, L. Extremely randomized trees. Mach. Learn. 2006, 63, 3–42. [Google Scholar] [CrossRef] [Green Version]
- Burges, C.J. From ranknet to lambdarank to lambdamart: An overview. Learning 2010, 11, 81. [Google Scholar]
Problem | Accuracy Level | Model | |||
---|---|---|---|---|---|
TREES | LMART | Full GCNN | BPFI-GCNN | ||
Set Covering | acc@1 | 46.5 ± 1.3 | 49.6 ± 3.0 | 61.7 ± 1.4 | 63.9 ± 1.2 |
acc@3 | 64.8 ± 1.0 | 66.5 ± 3.7 | 78.4 ± 0.9 | 79.3 ± 1.4 | |
acc@5 | 75.8 ± 2.3 | 73.9 ± 5.0 | 87.0 ± 1.8 | 87.9 ± 1.4 | |
acc@10 | 88.6 ± 2.5 | 84.3 ± 6.0 | 95.7 ± 0.9 | 95.2 ± 0.7 | |
Combinatorial Auction | acc@1 | 39.6 ± 4.7 | 45.8 ± 3.0 | 52.4 ± 1.1 | 55.9 ± 1.8 |
acc@3 | 61.1 ± 5.6 | 65.9 ± 2.2 | 75.8 ± 1.0 | 76.3 ± 1.4 | |
acc@5 | 74.0 ± 5.6 | 85.9 ± 0.8 | 85.9 ± 0.8 | 85.6 ± 0.9 | |
acc@10 | 88.7 ± 3.8 | 86.0 ± 1.8 | 94.9 ± 0.7 | 95.1 ± 0.2 | |
Maximum Independent Set | acc@1 | 26.1 ± 3.5 | 34.3 ± 9.8 | 29.4 ± 26.9 | 53.3 ± 0.8 |
acc@3 | 35.9 ± 4.9 | 47.4 ± 8.6 | 40.8 ± 36.6 | 68.2 ± 0.9 | |
acc@5 | 40.4 ± 5.2 | 53.0 ± 7.6 | 45.8 ± 38.9 | 74.1 ± 1.3 | |
acc@10 | 45.2 ± 5.3 | 58.7 ± 9.8 | 53.3 ± 39.2 | 81.3 ± 1.9 | |
Capacitated Facility Location | acc@1 | 55.8 ± 2.1 | 62.2 ± 2.3 | 67.2 ± 2.2 | 71.1 ± 1.0 |
acc@3 | 88.7 ± 2.4 | 90.7 ± 0.9 | 91.2 ± 0.8 | 92.6 ± 0.2 | |
acc@5 | 95.7 ± 1.9 | 97.7 ± 0.6 | 96.8 ± 0.8 | 98.9 ± 0.2 | |
acc@10 | 100.0 ± 0.0 | 100.0 ± 0.0 | 100.0 ± 0.0 | 100.0 ± 0.0 |
Problem | Type | Model | Node | Time | T-Stats (p-Value) |
---|---|---|---|---|---|
Set Covering | Easy | Full GCNN | 1.00r ± 9.3% | 1.00r ± 5.3% | NA |
TREES | 1.45r ± 20.1% | 1.17r ± 10.9% | 26.72 (0.0) | ||
LMART | 1.42r ± 17.8% | 0.80r ± 8.1% | −26.37 (0.0) | ||
BPFI-GCNN | 0.99r ± 6.8% | 0.98r ± 4.3% | −0.36 (0.7) | ||
Medium | Full GCNN | 1.00r ± 16.6% | 1.00r ± 15.7% | NA | |
TREES | 1.35r ± 14.8% | 1.53r ± 12.6% | 22.50 (0.0) | ||
LMART | 1.71r ± 27.6% | 0.99r ± 22.7% | 0.52 (0.6) | ||
BPFI-GCNN | 0.89r ± 9.0% | 0.95r ± 7.9% | −4.98 (0.0) | ||
Hard | Full GCNN | 1.00r ± 31.6% | 1.00r ± 29.4% | NA | |
TREES | 0.87r ± 13.7% | 1.17r ± 9.3% | 1.93 (0.1) | ||
LMART | 1.42r ± 20.3% | 1.04r ± 17.7% | 0.10 (0.9) | ||
BPFI-GCNN | 0.75r ± 8.2% | 0.79r ± 6.7% | −4.04 (0.0) | ||
Combinatorial Auction | Easy | Full GCNN | 1.00r ± 12.3% | 1.00r ± 8.2% | NA |
TREES | 1.28r ± 29.7% | 1.11r ± 16.7% | 7.03 (0.0) | ||
LMART | 1.33r ± 24.6% | 0.71r ± 9.0% | −30.69 (0.0) | ||
BPFI-GCNN | 0.99r ± 13.2% | 1.02r ± 8.6% | 2.19 (0.0) | ||
Medium | Full GCNN | 1.00r ± 14.2% | 1.00r ± 11.5% | NA | |
TREES | 4.0r ± 117.9% | 4.01r ± 111.7% | 8.29 (0.0) | ||
LMART | 2.30r ± 34.6% | 1.16r ± 28.7% | 7.99 (0.0) | ||
BPFI-GCNN | 1.01r ± 12.0% | 0.99r ± 8.9% | −1.64 (0.1) | ||
Hard | Full GCNN | 1.00r ± 19.1% | 1.00r ± 16.6% | NA | |
TREES | 8.20r ± 58.8% | 11.01r ± 58.8% | 15.22 (0.0) | ||
LMART | 4.11r ± 73.2% | 3.01r ± 71.9% | 10.38 (0.0) | ||
BPFI-GCNN | 0.97r ± 14.3% | 0.93r ± 13.2% | −5.33 (0.0) | ||
Maximum Independent Set | Easy | Full GCNN | 1.00r ± 160.4% | 1.00r ± 102.9% | NA |
TREES | 0.55r ± 71.1% | 0.73r ± 36.8% | −7.46 (0.0) | ||
LMART | 0.45r ± 108.9% | 0.41r ± 36.5% | −9.51 (0.0) | ||
BPFI-GCNN | 0.20r ± 53.7% | 0.36r ± 16.1% | −9.62 (0.0) | ||
Medium | Full GCNN | 1.00r ± 83.2% | 1.00r ± 82.1% | NA | |
TREES | 1.93r ± 17.7% | 3.20r ± 11.6% | 5.55 (0.0) | ||
LMART | 0.61r ± 85.7% | 0.51r ± 77.2% | −4.36 (0.0) | ||
BPFI-GCNN | 0.13r ± 94.9% | 0.20r ± 81.4% | −7.75 (0.0) | ||
Hard | Full GCNN | 1.00r ± 45.2% | 1.00r ± 24.2% | NA | |
TREES | 0.76r ± 9.3% | 1.29r ± 2.4% | 4.26 (0.0) | ||
LMART | 1.03r ± 29.7% | 1.06r ± 9.0% | 1.55 (0.1) | ||
BPFI-GCNN | 0.76r ± 36.4% | 0.79r ± 30.2% | −2.49 (0.01) | ||
Capacitated Facility Location | Easy | Full GCNN | 1.00r ± 23.5% | 1.00r ± 16.3% | NA |
TREES | 1.15r ± 24.3% | 1.50r ± 17.5% | 24.42 (0.0) | ||
LMART | 1.10r ± 22.7% | 0.88r ± 16.6% | −4.37 (0.0) | ||
BPFI-GCNN | 0.99r ± 21.7% | 0.97r ± 15.1% | −1.70 (0.09) | ||
Medium | Full GCNN | 1.00r ± 15.0% | 1.00r ± 13.8% | NA | |
TREES | 1.18r ± 18.4% | 1.47r ± 16.6% | 19.6 (0.0) | ||
LMART | 1.00r ± 17.6% | 0.89r ± 14.9% | −7.25 (0.0) | ||
BPFI-GCNN | 0.99r ± 14.3% | 0.96r ± 12.8% | −1.46 (0.15) | ||
Hard | Full GCNN | 1.00r ± 16.0% | 1.00r ± 14.6% | NA | |
TREES | 1.07r ± 16.3% | 1.28r ± 16.1% | 12.49 (0.0) | ||
LMART | 0.87r ± 13.9% | 0.86r ± 13.3% | −8.64 (0.0) | ||
BPFI-GCNN | 0.90r ± 11.7% | 0.89r ± 12.0% | −6.85 (0.0) |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Niu, Y.; Peng, C.; Liao, B. Batch-Wise Permutation Feature Importance Evaluation and Problem-Specific Bigraph for Learn-to-Branch. Electronics 2022, 11, 2253. https://doi.org/10.3390/electronics11142253
Niu Y, Peng C, Liao B. Batch-Wise Permutation Feature Importance Evaluation and Problem-Specific Bigraph for Learn-to-Branch. Electronics. 2022; 11(14):2253. https://doi.org/10.3390/electronics11142253
Chicago/Turabian StyleNiu, Yajie, Chen Peng, and Bolin Liao. 2022. "Batch-Wise Permutation Feature Importance Evaluation and Problem-Specific Bigraph for Learn-to-Branch" Electronics 11, no. 14: 2253. https://doi.org/10.3390/electronics11142253
APA StyleNiu, Y., Peng, C., & Liao, B. (2022). Batch-Wise Permutation Feature Importance Evaluation and Problem-Specific Bigraph for Learn-to-Branch. Electronics, 11(14), 2253. https://doi.org/10.3390/electronics11142253