Quantitative and Qualitative Comparison of Decision-Map Techniques for Explaining Classification Models
Abstract
:1. Introduction
- We propose a suite of metrics to quantitatively evaluate decision-map techniques.
- We conduct a comprehensive comparison of existing decision-map techniques, both quantitatively (by comparing the aforementioned metrics) and qualitatively (by visually comparing the obtained decision-map images).
- We propose a workflow to guide the selection of the most suitable decision-map technique for a given dataset based on a set of desirable requirements.
2. Related Works
2.1. Preliminaries
2.2. Overall Workflow of Decision Map
- Train a classifier f on a dataset . This is the classifier whose decision map we next want to visualize.
- Construct a direct projection P and inverse projection using a dataset .
- Project to create a 2D scatter plot .
- Sample the extent of on a uniform pixel grid I.
- Backproject all pixels to the data space using .
- Use f to predict the labels of the backprojected points .
- Color I according to the predicted labels .
2.3. Decision Boundary Maps (DBMs)
2.4. Supervised Decision Boundary Maps (SDBMs)
2.5. DeepView (DV)
2.6. Limitations
3. Evaluation Method
3.1. Global Metrics
3.2. Local Metrics
3.3. Datasets
3.4. Classifiers
4. Comparison Results
4.1. Global Metrics of Real-World Datasets
4.2. Interpreting Local Metrics on a Synthetic Dataset
- Distance to the nearest data N: We expected this distance to be small for pixels close to the actual projections of data points and larger for pixels far away from these points. In other words, we expected the 2D distance (to the projections of the data points) to mimic the nD distance (to the actual data points).
- Distance to the decision boundary B: Similar to the above, we expected that the points close to the decision boundaries (in 2D) would also be close to the decision boundaries (in nD), and vice versa.
- Class stability map S: Ideally, we would have liked most pixels to have high stability values, especially those close to the decision boundaries (where we were most interested in studying in a decision map).
- Gradient map G: We expected a smooth gradient map without any discontinuities or peaks. Ideally, we also would have liked low gradients close to the decision boundaries (for the same reason mentioned above for class stability).
4.3. Analyzing Local Metrics on Real-World Datasets
4.3.1. Decision Maps
4.3.2. Smoothness
4.3.3. Class Stability Map
4.3.4. Distance to Decision Boundary
4.3.5. Distance to the Nearest Training Data
4.4. Computational Efficiency
5. Discussion
5.1. Decision Maps for Deep Learning Variations
5.2. Workflow to Guide the Selection of a Decision-Map Technique
- If the user has already chosen a projection function P, they should select the DBM, as this is the only method that can accommodate a predefined P, and proceed to step 4.
- If the user does not have a specific P, the next key aspect to consider is computational efficiency. If speed is important and the data to be visualized are large, DV should be excluded from consideration, and the workflow proceeds to step 4.
- If the user does not have a specific P and computational efficiency is not a concern, one should consider if smoothness is important. If yes, DV should be excluded, and one can proceed to step 4.
- If more than a single classifier–decision map combination remains to be chosen from, global quality metrics can be used to select the optimal one. The key metrics to use here are , , and (defined in Section 3.1), which can be computed for any combination of direct projection, inverse projection, and classifier. Note that and are not available when the selected projection P cannot infer new data, such as in the case of non-parametric, non-out-of-sample projections like t-SNE.
- Finally, visualizations of local metrics can be used to gain more trust in and/or a better understanding of the behavior of the chosen decision map, as described in the scenarios in Section 4.
5.3. What Decision Maps Really Are
5.4. Limitations
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Javaid, M.; Haleem, A.; Pratap Singh, R.; Suman, R.; Rab, S. Significance of Machine Learning in Healthcare: Features, Pillars and Applications. Int. J. Intell. Netw. 2022, 3, 58–73. [Google Scholar] [CrossRef]
- LeCun, Y.; Bengio, Y.; Hinton, G. Deep Learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef]
- Mathur, P. Machine Learning Applications Using Python: Cases Studies from Healthcare, Retail, and Finance; Apress: New York, NY, USA, 2018. [Google Scholar]
- Bergen, K.J.; Johnson, P.A.; de Hoop, M.V.; Beroza, G.C. Machine Learning for Data-Driven Discovery in Solid Earth Geoscience. Science 2019, 363, eaau0323. [Google Scholar] [CrossRef]
- Gilpin, L.H.; Bau, D.; Yuan, B.Z.; Bajwa, A.; Specter, M.; Kagal, L. Explaining Explanations: An Overview of Interpretability of Machine Learning. In Proceedings of the 2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA), Turin, Italy, 1–3 October 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 80–89. [Google Scholar]
- Rudin, C. Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead. Nat. Mach. Intell. 2019, 1, 206–215. [Google Scholar] [CrossRef]
- Doshi-Velez, F.; Kim, B. Towards a Rigorous Science of Interpretable Machine Learning. arXiv 2017, arXiv:1702.08608. Available online: http://xxx.lanl.gov/abs/1702.08608 (accessed on 1 September 2023).
- Iooss, B.; Kenett, R.; Secchi, P. Different Views of Interpretability. In Interpretability for Industry 4.0: Statistical and Machine Learning Approaches; Springer: Berlin/Heidelberg, Germany, 2022; pp. 1–20. [Google Scholar]
- Ribeiro, M.; Singh, S.; Guestrin, C. “Why Should I Trust You?”: Explaining the Predictions of Any Classifier. arXiv 2016, arXiv:1602.04938. [Google Scholar]
- Yuan, J.; Chen, C.; Yang, W.; Liu, M.; Xia, J.; Liu, S. A Survey of Visual Analytics Techniques for Machine Learning. Comp. Visual Media 2021, 7, 3–36. [Google Scholar] [CrossRef]
- Molnar, C. Interpretable Machine Learning; Lean Publishing: Victoria, BC, Canada, 2020. [Google Scholar]
- Kaur, H.; Nori, H.; Jenkins, S.; Caruana, R.; Wallach, H.; Wortman Vaughan, J. Interpreting Interpretability: Understanding Data Scientists’ Use of Interpretability Tools for Machine Learning. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, Honolulu, HI, USA, 25–30 April 2020; pp. 1–14. [Google Scholar]
- Monarch, R. Human-in-the-Loop Machine Learning: Active Learning and Annotation for Human-Centered AI; Manning Publ.: New York, NY, USA, 2021. [Google Scholar]
- Ma, L.; Li, N.; Yu, G.; Geng, X.; Huang, M.; Wang, X. How to Simplify Search: Classification-Wise Pareto Evolution for One-Shot Neural Architecture Search. 2021. Available online: http://xxx.lanl.gov/abs/2109.07582 (accessed on 1 June 2023).
- Lee, S.; Kim, D.; Kim, N.; Jeong, S.G. Drop to Adapt: Learning Discriminative Features for Unsupervised Domain Adaptation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 91–100. [Google Scholar]
- Tsipras, D.; Santurkar, S.; Engstrom, L.; Turner, A.; Madry, A. On the Connection between Adversarial Robustness and Saliency Map Interpretability. arXiv 2019, arXiv:1905.04172. Available online: http://xxx.lanl.gov/abs/1905.04172 (accessed on 1 June 2023).
- Hamel, L. Visualization of Support Vector Machines with Unsupervised Learning. In Proceedings of the 2006 IEEE Symposium on Computational Intelligence and Bioinformatics and Computational Biology, Toronto, ON, Canada, 28–29 September 2006; pp. 1–8. [Google Scholar]
- Migut, M.A.; Worring, M.; Veenman, C.J. Visualizing Multi-Dimensional Decision Boundaries in 2D. Data Min. Knowl. Discov. 2015, 29, 273–295. [Google Scholar] [CrossRef]
- Schulz, A.; Gisbrecht, A.; Hammer, B. Using Discriminative Dimensionality Reduction to Visualize Classifiers. Neural Process. Lett. 2015, 42, 27–54. [Google Scholar] [CrossRef]
- Schulz, A.; Hinder, F.; Hammer, B. DeepView: Visualizing Classification Boundaries of Deep Neural Networks as Scatter Plots Using Discriminative Dimensionality Reduction. In Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence Organization, Yokohama, Japan, 11–17 July 2020; pp. 2305–2311. [Google Scholar]
- Rodrigues, F.C.M.; Espadoto, M.; Hirata, R.; Telea, A.C. Constructing and Visualizing High-Quality Classifier Decision Boundary Maps. Information 2019, 10, 280. [Google Scholar] [CrossRef]
- Oliveira, A.A.; Espadoto, M.; Hirata, R., Jr.; Telea, A.C. SDBM: Supervised Decision Boundary Maps for Machine Learning Classifiers. In Proceedings of the VISIGRAPP (3: IVAPP), Online Streaming, 6–8 February 2022; pp. 77–87. [Google Scholar]
- Rodrigues, F.C.M. Visual Analytics for Machine Learning. Ph.D. Thesis, University of Groningen, Groningen, The Netherlands, 2020. [Google Scholar]
- Zhou, T.; Cai, Y.W.; An, M.G.; Zhou, F.; Zhi, C.L.; Sun, X.C.; Tamer, M. Visual Interpretation of Machine Learning: Genetical Classification of Apatite from Various Ore Sources. Minerals 2023, 13, 491. [Google Scholar] [CrossRef]
- Espadoto, M.; Appleby, G.; Suh, A.; Cashman, D.; Li, M.; Scheidegger, C.E.; Anderson, E.W.; Chang, R.; Telea, A.C. UnProjection: Leveraging Inverse-Projections for Visual Analytics of High-Dimensional Data. IEEE Trans. Visual. Comput. Graphics 2021, 29, 1559–1572. [Google Scholar] [CrossRef]
- Van der Maaten, L.; Hinton, G. Visualizing Data Using T-SNE. J. Mach. Learn. Res. 2008, 9, 2579–2605. [Google Scholar]
- McInnes, L.; Healy, J.; Melville, J. UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction. arXiv 2018, arXiv:1802.03426. [Google Scholar]
- Jolliffe, I.T.; Cadima, J. Principal component analysis: A review and recent developments. Phil. Trans. Royal Soc. A 2016, 374, 20150202. [Google Scholar] [CrossRef]
- Joia, P.; Coimbra, D.; Cuminato, J.A.; Paulovich, F.V.; Nonato, L.G. Local Affine Multidimensional Projection. IEEE TVCG 2011, 17, 2563–2571. [Google Scholar] [CrossRef]
- Paulovich, F.V.; Nonato, L.G.; Minghim, R.; Levkowitz, H. Least Square Projection: A Fast High-Precision Multidimensional Projection Technique and Its Application to Document Mapping. IEEE TVCG 2008, 14, 564–575. [Google Scholar] [CrossRef]
- Paulovich, F.V.; Eler, D.M.; Poco, J.; Botha, a.C.P.; Minghim, R.; Nonato, L.G. Piecewise Laplacian-Based Projection for Interactive Data Exploration and Organization. Comput. Graph. Forum 2011, 30, 1091–1100. [Google Scholar] [CrossRef]
- dos Santos Amorim, E.P.; Brazil, E.V.; Daniels, J.; Joia, P.; Nonato, L.G.; Sousa, M.C. iLAMP: Exploring High-Dimensional Spacing through Backward Multidimensional Projection. In Proceedings of the 2012 IEEE Conference on Visual Analytics Science and Technology (VAST), Seattle, WA, USA, 14–19 October 2012; pp. 53–62. [Google Scholar]
- Espadoto, M.; Rodrigues, F.C.M.; Hirata, N.S.T.; Hirata, R. Deep Learning Inverse Multidimensional Projections. In Proceedings of the Proc. EuroVA, Porto, Portugal, 3 June 2019. [Google Scholar]
- Espadoto, M.; Hirata, N.S.T.; Telea, A.C. Deep Learning Multidimensional Projections. Inf. Vis. 2020, 19, 247–269. [Google Scholar] [CrossRef]
- Espadoto, M.; Hirata, N.; Telea, A. Self-Supervised Dimensionality Reduction with Neural Networks and Pseudo-labeling. In Proceedings of the 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications–IVAPP, Online Streaming, 8–10 February 2021; SciTePress: Setúbal, Portugal, 2021; pp. 27–37. [Google Scholar]
- Venna, J.; Kaski, S. Visualizing Gene Interaction Graphs with Local Multidimensional Scaling. In Proceedings of the Proc. ESANN, Bruges, Belgium, 26–28 April 2006; pp. 557–562. [Google Scholar]
- Espadoto, M.; Martins, R.; Kerren, A.; Hirata, N.; Telea, A. Toward a Quantitative Survey of Dimension Reduction Techniques. IEEE TVCG 2019, 27, 2153–2173. [Google Scholar] [CrossRef] [PubMed]
- Nonato, L.; Aupetit, M. Multidimensional Projection for Visual Analytics: Linking Techniques with Distortions, Tasks, and Layout Enrichment. IEEE TVCG 2018, 25, 2650–2673. [Google Scholar] [CrossRef]
- Aupetit, M. Visualizing Distortions and Recovering Topology in Continuous Projection Techniques. Neurocomputing 2007, 10, 1304–1330. [Google Scholar] [CrossRef]
- Goodfellow, I.J.; Shlens, J.; Szegedy, C. Explaining and Harnessing Adversarial Examples. 2015. Available online: http://xxx.lanl.gov/abs/1412.6572 (accessed on 1 June 2023).
- Moosavi-Dezfooli, S.M.; Fawzi, A.; Frossard, P. DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks. 2016. Available online: http://xxx.lanl.gov/abs/1511.04599 (accessed on 1 June 2023).
- Anguita, D.; Ghio, A.; Oneto, L.; Parra, X.; Reyes-Ortiz, J.L. Human Activity Recognition on Smartphones Using a Multiclass Hardware-Friendly Support Vector Machine. In Proceedings of the International Workshop on Ambient Assisted Living, Vitoria-Gasteiz, Spain, 3–5 December 2012; pp. 216–223. [Google Scholar]
- LeCun, Y.; Cortes, C.; Burges, C.J. MNIST Handwritten Digit Database. 2010. Available online: http://yann.lecun.com/exdb/mnist/ (accessed on 1 June 2023).
- Xiao, H.; Rasul, K.; Vollgraf, R. Fashion-MNIST: A Novel Image Dataset for Benchmarking Machine Learning Algorithms. arXiv 2017, arXiv:1708.07747. [Google Scholar]
- Thoma, M. The Reuters Dataset. 2017. Available online: https://martin-thoma.com/nlp-reuters (accessed on 1 June 2023).
- Salton, G.; McGill, M.J. Introduction to Modern Information Retrieval; McGraw-Hill Computer Science Series; McGraw-Hill: New York, NY, USA, 1986. [Google Scholar]
- Cox, D.R. Two Further Applications of a Model for Binary Regression. Biometrika 1958, 45, 562–565. [Google Scholar] [CrossRef]
- Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
- Cortes, C.; Vapnik, V. Support-Vector Networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
- Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V. Scikit-Learn: Machine Learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
- Arik, S.Ö.; Pfister, T. Tabnet: Attentive Interpretable Tabular Learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Virtually, 2–9 February 2021; Volume 35, pp. 6679–6687. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention Is All You Need. Adv. Neural Inf. Process. Syst. 2017, 30, 3058. [Google Scholar]
- LeCun, Y.; Boser, B.; Denker, J.S.; Henderson, D.; Howard, R.E.; Hubbard, W.; Jackel, L.D. Backpropagation Applied to Handwritten Zip Code Recognition. Neural Comput. 1989, 1, 541–551. [Google Scholar] [CrossRef]
- Elman, J.L. Finding Structure in Time. Cogn. Sci. 1990, 14, 179–211. [Google Scholar] [CrossRef]
- Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
- Salakhutdinov, R.; Murray, I. On the quantitative analysis of deep belief networks. In Proceedings of the ICML–International Conference on Machine Learning, Madison, WI, USA, 24–27 July 1998. [Google Scholar]
Dataset | Type | Dimensionality | No. of Classes | ||
---|---|---|---|---|---|
Synthetic Blobs | Synthetic | 1000 | 500 | 100 | 5 |
HAR | Time Series | 5000 | 2352 | 561 | 6 |
MNIST | Image | 5000 | 5000 | 784 | 10 |
FashionMNIST | Image | 5000 | 5000 | 784 | 10 |
Reuters Newswire | Text | 5000 | 2432 | 5000 | 6 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, Y.; Machado, A.; Telea, A. Quantitative and Qualitative Comparison of Decision-Map Techniques for Explaining Classification Models. Algorithms 2023, 16, 438. https://doi.org/10.3390/a16090438
Wang Y, Machado A, Telea A. Quantitative and Qualitative Comparison of Decision-Map Techniques for Explaining Classification Models. Algorithms. 2023; 16(9):438. https://doi.org/10.3390/a16090438
Chicago/Turabian StyleWang, Yu, Alister Machado, and Alexandru Telea. 2023. "Quantitative and Qualitative Comparison of Decision-Map Techniques for Explaining Classification Models" Algorithms 16, no. 9: 438. https://doi.org/10.3390/a16090438
APA StyleWang, Y., Machado, A., & Telea, A. (2023). Quantitative and Qualitative Comparison of Decision-Map Techniques for Explaining Classification Models. Algorithms, 16(9), 438. https://doi.org/10.3390/a16090438