Evaluating High-Variance Leaves as Uncertainty Measure for Random Forest Regression
Abstract
:1. Introduction
2. Results
2.1. Predictive Performance
2.2. Area under the Confidence–Oracle Error Curve after Removing 50% of the Most Uncertain Predictions ()
2.3. Decline in When Omitting the Least Certain Predictions
3. Discussion
4. Materials and Methods
4.1. Data Acquisition
4.2. Data Preparation
4.3. Machine Learning Setup
4.4. Predictive Quality
4.5. Uncertainty Assessment
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
Area under the confidence–oracle error | |
HVL | High-variance leaf |
Mean squared error | |
QSAR | Quantitative structure–activity relationship |
RF | Random forest |
SDEP | Standard deviation of ensemble predictions |
UM | Uncertainty measure |
References
- Tropsha, A.; Gramatica, P.; Gombar, V.K. The Importance of Being Earnest: Validation is the Absolute Essential for Successful Application and Interpretation of QSPR Models. QSAR Comb. Sci. 2003, 22, 69–77. [Google Scholar] [CrossRef]
- Baumann, D.; Baumann, K. Reliable estimation of prediction errors for QSAR models under model uncertainty using double cross-validation. J. Cheminform. 2014, 6, 47. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Tropsha, A.; Golbraikh, A. Predictive QSAR Modeling Workflow, Model Applicability Domains, and Virtual Screening. Curr. Pharm. Des. 2007, 13, 3494–3504. [Google Scholar] [CrossRef]
- Golbraikh, A.; Tropsha, A. Predictive QSAR modeling based on diversity sampling of experimental datasets for the training and test set selection. J. Comput. Aided Mol. Des. 2002, 16, 357–369. [Google Scholar] [CrossRef] [PubMed]
- Stumpfe, D.; Hu, H.; Bajorath, J. Evolving Concept of Activity Cliffs. ACS Omega 2019, 4, 14360–14368. [Google Scholar] [CrossRef] [PubMed]
- Cherkasov, A.; Muratov, E.N.; Fourches, D.; Varnek, A.; Baskin, I.I.; Cronin, M.; Dearden, J.; Gramatica, P.; Martin, Y.C.; Todeschini, R.; et al. QSAR Modeling: Where Have You Been? Where Are You Going To? J. Med. Chem. 2014, 57, 4977–5010. [Google Scholar] [CrossRef] [Green Version]
- Liu, R.; Glover, K.P.; Feasel, M.G.; Wallqvist, A. General Approach to Estimate Error Bars for Quantitative Structure—Activity Relationship Predictions of Molecular Activity. J. Chem. Inf. Model. 2018, 58, 1561–1575. [Google Scholar] [CrossRef]
- Vamathevan, J.; Clark, D.; Czodrowski, P.; Dunham, I.; Ferran, E.; Lee, G.; Li, B.; Madabhushi, A.; Shah, P.; Spitzer, M.; et al. Applications of machine learning in drug discovery and development. Nat. Rev. Drug Discov. 2019, 18, 463–477. [Google Scholar] [CrossRef]
- Jiménez-Luna, J.; Grisoni, F.; Weskamp, N.; Schneider, G. Artificial intelligence in drug discovery: Recent advances and future perspectives. Expert Opin. Drug Discov. 2021, 16, 949–959. [Google Scholar] [CrossRef]
- Guha, R.; Jurs, P.C. Determining the Validity of a QSAR Model —A Classification Approach. J. Chem. Inf. Model. 2005, 45, 65–73. [Google Scholar] [CrossRef]
- Tetko, I.V.; Bruneau, P.; Mewes, H.-W.; Rohrer, D.C.; Poda, G.I. Can we estimate the accuracy of ADME-Tox predictions? Drug Discov. Today 2006, 11, 700–707. [Google Scholar] [CrossRef] [PubMed]
- Schroeter, T.S.; Schwaighofer, A.; Mika, S.; Ter Laak, A.; Suelzle, D.; Ganzer, U.; Heinrich, N.; Müller, K.-R. Estimating the domain of applicability for machine learning QSAR models: A study on aqueous solubility of drug discovery molecules. J. Comput. Aided Mol. Des. 2007, 21, 485–498. [Google Scholar] [CrossRef]
- Tropsha, A. Best Practices for QSAR Model Development, Validation, and Exploitation. Mol. Inform. 2010, 29, 476–488. [Google Scholar] [CrossRef] [PubMed]
- Klingspohn, W.; Mathea, M.; ter Laak, A.; Heinrich, N.; Baumann, K. Efficiency of different measures for defining the applicability domain of classification models. J. Cheminform. 2017, 9, 44. [Google Scholar] [CrossRef]
- Cortes-Ciriano, I.; Bender, A. Reliable Prediction Errors for Deep Neural Networks Using Test-Time Dropout. J. Chem. Inf. Model. 2019, 59, 3330–3339. [Google Scholar] [CrossRef]
- Hirschfeld, L.; Swanson, K.; Yang, K.; Barzilay, R.; Coley, C.W. Uncertainty Quantification Using Neural Networks for Molecular Property Prediction. J. Chem. Inf. Model. 2020, 60, 3770–3780. [Google Scholar] [CrossRef] [PubMed]
- Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]
- Dietterich, T.G. Ensemble Methods in Machine Learning. In Lecture Notes in Computer Science 1857, Proceedings of the International Workshop on Multiple Classifier Systems, Cagliari, Italy, 21–23 June 2000; Springer: Berlin/Heidelberg, Germany, 2000; Volume 1857, pp. 1–15. [Google Scholar]
- Tetko, I.V.; Sushko, I.; Pandey, A.K.; Zhu, H.; Tropsha, A.; Papa, E.; Öberg, T.; Todeschini, R.; Fourches, D.; Varnek, A. Critical Assessment of QSAR Models of Environmental Toxicity against Tetrahymena pyriformis: Focusing on Applicability Domain and Overfitting by Variable Selection. J. Chem. Inf. Model. 2008, 48, 1733–1746. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Briesemeister, S.; Rahnenführer, J.; Kohlbacher, O. No Longer Confidential: Estimating the Confidence of Individual Regression Predictions. PLoS ONE 2012, 7, e48723. [Google Scholar]
- Stumpfe, D.; Bajorath, J. Exploring Activity Cliffs in Medicinal Chemistry. J. Med. Chem. 2012, 55, 2932–2942. [Google Scholar] [CrossRef]
- Cortes-Ciriano, I. Benchmarking the Predictive Power of Ligand Efficiency Indices in QSAR. J. Chem. Inf. Model. 2016, 56, 1576–1587. [Google Scholar] [CrossRef]
- Scalia, G.; Grambow, C.A.; Pernici, B.; Li, Y.-P.; Green, W.H. Evaluating Scalable Uncertainty Estimation Methods for Deep Learning-Based Molecular Property Prediction. J. Chem. Inf. Model. 2020, 60, 2697–2717. [Google Scholar] [CrossRef]
- Wood, D.J.; Carlsson, L.; Eklund, M.; Norinder, U.; Stålring, J. QSAR with experimental and predictive distributions: An information theoretic approach for assessing model quality. J. Comput. Aided Mol. Des. 2013, 27, 203–219. [Google Scholar] [CrossRef] [Green Version]
- Cheng, F.; Shen, J.; Yu, Y.; Li, W.; Liu, G.; Lee, P.W.; Tang, Y. In silico prediction of Tetrahymena pyriformis toxicity for diverse industrial chemicals with substructure pattern recognition and machine learning methods. Chemosphere 2011, 82, 1636–1643. [Google Scholar] [CrossRef]
- Delaney, J.S. ESOL: Estimating Aqueous Solubility Directly from Molecular Structure. J. Chem. Inf. Comput. Sci. 2004, 44, 1000–1005. [Google Scholar] [CrossRef]
- Mobley, D.L.; Guthrie, J.P. FreeSolv: A database of experimental and calculated hydration free energies, with input files. J. Comput. Aided Mol. Des. 2014, 28, 711–720. [Google Scholar] [CrossRef] [Green Version]
- RDKit: Open-Source Cheminformatics. Available online: http://www.rdkit.org (accessed on 13 September 2021).
- Wu, Z.; Ramsundar, B.; Feinberg, E.N.; Gomes, J.; Geniesse, C.; Pappu, A.S.; Leswing, K.; Pande, V. MoleculeNet: A benchmark for molecular machine learning. Chem. Sci. 2018, 9, 513–530. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Rogers, D.; Hahn, M. Extended-Connectivity Fingerprints. J. Chem. Inf. Model. 2010, 50, 742–754. [Google Scholar] [CrossRef]
- Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
- Alexander, D.L.J.; Tropsha, A.; Winkler, D.A. Beware of R2: Simple, Unambiguous Assessment of the Prediction Accuracy of QSAR and QSPR Models. J. Chem. Inf. Model. 2015, 55, 1316–1322. [Google Scholar] [CrossRef] [Green Version]
- Kvålseth, T.O. Cautionary Note about R2. Am. Stat. 1985, 39, 279–285. [Google Scholar]
- Schüürmann, G.; Ebert, R.-U.; Chen, J.; Wang, B.; Kühne, R. External Validation and Prediction Employing the Predictive Squared Correlation Coefficient—Test Set Activity Mean vs. Training Set Activity Mean. J. Chem. Inf. Model. 2008, 48, 2140–2145. [Google Scholar] [CrossRef] [PubMed]
RDKit Descriptors | ECFPs | |||||||
---|---|---|---|---|---|---|---|---|
Train | Test | Train | Test | |||||
Dataset | MSE | R2 | MSE | R2 | MSE | R2 | MSE | R2 |
F7 | 0.049 | 0.948 | 0.277 | 0.705 | 0.077 | 0.918 | 0.303 | 0.676 |
IL4 | 0.027 | 0.912 | 0.146 | 0.520 | 0.036 | 0.883 | 0.133 | 0.563 |
MMP2 | 0.084 | 0.905 | 0.465 | 0.474 | 0.102 | 0.885 | 0.389 | 0.560 |
O60674 | 0.133 | 0.909 | 0.742 | 0.495 | 0.139 | 0.905 | 0.597 | 0.594 |
O14965 | 0.106 | 0.938 | 0.610 | 0.642 | 0.104 | 0.939 | 0.449 | 0.736 |
P03372 | 0.097 | 0.945 | 0.489 | 0.720 | 0.120 | 0.931 | 0.497 | 0.715 |
P04150 | 0.085 | 0.936 | 0.444 | 0.664 | 0.110 | 0.917 | 0.445 | 0.664 |
P06401 | 0.070 | 0.947 | 0.399 | 0.696 | 0.089 | 0.932 | 0.373 | 0.715 |
P11229 | 0.087 | 0.951 | 0.500 | 0.717 | 0.129 | 0.927 | 0.522 | 0.705 |
P12931 | 0.100 | 0.947 | 0.546 | 0.712 | 0.113 | 0.940 | 0.478 | 0.747 |
P16581 | 0.108 | 0.937 | 0.620 | 0.640 | 0.140 | 0.919 | 0.539 | 0.688 |
P17252 | 0.109 | 0.928 | 0.539 | 0.642 | 0.125 | 0.917 | 0.495 | 0.671 |
P18089 | 0.103 | 0.932 | 0.606 | 0.598 | 0.122 | 0.919 | 0.640 | 0.575 |
P19327 | 0.096 | 0.924 | 0.565 | 0.553 | 0.107 | 0.915 | 0.481 | 0.620 |
P21554 | 0.095 | 0.955 | 0.556 | 0.736 | 0.115 | 0.945 | 0.513 | 0.757 |
P24530 | 0.068 | 0.958 | 0.368 | 0.770 | 0.077 | 0.952 | 0.306 | 0.809 |
P25929 | 0.081 | 0.955 | 0.473 | 0.739 | 0.091 | 0.950 | 0.386 | 0.787 |
P28335 | 0.088 | 0.923 | 0.519 | 0.541 | 0.109 | 0.904 | 0.499 | 0.560 |
P28482 | 0.074 | 0.915 | 0.404 | 0.532 | 0.077 | 0.911 | 0.393 | 0.546 |
P35968 | 0.109 | 0.934 | 0.648 | 0.608 | 0.119 | 0.928 | 0.544 | 0.671 |
P41594 | 0.113 | 0.910 | 0.664 | 0.473 | 0.133 | 0.894 | 0.561 | 0.555 |
P42345 | 0.095 | 0.963 | 0.534 | 0.790 | 0.105 | 0.959 | 0.450 | 0.823 |
P47871 | 0.079 | 0.929 | 0.451 | 0.591 | 0.097 | 0.912 | 0.421 | 0.619 |
P49146 | 0.071 | 0.954 | 0.399 | 0.741 | 0.096 | 0.938 | 0.423 | 0.726 |
P61169 | 0.087 | 0.910 | 0.498 | 0.486 | 0.100 | 0.897 | 0.417 | 0.570 |
Q05397 | 0.084 | 0.919 | 0.484 | 0.534 | 0.109 | 0.895 | 0.471 | 0.546 |
Q16602 | 0.073 | 0.968 | 0.439 | 0.808 | 0.100 | 0.956 | 0.416 | 0.818 |
P24941 | 0.115 | 0.940 | 0.661 | 0.655 | 0.136 | 0.929 | 0.581 | 0.697 |
Q92731 | 0.097 | 0.926 | 0.496 | 0.619 | 0.127 | 0.902 | 0.516 | 0.604 |
TETRAH | 0.036 | 0.967 | 0.190 | 0.822 | 0.082 | 0.923 | 0.291 | 0.727 |
DELANEY | 0.073 | 0.983 | 0.413 | 0.906 | 0.349 | 0.921 | 1.278 | 0.709 |
FREESOLV | 0.281 | 0.981 | 1.539 | 0.896 | 1.351 | 0.909 | 4.516 | 0.694 |
Median | 0.088 | 0.938 | 0.497 | 0.649 | 0.109 | 0.919 | 0.474 | 0.682 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Dutschmann, T.-M.; Baumann, K. Evaluating High-Variance Leaves as Uncertainty Measure for Random Forest Regression. Molecules 2021, 26, 6514. https://doi.org/10.3390/molecules26216514
Dutschmann T-M, Baumann K. Evaluating High-Variance Leaves as Uncertainty Measure for Random Forest Regression. Molecules. 2021; 26(21):6514. https://doi.org/10.3390/molecules26216514
Chicago/Turabian StyleDutschmann, Thomas-Martin, and Knut Baumann. 2021. "Evaluating High-Variance Leaves as Uncertainty Measure for Random Forest Regression" Molecules 26, no. 21: 6514. https://doi.org/10.3390/molecules26216514
APA StyleDutschmann, T. -M., & Baumann, K. (2021). Evaluating High-Variance Leaves as Uncertainty Measure for Random Forest Regression. Molecules, 26(21), 6514. https://doi.org/10.3390/molecules26216514