A Hybrid Docking and Machine Learning Approach to Enhance the Performance of Virtual Screening Carried out on Protein–Protein Interfaces
Abstract
:1. Introduction
2. Results
2.1. Virtual Screening Performance of GOLD Scoring Functions
2.2. Post-Docking Derivatization of SASA Descriptors
2.3. Virtual Screening Performance of SASA Descriptors
2.4. Building and Validating Machine Learning Models Using SASA Descriptors
2.5. Enrichment Factors Estimation of Machine Learning Models
3. Discussion
4. Materials and Methods
4.1. Rescoring with GOLD Scoring Functions and Consensus Approach
4.2. Solvent Accessible Surface Area (SASA) Calculations and Derivatization of SASA Descriptors
4.3. Machine Learning Models Generation
Supplementary Materials
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
Abbreviations
References
- Walters, W.P.; Stahl, M.T.; Murcko, M.A. Virtual Screening—An Overview. Drug Discov. Today 1998, 3, 160–178. [Google Scholar] [CrossRef]
- Shoichet, B.K. Virtual Screening of Chemical Libraries. Nature 2004, 432, 862–865. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Oprea, T.I.; Matter, H. Integrating Virtual Screening in Lead Discovery. Curr. Opin. Chem. Biol. 2004, 8, 349–358. [Google Scholar] [CrossRef] [PubMed]
- Bajorath, J. Integration of Virtual and High-Throughput Screening. Nat. Rev. Drug Discov. 2002, 1, 882–894. [Google Scholar] [CrossRef]
- Bissantz, C.; Folkers, G.; Rognan, D. Protein-Based Virtual Screening of Chemical Databases. 1. Evaluation of Different Docking/Scoring Combinations. J. Med. Chem. 2000, 43, 4759–4767. [Google Scholar] [CrossRef]
- Ma, D.-L.; Chan, D.S.-H.; Leung, C.-H. Drug Repositioning by Structure-Based Virtual Screening. Chem. Soc. Rev. 2013, 42, 2130–2141. [Google Scholar] [CrossRef]
- Lyne, P.D. Structure-Based Virtual Screening: An Overview. Drug Discov. Today 2002, 7, 1047–1055. [Google Scholar] [CrossRef]
- Mirdita, M.; Schütze, K.; Moriwaki, Y.; Heo, L.; Ovchinnikov, S.; Steinegger, M. ColabFold: Making Protein Folding Accessible to All. Nat. Methods 2022, 19, 679–682. [Google Scholar] [CrossRef]
- Jumper, J.; Evans, R.; Pritzel, A.; Green, T.; Figurnov, M.; Ronneberger, O.; Tunyasuvunakool, K.; Bates, R.; Žídek, A.; Potapenko, A.; et al. Highly Accurate Protein Structure Prediction with AlphaFold. Nature 2021, 596, 583–589. [Google Scholar] [CrossRef]
- Baek, M.; DiMaio, F.; Anishchenko, I.; Dauparas, J.; Ovchinnikov, S.; Lee, G.R.; Wang, J.; Cong, Q.; Kinch, L.N.; Schaeffer, R.D.; et al. Accurate Prediction of Protein Structures and Interactions Using a Three-Track Neural Network. Science 2021, 373, 871–876. [Google Scholar] [CrossRef]
- Bryant, P.; Pozzati, G.; Elofsson, A. Improved Prediction of Protein-Protein Interactions Using AlphaFold2. Nat. Commun. 2022, 13, 1265. [Google Scholar] [CrossRef]
- Wigge, C.; Stefanovic, A.; Radjainia, M. The Rapidly Evolving Role of Cryo-EM in Drug Design. Drug Discov. Today Technol. 2020, 38, 91–102. [Google Scholar] [CrossRef]
- Van Drie, J.H.; Tong, L. Cryo-EM as a Powerful Tool for Drug Discovery. Bioorg. Med. Chem. Lett. 2020, 30, 127524. [Google Scholar] [CrossRef] [PubMed]
- Ceska, T.; Chung, C.-W.; Cooke, R.; Phillips, C.; Williams, P.A. Cryo-EM in Drug Discovery. Biochem. Soc. Trans. 2019, 47, 281–293. [Google Scholar] [CrossRef] [Green Version]
- Rognan, D. The Impact of in Silico Screening in the Discovery of Novel and Safer Drug Candidates. Pharmacol. Ther. 2017, 175, 47–66. [Google Scholar] [CrossRef]
- Slater, O.; Kontoyianni, M. The Compromise of Virtual Screening and Its Impact on Drug Discovery. Expert Opin. Drug Discov. 2019, 14, 619–637. [Google Scholar] [CrossRef] [PubMed]
- Gimeno, A.; Ojeda-Montes, M.J.; Tomás-Hernández, S.; Cereto-Massagué, A.; Beltrán-Debón, R.; Mulero, M.; Pujadas, G.; Garcia-Vallvé, S. The Light and Dark Sides of Virtual Screening: What Is There to Know? Int. J. Mol. Sci. 2019, 20, 1375. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Spyrakis, F.; Cavasotto, C.N. Open Challenges in Structure-Based Virtual Screening: Receptor Modeling, Target Flexibility Consideration and Active Site Water Molecules Description. Arch. Biochem. Biophys. 2015, 583, 105–119. [Google Scholar] [CrossRef]
- Scior, T.; Bender, A.; Tresadern, G.; Medina-Franco, J.L.; Martínez-Mayorga, K.; Langer, T.; Cuanalo-Contreras, K.; Agrafiotis, D.K. Recognizing Pitfalls in Virtual Screening: A Critical Review. J. Chem. Inf. Model 2012, 52, 867–881. [Google Scholar] [CrossRef]
- Plewczynski, D.; Łaźniewski, M.; Augustyniak, R.; Ginalski, K. Can We Trust Docking Results? Evaluation of Seven Commonly Used Programs on PDBbind Database. J. Comput. Chem. 2011, 32, 742–755. [Google Scholar] [CrossRef]
- Sheridan, R.P.; Kearsley, S.K. Why Do We Need so Many Chemical Similarity Search Methods? Drug Discov. Today 2002, 7, 903–911. [Google Scholar] [CrossRef]
- Yang, J.; Shen, C.; Huang, N. Predicting or Pretending: Artificial Intelligence for Protein-Ligand Interactions Lack of Sufficiently Large and Unbiased Datasets. Front. Pharmacol. 2020, 11, 69. [Google Scholar] [CrossRef] [PubMed]
- Shen, C.; Hu, Y.; Wang, Z.; Zhang, X.; Zhong, H.; Wang, G.; Yao, X.; Xu, L.; Cao, D.; Hou, T. Can Machine Learning Consistently Improve the Scoring Power of Classical Scoring Functions? Insights into the Role of Machine Learning in Scoring Functions. Brief. Bioinform. 2020, 22, 497–514. [Google Scholar] [CrossRef] [PubMed]
- Boyles, F.; Deane, C.M.; Morris, G.M. Learning from the Ligand: Using Ligand-Based Features to Improve Binding Affinity Prediction. Bioinformatics 2020, 36, 758–764. [Google Scholar] [CrossRef]
- Torres, P.H.M.; Sodero, A.C.R.; Jofily, P.; Silva-Jr, F.P. Key Topics in Molecular Docking for Drug Design. Int. J. Mol. Sci. 2019, 20, 4574. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Li, J.; Fu, A.; Zhang, L. An Overview of Scoring Functions Used for Protein-Ligand Interactions in Molecular Docking. Interdiscip. Sci. 2019, 11, 320–328. [Google Scholar] [CrossRef]
- Sieg, J.; Flachsenberg, F.; Rarey, M. In Need of Bias Control: Evaluating Chemical Data for Machine Learning in Structure-Based Virtual Screening. J. Chem. Inf. Model. 2019, 59, 947–961. [Google Scholar] [CrossRef]
- Nogueira, M.S.; Koch, O. The Development of Target-Specific Machine Learning Models as Scoring Functions for Docking-Based Target Prediction. J. Chem. Inf. Model. 2019, 59, 1238–1252. [Google Scholar] [CrossRef]
- Guedes, I.A.; Pereira, F.S.S.; Dardenne, L.E. Empirical Scoring Functions for Structure-Based Virtual Screening: Applications, Critical Aspects, and Challenges. Front. Pharmacol. 2018, 9, 1089. [Google Scholar] [CrossRef]
- Wingert, B.M.; Camacho, C.J. Improving Small Molecule Virtual Screening Strategies for the next Generation of Therapeutics. Curr. Opin. Chem. Biol. 2018, 44, 87–92. [Google Scholar] [CrossRef]
- Wójcikowski, M.; Ballester, P.J.; Siedlecki, P. Performance of Machine-Learning Scoring Functions in Structure-Based Virtual Screening. Sci. Rep. 2017, 7, 46710. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Feher, M. Consensus Scoring for Protein-Ligand Interactions. Drug Discov. Today 2006, 11, 421–428. [Google Scholar] [CrossRef] [PubMed]
- Wang, R.; Wang, S. How Does Consensus Scoring Work for Virtual Library Screening? An Idealized Computer Experiment. J. Chem. Inf. Comput. Sci. 2001, 41, 1422–1426. [Google Scholar] [CrossRef]
- Singh, N.; Villoutreix, B.O. Demystifying the Molecular Basis of Pyrazoloquinolinones Recognition at the Extracellular A1+/Β3- Interface of the GABAA Receptor by Molecular Modeling. Front. Pharmacol. 2020, 11, 561834. [Google Scholar] [CrossRef] [PubMed]
- Arkin, M.R.; Wells, J.A. Small-Molecule Inhibitors of Protein-Protein Interactions: Progressing towards the Dream. Nat. Rev. Drug Discov. 2004, 3, 301–317. [Google Scholar] [CrossRef]
- Stumpf, M.P.H.; Thorne, T.; de Silva, E.; Stewart, R.; An, H.J.; Lappe, M.; Wiuf, C. Estimating the Size of the Human Interactome. Proc. Natl. Acad. Sci. USA 2008, 105, 6959–6964. [Google Scholar] [CrossRef] [Green Version]
- Venkatesan, K.; Rual, J.-F.; Vazquez, A.; Stelzl, U.; Lemmens, I.; Hirozane-Kishikawa, T.; Hao, T.; Zenkner, M.; Xin, X.; Goh, K.-I.; et al. An Empirical Framework for Binary Interactome Mapping. Nat. Methods 2009, 6, 83–90. [Google Scholar] [CrossRef]
- Cheng, A.C.; Coleman, R.G.; Smyth, K.T.; Cao, Q.; Soulard, P.; Caffrey, D.R.; Salzberg, A.C.; Huang, E.S. Structure-Based Maximal Affinity Model Predicts Small-Molecule Druggability. Nat. Biotechnol. 2007, 25, 71–75. [Google Scholar] [CrossRef]
- Blundell, T.L.; Burke, D.F.; Chirgadze, D.; Dhanaraj, V.; Hyvönen, M.; Innis, C.A.; Parisini, E.; Pellegrini, L.; Sayed, M.; Sibanda, B.L. Protein-Protein Interactions in Receptor Activation and Intracellular Signalling. Biol. Chem. 2000, 381, 955–959. [Google Scholar] [CrossRef]
- Chen, P.; Ke, Y.; Lu, Y.; Du, Y.; Li, J.; Yan, H.; Zhao, H.; Zhou, Y.; Yang, Y. DLIGAND2: An Improved Knowledge-Based Energy Function for Protein-Ligand Interactions Using the Distance-Scaled, Finite, Ideal-Gas Reference State. J. Cheminform. 2019, 11, 52. [Google Scholar] [CrossRef]
- Trisciuzzi, D.; Nicolotti, O.; Miteva, M.A.; Villoutreix, B.O. Analysis of Solvent-Exposed and Buried Co-Crystallized Ligands: A Case Study to Support the Design of Novel Protein–Protein Interaction Inhibitors. Drug Discov. Today 2019, 24, 551–559. [Google Scholar] [CrossRef]
- Díaz-Eufracio, B.I.; Medina-Franco, J.L. Towards the Development of Machine Learning Models to Predict Protein-Protein Interaction Modulators. ChemRxiv 2022. [Google Scholar] [CrossRef]
- Sarkar, D.; Saha, S. Machine-Learning Techniques for the Prediction of Protein–Protein Interactions. J. Biosci. 2019, 44, 104. [Google Scholar] [CrossRef] [PubMed]
- Gupta, P.; Mohanty, D. SMMPPI: A Machine Learning-Based Approach for Prediction of Modulators of Protein-Protein Interactions and Its Application for Identification of Novel Inhibitors for RBD:HACE2 Interactions in SARS-CoV-2. Brief Bioinform. 2021, 22, bbab111. [Google Scholar] [CrossRef]
- Neugebauer, A.; Hartmann, R.W.; Klein, C.D. Prediction of Protein-Protein Interaction Inhibitors by Chemoinformatics and Machine Learning Methods. J. Med. Chem. 2007, 50, 4665–4668. [Google Scholar] [CrossRef] [PubMed]
- Sperandio, O.; Reynès, C.H.; Camproux, A.-C.; Villoutreix, B.O. Rationalizing the Chemical Space of Protein–Protein Interaction Inhibitors. Drug Discov. Today 2010, 15, 220–229. [Google Scholar] [CrossRef]
- Hamon, V.; Bourgeas, R.; Ducrot, P.; Theret, I.; Xuereb, L.; Basse, M.J.; Brunel, J.M.; Combes, S.; Morelli, X.; Roche, P. 2P2I HUNTER: A Tool for Filtering Orthosteric Protein-Protein Interaction Modulators via a Dedicated Support Vector Machine. J. R. Soc. Interface 2014, 11, 20130860. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Gaulton, A.; Bellis, L.J.; Bento, A.P.; Chambers, J.; Davies, M.; Hersey, A.; Light, Y.; McGlinchey, S.; Michalovich, D.; Al-Lazikani, B.; et al. ChEMBL: A Large-Scale Bioactivity Database for Drug Discovery. Nucleic. Acids Res. 2012, 40, D1100–D1107. [Google Scholar] [CrossRef] [Green Version]
- Kim, S.; Thiessen, P.A.; Bolton, E.E.; Chen, J.; Fu, G.; Gindulyte, A.; Han, L.; He, J.; He, S.; Shoemaker, B.A.; et al. PubChem Substance and Compound Databases. Nucleic. Acids Res. 2016, 44, D1202–D1213. [Google Scholar] [CrossRef]
- Singh, N.; Chaput, L.; Villoutreix, B.O. Fast Rescoring Protocols to Improve the Performance of Structure-Based Virtual Screening Performed on Protein-Protein Interfaces. J. Chem. Inf. Model. 2020, 60, 3910–3934. [Google Scholar] [CrossRef]
- Verdonk, M.L.; Cole, J.C.; Hartshorn, M.J.; Murray, C.W.; Taylor, R.D. Improved Protein-Ligand Docking Using GOLD. Proteins 2003, 52, 609–623. [Google Scholar] [CrossRef]
- Whitty, A.; Kumaravel, G. Between a Rock and a Hard Place? Nat. Chem. Biol. 2006, 2, 112–118. [Google Scholar] [CrossRef]
- Fry, D.C. Drug-like Inhibitors of Protein-Protein Interactions: A Structural Examination of Effective Protein Mimicry. Curr. Protein Pept. Sci. 2008, 9, 240–247. [Google Scholar] [CrossRef]
- Chène, P. Drugs Targeting Protein-Protein Interactions. ChemMedChem 2006, 1, 400–411. [Google Scholar] [CrossRef] [PubMed]
- Núñez, S.; Venhorst, J.; Kruse, C.G. Assessment of a Novel Scoring Method Based on Solvent Accessible Surface Area Descriptors. J. Chem. Inf. Model. 2010, 50, 480–486. [Google Scholar] [CrossRef]
- Kuenemann, M.A.; Sperandio, O.; Labbé, C.M.; Lagorce, D.; Miteva, M.A.; Villoutreix, B.O. In Silico Design of Low Molecular Weight Protein-Protein Interaction Inhibitors: Overall Concept and Recent Advances. Prog. Biophys. Mol. Biol. 2015, 119, 20–32. [Google Scholar] [CrossRef]
- Fuller, J.C.; Burgoyne, N.J.; Jackson, R.M. Predicting Druggable Binding Sites at the Protein-Protein Interface. Drug Discov. Today 2009, 14, 155–161. [Google Scholar] [CrossRef]
- Reynès, C.; Host, H.; Camproux, A.-C.; Laconde, G.; Leroux, F.; Mazars, A.; Deprez, B.; Fahraeus, R.; Villoutreix, B.O.; Sperandio, O. Designing Focused Chemical Libraries Enriched in Protein-Protein Interaction Inhibitors Using Machine-Learning Methods. PLoS Comput. Biol. 2010, 6, e1000695. [Google Scholar] [CrossRef] [PubMed]
- Bosc, N.; Muller, C.; Hoffer, L.; Lagorce, D.; Bourg, S.; Derviaux, C.; Gourdel, M.-E.; Rain, J.-C.; Miller, T.W.; Villoutreix, B.O.; et al. Fr-PPIChem: An Academic Compound Library Dedicated to Protein–Protein Interactions. ACS Chem. Biol. 2020. [Google Scholar] [CrossRef] [PubMed]
- Jones, G.; Willett, P.; Glen, R.C.; Leach, A.R.; Taylor, R. Development and Validation of a Genetic Algorithm for Flexible Docking. J. Mol. Biol. 1997, 267, 727–748. [Google Scholar] [CrossRef]
- Triballeau, N.; Acher, F.; Brabet, I.; Pin, J.-P.; Bertrand, H.-O. Virtual Screening Workflow Development Guided by the “Receiver Operating Characteristic” Curve Approach. Application to High-Throughput Docking on Metabotropic Glutamate Receptor Subtype 4. J. Med. Chem. 2005, 48, 2534–2547. [Google Scholar] [CrossRef] [PubMed]
- Mysinger, M.M.; Shoichet, B.K. Rapid Context-Dependent Ligand Desolvation in Molecular Docking. J. Chem. Inf. Model. 2010, 50, 1561–1573. [Google Scholar] [CrossRef] [PubMed]
- Truchon, J.-F.; Bayly, C.I. Evaluating Virtual Screening Methods: Good and Bad Metrics for the “Early Recognition” Problem. J. Chem. Inf. Model. 2007, 47, 488–508. [Google Scholar] [CrossRef] [PubMed]
- Venkatraman, V.; Pérez-Nueno, V.I.; Mavridis, L.; Ritchie, D.W. Comprehensive Comparison of Ligand-Based Virtual Screening Tools against the DUD Data Set Reveals Limitations of Current 3D Methods. J. Chem. Inf. Model. 2010, 50, 2079–2093. [Google Scholar] [CrossRef] [Green Version]
- Empereur-mot, C.; Guillemain, H.; Latouche, A.; Zagury, J.-F.; Viallon, V.; Montes, M. Predictiveness Curves in Virtual Screening. J. Cheminform. 2015, 7, 52. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Warr, W.A. Scientific Workflow Systems: Pipeline Pilot and KNIME. J. Comput. Aided Mol. Des. 2012, 26, 801–804. [Google Scholar] [CrossRef]
- Gentleman, R.; Hornik, K.; Leisch, F. R 1.5 and the Bioconductor 1.0 Releases. Comput. Stat. Data An. 2002, 39, 557–558. [Google Scholar]
Target Name | PDB ID | Target ChEMBL ID | Nactives | Ninactives | Training Set (Nactives) | Training Set (Ninactives) | Test Set (Nactives) | Test Set (Ninactives) |
---|---|---|---|---|---|---|---|---|
Bromodomain Adjacent to Zinc Finger Domain 2B (BAZ2B) | 4XUA | CHEMBL1741220 | 6852 | 49,457 | 4794 | 34,598 | 2055 | 14,827 |
5E73 | ||||||||
Apoptosis regulator Bcl-2 | 2O21 | CHEMBL4860 | 1788 | 49,082 | 1240 | 34,343 | 531 | 14,718 |
4LVT | ||||||||
Apoptosis regulator Bcl-xL | 3INQ | CHEMBL4625 | 971 | 49,190 | 674 | 34,420 | 289 | 14,751 |
3WIZ | ||||||||
BRD4 bromodomain 1 (BRD4-1) | 5D3L | CHEMBL1163125 | 847 | 981 | 592 | 687 | 253 | 294 |
5KU3 | ||||||||
CREB-binding protein (CREBBP) | 5EIC | CHEMBL5747 | 1360 | 48,781 | 910 | 34,425 | 390 | 13,896 |
5MMG | ||||||||
HIV Integrase (HIV IN) | 4CFD | CHEMBL2366505 | 905 | 1232 | 610 | 855 | 261 | 366 |
4CHO | ||||||||
Inhibitor of apoptosis protein 3 (XIAP) | 1TFT | CHEMBL4198 | 1145 | 49,351 | 793 | 34,528 | 340 | 14,798 |
5C3H | ||||||||
Induced myeloid leukemia cell differentiation protein Mcl-1 | 5FC4 | CHEMBL4361 | 1455 | 49,112 | 995 | 34,147 | 426 | 14,635 |
5MES | ||||||||
E3 ubiquitin-protein ligase Mdm2 | 4ODF | CHEMBL5023 | 2227 | 4351 | 1559 | 3040 | 668 | 1303 |
4ZFI | ||||||||
Menin | 5DB2 | CHEMBL2093861 | 705 | 31,510 | 491 | 22,020 | 211 | 9437 |
6B41 |
Target | PDB ID | Surflex | GoldScore | ChemScore | ASP | ChemPLP | ChemScore RDS | Consensus Scoring | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
AUC | BEDROC | AUC | BEDROC | AUC | BEDROC | AUC | BEDROC | AUC | BEDROC | AUC | BEDROC | AUC | BEDROC | ||
BAZ2B | 4XUA | 0.491 | 0.120 | 0.489 | 0.127 | 0.505 | 0.144 | 0.496 | 0.139 | 0.489 | 0.127 | 0.503 | 0.132 | 0.495 | 0.133 |
5E73 | 0.482 | 0.120 | 0.482 | 0.118 | 0.507 | 0.140 | 0.493 | 0.132 | 0.487 | 0.123 | 0.506 | 0.139 | 0.493 | 0.127 | |
Bcl-2 | 2O21 | 0.857 | 0.641 | 0.860 | 0.640 | 0.839 | 0.588 | 0.883 | 0.595 | 0.853 | 0.711 | 0.590 | 0.209 | 0.798 | 0.548 |
4LVT | 0.886 | 0.701 | 0.877 | 0.700 | 0.864 | 0.690 | 0.887 | 0.694 | 0.869 | 0.739 | 0.665 | 0.368 | 0.886 | 0.722 | |
Bcl-xL | 3INQ | 0.844 | 0.523 | 0.799 | 0.463 | 0.753 | 0.408 | 0.786 | 0.411 | 0.776 | 0.513 | 0.555 | 0.188 | 0.770 | 0.435 |
3WIZ | 0.823 | 0.500 | 0.821 | 0.521 | 0.792 | 0.510 | 0.816 | 0.461 | 0.823 | 0.614 | 0.541 | 0.174 | 0.805 | 0.573 | |
BRD4-1 | 5D3L | 0.696 | 0.672 | 0.778 | 0.777 | 0.725 | 0.808 | 0.801 | 0.818 | 0.793 | 0.858 | 0.671 | 0.852 | 0.799 | 0.849 |
5KU3 | 0.426 | 0.254 | 0.652 | 0.600 | 0.531 | 0.516 | 0.678 | 0.674 | 0.632 | 0.572 | 0.503 | 0.530 | 0.625 | 0.643 | |
CREBBP | 5EIC | 0.604 | 0.100 | 0.627 | 0.114 | 0.598 | 0.123 | 0.675 | 0.159 | 0.643 | 0.134 | 0.546 | 0.082 | 0.664 | 0.155 |
5MMG | 0.662 | 0.160 | 0.654 | 0.134 | 0.579 | 0.113 | 0.631 | 0.138 | 0.613 | 0.119 | 0.540 | 0.084 | 0.620 | 0.135 | |
HIV IN | 4CFD | 0.627 | 0.604 | 0.590 | 0.356 | 0.462 | 0.401 | 0.614 | 0.565 | 0.595 | 0.476 | 0.451 | 0.335 | 0.567 | 0.533 |
4CHO | 0.631 | 0.639 | 0.617 | 0.425 | 0.473 | 0.371 | 0.620 | 0.589 | 0.614 | 0.537 | 0.455 | 0.335 | 0.571 | 0.521 | |
XIAP | 1TFT | 0.888 | 0.767 | 0.851 | 0.636 | 0.823 | 0.582 | 0.811 | 0.493 | 0.845 | 0.572 | 0.752 | 0.535 | 0.806 | 0.563 |
5C3H | 0.892 | 0.704 | 0.676 | 0.341 | 0.726 | 0.492 | 0.765 | 0.502 | 0.778 | 0.632 | 0.739 | 0.551 | 0.735 | 0.626 | |
Mcl-1 | 5FC4 | 0.585 | 0.170 | 0.636 | 0.228 | 0.657 | 0.257 | 0.658 | 0.212 | 0.642 | 0.243 | 0.595 | 0.223 | 0.659 | 0.241 |
5MES | 0.59 | 0.150 | 0.630 | 0.205 | 0.602 | 0.167 | 0.596 | 0.152 | 0.605 | 0.166 | 0.524 | 0.130 | 0.605 | 0.168 | |
Mdm2 | 4ODF | 0.64 | 0.842 | 0.712 | 0.869 | 0.633 | 0.791 | 0.706 | 0.832 | 0.681 | 0.822 | 0.531 | 0.649 | 0.661 | 0.833 |
4ZFI | 0.57 | 0.600 | 0.657 | 0.744 | 0.535 | 0.442 | 0.594 | 0.539 | 0.554 | 0.400 | 0.435 | 0.242 | 0.566 | 0.492 | |
Menin | 5DB2 | 0.548 | 0.104 | 0.534 | 0.136 | 0.525 | 0.099 | 0.533 | 0.124 | 0.536 | 0.122 | 0.527 | 0.087 | 0.532 | 0.129 |
6B41 | 0.517 | 0.065 | 0.521 | 0.104 | 0.478 | 0.066 | 0.509 | 0.092 | 0.527 | 0.093 | 0.469 | 0.059 | 0.489 | 0.076 |
Target | PDB ID | Tree | Bagged Forest | Random Forest | Bayesian | SVM | Logistic Regression | Neural Net | Neural Net (Bagging) |
---|---|---|---|---|---|---|---|---|---|
BAZ2B | 4XUA | 0.543 ± 0.013 | 0.500 ± 0.000 | 0.601 ± 0.145 | 0.602 ± 0.001 | 0.758 ± 0.005 | 0.550 ± 0.018 | 0.543 ± 0.009 | 0.651 ± 0.024 |
5E73 | 0.534 ± 0.010 | 0.500 ± 0.000 | 0.631 ± 0.132 | 0.602 ± 0.001 | 0.772 ± 0.002 | 0.535 ± 0.016 | 0.542 ± 0.004 | 0.671 ± 0.021 | |
Bcl-2 | 2O21 | 0.965 ± 0.011 | 0.943 ± 0.001 | 0.999 ± 0.117 | 0.984 ± 0.000 | 0.990 ± 0.000 | 0.973 ± 0.000 | 0.992 ± 0.001 | 0.999 ± 0.000 |
4LVT | 0.948 ± 0.004 | 0.940 ± 0.001 | 0.998 ± 0.038 | 0.984 ± 0.000 | 0.988 ± 0.000 | 0.979 ± 0.000 | 0.989 ± 0.001 | 0.999 ± 0.000 | |
Bcl-xL | 3INQ | 0.971 ± 0.032 | 0.907 ± 0.004 | 0.997 ± 0.129 | 0.983 ± 0.001 | 0.991 ± 0.001 | 0.980 ± 0.000 | 0.987 ± 0.001 | 0.999 ± 0.001 |
3WIZ | 0.973 ± 0.019 | 0.906 ± 0.007 | 0.998 ± 0.011 | 0.977 ± 0.000 | 0.999 ± 0.001 | 0.975 ± 0.000 | 0.988 ± 0.005 | 0.999 ± 0.000 | |
BRD4-1 | 5D3L | 0.962 ± 0.028 | 0.972 ± 0.007 | 0.968 ± 0.013 | 0.936 ± 0.001 | 0.952 ± 0.001 | 0.917 ± 0.001 | 0.947 ± 0.013 | 0.999 ± 0.003 |
5KU3 | 0.972 ± 0.049 | 0.966 ± 0.005 | 0.957 ± 0.008 | 0.924 ± 0.001 | 0.927 ± 0.001 | 0.900 ± 0.001 | 0.928 ± 0.010 | 0.999 ± 0.003 | |
CREBBP | 5EIC | 0.635 ± 0.042 | 0.500 ± 0.000 | 0.887 ± 0.178 | 0.794 ± 0.001 | 0.938 ± 0.003 | 0.766 ± 0.000 | 0.766 ± 0.011 | 0.938 ± 0.033 |
5MMG | 0.714 ± 0.070 | 0.500 ± 0.000 | 0.883 ± 0.153 | 0.746 ± 0.009 | 0.946 ± 0.004 | 0.734 ± 0.014 | 0.760 ± 0.022 | 0.934 ± 0.028 | |
HIV IN | 4CFD | 0.970 ± 0.008 | 0.920 ± 0.017 | 0.936 ± 0.004 | 0.827 ± 0.004 | 0.895 ± 0.002 | 0.781 ± 0.002 | 0.831 ± 0.024 | 0.999 ± 0.005 |
4CHO | 0.931 ± 0.055 | 0.928 ± 0.009 | 0.929 ± 0.019 | 0.832 ± 0.014 | 0.896 ± 0.003 | 0.770 ± 0.001 | 0.833 ± 0.052 | 0.999 ± 0.005 | |
XIAP | 1TFT | 0.972 ± 0.089 | 0.974 ± 0.038 | 0.998 ± 0.127 | 0.981 ± 0.000 | 0.997 ± 0.000 | 0.984 ± 0.000 | 0.994 ± 0.005 | 0.999 ± 0.000 |
5C3H | 0.985 ± 0.006 | 0.984 ± 0.007 | 0.998 ± 0.116 | 0.981 ± 0.003 | 0.998 ± 0.000 | 0.987 ± 0.000 | 0.997 ± 0.000 | 0.999 ± 0.000 | |
Mcl-1 | 5FC4 | 0.752 ± 0.068 | 0.578 ± 0.006 | 0.930 ± 0.143 | 0.811 ± 0.002 | 0.835 ± 0.002 | 0.817 ± 0.001 | 0.842 ± 0.010 | 0.956 ± 0.008 |
5MES | 0.780 ± 0.089 | 0.583 ± 0.003 | 0.921 ± 0.147 | 0.819 ± 0.002 | 0.877 ± 0.004 | 0.817 ± 0.003 | 0.843 ± 0.010 | 0.957 ± 0.014 | |
Mdm2 | 4ODF | 0.943 ± 0.003 | 0.938 ± 0.006 | 0.977 ± 0.077 | 0.932 ± 0.000 | 0.966 ± 0.000 | 0.942 ± 0.000 | 0.952 ± 0.005 | 0.999 ± 0.003 |
4ZFI | 0.952 ± 0.009 | 0.937 ± 0.005 | 0.977 ± 0.059 | 0.926 ± 0.002 | 0.962 ± 0.000 | 0.942 ± 0.000 | 0.956 ± 0.006 | 0.999 ± 0.002 | |
Menin | 5DB2 | 0.500 ± 0.000 | 0.500 ± 0.000 | 0.823 ± 0.129 | 0.758 ± 0.014 | 0.903 ± 0.004 | 0.662 ± 0.002 | 0.680 ± 0.028 | 0.969 ± 0.038 |
6B41 | 0.538 ± 0.012 | 0.500 ± 0.000 | 0.828 ± 0.242 | 0.760 ± 0.025 | 0.897 ± 0.008 | 0.617 ± 0.080 | 0.679 ± 0.028 | 0.971 ± 0.033 |
Target | PDB ID | Tree | Bagged Forest | Random Forest | Bayesian | SVM | Logistic Regression | Neural Net | Neural Net (Bagging) |
---|---|---|---|---|---|---|---|---|---|
BAZ2B | 4XUA | 0.534 | 0.5 | 0.537 | 0.53 | 0.521 | 0.532 | 0.53 | 0.54 |
5E73 | 0.524 | 0.5 | 0.546 | 0.525 | 0.54 | 0.52 | 0.538 | 0.553 | |
Bcl-2 | 2O21 | 0.966 | 0.949 | 0.987 | 0.978 | 0.979 | 0.972 | 0.981 | 0.985 |
4LVT | 0.948 | 0.948 | 0.992 | 0.982 | 0.976 | 0.974 | 0.988 | 0.991 | |
Bcl-xL | 3INQ | 0.964 | 0.91 | 0.989 | 0.982 | 0.983 | 0.985 | 0.99 | 0.986 |
3WIZ | 0.979 | 0.916 | 0.989 | 0.975 | 0.981 | 0.986 | 0.988 | 0.993 | |
BRD4-1 | 5D3L | 0.926 | 0.926 | 0.918 | 0.897 | 0.915 | 0.893 | 0.917 | 0.918 |
5KU3 | 0.901 | 0.901 | 0.911 | 0.886 | 0.908 | 0.887 | 0.906 | 0.927 | |
CREBBP | 5EIC | 0.631 | 0.5 | 0.769 | 0.726 | 0.724 | 0.744 | 0.75 | 0.772 |
5MMG | 0.689 | 0.5 | 0.691 | 0.712 | 0.732 | 0.721 | 0.746 | 0.788 | |
HIV IN | 4CFD | 0.822 | 0.827 | 0.801 | 0.74 | 0.805 | 0.742 | 0.784 | 0.832 |
4CHO | 0.792 | 0.807 | 0.791 | 0.73 | 0.816 | 0.714 | 0.753 | 0.824 | |
XIAP | 1TFT | 0.97 | 0.972 | 0.996 | 0.979 | 0.99 | 0.981 | 0.994 | 0.997 |
5C3H | 0.973 | 0.976 | 0.994 | 0.974 | 0.994 | 0.985 | 0.995 | 0.995 | |
Mcl-1 | 5FC4 | 0.742 | 0.586 | 0.854 | 0.783 | 0.776 | 0.813 | 0.843 | 0.859 |
5MES | 0.755 | 0.573 | 0.845 | 0.786 | 0.715 | 0.809 | 0.835 | 0.846 | |
Mdm2 | 4ODF | 0.947 | 0.947 | 0.958 | 0.922 | 0.957 | 0.946 | 0.95 | 0.963 |
4ZFI | 0.941 | 0.94 | 0.949 | 0.919 | 0.954 | 0.943 | 0.951 | 0.961 | |
Menin | 5DB2 | 0.5 | 0.5 | 0.674 | 0.622 | 0.614 | 0.62 | 0.646 | 0.646 |
6B41 | 0.536 | 0.5 | 0.669 | 0.601 | 0.608 | 0.633 | 0.656 | 0.637 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Singh, N.; Villoutreix, B.O. A Hybrid Docking and Machine Learning Approach to Enhance the Performance of Virtual Screening Carried out on Protein–Protein Interfaces. Int. J. Mol. Sci. 2022, 23, 14364. https://doi.org/10.3390/ijms232214364
Singh N, Villoutreix BO. A Hybrid Docking and Machine Learning Approach to Enhance the Performance of Virtual Screening Carried out on Protein–Protein Interfaces. International Journal of Molecular Sciences. 2022; 23(22):14364. https://doi.org/10.3390/ijms232214364
Chicago/Turabian StyleSingh, Natesh, and Bruno O. Villoutreix. 2022. "A Hybrid Docking and Machine Learning Approach to Enhance the Performance of Virtual Screening Carried out on Protein–Protein Interfaces" International Journal of Molecular Sciences 23, no. 22: 14364. https://doi.org/10.3390/ijms232214364
APA StyleSingh, N., & Villoutreix, B. O. (2022). A Hybrid Docking and Machine Learning Approach to Enhance the Performance of Virtual Screening Carried out on Protein–Protein Interfaces. International Journal of Molecular Sciences, 23(22), 14364. https://doi.org/10.3390/ijms232214364