Hybrid Approach to Identifying Druglikeness Leading Compounds against COVID-19 3CL Protease
Abstract
:1. Introduction
Literature Review
2. Results and Discussion
2.1. Exploratory Data Analysis
2.2. Evaluation of the Proposed Model
2.3. Comparative Analysis
2.4. ADMET Analysis
2.5. Molecular Docking
3. Materials and Methods
3.1. Module A: Data Preparation
3.1.1. Targeting the Replicating Enzyme
3.1.2. Dataset
3.1.3. Data Preprocessing
3.2. Module B: QSAR Modeling
3.2.1. Exploratory Data Analysis (EDA)
- The Molecular Weight (MW) should be less than 500 Dalton
- The octanol-water partition coefficient (LogP) should be less than 5
- The hydrogen-bond-donors (NumHDonors) should be less than 5
- The hydrogen-bond-acceptors (NumHAcceptors) should be less than 10
3.2.2. Feature Extraction
3.2.3. Extra Tree Regressor-Based Ensemble Model
3.3. Module C: ADMET Analysis
3.4. Module D: Molecular Docking
4. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Wang, M.Y.; Zhao, R.; Gao, L.J.; Gao, X.F.; Wang, D.P.; Cao, J.M. SARS-CoV-2: Structure, Biology, and Structure-Based Therapeutics Development. Front. Cell. Infect. Microbiol. 2020, 10, 587269. [Google Scholar] [CrossRef]
- Chen, W.; Wang, Z.; Wang, Y.; Li, Y. Natural Bioactive Molecules as Potential Agents Against SARS-CoV-2. Front. Pharmacol. 2021, 12, 702472. [Google Scholar] [CrossRef]
- Iketani, S.; Forouhar, F.; Liu, H.; Hong, S.J.; Lin, F.Y.; Nair, M.S.; Zask, A.; Huang, Y.; Xing, L.; Stockwell, B.R.; et al. Lead compounds for the development of SARS-CoV-2 3CL protease inhibitors. Nat. Commun. 2021, 12, 2–8. [Google Scholar] [CrossRef]
- Muramatsu, T.; Takemoto, C.; Kim, Y.; Wang, H.; Nishii, W.; Terada, T. SARS-CoV 3CL protease cleaves its C-terminal autoprocessing site by novel subsite cooperativity. Proc. Natl. Acad. Sci. USA 2016, 113, 12997–13002. [Google Scholar] [CrossRef] [Green Version]
- Sharma, P.P.; Bansal, M.; Sethi, A.; Poonam; Pena, L.; Goel, V.K.; Grishina, M.; Chaturvedi, S.; Kumar, D.; Rathi, B. Computational methods directed towards drug repurposing for COVID-19: Advantages and limitations. RSC Adv. 2021, 11, 36181–36198. [Google Scholar] [CrossRef]
- Gns, H.S.; Gr, S. An update on Drug Repurposing: Re-written saga of the drug’s fate. Biomed. Pharmacother. 2019, 110, 700–716. [Google Scholar] [CrossRef]
- Halstead, S.B. Vaccine—Associated Enhanced Viral Disease: Implications for Viral Vaccine Development. BioDrugs 2021, 35, 505–515. [Google Scholar] [CrossRef]
- Robinson, B.W.S.; Tai, A.; Springer, K. Why we still need drugs for COVID-19 and can’t just rely on vaccines. Respirology 2022, 27, 109–111. [Google Scholar] [CrossRef]
- Kumari, P.; Pradhan, B.; Koromina, M.; Patrinos, G.P.; Steen, K.V. Van Discovery of new drug indications for COVID-19: A drug repurposing approach. PLoS ONE 2022, 17, e0267095. [Google Scholar] [CrossRef]
- Li, X.; Yu, J.; Zhang, Z.; Ren, J.; Peluffo, A.E.; Zhang, W.; Zhao, Y.; Wu, J.; Yan, K.; Cohen, D.; et al. Network bioinformatics analysis provides insight into drug repurposing for COVID-19. Med. Drug Discov. 2021, 10, 100090. [Google Scholar] [CrossRef]
- Elmezayen, A.D.; Al-obaidi, A.; Şahin, A.T.; Yelekçi, K. Drug repurposing for coronavirus (COVID-19): In silico screening of known drugs against coronavirus 3CL hydrolase and protease enzymes. J. Biomol. Struct. Dyn. 2021, 39, 2980–2992. [Google Scholar] [CrossRef] [Green Version]
- Jha, N.; Prashar, D.; Rashid, M.; Shafiq, M.; Khan, R.; Pruncu, C.I.; Tabrez Siddiqui, S.; Saravana Kumar, M. Deep Learning Approach for Discovery of in Silico Drugs for Combating COVID-19. J. Healthc. Eng. 2021, 2021, 6668985. [Google Scholar] [CrossRef]
- Zhu, W.; Xu, M.; Chen, C.Z.; Guo, H.; Shen, M.; Hu, X.; Shinn, P.; Klumpp-Thomas, C.; Michael, S.G.; Zheng, W. Identification of SARS-CoV-2 3CL Protease Inhibitors by a Quantitative High-Throughput Screening. ACS Pharmacol. Transl. Sci. 2020, 3, 1008–1016. [Google Scholar] [CrossRef]
- Hu, T.; Li, J.; Zhou, H.; Li, C.; Holmes, E.C.; Shi, W. Bioinformatics resources for SARS-CoV-2 discovery and surveillance. Brief. Bioinform. 2021, 22, 631–641. [Google Scholar] [CrossRef]
- Li, R.; Li, Y.; Liang, X.; Yang, L.; Su, M.; Lai, K.P. Network Pharmacology and bioinformatics analyses identify intersection genes of niacin and COVID-19 as potential therapeutic targets. Brief. Bioinform. 2021, 22, 1279–1290. [Google Scholar] [CrossRef]
- Budak, C.; Mençik, V.; Gider, V. Determining similarities of COVID-19–lung cancer drugs and affinity binding mode analysis by graph neural network-based GEFA method. J. Biomol. Struct. Dyn. 2021, 1–13. [Google Scholar] [CrossRef]
- Serra, A.; Fratello, M.; Federico, A.; Ojha, R.; Provenzani, R.; Tasnadi, E.; Cattelani, L.; del Giudice, G.; Kinaret, P.A.S.; Saarimäki, L.A.; et al. Computationally prioritized drugs inhibit SARS-CoV-2 infection and syncytia formation. Brief. Bioinform. 2022, 23, bbab507. [Google Scholar] [CrossRef]
- Jang, W.D.; Jeon, S.; Kim, S.; Lee, S.Y. Drugs repurposed for COVID-19 by virtual screening of 6218 drugs and cell-based assay. Proc. Natl. Acad. Sci. USA 2021, 118, e2024302118. [Google Scholar] [CrossRef]
- Tropmann-Frick, M.; Schreier, T. Towards Drug Repurposing for COVID-19 Treatment Using Literature-Based Discovery. Front. Artif. Intell. Appl. 2022, 343, 215–232. [Google Scholar] [CrossRef]
- Liu, Y.; Wu, Y.; Shen, X.; Xie, L. COVID-19 Multi-Targeted Drug Repurposing Using Few-Shot Learning. Front. Bioinform. 2021, 1, 693177. [Google Scholar] [CrossRef]
- Harigua-Souiai, E.; Heinhane, M.M.; Abdelkrim, Y.Z.; Souiai, O.; Abdeljaoued-Tej, I.; Guizani, I. Deep Learning Algorithms Achieved Satisfactory Predictions When Trained on a Novel Collection of Anticoronavirus Molecules. Front. Genet. 2021, 12, 744170. [Google Scholar] [CrossRef]
- Mohapatra, S.; Nath, P.; Chatterjee, M.; Das, N.; Kalita, D.; Roy, P.; Satapathi, S. Repurposing therapeutics for COVID-19: Rapid prediction of commercially available drugs through machine learning and docking. PLoS ONE 2020, 15, e0241543. [Google Scholar] [CrossRef]
- Yu, P.C.; Huang, C.H.; Kuo, C.J.; Liang, P.H.; Wang, L.H.C.; Pan, M.Y.C.; Chang, S.Y.; Chao, T.L.; Ieong, S.M.; Fang, J.T.; et al. Drug Repurposing for the Identification of Compounds with Anti-SARS-CoV-2 Capability via Multiple Targets. Pharmaceutics 2022, 14, 176. [Google Scholar] [CrossRef]
- Chen, C.-P.; Chen, C.-C.; Huang, C.-W.; Chang, Y.-C. Evaluating Molecular Properties Involved in Transport of Small Molecules in Stratum Corneum: A Quantitative Structure-Activity Relationship for Skin Permeability. Molecules 2018, 23, 911. [Google Scholar] [CrossRef] [Green Version]
- Tsantili-kakoulidou, A.; Demopoulos, V.J. Drug-like Properties and Fraction Lipophilicity Index as a combined metric. ADMET DMPK 2021, 9, 177–190. [Google Scholar] [CrossRef]
- Ahmed, A.; Abdusalam, A.; Murugaiyah, V. Identification of Potential Inhibitors of 3CL Protease of SARS-CoV-2 From ZINC Database by Molecular Docking-Based Virtual Screening. Front. Mol. Biosci. 2020, 7, 603037. [Google Scholar] [CrossRef]
- Gaulton, A.; Bellis, L.J.; Bento, A.P.; Chambers, J.; Davies, M.; Hersey, A.; Light, Y.; McGlinchey, S.; Michalovich, D.; Al-Lazikani, B.; et al. ChEMBL: A large-scale bioactivity database for drug discovery. Nucleic Acids Res. 2012, 40, 1100–1107. [Google Scholar] [CrossRef] [Green Version]
- Simeon, S.; Anuwongcharoen, N.; Shoombuatong, W.; Malik, A.A.; Prachayasittikul, V.; Wikberg, J.E.S.; Nantasenamat, C. Probing the origins of human acetylcholinesterase inhibition via QSAR modeling and molecular docking. PeerJ 2016, 2016, e2322. [Google Scholar] [CrossRef] [Green Version]
- Cheng, T.; Pan, Y.; Hao, M.; Wang, Y.; Bryant, S.H. PubChem applications in drug discovery: A bibliometric analysis. Drug Discov. Today 2014, 19, 1751–1756. [Google Scholar] [CrossRef] [Green Version]
- Majid, A.; Ali, S.; Iqbal, M.; Kausar, N. Prediction of human breast and colon cancers from imbalanced data using nearest neighbor and support vector machines. Comput. Methods Programs Biomed. 2014, 113, 792–808. [Google Scholar] [CrossRef]
- Tahir, M.; Khan, A.; Majid, A.; Lumini, A. Subcellular localization using fluorescence imagery: Utilizing ensemble classification with diverse feature extraction strategies and data balancing. Appl. Soft Comput. J. 2013, 13, 4231–4243. [Google Scholar] [CrossRef]
- Daina, A.; Michielin, O.; Zoete, V. SwissADME: A free web tool to evaluate pharmacokinetics, drug-likeness and medicinal chemistry friendliness of small molecules. Sci. Rep. 2017, 7, 42717. [Google Scholar] [CrossRef] [Green Version]
- Trott, O.; Olson, A.J. Software News and Update AutoDock Vina: Improving the Speed and Accuracy of Docking with a New Scoring Function, Efficient Optimization, and Multithreading. J. Comput. Chem. 2009, 31, 455–461. [Google Scholar] [CrossRef] [Green Version]
- Forli, S.; Huey, R.; Pique, M.E.; Sanner, M.F.; Goodsell, D.S.; Olson, A.J. Computational protein—Ligand docking and virtual drug screening with the AutoDock suite. Nat. Protoc. 2016, 11, 905–919. [Google Scholar] [CrossRef]
Descriptor | Statistics | p | Alpha | Interpretation |
---|---|---|---|---|
LogP | 440 | 0.4892 | 0.05 | Same distribution (fail to reject H0) |
MW | 232 | 0.0023 | 0.05 | Different distribution (reject H0) |
NumHAcceptors | 214.5 | 0.0009 | 0.05 | Different distribution (reject H0) |
NumHDonors | 157 | 0.00002 | 0.05 | Different distribution (reject H0) |
pIC50 | 0 | 1.37 × 10−9 | 0.05 | Different distribution (reject H0) |
CHEMBL ID | Molecular Formula | PubChem ID | Isomeric SMILES | 3D Structure |
---|---|---|---|---|
CHEMBL 187460 | C19H20O3 | 160254 | ||
CHEMBL 190743 | C17H10INO2S | 11796320 | ||
CHEMBL 212218 | C14H7Cl2F3N2O6S | 2799606 | ||
CHEMBL 212454 | C18H8Cl6O6S | 2774892 | ||
CHEMBL 222234 | C10H6BrNO3 | 16203681 | ||
CHEMBL 222628 | C9H5ClN2O2S | 16203796 | ||
CHEMBL 222735 | C13H10ClNO3 | 16204324 | ||
CHEMBL 222769 | C16H9Cl2NO3 | 16203797 | ||
CHEMBL 222840 | C10H6ClNO3 | 7230550 | ||
CHEMBL222893 | C14H8ClNO2S | 2800273 | ||
CHEMBL 225515 | C14H9ClN2O2 | 16204322 | ||
CHEMBL 358279 | C20H14N2O3 | 515964 | ||
CHEMBL 363535 | C18H12O3 | 114917 | ||
CHEMBL 365134 | C17H10BrNO2S | 11667869 | ||
CHEMBL 426898 | C14H8ClNO3 | 16204318 |
Regression Model | R-Squared | MSE | RMSE |
---|---|---|---|
Extra Tree Regressor | 0.73 | 0.005 | 0.074 |
Gradient Boosting Regressor | 0.62 | 0.006 | 0.078 |
XGBoost Regressor | 0.59 | 0.008 | 0.089 |
Support Vector Regressor | 0.59 | 0.006 | 0.078 |
Decision Tree Regressor | 0.58 | 0.008 | 0.092 |
Random Forest Regressor | 0.52 | 0.008 | 0.089 |
ChEMBL ID | Physicochemical Properties | Lipophilicity | Water Solubility | Pharmacokinetics | Drug-Likeness | Medicinal Chemistry |
---|---|---|---|---|---|---|
187460 | MW 324.58 g/mol TPSA 43.37 Å2 NHA = 3 NHD = 0 | Consensus log Po/w 3.06 | Moderately soluble | GI absorption = High BBB Permeant = Yes Skin Permeation (log Kp) = −5.57cm/s | Yes | Synthetic accessibility = 5.25 |
190743 | MW 442.42 g/mol TPSA 69.47 Å2 NHA = 2 NHD = 0 | Consensus log Po/w 2.43 | Moderately soluble | GI absorption = High BBB Permeant = Yes Skin Permeation (log Kp) = −6.91cm/s | Yes | Synthetic accessibility = 5.36 |
212218 | MW 459.18 g/mol TPSA 136.16 Å2 NHA = 9 NHD = 0 | Consensus log Po/w 3.58 | Moderately soluble | GI absorption = Low BBB Permeant = No Skin Permeation (log Kp) = −5.64cm/s | Yes | Synthetic accessibility = 3.13 |
212454 | MW 585.19 g/mol TPSA 86.04 Å2 NHA = 6 NHD = 0 | Consensus log Po/w 3.73 | Poorly soluble | GI absorption = Low BBB Permeant = No Skin Permeation (log Kp) = −6.33cm/s | No | Synthetic accessibility = 8.41 |
222234Y | MW 276.13 g/mol TPSA 51.21 Å2 NHA = 4 NHD = 0 | Consensus log Po/w −0.82 | Highly soluble | GI absorption = High BBB Permeant = No Skin Permeation (log Kp) = −10.08 cm/s | Yes | Synthetic accessibility = 5.02 |
222628 | MW 246.71 g/mol TPSA 59.44 Å2 NHA = 4 NHD = 0 | Consensus log Po/w −1.05 | Highly soluble | GI absorption = High BBB Permeant = No Skin Permeation (log Kp) = −9.62 cm/s | Yes | Synthetic accessibility = 4.63 |
222735Y | MW 280.81 g/mol TPSA 35.53 Å2 NHA = 4 NHD = 0 | Consensus log Po/w 0.73 | Very soluble | GI absorption = High BBB Permeant = Yes Skin Permeation (log Kp) = −8.19cm/s | Yes | Synthetic accessibility = 5.74 |
222769 | MW 350.28 g/mol TPSA 43.37 Å2 NHA = 4 NHD = 0 | Consensus log Po/w 0.58 | Very soluble | GI absorption = High BBB Permeant = Yes Skin Permeation (log Kp) = −9.57cm/s | Yes | Synthetic accessibility = 6.00 |
222840 | MW 231.68 g/mol TPSA 51.21 Å2 NHA = 4 NHD = 0 | Consensus log Po/w −0.90 | Highly soluble | GI absorption = High BBB Permeant = No Skin Permeation (log Kp) = −9.85cm/s | Yes | Synthetic accessibility = 4.74 |
222893Y | MW 305.86 g/mol TPSA 58.39 Å2 NHA = 3 NHD = 0 | Consensus log Po/w 1.04 | Very soluble | GI absorption = High BBB Permeant = Yes Skin Permeation (log Kp) = −7.90cm/s | Yes | Synthetic accessibility = 5.75 |
225515Y | MW 287.81 g/mol TPSA 26.30 Å2 NHA = 4 NHD = 0 | Consensus log Po/w 0.06 | Very soluble | GI absorption = High BBB Permeant = No Skin Permeation (log Kp) = −9.53cm/s | Yes | Synthetic accessibility = 5.50 |
358279 | MW 360.57 g/mol TPSA 80.47 Å2 NHA = 3 NHD = 1 | Consensus log Po/w 2.06 | Soluble | GI absorption = High BBB Permeant = No Skin Permeation (log Kp) = −6.45cm/s | Yes | Synthetic accessibility = 4.55 |
363535Y | MW 301.48 g/mol TPSA 51.21 Å2 NHA = 3 NHD = 0 | Consensus log Po/w 2.29 | Soluble | GI absorption = High BBB Permeant = Yes Skin Permeation (log Kp) = −5.95cm/s | Yes | Synthetic accessibility = 5.35 |
365134 | MW 393.40 g/mol TPSA 69.47 Å2 NHA = 2 NHD = 0 | Consensus log Po/w 2.05 | Soluble | GI absorption = High BBB Permeant = Yes Skin Permeation (log Kp) = −7.30cm/s | Yes | Synthetic accessibility = 6.47 |
426898Y | MW 289.80 g/mol TPSA 43.37 Å2 NHA = 4 NHD = 0 | Consensus log Po/w 0.57 | Very soluble | GI absorption = High BBB Permeant = Yes Skin Permeation (log Kp) = −8.23cm/s | Yes | Synthetic accessibility = 5.60 |
Protein Name | ChEMBL ID | Ligand ID | Binding Affinity (kcal/mol) |
---|---|---|---|
7JSU | 187460 | 160254 | −8.0 |
7JSU | 190743 | 11796320 | −6.7 |
7JSU | 222234 | 16203681 | −5.4 |
7JSU | 222628 | 16203796 | −5.4 |
7JSU | 222735 | 16204324 | −6.6 |
7JSU | 222769 | 16203797 | −7.3 |
7JSU | 222840 | 7230550 | −5.3 |
7JSU | 222893 | 2800273 | −6.5 |
7JSU | 225515 | 16204322 | −7.0 |
7JSU | 358279 | 515964 | −8.4 |
7JSU | 363535 | 114917 | −7.6 |
7JSU | 365134 | 11667869 | −7.8 |
7JSU | 426898 | 16204318 | −6.6 |
Bit Position | Description |
---|---|
0–114 | Presence of chemical atoms |
115–262 | Presence of the described chemical ring system |
263–326 | Presence of simple atom pairs |
327–415 | Presence of simple atoms nearest neighbors |
416–459 | Presence of detailed atom neighborhoods |
460–712 | Presence of simple SMARTS patterns |
713–880 | Presence of complex SMARTS patterns |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Aqeel, I.; Bilal, M.; Majid, A.; Majid, T. Hybrid Approach to Identifying Druglikeness Leading Compounds against COVID-19 3CL Protease. Pharmaceuticals 2022, 15, 1333. https://doi.org/10.3390/ph15111333
Aqeel I, Bilal M, Majid A, Majid T. Hybrid Approach to Identifying Druglikeness Leading Compounds against COVID-19 3CL Protease. Pharmaceuticals. 2022; 15(11):1333. https://doi.org/10.3390/ph15111333
Chicago/Turabian StyleAqeel, Imra, Muhammad Bilal, Abdul Majid, and Tuba Majid. 2022. "Hybrid Approach to Identifying Druglikeness Leading Compounds against COVID-19 3CL Protease" Pharmaceuticals 15, no. 11: 1333. https://doi.org/10.3390/ph15111333
APA StyleAqeel, I., Bilal, M., Majid, A., & Majid, T. (2022). Hybrid Approach to Identifying Druglikeness Leading Compounds against COVID-19 3CL Protease. Pharmaceuticals, 15(11), 1333. https://doi.org/10.3390/ph15111333