Automatic Identification of Analogue Series from Large Compound Data Sets: Methods and Applications
Abstract
:1. Introduction
2. Fundamental Concepts and Algorithms
2.1. Molecular Similarity
2.2. Bemis–Murcko Scaffolds and Cyclic Skeletons
2.3. Matched Molecular Pairs
2.4. Fragment-and-Index Approach: From Matched Molecular Pairs to Series and Scaffolds
3. Methodological Developments Related to the MMP Concept and Scaffold Identification
3.1. SAR Transfer and SAR Matrix
3.2. Networks and Analogue Series-Based Scaffolds
3.3. Compound-Core Relationships
3.4. Scaffold-Based Approaches
4. Exemplary Applications
4.1. Analogue Screening and Virtual Analogues
4.2. Structure Activity Relationships and Property Cliffs
4.3. Virtual Screening and ADMET Prediction
5. Exemplary Sar Analysis with CCR-Based Approaches
6. Conclusions
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
Abbreviations
ADMET | absorption, distribution, metabolism, elimination, toxicity |
ASBS | analogue series-based scaffold |
BMMSG | bipartite matched molecular series graph |
CCR | compound-core relationship |
CReM | chemical reasonable mutations |
CSN | chemical space networks |
MMP | matched molecular pair |
MMPA | matched molecular pair analysis |
MMS | matched/matching molecular series |
QSA(P)R | quantitative structure activity (property) relationships |
RECAP | retrosynthetic combinatorial analysis procedure |
SAR | structure-activity relationships |
SRP | single R-group polymorphisms |
References
- Wawer, M.; Bajorath, J. Local Structural Changes, Global Data Views: Graphical Substructure-Activity Relationship Trailing. J. Med. Chem. 2011, 54, 2944–2951. [Google Scholar] [CrossRef]
- Stumpfe, D.; Dimova, D.; Bajorath, J. Computational Method for the Systematic Identification of Analog Series and Key Compounds Representing Series and Their Biological Activity Profiles. J. Med. Chem. 2016, 59, 7667–7676. [Google Scholar] [CrossRef]
- Naveja, J.J.; Vogt, M.; Stumpfe, D.; Medina-Franco, J.L.; Bajorath, J. Systematic Extraction of Analogue Series from Large Compound Collections Using a New Computational Compound–Core Relationship Method. ACS Omega 2019, 4, 1027–1032. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Wermuth, C.G.; Aldous, D.; Raboisson, P.; Rognan, D. The Practice of Medicinal Chemistry; Academic Press: Cambridge, MA, USA, 2015. [Google Scholar] [CrossRef]
- Agrafiotis, D.K.; Shemanarev, M.; Connolly, P.J.; Farnum, M.; Lobanov, V.S. SAR Maps: A New SAR Visualization Technique for Medicinal Chemists. J. Med. Chem. 2007, 50, 5926–5937. [Google Scholar] [CrossRef]
- Zhang, B.; Hu, Y.; Bajorath, J. AnalogExplorer: A New Method for Graphical Analysis of Analog Series and Associated Structure–activity Relationship Information. J. Med. Chem. 2014, 57, 9184–9194. [Google Scholar] [CrossRef] [PubMed]
- Maynard, A.T.; Roberts, C.D. Quantifying, Visualizing, and Monitoring Lead Optimization. J. Med. Chem. 2015, 59, 4189–4201. [Google Scholar] [CrossRef] [PubMed]
- Shanmugasundaram, V.; Zhang, L.; Kayastha, S.; de la Vega de León, A.; Dimova, D.; Bajorath, J. Monitoring the Progression of Structure–Activity Relationship Information during Lead Optimization. J. Med. Chem. 2015, 59, 4235–4244. [Google Scholar] [CrossRef]
- Naveja, J.J.; Medina-Franco, J.L. Finding Constellations in Chemical Space Through Core Analysis. Front Chem. 2019, 7. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Gaulton, A.; Hersey, A.; Nowotka, M.; Bento, A.P.; Chambers, J.; Mendez, D.; Mutowo, P.; Atkinson, F.; Bellis, L.J.; Cibrián-Uhalte, E.; et al. The ChEMBL Database in 2017. Nucleic Acids Res. 2016, 45, D945–D954. [Google Scholar] [CrossRef] [PubMed]
- Kim, S.; Chen, J.; Cheng, T.; Gindulyte, A.; He, J.; He, S.; Li, Q.; Shoemaker, B.A.; Thiessen, P.A.; Yu, B.; et al. PubChem in 2021: New Data Content and Improved Web Interfaces. Nucleic Acids Res. 2020, 49, D1388–D1395. [Google Scholar] [CrossRef]
- Bemis, G.W.; Murcko, M.A. The Properties of Known Drugs. 1. Molecular Frameworks. J. Med. Chem. 1996, 39, 2887–2893. [Google Scholar] [CrossRef]
- Schuffenhauer, A.; Ertl, P.; Roggo, S.; Wetzel, S.; Koch, M.A.; Waldmann, H. The Scaffold Tree—Visualization of the Scaffold Universe by Hierarchical Scaffold Classification. J. Chem. Inf. Model. 2006, 47, 47–58. [Google Scholar] [CrossRef] [PubMed]
- Hussain, J.; Rea, C. Computationally Efficient Algorithm to Identify Matched Molecular Pairs (MMPs) in Large Data Sets. J. Chem. Inf. Model. 2010, 50, 339–348. [Google Scholar] [CrossRef]
- Maggiora, G.; Vogt, M.; Stumpfe, D.; Bajorath, J. Molecular Similarity in Medicinal Chemistry. J. Med. Chem. 2013, 57, 3186–3204. [Google Scholar] [CrossRef]
- Willett, P.; Barnard, J.M.; Downs, G.M. Chemical Similarity Searching. J. Chem. Inf. Comput. Sci. 1998, 38, 983–996. [Google Scholar] [CrossRef] [Green Version]
- Bender, A.; Glen, R.C. Molecular Similarity: A Key Technique in Molecular Informatics. Org. Biomol. Chem. 2004, 2, 3204. [Google Scholar] [CrossRef]
- Sheridan, R.P.; Hunt, P.; Culberson, J.C. Molecular Transformations as a Way of Finding and Exploiting Consistent Local QSAR. J. Chem. Inf. Model. 2006, 46, 180–192. [Google Scholar] [CrossRef] [PubMed]
- Warner, D.J.; Griffen, E.J.; St-Gallay, S.A. WizePairZ: A Novel Algorithm to Identify, Encode, and Exploit Matched Molecular Pairs with Unspecified Cores in Medicinal Chemistry. J. Chem. Inf. Model. 2010, 50, 1350–1357. [Google Scholar] [CrossRef] [Green Version]
- Xu, Y.; Johnson, M. Algorithm for Naming Molecular Equivalence Classes Represented by Labeled Pseudographs. J. Chem. Inf. Comput. Sci. 2001, 41, 181–185. [Google Scholar] [CrossRef]
- Gleeson, P.; Bravi, G.; Modi, S.; Lowe, D. ADMET Rules of Thumb II: A Comparison of the Effects of Common Substituents on a Range of ADMET Parameters. Bioorg. Med. Chem. 2009, 17, 5906–5919. [Google Scholar] [CrossRef]
- Brown, N.; Jacoby, E. On Scaffolds and Hopping in Medicinal Chemistry. Mini-Rev. Med. Chem. 2006, 6, 1217–1229. [Google Scholar] [CrossRef]
- Wassermann, A.M.; Haebel, P.; Weskamp, N.; Bajorath, J. SAR Matrices: Automated Extraction of Information-Rich SAR Tables from Large Compound Data Sets. J. Chem. Inf. Model. 2012, 52, 1769–1776. [Google Scholar] [CrossRef] [PubMed]
- Kenny, P.W.; Sadowski, J. Structure Modification in Chemical Databases; Wiley: Hoboken, NJ, USA, 2005; pp. 271–285. [Google Scholar] [CrossRef]
- Leach, A.G.; Jones, H.D.; Cosgrove, D.A.; Kenny, P.W.; Ruston, L.; MacFaul, P.; Wood, J.M.; Colclough, N.; Law, B. Matched Molecular Pairs as a Guide in the Optimization of Pharmaceutical Properties; a Study of Aqueous Solubility, Plasma Protein Binding and Oral Exposure. J. Med. Chem. 2006, 49, 6672–6682. [Google Scholar] [CrossRef] [PubMed]
- Wassermann, A.M.; Dimova, D.; Iyer, P.; Bajorath, J. Advances in Computational Medicinal Chemistry: Matched Molecular Pair Analysis. Drug Dev. Res. 2012, 73, 518–527. [Google Scholar] [CrossRef]
- Kramer, C.; Fuchs, J.E.; Whitebread, S.; Gedeck, P.; Liedl, K.R. Matched Molecular Pair Analysis: Significance and the Impact of Experimental Uncertainty. J. Med. Chem. 2014, 57, 3786–3802. [Google Scholar] [CrossRef] [PubMed]
- Tyrchan, C.; Evertsson, E. Matched Molecular Pair Analysis in Short: Algorithms, Applications and Limitations. Comput. Struct. Biotechnol. J. 2017, 15, 86–90. [Google Scholar] [CrossRef]
- Dalke, A.; Hert, J.; Kramer, C. mmpdb: An Open-Source Matched Molecular Pair Platform for Large Multiproperty Data Sets. J. Chem. Inf. Model. 2018, 58, 902–910. [Google Scholar] [CrossRef]
- Haubertin, D.Y.; Bruneau, P. A Database of Historically-Observed Chemical Replacements. J. Chem. Inf. Model. 2007, 47, 1294–1302. [Google Scholar] [CrossRef] [PubMed]
- Fuchs, J.E.; Wellenzohn, B.; Weskamp, N.; Liedl, K.R. Matched Peptides: Tuning Matched Molecular Pair Analysis for Biopharmaceutical Applications. J. Chem. Inf. Model. 2015, 55, 2315–2323. [Google Scholar] [CrossRef] [PubMed]
- Bradley, A.R.; Wall, I.D.; Green, D.V.S.; Deane, C.M.; Marsden, B.D. OOMMPPAA: A Tool To Aid Directed Synthesis by the Combined Analysis of Activity and Structural Data. J. Chem. Inf. Model. 2014, 54, 2636–2646. [Google Scholar] [CrossRef]
- Bradley, A.R.; Wall, I.D.; von Delft, F.; Green, D.V.S.; Deane, C.M.; Marsden, B.D. WONKA: Objective Novel Complex Analysis for Ensembles of Protein–Ligand Structures. J. Comput. Aided Mol. Des. 2015, 29, 963–973. [Google Scholar] [CrossRef] [Green Version]
- Geppert, T.; Beck, B. Fuzzy Matched Pairs: A Means To Determine the Pharmacophore Impact on Molecular Interaction. J. Chem. Inf. Model. 2014, 54, 1093–1102. [Google Scholar] [CrossRef]
- Lukac, I.; Zarnecka, J.; Griffen, E.J.; Dossetter, A.G.; St-Gallay, S.A.; Enoch, S.J.; Madden, J.C.; Leach, A.G. Turbocharging Matched Molecular Pair Analysis: Optimizing the Identification and Analysis of Pairs. J. Chem. Inf. Model. 2017, 57, 2424–2436. [Google Scholar] [CrossRef] [Green Version]
- Naveja, J.J.; Pilón-Jiménez, B.A.; Bajorath, J.; Medina-Franco, J.L. A General Approach for Retrosynthetic Molecular Core Analysis. J. Cheminf. 2019, 11. [Google Scholar] [CrossRef] [Green Version]
- de la Vega de León, A.; Bajorath, J. Matched Molecular Pairs Derived by Retrosynthetic Fragmentation. Med. Chem. Commun. 2014, 5, 64–67. [Google Scholar] [CrossRef]
- Lewell, X.Q.; Judd, D.B.; Watson, S.P.; Hann, M.M. RECAP – Retrosynthetic Combinatorial Analysis Procedure: A Powerful New Technique for Identifying Privileged Molecular Fragments with Useful Applications in Combinatorial Chemistry. J. Chem. Inf. Comput. Sci. 1998, 38, 511–522. [Google Scholar] [CrossRef] [PubMed]
- Hu, Y.; de la Vega de León, A.; Zhang, B.; Bajorath, J. Matched Molecular Pair-based Data Sets for Computer-aided Medicinal Chemistry. F1000Research 2014, 3, 36. [Google Scholar] [CrossRef] [PubMed]
- Hu, X.; Hu, Y.; Vogt, M.; Stumpfe, D.; Bajorath, J. MMP-Cliffs: Systematic Identification of Activity Cliffs on the Basis of Matched Molecular Pairs. J. Chem. Inf. Model. 2012, 52, 1138–1145. [Google Scholar] [CrossRef]
- Leach, A.; Lukac, I.; Zarnecka, J.; Dossetter, A.; Griffen, E. Matched Molecular Pair Analysis. In Comprehensive Medicinal Chemistry III; Elsevier: Amsterdam, The Netherlands, 2017; pp. 221–252. [Google Scholar] [CrossRef]
- de la Vega de León, A.; Hu, Y.; Bajorath, J. Systematic Identification of Matching Molecular Series and Mapping of Screening Hits. Mol. Inf. 2014, 33, 257–263. [Google Scholar] [CrossRef] [PubMed]
- Gupta-Ostermann, D.; Wawer, M.; Wassermann, A.M.; Bajorath, J. Graph Mining for SAR Transfer Series. J. Chem. Inf. Model. 2012, 52, 935–942. [Google Scholar] [CrossRef]
- Zhang, B.; Wassermann, A.M.; Vogt, M.; Bajorath, J. Systematic Assessment of Compound Series with SAR Transfer Potential. J. Chem. Inf. Model. 2012, 52, 3138–3143. [Google Scholar] [CrossRef]
- Gupta-Ostermann, D.; Shanmugasundaram, V.; Bajorath, J. Neighborhood-Based Prediction of Novel Active Compounds from SAR Matrices. J. Chem. Inf. Model. 2014, 54, 801–809. [Google Scholar] [CrossRef] [PubMed]
- Gupta-Ostermann, D.; Bajorath, J. The ‘SAR Matrix’ Method and Its Extensions for Applications in Medicinal Chemistry and Chemogenomics. F1000Research 2014, 3, 113. [Google Scholar] [CrossRef] [PubMed]
- Gupta-Ostermann, D.; Hirose, Y.; Odagami, T.; Kouji, H.; Bajorath, J. Follow-up: Prospective Compound Design Using the ‘SAR Matrix’ Method and Matrix-derived Conditional Probabilities of Activity. F1000Research 2015, 4, 75. [Google Scholar] [CrossRef]
- Yoshimori, A.; Horita, Y.; Tanoue, T.; Bajorath, J. Method for Systematic Analogue Search Using the Mega SAR Matrix Database. J. Chem. Inf. Model. 2019, 59, 3727–3734. [Google Scholar] [CrossRef]
- Zhang, B.; Hu, Y.; Bajorath, J. SAR Transfer across Different Targets. J. Chem. Inf. Model. 2013, 53, 1589–1594. [Google Scholar] [CrossRef] [PubMed]
- Hu, Y.; Bajorath, J. SAR Matrix Method for Large-Scale Analysis of Compound Structure–Activity Relationships and Exploration of Multitarget Activity Spaces. In Methods in Molecular Biology; Springer: New York, NY, USA, 2018; pp. 339–352. [Google Scholar] [CrossRef]
- Yoshimori, A.; Bajorath, J. The SAR Matrix Method and an Artificially Intelligent Variant for the Identification and Structural Organization of Analog Series, SAR Analysis, and Compound Design. Mol. Inf. 2020, 39, 2000045. [Google Scholar] [CrossRef]
- Free, S.M.; Wilson, J.W. A Mathematical Contribution to Structure-Activity Studies. J. Med. Chem. 1964, 7, 395–399. [Google Scholar] [CrossRef] [PubMed]
- Yoshimori, A.; Tanoue, T.; Bajorath, J. Integrating the Structure–Activity Relationship Matrix Method with Molecular Grid Maps and Activity Landscape Models for Medicinal Chemistry Applications. ACS Omega 2019, 4, 7061–7069. [Google Scholar] [CrossRef]
- Agrafiotis, D.K.; Wiener, J.J.M.; Skalkin, A.; Kolpak, J. Single R-Group Polymorphisms (SRPs) and R-Cliffs: An Intuitive Framework for Analyzing and Visualizing Activity Cliffs in a Single Analog Series. J. Chem. Inf. Model. 2011, 51, 1122–1131. [Google Scholar] [CrossRef]
- Maggiora, G.M.; Bajorath, J. Chemical Space Networks: A Powerful New Paradigm for the Description of Chemical Space. J. Comput. Aided Mol. Des. 2014, 28, 795–802. [Google Scholar] [CrossRef]
- Zwierzyna, M.; Vogt, M.; Maggiora, G.M.; Bajorath, J. Design and Characterization of Chemical Space Networks for Different Compound Data Sets. J. Comput. Aided Mol. Des. 2014, 29, 113–125. [Google Scholar] [CrossRef]
- Zhang, B.; Vogt, M.; Maggiora, G.M.; Bajorath, J. Design of Chemical Space Networks Using a Tanimoto Similarity Variant Based upon Maximum Common Substructures. J. Comput. Aided Mol. Des. 2015, 29, 937–950. [Google Scholar] [CrossRef] [PubMed]
- Dimova, D.; Stumpfe, D.; Hu, Y.; Bajorath, J. Analog Series-based Scaffolds: Computational Design and Exploration of a New Type of Molecular Scaffolds for Medicinal Chemistry. Future Sci. OA 2016, 2, FSO149. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Bajorath, J. Improving the Utility of Molecular Scaffolds for Medicinal and Computational Chemistry. Future Med. Chem. 2018, 10, 1645–1648. [Google Scholar] [CrossRef]
- Wassermann, A.M.; Bajorath, J. Directed R-Group Combination Graph: A Methodology To Uncover Structure–Activity Relationship Patterns in a Series of Analogues. J. Med. Chem. 2012, 55, 1215–1226. [Google Scholar] [CrossRef]
- Hu, Y.; Zhang, B.; Vogt, M.; Bajorath, J. AnalogExplorer2 – Stereochemistry Sensitive Graphical Analysis of Large Analog Series. F1000Research 2015, 4, 1031. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Medina-Franco, J.L.; Petit, J.; Maggiora, G.M. Hierarchical Strategy for Identifying Active Chemotype Classes in Compound Databases. Chem. Biol. Drug Des. 2006, 67, 395–408. [Google Scholar] [CrossRef]
- Koch, M.A.; Schuffenhauer, A.; Scheck, M.; Wetzel, S.; Casaulta, M.; Odermatt, A.; Ertl, P.; Waldmann, H. Charting Biologically Relevant Chemical Space: A Structural Classification of Natural Products (SCONP). Proc. Natl. Acad. Sci. USA 2005, 102, 17272–17277. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Agrafiotis, D.K.; Wiener, J.J.M. Scaffold Explorer: An Interactive Tool for Organizing and Mining Structure-Activity Data Spanning Multiple Chemotypes. J. Med. Chem. 2010, 53, 5002–5011. [Google Scholar] [CrossRef] [PubMed]
- Wetzel, S.; Klein, K.; Renner, S.; Rauh, D.; Oprea, T.I.; Mutzel, P.; Waldmann, H. Interactive Exploration of Chemical Space with Scaffold Hunter. Nat. Chem. Biol. 2009, 5, 581–583. [Google Scholar] [CrossRef]
- Wilkens, S.J.; Janes, J.; Su, A.I. HierS: Hierarchical Scaffold Clustering Using Topological Chemical Graphs. J. Med. Chem. 2005, 48, 3182–3193. [Google Scholar] [CrossRef]
- Varin, T.; Schuffenhauer, A.; Ertl, P.; Renner, S. Mining for Bioactive Scaffolds with Scaffold Networks: Improved Compound Set Enrichment from Primary Screening Data. J. Chem. Inf. Model. 2011, 51, 1528–1538. [Google Scholar] [CrossRef] [PubMed]
- Kruger, F.; Stiefl, N.; Landrum, G.A. rdScaffoldNetwork: The Scaffold Network Implementation in RDKit. J. Chem. Inf. Model. 2020, 60, 3331–3335. [Google Scholar] [CrossRef] [PubMed]
- Madariaga-Mazón, A.; Naveja, J.J.; Medina-Franco, J.L.; Noriega-Colima, K.O.; Martinez-Mayorga, K. DiaNat-DB: A Molecular Database of Antidiabetic Compounds from Medicinal Plants. RSC Adv. 2021, 11, 5172–5178. [Google Scholar] [CrossRef]
- Sterling, T.; Irwin, J.J. ZINC 15 – Ligand Discovery for Everyone. J. Chem. Inf. Model. 2015, 55, 2324–2337. [Google Scholar] [CrossRef]
- Polishchuk, P. CReM: Chemically Reasonable Mutations Framework for Structure Generation. J. Cheminf. 2020, 12. [Google Scholar] [CrossRef] [PubMed]
- Gómez-Bombarelli, R.; Wei, J.N.; Duvenaud, D.; Hernández-Lobato, J.M.; Sánchez-Lengeling, B.; Sheberla, D.; Aguilera-Iparraguirre, J.; Hirzel, T.D.; Adams, R.P.; Aspuru-Guzik, A. Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules. ACS Cent. Sci. 2018, 4, 268–276. [Google Scholar] [CrossRef]
- Yoshimori, A.; Bajorath, J. Deep SAR Matrix: SAR Matrix Expansion for Advanced Analog Design Using Deep Learning Architectures. Future Drug Discov. 2020, 2, FDD36. [Google Scholar] [CrossRef] [Green Version]
- Yoshimori, A.; Hu, H.; Bajorath, J. Adapting the DeepSARM Approach for Dual-target Ligand Design. J. Comput. Aided Mol. Des. 2021, 35, 587–600. [Google Scholar] [CrossRef]
- Miyao, T.; Bajorath, J. Exploring Ensembles of Bioactive or Virtual Analogs of X-ray Ligands for Shape Similarity Searching. J. Comput. Aided Mol. Des. 2018, 32, 759–767. [Google Scholar] [CrossRef] [PubMed]
- Kunimoto, R.; Miyao, T.; Bajorath, J. Computational Method for Estimating Progression Saturation of Analog Series. RSC Adv. 2018, 8, 5484–5492. [Google Scholar] [CrossRef] [Green Version]
- Vogt, M.; Yonchev, D.; Bajorath, J. Computational Method to Evaluate Progress in Lead Optimization. J. Med. Chem. 2018, 61, 10895–10900. [Google Scholar] [CrossRef]
- Yonchev, D.; Vogt, M.; Stumpfe, D.; Kunimoto, R.; Miyao, T.; Bajorath, J. Computational Assessment of Chemical Saturation of Analogue Series under Varying Conditions. ACS Omega 2018, 3, 15799–15808. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Yonchev, D.; Vogt, M.; Bajorath, J. Compound Optimization Monitor (COMO) Method for Computational Evaluation of Progress in Medicinal Chemistry Projects. Future Drug Discov. 2019, 1, FDD15. [Google Scholar] [CrossRef] [Green Version]
- Yonchev, D.; Vogt, M.; Bajorath, J. From SAR Diagnostics to Compound Design: Development Chronology of the Compound Optimization Monitor (COMO) Method. Mol. Inf. 2020, 39, 2000046. [Google Scholar] [CrossRef] [PubMed]
- Yonchev, D.; Bajorath, J. DeepCOMO: From Structure-activity Relationship Diagnostics to Generative Molecular Design Using the Compound Optimization Monitor Methodology. J. Comput. Aided Mol. Des. 2020, 34, 1207–1218. [Google Scholar] [CrossRef]
- Ertl, P. Identification of Bioisosteric Substituents by a Deep Neural Network. J. Chem. Inf. Model. 2020, 60, 3369–3375. [Google Scholar] [CrossRef] [PubMed]
- Chen, H.; Engkvist, O.; Wang, Y.; Olivecrona, M.; Blaschke, T. The Rise of Deep Learning in Drug Discovery. Drug Discov. Today 2018, 23, 1241–1250. [Google Scholar] [CrossRef]
- Blaschke, T.; Engkvist, O.; Bajorath, J.; Chen, H. Memory-assisted Reinforcement Learning for Diverse Molecular De Novo Design. J. Cheminf. 2020, 12. [Google Scholar] [CrossRef]
- Blaschke, T.; Arús-Pous, J.; Chen, H.; Margreitter, C.; Tyrchan, C.; Engkvist, O.; Papadopoulos, K.; Patronov, A. REINVENT 2.0: An AI Tool for De Novo Drug Design. J. Chem. Inf. Model. 2020, 60, 5918–5922. [Google Scholar] [CrossRef] [PubMed]
- Arús-Pous, J.; Patronov, A.; Bjerrum, E.J.; Tyrchan, C.; Reymond, J.L.; Chen, H.; Engkvist, O. SMILES-based Deep Generative Scaffold Decorator for De-novo Drug Design. J. Cheminf. 2020, 12. [Google Scholar] [CrossRef]
- Takeuchi, K.; Kunimoto, R.; Bajorath, J. Global Assessment of Substituents on the Basis of Analogue Series. J. Med. Chem. 2020, 63, 15013–15020. [Google Scholar] [CrossRef] [PubMed]
- Takeuchi, K.; Kunimoto, R.; Bajorath, J. R-group Replacement Database for Medicinal Chemistry. Future Sci. OA 2021, 7, FSO742. [Google Scholar] [CrossRef] [PubMed]
- Liu, T.; Lin, Y.; Wen, X.; Jorissen, R.N.; Gilson, M.K. BindingDB: A Web-accessible Database of Experimentally Determined Protein-ligand Binding Affinities. Nucleic Acids Res. 2007, 35, D198–D201. [Google Scholar] [CrossRef] [Green Version]
- Wishart, D.S.; Feunang, Y.D.; Guo, A.C.; Lo, E.J.; Marcu, A.; Grant, J.R.; Sajed, T.; Johnson, D.; Li, C.; Sayeeda, Z.; et al. DrugBank 5.0: A major update to the DrugBank database for 2018. Nucleic Acids Res. 2017, 46, D1074–D1082. [Google Scholar] [CrossRef] [PubMed]
- Wassermann, A.M.; Bajorath, J. Large-scale Exploration of Bioisosteric Replacements on the Basis of Matched Molecular Pairs. Future Med. Chem. 2011, 3, 425–436. [Google Scholar] [CrossRef]
- Wassermann, A.M.; Bajorath, J. Identification of Target Family Directed Bioisosteric Replacements. MedChemComm 2011, 2, 601–606. [Google Scholar] [CrossRef]
- Wassermann, A.M.; Bajorath, J. Chemical Substitutions That Introduce Activity Cliffs Across Different Compound Classes and Biological Targets. J. Chem. Inf. Model. 2010, 50, 1248–1256. [Google Scholar] [CrossRef]
- Hu, Y.; Bajorath, J. Chemical Transformations That Yield Compounds with Distinct Activity Profiles. ACS Med. Chem. Lett. 2011, 2, 523–527. [Google Scholar] [CrossRef] [Green Version]
- Hu, Y.; Bajorath, J. Structural and Activity Profile Relationships Between Drug Scaffolds. AAPS J. 2015, 17, 609–619. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Bajorath, J. Large-scale SAR analysis. Drug Discov. Today Technol. 2013, 10, e419–e426. [Google Scholar] [CrossRef] [PubMed]
- Kunimoto, R.; Dimova, D.; Bajorath, J. Application of a New Scaffold Concept for Computational Target Deconvolution of Chemical Cancer Cell Line Screens. ACS Omega 2017, 2, 1463–1468. [Google Scholar] [CrossRef]
- Naveja, J.J.; Medina-Franco, J.L. Consistent Cell-selective Analog Series as Constellation Luminaries in Chemical Space. Mol. Inf. 2020, 39, 2000061. [Google Scholar] [CrossRef] [PubMed]
- Maggiora, G.M. On Outliers and Activity CliffsWhy QSAR Often Disappoints. J. Chem. Inf. Model. 2006, 46, 1535. [Google Scholar] [CrossRef]
- Stumpfe, D.; de la Vega de León, A.; Dimova, D.; Bajorath, J. Advancing the Activity Cliff Concept, Part II. F1000Research 2014, 3, 75. [Google Scholar] [CrossRef] [Green Version]
- Medina-Franco, J.L. Activity Cliffs: Facts or Artifacts? Chem. Biol. Drug Des. 2013, 81, 553–556. [Google Scholar] [CrossRef]
- Medina-Franco, J.L. Scanning Structure–Activity Relationships with Structure–Activity Similarity and Related Maps: From Consensus Activity Cliffs to Selectivity Switches. J. Chem. Inf. Model. 2012, 52, 2485–2493. [Google Scholar] [CrossRef]
- Schneider, N.; Lewis, R.A.; Fechner, N.; Ertl, P. Chiral Cliffs: Investigating the Influence of Chirality on Binding Affinity. ChemMedChem 2018, 13, 1315–1324. [Google Scholar] [CrossRef]
- Hu, Y.; Furtmann, N.; Bajorath, J. Extension of Three-dimensional Activity Cliff Information through Systematic Mapping of Active Analogs. RSC Adv. 2015, 5, 43006–43015. [Google Scholar] [CrossRef]
- Stumpfe, D.; Hu, H.; Bajorath, J. Evolving Concept of Activity Cliffs. ACS Omega 2019, 4, 14360–14368. [Google Scholar] [CrossRef]
- Hu, H.; Bajorath, J. Increasing the public activity cliff knowledge base with new categories of activity cliffs. Future Sci. OA 2020, 6, FSO472. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Stumpfe, D.; Hu, H.; Bajorath, J. Introducing a New Category of Activity Cliffs with Chemical Modifications at Multiple Sites and Rationalizing Contributions of Individual Substitutions. Bioorg. Med. Chem. 2019, 27, 3605–3612. [Google Scholar] [CrossRef] [PubMed]
- Kanetaka, H.; Koseki, Y.; Taira, J.; Umei, T.; Komatsu, H.; Sakamoto, H.; Gulten, G.; Sacchettini, J.C.; Kitamura, M.; Aoki, S. Discovery of InhA Inhibitors with Anti-mycobacterial Activity through a Matched Molecular Pair Approach. Eur. J. Med. Chem. 2015, 94, 378–385. [Google Scholar] [CrossRef] [PubMed]
- Fu, L.; Yang, Z.Y.; Yang, Z.J.; Yin, M.Z.; Lu, A.P.; Chen, X.; Liu, S.; Hou, T.J.; Cao, D.S. QSAR-assisted-MMPA to Expand Chemical Transformation Space for Lead Optimization. Brief Bioinform. 2021. [Google Scholar] [CrossRef]
- Kramer, C.; Ting, A.; Zheng, H.; Hert, J.; Schindler, T.; Stahl, M.; Robb, G.; Crawford, J.J.; Blaney, J.; Montague, S.; et al. Learning Medicinal Chemistry Absorption, Distribution, Metabolism, Excretion, and Toxicity (ADMET) Rules from Cross-Company Matched Molecular Pairs Analysis (MMPA). J. Med. Chem. 2017, 61, 3277–3292. [Google Scholar] [CrossRef]
- Keefer, C.E.; Chang, G.; Kauffman, G.W. Extraction of Tacit Knowledge from Large ADME Data Sets Via Pairwise Analysis. Bioorg. Med. Chem. 2011, 19, 3739–3749. [Google Scholar] [CrossRef]
- Awale, M.; Riniker, S.; Kramer, C. Matched Molecular Series Analysis for ADME Property Prediction. J. Chem. Inf. Model. 2020, 60, 2903–2914. [Google Scholar] [CrossRef]
- Koutsoukas, A.; Chang, G.; Keefer, C.E. In-Silico Extraction of Design Ideas Using MMPA-by-QSAR and its Application on ADME Endpoints. J. Chem. Inf. Model. 2018, 59, 477–485. [Google Scholar] [CrossRef]
- Fu, L.; Liu, L.; Yang, Z.J.; Li, P.; Ding, J.J.; Yun, Y.H.; Lu, A.P.; Hou, T.J.; Cao, D.S. Systematic Modeling of log D7.4 Based on Ensemble Machine Learning, Group Contribution, and Matched Molecular Pair Analysis. J. Chem. Inf. Model. 2019, 60, 63–76. [Google Scholar] [CrossRef]
MMP Definition | Concept | Advantages | Disadvantages | References |
---|---|---|---|---|
Transformation-based | Only bonds matching a transformation can be cut. | Computationally efficient. Chemically meaningful transformations are studied. | Limited to a set of predefined transformations. Only pairwise comparisons. | [21,24,25,30] |
MCS-based | Topological identification of the maximum common substructure between molecules. | Exhaustive. Can extract specific transformations. | High computational complexity. | [18,19] |
Fragmentation-based (exhaustive) | Every acyclic single bond can be cut. Two molecules form an MMP if they can be reduced to a common substructure of significant size. | Computationally efficient for large databases using the fragment and index approach. No predefined transformations limit the algorithm. Compounds can be organized in analogue series. Yields scaffolds and transformations. | Chemical feasibility of the generated cuts and transformation is not considered. Inefficient for pairwise comparisons. Algorithmic limitations on core structures are imposed. Systematic fragmentation can be time consuming for some large molecules. | [14] |
Fragmentation-based using retrosynthetic rules | Bonds are cut according to retrosynthetic rules. Two molecules form an MMP if they can be reduced to a common substructure of significant size. | Computationally efficient for large databases using the fragment and index approach. Chemically meaningful core structures shared by MMPs. Compounds can be organized in analogue series. Hierarchical organization of analogue series is possible. | Limited to the list of retrosynthetic rules. Inefficient for pairwise comparisons. Algorithmic limitations on core structures are imposed. | [2,3,36,37] |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Naveja, J.J.; Vogt, M. Automatic Identification of Analogue Series from Large Compound Data Sets: Methods and Applications. Molecules 2021, 26, 5291. https://doi.org/10.3390/molecules26175291
Naveja JJ, Vogt M. Automatic Identification of Analogue Series from Large Compound Data Sets: Methods and Applications. Molecules. 2021; 26(17):5291. https://doi.org/10.3390/molecules26175291
Chicago/Turabian StyleNaveja, José J., and Martin Vogt. 2021. "Automatic Identification of Analogue Series from Large Compound Data Sets: Methods and Applications" Molecules 26, no. 17: 5291. https://doi.org/10.3390/molecules26175291
APA StyleNaveja, J. J., & Vogt, M. (2021). Automatic Identification of Analogue Series from Large Compound Data Sets: Methods and Applications. Molecules, 26(17), 5291. https://doi.org/10.3390/molecules26175291