md_harmonize: A Python Package for Atom-Level Harmonization of Public Metabolic Databases
Abstract
:1. Introduction
2. Materials and Methods
2.1. Compound and Metabolic Reaction Data
2.2. Matrix Representation of a Compound Structure
2.3. Mapping Matrix between Two Compound Structures
2.4. Backtracking Algorithm to Generate One-to-One Atom Mapping
2.5. Shortest Distance between Any Two Atoms in a Compound Structure
2.6. Shortest Distance to the R Groups in a Compound Structure
2.7. Implementation Details of the md_hamonize Python Package
3. Results
3.1. md_harmonize Package Overview
3.2. md_harmonize Package Interface
3.3. Optimization of Substructure Detection
3.4. Application of md_harmonize to Compound Harmonization across Public Databases
4. Discussion
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Faubert, B.; Solmonson, A.; DeBerardinis, R.J. Metabolic reprogramming and cancer progression. Science 2020, 368, eaaw5473. [Google Scholar] [CrossRef] [PubMed]
- DeBerardinis, R.J.; Chandel, N.S. Fundamentals of cancer metabolism. Sci. Adv. 2016, 2, e1600200. [Google Scholar] [CrossRef] [PubMed]
- You, L.; Zhang, B.; Tang, Y. Application of Stable Isotope-Assisted Metabolomics for Cell Metabolism Studies. Metabolites 2014, 4, 142–165. [Google Scholar] [CrossRef] [PubMed]
- Fan, T.W.-M.; Lorkiewicz, P.K.; Sellers, K.; Moseley, H.N.B.; Higashi, R.M.; Lane, A.N. Stable isotope-resolved metabolomics and applications for drug development. Pharmacol. Ther. 2012, 133, 366–391. [Google Scholar] [CrossRef] [PubMed]
- Jin, H.; Moseley, H.N.B. Moiety modeling framework for deriving moiety abundances from mass spectrometry measured isotopologues. BMC Bioinform. 2019, 20, 524. [Google Scholar] [CrossRef] [PubMed]
- Altman, T.; Travers, M.; Kothari, A.; Caspi, R.; Karp, P.D. A systematic comparison of the MetaCyc and KEGG pathway databases. BMC Bioinform. 2013, 14, 112. [Google Scholar] [CrossRef] [PubMed]
- Kanehisa, M. KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res. 2000, 28, 27–30. [Google Scholar] [CrossRef]
- Caspi, R.; Dreher, K.; Karp, P.D. The challenge of constructing, classifying, and representing metabolic pathways. FEMS Microbiol. Lett. 2013, 345, 85–93. [Google Scholar] [CrossRef]
- Caspi, R.; Billington, R.; Keseler, I.M.; Kothari, A.; Krummenacker, M.; Midford, P.E.; Karp, P.D. The MetaCyc database of metabolic pathways and enzymes—A 2019 update. Nucleic Acids Res. 2020, 48, D445–D453. [Google Scholar] [CrossRef]
- Kotera, M.; Hattori, M.; Oh, M.A.; Yamamoto, R.; Komeno, T.; Yabuzaki, J.; Kanehisa, M. RPAIR: A Reactant-Pair Database Representing Chemical Changes in Enzymatic Reactions RPAIR: A Reactant-Pair Database Representing Chemical Changes in Enzymatic Reactions Min-A Oh. 2004. Available online: https://www.researchgate.net/publication/228501550 (accessed on 31 December 2015).
- Latendresse, M.; Malerich, J.P.; Travers, M.; Karp, P.D. Accurate Atom-Mapping Computation for Biochemical Reactions. J. Chem. Inf. Model. 2012, 52, 2970–2982. [Google Scholar] [CrossRef]
- Jin, H.; Mitchell, J.M.; Moseley, H.N.B. Atom Identifiers Generated by a Neighborhood-Specific Graph Coloring Method Enable Compound Harmonization across Metabolic Databases. Metabolites 2020, 10, 368. [Google Scholar] [CrossRef] [PubMed]
- Powers, R. NMR metabolomics and drug discovery. Magn. Reson. Chem. 2009, 47, S2–S11. [Google Scholar] [CrossRef] [PubMed]
- Jin, H.; Moseley, H.N.B. Hierarchical Harmonization of Atom-Resolved Metabolic Reactions across Metabolic Databases. Metabolites 2021, 11, 431. [Google Scholar] [CrossRef] [PubMed]
- Poolman, M.G.; Bonde, B.K.; Gevorgyan, A.; Patel, H.H.; Fell, D.A. Challenges to be faced in the reconstruction of metabolic networks from public databases. IEE Proc. Syst. Biol. 2006, 153, 379. [Google Scholar] [CrossRef] [PubMed]
- Dalby, A.; Nourse, J.G.; Hounshell, W.D.; Gushurst, A.K.; Grier, D.L.; Leland, B.A.; Laufer, J. Description of several chemical structure file formats used by computer programs developed at Molecular Design Limited. J. Chem. Inf. Comput. Sci. 1992, 32, 244–255. [Google Scholar] [CrossRef]
- Floyd, R.W. Algorithm 97: Shortest path. Commun. ACM 1962, 5, 345. [Google Scholar] [CrossRef]
- Dijkstra, E.W. A note on two problems in connexion with graphs. Numer. Math. 1959, 1, 269–271. [Google Scholar] [CrossRef]
- Cafasso, M. Pebble. Available online: https://github.com/noxdafox/pebble. (accessed on 1 July 2022).
- Mitchell, J.M.; Fan, T.W.-M.; Lane, A.N.; Moseley, H.N.B. Development and in silico evaluation of large-scale metabolite identification methods using functional group detection for metabolomics. Front. Genet. 2014, 5, 237. [Google Scholar] [CrossRef]
- O’Boyle, N.M.; Banck, M.; James, C.A.; Morley, C.; Vandermeersch, T.; Hutchison, G.R. Open Babel: An open chemical toolbox. J. Cheminform. 2011, 3, 33. [Google Scholar] [CrossRef]
- Heller, S.; McNaught, A.; Stein, S.; Tchekhovskoi, D.; Pletnev, I. InChI–the worldwide chemical structure identifier standard. J. Cheminform. 2013, 5, 7. [Google Scholar] [CrossRef]
- Goodman, J.M.; Pletnev, I.; Thiessen, P.; Bolton, E.; Heller, S.R. InChI version 1.06: Now more than 99.99% reliable. J. Cheminform. 2021, 13, 40. [Google Scholar] [CrossRef] [PubMed]
- Heller, S.R.; McNaught, A.; Pletnev, I.; Stein, S.; Tchekhovskoi, D. InChI, the IUPAC International Chemical Identifier. J. Cheminform. 2015, 7, 23. [Google Scholar] [CrossRef] [PubMed]
- Starke, C.; Wegner, A. MetAMDB: Metabolic Atom Mapping Database. Metabolites 2022, 12, 122. [Google Scholar] [CrossRef] [PubMed]
Bond Type | Integer |
---|---|
1 | Single |
2 | Double |
3 | Triple |
4 | Aromatic |
Databases | HMDB Extracted | md_harmonized Detected | Overlap |
---|---|---|---|
KEGG | 6814 | 8644 | 5358 |
MetaCyc | 2652 | 7271 | 1868 |
Category | KEGG | MetaCyc |
---|---|---|
Invalid references | 232 (15.93%) | 111 (14.15%) |
Inconsistent formulas | 793 (54.46%) | 557 (71.05%) |
Other | 431 (29.61%) | 116 (14.80%) |
Total | 1456 | 784 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Jin, H.; Moseley, H.N.B. md_harmonize: A Python Package for Atom-Level Harmonization of Public Metabolic Databases. Metabolites 2023, 13, 1199. https://doi.org/10.3390/metabo13121199
Jin H, Moseley HNB. md_harmonize: A Python Package for Atom-Level Harmonization of Public Metabolic Databases. Metabolites. 2023; 13(12):1199. https://doi.org/10.3390/metabo13121199
Chicago/Turabian StyleJin, Huan, and Hunter N. B. Moseley. 2023. "md_harmonize: A Python Package for Atom-Level Harmonization of Public Metabolic Databases" Metabolites 13, no. 12: 1199. https://doi.org/10.3390/metabo13121199
APA StyleJin, H., & Moseley, H. N. B. (2023). md_harmonize: A Python Package for Atom-Level Harmonization of Public Metabolic Databases. Metabolites, 13(12), 1199. https://doi.org/10.3390/metabo13121199