Retention Time Prediction with Message-Passing Neural Networks
Abstract
:1. Introduction
2. Materials and Methods
3. Results and Discussion
4. Conclusions
Supplementary Materials
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Xue, J.; Guijas, C.; Benton, H.P.; Warth, B.; Siuzdak, G. METLIN MS2 molecular standards database: A broad chemical and biological resource. Nat. Methods 2020, 17, 953–954. [Google Scholar] [CrossRef] [PubMed]
- Kim, S.; Chen, J.; Cheng, T.; Gindulyte, A.; He, J.; He, S.; Li, Q.; Shoemaker, B.A.; Thiessen, P.A.; Yu, B.; et al. PubChem 2019 update: Improved access to chemical data. Nucleic Acids Res. 2019, 47, D1102–D1109. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Djoumbou-Feunang, Y.; Pon, A.; Karu, N.; Zheng, J.; Li, C.; Arndt, D.; Gautam, M.; Allen, F.; Wishart, D.S. CFM-ID 3.0: Significantly Improved ESI-MS/MS Prediction and Compound Identification. Metabolites 2019, 9, 23. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Dührkop, K.; Fleischauer, M.; Ludwig, M.; Aksenov, A.A.; Melnik, A.V.; Meusel, M.; Dorrestein, P.C.; Rousu, J.; Böcker, S. SIRIUS 4: A rapid tool for turning tandem mass spectra into metabolite structure information. Nat. Methods 2019, 16, 299–302. [Google Scholar] [CrossRef] [Green Version]
- Ruttkies, C.; Neumann, S.; Posch, S. Improving MetFrag with statistical learning of fragment annotations. BMC Bioinform. 2019, 20, 14. [Google Scholar] [CrossRef]
- Witting, M.; Böcker, S. Current status of retention time prediction in metabolite identification. J. Sep. Sci. 2020, 43, 1746–1754. [Google Scholar] [CrossRef]
- Haddad, P.R.; Taraji, M.; Szücs, R. Prediction of Analyte Retention Time in Liquid Chromatography. Anal. Chem. 2021, 93, 228–256. [Google Scholar] [CrossRef]
- Aalizadeh, R.; Nika, M.C.; Thomaidis, N.S. Development and application of retention time prediction models in the suspect and non-target screening of emerging contaminants. J. Hazard. Mater. 2019, 363, 277–285. [Google Scholar] [CrossRef]
- Aicheler, F.; Li, J.; Hoene, M.; Lehmann, R.; Xu, G.W.; Kohlbacher, O. Retention Time Prediction Improves Identification in Nontargeted Lipidomics Approaches. Anal. Chem. 2015, 87, 7698–7704. [Google Scholar] [CrossRef]
- Amos, R.I.J.; Haddad, P.R.; Szucs, R.; Dolan, J.W.; Pohl, C.A. Molecular modeling and prediction accuracy in Quantitative Structure-Retention Relationship calculations for chromatography. TrAC Trends Anal. Chem. 2018, 105, 352–359. [Google Scholar] [CrossRef]
- Bach, E.; Szedmak, S.; Brouard, C.; Böcker, S.; Rousu, J. Liquid-chromatography retention order prediction for metabolite identification. Bioinformatics 2018, 34, i875–i883. [Google Scholar] [CrossRef] [PubMed]
- Bonini, P.; Kind, T.; Tsugawa, H.; Barupal, D.K.; Fiehn, O. Retip: Retention Time Prediction for Compound Annotation in Untargeted Metabolomics. Anal. Chem. 2020, 92, 7515–7522. [Google Scholar] [CrossRef]
- Boswell, P.G.; Schellenberg, J.R.; Carr, P.W.; Cohen, J.D.; Hegeman, A.D. Easy and accurate high-performance liquid chromatography retention prediction with different gradients, flow rates, and instruments by back-calculation of gradient and flow rate profiles. J. Chromatogr. A 2011, 1218, 6742–6749. [Google Scholar] [CrossRef] [PubMed]
- Bouwmeester, R.; Martens, L.; Degroeve, S. Comprehensive and Empirical Evaluation of Machine Learning Algorithms for Small Molecule LC Retention Time Prediction. Anal. Chem. 2019, 91, 3694–3703. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Bruderer, T.; Varesio, E.; Hopfgartner, G. The use of LC predicted retention times to extend metabolites identification with SWATH data acquisition. J. Chromatogr. B Anal. Technol. Biomed. Life Sci. 2017, 1071, 3–10. [Google Scholar] [CrossRef]
- Cao, M.S.; Fraser, K.; Huege, J.; Featonby, T.; Rasmussen, S.; Jones, C. Predicting retention time in hydrophilic interaction liquid chromatography mass spectrometry and its use for peak annotation in metabolomics. Metabolomics 2015, 11, 696–706. [Google Scholar] [CrossRef] [Green Version]
- Codesido, S.; Randazzo, G.M.; Lehmann, F.; González-Ruiz, V.; García, A.; Xenarios, I.; Liechti, R.; Bridge, A.; Boccard, J.; Rudaz, S. DynaStI: A Dynamic Retention Time Database for Steroidomics. Metabolites 2019, 9, 85. [Google Scholar] [CrossRef] [Green Version]
- Creek, D.J.; Jankevics, A.; Breitling, R.; Watson, D.G.; Barrett, M.P.; Burgess, K.E.V. Toward Global Metabolomics Analysis with Hydrophilic Interaction Liquid Chromatography-Mass Spectrometry: Improved Metabolite Identification by Retention Time Prediction. Anal. Chem. 2011, 83, 8703–8710. [Google Scholar] [CrossRef] [Green Version]
- Falchi, F.; Bertozzi, S.M.; Ottonello, G.; Ruda, G.F.; Colombano, G.; Fiorelli, C.; Martucci, C.; Bertorelli, R.; Scarpelli, R.; Cavalli, A.; et al. Kernel-Based, Partial Least Squares Quantitative Structure-Retention Relationship Model for UPLC Retention Time Prediction: A Useful Tool for Metabolite Identification. Anal. Chem. 2016, 88, 9510–9517. [Google Scholar] [CrossRef] [Green Version]
- Feng, C.; Xu, Q.; Qiu, X.; Jin, Y.; Ji, J.; Lin, Y.; Le, S.; She, J.; Lu, D.; Wang, G. Evaluation and application of machine learning-based retention time prediction for suspect screening of pesticides and pesticide transformation products in LC-HRMS. Chemosphere 2021, 271, 129447. [Google Scholar] [CrossRef]
- Kitamura, R.; Kawabe, T.; Kajiro, T.; Yonemochi, E. The development of retention time prediction model using multilinear gradient profiles of seven pharmaceuticals. J. Pharm. Biomed. Anal. 2021, 198, 114024. [Google Scholar] [CrossRef] [PubMed]
- Parinet, J. Predicting reversed-phase liquid chromatographic retention times of pesticides by deep neural networks. Heliyon 2021, 7, e08563. [Google Scholar] [CrossRef] [PubMed]
- Pasin, D.; Mollerup, C.B.; Rasmussen, B.S.; Linnet, K.; Dalsgaard, P.W. Development of a single retention time prediction model integrating multiple liquid chromatography systems: Application to new psychoactive substances. Anal. Chim. Acta 2021, 1184, 339035. [Google Scholar] [CrossRef] [PubMed]
- Rojas, C.; Aranda, J.F.; Jaramillo, E.P.; Losilla, I.; Tripaldi, P.; Duchowicz, P.R.; Castro, E.A. Foodinformatic prediction of the retention time of pesticide residues detected in fruits and vegetables using UHPLC/ESI Q-Orbitrap. Food Chemistry. 2021, 342, 128354. [Google Scholar] [CrossRef]
- Liapikos, T.; Zisi, C.; Kodra, D.; Kademoglou, K.; Diamantidou, D.; Begou, O.; Pappa-Louisi, A.; Theodoridis, G. Quantitative Structure Retention Relationship (QSRR) Modelling for Analytes’ Retention Prediction in LC-HRMS by Applying Different Machine Learning Algorithms and Evaluating Their Performance. J. Chromatogr. B 2022, 1191, 123132. [Google Scholar] [CrossRef]
- Domingo-Almenara, X.; Guijas, C.; Billings, E.; Montenegro-Burke, J.R.; Uritboonthai, W.; Aisporna, A.E.; Chen, E.; Benton, H.P.; Siuzdak, G. METLIN small molecule dataset for machine learning-based retention time prediction. Nat. Commun. 2019, 10, 5811. [Google Scholar] [CrossRef] [Green Version]
- Osipenko, S.; Bashkirova, I.; Sosnin, S.; Kovaleva, O.; Fedorov, M.; Nikolaev, E.; Kostyukevich, Y. Machine learning to predict retention time of small molecules in nano-HPLC. Anal. Bioanal. Chem. 2020, 412, 7767–7776. [Google Scholar] [CrossRef]
- Kensert, A.; Bouwmeester, R.; Efthymiadis, K.; Van Broeck, P.; Desmet, G.; Cabooter, D. Graph Convolutional Networks for Improved Prediction and Interpretability of Chromatographic Retention Data. Anal. Chem. 2021, 93, 15633–15641. [Google Scholar] [CrossRef]
- Yang, Q.; Ji, H.; Fan, X.; Zhang, Z.; Lu, H. Retention time prediction in hydrophilic interaction liquid chromatography with graph neural network and transfer learning. J. Chromatogr. A 2021, 1656, 462536. [Google Scholar] [CrossRef]
- Yang, Q.; Ji, H.; Lu, H.; Zhang, Z. Prediction of Liquid Chromatographic Retention Time with Graph Neural Networks to Assist in Small Molecule Identification. Anal. Chem. 2021, 93, 2200–2206. [Google Scholar] [CrossRef]
- Fedorova, E.S.; Matyushin, D.D.; Plyushchenko, I.V.; Stavrianidi, A.N.; Buryak, A.K. Deep learning for retention time prediction in reversed-phase liquid chromatography. J. Chromatogr. A 2022, 1664, 462792. [Google Scholar] [CrossRef] [PubMed]
- Bouwmeester, R.; Martens, L.; Degroeve, S. Generalized Calibration Across Liquid Chromatography Setups for Generic Prediction of Small-Molecule Retention Times. Anal. Chem. 2020, 92, 6571–6578. [Google Scholar] [CrossRef] [PubMed]
- Stanstrup, J.; Neumann, S.; Vrhovsek, U. PredRet: Prediction of Retention Time by Direct Mapping between Multiple Chromatographic Systems. Anal. Chem. 2015, 87, 9421–9428. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Boswell, P.G.; Schellenberg, J.R.; Carr, P.W.; Cohen, J.D.; Hegeman, A.D. A study on retention “projection” as a supplementary means for compound identification by liquid chromatography-mass spectrometry capable of predicting retention with different gradients, flow rates, and instruments. J. Chromatogr. A 2011, 1218, 6732–6741. [Google Scholar] [CrossRef]
- Pan, S.J.; Yang, Q.A. A Survey on Transfer Learning. IEEE Trans. Knowl. Data Eng. 2010, 22, 1345–1359. [Google Scholar] [CrossRef]
- Ju, R.; Liu, X.; Zheng, F.; Lu, X.; Xu, G.; Lin, X. Deep Neural Network Pretrained by Weighted Autoencoders and Transfer Learning for Retention Time Prediction of Small Molecules. Anal. Chem. 2021, 93, 15651–15658. [Google Scholar] [CrossRef]
- Osipenko, S.; Botashev, K.; Nikolaev, E.; Kostyukevich, Y. Transfer learning for small molecule retention predictions. J. Chromatogr. A 2021, 1644, 462119. [Google Scholar] [CrossRef]
- Gilmer, J.; Schoenholz, S.S.; Riley, P.F.; Vinyals, O.; Dahl, G.E. Neural Message Passing for Quantum Chemistry. arXiv 2017, arXiv:1704.01212. [Google Scholar]
- Wu, Z.; Ramsundar, B.; Feinberg, E.N.; Gomes, J.; Geniesse, C.; Pappu, A.S.; Leswing, K.; Pande, V. MoleculeNet: A Benchmark for Molecular Machine Learning. arXiv 2017, arXiv:1703.00564. [Google Scholar] [CrossRef] [Green Version]
- Tang, B.; Kramer, S.T.; Fang, M.; Qiu, Y.; Wu, Z.; Xu, D. A self-attention based message passing neural network for predicting molecular lipophilicity and aqueous solubility. J. Cheminform. 2020, 12, 15. [Google Scholar] [CrossRef] [Green Version]
- McGill, C.; Forsuelo, M.; Guan, Y.; Green, W.H. Predicting Infrared Spectra with Message Passing Neural Networks. J. Chem. Inf. Model. 2021, 61, 2594–2609. [Google Scholar] [CrossRef] [PubMed]
- Withnall, M.; Lindelöf, E.; Engkvist, O.; Chen, H. Building attention and edge message passing neural networks for bioactivity and physical–chemical property prediction. J. Cheminform. 2020, 12, 1. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Xing, G.; Sresht, V.; Sun, Z.; Shi, Y.; Clasquin, M.F. Coupling Mixed Mode Chromatography/ESI Negative MS Detection with Message-Passing Neural Network Modeling for Enhanced Metabolome Coverage and Structural Identification. Metabolites 2021, 11, 772. [Google Scholar] [CrossRef] [PubMed]
- Cho, K.; van Merriënboer, B.; Gulcehre, C.; Bahdanau, D.; Bougares, F.; Schwenk, H.; Bengio, Y. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. arXiv 2014, arXiv:1406.1078. [Google Scholar]
- Kim, S.; Thiessen, P.A.; Cheng, T.; Zhang, J.; Gindulyte, A.; Bolton, E.E. PUG-View: Programmatic access to chemical annotations integrated in PubChem. J. Cheminform. 2019, 11, 56. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Ramsundar, B. Molecular machine learning with DeepChem. Abstr. Pap. Am. Chem. Soc. 2018, 255, 1. [Google Scholar]
- Chollet, F.C. Keras. 2015. Available online: https://keras.io (accessed on 30 August 2022).
- Wishart, D.S.; Feunang, Y.D.; Marcu, A.; Guo, A.C.; Liang, K.; Vázquez-Fresno, R.; Sajed, T.; Johnson, D.; Li, C.; Karu, N.; et al. HMDB 4.0: The human metabolome database for 2018. Nucleic Acids Res. 2018, 46, D608–D617. [Google Scholar] [CrossRef]
- Hu, M.; Müller, E.; Schymanski, E.L.; Ruttkies, C.; Schulze, T.; Brack, W.; Krauss, M. Performance of combined fragmentation and retention prediction for the identification of organic micropollutants by LC-HRMS. Anal. Bioanal. Chem. 2018, 410, 1931–1941. [Google Scholar] [CrossRef]
Model | MAE, s | MedAE, s | MAPE, % | RMSE | R2 | |||||
---|---|---|---|---|---|---|---|---|---|---|
Validation | Test | Validation | Test | Validation | Test | Validation | Test | Validation | Test | |
MPNN | 32.1 ± 0.6 | 31.5 ± 0.1 | 16.2 ± 0.2 | 16.0 ± 0.2 | 4.1 ± 0.1 | 4.0 ± 0.01 | 62.8 ± 1.9 | 60.5 ± 0.3 | 0.872 ± 0.008 | 0.879 ± 0.001 |
1D-CNN [31] | 34.7 ± 1.2 | 18.7 ± 1.3 | 4.3 ± 0.1 | 65.5 ± 2.7 | No data | |||||
GNN [30] | 39.87 | 25.24 | 5 | No data | No data |
Model | RIKEN Retip | FEM_long | Eawag_XBridgeC18 | LIFE_new | LIFE_old | |||||
---|---|---|---|---|---|---|---|---|---|---|
Validation | Test | Validation | Test | Validation | Test | Validation | Test | Validation | Test | |
MPNN Transfer Learning | 34.7 ± 3.3 | 38.2 ± 3.2 | 214.4 ± 25. 6 | 204.6 ± 23.0 | 79.5 ± 10.3 | 80.9 ± 6.5 | 23.3 ± 3.8 | 22.1 ± 3.3 | 16.9 ± 2.9 | 16.9 ± 2.0 |
MPNN From Scratch | 56.2 ± 13.6 | 57.2 ± 13.2 | 299.4 ± 51.0 | 317.7 ± 25.8 | 137.1 ± 24.7 | 135.8 ± 14.1 | 37.5 ± 17.2 | 39.1 ± 15.2 | 27.2 ± 8.3 | 23.2 ± 2.1 |
1D-CNN Transfer learning [31] | 32.4 ± 3.0 | No data | No data | 23.6 ± 5.1 | 15.5 ± 2.8 | |||||
GNN Transfer learning [30] | No data | 235.01 | 112.78 | 29.38 | 17.10 |
Model | RIKEN Retip | FEM_long | Eawag_XBridgeC18 | LIFE_new | LIFE_old | |||||
---|---|---|---|---|---|---|---|---|---|---|
Validation | Test | Validation | Test | Validation | Test | Validation | Test | Validation | Test | |
MPNN Transfer Learning | 22.2 ± 2.4 | 25.0 ± 3.5 | 93.5 ± 14.9 | 125.2 ± 12.0 | 56.0 ± 4.7 | 57.4 ± 9.3 | 12.2 ± 1.1 | 9.7 ± 3.6 | 9.9 ± 3.1 | 9.5 ± 0.7 |
MPNN From Scratch | 38.0 ± 9.5 | 40.8 ± 12.4 | 162.4 ± 51.2 | 193.1 ± 41.3 | 105.3 ± 20.7 | 115.1 ± 18.1 | 20.4 ± 21.5 | 24.0 ± 22.2 | 18.11 ± 8.6 | 19.5 ± 3.6 |
1D-CNN Transfer learning [31] | 22.2 ± 3.6 | No data | No data | 14.7 ± 4.7 | 11.8 ± 4.2 | |||||
GNN Transfer learning [30] | No data | 94.66 | 83.79 | 15.16 | 12.88 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Osipenko, S.; Nikolaev, E.; Kostyukevich, Y. Retention Time Prediction with Message-Passing Neural Networks. Separations 2022, 9, 291. https://doi.org/10.3390/separations9100291
Osipenko S, Nikolaev E, Kostyukevich Y. Retention Time Prediction with Message-Passing Neural Networks. Separations. 2022; 9(10):291. https://doi.org/10.3390/separations9100291
Chicago/Turabian StyleOsipenko, Sergey, Eugene Nikolaev, and Yury Kostyukevich. 2022. "Retention Time Prediction with Message-Passing Neural Networks" Separations 9, no. 10: 291. https://doi.org/10.3390/separations9100291
APA StyleOsipenko, S., Nikolaev, E., & Kostyukevich, Y. (2022). Retention Time Prediction with Message-Passing Neural Networks. Separations, 9(10), 291. https://doi.org/10.3390/separations9100291