TumFlow: An AI Model for Predicting New Anticancer Molecules
Abstract
:1. Introduction
2. Results and Discussion
2.1. Generation Starting from the NCI-60 Dataset
2.2. Generation Starting from Clinically Adopted Anti-Melanoma Molecules
2.3. Benchmarking
2.4. Code Implementation
3. Materials and Methods
- (i)
- Lower-case symbols for scalars, indexes, and assignment to random variables, e.g., n and x;
- (ii)
- Italic upper-case symbols for sets and single random variables, e.g., A and X;
- (iii)
- Bold lower-case symbols for vectors and assignments to vectors of random variables, e.g., and ;
- (iv)
- Bold upper-case symbols for matrices, tensors, and vectors of random variables, e.g., and ;
- (v)
- The position within a tensor or vector is denoted by numeric subscripts in square brackets, for example, , where , and “:” indicates the positions from a to b. The solitary use of the colon symbol “:” represents all positions;
- (vi)
- Calligraphic symbols for domains, e.g., ;
- (vii)
- When it is clear from the context, the probability random variables are omitted, as instead of .
3.1. Data Sources and Data Preprocessing
3.2. TumFlow
3.2.1. Prediction of the GI50 Scores
3.2.2. New Molecule Generation
- (i)
- In the first approach, the starting point consists of molecules with higher antitumoral efficacy appearing in the training set, i.e., antitumoral molecules tested in vitro from the NCI-60 project;
- (ii)
- In the second approach, the starting point consists of nine molecules, reported in Table S1, known for their efficacy in clinical treatments for melanoma.
4. Conclusions and Future Prospects
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Dzwierzynski, W.W. Melanoma risk factors and prevention. Clin. Plast. Surg. 2021, 48, 543–550. [Google Scholar] [CrossRef] [PubMed]
- O’Neill, C.H.; Scoggins, C.R. Melanoma. J. Surg. Oncol. 2019, 120, 873–881. [Google Scholar] [CrossRef]
- Gandini, S.; Sera, F.; Cattaruzza, M.S.; Pasquini, P.; Zanetti, R.; Masini, C.; Boyle, P.; Melchi, C.F. Meta-analysis of risk factors for cutaneous melanoma: III. Family history, actinic damage and phenotypic factors. Eur. J. Cancer 2005, 41, 2040–2059. [Google Scholar] [CrossRef]
- Arnold, M.; de Vries, E.; Whiteman, D.C.; Jemal, A.; Bray, F.; Parkin, D.M.; Soerjomataram, I. Global burden of cutaneous melanoma attributable to ultraviolet radiation in 2012. Int. J. Cancer 2018, 143, 1305–1314. [Google Scholar] [CrossRef]
- Erdei, E.; Torres, S.M. A new understanding in the epidemiology of melanoma. Expert Rev. Anticancer. Ther. 2010, 10, 1811–1823. [Google Scholar] [CrossRef] [PubMed]
- Arioka, M.; Takahashi-Yanaga, F.; Kubo, M.; Igawa, K.; Tomooka, K.; Sasaguri, T. Anti-tumor effects of differentiation-inducing factor-1 in malignant melanoma: GSK-3-mediated inhibition of cell proliferation and GSK-3-independent suppression of cell migration and invasion. Biochem. Pharmacol. 2017, 138, 31–48. [Google Scholar] [CrossRef] [PubMed]
- Chapman, P.B.; Hauschild, A.; Robert, C.; Haanen, J.B.; Ascierto, P.; Larkin, J.; Dummer, R.; Garbe, C.; Testori, A.; Maio, M.; et al. Improved survival with vemurafenib in melanoma with BRAF V600E mutation. N. Engl. J. Med. 2011, 364, 2507–2516. [Google Scholar] [CrossRef]
- Leach, D.R.; Krummel, M.F.; Allison, J.P. Enhancement of antitumor immunity by CTLA-4 blockade. Science 1996, 271, 1734–1736. [Google Scholar] [CrossRef]
- Hodi, F.S.; O’day, S.J.; McDermott, D.F.; Weber, R.W.; Sosman, J.A.; Haanen, J.B.; Gonzalez, R.; Robert, C.; Schadendorf, D.; Hassel, J.C.; et al. Improved survival with ipilimumab in patients with metastatic melanoma. N. Engl. J. Med. 2010, 363, 711–723. [Google Scholar] [CrossRef]
- Robert, C.; Long, G.V.; Brady, B.; Dutriaux, C.; Maio, M.; Mortier, L.; Hassel, J.C.; Rutkowski, P.; McNeil, C.; Kalinka-Warzocha, E.; et al. Nivolumab in previously untreated melanoma without BRAF mutation. N. Engl. J. Med. 2015, 372, 320–330. [Google Scholar] [CrossRef]
- Zang, C.; Wang, F. Moflow: An invertible flow model for generating molecular graphs. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Virtual, 23–27 August 2020; pp. 617–626. [Google Scholar]
- Harrer, S.; Shah, P.; Antony, B.; Hu, J. Artificial intelligence for clinical trial design. Trends Pharmacol. Sci. 2019, 40, 577–591. [Google Scholar] [CrossRef] [PubMed]
- Mullard, A. 2021 FDA approvals. Nat. Rev. Drug Discov. 2022, 21, 83–88. [Google Scholar] [CrossRef] [PubMed]
- Statista. Spending of the U.S. Pharmaceutical Industry on Research and Development at Home and Abroad from 1990 to 2022 (in Million U.S. Dollars). In Statista. 2023. Retrieved 1 January 2024. Available online: https://www.statista.com/statistics/265090/us-pharmaceutical-industry-spending-on-research-and-development/ (accessed on 6 February 2024).
- Jiang, F.; Jiang, Y.; Zhi, H.; Dong, Y.; Li, H.; Ma, S.; Wang, Y.; Dong, Q.; Shen, H.; Wang, Y. Artificial intelligence in healthcare: Past, present and future. Stroke Vasc. Neurol. 2017, 2, 230–243. [Google Scholar] [CrossRef] [PubMed]
- Vamathevan, J.; Clark, D.; Czodrowski, P.; Dunham, I.; Ferran, E.; Lee, G.; Li, B.; Madabhushi, A.; Shah, P.; Spitzer, M.; et al. Applications of machine learning in drug discovery and development. Nat. Rev. Drug Discov. 2019, 18, 463–477. [Google Scholar] [CrossRef] [PubMed]
- Hassanzadeh, P.; Atyabi, F.; Dinarvand, R. The significance of artificial intelligence in drug delivery system design. Adv. Drug Deliv. Rev. 2019, 151, 169–190. [Google Scholar] [CrossRef] [PubMed]
- Fakoor, R.; Ladhak, F.; Nazi, A.; Huber, M. Using deep learning to enhance cancer diagnosis and classification. In Proceedings of the International Conference on Machine Learning, Atlanta, GA, USA, 6–21 June 2013; ACM: New York, NY, USA, 2013; Volume 28, pp. 3937–3949. [Google Scholar]
- Munir, K.; Elahi, H.; Ayub, A.; Frezza, F.; Rizzi, A. Cancer diagnosis using deep learning: A bibliographic review. Cancers 2019, 11, 1235. [Google Scholar] [CrossRef] [PubMed]
- Bloice, M.D.; Roth, P.M.; Holzinger, A. Biomedical image augmentation using Augmentor. Bioinformatics 2019, 35, 4522–4524. [Google Scholar] [CrossRef]
- Kingma, D.P.; Welling, M. Auto-Encoding Variational Bayes. In Proceedings of the 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, 14–16 April 2014. [Google Scholar]
- Kusner, M.J.; Paige, B.; Hernández-Lobato, J.M. Grammar variational autoencoder. In Proceedings of the International Conference on Machine Learning, PMLR, Sydney, Australia, 6–11 August 2017; pp. 1945–1954. [Google Scholar]
- Dai, H.; Tian, Y.; Dai, B.; Skiena, S.; Song, L. Syntax-Directed Variational Autoencoder for Structured Data. In Proceedings of the 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, 30 April–3 May 2018. [Google Scholar]
- Liu, Q.; Allamanis, M.; Brockschmidt, M.; Gaunt, A. Constrained graph variational autoencoders for molecule design. Adv. Neural Inf. Process. Syst. 2018, 31, 7806–7815. [Google Scholar]
- Ma, T.; Chen, J.; Xiao, C. Constrained generation of semantically valid graphs via regularizing variational autoencoders. Adv. Neural Inf. Process. Syst. 2018, 31, 7113–7124. [Google Scholar]
- Jin, W.; Barzilay, R.; Jaakkola, T. Junction tree variational autoencoder for molecular graph generation. In Proceedings of the International Conference on Machine Learning, PMLR, Stockholm, Sweden, 10–15 July 2018; pp. 2323–2332. [Google Scholar]
- Rigoni, D.; Navarin, N.; Sperduti, A. Conditional constrained graph variational autoencoders for molecule design. In Proceedings of the 2020 IEEE Symposium Series on Computational Intelligence (SSCI), Canberra Australia, 1–4 December 2020; IEEE: New York, NY, USA, 2020; pp. 729–736. [Google Scholar]
- Rigoni, D.; Nicolo, N.; Alessandro, S. A Systematic Assessment of Deep Learning Models for Molecule Generation. In Proceedings of the ESANN 2020-Proceedings, 28th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, Bruges, Belgium, 2–4 October 2020; pp. 547–552. [Google Scholar]
- Rigoni, D.; Navarin, N.; Sperduti, A. RGCVAE: Relational Graph Conditioned Variational Autoencoder for Molecule Design. arXiv 2023, arXiv:2305.11699. [Google Scholar]
- Hy, T.S.; Kondor, R. Multiresolution equivariant graph variational autoencoder. Mach. Learn. Sci. Technol. 2023, 4, 015031. [Google Scholar] [CrossRef]
- Bhadwal, A.S.; Kumar, K.; Kumar, N. NRC-VABS: Normalized Reparameterized Conditional Variational Autoencoder with applied beam search in latent space for drug molecule design. Expert Syst. Appl. 2024, 240, 122396. [Google Scholar] [CrossRef]
- De Cao, N.; Kipf, T. MolGAN: An implicit generative model for small molecular graphs. arXiv 2018, arXiv:1805.11973. [Google Scholar]
- Tsujimoto, Y.; Hiwa, S.; Nakamura, Y.; Oe, Y.; Hiroyasu, T. L-MolGAN: An improved implicit generative model for large molecular graphs. ChemRxiv 2021. chemrxiv.14569545. [Google Scholar]
- Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial networks. Commun. ACM 2020, 63, 139–144. [Google Scholar] [CrossRef]
- Shi, C.; Xu, M.; Zhu, Z.; Zhang, W.; Zhang, M.; Tang, J. Graphaf: A flow-based autoregressive model for molecular graph generation. arXiv 2020, arXiv:2001.09382. [Google Scholar]
- Madhawa, K.; Ishiguro, K.; Nakago, K.; Abe, M. GraphNVP: An Invertible Flow-Based Model for Generating Molecular Graphs. arXiv 2019, arXiv:1905.11600. [Google Scholar]
- Kobyzev, I.; Prince, S.J.; Brubaker, M.A. Normalizing flows: An introduction and review of current methods. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 43, 3964–3979. [Google Scholar] [CrossRef]
- Kuznetsov, M.; Polykovskiy, D. MolGrow: A graph normalizing flow for hierarchical molecular generation. In Proceedings of the AAAI Conference on Artificial Intelligence, Virtually, 2–9 February 2021; Volume 35, pp. 8226–8234. [Google Scholar]
- Faez, F.; Ommi, Y.; Baghshah, M.S.; Rabiee, H.R. Deep graph generators: A survey. IEEE Access 2021, 9, 106675–106702. [Google Scholar] [CrossRef]
- Sohl-Dickstein, J.; Weiss, E.; Maheswaranathan, N.; Ganguli, S. Deep unsupervised learning using nonequilibrium thermodynamics. In Proceedings of the International Conference on Machine Learning, PMLR, Lille, France, 6 July–11 July 2015; pp. 2256–2265. [Google Scholar]
- Ho, J.; Jain, A.; Abbeel, P. Denoising diffusion probabilistic models. Adv. Neural Inf. Process. Syst. 2020, 33, 6840–6851. [Google Scholar]
- Austin, J.; Johnson, D.D.; Ho, J.; Tarlow, D.; Van Den Berg, R. Structured denoising diffusion models in discrete state-spaces. Adv. Neural Inf. Process. Syst. 2021, 34, 17981–17993. [Google Scholar]
- Nichol, A.Q.; Dhariwal, P. Improved denoising diffusion probabilistic models. In Proceedings of the International Conference on Machine Learning, PMLR, Virtual, 18–24 July 2021; pp. 8162–8171. [Google Scholar]
- Vignac, C.; Krawczuk, I.; Siraudin, A.; Wang, B.; Cevher, V.; Frossard, P. Digress: Discrete denoising diffusion for graph generation. arXiv 2022, arXiv:2209.14734. [Google Scholar]
- Luo, T.; Mo, Z.; Pan, S.J. Fast graph generation via spectral diffusion. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 46, 3496–3508. [Google Scholar] [CrossRef] [PubMed]
- Jo, J.; Lee, S.; Hwang, S.J. Score-based generative modeling of graphs via the system of stochastic differential equations. In Proceedings of the International Conference on Machine Learning, PMLR, Baltimore, MD, USA, 17–23 July 2022; pp. 10362–10383. [Google Scholar]
- Huang, H.; Sun, L.; Du, B.; Fu, Y.; Lv, W. Graphgdp: Generative diffusion processes for permutation invariant graph generation. In Proceedings of the 2022 IEEE International Conference on Data Mining (ICDM), Orlando, FL, USA, 28 November–1 December 2022; IEEE: New York, NY, USA, 2022; pp. 201–210. [Google Scholar]
- Xu, M.; Yu, L.; Song, Y.; Shi, C.; Ermon, S.; Tang, J. Geodiff: A geometric diffusion model for molecular conformation generation. arXiv 2022, arXiv:2203.02923. [Google Scholar]
- Hoogeboom, E.; Gritsenko, A.A.; Bastings, J.; Poole, B.; Berg, R.v.d.; Salimans, T. Autoregressive diffusion models. arXiv 2021, arXiv:2110.02037. [Google Scholar]
- Huang, H.; Sun, L.; Du, B.; Lv, W. Conditional diffusion based on discrete graph structures for molecular graph generation. arXiv 2023, arXiv:2301.00427. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30, 5998–6008. [Google Scholar]
- Mazuz, E.; Shtar, G.; Shapira, B.; Rokach, L. Molecule generation using transformers and policy gradient reinforcement learning. Sci. Rep. 2023, 13, 8799. [Google Scholar] [CrossRef]
- Bagal, V.; Aggarwal, R.; Vinod, P.; Priyakumar, U.D. MolGPT: Molecular generation using a transformer-decoder model. J. Chem. Inf. Model. 2021, 62, 2064–2076. [Google Scholar] [CrossRef]
- Rothchild, D.; Tamkin, A.; Yu, J.; Misra, U.; Gonzalez, J. C5t5: Controllable generation of organic molecules with transformers. arXiv 2021, arXiv:2108.10307. [Google Scholar]
- Dollar, O.; Joshi, N.; Beck, D.A.; Pfaendtner, J. Attention-based generative models for de novo molecular design. Chem. Sci. 2021, 12, 8362–8372. [Google Scholar] [CrossRef]
- Huang, C.W.; Krueger, D.; Lacoste, A.; Courville, A. Neural autoregressive flows. In Proceedings of the International Conference on Machine Learning, PMLR, Stockholm, Sweden, 10–15 July 2018; pp. 2078–2087. [Google Scholar]
- Ohno, H. Training data augmentation: An empirical study using generative adversarial net-based approach with normalizing flow models for materials informatics. Appl. Soft Comput. 2020, 86, 105932. [Google Scholar] [CrossRef]
- Dinh, L.; Sohl-Dickstein, J.; Bengio, S. Density estimation using Real NVP. In Proceedings of the International Conference on Learning Representations, Toulon, France, 24–26 April 2017. [Google Scholar]
- Kingma, D.P.; Dhariwal, P. Glow: Generative flow with invertible 1×1 convolutions. Adv. Neural Inf. Process. Syst. 2018, 31, 10236–10245. [Google Scholar]
- Ertl, P.; Schuffenhauer, A. Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions. J. Cheminform. 2009, 1, 1–11. [Google Scholar] [CrossRef] [PubMed]
- Rogers, D.; Hahn, M. Extended-connectivity fingerprints. J. Chem. Inf. Model. 2010, 50, 742–754. [Google Scholar] [CrossRef] [PubMed]
- NCI-60 Project. Available online: https://dtp.cancer.gov/discovery_development/nci-60/ (accessed on 1 October 2023).
- Weininger, D. SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J. Chem. Inf. Comput. Sci. 1988, 28, 31–36. [Google Scholar] [CrossRef]
- Weininger, D.; Weininger, A.; Weininger, J.L. SMILES. 2. Algorithm for generation of unique SMILES notation. J. Chem. Inf. Comput. Sci. 1989, 29, 97–101. [Google Scholar] [CrossRef]
- Weininger, D. SMILES. 3. DEPICT. Graphical depiction of chemical structures. J. Chem. Inf. Comput. Sci. 1990, 30, 237–243. [Google Scholar] [CrossRef]
- PubChem. Available online: https://pubchem.ncbi.nlm.nih.gov/ (accessed on 10 April 2024).
- Docker. Available online: https://www.docker.com/ (accessed on 22 May 2024).
- National Cancer Institute (NCI). Available online: https://dtp.cancer.gov/ (accessed on 1 October 2023).
- Schlichtkrull, M.; Kipf, T.N.; Bloem, P.; Van Den Berg, R.; Titov, I.; Welling, M. Modeling relational data with graph convolutional networks. In Proceedings of the The Semantic Web: 15th International Conference, ESWC 2018, Heraklion, Crete, Greece, 3–7 June 2018; Proceedings 15. Springer: Berlin/Heidelberg, Germany, 2018; pp. 593–607. [Google Scholar]
- Wu, X.; Zhang, Q.; Wu, Y.; Wang, H.; Li, S.; Sun, L.; Li, X. F3A-GAN: Facial Flow for Face Animation With Generative Adversarial Networks. IEEE Trans. Image Process. 2021, 30, 8658–8670. [Google Scholar] [CrossRef]
- GitHub. Available online: https://github.com/ (accessed on 22 May 2024).
- RDKit. Available online: https://www.rdkit.org/ (accessed on 22 May 2024).
- Dinh, L.; Krueger, D.; Bengio, Y. NICE: Non-linear Independent Components Estimation. In Proceedings of the 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, 7–9 May 2015. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Rigoni, D.; Yaddehige, S.; Bianchi, N.; Sperduti, A.; Moro, S.; Taccioli, C. TumFlow: An AI Model for Predicting New Anticancer Molecules. Int. J. Mol. Sci. 2024, 25, 6186. https://doi.org/10.3390/ijms25116186
Rigoni D, Yaddehige S, Bianchi N, Sperduti A, Moro S, Taccioli C. TumFlow: An AI Model for Predicting New Anticancer Molecules. International Journal of Molecular Sciences. 2024; 25(11):6186. https://doi.org/10.3390/ijms25116186
Chicago/Turabian StyleRigoni, Davide, Sachithra Yaddehige, Nicoletta Bianchi, Alessandro Sperduti, Stefano Moro, and Cristian Taccioli. 2024. "TumFlow: An AI Model for Predicting New Anticancer Molecules" International Journal of Molecular Sciences 25, no. 11: 6186. https://doi.org/10.3390/ijms25116186
APA StyleRigoni, D., Yaddehige, S., Bianchi, N., Sperduti, A., Moro, S., & Taccioli, C. (2024). TumFlow: An AI Model for Predicting New Anticancer Molecules. International Journal of Molecular Sciences, 25(11), 6186. https://doi.org/10.3390/ijms25116186