MOTA: Network-Based Multi-Omic Data Integration for Biomarker Discovery
Abstract
:1. Introduction
2. Methods
2.1. Framework of MOTA
2.2. Partial Correlation Calculation Using Graphical LASSO
2.3. Canonical Correlation Calculation Using Regularized Generalized Canonical Correlation Analysis (rgCCA)
2.4. MOTA Score Calculation
2.5. Multi-Omic Datasets
3. Results
3.1. Ranking Disease-Associated Metabolites
3.2. Ranking Disease-Associated Genes
4. Discussion
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
References
- Newton, M.A.; Kendziorski, C.M.; Richmond, C.S.; Blattner, F.R.; Tsui, K.W. On differential variability of expression ratios: Improving statistical inference about gene expression changes from microarray data. J. Comput. Biol. 2001, 8, 37–52. [Google Scholar] [CrossRef] [Green Version]
- Tusher, V.G.; Tibshirani, R.; Chu, G. Significance analysis of microarrays applied to the ionizing radiation response. Proc. Natl. Acad. Sci. USA 2001, 98, 5116–5121. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Datta, S.; Satten, G.A.; Benos, D.J.; Xia, J.; Heslin, M.J.; Datta, S. An empirical bayes adjustment to increase the sensitivity of detecting differentially expressed genes in microarray experiments. Bioinformatics 2004, 20, 235–242. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Wang, M.; Yu, G.; Ressom, H.W. Integrative Analysis of Proteomic, Glycomic, and Metabolomic Data for Biomarker Discovery. IEEE J. Biomed. Health Inform. 2016, 20, 1225–1231. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Melo, C.F.O.R.; Navarro, L.C.; de Oliveira, D.N.; Guerreiro, T.M.; Lima, E.O.; Delafiori, J.; Dabaja, M.Z.; Ribeiro, M.D.S.; de Menezes, M.; Rodrigues, R.G.M.; et al. A Machine Learning Application Based in Random Forest for Integrating Mass Spectrometry-Based Metabolomic Data: A Simple Screening Method for Patients with Zika Virus. Front. Bioeng. Biotechnol. 2018, 6, 31. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Alakwaa, F.M.; Chaudhary, K.; Garmire, L.X. Deep Learning Accurately Predicts Estrogen Receptor Status in Breast Cancer Metabolomics Data. J. Proteome Res. 2018, 17, 337–347. [Google Scholar] [CrossRef]
- Armitage, E.G.; Barbas, C. Metabolomics in cancer biomarker discovery: Current trends and future perspectives. J. Pharm. Biomed. Anal. 2014, 87, 1–11. [Google Scholar] [CrossRef]
- Butte, A.J.; Kohane, I.S. Mutual information relevance networks: Functional genomic clustering using pairwise entropy measurements. Pac. Symp. Biocomput. 2000, 5, 418–429. [Google Scholar]
- Chiquet, J.; Rigaill, G.; Sundqvist, M. A Multiattribute Gaussian Graphical Model for Inferring Multiscale Regulatory Networks: An Application in Breast Cancer. Methods Mol. Biol. 2019, 1883, 143–160. [Google Scholar]
- Martinez, C.A.; Khare, K.; Rahman, S.; Elzo, M.A. Modeling correlated marker effects in genome-wide prediction via Gaussian concentration graph models. J. Theor. Biol. 2018, 437, 67–78. [Google Scholar] [CrossRef]
- Xie, Y.; Liu, Y.; Valdar, W. Joint Estimation of Multiple Dependent Gaussian Graphical Models with Applications to Mouse Genomics. Biometrika 2016, 103, 493–511. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Krumsiek, J.; Suhre, K.; Illig, T.; Adamski, J.; Theis, F.J. Gaussian graphical modeling reconstructs pathway reactions from high-throughput metabolomics data. BMC Syst. Biol. 2011, 5, 21. [Google Scholar] [CrossRef] [Green Version]
- Huan, T.; Palermo, A.; Ivanisevic, J.; Rinehart, D.; Edler, D.; Phommavongsay, T.; Benton, H.P.; Guijas, C.; Domingo-Almenara, X.; Warth, B.; et al. Autonomous Multimodal Metabolomics Data Integration for Comprehensive Pathway Analysis and Systems Biology. Anal. Chem. 2018, 90, 8396–8403. [Google Scholar] [CrossRef] [PubMed]
- Inouye, M.; Ripatti, S.; Kettunen, J.; Lyytikainen, L.P.; Oksala, N.; Laurila, P.P.; Kangas, A.J.; Soininen, P.; Savolainen, M.J.; Viikari, J.; et al. Novel Loci for metabolic networks and multi-tissue expression studies reveal genes for atherosclerosis. PLoS Genet. 2012, 8, e1002907. [Google Scholar] [CrossRef]
- Lei, G.; Lin, Q.; Enqing, C.; Ling, G. Discriminative Multiple Canonical Correlation Analysis for Information Fusion. IEEE Trans. Image Process. 2018, 27, 1951–1965. [Google Scholar]
- Le Cao, K.A.; Martin, P.G.; Robert-Granie, C.; Besse, P. Sparse canonical methods for biological data integration: Application to a cross-platform study. BMC Bioinform. 2009, 10, 34–2105. [Google Scholar] [CrossRef] [PubMed]
- de la Fuente, A. From ‘differential expression’ to ‘differential networking’—Identification of dysfunctional regulatory networks in diseases. Trends Genet. 2010, 26, 326–333. [Google Scholar] [CrossRef] [PubMed]
- Friedman, J.; Hastie, T.; Tibshirani, R. Sparse inverse covariance estimation with the graphical lasso. Biostatistics 2008, 9, 432–441. [Google Scholar] [CrossRef] [Green Version]
- Fan, Z.; Zhou, Y.; Ressom, H.W. MOTA: Multi-omic integrative analysis for biomarker discovery. Conf. Proc. IEEE Eng. Med. Biol. Soc. 2019, 2019, 243–247. [Google Scholar]
- Tenenhaus, M.; Tenenhaus, A.; Groenen, P.J.F. Regularized Generalized Canonical Correlation Analysis: A Framework for Sequential Multiblock Component Methods. Psychometrika 2017, 82, 737–777. [Google Scholar] [CrossRef]
- Gonzalez, I.; Cao, K.A.; Davis, M.J.; Dejean, S. Visualising associations between paired ‘omics’ data sets. BioData Min. 2012, 5, 19. [Google Scholar] [CrossRef] [Green Version]
- Di Poto, C.; He, S.; Varghese, R.S.; Zhao, Y.; Ferrarini, A.; Su, S.; Karabala, A.; Redi, M.; Mamo, H.; Rangnekar, A.S.; et al. Identification of race-associated metabolite biomarkers for hepatocellular carcinoma in patients with liver cirrhosis and hepatitis C virus infection. PLoS ONE 2018, 13, e0192748. [Google Scholar] [CrossRef] [PubMed]
- Class, C.A.; Ha, M.J.; Baladandayuthapani, V.; Do, K.A. iDINGO-integrative differential network analysis in genomics with Shiny application. Bioinform. 2018, 34, 1243–1245. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Blomme, B.; Van Steenkiste, C.; Callewaert, N.; Van Vlierberghe, H. Alteration of protein glycosylation in liver diseases. J. Hepatol. 2009, 50, 592–603. [Google Scholar] [CrossRef] [PubMed]
- Garner, B.; Witting, P.K.; Waldeck, A.R.; Christison, J.K.; Raftery, M.; Stocker, R. Oxidation of high density lipoproteins. I. Formation of methionine sulfoxide in apolipoproteins AI and AII is an early event that accompanies lipid peroxidation and can be enhanced by alpha-tocopherol. J. Biol. Chem. 1998, 273, 6080–6087. [Google Scholar] [CrossRef] [Green Version]
- Singh, U.; Jialal, I. Anti-inflammatory effects of alpha-tocopherol. Ann. N. Y. Acad. Sci. 2004, 1031, 195–203. [Google Scholar] [CrossRef]
- Saboori, S.; Shab-Bidar, S.; Speakman, J.R.; Yousefi Rad, E.; Djafarian, K. Effect of vitamin E supplementation on serum C-reactive protein level: A meta-analysis of randomized controlled trials. Eur. J. Clin. Nutr. 2015, 69, 867–873. [Google Scholar] [CrossRef]
- Zyla, J.; Marczyk, M.; Domaszewska, T.; Kaufmann, S.H.E.; Polanska, J.; Weiner, J. Gene set enrichment for reproducible science: Comparison of CERNO and eight other algorithms. Bioinformatics 2019, 35, 5146–5154. [Google Scholar] [CrossRef] [Green Version]
- Nagashima, T.; Yamaguchi, K.; Urakami, K.; Shimoda, Y.; Ohnami, S.; Ohshima, K.; Tanabe, T.; Naruoka, A.; Kamada, F.; Seriawa, M.; et al. Japanese version of cancer genome atlas, JCGA, analyzed by fresh frozen tumors obtained from 5143 cancer patients. Cancer Sci. 2020, 111, 687–699. [Google Scholar] [CrossRef]
- Pandey, G.; Pandey, O.P.; Rogers, A.J.; Ahsen, M.E.; Hoffman, G.E.; Raby, B.A.; Weiss, S.T.; Schadt, E.E.; Bunyavanich, S. A Nasal Brush-based Classifier of Asthma Identified by Machine Learning Analysis of Nasal RNA Sequence Data. Sci. Rep. 2018, 8, 8826. [Google Scholar] [CrossRef] [Green Version]
- Varet, H.; Brillet-Gueguen, L.; Coppee, J.Y.; Dillies, M.A. SARTools: A DESeq2- and EdgeR-Based R Pipeline for Comprehensive Differential Analysis of RNA-Seq Data. PLoS ONE 2016, 11, e0157022. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Mi, H.; Muruganujan, A.; Huang, X.; Ebert, D.; Mills, C.; Guo, X.; Thomas, P.D. Protocol Update for large-scale genome and gene function analysis with the PANTHER classification system (v.14.0). Nat. Protoc. 2019, 14, 703–721. [Google Scholar] [CrossRef] [PubMed]
- Kanta, J. Elastin in the Liver. Front. Physiol. 2016, 7, 491. [Google Scholar] [CrossRef] [PubMed]
- Wang, Q.; Chen, X.; Hay, N. Akt as a target for cancer therapy: More is not always better (lessons from studies in mice). Br. J. Cancer 2017, 117, 159–163. [Google Scholar] [CrossRef] [Green Version]
- Futreal, P.A.; Coin, L.; Marshall, M.; Down, T.; Hubbard, T.; Wooster, R.; Rahman, N.; Stratton, M.R. A census of human cancer genes. Nat. Rev. Cancer 2004, 4, 177–183. [Google Scholar] [CrossRef]
Datasets | Omic Studies (No. of Features) | Serum | Tissue | ||
---|---|---|---|---|---|
HCC | CIRR | HCC | CIRR | ||
TU Datasets | Metabolomics (66) Glycomics (82) Proteomics (100) | 39 | 48 | ||
GU1 Datasets | Metabolomics (53) Glycomics (82) Proteomics (101) | 40 | 44 | ||
GU2 Datasets | Metabolomics (3672) mRNA profiling (27,523) miRNA profiling (2543) | 37 | 24 |
Feature | p-Value | Rank | MOTA | Rank |
---|---|---|---|---|
tyrosine | 0.42 | 36 | 11.29 | 1 |
alpha tocopherol | 0.85 | 50 | 10.24 | 2 |
pyroglutamic acid | 0.01 | 4 | 8.96 | 3 |
glycine | 0.01 | 5 | 8.62 | 4 |
ethanolamine | 0.00 | 1 | 8.34 | 5 |
phenylalanine | 0.01 | 2 | 7.92 | 6 |
citric acid | 0.13 | 16 | 7.42 | 7 |
threitol | 0.08 | 12 | 7.27 | 8 |
tyramine | 0.95 | 53 | 7.23 | 9 |
aspartic acid | 0.08 | 13 | 7.18 | 10 |
ribitol /arabitol | 0.06 | 10 | 7.08 | 11 |
creatinine | 0.02 | 7 | 7.01 | 12 |
malic acid | 0.22 | 20 | 7.00 | 13 |
Proline | 0.45 | 38 | 7.00 | 14 |
lactulose | 0.26 | 23 | 6.43 | 15 |
linoleic acid | 0.02 | 6 | 6.42 | 16 |
hydroxybenzyl alcohol | 0.34 | 33 | 6.40 | 17 |
malonic acid | 0.26 | 24 | 6.34 | 18 |
xanthine | 0.29 | 29 | 6.30 | 19 |
sorbose | 0.01 | 3 | 6.26 | 20 |
myo-inositol | 0.31 | 30 | 6.23 | 21 |
stearic acid | 0.08 | 11 | 6.20 | 22 |
diglycerol | 0.21 | 19 | 6.18 | 23 |
lauric acid | 0.06 | 8 | 6.18 | 24 |
Rank | GU1 Cohort | TU Cohort | GU1+TU Cohort | No. of Overlaps |
---|---|---|---|---|
Ranking using Student t-Test (p-Value) | ||||
1 | ethanolamine | glutamic acid | ethanolamine | 2 |
2 | phenylalanine | lactic acid | sorbose | |
3 | sorbose | alpha tocopherol | citric Acid | |
4 | pyroglutamic acid | valine | isoleucine | |
5 | glycine | ethanolamine | threitol | |
6 | linoleic acid | alpha-D-glucosamine 1-phosphate | ribose | |
7 | creatinine | norvaline | malic acid | |
8 | lauric acid | citric Acid | phenylalanine | |
9 | ribitol /arabitol | norleucine | stearic acid | |
10 | threitol | sorbose | trans-aconitic acid | |
Ranking using iDINGO | ||||
1 | linoleic acid | norvaline | valine | 2 |
2 | isoleucine | cystine | ethanolamine | |
3 | leucine | sorbose | butanediol | |
4 | proline | tagatose | ribose | |
5 | ethanolamine | isoleucine | glycine | |
6 | valine | trans-3-hydroxy-L-proline | sorbose | |
7 | glutamic acid | N,N-dimethyl-1-4-phenylenediamine | tyrosine | |
8 | sorbose | cholesterol | malic acid | |
9 | aspartic acid | butanediol | isoleucine | |
10 | glycine | arachidic acid | tagatose | |
Ranking using MOTA | ||||
1 | tyrosine | alpha tocopherol | alpha tocopherol | 4 |
2 | alpha tocopherol | tyrosine | ethanolamine | |
3 | pyroglutamic acid | ethanolamine | glycine | |
4 | glycine | creatinine | lactic acid | |
5 | ethanolamine | tyramine | creatinine | |
6 | phenylalanine | mimosine | tyrosine | |
7 | citric acid | lactic acid | cholesterol | |
8 | threitol | cholesterol | tyramine | |
9 | tyramine | threitol | citric Acid | |
10 | aspartic acid | ribose | isoleucine |
DESeq2 | iDINGO | MOTA | ||||
---|---|---|---|---|---|---|
No. of GO Terms with FDR < 0.05 | 0 | 7 | 10 | |||
GO Terms | FDR (p-Value) | GO Terms | FDR (p-Value) | GO Terms | FDR (p-Value) | |
Gene | chromatin organization | 1.0 (1.03 × 10−4) | chromatin assembly | 0.014 (1.73 × 10−6) | extracellular matrix organization | 1.27 × 10−5 (8.0 × 10−10) |
kidney development | 1.0 (2.31 × 10−4) | nucleosome assembly | 0.014 (8.69 × 10−6) | extracellular structure organization | 1.90 × 10−5 (2.4 × 10−9) | |
renal system development | 1.0 (2.88 × 10−4) | nucleosome organization | 0.016 (3.93 × 10−6) | positive regulation of protein kinase B signaling | 7.56 × 10−4 (1.43 × 10−7) | |
nucleosome assembly | 1.0 (3.28 × 10−4) | Chromatin assembly or disassembly | 0.019 (3.48 × 10−6) | cell chemotaxis | 1.48 × 10−3 (6.56 × 10−7) | |
urogenital system development | 1.0 (4.46 × 10−4) | DNA packaging | 0.214 (6.73 × 10−6) | elastic fiber assembly | 1.58 × 10−3 (5.98 × 10−7) |
Top k | DESeq2 | iDINGO | MOTA |
---|---|---|---|
Top 10 | 1 (ID3) | 0 | 2 (FGFR2, PDGFRA) |
Top 50 | 3 (ID3, PDGFRA, HIST1H3B) | 1 (PDGFRA) | 6 (FGFR2, PDGFRA, ID3, CDH1, SMO, EPAS1) |
Top 100 | 4 (ID3, PDGFRA, HIST1H3B, HSP90AB1) | 3 (PDGFRA, CSF1R, SMO) | 8 (FGFR2, PDGFRA, ID3, CDH1, SMO, EPAS1, HIST1H3B, HSP90AB1) |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Fan, Z.; Zhou, Y.; Ressom, H.W. MOTA: Network-Based Multi-Omic Data Integration for Biomarker Discovery. Metabolites 2020, 10, 144. https://doi.org/10.3390/metabo10040144
Fan Z, Zhou Y, Ressom HW. MOTA: Network-Based Multi-Omic Data Integration for Biomarker Discovery. Metabolites. 2020; 10(4):144. https://doi.org/10.3390/metabo10040144
Chicago/Turabian StyleFan, Ziling, Yuan Zhou, and Habtom W. Ressom. 2020. "MOTA: Network-Based Multi-Omic Data Integration for Biomarker Discovery" Metabolites 10, no. 4: 144. https://doi.org/10.3390/metabo10040144
APA StyleFan, Z., Zhou, Y., & Ressom, H. W. (2020). MOTA: Network-Based Multi-Omic Data Integration for Biomarker Discovery. Metabolites, 10(4), 144. https://doi.org/10.3390/metabo10040144