A Sparse and Low-Rank Regression Model for Identifying the Relationships Between DNA Methylation and Gene Expression Levels in Gastric Cancer and the Prediction of Prognosis
Abstract
:1. Introduction
2. Materials and Methods
2.1. Methods
- There are only a few expression heterogeneous factors related to tumor subtypes that influence differential gene expression levels. Thus, L is a low-rank matrix. Then, we can obtain gene clusters according to the matrix L.
- The gene expression level is only affected by a small fraction of DNA methylation sites. This implies that the coefficient matrix B should be sparse.
- For fixed B, the optimization problem becomes:
- For fixed L, the optimization problem becomes:
2.2. Synthetic Data
- For the methylation effects, each methylation site is generated independently and uniformly from a binomial distribution with the probability p = 0.25 denoted by matrix X with dimension . The coefficient matrix B is a sparse matrix with dimension , with 2% non-zero entries, which are generated using a standard Gaussian distribution. Let G denote the methylation effect G = XB.
- EH: The covariance matrix is generated by , with and . Here, K is the number of hidden factors. The random variable was drawn from . Let
- Let
2.3. Gastric Cancer Data
3. Results
3.1. Synthetic Results Demonstrated Our Model Benefited High Dimensional Data
3.2. Gastric Cancer Transcription and Methylation Datasets Filtration
3.3. Application of Our Model to Gastric Cancer Dataset
3.4. 29 Genes Were Identified as Gastric Cancer-Associated Methylation-Driven Genes
3.5. Methylation Markers Associated with Prognosis of GC
3.6. Exploration of the Subtype-Associated GpG Sites
3.7. Association between the Eight Subtype-Associated CpG Sites and Prognosis
3.8. Identifying Four Subtype-Specificity Prognosis-Related Sites
3.9. Determining the Influential Power of the Subtype-Specificity Prognosis-Related Sites on Expression Levels—Important Prognostic Markers and Regulation of Gene Expression Factors
4. Discussion
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Van Cutsem, E.; Sagaert, X.; Topal, B.; Haustermans, K.; Prenen, H. Gastric cancer. Lancet 2016, 388, 2654–2664. [Google Scholar] [CrossRef]
- Bornschein, J.; Selgrad, M.; Warnecke, M.; Kuester, D.; Wex, T.; Malfertheiner, P.H. Pylori infection is a key risk factor for proximal gastric cancer. Dig. Dis. Sci. 2010, 55, 3124–3131. [Google Scholar] [CrossRef] [PubMed]
- Ebrahimi, V.; Soleimanian, A.; Ebrahimi, T.; Azargun, R.; Yazdani, P.; Eyvazi, S.; Tarhriz, V. Epigenetic modifications in gastric cancer: Focus on DNA methylation. Gene 2020, 742, 144577. [Google Scholar] [CrossRef] [PubMed]
- Park, S.Y.; G¨onen, M.; Kim, H.J.; Michor, F.; Polyak, K. Cellular and genetic diversity in the progression of in situ human breast carcinomas to an invasive phenotype. J. Clin. Investig. 2010, 120, 636–644. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Jaenisch, R.; Bird, A. Epigenetic regulation of gene expression: How the genome integrates intrinsic and environmental signals. Nat Genet. 2003, 33, 245–254. [Google Scholar] [CrossRef] [PubMed]
- Pfeifer, G.P. Defining driver DNA methylation changes in human cancer. Int. J. Mol. Sci. 2018, 19, 1166. [Google Scholar] [CrossRef] [Green Version]
- Gujar, H.; Weisenberger, D.; Liang, G. The Roles of Human DNA Methyltransferases and Their Isoforms in Shaping the Epigenome. Genes 2019, 10, 172. [Google Scholar] [CrossRef] [Green Version]
- Luo, C.; Hajkova, P.; Ecker, J.R. Dynamic DNA methylation: In the right place at the right time. Science 2018, 361, 1336–1340. [Google Scholar] [CrossRef] [Green Version]
- Lea, A.J.; Vockley, C.M.; A Johnston, R.; A Del Carpio, C.; Barreiro, L.B.; E Reddy, T.; Tung, J. Genome-wide quantification of the effects of DNA methylation on human gene regulation. Dig. Dis. Sci. 2018, 7, 156. [Google Scholar] [CrossRef]
- Liang, C.; Yu, X.; Li, B.; Chen, Y.A.; Conejo-Garcia, J.R.; Wang, X. DNA methylation-based immune cell deconvolution in solid tumors. bioRxiv 2019, 619965. [Google Scholar] [CrossRef]
- Alter, O.; Brown, P.O.; Botstein, D. Singular value decomposition for genome-wide expression data processing and modeling. Proc. Natl. Acad. Sci. USA 2000, 97, 10101–10106. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Gibson, G. The environmental contribution to gene expression profiles. Nat. Rev. Genet. 2008, 9, 575–581. [Google Scholar] [CrossRef]
- Leek, J.T.; Scharpf, R.B.; Bravo, H.C.; Simcha, D.; Langmead, B.; Johnson, W.E.; Geman, D.; Baggerly, K.; Irizarry, R.A. Tackling the widespread and critical impact of batch effects in high-throughput data. Nat. Rev. Genet. 2010, 11, 733–739. [Google Scholar] [CrossRef] [Green Version]
- Searle, S.R. An overview of variance component estimation. Metrika 1995, 42, 215–230. [Google Scholar] [CrossRef] [Green Version]
- Kang, H.M.; Ye, C.; Eskin, E. Accurate discovery of expression quantitative trait loci under confounding from spurious and genuine regulatory hotspots. Genetics 2008, 180, 1909–1925. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Recht, B.; Fazel, M.; Parrilo, P.A. Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization. SIAM Rev. 2010, 52, 471–501. [Google Scholar] [CrossRef] [Green Version]
- Tibshirani, R. Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B 1996, 58, 267–288. [Google Scholar] [CrossRef]
- Kang, H.M.; Sul, J.H.; Service, S.K.; Zaitlen, N.A.; Kong, S.-Y.; Freimer, N.B.; Sabatti, C.; Eskin, E. Variance component model to account for sample structure in genome-wide association studies. Nat. Genet. 2010, 42, 348–354. [Google Scholar] [CrossRef] [Green Version]
- Tibshirani, R.; Hastie, T.; Friedman, J.H. Regularized Paths for Generalized Linear Models via Coordinate Descent. J. Stat. Softw. 2010, 33, 1–22. [Google Scholar]
- Olivier, G. MethylMix: An R package for identifying DNA methylation-driven genes. Bioinformatics 2015, 31, 1839–1841. [Google Scholar]
- Therneau, T.M. Survival Analysis [Rpackage survival version 2.41-3]. Technometrics 2015, 46, 111–112. [Google Scholar]
- Hansen, P.C.; Sekii, T.; Shibahashi, H. The Modified Truncated SVD Method for Regularization in General Form. SIAM J. Sci. Stat. Comput. 1992, 13, 1142–1150. [Google Scholar] [CrossRef]
- Liu, Y.W.; Xia, R.; Lu, K.; Xie, M.; Yang, F.; Sun, M.; De, W.; Wang, C.; Ji, G. LincRNAFEZF1-AS1 represses p21 expression to promote gastric cancer proliferation through LSD1-Mediated H3K4me2 demethylation. Mol. Cancer 2017, 16, 39. [Google Scholar] [CrossRef] [Green Version]
- Peng, Y.; Wu, Q.; Wang, L.; Wang, H.; Yin, F. A DNA methylation signature to improve survival prediction of gastric cancer. Clin. Epigenet. 2020, 12, 1–16. [Google Scholar] [CrossRef] [Green Version]
- Li, D.; Guo, J.; Wang, S.; Zhu, L.; Shen, Z. Identification of novel methylated targets in colorectal cancer by microarray analysis and construction of coexpression network. Oncol. Lett. 2017, 14, 2643–2648. [Google Scholar] [CrossRef] [Green Version]
- Bethge, N.; Honne, H.; Hilden, V.; Trøen, G.; Eknæs, M.; Liestøl, K.; Holte, H.; Delabie, J.; Smeland, E.B.; Lind, G.E. Lind Identification of Highly Methylated Genes across Various Types of B-Cell Non-Hodgkin Lymphoma. PLoS ONE 2013, 8, e79602. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Cheng, Q.; Zhao, B.; Huang, Z.; Su, Y.; Chen, B.; Yang, S.; Peng, X.; Ma, Q.; Yu, X.; Zhao, B.; et al. Epigenome-wide study for the offspring exposed to maternal HBV infection during pregnancy, a pilot study. Gene 2018, 658, 76–85. [Google Scholar] [CrossRef] [PubMed]
- Morris, M.R.; Ricketts, C.J.; Gentle, D.; McRonald, F.; Carli, N.; Khalili, H.; Brown, M.; Kishida, T.; Yao, M.; Banks, R.E.; et al. Genome-wide methylation analysis identifies epigenetically inactivated candidate tumour suppressor genes in renal cell carcinoma. Oncogene 2011, 30, 1390–1401. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Gene Name | GO | KEGG |
---|---|---|
FEZF1 | negative regulation of transcription from RNA polymerase II promoter | None |
KIHL35 | protein binding | None |
NOL6 | rRNA processing, tRNA export from nucleus | Ribosome biogenesis in eukaryotes |
PPP1R14A | regulation of protein dephosphorylation, cellular response to drug, phosphatase inhibitor activity | Vascular smooth muscle contraction |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, Y.; Xu, L.; Ai, D. A Sparse and Low-Rank Regression Model for Identifying the Relationships Between DNA Methylation and Gene Expression Levels in Gastric Cancer and the Prediction of Prognosis. Genes 2021, 12, 854. https://doi.org/10.3390/genes12060854
Wang Y, Xu L, Ai D. A Sparse and Low-Rank Regression Model for Identifying the Relationships Between DNA Methylation and Gene Expression Levels in Gastric Cancer and the Prediction of Prognosis. Genes. 2021; 12(6):854. https://doi.org/10.3390/genes12060854
Chicago/Turabian StyleWang, Yishu, Lingyun Xu, and Dongmei Ai. 2021. "A Sparse and Low-Rank Regression Model for Identifying the Relationships Between DNA Methylation and Gene Expression Levels in Gastric Cancer and the Prediction of Prognosis" Genes 12, no. 6: 854. https://doi.org/10.3390/genes12060854
APA StyleWang, Y., Xu, L., & Ai, D. (2021). A Sparse and Low-Rank Regression Model for Identifying the Relationships Between DNA Methylation and Gene Expression Levels in Gastric Cancer and the Prediction of Prognosis. Genes, 12(6), 854. https://doi.org/10.3390/genes12060854