Genotype Pattern Mining for Pairs of Interacting Variants Underlying Digenic Traits
Abstract
:1. Introduction
2. Materials and Methods
3. Results
3.1. AMD Dataset
3.2. Opioid Dataset
3.3. Schizophrenia Dataset
4. Discussion
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Deltas, C. Digenic inheritance and genetic modifiers. Clin. Genet. 2018, 93, 429–438. [Google Scholar] [CrossRef]
- Schaffer, A.A. Digenic inheritance in medical genetics. J. Med. Genet. 2013, 50, 641–652. [Google Scholar] [CrossRef] [Green Version]
- Ming, J.E.; Muenke, M. Multiple hits during early embryonic development: Digenic diseases and holoprosencephaly. Am. J. Hum. Genet. 2002, 71, 1017–1032. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Savage, D.B.; Agostini, M.; Barroso, I.; Gurnell, M.; Luan, J.; Meirhaeghe, A.; Harding, A.H.; Ihrke, G.; Rajanayagam, O.; Soos, M.A.; et al. Digenic inheritance of severe insulin resistance in a human pedigree. Nat. Genet. 2002, 31, 379–384. [Google Scholar] [CrossRef] [PubMed]
- Breslow, N.E.; Day, N.E. The Analysis of Case-Control Studies; International Agency of Cancer Research: Lyon, France, 1980; Volume 1, p. 350. [Google Scholar]
- Cordell, H.J. Epistasis: What it means, what it doesn’t mean, and statistical methods to detect it in humans. Hum. Mol. Genet. 2002, 11, 2463–2468. [Google Scholar] [CrossRef] [Green Version]
- Wang, X.; Elston, R.C.; Zhu, X. Statistical interaction in human genetics: How should we model it if we are looking for biological interaction? Nat. Rev. Genet. 2010, 12, 74. [Google Scholar] [CrossRef]
- Wang, X.; Elston, R.C.; Zhu, X. The meaning of interaction. Hum. Hered. 2010, 70, 269–277. [Google Scholar] [CrossRef] [Green Version]
- Marchini, J.; Donnelly, P.; Cardon, L.R. Genome-wide strategies for detecting multiple loci that influence complex diseases. Nat. Genet. 2005, 37, 413–417. [Google Scholar] [CrossRef] [PubMed]
- Cordell, H.J. Detecting gene-gene interactions that underlie human diseases. Nat. Rev. Genet. 2009, 10, 392–404. [Google Scholar] [CrossRef] [Green Version]
- Upstill-Goddard, R.; Eccles, D.; Fliege, J.; Collins, A. Machine learning approaches for the discovery of gene-gene interactions in disease data. Brief. Bioinform. 2013, 14, 251–260. [Google Scholar] [CrossRef] [PubMed]
- Miller, A.K.; Chen, A.; Bartlett, J.; Wang, L.; Williams, S.M.; Buchner, D.A. A Novel Mapping Strategy Utilizing Mouse Chromosome Substitution Strains Identifies Multiple Epistatic Interactions That Regulate Complex Traits. G3 Genes Genomes Genet. 2020, 10, 4553–4563. [Google Scholar] [CrossRef] [PubMed]
- Chatelain, C.; Lessard, S.; Thuillier, V.; Carliez, C.; Rajpal, D.; Augé, F. Atlas of epistasis. medRxiv 2021. [Google Scholar] [CrossRef]
- Hashimoto, L.; Habita, C.; Beressi, J.P.; Delepine, M.; Besse, C.; Cambon-Thomsen, A.; Deschamps, I.; Rotter, J.I.; Djoulah, S.; James, M.R.; et al. Genetic mapping of a susceptibility locus for insulin-dependent diabetes mellitus on chromosome 11q. Nature 1994, 371, 161–164. [Google Scholar] [CrossRef]
- Wang, G.; Yang, Y.; Ott, J. Genome-wide conditional search for epistatic disease-predisposing variants in human association studies. Hum. Hered. 2010, 70, 34–41. [Google Scholar] [CrossRef] [PubMed]
- Wu, X.; Kumar, V.; Quinlan, J.R.; Ghosh, J.; Yang, Q.; Motoda, H.; McLachlan, G.J.; Ng, A.; Liu, B.; Yu, P.S.; et al. Top 10 algorithms in data mining. Knowl. Inf. Syst. 2008, 14, 1–37. [Google Scholar] [CrossRef] [Green Version]
- MacLean, C.J.; Sham, P.C.; Kendler, K.S. Joint linkage of multiple loci for a complex disorder. Am. J. Hum. Genet. 1993, 53, 353–366. [Google Scholar]
- Hoh, J.; Wille, A.; Ott, J. Trimming, weighting, and grouping SNPs in human case-control association studies. Genome Res. 2001, 11, 2115–2119. [Google Scholar] [CrossRef] [Green Version]
- Moore, J.H.; Hahn, L.W. A cellular automata approach to detecting interactions among single-nucleotide polymorphisms in complex multifactorial diseases. Biocomputing 2002, 53–64. [Google Scholar] [CrossRef] [Green Version]
- Ritchie, M.D.; Hahn, L.W.; Moore, J.H. Power of multifactor dimensionality reduction for detecting gene-gene interactions in the presence of genotyping error, missing data, phenocopy, and genetic heterogeneity. Genet. Epidemiol. 2003, 24, 150–157. [Google Scholar] [CrossRef] [PubMed]
- Moore, J.H.; Andrews, P.C. Epistasis Analysis Using Multifactor Dimensionality Reduction. In Epistasis: Methods and Protocols; Moore, J.H., Williams, S.M., Eds.; Springer: New York, NY, USA, 2015; pp. 301–314. [Google Scholar]
- Ritchie, M.D.; Hahn, L.W.; Roodi, N.; Bailey, L.R.; Dupont, W.D.; Parl, F.F.; Moore, J.H. Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. Am. J. Hum. Genet. 2001, 69, 138–147. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Winham, S.J.; Motsinger-Reif, A.A. An R package implementation of multifactor dimensionality reduction. BioData Min. 2011, 4, 24. [Google Scholar] [CrossRef] [Green Version]
- Lo, S.H.; Chernoff, H.; Cong, L.; Ding, Y.; Zheng, T. Discovering interactions among BRCA1 and other candidate genes associated with sporadic breast cancer. Proc. Natl. Acad. Sci. USA 2008, 105, 12387–12392. [Google Scholar] [CrossRef] [Green Version]
- Borgelt, C. Frequent item set mining. WIREs Data Min. Knowl. Discov. 2012, 2, 437–456. [Google Scholar] [CrossRef]
- Agrawal, R.; Srikant, R. Fast algorithms for mining association rules. In Proceedings of the 20th VLCB Conference, Santiago, Chile, 12–15 September 1994; pp. 487–499. [Google Scholar]
- Zhang, Q.; Long, Q.; Ott, J. AprioriGWAS, a new pattern mining strategy for detecting genetic variants associated with disease through interaction effects. PLoS Comput. Biol. 2014, 10, e1003627. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Klein, R.J.; Zeiss, C.; Chew, E.Y.; Tsai, J.Y.; Sackler, R.S.; Haynes, C.; Henning, A.K.; SanGiovanni, J.P.; Mane, S.M.; Mayne, S.T.; et al. Complement factor H polymorphism in age-related macular degeneration. Science 2005, 308, 385–389. [Google Scholar] [CrossRef]
- Burton, P.R.; Clayton, D.G.; Cardon, L.R.; Craddock, N.; Deloukas, P.; Duncanson, A.; Kwiatkowski, D.P.; McCarthy, M.I.; Ouwehand, W.H.; Samani, N.J.; et al. Genome-wide association study of 14,000 cases of seven common diseases and 3000 shared controls. Nature 2007, 447, 661–678. [Google Scholar] [CrossRef] [Green Version]
- Huh, I.; Kwon, M.S.; Park, T. An Efficient Stepwise Statistical Test to Identify Multiple Linked Human Genetic Variants Associated with Specific Phenotypic Traits. PLoS ONE 2015, 10, e0138700. [Google Scholar] [CrossRef]
- Chimusa, E.R.; Mbiyavanga, M.; Mazandu, G.K.; Mulder, N.J. ancGWAS: A post genome-wide association study method for interaction, pathway and ancestry analysis in homogeneous and admixed populations. Bioinformatics 2015, 32, 549–556. [Google Scholar] [CrossRef]
- Tuo, S.; Zhang, J.; Yuan, X.; Zhang, Y.; Liu, Z. FHSA-SED: Two-Locus Model Detection for Genome-Wide Association Study with Harmony Search Algorithm. PLoS ONE 2016, 11, e0150669. [Google Scholar] [CrossRef] [Green Version]
- Woo, H.J.; Yu, C.; Kumar, K.; Gold, B.; Reifman, J. Genotype distribution-based inference of collective effects in genome-wide association studies: Insights to age-related macular degeneration disease mechanism. BMC Genom. 2016, 17, 695. [Google Scholar] [CrossRef] [Green Version]
- Guo, Y.; Zhong, Z.; Yang, C.; Hu, J.; Jiang, Y.; Liang, Z.; Gao, H.; Liu, J. Epi-GTBN: An approach of epistasis mining based on genetic Tabu algorithm and Bayesian network. BMC Bioinform. 2019, 20, 444. [Google Scholar] [CrossRef] [PubMed]
- Chen, Y.; Xu, F.; Pian, C.; Xu, M.; Kong, L.; Fang, J.; Li, Z.; Zhang, L. EpiMOGA: An Epistasis Detection Method Based on a Multi-Objective Genetic Algorithm. Genes 2021, 12, 191. [Google Scholar] [CrossRef] [PubMed]
- Agresti, A. Categorical Data Analysis, 2nd ed.; Wiley-Interscience: New York, NY, USA, 2002. [Google Scholar]
- Lander, E.S.; Botstein, D. Homozygosity mapping: A way to map human recessive traits with the DNA of inbred children. Science 1987, 236, 1567–1570. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Imai-Okazaki, A.; Li, Y.; Horpaopan, S.; Riazalhosseini, Y.; Garshasbi, M.; Mosse, Y.P.; Zhang, D.; Schrauwen, I.; Sharma, A.; Fann, C.S.J.; et al. Heterozygosity mapping for human dominant trait variants. Hum. Mutat. 2019, 40, 996–1004. [Google Scholar] [CrossRef]
- Borgelt, C. An implementation of the FP-growth algorithm. In Proceedings of the 1st International Workshop on Open Source Data Mining: Frequent Pattern Mining Implementations, Chicago, IL, USA, 21 August 2005; pp. 1–5. [Google Scholar]
- Nasreen, S.; Azam, M.A.; Shehzad, K.; Naeem, U.; Ghazanfar, M.A. Frequent Pattern Mining Algorithms for Finding Associated Frequent Patterns for Data Streams: A Survey. Procedia Comput. Sci. 2014, 37, 109–116. [Google Scholar] [CrossRef] [Green Version]
- Purcell, S.; Neale, B.; Todd-Brown, K.; Thomas, L.; Ferreira, M.A.; Bender, D.; Maller, J.; Sklar, P.; de Bakker, P.I.; Daly, M.J.; et al. PLINK: A tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 2007, 81, 559–575. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Chang, C.C.; Chow, C.C.; Tellier, L.C.; Vattikuti, S.; Purcell, S.M.; Lee, J.J. Second-generation PLINK: Rising to the challenge of larger and richer datasets. Gigascience 2015, 4, 7. [Google Scholar] [CrossRef]
- Randesi, M.; van den Brink, W.; Levran, O.; Blanken, P.; Butelman, E.R.; Yuferov, V.; da Rosa, J.C.; Ott, J.; van Ree, J.M.; Kreek, M.J. Variants of opioid system genes are associated with non-dependent opioid use and heroin dependence. Drug Alcohol. Depend. 2016, 168, 164–169. [Google Scholar] [CrossRef]
- Ott, J.; Macciardi, F.; Shen, Y.; Carta, M.G.; Murru, A.; Triunfo, R.; Robledo, R.; Rinaldi, A.; Contu, L.; Siniscalco, M. Pilot Study on Schizophrenia in Sardinia. Hum. Hered. 2010, 70, 92–96. [Google Scholar] [CrossRef]
- Lo, A.; Chernoff, H.; Zheng, T.; Lo, S.H. Why significant variables aren’t automatically good predictors. Proc. Natl. Acad. Sci. USA 2015, 112, 13892–13897. [Google Scholar] [CrossRef] [Green Version]
- Manly, B.F.J. Randomization, Bootstrap, and Monte Carlo Methods in Biology, 3rd ed.; Chapman & Hall/CRC: Boca Raton, FL, USA, 2007; p. 480. [Google Scholar]
- Llinares-López, F.; Sugiyama, M.; Papaxanthos, L.; Borgwardt, K. Fast and Memory-Efficient Significant Pattern Mining via Permutation Testing. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Sydney, NSW, Australia, 10–13 August 2015; pp. 725–734. [Google Scholar]
- Risch, N.J. Searching for genetic determinants in the new millennium. Nature 2000, 405, 847–856. [Google Scholar] [CrossRef] [PubMed]
- Nelson, M.R.; Kardia, S.L.; Ferrell, R.E.; Sing, C.F. A combinatorial partitioning method to identify multilocus genotypic partitions that predict quantitative trait variation. Genome Res. 2001, 11, 458–470. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Source | Chi-Square | df | p |
---|---|---|---|
rs1918760 main | 2.329 | 2 | 0.3121 |
rs6136667 main | 7.388 | 2 | 0.0249 |
interaction | 12.592 | 4 | 0.0135 |
Total table | 22.309 | 8 | 0.0002 |
rs136667 Genotypes | |||
---|---|---|---|
rs1918760 Genotypes | 1 | 2 | 3 |
Cases | |||
1 | 0 | 1 | 4 |
2 | 1 | 14 | 39 |
3 | 1 | 16 | 65 |
Controls | |||
1 | 0 | 1 | 4 |
2 | 1 | 0 | 45 |
3 | 1 | 15 | 86 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Okazaki, A.; Horpaopan, S.; Zhang, Q.; Randesi, M.; Ott, J. Genotype Pattern Mining for Pairs of Interacting Variants Underlying Digenic Traits. Genes 2021, 12, 1160. https://doi.org/10.3390/genes12081160
Okazaki A, Horpaopan S, Zhang Q, Randesi M, Ott J. Genotype Pattern Mining for Pairs of Interacting Variants Underlying Digenic Traits. Genes. 2021; 12(8):1160. https://doi.org/10.3390/genes12081160
Chicago/Turabian StyleOkazaki, Atsuko, Sukanya Horpaopan, Qingrun Zhang, Matthew Randesi, and Jurg Ott. 2021. "Genotype Pattern Mining for Pairs of Interacting Variants Underlying Digenic Traits" Genes 12, no. 8: 1160. https://doi.org/10.3390/genes12081160
APA StyleOkazaki, A., Horpaopan, S., Zhang, Q., Randesi, M., & Ott, J. (2021). Genotype Pattern Mining for Pairs of Interacting Variants Underlying Digenic Traits. Genes, 12(8), 1160. https://doi.org/10.3390/genes12081160