Large-Scale Meta-Longitudinal Microbiome Data with a Known Batch Factor
Abstract
:1. Introduction
2. Results
2.1. Initial Inspection of Batch Factor and Whether It Is Statistically Significant
2.2. Biological Impact of Batch Factor on Longitudinal Differential Abundance Test
2.3. Biological Impact of Batch Factor on Functional Profiling Data Analyses
2.4. Biological Impact of Batch Factors on Network Modules of Disease-Associated Functional Enrichment Pathways
2.5. Performance of Batch Identification and Removal Methods When There Are Either Balanced or Unbalanced Sample Sizes for a Known Batch Factor in Simulations
3. Discussion
4. Online Methods
4.1. Batch-Correction Methods
4.2. Group Difference Detection Methods for Longitudinal Trajectory Tests
4.3. Pre-Processing Bioinformatics Procedures
4.4. Experimental Design for Longitudinal Microbiome Time Series Data with Two Different Treatment Groups
Supplementary Materials
Author Contributions
Funding
Conflicts of Interest
References
- Marchesi, J.R.; Ravel, J. The vocabulary of microbiome research: A proposal. Microbiome 2015, 3, 31. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Duerkop, B.A.; Vaishnava, S.; Hooper, L.V. Immune responses to the microbiota at the intestinal mucosal surface. Immunity 2009, 31, 368–376. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Jovanovic, M.; Rooney, M.S.; Mertins, P.; Przybylski, D.; Chevrier, N.; Satija, R.; Rodriguez, E.H.; Fields, A.P.; Schwartz, S.; Raychowdhury, R.; et al. Immunogenetics. Dynamic profiling of the protein life cycle in response to pathogens. Science 2015, 347, 1259038. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Wilmanski, T.; Diener, C.; Rappaport, N.; Patwardhan, S.; Wiedrick, J.; Lapidus, J.; Earls, J.C.; Zimmer, A.; Glusman, G.; Robinson, M.; et al. Gut microbiome pattern reflects healthy ageing and predicts survival in humans. Nat. Metab. 2021, 3, 274–286. [Google Scholar] [CrossRef] [PubMed]
- Kushak, R.I.; Sengupta, A.; Winter, H.S. Interactions between the intestinal microbiota and epigenome in individuals with autism spectrum disorder. Dev. Med. Child Neurol. 2022, 64, 296–304. [Google Scholar] [CrossRef] [PubMed]
- Dalal, N.; Jalandra, R.; Bayal, N.; Yadav, A.K.; Harshulika; Sharma, M.; Makharia, G.K.; Kumar, P.; Singh, R.; Solanki, P.R.; et al. Gut microbiota-derived metabolites in CRC progression and causation. J. Cancer Res. Clin. Oncol. 2021, 147, 3141–3155. [Google Scholar] [CrossRef] [PubMed]
- Pandey, K.; Umar, S. Microbiome in drug resistance to colon cancer. Curr. Opin. Physiol. 2021, 23, 100472. [Google Scholar] [CrossRef]
- Baker, P.I.; Love, D.R.; Ferguson, L.R. Role of gut microbiota in Crohn’s disease. Expert Rev. Gastroenterol. Hepatol. 2009, 3, 535–546. [Google Scholar] [CrossRef]
- Schwiertz, A.; Jacobi, M.; Frick, J.S.; Richter, M.; Rusch, K.; Köhler, H. Microbiota in pediatric inflammatory bowel disease. J. Pediatr. 2010, 157, 240–244.e241. [Google Scholar] [CrossRef]
- Chaudhari, S.N.; McCurry, M.D.; Devlin, A.S. Chains of evidence from correlations to causal molecules in microbiome-linked diseases. Nat. Chem. Biol. 2021, 17, 1046–1056. [Google Scholar] [CrossRef]
- Klag, K.A.; Round, J.L. Microbiota-Immune Interactions Regulate Metabolic Disease. J. Immunol. 2021, 207, 1719–1724. [Google Scholar] [CrossRef] [PubMed]
- Markey, K.A.; van den Brink, M.R.M.; Peled, J.U. Therapeutics Targeting the Gut Microbiome: Rigorous Pipelines for Drug Development. Cell Host Microbe 2020, 27, 169–172. [Google Scholar] [CrossRef] [PubMed]
- Levy, R.; Magis, A.T.; Earls, J.C.; Manor, O.; Wilmanski, T.; Lovejoy, J.; Gibbons, S.M.; Omenn, G.S.; Hood, L.; Price, N.D. Longitudinal analysis reveals transition barriers between dominant ecological states in the gut microbiome. Proc. Natl. Acad. Sci. USA 2020, 117, 13839–13845. [Google Scholar] [CrossRef] [PubMed]
- Wagner Mackenzie, B.; Chang, K.; Zoing, M.; Jain, R.; Hoggard, M.; Biswas, K.; Douglas, R.G.; Taylor, M.W. Longitudinal study of the bacterial and fungal microbiota in the human sinuses reveals seasonal and annual changes in diversity. Sci. Rep. 2019, 9, 17416. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Di Gioia, D.; Bozzi Cionci, N.; Baffoni, L.; Amoruso, A.; Pane, M.; Mogna, L.; Gaggìa, F.; Lucenti, M.A.; Bersano, E.; Cantello, R.; et al. A prospective longitudinal study on the microbiota composition in amyotrophic lateral sclerosis. BMC Med. 2020, 18, 153. [Google Scholar] [CrossRef]
- Yee, A.L.; Miller, E.; Dishaw, L.J.; Gordon, J.M.; Ji, M.; Dutra, S.; Ho, T.T.B.; Gilbert, J.A.; Groer, M. Longitudinal Microbiome Composition and Stability Correlate with Increased Weight and Length of Very-Low-Birth-Weight Infants. mSystems 2019, 4, e00229-18. [Google Scholar] [CrossRef] [Green Version]
- Ma, S.; Dmitry, S.; Himel, M.; Melanie, S.; Nguyen, L.H.; Kolde, R.; Franzosa, E.; Vlamakis, H.; Xavier, R.; Huttenhower, C. Population Structure Discovery in Meta-Analyzed Microbial Communities and Inflammatory Bowel Disease. bioRxiv 2020. [Google Scholar] [CrossRef]
- Mandal, S.; Van Treuren, W.; White, R.A.; Eggesbø, M.; Knight, R.; Peddada, S.D. Analysis of composition of microbiomes: A novel method for studying microbial composition. Microb. Ecol. Health Dis. 2015, 26, 27663. [Google Scholar] [CrossRef] [Green Version]
- Oh, S.; Li, C.; Baldwin, R.L.; Song, S.; Liu, F.; Li, R.W. Temporal dynamics in meta longitudinal RNA-Seq data. Sci. Rep. 2019, 9, 763. [Google Scholar] [CrossRef]
- Leek, J.T.; Scharpf, R.B.; Bravo, H.C.; Simcha, D.; Langmead, B.; Johnson, W.E.; Geman, D.; Baggerly, K.; Irizarry, R.A. Tackling the widespread and critical impact of batch effects in high-throughput data. Nat. Rev. Genet. 2010, 11, 733–739. [Google Scholar] [CrossRef] [Green Version]
- Oytam, Y.; Sobhanmanesh, F.; Duesing, K.; Bowden, J.C.; Osmond-McLeod, M.; Ross, J. Risk-conscious correction of batch effects: Maximising information extraction from high-throughput genomic datasets. BMC Bioinform. 2016, 17, 332. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Müller, C.; Schillert, A.; Röthemeier, C.; Trégouët, D.A.; Proust, C.; Binder, H.; Pfeiffer, N.; Beutel, M.; Lackner, K.J.; Schnabel, R.B.; et al. Removing Batch Effects from Longitudinal Gene Expression—Quantile Normalization Plus ComBat as Best Approach for Microarray Transcriptome Data. PLoS ONE 2016, 11, e0156594. [Google Scholar] [CrossRef] [PubMed]
- Reese, S.E.; Archer, K.J.; Therneau, T.M.; Atkinson, E.J.; Vachon, C.M.; de Andrade, M.; Kocher, J.P.; Eckel-Passow, J.E. A new statistic for identifying batch effects in high-throughput genomic data that uses guided principal component analysis. Bioinformatics 2013, 29, 2877–2883. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Johnson, W.E.; Li, C.; Rabinovic, A. Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics 2007, 8, 118–127. [Google Scholar] [CrossRef] [PubMed]
- Nueda, M.J.; Ferrer, A.; Conesa, A. ARSyN: A method for the identification and removal of systematic noise in multifactorial time course microarray experiments. Biostatistics 2012, 13, 553–566. [Google Scholar] [CrossRef]
- Suzuki, R.; Shimodaira, H. Pvclust: An R package for assessing the uncertainty in hierarchical clustering. Bioinformatics 2006, 22, 1540–1542. [Google Scholar] [CrossRef]
- Chong, J.; Liu, P.; Zhou, G.; Xia, J. Using MicrobiomeAnalyst for comprehensive statistical, functional, and meta-analysis of microbiome data. Nat. Protoc. 2020, 15, 799–821. [Google Scholar] [CrossRef]
- Dhariwal, A.; Chong, J.; Habib, S.; King, I.L.; Agellon, L.B.; Xia, J. MicrobiomeAnalyst: A web-based tool for comprehensive statistical, visual and meta-analysis of microbiome data. Nucleic Acids Res. 2017, 45, W180–W188. [Google Scholar] [CrossRef]
- Langille, M.G.; Zaneveld, J.; Caporaso, J.G.; McDonald, D.; Knights, D.; Reyes, J.A.; Clemente, J.C.; Burkepile, D.E.; Vega Thurber, R.L.; Knight, R.; et al. Predictive functional profiling of microbial communities using 16S rRNA marker gene sequences. Nat. Biotechnol. 2013, 31, 814–821. [Google Scholar] [CrossRef]
- Williams, J.; Bravo, H.C.; Tom, J.; Paulson, J.N. Simulating longitudinal differential abundance for microbiome data. F1000Research 2019, 8, 1769. [Google Scholar] [CrossRef]
- Benjamini, Y.; Hochberg, Y. Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B 1995, 57, 289–300. [Google Scholar] [CrossRef]
- Robin, X.; Turck, N.; Hainard, A.; Tiberti, N.; Lisacek, F.; Sanchez, J.C.; Müller, M. pROC: An open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinform. 2011, 12, 77. [Google Scholar] [CrossRef] [PubMed]
- Espín-Pérez, A.; Portier, C.; Chadeau-Hyam, M.; van Veldhoven, K.; Kleinjans, J.C.S.; de Kok, T.M.C.M. Comparison of statistical methods and the use of quality control samples for batch effect correction in human transcriptome data. PLoS ONE 2018, 13, e0202947. [Google Scholar] [CrossRef] [PubMed]
- Ho, N.T.; Li, F.; Wang, S.; Kuhn, L. Metamicrobiomer: An R package for analysis of microbiome relative abundance data using zero-inflated β GAMLSS and meta-analysis across studies using random effects models. BMC Bioinform. 2019, 20, 188. [Google Scholar] [CrossRef] [Green Version]
- Nygaard, V.; Rodland, E.A.; Hovig, E. Methods that remove batch effects while retaining group differences may lead to exaggerated confidence in downstream analyses. Biostatistics 2016, 17, 29–39. [Google Scholar] [CrossRef]
- Hansen, K.D.; Wu, Z.; Irizarry, R.A.; Leek, J.T. Sequencing technology does not eliminate biological variability. Nat. Biotechnol. 2011, 29, 572–573. [Google Scholar] [CrossRef] [Green Version]
- Hansen, K.D.; Irizarry, R.A.; Wu, Z. Removing technical variability in RNA-seq data using conditional quantile normalization. Biostatistics 2012, 13, 204–216. [Google Scholar] [CrossRef] [Green Version]
- Sudmant, P.H.; Alexis, M.S.; Burge, C.B. Meta-analysis of RNA-seq expression data across species, tissues and studies. Genome Biol. 2015, 16, 287. [Google Scholar] [CrossRef] [Green Version]
- Hoffman, G.; Roussos, P. Dream: Powerful differential expression analysis for repeated measures designs. bioRxiv 2020. [Google Scholar] [CrossRef]
- Lewin, A.; Richardson, S.; Marshall, C.; Glazier, A.; Aitman, T. Bayesian modeling of differential gene expression. Biometrics 2006, 62, 10–18. [Google Scholar] [CrossRef]
- Smilde, A.K.; Jansen, J.J.; Hoefsloot, H.C.; Lamers, R.J.; van der Greef, J.; Timmerman, M.E. ANOVA-simultaneous component analysis (ASCA): A new tool for analyzing designed metabolomics data. Bioinformatics 2005, 21, 3043–3048. [Google Scholar] [CrossRef] [PubMed]
- Leek, J.T. Svaseq: Removing batch effects and other unwanted noise from sequencing data. Nucleic Acids Res. 2014, 42, e161. [Google Scholar] [CrossRef] [PubMed]
- Shields-Cutler, R.R.; Al-Ghalith, G.A.; Yassour, M.; Knights, D. SplinectomeR Enables Group Comparisons in Longitudinal Microbiome Studies. Front. Microbiol. 2018, 9, 785. [Google Scholar] [CrossRef] [PubMed]
- Paulson, J.N.; Talukder, H.; Bravo, H.C. Longitudinal differential abundance analysis of microbial marker-gene surveys using smoothing splines. bioRxiv 2017. [Google Scholar] [CrossRef]
- Liu, R.; Holik, A.Z.; Su, S.; Jansz, N.; Chen, K.; Leong, H.S.; Blewitt, M.E.; Asselin-Labat, M.L.; Smyth, G.K.; Ritchie, M.E. Why weight? Modelling sample and observational level variability improves power in RNA-seq analyses. Nucleic Acids Res. 2015, 43, e97. [Google Scholar] [CrossRef]
- Law, C.W.; Chen, Y.; Shi, W.; Smyth, G.K. Voom: Precision weights unlock linear model analysis tools for RNA-seq read counts. Genome Biol. 2014, 15, R29. [Google Scholar] [CrossRef] [Green Version]
- Bolyen, E.; Rideout, J.R.; Dillon, M.R.; Bokulich, N.A.; Abnet, C.C.; Al-Ghalith, G.A.; Alexander, H.; Alm, E.J.; Arumugam, M.; Asnicar, F.; et al. Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2. Nat. Biotechnol. 2019, 37, 852–857. [Google Scholar] [CrossRef]
- Liu, F.; Smith, A.D.; Solano-Aguilar, G.; Wang, T.T.Y.; Pham, Q.; Beshah, E.; Tang, Q.; Urban, J.F.; Xue, C.; Li, R.W. Mechanistic insights into the attenuation of intestinal inflammation and modulation of the gut microbiome by krill oil using in vitro and in vivo models. Microbiome 2020, 8, 83. [Google Scholar] [CrossRef]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Oh, V.-K.S.; Li, R.W. Large-Scale Meta-Longitudinal Microbiome Data with a Known Batch Factor. Genes 2022, 13, 392. https://doi.org/10.3390/genes13030392
Oh V-KS, Li RW. Large-Scale Meta-Longitudinal Microbiome Data with a Known Batch Factor. Genes. 2022; 13(3):392. https://doi.org/10.3390/genes13030392
Chicago/Turabian StyleOh, Vera-Khlara S., and Robert W. Li. 2022. "Large-Scale Meta-Longitudinal Microbiome Data with a Known Batch Factor" Genes 13, no. 3: 392. https://doi.org/10.3390/genes13030392
APA StyleOh, V. -K. S., & Li, R. W. (2022). Large-Scale Meta-Longitudinal Microbiome Data with a Known Batch Factor. Genes, 13(3), 392. https://doi.org/10.3390/genes13030392