Thanos: An R Package for the Gene-Centric Analysis of Functional Potential in Metagenomic Samples
Abstract
:1. Introduction
2. Materials and Methods
2.1. Importing Depths Files
2.2. Running HMMER
2.3. Aggregation and Normalization
2.4. Visualization
2.5. Parallelization
2.6. Dependencies
3. Results
3.1. MAGs Workflow: Sulfur Metabolism by Taxonomy
3.2. Contigs Workflow: Prevalence of Glycolysis
4. Discussion
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
MAG | Metagenome-Assembled Genome |
OTU | Operational Taxonomic Unit |
HMM | Hidden Markov Model |
KEGG | Kyoto Encyclopedia of Gene Elements |
References
- Zhang, L.; Chen, F.; Zeng, Z.; Xu, M.; Sun, F.; Yang, L.; Bi, X.; Lin, Y.; Gao, Y.; Hao, H.; et al. Advances in metagenomics and its application in environmental microorganisms. Front. Microbiol. 2021, 12, 766364. [Google Scholar] [CrossRef]
- Afzaal, M.; Saeed, F.; Shah, Y.A.; Hussain, M.; Rabail, R.; Socol, C.T.; Hassoun, A.; Pateiro, M.; Lorenzo, J.M.; Rusu, A.V.; et al. Human gut microbiota in health and disease: Unveiling the relationship. Front. Microbiol. 2022, 13, 999001. [Google Scholar] [CrossRef]
- Borroni, D.; Paytuví-Gallart, A.; Sanseverino, W.; Gómez-Huertas, C.; Bonci, P.; Romano, V.; Giannaccare, G.; Rechichi, M.; Meduri, A.; Oliverio, G.W.; et al. Exploring the healthy eye microbiota niche in a multicenter study. Int. J. Mol. Sci. 2022, 23, 10229. [Google Scholar] [CrossRef]
- Rocha-de Lossada, C.; Mazzotta, C.; Gabrielli, F.; Papa, F.T.; Gómez-Huertas, C.; García-López, C.; Urbinati, F.; Rachwani-Anil, R.; García-Lorente, M.; Sánchez-González, J.M.; et al. Ocular surface microbiota in naïve keratoconus: A multicenter validation study. J. Clin. Med. 2023, 12, 6354. [Google Scholar] [CrossRef] [PubMed]
- Sunagawa, S.; Acinas, S.G.; Bork, P.; Bowler, C.; Coordinators, T.O.; Eveillard, D.; Gorsky, G.; Guidi, L.; Iudicone, D.; Karsenti, E.; et al. Tara Oceans: Towards global ocean ecosystems biology. Nat. Rev. Microbiol. 2020, 18, 428–445. [Google Scholar] [CrossRef] [PubMed]
- Yang, C.; Chowdhury, D.; Zhang, Z.; Cheung, W.K.; Lu, A.; Bian, Z.; Zhang, L. A review of computational tools for generating metagenome-assembled genomes from metagenomic sequencing data. Comput. Struct. Biotechnol. J. 2021, 19, 6301–6314. [Google Scholar] [CrossRef]
- Konopka, A. What is microbial community ecology? ISME J. 2009, 3, 1223–1230. [Google Scholar] [CrossRef]
- McMurdie, P.J.; Holmes, S. phyloseq: An R package for reproducible interactive analysis and graphics of microbiome census data. PLoS ONE 2013, 8, e61217. [Google Scholar] [CrossRef]
- Franzosa, E.A.; McIver, L.J.; Rahnavard, G.; Thompson, L.R.; Schirmer, M.; Weingart, G.; Lipson, K.S.; Knight, R.; Caporaso, J.G.; Segata, N.; et al. Species-level functional profiling of metagenomes and metatranscriptomes. Nat. Methods 2018, 15, 962–968. [Google Scholar] [CrossRef] [PubMed]
- Yue, X.L.; Xu, L.; Cui, L.; Fu, G.Y.; Xu, X.W. Metagenome-based analysis of carbon-fixing microorganisms and their carbon-fixing pathways in deep-sea sediments of the southwestern Indian Ocean. Mar. Genom. 2023, 70, 101045. [Google Scholar] [CrossRef]
- Milanese, A.; Mende, D.R.; Paoli, L.; Salazar, G.; Ruscheweyh, H.J.; Cuenca, M.; Hingamp, P.; Alves, R.; Costea, P.I.; Coelho, L.P.; et al. Microbial abundance, activity and population genomic profiling with mOTUs2. Nat. Commun. 2019, 10, 1014. [Google Scholar] [CrossRef] [PubMed]
- Seemann, T. Prokka: Rapid prokaryotic genome annotation. Bioinformatics 2014, 30, 2068–2069. [Google Scholar] [CrossRef] [PubMed]
- Prokka: Rapid Prokaryotic Genome Annotation. Available online: https://github.com/tseemann/prokka (accessed on 17 June 2024).
- Cantalapiedra, C.P.; Hernández-Plaza, A.; Letunic, I.; Bork, P.; Huerta-Cepas, J. eggNOG-mapper v2: Functional Annotation, Orthology Assignments, and Domain Prediction at the Metagenomic Scale. Mol. Biol. Evol. 2021, 38, 5825–5829. [Google Scholar] [CrossRef] [PubMed]
- EggNOG-Mapper: Genome-Wide Functional Annotation. Available online: http://eggnog-mapper.embl.de/ (accessed on 17 June 2024).
- Kanehisa, M.; Goto, S. KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 2000, 28, 27–30. [Google Scholar] [CrossRef] [PubMed]
- KEGG: Kyoto Encyclopedia of Genes and Genomes. Available online: https://www.genome.jp/kegg/ (accessed on 17 June 2024).
- Gentleman, R.C.; Carey, V.J.; Bates, D.M.; Bolstad, B.; Dettling, M.; Dudoit, S.; Ellis, B.; Gautier, L.; Ge, Y.; Gentry, J.; et al. Bioconductor: Open software development for computational biology and bioinformatics. Genome Biol. 2004, 5, R80. [Google Scholar] [CrossRef] [PubMed]
- Eddy, S.R. Accelerated profile HMM searches. PLoS Comput. Biol. 2011, 7, e1002195. [Google Scholar] [CrossRef] [PubMed]
- Wickham, H. ggplot2: Elegant Graphics for Data Analysis (Use R!), 2nd ed.; Springer: Cham, Switzerland, 2016; 276p. [Google Scholar]
- Kang, D.D.; Li, F.; Kirton, E.; Thomas, A.; Egan, R.; An, H.; Wang, Z. MetaBAT 2: An adaptive binning algorithm for robust and efficient genome reconstruction from metagenome assemblies. PeerJ 2019, 7, e7359. [Google Scholar] [CrossRef] [PubMed]
- MetaBAT: A Robust Statistical Framework for Reconstructing Genomes from Metagenomic Data. Available online: https://bitbucket.org/berkeleylab/metabat/src/master/ (accessed on 17 June 2024).
- wwood/CoverM: Read Coverage Calculator for Metagenomics. Available online: https://github.com/wwood/CoverM (accessed on 17 June 2024).
- Hyatt, D.; Chen, G.L.; Locascio, P.F.; Land, M.L.; Larimer, F.W.; Hauser, L.J. Prodigal: Prokaryotic gene recognition and translation initiation site identification. BMC Bioinform. 2010, 11, 119. [Google Scholar] [CrossRef] [PubMed]
- Prodigal Gene Prediction Software. Available online: https://github.com/hyattpd/Prodigal (accessed on 17 June 2024).
- Krakau, S.; Straub, D.; Gourlé, H.; Gabernet, G.; Nahnsen, S. nf-core/mag: A best-practice pipeline for metagenome hybrid assembly and binning. NAR Genom. Bioinform. 2022, 4, lqac007. [Google Scholar] [CrossRef]
- Ewels, P.A.; Peltzer, A.; Fillinger, S.; Patel, H.; Alneberg, J.; Wilm, A.; Garcia, M.U.; Di Tommaso, P.; Nahnsen, S. The nf-core framework for community-curated bioinformatics pipelines. Nat. Biotechnol. 2020, 38, 276–278. [Google Scholar] [CrossRef]
- nf-core/mag: Assembly and Binning of Metagenomes. Available online: https://nf-co.re/mag/2.5.4 (accessed on 17 June 2024).
- Bracher, A.; Verghese, J. The nucleotide exchange factors of Hsp70 molecular chaperones. Front. Mol. Biosci. 2015, 2, 10. [Google Scholar] [CrossRef] [PubMed]
- Delaney, J.M. A grpE mutant of Escherichia coli is more resistant to heat than the wild-type. J. Gen. Microbiol. 1990, 136, 797–801. [Google Scholar] [CrossRef] [PubMed]
- Chaumeil, P.A.; Mussig, A.J.; Hugenholtz, P.; Parks, D.H. GTDB-Tk v2: Memory friendly classification with the genome taxonomy database. Bioinformatics 2022, 38, 5315–5316. [Google Scholar] [CrossRef] [PubMed]
- Parks, D.H.; Chuvochina, M.; Rinke, C.; Mussig, A.J.; Chaumeil, P.A.; Hugenholtz, P. GTDB: An ongoing census of bacterial and archaeal diversity through a phylogenetically consistent, rank normalized and complete genome-based taxonomy. Nucleic Acids Res. 2022, 50, D785–D794. [Google Scholar] [CrossRef]
- R: A Language and Environment for Statistical Computing. Available online: https://www.R-project.org (accessed on 17 June 2024).
- Sunagawa, S.; Coelho, L.; Chaffron, S.; Kultima, J.; Labadie, K.; Salazar, G.; Djahanschiri, B.; Zeller, G.; Mende, D.; Alberti, A.; et al. Structure and function of the global ocean microbiome. Science 2015, 348, 1261359. [Google Scholar] [CrossRef]
- Leinonen, R.; Akhtar, R.; Birney, E.; Bower, L.; Cerdeno-Tárraga, A.; Cheng, Y.; Cleland, I.; Faruque, N.; Goodgame, N.; Gibson, R.; et al. The european nucleotide archive. Nucleic Acids Res. 2011, 39, D28–D31. [Google Scholar] [CrossRef] [PubMed]
- Goldberger, R.F.; Deeley, R.G.; Mullinix, K.P. Regulation of gene expression in prokaryotic organisms. In Advances in Genetics; Elsevier: Amsterdam, The Netherlands, 1976; Volume 18, pp. 1–67. [Google Scholar] [CrossRef]
- Allen, K.J.; Lepp, D.; McKellar, R.C.; Griffiths, M.W. Examination of stress and virulence gene expression in Escherichia coli O157:H7 using targeted microarray analysis. Foodborne Pathog. Dis. 2008, 5, 437–447. [Google Scholar] [CrossRef] [PubMed]
- Echtenkamp, P.L.; Wilson, D.B.; Shuler, M.L. Cell cycle progression in Escherichia coli B/r affects transcription of certain genes: Implications for synthetic genome design. Biotechnol. Bioeng. 2009, 102, 902–909. [Google Scholar] [CrossRef]
- Côté, J.P.; French, S.; Gehrke, S.S.; MacNair, C.R.; Mangat, C.S.; Bharat, A.; Brown, E.D. The Genome-Wide Interaction Network of Nutrient Stress Genes in Escherichia coli. mBio 2016, 7, e01714-16. [Google Scholar] [CrossRef]
- Pearson, W.R. An introduction to sequence similarity (“homology”) searching. Curr. Protoc. Bioinform. 2013, 42, 3.1.1–3.1.8. [Google Scholar] [CrossRef]
- Koonin, E.V. Orthologs, paralogs, and evolutionary genomics. Annu. Rev. Genet. 2005, 39, 309–338. [Google Scholar] [CrossRef] [PubMed]
- Kaper, J.B.; Nataro, J.P.; Mobley, H.L. Pathogenic Escherichia coli. Nat. Rev. Microbiol. 2004, 2, 123–140. [Google Scholar] [CrossRef] [PubMed]
- Brockhurst, M.A.; Harrison, E.; Hall, J.P.J.; Richards, T.; McNally, A.; MacLean, C. The ecology and evolution of pangenomes. Curr. Biol. 2019, 29, R1094–R1103. [Google Scholar] [CrossRef] [PubMed]
- Gally, D.L.; Stevens, M.P. Microbe Profile: Escherichia coli O157:H7—Notorious relative of the microbiologist’s workhorse. Microbiology 2017, 163, 1–3. [Google Scholar] [CrossRef] [PubMed]
- Kanehisa, M.; Sato, Y. KEGG Mapper for inferring cellular functions from protein sequences. Protein Sci. 2020, 29, 28–35. [Google Scholar] [CrossRef] [PubMed]
- Liu, P.; Zou, S.; Zhang, H.; Liu, Q.; Song, Z.; Huang, Y.; Hu, X. Genome-resolved metagenomics provides insights into the microbial-mediated sulfur and nitrogen cycling in temperate seagrass meadows. Front. Mar. Sci. 2023, 10, 1245288. [Google Scholar] [CrossRef]
- Wang, J.; Jia, H. Metagenome-wide association studies: Fine-mining the microbiome. Nat. Rev. Microbiol. 2016, 14, 508–522. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhao, Z.; Marotta, F.; Wu, M. Thanos: An R Package for the Gene-Centric Analysis of Functional Potential in Metagenomic Samples. Microorganisms 2024, 12, 1264. https://doi.org/10.3390/microorganisms12071264
Zhao Z, Marotta F, Wu M. Thanos: An R Package for the Gene-Centric Analysis of Functional Potential in Metagenomic Samples. Microorganisms. 2024; 12(7):1264. https://doi.org/10.3390/microorganisms12071264
Chicago/Turabian StyleZhao, Zhe, Federico Marotta, and Min Wu. 2024. "Thanos: An R Package for the Gene-Centric Analysis of Functional Potential in Metagenomic Samples" Microorganisms 12, no. 7: 1264. https://doi.org/10.3390/microorganisms12071264
APA StyleZhao, Z., Marotta, F., & Wu, M. (2024). Thanos: An R Package for the Gene-Centric Analysis of Functional Potential in Metagenomic Samples. Microorganisms, 12(7), 1264. https://doi.org/10.3390/microorganisms12071264