Pangenomics in Microbial and Crop Research: Progress, Applications, and Perspectives
Abstract
:1. Introduction
2. Pangenome: Concept and Types
3. Importance of Pangenome
4. Structural Variations Are Crucial for within-Species Diversity
5. Pangenome Construction: Basic Approaches and Critical Factors
6. Software’s/Tools for Pangenome Analysis
7. Applications of Pangenomics in Biological Research
7.1. Finding Novel Genes
7.2. Revealing Niche-Specific Fitness
7.3. Evolution, Domestication and Breeding History
7.4. Elucidating Host-Pathogen Interactions
7.5. Explaining Heterosis
7.6. Facilitating Taxonomic Identification
7.7. Strengthening Proteogenomics
7.8. Advancing Reverse Vaccinology
8. Conclusions and Future Perspectives
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Bohra, A.; Jha, U.C.; Godwin, I.; Varshney, R.K. Genomic Interventions for Sustainable Agriculture. Plant Biotechnol. J. 2020, 18, 2388–2405. [Google Scholar] [CrossRef] [PubMed]
- Heather, J.M.; Chain, B. The Sequence of Sequencers: The History of Sequencing DNA. Genomics 2016, 107, 1–8. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Varshney, R.K.; Nayak, S.N.; May, G.D.; Jackson, S.A. Next-Generation Sequencing Technologies and Their Implications for Crop Genetics and Breeding. Trends Biotechnol. 2009, 27, 522–530. [Google Scholar] [CrossRef] [Green Version]
- Varshney, R.K.; Bohra, A.; Yu, J.; Graner, A.; Zhang, Q.; Sorrells, M.E. Designing Future Crops: Genomics-Assisted Breeding Comes of Age. Trends Plant Sci. 2021, 26, 631–649. [Google Scholar] [CrossRef] [PubMed]
- Tettelin, H.; Masignani, V.; Cieslewicz, M.J.; Donati, C.; Medini, D.; Ward, N.L.; Angiuoli, S.V.; Crabtree, J.; Jones, A.L.; Durkin, A.S.; et al. Genome Analysis of Multiple Pathogenic Isolates of Streptococcus Agalactiae: Implications for the Microbial “pangenome”. Proc. Natl. Acad. Sci. USA 2005, 102, 13950–13955. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Golicz, A.A.; Batley, J.; Edwards, D. Towards Plant Pangenomics. Plant Biotechnol. J. 2016, 14, 1099–1105. [Google Scholar] [CrossRef] [PubMed]
- Vernikos, G.; Medini, D.; Riley, D.R.; Tettelin, H. Ten Years of Pangenome Analyses. Curr. Opin. Microbiol. 2015, 23, 148–154. [Google Scholar] [CrossRef] [PubMed]
- Medini, D.; Donati, C.; Tettelin, H.; Masignani, V.; Rappuoli, R. The microbial pan-genome. Curr. Opin. Genet. Dev. 2005, 15, 589–594. [Google Scholar] [CrossRef]
- Khan, A.W.; Garg, V.; Roorkiwal, M.; Golicz, A.A.; Edwards, D.; Varshney, R.K. Super-Pangenome by Integrating the Wild Side of a Species for Accelerated Crop Improvement. Trends Plant Sci. 2020, 25, 148–158. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Saxena, R.K.; Edwards, D.; Varshney, R.K. Structural Variations in Plant Genomes. Brief. Funct. Genom. 2014, 13, 296–307. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Carlos Guimaraes, L.; Benevides de Jesus, L.; Vinicius Canario Viana, M.; Silva, A.; Thiago Juca Ramos, R.; de Castro Soares, S.; Azevedo, V. Inside the Pangenome—Methods and Software Overview. Curr. Genom. 2015, 16, 245–252. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Marschall, T.; Marz, M.; Abeel, T.; Dijkstra, L.; Dutilh, B.E.; Ghaffaari, A.; Kersey, P.; Kloosterman, W.P.; Mäkinen, V.; Novak, A.M.; et al. Computational Pangenomics: Status, Promises and Challenges. Brief. Bioinform. 2018, 19, 118–135. [Google Scholar] [CrossRef] [Green Version]
- Zhao, Q.; Feng, Q.; Lu, H. Pangenome Analysis Highlights the Extent of Genomic Variation in Cultivated and Wild Rice. Nat. Genet. 2018, 50, 278–284. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Golicz, A.A.; Bayer, P.E.; Barker, G.C.; Edger, P.P.; Kim, H.; Martinez, P.A.; Chan, C.K.K.; Severn-Ellis, A.; McCombie, W.R.; Parkin, I.A.; et al. The Pangenome of an Agronomically Important Crop Plant Brassica Oleracea. Nat. Commun. 2016, 7, 1–8. [Google Scholar] [CrossRef]
- Hirsch, C.N.; Foerster, J.M.; Johnson, J.M.; Sekhon, R.S.; Muttoni, G.; Vaillancourt, B.; Peñagaricano, F.; Lindquist, E.; Pedraza, M.A.; Barry, K. Insights into the Maize Pangenome and Pantranscriptome. Plant Cell 2014, 26, 121–135. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Li, Y.H.; Zhou, G.; Ma, J.; Jiang, W.; Jin, L.G.; Zhang, Z.; Guo, Y.; Zhang, J.; Sui, Y.; Zheng, L.; et al. De Novo Assembly of Soybean Wild Relatives for Pangenome Analysis of Diversity and Agronomic Traits. Nat. Biotechnol. 2014, 32, 1045–1052. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Vernikos, G.S. A Review of Pangenome Tools and Recent Studies. In The Pangenome: Diversity, Dynamics and Evolution of Genomes; The Pangenome; Springer: Cham, Switzerland, 2020. [Google Scholar]
- Baker, M. Structural Variation: The Genome’s Hidden Architecture. Nat. Methods 2012, 9, 133–137. [Google Scholar] [CrossRef] [PubMed]
- Springer, N.M.; Ying, K.; Fu, Y.; Ji, T.; Yeh, C.T.; Jia, Y.; Wu, W.; Richmond, T.; Kitzman, J.; Rosenbaum, H.; et al. Maize Inbreds Exhibit High Levels of Copy Number Variation (CNV) and Presence/Absence Variation (PAV) in Genome Content. PLoS Genet. 2009, 5, e1000734. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Ellis, J.; Dodds, P.; Pryor, T. Structure, Function and Evolution of Plant Disease Resistance Genes. Curr. Opin. Plant Biol. 2000, 3, 278–284. [Google Scholar] [CrossRef]
- Tao, Y.; Zhao, X.; Mace, E.; Henry, R.; Jordan, D. Exploring and Exploiting Pangenomics for Crop Improvement. Mol. Plant 2019, 12, 156–169. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Ashikawa, I.; Hayashi, N.; Yamane, H.; Kanamori, H.; Wu, J.; Matsumoto, T.; Ono, K.; Yano, M. Two Adjacent Nucleotide-Binding Site–Leucine-Rich Repeat Class Genes Are Required to Confer Pikm-Specific Rice Blast Resistance. Genetics 2008, 180, 2267–2276. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Lin, Z.; Li, X.; Shannon, L.M.; Yeh, C.T.; Wang, M.L.; Bai, G.; Peng, Z.; Li, J.; Trick, H.N.; Clemente, T.E.; et al. Parallel domestication of the Shattering. Nat. Genet. 2012, 44, 720–724. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Yang, Q.; Li, Z.; Li, W.; Ku, L.; Wang, C.; Ye, J.; Li, K.; Yang, N.; Li, Y.; Zhong, T.; et al. CACTA-like Transposable Element in ZmCCT Attenuated Photoperiod Sensitivity and Accelerated the Post Domestication Spread of Maize. Proc. Natl. Acad. Sci. USA 2013, 110, 16969–16974. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Fu, H.; Dooner, H.K. Intraspecific Violation of Genetic Colinearity and Its Implications in Maize. Proc. Natl. Acad. Sci. USA 2002, 99, 7578–9573. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Qin, P.; Lu, H.; Du, H.; Wang, H.; Chen, W.; Chen, Z.; He, Q.; Ou, S.; Zhang, H.; Li, X.; et al. Pangenome Analysis of 33 Genetically Diverse Rice Accessions Reveals Hidden Genomic Variations. Cell 2021, 184, 3542–3558.e16. [Google Scholar] [CrossRef]
- Schatz, M.C.; Maron, L.G.; Stein, J.C.; Wences, A.H.; Gurtowski, J.; Biggers, E.; Lee, H.; Kramer, M.; Antoniou, E.; Ghiban, E.; et al. Whole Genome de Novo Assemblies of Three Divergent Strains of Rice, Oryza Sativa, Document Novel Gene Space of Aus and Indica. Genome Biol. 2014, 15, 506. [Google Scholar] [PubMed]
- Montenegro, J.D.; Golicz, A.A.; Bayer, P.E.; Hurgobin, B.; Lee, H.; Chan, C.K.K.; Visendi, P.; Lai, K.; Doležel, J.; Batley, J.; et al. The Pangenome of Hexaploid Bread Wheat. Plant J. 2017, 90, 1007–1013. [Google Scholar] [CrossRef] [Green Version]
- Hurgobin, B.; Golicz, A.A.; Bayer, P.E.; Chan, C.K.K.; Tirnaz, S.; Dolatabadian, A.; Schiessl, S.V.; Samans, B.; Montenegro, J.D.; Parkin, I.A.; et al. Homoeologous Exchange Is a Major Cause of Gene Presence/Absence Variation in the Amphidiploid Brassica Napus. Plant Biotechnol. J. 2018, 16, 1265–1274. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Hu, Z.; Sun, C.; Lu, K.C.; Chu, X.; Zhao, Y.; Lu, J.; Shi, J.; Wei, C. EUPAN Enables Pangenome Studies of a Large Number of Eukaryotic Genomes. Bioinformatics 2017, 33, 2408–2409. [Google Scholar] [CrossRef] [PubMed]
- Jayakodi, M.; Schreiber, M.; Stein, N.; Mascher, M. Building Pangenome Infrastructures for Crop Plants and Their Use in Association Genetics. DNA Res. 2021, 28, dsaa030. [Google Scholar] [CrossRef]
- Wang, W.; Mauleon, R.; Hu, Z. Genomic Variation in 3010 Diverse Accessions of Asian Cultivated Rice. Nature 2018, 557, 43–49. [Google Scholar] [CrossRef] [PubMed]
- Gan, X.; Stegle, O.; Behr, J.; Steffen, J.G.; Drewe, P.; Hildebrand, K.L.; Lyngsoe, R.; Schultheiss, S.J.; Osborne, E.J.; Sreedharan, V.T.; et al. Multiple Reference Genomes and Transcriptomes for Arabidopsis Thaliana. Nature 2011, 477, 419–423. [Google Scholar] [CrossRef] [PubMed]
- Zapata, L.; Ding, J.; Willing, E.M.; Hartwig, B.; Bezdan, D.; Jiao, W.B.; Patel, V.; James, G.V.; Koornneef, M.; Ossowski, S.; et al. Chromosome-Level Assembly of Arabidopsis Thaliana Ler Reveals the Extent of Translocation and Inversion Polymorphisms. Proc. Natl. Acad. Sci. USA 2016, 113, E4052–E4060. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Adams, K.L.; Wendel, J.F. Polyploidy and Genome Evolution in Plants. Curr. Opin. Plant Biol. 2005, 8, 135–141. [Google Scholar] [CrossRef] [PubMed]
- Cao, M.D.; Nguyen, S.H.; Ganesamoorthy, D.; Elliott, A.G.; Cooper, M.A.; Coin, L.J. Scaffolding and Completing Genome Assemblies in Real-Time with Nanopore Sequencing. Nat. Commun. 2017, 8, 14515. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Parra, G.; Bradnam, K.; Korf, I. CEGMA: A Pipeline to Accurately Annotate Core Genes in Eukaryotic Genomes. Bioinformatics 2007, 23, 1061–1067. [Google Scholar] [CrossRef] [PubMed]
- Simão, F.A.; Waterhouse, R.M.; Ioannidis, P.; Kriventseva, E.V.; Zdobnov, E.M. BUSCO: Assessing Genome Assembly and Annotation Completeness with Single-Copy Orthologs. Bioinformatics 2015, 31, 3210–3212. [Google Scholar] [CrossRef] [Green Version]
- Tranchant-Dubreuil, C.; Rouard, M.; Sabot, F. Plant Pangenome: Impacts on Phenotypes and Evolution. Annu. Rev. Plant Biol. 2018, 15, 453–478. [Google Scholar]
- Xiao, J.; Zhang, Z.; Wu, J.; Yu, J. A Brief Review of Software Tools for Pangenomics. Genom. Proteom. Bioinform. 2015, 13, 73–76. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Lerat, E.; Daubin, V.; Moran, N.A. From Gene Trees to Organismal Phylogeny in Prokaryotes: The Case of the Gammaproteo Bacteria. PLoS Biol. 2003, 1, 101–109. [Google Scholar] [CrossRef] [PubMed]
- Laing, C.; Buchanan, C.; Taboada, E.N.; Zhang, Y.; Kropinski, A.; Villegas, A.; Thomas, J.E.; Gannon, V.P. Pangenome Sequence Analysis Using Panseq: An Online Tool for the Rapid Analysis of Core and Accessory Genomic Regions. BMC Bioinform. 2010, 11, 461. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Lukjancenko, O.; Thomsen, M.C.; Voldby, L.M.; Ussery, D.W. PanFunPro: Pangenome Analysis Based on FUNctionalPROfiles. F1000Research 2013, 2, 265. [Google Scholar] [CrossRef]
- Contreras-Moreira, B.; Vinuesa, P. GET_HOMOLOGUES, a Versatile Software Package for Scalable and Robust Microbial Pangenome Analysis. Appl. Environ. Microbiol. 2013, 79, 7696–7701. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Benedict, M.N.; Henriksen, J.R.; Metcalf, W.W.; Whitaker, R.J.; Price, N.D. ITEP: An Integrated Toolkit for Exploration of Microbial Pangenomes. BMC Genom. 2014, 15, 8. [Google Scholar] [CrossRef] [Green Version]
- Zhao, Y.; Jia, X.; Yang, J.; Ling, Y.; Zhang, Z.; Yu, J.; Wu, J.; Xiao, J. PanGP: A Tool for Quickly Analyzing Bacterial Pangenome Profile. Bioinformatics 2014, 30, 1297–1299. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Zhao, Y.; Wu, J.; Yang, J.; Sun, S.; Xiao, J.; Yu, J. PGAP: Pangenomes Analysis Pipeline. Bioinformatics 2012, 28, 416–418. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Brittnacher, M.J.; Fong, C.; Hayden, H.S.; Jacobs, M.A.; Radey, M.; Rohmer, L. PGAT: A Multistrain Analysis Resource for Microbial Genomes. Bioinformatics 2011, 27, 2429–2430. [Google Scholar] [CrossRef]
- Blom, J.; Albaum, S.P.; Doppmeier, D. EDGAR: A Software Framework for the Comparative Analysis of Prokaryotic Genomes. BMC Bioinform. 2009, 10, 1–14. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Snipen, L.; Liland, K.H. Micropan: An R-Package for Microbial Pangenomics. BMC Bioinform. 2015, 16, 79. [Google Scholar] [CrossRef]
- Marcus, S.; Lee, H.; Schatz, M.C. Split MEM: A Graphical Algorithm for Pangenome Analysis with Suffix Skips. Bioinformatics 2014, 30, 3476–3483. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Ozer, E.A. AGE: A Tool for Clustering and Distribution Analysis of Bacterial Accessory Genomic Elements. BMC Bioinform. 2018, 19, 150. [Google Scholar] [CrossRef] [PubMed]
- Thakur, S.; Guttman, D.S. A De-Novo Genome Analysis Pipeline (DeNoGAP) for Large-Scale Comparative Prokaryotic Genomics Studies. BMC Bioinform. 2016, 17, 260. [Google Scholar] [CrossRef] [Green Version]
- Treangen, T.J.; Ondov, B.D.; Koren, S.; Phillippy, A.M. The Harvest Suite for Rapid Core-Genome Alignment and Visualization of Thousands of Intraspecific Microbial Genomes. Genome Biol. 2014, 15, 524. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Sahl, J.W.; Caporaso, J.G.; Rasko, D.A.; Keim, P. The Large-Scale Blast Score Ratio (LS-BSR) Pipeline: A Method to Rapidly Compare Genetic Content between Bacterial Genomes. PeerJ 2014, 2014, e332. [Google Scholar] [CrossRef] [Green Version]
- Kulsum, U.; Kapil, A.; Singh, H.; Kaur, P. NGSPanPipe: A Pipeline for Pangenome Identification in Microbial Strains from Experimental Reads. Adv. Exp. Med. Biol. 2018, 1052, 39–49. [Google Scholar] [PubMed]
- Clarke, T.H.; Brinkac, L.M.; Inman, J.M.; Sutton, G.; Fouts, D.E. PanACEA: A Bioinformatics Tool for the Exploration and Visualization of Bacterial Pan-Chromosomes. BMC Bioinform. 2018, 19, 1–6. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Ernst, C.; Rahmann, S. PanCake: A Data Structure for Pangenomes. German Conference on Bioinformatics. Schloss Dagstuhl-Leibniz-Zent. Inform. Ger. Dagstuhl Publ. 2013, 35–45. [Google Scholar] [CrossRef]
- Yuvaraj, I.; Sridhar, J.; Michael, D.; Sekar, K. PanGeT: Pangenomics Tool. Gene 2017, 600, 77–84. [Google Scholar] [CrossRef]
- Chaudhari, N.M.; Gautam, A.; Gupta, V.K.; Kaur, G.; Dutta, C.; Paul, S. PanGFR-HM: A Dynamic Web Resource for Pan-Genomic and Functional Profiling of Human Microbiome with Comparative Features. Front. Microbiol. 2018, 9, 2322. [Google Scholar] [CrossRef]
- Abudahab, K.; Prada, J.M.; Yang, Z.; Bentley, S.D.; Croucher, N.J.; Corander, J.; Aanensen, D.M. PANINI: Pangenome Neighbour Identification for Bacterial Populations. Microb. Genom. 2018, 5, 4. [Google Scholar] [CrossRef] [PubMed]
- Santos, A.R.; Barbosa, E.; Fiaux, K.; Zurita-Turk, M.; Chaitankar, V.; Kamapantula, B.; Abdelzaher, A.; Ghosh, P.; Tiwari, S.; Barve, N.; et al. PANNOTATOR: An Automated Tool for Annotation of Pangenomes. Genet. Mol. Res. 2013, 12, 2982–2989. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Fouts, D.E.; Brinkac, L.; Beck, E.; Inman, J.; Sutton, G. PanOCT: Automated Clustering of Orthologs Using Conserved Gene Neighborhood for Pan-Genomic Analysis of Bacterial Strains and Closely Related Species. Nucleic Acids Res. 2012, 40, e172. [Google Scholar] [CrossRef] [PubMed]
- Hennig, A.; Bernhardt, J.; Nieselt, K. Pan-Tetris: An Interactive Visualisation for Pangenomes. BMC Bioinform. 2015, 16, 1–11. [Google Scholar] [CrossRef] [Green Version]
- Sheikhizadeh, S.; Schranz, M.E.; Akdel, M.; de Ridder, D.; Smit, S. PanTools: Representation, Storage and Exploration of Pan-Genomic Data. Bioinformatics 2016, 32, 487–493. [Google Scholar] [CrossRef] [PubMed]
- Pedersen, T.L.; Nookaew, I.; Wayne Ussery, D.; Månsson, M. PanViz: Interactive visualization of the structure of functionally annotated pangenomes. Bioinformatics 2017, 33, 1081–1082. [Google Scholar] [CrossRef] [Green Version]
- Pantoja, Y.; Pinheiro, K.; Veras, A.; Araújo, F.; de Sousa, L.; Guimarães, L.C.; Silva, A.; Ramos, R.T. PanWeb: A Web Interface for Pan-Genomic Analysis. PLoS ONE 2017, 12, e0178154. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Ding, W.; Baumdicker, F.; Neher, R.A. PanX: Pangenome Analysis and Exploration. Nucleic Acids Res. 2018, 46, e5. [Google Scholar] [CrossRef] [Green Version]
- Liu, Y.Y.; Chiou, C.S.; Chen, C.C. PGAdb-Builder: A Web Service Tool for Creating Pangenome Allele Database for Molecular Fine Typing. Sci. Rep. 2016, 6, 36213. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Thorpe, H.A.; Bayliss, S.C.; Sheppard, S.K.; Feil, E.J. Piggy: A Rapid, Large-Scale Pangenome Analysis Tool for Intergenic Regions in Bacteria. Gigascience 2018, 7, 1–11. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Lees, J.A.; Galardini, M.; Bentley, S.D.; Weiser, J.N.; Corander, J. Pyseer: A Comprehensive Tool for Microbial Pangenome-Wide Association Studies. Bioinformatics 2018, 34, 4310–4312. [Google Scholar] [CrossRef]
- Jandrasits, C.; Dabrowski, P.W.; Fuchs, S.; Renard, B.Y. Seq-Seq-Pan: Building a Computational Pangenome Data Structure on Whole Genome Alignment. BMC Genom. 2018, 19, 47. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Ozer, E.A.; Allen, J.P.; Hauser, A.R. Characterization of the Core and Accessory Genomes of Pseudomonas Aeruginosa Using Bioinformatic Tools Spine and AGEnt. BMC Genom. 2014, 15, 737. [Google Scholar] [CrossRef] [Green Version]
- Chaudhari, N.M.; Gupta, V.K.; Dutta, C. BPGA- an Ultra-Fast Pangenome Analysis Pipeline. Sci. Rep. 2016, 6, 24373. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Cheng, G.; Lu, Q.; Ma, L.; Zhang, G.; Xu, L.; Zhou, Z. BGDMdocker: A Docker a Workflow Base on Docker for Analysis and Visualization Pangenome and Biosynthetic Gene Clusters of Bacterial. PeerJ 2017, 30, e3948. [Google Scholar] [CrossRef] [Green Version]
- Silva de Oliveira, M.; Thyeska Castro Alves, J.; Henrique Caracciolo Gomes de Sá, P.; Veras, A.A.D.O. PAN2HGENE–Tool for Comparative Analysis and Identifying New Gene Products. PLoS ONE 2021, 16, e0252414. [Google Scholar] [CrossRef] [PubMed]
- Danilevicz, M.F.; Fernandez, C.G.T.; Marsh, J.I.; Bayer, P.E.; Edwards, D. Plant Pangenomics: Approaches, Applications and Advancements. Curr. Opin. Plant Biol. 2020, 54, 18–25. [Google Scholar] [CrossRef] [PubMed]
- Beier, S.; Thomson, N.R. Panakeia—A Universal Tool for Bacterial Pangenome Analysis. bioRxiv 2021. [Google Scholar] [CrossRef]
- Duan, Z.; Qiao, Y.; Lu, J.; Lu, H.; Zhang, W.; Yan, F.; Sun, C.; Hu, Z.; Zhang, Z.; Li, G.; et al. HUPAN: A Pangenome Analysis Pipeline for Human Genomes. Genome Biol. 2019, 20, 1–11. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Bosi, E.; Fani, R.; Fondi, M. Defining Orthologs and Pangenome Size Metrics. Methods Mol. Biol. 2015, 1231, 191–202. [Google Scholar] [PubMed]
- Othoum, G.; Bougouffa, S.; Bokhari, A. Mining Biosynthetic Gene Clusters in Virgibacillus Genomes. BMC Genom. 2019, 20, 696. [Google Scholar] [CrossRef]
- Othoum, G.; Prigent, S.; Derouiche, A. Comparative Genomics Study Reveals Red Sea Bacillus with Characteristics Associated with Potential Microbial Cell Factories (MCFs). Sci. Rep. 2019, 9, 19254. [Google Scholar] [CrossRef]
- Kant, R.; Rintahaka, J.; Yu, X.; Sigvart-Mattila, P.; Paulin, L.; Mecklin, J.P.; Saarela, M.; Palva, A.; von Ossowski, I. A Comparative Pangenome Perspective of Niche-Adaptable Cell-Surface Protein Phenotypes in Lactobacillus Rhamnosus. PLoS ONE 2014, 9, e102762. [Google Scholar] [CrossRef] [PubMed]
- McInerney, J.; McNally, A.; O’Connell, M. Why Prokaryotes Have Pangenomes. Nat. Microbiol. 2017, 2, 17040. [Google Scholar] [CrossRef] [PubMed]
- Vos, M.; Eyre-Walker, A. Are Pangenomes Adaptive or Not? Nat. Microbiol. 2017, 2, 1576. [Google Scholar] [CrossRef] [PubMed]
- Vos, M.; Hesselman, M.C.; Te Beek, T.A.; van Passel, M.W.; Eyre-Walker, A. Rates of Lateral Gene Transfer in Prokaryotes: High but Why? Trend Microbiol. 2015, 23, 598–605. [Google Scholar] [CrossRef] [PubMed]
- Livingstone, P.G.; Morphew, R.M.; Whitworth, D.E. Genome Sequencing and Pangenome Analysis of 23 Corallococcus Spp. Strains Reveal Unexpected Diversity, with Particular Plasticity of Predatory Gene Sets. Front. Microbiol. 2018, 9, 3187. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Huang, X.; Kurata, N.; Wang, Z.X.; Wang, A.; Zhao, Q.; Zhao, Y.; Liu, K.; Lu, H.; Li, W.; Guo, Y.; et al. A Map of Rice Genome Variation Reveals the Origin of Cultivated Rice. Nature 2012, 490, 497–501. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Zhang, B.; Zhu, W.; Diao, S. The Poplar Pangenome Provides Insights into the Evolutionary History of the Genus. Commun. Biol. 2019, 2, 215. [Google Scholar] [CrossRef] [PubMed]
- Barchi, L.; Rabanus-Wallace, M.T.; Prohens, J.; Toppino, L.; Padmarasu, S.; Portis, E.; Rotino, G.L.; Stein, N.; Lanteri, S.; Giuliano, G. Improved Genome Assembly and Pan-genome Provide Key Insights on Eggplant Domestication and Breeding. Plant J. 2021, 107, 579–596. [Google Scholar] [CrossRef]
- Monat, C.; Sabot, F. Pangenomics in Crop Plants; Population Genomics; Springer: Cham, Switzerland, 2020. [Google Scholar] [CrossRef]
- Lei, L.; Goltsman, E.; Goodstein, D.; Wu, G.A.; Rokhsar, D.S.; Vogel, J.P. Plant Pangenomics Comes of Age. Ann. Rev. Plant Biol. 2021, 72, 411–435. [Google Scholar] [CrossRef] [PubMed]
- Della Coletta, R.; Qiu, Y.; Ou, S.; Hufford, M.B.; Hirsch, C.N. How the Pangenome Is Changing Crop Genomics and Improvement. Genome Biol. 2021, 22, 3. [Google Scholar] [CrossRef] [PubMed]
- Bayer, P.E.; Petereit, J.; Danilevicz, M.F.; Anderson, R.; Batley, J.; Edwards, D. The Application of Pangenomics and Machine Learning in Genomic Selection in Plants. Plant Genome 2021, 14, e20112. [Google Scholar] [CrossRef]
- Hu, B.; Xie, G.; Lo, C.; Starkenburg, S.R.; Chain, P.S.G. Pathogen Comparative Genomics in the Next-Generation Sequencing Era: Genome Alignments, Pangenomics and Metagenomics. Brief. Funct. Genom. 2011, 10, 322–333. [Google Scholar] [CrossRef] [Green Version]
- Casa-Esperón, E. Horizontal Transfer and the Evolution of Host-Pathogen Interactions. Int. J. Evol. Biol. 2012, 2012, 679045. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Perna, N.T.; Plunkett, G.; Burland, V.; Mau, B.; Glasner, J.D.; Rose, D.J.; Mayhew, G.F.; Evans, P.S.; Gregor, J.; Kirkpatrick, H.A.; et al. Genome Sequence of Enterohaemorrhagic Escherichia Coli O157: H7. Nature 2001, 409, 529–533. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Rasko, D.A.; Rosovitz, M.J.; Myers, G.S.; Mongodin, E.F.; Fricke, W.F.; Gajer, P.; Crabtree, J.; Sebaihia, M.; Thomson, N.R.; Chaudhuri, R.; et al. The Pangenome Structuree of Escherichia coli: Comparative Genomic Analysis of E. coli Commensal and Pathogenic Isolates. J. Bacteriol. 2008, 190, 6881–6893. [Google Scholar] [CrossRef] [Green Version]
- Badet, T.; Oggenfuss, U.; Abraham, L. A 19-Isolate Reference-Quality Global Pangenome for the Fungal Wheat Pathogen Zymoseptoria Tritici. BMC Biol. 2020, 18, 12. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Plissonneau, C.; Hartmann, F.E.; Croll, D. Pangenome Analyses of the Wheat Pathogen Zymoseptoria Tritici Reveal the Structural Basis of a Highly Plastic Eukaryotic Genome. BMC Biol. 2018, 16, 5. [Google Scholar] [CrossRef] [PubMed]
- Agarwal, G.; Gitaitis, R.D.; Dutta, B. Pangenome of Novel Pantoea Stewartii Subsp. Indologenes Reveals Genes Involved in Onion Pathogenicity and Evidence of Lateral Gene Transfer. Microorganisms 2021, 9, 1761. [Google Scholar] [CrossRef] [PubMed]
- Gonzalez, V.; Aventin, N.; Centeno, E.; Puigdomenech, P. High Presence/Absence Gene Variability in Defense-Related Gene Clusters of Cucumis Melo. BMC Genom. 2013, 14, 1–8. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Shen, J.; Araki, H.; Chen, L.; Chen, J.Q.; Tian, D. Unique Evolutionary Mechanism in R-Genes under the Presence/Absence Polymorphism in Arabidopsis Thaliana. Genetics 2006, 172, 1243–1250. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Winzer, T.; Gazda, V.; He, Z.; Kaminski, F.; Kern, M.; Larson, T.R.; Li, Y.; Meade, F.; Teodor, R.; Vaistij, F.E.; et al. A Papaver Somniferum 10-Gene Cluster for Synthesis of the Anticancer Alkaloid Noscapine. Science 2012, 336, 1704–1708. [Google Scholar] [CrossRef] [Green Version]
- Swanson-Wagner, R.A.; Eichten, S.R.; Kumari, S.; Tiffin, P.; Stein, J.C.; Ware, D.; Springer, N.M. Pervasive Gene Content Variation and Copy Number Variation in Maize and Its Undomesticated Progenitor. Genome Res. 2010, 20, 1689–1699. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Rouli, L.; Merhej, V.; Fournier, P.E.; Raoult, D. The Bacterial Pangenome as a New Tool for Analysing Pathogenic Bacteria. New Microbes New Infect. 2015, 7, 72–85. [Google Scholar] [CrossRef] [Green Version]
- de Souza, G.A.; Arntzen, M.Ø.; Wiker, H.G. MSMSpdbb: Providing Protein Databases of Closely Related Organisms to Improve Proteomic Characterization of Prokaryotic Microbes. Bioinformatics 2010, 26, 698–699. [Google Scholar] [CrossRef] [PubMed]
- Caputo, A.; Fournier, P.E.; Raoult, D. Genome and Pangenome Analysis to Classify Emerging Bacteria. Biol. Direct 2019, 14, 5. [Google Scholar] [CrossRef] [Green Version]
- Naz, K.; Naz, A.; Ashraf, S.T.; Rizwan, M.; Ahmad, J.; Baumbach, J.; Ali, A. PanRV: Pangenome-Reverse Vaccinology Approach for Identifications of Potential Vaccine Candidates in Microbial Pangenome. BMC Bioinform. 2019, 20, 1–10. [Google Scholar] [CrossRef]
- Dalsass, M.; Brozzi, A.; Medini, D.; Rappuoli, R. Comparison of Open-Source Reverse Vaccinology Programs for Bacterial Vaccine Antigen Discovery. Front. Immunol. 2019, 10, 113. [Google Scholar] [CrossRef] [Green Version]
Software/Tool | Description/Role | URL Link | References |
---|---|---|---|
PanSeq | Extract the regions unique in the genome, Identify the SNPs and construct the file for phylogeny programme. | https://lfz.corefacility.ca/panseq/ (accessed on 17 September 2021) | [42] |
PanFunPro | Homology detection and pairwise genome analysis in pan/core genome. | https://zenodo.org/record/7583#.YTR36p0zY2w (accessed on 17 September 2021) | [43] |
GET_ HOMOLOGUES | Clustering proteins and nucleotide sequence into homologous group and analysis of overlapping sets of proteins | http://www.eead.csic.es/compbio/soft/gethoms.php (accessed on 17 September 2021) | [44] |
ITEP | It is use for sequence alignment, metabolic, clustering, and protein prediction | https://price.systemsbiology.net/itep (accessed on 17 September 2021) | [45] |
PanGP | Use for large-scale bacterial pangenome profile analysis with sampling algorithms. | https://pangp.zhaopage.com/ (accessed on 17 September 2021) | [46] |
PGAP | Detection of homologous genes, orthologous genes, SNP, phylogenetic studies, pangenome plotting and functional annotation. | http://pgap.sf.net (accessed on 17 September 2021) | [47] |
PGAT | To compare the gene content and sequence across multiple microbial genomes to identify the SNPs. | http://nwrce.org/pgat (accessed on 17 September 2021) | [48] |
EDGAR | EDGAR performs homology analyses with a specific cutoff, Venn diagrams and interactive synteny plots. | https://bio.tools/edgar_genomics (accessed on 17 September 2021) | [49] |
Micropan | This allows integration of pangenome and additional analyses within a single programming language environment | Package “micropan” in r software (accessed on 17 September 2021) | [50] |
SplitMem | A graphic software for pangenome analysis software by de Bruijn graph. | https://sourceforge.net/projects/splitmem/ (accessed on 17 September 2021) | [51] |
ClustAGE | Focused on the accessory genomic dimension of pangenome | http://vfsmspineagent.fsm.northwestern.edu/cgi-bin/clustage.cgi (accessed on 17 September 2021) | [52] |
DeNoGAP | Help in gene prediction, protein classification and orthology search | https://github.com/DSGlab/DeNoGAP (accessed on 17 September 2021) | [53] |
EUPAN | This was first to analyze eukaryotic pangenomes to identify core and accessory gene datasets | http://cgm.sjtu.edu.cn/eupan/index.html (accessed on 17 September 2021) | [30] |
Harvest | This is useful for the analysis based on three modules Parsnp (core-genome analysis), Gingr (output visualization), and Harvest Tools (meta-analysis) | https://www.cbcb.umd.edu/software/harvest (accessed on 17 September 2021) | [54] |
LS-BSR | Calculates a score ratio per coding sequence within a pangenome dataset using BLAST | https://github.com/jasonsahl/LS-BSR (accessed on 17 September 2021) | [55] |
NGSPanPipe | Identify pangenome from short reads and output is compatible with other pangenome analysis tools | https://github.com/Biomedinformatics/NGSPanPipe (accessed on 17 September 2021) | [56] |
PanACEA | Identification of genomic regions those are phylogenetically dissimilar. | https://github.com/JCVenterInstitute/PanACEA (accessed on 17 September 2021) | [57] |
PanCake | Useful for clustering homologous genes and analyzing core/accessory genome | https://pypi.org/project/pancake/ ( accessed on 17 September 2021) | [58] |
PanGeT | Pangenome analysis based on comparison at genome and proteome levels. | http://pranag.physics.iisc.ernet.in/PanGeT/ (accessed on 17 September 2021) | [59] |
PanGFR-HM | Genomic/functional diversity and phylogenetic on genome-based between human associated microbial genomes | http://www.bioinfo.iicb.res.in/pangfr-hm/ (accessed on 17 September 2021) | [60] |
PANINI | For rapid online visualization and analysis of the core and accessory genome evolutionary signal. | http://panini.pathogen.watch (accessed on 17 September 2021) and code at http://gitlab.com/cgps/panini (accessed on 17 September 2021) | [61] |
PANNOTATOR | To ensure quality and standards for functional genome annotation among different strains | http://bnet.egr.vcu.edu/iioab/agenote.php (accessed on 17 September 2021) | [62] |
PanOCT | PanOCT is a graph-based ortholog clustering tool of closely related prokaryotic genomes. | ftp://ftp.ncbi.nih.gov/blast/executables/release/ (accessed 17 September 2021) | [63] |
Pan-Tetris | An interactive and dynamic visual inspection of gene occurrences in a pangenome table. | http://bit.ly/1vVxYZT (accessed on 17 September 2021) | [64] |
PanTools | Annotating pangenomes, sequences adding, grouping genes, retrieving genomic regions and querying pangenome | http://www.bif.wur.nl (accessed on 17 September 2021) | [65] |
PanViz | It can visualize from range of data formats of pangenomic data and mapping genes from existing pangenome. | https://github.com/thomasp85/PanViz (accessed on 17 September 2021) | [66] |
PanWeb | It is a graphical interface of pangenome analysis generated from PGAP software. | http://www.computationalbiology.ufpa.br/panweb (accessed on 17 September 2021) | [67] |
PanX | This tool identifies orthologous gene clusters in pangenomes, visualization, presence/absence pattern and identify SNPs | https://pangenome.org/ (accessed on 17 September 2021) | [68] |
PGAdb-Builder | This is used to constructs a pangenome allele database (PGAdb). | http://wgmlstdb.imst.nsysu.edu.tw/ (accessed on 17 September 2021) | [69] |
PGAP-X | Genome diversity and visualize genome structure and gene content to understand the evolution. | http://pgapx.ybzhao.com/ (accessed on 17 September 2021) | [22] |
Piggy | Detection of highly divergent (“switched”) intergenic regions (IGRs) upstream of genes in pangenome | https://github.com/harry-thorpe/piggy (accessed on 17 September 2021) | [70] |
Pyseer | This is helpful in genome-wide association studies in the microbes to identify potential genetic variation. | https://github.com/mgalardini/pyseer (accessed on 17 September 2021) | [71] |
Seq-seq-pan | For sequential alignment of sequences to build a pangenome data structure and a whole-genome alignment. | https://gitlab.com/rki_bioinformatics (accessed on 17 September 2021) | [72] |
Spine and AGEnt | Spine, find core-genome from a group of genomic sequences and AGEnt, find the accessory genome in draft genomic sequences | http://vfsmspineagent.fsm.northwestern.edu/index_age.html (accessed on 17 September 2021) | [73] |
BPGA | Pangenome profile analysis, pangenome sequence extraction, exclusive gene family analysis, atypical GC content analysis and species phylogenetic analysis. | http://sourceforge.net/projects/bpgatool/ (accessed on 17 September 2021) | [74] |
BGDMdocker | For pangenome analysis, visualization, clustering and genome annotation. | https://www.docker.com/whatisdocker (accessed on 17 September 2021) | [75] |
PAN2HGENE | To identify new products, resulting in altering the α value behavior in the pangenome without altering the original genomic sequence. | https://sourceforge.net/projects/pan2hgene-software (accessed on 17 September 2021) | [76] |
PATO | Core-genome and accessory genome identification and help to characterize population structure, annotate pathogenic features and create gene sharedness networks. | https://github.com/irycisBioinfo/PATO (accessed on 17 September 2021) | [77] |
Panakeia | It analyses synteny and multiple structural patterns of the pangenome, help for biological diversity and evolution studied. | https://github.com/BioSina/Panakeia (accessed on 17 September 2021) | [78] |
HUPAN | It is developed for pangenome analysis for humans/mammals | http://cgm.sjtu.edu.cn/hupan/ (17 September 2021) and https://github.com/SJTU-CGM/HUPAN (accessed on 17 September 2021) | [79] |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Aggarwal, S.K.; Singh, A.; Choudhary, M.; Kumar, A.; Rakshit, S.; Kumar, P.; Bohra, A.; Varshney, R.K. Pangenomics in Microbial and Crop Research: Progress, Applications, and Perspectives. Genes 2022, 13, 598. https://doi.org/10.3390/genes13040598
Aggarwal SK, Singh A, Choudhary M, Kumar A, Rakshit S, Kumar P, Bohra A, Varshney RK. Pangenomics in Microbial and Crop Research: Progress, Applications, and Perspectives. Genes. 2022; 13(4):598. https://doi.org/10.3390/genes13040598
Chicago/Turabian StyleAggarwal, Sumit Kumar, Alla Singh, Mukesh Choudhary, Aundy Kumar, Sujay Rakshit, Pardeep Kumar, Abhishek Bohra, and Rajeev K. Varshney. 2022. "Pangenomics in Microbial and Crop Research: Progress, Applications, and Perspectives" Genes 13, no. 4: 598. https://doi.org/10.3390/genes13040598
APA StyleAggarwal, S. K., Singh, A., Choudhary, M., Kumar, A., Rakshit, S., Kumar, P., Bohra, A., & Varshney, R. K. (2022). Pangenomics in Microbial and Crop Research: Progress, Applications, and Perspectives. Genes, 13(4), 598. https://doi.org/10.3390/genes13040598