OHDLF: A Method for Selecting Orthologous Genes for Phylogenetic Construction and Its Application in the Genus Camellia
Abstract
:1. Introduction
2. Materials and Methods
2.1. Analysis Data Source
2.2. Inference of Whole-Genome Duplication
2.3. Transcriptome Assembly and Analysis
2.4. Pipeline for OHDLF
2.5. Calculation of Species Divergence Times
3. Results and Discussion
3.1. Current Challenges in Using Whole-Genome Transcripts for Accurate Phylogenetic Tree of the Genus Camellia
3.2. Development of the OHDLF Workflow for Filtering Orthologous Genes in Camellia Species
3.3. Phylogenetic Trees Constructed Using Orthogroup Data from the OHDLF Workflow
3.4. The Accuracy of Phylogenetic Tree Construction Using the OHDLF Pipeline and Its Scope of Application
3.5. The Time Tree of Family Theaceae
4. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Zhang, H.D. A taxonomy of the genus Camellia. Acta Sci. Nat. Univ. Sunyatseni 1981, 1, 1–180. [Google Scholar]
- Ming, T.L. The classification, differentiation and distribution of the genus Camellia Sect. Camellia. Acta Bot. Yunnanica 1998, 20, 48–91. [Google Scholar]
- Wu, Q.; Tong, W.; Zhao, H.; Ge, R.; Li, R.; Huang, J.; Li, F.; Wang, Y.; Mallano, A.I.; Deng, W.; et al. Comparative transcriptomic analysis unveils the deep phylogeny and secondary metabolite evolution of 116 Camellia plants. Plant J. 2022, 111, 406–421. [Google Scholar] [CrossRef]
- Zan, T.; He, Y.T.; Zhang, M.; Yonezawa, T.; Ma, H.; Zhao, Q.M.; Kuo, W.Y.; Zhang, W.J.; Huang, C.H. Phylogenomic analyses of Camellia support reticulate evolution among major clades. Mol. Phylogenetics Evol. 2023, 182, 107744. [Google Scholar] [CrossRef]
- Feng, S.; Ru, D.; Sun, Y.; Mao, K.; Milne, R.; Liu, J. Trans-lineage polymorphism and nonbifurcating diversification of the genus Picea. New Phytol. 2018, 222, 576–587. [Google Scholar] [CrossRef]
- Hirota, S.K.; Yasumoto, A.A.; Nitta, K.; Tagane, M.; Miki, N.; Suyama, Y.; Yahara, T. Evolutionary history of Hemerocallis in Japan inferred from chloroplast and nuclear phylogenies and levels of interspecific gene flow. Mol. Phylogenetics Evol. 2021, 164, 107264. [Google Scholar] [CrossRef]
- Myers, E.A.; Mulcahy, D.G.; Falk, B.; Johnson, K.; Carbi, M.; de Queiroz, K. Interspecific gene flow and mitochondrial genome capture during the radiation of Jamaican Anolis lizards (Squamata; Iguanidae). Syst. Biol. 2022, 71, 501–511. [Google Scholar] [CrossRef]
- Huang, F.; Duan, J.; Lei, Y.; Liu, Z.; Kang, Y.; Luo, Y.; Chen, Y.; Li, Y.; Liu, S.; Li, S.; et al. Genetic diversity, population structure and core collection analysis of hunan tea plant germplasm through genotyping-by-sequencing. Beverage Plant Res. 2022, 2, 36–42. [Google Scholar] [CrossRef]
- Huang, J.; Bennett, J.; Flouri, T.; Leaché, A.D.; Yang, Z. Phase resolution of heterozygous sites in diploid genomes is important to phylogenomic analysis under the multispecies coalescent model. Syst. Biol. 2022, 71, 334–352. [Google Scholar] [CrossRef]
- Wang, F.; Wang, Y.; Zeng, X.; Zhang, S.; Yu, J.; Li, D.; Zhang, X. MIKE: An ultrafast, assembly-, and alignment-free approach for phylogenetic tree construction. Bioinformatics 2024, 40, bate154. [Google Scholar] [CrossRef]
- Kubatko, L.S.; Degnan, J.H. Inconsistency of phylogenetic estimates from concatenated data under coalescence. Syst. Biol. 2007, 56, 17–24. [Google Scholar] [CrossRef]
- Mirarab, S.; Reaz, R.; Bayzid, M.S.; Zimmermann, T.; Swenson, M.S.; Warnow, T. Astral: Genome-scale coalescent-based species tree estimation. Bioinformatics 2014, 30, i541–i548. [Google Scholar] [CrossRef]
- Vachaspati, P.; Warnow, T. Astrid: Accurate species trees from internode distances. BMC Genom. 2015, 16. [Google Scholar] [CrossRef]
- Sayyari, E.; Mirarab, S. Anchoring quartet-based phylogenetic distances and applications to species tree reconstruction. BMC Genom. 2016, 17, 101–113. [Google Scholar] [CrossRef]
- Molloy, E.K.; Warnow, T. To include or not to include: The impact of gene filtering on species tree estimation methods. Syst. Biol. 2018, 67, 285–303. [Google Scholar] [CrossRef]
- Liu, L.; Yu, L.; Edwards, S.V. A maximum pseudo-likelihood approach for estimating species trees under the coalescent model. BMC Evol. Biol. 2010, 10, 302. [Google Scholar] [CrossRef]
- Willson, J.; Roddur, M.S.; Liu, B.; Zaharias, P.; Warnow, T. Disco:species tree inference using multi-copy gene family tree decomposition. Syst. Biol. 2021, 3, 3. [Google Scholar] [CrossRef]
- Washburn, J.D.; Schnable, J.C.; Conant, G.C.; Brutnell, T.P.; Shao, Y.; Zhang, Y.; Ludwig, M.; Davidse, G.; Pires, J.C. Genome-Guided Phylo-Transcriptomic Methods and the Nuclear Phylogentic Tree of the Paniceae Grasses. Sci. Rep. 2017, 7, 13528. [Google Scholar] [CrossRef]
- Yang, Y.; Smith, S.A. Orthology inference in nonmodel organisms using transcriptomes and low-coverage genomes: Improving accuracy and matrix occupancy for phylogenomics. Mol. Biol. Evol. 2014, 31, 3081–3092. [Google Scholar] [CrossRef]
- Zhang, X.; Chen, S.; Shi, L.; Gong, D.; Zhang, S.; Zhao, Q.; Zhan, D.; Vasseur, L.; Wang, Y.; Yu, J.; et al. Haplotype-resolved genome assembly provides insights into evolutionary history of the tea plant Camellia sinensis. Nat. Genet. 2021, 53, 1250–1259. [Google Scholar] [CrossRef]
- Zhang, W.; Zhang, Y.; Qiu, H.; Guo, Y.; Wan, H.; Zhang, X.; Scossa, F.; Alseekh, S.; Zhang, Q.; Wang, P.; et al. Genome assembly of wild tea tree DASZ reveals pedigree and selection history of tea varieties. Nat. Commun. 2020, 11, 3719. [Google Scholar] [CrossRef]
- Wang, P.; Yu, J.; Jin, S.; Chen, S.; Yue, C.; Wang, W.; Gao, S.; Cao, H.; Zheng, Y.; Gu, M.; et al. Genetic basis of high aroma and stress tolerance in the oolong tea cultivar genome. Hortic. Res. 2021, 8, 107. [Google Scholar] [CrossRef]
- Wang, X.; Feng, H.; Chang, Y.; Ma, C.; Wang, L.; Hao, X.; Li, A.; Cheng, H.; Wang, L.; Cui, P.; et al. Population sequencing enhances understanding of tea plant evolution. Nat. Commun. 2020, 11, 4447. [Google Scholar] [CrossRef]
- Zhang, Q.J.; Li, W.; Li, K.; Nan, H.; Shi, C.; Zhang, Y.; Dai, Z.Y.; Lin, Y.L.; Yang, X.L.; Tong, Y.; et al. The chromosome-level reference genome of tea tree unveils recent bursts of non-autonomous LTR retrotransposons in driving genome size evolution. Mol. Plant 2020, 13, 935–938. [Google Scholar] [CrossRef]
- Xia, E.; Tong, W.; Hou, Y.; An, Y.; Chen, L.; Wu, Q.; Liu, Y.; Yu, J.; Li, F.; Li, R.; et al. The reference genome of tea plant and resequencing of 81 diverse accessions provide insights into its genome evolution and adaptation. Mol. Plant 2020, 13, 1013–1026. [Google Scholar] [CrossRef]
- Lin, P.; Wang, K.; Wang, Y.; Hu, Z.; Yan, C.; Huang, H.; Ma, X.; Cao, Y.; Long, W.; Liu, W.; et al. The genome of oil-camellia and population genomics analysis provide insights into seed oil domestication. Genome Biol. 2022, 23, 14. [Google Scholar] [CrossRef]
- Gong, W.; Xiao, S.; Wang, L.; Liao, Z.; Chang, Y.; Mo, W.; Hu, G.; Li, W.; Zhao, G.; Zhu, H.; et al. Chromosome-level genome of Camellia Lanceoleosa provides a valuable resource for understanding genome evolution and self-incompatibility. Plant J. 2022, 110, 881–898. [Google Scholar] [CrossRef]
- Shen, T.F.; Huang, B.; Xu, M.; Zhou, P.Y.; Ni, Z.X.; Gong, C.; Wen, Q.; Cao, F.L.; Xu, L.A. The reference genome of Camellia Chekiangoleosa provides insights into Camellia evolution and tea oil biosynthesis. Hortic. Res. 2022, 9, uhab083. [Google Scholar] [CrossRef]
- Zhang, Q.; Zhao, L.; Folk, R.A.; Zhao, J.L.; Zamora, N.A.; Yang, S.X.; Soltis, D.E.; Soltis, P.S.; Gao, L.M.; Peng, H.; et al. Phylotranscriptomics of Theaceae: Generic-level relationships, reticulation and whole-genome duplication. Ann. Bot. 2022, 129, 457–471. [Google Scholar] [CrossRef]
- Huang, H.; Tong, Y.; Zhang, Q.J.; Gao, L.Z. Genome size variation among and within Camellia species by using flow cytometric analysis. PLoS ONE 2013, 8, e64981. [Google Scholar] [CrossRef]
- Zwaenepoel, A.; Van de Peer, Y. Inference of ancient whole-genome duplications and the evolution of gene duplication and loss rates. Mol. Biol. Evol. 2019, 36, 1384–1404. [Google Scholar] [CrossRef]
- Bolger, A.M.; Lohse, M.; Usadel, B. Trimmomatic: A flexible trimmer for illumina sequence data. Bioinformatics 2014, 30, 2114–2120. [Google Scholar] [CrossRef]
- Haas, B.J.; Papanicolaou, A.; Yassour, M.; Grabherr, M.; Blood, P.D.; Bowden, J.; Couger, M.B.; Eccles, D.; Li, B.; Lieber, M.; et al. De novo transcript sequence reconstruction from RNA-seq using the trinity platform for reference generation and analysis. Nat. Protoc. 2013, 8, 1494–1512. [Google Scholar] [CrossRef]
- Fu, L.; Niu, B.; Zhu, Z.; Wu, S.; Li, W. CD-HIT: Accelerated for clustering the next-generation sequencing data. Bioinformatics 2012, 28, 3150–3152. [Google Scholar] [CrossRef]
- Emms, D.M.; Kelly, S. Orthofinder: Phylogenetic orthology inference for comparative genomics. Genome Biol. 2019, 20, 238. [Google Scholar] [CrossRef]
- Katoh, K.; Standley, D.M. Mafft multiple sequence alignment software version 7: Improvements in performance and usability. Mol. Biol. Evol. 2013, 30, 772–780. [Google Scholar] [CrossRef]
- Stamatakis, A. Raxml version 8: A tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 2014, 30, 1312–1313. [Google Scholar] [CrossRef]
- Nguyen, L.T.; Schmidt, H.A.; von Haeseler, A.; Minh, B.Q. IQ-TREE: A fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol. Biol. Evol. 2015, 32, 268–274. [Google Scholar] [CrossRef]
- Zhang, C.; Mirarab, S. ASTRAL-Pro 2: Ultrafast species tree reconstruction from multi-copy gene family trees. Bioinformatics 2022, 38, 4949–4950. [Google Scholar] [CrossRef]
- Sanderson, M.J. R8s: Inferring absolute rates of molecular evolution and divergence times in the absence of a molecular clock. Bioinformatics 2003, 19, 301–302. [Google Scholar] [CrossRef]
- Yu, X.Q.; Gao, L.M.; Soltis, D.E.; Soltis, P.S.; Yang, J.B.; Fang, L.; Yang, S.X.; Li, D.Z. Insights into the historical assembly of East Asian subtropical evergreen broadleaved forests revealed by the temporal history of the tea family. New Phytol. 2017, 215, 1235–1248. [Google Scholar] [CrossRef] [PubMed]
- Stein, J.C.; Yu, Y.; Copetti, D.; Zwickl, D.J.; Zhang, L.; Zhang, C.; Chougule, K.; Gao, D.; Iwata, A.; Goicoechea, J.L.; et al. Genomes of 13 domesticated and wild rice relatives highlight genetic conservation, turnover and innovation across the genus Oryza. Nat. Genet. 2018, 50, 285–296. [Google Scholar] [CrossRef] [PubMed]
- Cohen, K.M.; Finney, S.M.; Gibbard, P.L.; Fan, J. The ics international chronostratigraphic chart. Episodes 2013, 36, 199–204. [Google Scholar] [CrossRef] [PubMed]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Cai, J.; Lu, C.; Cui, Y.; Wang, Z.; Zhang, Q. OHDLF: A Method for Selecting Orthologous Genes for Phylogenetic Construction and Its Application in the Genus Camellia. Genes 2024, 15, 1404. https://doi.org/10.3390/genes15111404
Cai J, Lu C, Cui Y, Wang Z, Zhang Q. OHDLF: A Method for Selecting Orthologous Genes for Phylogenetic Construction and Its Application in the Genus Camellia. Genes. 2024; 15(11):1404. https://doi.org/10.3390/genes15111404
Chicago/Turabian StyleCai, Junhao, Cui Lu, Yuwei Cui, Zhentao Wang, and Qunjie Zhang. 2024. "OHDLF: A Method for Selecting Orthologous Genes for Phylogenetic Construction and Its Application in the Genus Camellia" Genes 15, no. 11: 1404. https://doi.org/10.3390/genes15111404
APA StyleCai, J., Lu, C., Cui, Y., Wang, Z., & Zhang, Q. (2024). OHDLF: A Method for Selecting Orthologous Genes for Phylogenetic Construction and Its Application in the Genus Camellia. Genes, 15(11), 1404. https://doi.org/10.3390/genes15111404