Next Article in Journal
Army Combat Fitness Test Relationships to Tactical Foot March Performance in Reserve Officers’ Training Corps Cadets
Previous Article in Journal
Cingulum and Uncinate Fasciculus Microstructural Abnormalities in Parkinson’s Disease: A Systematic Review of Diffusion Tensor Imaging Studies
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Intraspecific Comparative Analysis Reveals Genomic Variation of Didymella arachidicola and Pathogenicity Factors Potentially Related to Lesion Phenotype

Institute of Plant Protection, Henan Key Laboratory of Crop Pest Control, International Joint Research Laboratory for Crop Protection of Henan, Key Laboratory of Integrated Pest Management on Crops in Southern Region of North China, Henan Academy of Agricultural Sciences, Zhengzhou 450000, China
*
Authors to whom correspondence should be addressed.
These authors contributed equally to this work.
Biology 2023, 12(3), 476; https://doi.org/10.3390/biology12030476
Submission received: 29 January 2023 / Revised: 16 March 2023 / Accepted: 17 March 2023 / Published: 21 March 2023
(This article belongs to the Section Genetics and Genomics)

Abstract

:

Simple Summary

Didymella arachidicola, the causal agent of peanut web blotch, leads to severe defoliation at the late growth stage of peanuts, and eventually to significant yield losses of up to 30%. The biology, ecology, and taxonomy of this phytopathogenic fungi have been well-studied; however, no study has focused on its genomic variation and pathogenic phenotype. Herein, we reported the first chromosome-scale genome assembly of D. arachidicola, which provides a reliable baseline for further comparative studies of plant pathogenic fungi. Combined with genome re-sequencing, we revealed the genomic variation within the D. arachidicola population, as well as comprehensively analyzed the pathogenicity-related genes to preliminarily explain their roles in forming different lesion phenotypes of peanut web blotch. This work set a genomic foundation and an adaptive landscape of D. arachidicola for understanding its genomic diversity and adaptive evolution, in parallel to the correlation of genotype and phenotype underlying the evolutionary force.

Abstract

Didymella arachidicola is one of the most important fungal pathogens, causing foliar disease and leading to severe yield losses of peanuts (Arachis hypogaea L.) in China. Two main lesion phenotypes of peanut web blotch have been identified as reticulation type (R type) and blotch type (B type). As no satisfactory reference genome is available, the genomic variations and pathogenicity factors of D. arachidicola remain to be revealed. In the present study, we collected 41 D. arachidicola isolates from 26 geographic locations across China (33 for R type and 8 for B type). The chromosome-scale genome of the most virulent isolate (YY187) was assembled as a reference using PacBio and Hi-C technologies. In addition, we re-sequenced 40 isolates from different sampling sites. Genome-wide alignments showed high similarity among the genomic sequences from the 40 isolates, with an average mapping rate of 97.38%. An average of 3242 SNPs and 315 InDels were identified in the genomic variation analysis, which revealed an intraspecific polymorphism in D. arachidicola. The comparative analysis of the most and least virulent isolates generated an integrated gene set containing 512 differential genes. Moreover, 225 genes individually or simultaneously harbored hits in CAZy-base, PHI-base, DFVF, etc. Compared with the R type reference, the differential gene sets from all B type isolates identified 13 shared genes potentially related to lesion phenotype. Our results reveal the intraspecific genomic variation of D. arachidicola isolates and pathogenicity factors potentially related to different lesion phenotypes. This work sets a genomic foundation for understanding the mechanisms behind genomic diversity driving different pathogenic phenotypes of D. arachidicola.

1. Introduction

Peanut web blotch (PWB) is one of the peanut plant’s most yield-limiting foliar diseases caused by Didymella arachidicola [1]. It was first observed in Texas, United States, in 1972 [2], then was gradually reported in Liaoning, Shandong, Shaanxi, and Henan provinces of China in the 1980s–1990s [3]. PWB occurs in the middle–late growth periods of peanuts, causes severe defoliation, and results in significant yield losses [3,4]. Due to the economic importance of peanuts, most studies have focused on the biology, ecology, and control methods of PWB, as well as the taxonomy of its pathogens [5,6,7,8]. Based on modern phylogeny, which combines genomics and morphology, the causal fungus of PWB has been recently reclassified as D. arachidicola [9,10], previously described as Phoma arachidicola and Peyronella arachidicola [6,11]. However, the mechanisms underlying its pathogenesis and virulence remain to be elucidated.
Phytopathogenic fungi have adopted diverse lifestyles, including obligate biotrophic, hemibiotrophic, and necrotrophic [12,13]. Biotrophic fungi have a narrow host range and derive nutrients from living host cells, while necrotrophic fungi have a broad host range and derive nutrients from dead host cells that are killed by secreted hydrolytic enzymes and toxins. Notably, hemibiotrophic fungi have a narrow host range and can establish an initial biotrophic phase, then switch to a necrotrophic phase [12,14,15]. The shift of fungi lifestyles has been revealed to include switching between the same fungal species with different lifestyles isolated from different host plants [16] or different host cultivars [17,18] and between different fungal species of the same genus isolated from different host plants [19,20]. We have previously found two different leaf lesion phenotypes of PWB in China: reticulation type (R type) and blotch type (B type) (Unpublished work). Isolates of R type invade the epidermal cells and extend over the adaxial leaf surface. Isolates of B type lead to chlorosis or cell death (necrosis), and eventually perform corresponding spot symptoms at both adaxial and abaxial leaf surfaces. Different lifestyles of fungi are highly connected to their secretome (secreted effectors, carbohydrate-active enzymes, transporters, etc.), which are secreted from penetration to tissue destruction [19,20,21]. Comparing variations in the secretome will facilitate our understanding of how fungi interact with their hosts.
Next-generation sequencing technologies have been widely used to study plant pathogens and greatly facilitated understanding of fungal evolution and virulence factors [22,23,24,25,26,27]. The lifestyle of fungal species in Dothideomycetes evolves in four major transitions from non-phytopathogenic to phytopathogenic, along with the pathogenicity-related genes carried in different genera varying considerably [28]. Fungi virulence has been revealed to be associated with the number of certain secreted proteins, including CAZy proteins [29]. A large-scale comparative genomic analysis has been confirmed to be a powerful tool for investigating the genetic variation of phytopathogenic fungi [30,31]. Based on a comparison of 26 Phytophthora sojae isolates, a genomic landscape including single-nucleotide polymorphism (SNP), insertion or deletion (InDel), copy number variation (CNV), and core RxLR effectors that are assumed to be essential for fungi infection have been identified [32]. The genomic differences and uncovered gene family expansions in the pathogenicity-related genes have been reported from a comparative genomic analysis of three Phytophthora capsici isolates [33]. Although the increasing number of genomic analyses provides insight into elucidating the genetic mechanisms underlying different aggressiveness levels of fungal pathogens, most have focused on the evolutionary traits and genetic variation at the intraspecific levels of plant fungal pathogens [30,33,34,35].
To date, only two draft genome sequences of D. arachidicola have been reported that referred to genome sizes of 34.11 and 47.30 Mb [36,37]. Notably, in one of the draft genome reports (our previous study), the causal fungus of PWB was misused as Peyronellaea arachidicola [37], which was reclassified as D. arachidicola in 2020 [10]. Multiple genomic comparisons at the intraspecific level are believed to reveal the genomic adaptation of D. arachidicola and provide a molecular foundation to understand the pathogenic mechanism of D. arachidicola. Therefore, this study completed the high-quality, chromosome-scale genome assembly of a local D. arachidicola isolate and re-sequenced 40 isolates from 26 geographical locations across China. The present work provides a valuable foundation for understanding the genomic basis of pathogenicity and the adaptive evolution of D. arachidicola. Furthermore, our results set a basis for further studies on functional genes related to lesion phenotype.

2. Materials and Methods

2.1. Fungal Isolates and Pathogenicity Assays

All the isolates of D. arachidicola were isolated from symptomatic leaves collected from 26 different geographic locations across main peanut production areas in China (Supplementary Figure S1). D. arachidicola isolates were single spore-derived, and their hyphae (2–3 days old) were soaked in cryogenic vials with sterile 20% glycerin solution for −80 °C storage. For the pathogenicity test, all isolates were transferred to an Oat agar plate medium for induction of conidia production (22 °C in the dark for seven days, then 22 °C under a photoperiod of 12 h black light and 12 h dark). Eight-leaf stage seedlings of peanut cultivar Yuhua 22 were chosen for inoculation. The leaflet of peanut seedlings was inoculated by spraying either spore suspension as treatment or sterile water as a control. Three replicates were set for each treatment. The inoculation seedlings were incubated at 25 °C with a relative humidity of 90% and a photoperiod of 10 h light and 14 h dark. The diseased leaflet was recorded and photographed at 14 days post-inoculation (dpi), blade and lesion areas were measured using ImageJ v1.8.0, and the pathogenicity was evaluated based on a nine-level grading method [38]. The differences in disease index were compared in R software v4.2.2. Bartlett’s test was used to check the homogeneity of variances; the Wilcoxon test was used for unpaired two-sample comparison with p values adjusted by the Benjamini method to control the false discovery rate (Figure 1A), and the Mann–Whitney U test was used for comparison of two independent groups (Figure 1B). The standard error of the mean was calculated in Microsoft Office Excel with the formula ‘=STDEV.S (sample)/SQRT (COUNT (sample))’.

2.2. Genomic DNA, RNA Extraction, and Whole-Genome Sequencing

All the isolates were grown on Potato Dextrose Agar (PDA) medium for three days at 25 °C with a photoperiod of 8 h light and 16 h dark. Mycelium was harvested, freeze-dried in liquid nitrogen for 3 h, and stored at −80 °C. Genomic DNA and total RNA of D. arachidicola isolates were extracted from mycelia using the DNeasy plant mini kit (QIAGEN, Venlo, The Netherlands) and TRI reagent (Sigma Aldrich, St. Louis, MO, USA). The quality of genomic DNA and total RNA was checked using 1% agarose gel electrophoresis and quantified by NanoDrop spectrophotometer 2000c (Thermo Fisher Scitific Inc., Waltham, MA, USA). RNA integrity was accessed using the RNA Nano 6000 Assay Kit of the Agilent Bioanalyzer 2100 system (Agilent Technologies, Santa Clara, CA, USA). De novo genome sequencing was performed on isolate YY187. It was undertaken by producing Single Molecule Real-Time (SMRT) cell libraries and sequenced on the PacBio Sequel platform (PacBio, Menlo Park, CA, USA).

2.3. Genome Assembly and Hi-C Analysis

Raw PacBio polymerase reads were processed by SMRT analysis package v2.3.0 to filter out low-quality reads (readScore < 0.8), remove adapters, and extract subreads with a length greater than 1000 bp. The high-quality subreads were corrected and assembled using Canu v1.5 with the parameter ‘correctedErrorRate’ set to 0.045 [39]. Genome completeness and assembly quality were assessed using BUSCO v5.4.4 [40,41].
The procedures of Genomic DNA extraction, quality, and quantity assessment of isolate YY187 were as described above. Hi-C sequencing library was constructed, the concentration and insert size were detected using Qubit v2.0 and Agilent 2100, and then subjected to paired-end 150 bp sequencing by Illumina HiSeq platform (Illumina, San Diego, CA, USA). Illumina clean reads were mapped to the assembled genome of D. arachidicola using BWA with default parameters [42]. Valid and invalid interaction pairs of unique mapped read pairs were filtered and assessed using HiC-Pro v2.10 [43]. Scaffolds/contigs were clustered into chromosome groups, and then those scaffolds/contigs within each chromosome group were ordered and oriented using LACHESIS [44] with the parameters ‘-CLUSTER_MAX_LINK_DENSITY 2 -CLUSTER_MIN_RE_SITES 5 -ORDER_MIN_N_RES_IN_TRUNK 5 -ORDER_MIN_N_RES_IN_SHREDS 5 -CLUSTER_NONINFORMATIVE_RATIO 2’.

2.4. Genome Annotation

The repeat library was constructed using LTR_FINDER v1.05 [45], MITE-HUNTER [46], RepeatScout v1.0.5 [47], and PILER-DF v2.4 [48], classified using PASTEClassifier [49], and then merged with the Repbase database [50]. Accordingly, all the possible repeat elements (REs) were detected by RepeatMasker v4.0.6 [51].
Protein-coding genes were predicted by combining three strategies: ab initio prediction, homology-based, and RNA-Seq based. The ab initio prediction was carried out by using Genscan v1.0 [52], Augustus v2.4 [53], GlimmerHMM v3.0.4 [54], GeneID v1.4 [55], and SNAP v2006-07-28 [56]. The homology-based prediction was performed by GeMoMa v1.3.1 [57]. RNA-Seq data were mapped to reference transcripts using Hisat2 v2.0.4 and Stringtie v1.2.3 [58]. Transcriptome assembly was conducted based on Unigene sequences that were predicted using PASA v2.0.2 [59] + TransDecoder v2.0 pipeline. The gene sets were integrated into a non-redundant gene set using EVM v1.1.1 [60]. Transfer RNAs (tRNAs) were predicted by tRNAscan-SE [61], and the predictions of ribosome RNAs (rRNAs) and other non-coding RNAs (ncRNAs) were performed using Infernal v1.1 [62] based on Rfam database [63].
The prediction gene set was used for gene annotation based on functional databases of KOG [64], KEGG [65], Swiss-Prot (2015_01), TrEMBL [66], and Nr [67] using BLAST v2.2.29 [68]. The hits from the Nr database blast were further annotated with the Blast2GO v2.5 [69], Hmmer v3.0 [70], GO [71], and Pfam (27.0) [72] database, respectively. In addition, Hmmer v3.0 [70] was used to annotate carbohydrate enzymes, membrane transport proteins, and cytochrome P450 proteins based on the carbohydrate-active enzymes (CAZy) database, transporter classification database (TCDB), and cytochrome P450 engineering database (CYPED), respectively [73,74,75]. The prediction gene set was also used for annotation in the pathogen–host interactions (PHI) database [76] and the database of fungal virulence factors (DFVF) [77] using BLAST v2.2.29 [68].
To identify protein subcellular localization, the protein sequences of all the predicted genes were analyzed, and proteins containing signal peptides were detected by SignalP v4.0 [78] with the parameter ‘-f long -g png’. After transmembrane proteins were filtered by TMHMM with default parameters [79], the remains were candidate-secreted proteins. EffectorP [80] was used to further analyze the secreted proteins for predicting effector proteins.

2.5. Genome Re-Sequencing and Variations Calling

Genomic DNA extraction of 40 isolates of D. arachidicola and quantity/quality determination were performed as described above. Paired-end libraries with an insert size of 350 bp for Illumina sequencing were constructed, and PE150 sequencing was performed on an Illumina HiSeq X-Ten platform. The obtained raw reads were filtered to remove adapters, low-quality reads on which more than 50% of the bases had a quality score less than 20 (Phred-like score), and paired-end reads with >10% ‘N’ bases. The clean reads of re-sequenced isolates were aligned to the reference genome (de novo assembly of isolate YY187) by BWA-MEM v0.7.12 [81]. Duplicate reads were removed from alignments using samtools v1.7 [82].
Single-nucleotide polymorphisms (SNPs) and insertions and deletions (InDels) within the 40 isolates were called using the HaplotypeCaller module in GATK [83] and were filtered with the following parameters: QD < 2.0 || MQ < 40.0 || FS > 60.0 || QUAL < 30.0 || MQrankSum < −12.5 || ReadPosRankSum < −8.0 -clusterSize 2 -clusterWindowSize 5. SNP and InDel annotations were performed using default parameters based on the reference genome by snpEff v4.1 [84]. Synonymous and non-synonymous SNPs and InDels in exons were further filtered. In addition, genes with non-synonymous SNPs and InDels in exons were targeted and used for searching genes with potential functional differences, which were further annotated against databases of Nr [67], SwissProt (2015_01), GO [71], COG [64], and KEGG [65] using BLAST v2.2.29 [68].

2.6. Population Phylogeny Analysis

The SNPs identified by GATK were further filtered: only SNPs with a minor allele frequency greater than 5% and less than 20% missing data were considered high-quality SNPs. A phylogenetic tree was reconstructed based on the alignment of high-quality SNPs, using the maximum likelihood method with the GTR + G + I model in MEGA X [85] with 1000 bootstrap replicates. The population structure within 40 isolates was inferred based on the high-quality SNPs using ADMIXTURE v1.22 [86], with the putative number of sub-populations (K value) ranging from 1 to 10. The optimal number of sub-populations was assessed using five-fold cross-validation.

2.7. Identification of Potential Pathogenicity Factors

Those genes with non-synonymous SNPs and InDels occurring in exons were considered specific differential gene sets for each isolate. As the reference genome we used above was the genome of the most virulent isolate, the differential gene sets of the least virulent isolates of two lesion phenotypes were selected to identify pathogenicity-related genes. The differential gene set was individually compared with the functional annotations of the reference genome, including CAZy-base, TCDB, PHI-base, CYPED, DFVF annotations, and prediction of secreted proteins and effector proteins by VCFtools v.0.1.15 [87]. All the differential gene sets of B type isolates were merged and compared to further validate shared differential genes to validate pathogenicity factors potentially related to different lesion phenotypes.

3. Results

3.1. Pathogenicity Varied among D. arachidicola Isolates

The results showed differences in the pathogenicity of infected peanut seedlings at 14 dpi. The isolates YY187 and SC291 had the highest and lowest pathogenicities, with a disease index of 28.33 and 7.56, respectively (Figure 1A). The control treatment (seedlings inoculated with sterile water) showed no symptoms during the experimental period. The leaf lesion phenotypes showed at 14 dpi among all the isolates were recorded, which showed that 33 isolates were reticulation type (R type) and 8 were blotch type (B type). The isolates YY187 and SD278 were the most virulent from R and B types, respectively (Figure 1A). In addition, the average disease index of R type isolates was significantly higher than that of B type isolates (Figure 1B).

3.2. Genome Assembly and Annotation

Isolate YY187 was sequenced and assembled as a reference genome of D. arachidicola. A total of 7.86 Gb filtered subreads with an average length of 8.09 kb were generated from the PacBio Sequel platform. The assembled genome size was 47.35 Mb, comprising 26 scaffolds, with an N50 length of 2.17 Mb and a G + C content of 56.37% (Table 1). A total of 16,629 genes were predicted in this study (Supplementary Table S1).
The Hi-C approach was further used to determine the state of genome folding of isolate YY187. Approximately two million paired-end reads (6.01 Gb) were obtained with a G + C content of 51.95% and a Q30 of 94.35%. The ratios of mapped reads and unique mapped read pairs were 96.14% and 73.06%, respectively. Among the unique mapped read pairs, valid interaction pairs (8.83 Mb) with a percentage of 60.32% were generated. Hi-C assembly located 36.03 Mb genomic sequences (accounting for 76.09% of the total sequence length) on chromosomes. Moreover, the chromosomal ordered and oriented sequences accounted for 100% of the total sequence length and number, respectively. The Hi-C assembled results were further cut into 20 Kb bins, the interaction intensity of any two of which was used to construct a heat map. Finally, the preliminary PacBio assembly was assembled using the Hi-C technique into a chromosome-level assembly containing 18 chromosomes (Figure 2, Supplementary Figure S2).
The predicted genes were annotated based on multiple functional databases (KOG, KEGG, Swiss-Prot, TrEMBL, Nr, GO, and Pfam), with which 15,767 genes (approximately 94.82% of the total) showed at least one hit to the above databases (Supplementary Table S2). In total, 9682 protein-coding genes were annotated from the GO database. The ‘cell part’, ‘cell’, ‘membrane’, and ‘organelle’ were the most prevalent GO terms associated with cellular components; meanwhile, the ‘catalytic activity’ and ‘metabolic process’ were the largest categories of genes associated with molecular function and biological processes, respectively (Supplementary Figure S3). In addition, 7676 protein-coding genes were annotated by the KOG database. The functional category ‘general function prediction only’ was the most, followed by ‘posttranslational modification, protein turnover, chaperones’ and ‘amino acid transport and metabolism’ (Supplementary Figure S4). The annotation based on the KEGG database found 4739 protein-coding genes, demonstrating that ‘biosynthesis of amino acids’ and ‘carbon metabolism’ were the two categories most associated with metabolism (Supplementary Figure S5). The density of annotated genes distributed on each chromosome is shown (Figure 2) using gene location visualization in TBtools software for visualization [88].
Additional annotations were based on CAZy-base, TCDB, CYPED, PHI-base, and DFVF. A total of 902 genes were annotated to be CAZymes, including 35.32% glycoside hydrolases (GHs), 19.90% carbohydrate esterases (CEs), 18.00% glycosyl transferases (GTs), 14.02% auxiliary activities (AAs), 9.25% carbohydrate-binding modules (CBMs), and 3.48% polysaccharide lyases (PLs) (Supplementary Figures S6A and S7A). A total of 405 and 558 genes were annotated as membrane transport proteins and cytochrome P450 proteins in the reference genome, respectively (Supplementary Figure S6A). A total of 5028 potential pathogenic genes were assigned to different categories in PHI-base (containing information on experimentally proven genes in bacteria, fungi, and oomycetes), most of which were related to reduced virulence (40.93%) and unaffected virulence (30.65%) (Supplementary Figures S6A and S7B). As a supplement and validation, 2996 genes were annotated to be known fungal virulence factors in the DFVF database, among which 240 genes were exclusive in the DFVF database as mapping to the annotation genes in PHI-base (Supplementary Figure S6).
In total, 1906 proteins with signal peptides and 3990 transmembrane proteins were predicted in the genome of D. arachidicola isolate YY187. By removing those proteins with a transmembrane structure from proteins with a signal peptide, the remaining 1292 proteins were identified as potentially secreted proteins. Furthermore, 335 secreted proteins (accounting for 25.93% of the total predicted secreted proteins) belonged to CAZymes, among which PHI-base annotated 169 secreted proteins with information of “reduced virulence”, “loss of pathogenicity”, “effector (plant avirulence determinant)”, etc. In total, 144 proteins of the potential CAZymes were supplementarily annotated by DFVF of known fungal virulence factors, with 131 proteins the same to PHI-base annotation. Moreover, 174 effector proteins associated with pathogen–plant host interactions were further predicted by EffectorP by analyzing potentially secreted proteins (Supplementary Figure S6B).

3.3. Genomic Variation and Phylogeny of D. arachidicola

Genome re-sequencing of 40 isolates (except for the isolate for reference genome) was performed to investigate the genomic variations of D. arachidicola at the intraspecific level. The clean reads of 40 isolates were mapped to the reference genome of isolate YY187, with GC content and mapping rate ranging from 50.95% to 53.03% and 90.13% to 99.66%, respectively (Supplementary Table S3). A total of 158,962 SNPs and 12,771 InDels were identified (Supplementary Table S4). For all the isolates, the ratio of transition over transversion (Ti/Tv) of SNPs ranged from 2.91 to 3.47, and the insertion and deletion numbers of InDels ranged from 86 to 267 and 101 to 229, respectively (Supplementary Table S3). Genomic variations were further categorized due to their occurring regions (intergenic regions, upstream or downstream regions, and exons or introns), which showed that all 40 isolates had similar variations (Supplementary Table S3). In addition, most variations (59.62% of SNPs and 59.00% of InDels) were located in gene upstream and downstream regions, and 19.21% of SNPs and 14.03% of InDels were located in exon regions (Figure 3A). Non-synonymous SNPs were filtered in exons with numbers from 224 to 507, and InDels detected in exons ranged from 22 to 62 (Supplementary Table S5). An intraspecific comparison of variations among all the isolates is further shown in Supplementary Figure S8. For the variation types in CDS regions of each isolate, the proportion of synonymous and non-synonymous SNPs was the most; meanwhile, frameshift mutation was the primary type that referred to InDels (Figure 3B). Genes with non-synonymous SNPs and InDels occurring in exons were identified, with the gene numbers ranging from 167 to 420 and 19 to 51, respectively (Supplementary Table S5).
A phylogeny analysis was performed to investigate the phylogenetic relationship of the D. arachidicola population. The SNP calling was based on the reference assembly of isolate YY187, resulting in 8154 SNPs being further filtered as high-quality SNPs (Supplementary Table S4), which were used to construct a maximum likelihood phylogenetic tree (Figure 4). The phylogenetic tree included 40 isolates from 26 geographical regions of China, and the results indicated four sub-populations (S1, S2, S3, and S4), while no correlations were found among the clustering of isolates and geography for the D. arachidicola population. Eight isolates of B type belonged to clusters S1, S3, and S4 separately, indicating no correlation between the phenotype of different lesions and geographical regions. In addition, the putative number of sub-populations was assumed from 1 to 10 and assessed using five-fold cross validation. At K = 4, D. arachidicola isolates were divided into four groups that showed the same grouping results as our phylogenetic analysis.

3.4. Identification of Pathogenicity Factors

According to the results of pathogenicity assays, the most virulent isolate was YY187, the least virulent R type isolate was SC289, and the B type was SC291. Based on the genome re-sequencing, 3,552,798 clean reads of SC289 and 3,510,557 clean reads of isolate SC291 were mapped to the reference genome (isolate YY187) with a mapping rate of 98.36% and 98.70%, respectively (Supplementary Table S3). By variation calling and annotations, 315 and 420 differential genes with non-synonymous SNPs, and 43 and 41 differential genes with InDels, were further annotated (Supplementary Table S5). Differential genes were blasted in general functional databases that generated two differential gene sets containing 266 and 331 functional annotated genes for isolates SC289 and SC291, respectively (Supplementary Table S6). The two differential gene sets were further integrated into one gene set containing 512 differential genes (85 genes were shared by two differential gene sets), which we called an integrated gene set (Supplementary Table S6) and used for subsequent analysis.
Compared with the reference genome annotations in CAZy-base, TCDB, CYPED, PHI-base, and DFVF, the pathogenicity-related genes were further identified from 512 genes from an integrated gene set, which were 35 CAZyme genes, one membrane transport protein gene, 29 cytochrome P450 monooxygenase genes, 193 pathogen–host interaction protein genes, and 126 known fungal virulence factors (Supplementary Table S6). It has been well-understood that plant pathogens secrete effector proteins to facilitate the infection of their hosts [89,90,91]. By comparing the integrated gene set to the predicted secreted proteins of the reference genome, 30 genes were identified as secreted proteins, among which three were further identified as effector proteins (Supplementary Table S6).

3.5. Pathogenicity Factors Potentially Related to Different Lesion Phenotypes

Compared with the reference isolate YY187 (R type), eight B type isolates, SD278, HuB260, SX282, SX283, LN297, SC290, SC291, and SC293, were analyzed to identify the potential pathogenicity factors related to different lesion phenotypes. In total, 13 differential genes were shared by all B type isolates (Table 2). Those genes were mainly located in chromosomes 3 and 11, with only one in Chr1 (Figure 5). Eight genes had hits in PHI-base with either “reduced virulence”, “unaffected pathogenicity”, or “loss of pathogenicity” annotations, five of which also had hits in DFVF annotations (Table 2). Only one gene in Chr1 was also annotated in CYPED. One out of three genes in Chr3 was annotated to be CAZyme. On Chr11, two secreted proteins were found, and one of them (EVM0007819.1) was further predicted to be an effector protein (244 amino acids).

4. Discussion

Inoculation of D. arachidicola isolates on peanut seedlings showed that pathogenicity varied among 41 isolates from different regions of China (Figure 1A). Given the two different lesion phenotypes, isolates of reticulation type (R type) and blotch type (B type) showed significant differences in pathogenicity (Figure 1B). Notably, the lesion phenotype exhibited no correlation with geographic locations. Different lesion phenotypes can be found in different regions, such as isolate SD277 and SD279 (R type) and SD278 (B type), originating from Shandong province, China. Previous studies have reported that the adaptability and pathogenicity of fungal pathogens can be affected by plant cultivars and environmental factors such as temperature, light conditions, and humidity [34]. Pathogenic fungal species with a narrow range of plant species were so-called ‘host specificity’ [92] and have been extensively used to study pathogen–host interactions [93,94,95]. It has been proved that fungal pathogens utilize diverse life strategies to interact with host plants, while plants use immune systems to respond to pathogenic infection [12,96]. In order to colonize host plants, fungal pathogens either directly invade epidermal cells, or extend hyphae over, between, or through plant cells, which happens to be the different lesion phenotypes in our study (Figure 1C). In consideration of the limited expansion and higher invasion ability of B type isolates, a better strategy for interfering with PAMP (pathogen-associated molecular patterns)-triggered immunity (PTI) and further effector-triggered susceptibility (ETS) might happen. However, the causation of different lesion phenotypes of D. arachidicola, especially at the genetic level, needs to be further illustrated.
The results of genome re-sequencing showed a high mapping rate of the clean reads mapped to the reference, around 97~99%, except some lower, around 92~94% (Supplementary Table S3). The present bioinformatic methods usually discarded those unmapped reads from the analysis process, while some studies revealed that meaningful biological information (e.g., missing genes) could also be found by further exploring unmapped reads [97,98]. However, this phenomenon was rarely reported in the research on fungal pathogens. Considering the high homology shown in D. arachidicola isolates at the intraspecific level, the lower mapping rate may be more related to the sequencing quality than sequence differences. Nevertheless, a comparison of PacBio whole-genome assemblies of representative isolates with the reference could better explain the unmapped reads issue and needs further investigation. The results of the variation calling indicated a homogeneous polymorphism at the intraspecific level of D. arachidicola. The 40 isolates of D. arachidicola showed similar numbers of SNPs and InDels; moreover, the proportion of variations identified, respectively, in the intron, exon, and other genomic regions showed no significant differences among isolates. Regarding the distribution in genomic regions, genomic variations were mainly located in the upstream and downstream regions (59.31%) and exon regions (16.62%) of genes, which showed a similar distribution pattern with variations of three pathotypes of a rust fungi Puccinia striiformis [99], but differed from the previous study of Phytophthora sojae in that a high proportion of SNPs and InDels were located in intergenic regions [32]. The upstream and downstream regions have been mainly studied at the cis-regulatory elements, such as promoters and enhancers, which have been elucidated to regulate gene expression and functions [100,101]. Thus, most of the variations identified in our study were distributed in the non-coding regions that might contribute to phenotypic diversity within species by affecting the functions of cis-regulatory elements. In the coding regions, especially exon regions, the variations are directly related to the transcription and translation of proteins. The phylogenetic relationship based on SNPs did not correlate with their geographical origins and virulence, suggesting a more diverse evolution at the intraspecific level of the D. arachidicola population. As the latitudes of origins, peanut cultivars, and environmental factors can affect the adaptability of D. arachidicola isolates, an extensive range of sample collection and further investigations combined with the related environmental factors will contribute to a better understanding of the evolution of D. arachidicola.
By comparing the differential gene sets of all B type isolates, 13 shared differential genes were identified as pathogenicity factors potentially related to different lesion phenotypes of D. arachidicola (Table 2). Among these genes, EVM0007100.1 had hits in PHI-base and DFVF simultaneously with the name of GAS1, which has been reported to be essential for efficient colonization on the host surface [102] by affecting the formation of appressoria and penetration into the epidermal cell layer [103]. As homologous to GAS1, EVM0007100.1 is annotated as CAZyme with GH31 (glycoside hydrolase) activity. Previous studies confirmed that various CAZymes had been found to be important virulence factors, among which glycoside hydrolases (GHs) are the most abundant class studied so far [104]. The known GHs families in phytopathogenic fungi and oomycetes that were reported to have potential roles in penetration of the cell wall, expansion of the fungus in the host, stealth pathogenicity, et cetera, were concentrated in GH5-7, GH10-13, GH18, GH45, and GH74 families [105]. Here, we showed the potential that members from the GH31 family could also be regarded as CAZymes with potential functions on pathogen penetration and invasion in phytopathogenic fungi.
Earlier studies have reported that plant pathogens utilize secreted proteins termed ‘effectors’, a broad class of cytotoxic, virulence-promoting, or resistance-eliciting molecules released from pathogen cells to facilitate infection by suppressing plant defense reactions and manipulating plant cell physiology [12,91]. The gene EVM0007819.1 identified in this work was predicted to be an effector protein and was functionally annotated as necrosis-and ethylene-inducing protein 1 (Nep1), which belongs to Nep1-like proteins (NLPs). NLPs are widespread among microbes such as bacteria, fungi, and oomycetes and act as toxin-like virulence factors that induce tissue necrosis [106,107]. Studies have proved that plant pathogens generally expressed multiple NLPs with necrotrophic or hemibiotrophic lifestyles during infection or in the transition from biotrophic to necrotrophic [108,109]. EVM0015380.1 was functionally annotated as a Kpp6 gene, which has been identified for encoding a mitogen-activated protein (MAP) kinase [110]. As fungal MAP kinases have shown essential roles in controlling essential virulence factors, the Kpp6 gene has been further speculated to be crucial for the penetration of plant epidermis [110,111]. In addition, the genes EVM0000257.1, EVM0000544.1, EVM0002645.1, and EVM0013695.1 were annotated against PHI-base to be putative transcription factors (TFs) [112], which are essential regulators of gene expression, possess critical roles in the signal transduction pathways, and further phenotypic evolution [113,114].

5. Conclusions

This work assembled the chromosome-scale genome of a local isolate of D. arachidicola as the first reference and re-sequenced the genomes of 40 D. arachidicola isolates from 26 geographical locations across China. Our analysis revealed that the pathogenicity varied among all the isolates, and the isolates of B type presented less virulence than the R type in consideration of the different lesion phenotypes, although there were no correlations among pathogenicity, lesion phenotype, and geographical origins. Furthermore, we investigated the genomic variations of D. arachidicola and determined an intraspecific polymorphism. The 13 differential genes potentially related to lesion phenotype were identified to further understand the correlation of genotype and phenotype underlying the pathogenic mechanism. Our study set a genomic foundation for the adaptive evolution of D. arachidicola and provided a molecular background for further functional study on the genes potentially related to different lesion phenotypes.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/biology12030476/s1. Figure S1: Sampling sites of the 41 D. arachidicola isolates used in this study in China. The map was modified based on map GS(2019)1673; Figure S2: Heat map of chromosomal interaction based on Hi-C assembly. “LG” means Lachesis group; Figure S3: Category statistics for GO annotation of D. arachidicola genes; Figure S4: Category statistics for KOG annotation of D. arachidicola genes; Figure S5: Category statistics for KEGG annotation of D. arachidicola genes; Figure S6: The functional annotations of D. arachidicola. (A) The number of genes that have hits in different databases. (B) Venn diagram based on annotated genes in CAZy-base, PHI-base, DFVF, and genes of predicted secreted proteins; the numbers in parentheses represent the number of predicted effector proteins; Figure S7: Percentage of genes belonging to different categories. (A) Percentage of genes that have hits in different families of CAZyme. (B) Percentage of genes with different PHI phenotypes; Figure S8: Intraspecific comparison of variations of D. arachidicola isolates. Results visualized by TBtools v1.098769; Table S1: The gene numbers predicted by different software; Table S2: The statistics of gene annotation; Table S3: Summary of re-sequenced isolates, mapping, and variants relative to the reference genome; Table S4: All identified SNPs, InDels, and filtered high-quality SNPs; Table S5: The statistics of differential genes; Table S6: The integrated gene set that shows the pathogenicity-related genes. Table S7: The repository and accession numbers of the datasets presented in this study.

Author Contributions

Conceptualization, S.L. and Z.W.; formal analysis, S.L., M.G. and T.L.; funding acquisition, S.L. and Z.W.; investigation, S.L., M.G., S.S., W.F. and H.Z.; resources, S.L., M.G., W.F., X.C. and J.Z.; visualization, S.L.; supervision, S.L. and Z.W.; writing—original draft preparation, S.L.; writing—review and editing, S.L., Z.W. and T.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the Fund for Outstanding Young Scholars from Henan Academy of Agricultural Sciences (Grant No. 2022YQ05), Innovation and Creativity Project of Henan Academy of Agricultural Sciences (Grant No. 2020CX23), and the Major Science and Technology Project of Henan Province (Grant No. 201300111000).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The whole-genome project of D. arachidicola was deposited at DDBJ/ENA/GenBank under Bioproject PRJNA562378 and the accession numbers can be found in the article/Supplementary Table S7.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Tomilin, B.A. Opredelitel’ Gribov Roda Mycosphaerella Johansen; ‘Nauka’ Publishing House: Leningrad, Russia, 1979. [Google Scholar]
  2. Pettit, R.E.; Philley, G.L.; Smith, D.H.; Taber, R.A. Peanut Web Blotch: II Symptoms and Host Range of Pathogen1. Peanut Sci. 1986, 13, 27–30. [Google Scholar] [CrossRef]
  3. Fu, J.F.; Wang, D.Z.; Zhou, R.J.; Yang, F.Y.; Su, W.N. Occurrence and epidemic dynamics of peanut web blotch disease in Liaoning Province. Chin. J. Oil Crop Sci. 2013, 35, 80–83. [Google Scholar] [CrossRef]
  4. Xu, X.J.; Cui, F.G.; Shi, Y.M.; Xu, M.X.; Bi, G.J. Research on the peanut web blotch in China. Acta Phytopathol. Sin. 1995, 22, 70–74. [Google Scholar] [CrossRef]
  5. Lancaster, S.H.; Jordan, D.L.; York, A.C.; Wilcut, J.W.; Brandenburg, R.L.; Monks, D.W. Interactions of Late-Season Morningglory (Ipomoea spp.) Management Practices in Peanut (Arachis hypogaea). Weed Technol. 2005, 19, 803–808. [Google Scholar] [CrossRef]
  6. Aveskamp, M.M.; de Gruyter, J.; Woudenberg, J.H.C.; Verkley, G.J.M.; Crous, P.W. Highlights of the Didymellaceae: A polyphasic approach to characterise Phoma and related pleosporalean genera. Stud. Mycol. 2010, 65, 1–60. [Google Scholar] [CrossRef]
  7. Xie, J.H.; Lin, Y.; Zang, C.Q.; Pei, X.; Liang, C.H. Peanut net spot disease: Selection of chemical control agents and compound synergistic formulas. Chin. Agric. Sci. Bull. 2021, 23, 101–105. [Google Scholar] [CrossRef]
  8. Man-Lin, X.; Xia, Z.; Ju-Xiang, W.; Jing, Y.; Zhi-Qing, G.; Ying, L.; Hong-Feng, X.; De-Yun, T.; Ru-Jun, Z.; Yu-Cheng, C. Evaluation of peanut accessions resistance against Phoma arachidicola and relationship between disease-resistance and yield loss. Chin. J. Oil Crop Sci. 2021, 43, 731–735. [Google Scholar] [CrossRef]
  9. Chen, Q.; Jiang, J.R.; Zhang, G.Z.; Cai, L.; Crous, P.W. Resolving the Phoma enigma. Stud. Mycol. 2015, 82, 137–217. [Google Scholar] [CrossRef] [Green Version]
  10. Hou, L.; Groenewald, J.; Pfenning, L.; Yarden, O.; Crous, P.; Cai, L. The phoma-like dilemma. Stud. Mycol. 2020, 96, 309–396. [Google Scholar] [CrossRef]
  11. Marasas, W.F.O.; Pauer, G.D.; Boerema, G.H. A serious leaf blotch disease of groundnuts (Arachis hypogaea L.) in southern Africa caused by Phoma arachidicola sp. nov. Phytophylactica 1974, 6, 195–202. [Google Scholar]
  12. Lo Presti, L.; Lanver, D.; Schweizer, G.; Tanaka, S.; Liang, L.; Tollot, M.; Zuccaro, A.; Reissmann, S.; Kahmann, R. Fungal Effectors and Plant Susceptibility. Annu. Rev. Plant Biol. 2015, 66, 513–545. [Google Scholar] [CrossRef] [PubMed]
  13. Wu, B.; Cox, M.P. Comparative genomics reveals a core gene toolbox for lifestyle transitions in Hypocreales fungi. Environ. Microbiol. 2021, 23, 3251–3264. [Google Scholar] [CrossRef] [PubMed]
  14. Amselem, J.; Cuomo, C.A.; van Kan, J.A.L.; Viaud, M.; Benito, E.P.; Couloux, A.; Coutinho, P.M.; de Vries, R.P.; Dyer, P.S.; Fillinger, S.; et al. Genomic Analysis of the Necrotrophic Fungal Pathogens Sclerotinia sclerotiorum and Botrytis cinerea. PLoS Genet. 2011, 7, e1002230. [Google Scholar] [CrossRef] [Green Version]
  15. De Silva, N.I.; Lumyong, S.; Hyde, K.D.; Bulgakov, T.; Phillips, A.J.L.; Yan, J.Y. Mycosphere Essays 9: Defining biotrophs and hemibiotrophs. Mycosphere 2016, 7, 545–559. [Google Scholar] [CrossRef]
  16. Leuchtmann, A.; Schardl, C. The Epichloë Endophytes of Grasses and the Symbiotic Continuum. Mycology 2005, 23, 475–503. [Google Scholar] [CrossRef]
  17. Redman, R.S.; Freeman, S.; Clifton, D.R.; Morrel, J.; Brown, G.; Rodriguez, R.J. Biochemical Analysis of Plant Protection Afforded by a Nonpathogenic Endophytic Mutant of Colletotrichum magna. Plant Physiol. 1999, 119, 795–804. [Google Scholar] [CrossRef] [Green Version]
  18. Redman, R.S.; Roossinck, M.J.; Maher, S.; Rodriguez, R.J. Field performance of cucurbit and tomato plants colonized with a nonpathogenic mutant of Colletotrichum magna (teleomorph: Glomerella magna; Jekins and Winstead). Symbiosis 2002, 32, 55–70. [Google Scholar]
  19. Hill, R.; Buggs, R.J.; Vu, D.T.; Gaya, E. Lifestyle Transitions in Fusarioid Fungi are Frequent and Lack Clear Genomic Signatures. Mol. Biol. Evol. 2022, 39, msac085. [Google Scholar] [CrossRef]
  20. O’Connell, R.J.; Thon, M.R.; Hacquard, S.; Amyotte, S.G.; Kleemann, J.; Torres, M.F.; Damm, U.; Buiate, E.A.; Epstein, L.; Alkan, N.; et al. Lifestyle transitions in plant pathogenic Colletotrichum fungi deciphered by genome and transcriptome analyses. Nat. Genet. 2012, 44, 1060–1065. [Google Scholar] [CrossRef]
  21. Muszewska, A.; Stepniewska-Dziubinska, M.M.; Steczkiewicz, K.; Pawlowska, J.; Dziedzic, A.; Ginalski, K. Fungal lifestyle reflected in serine protease repertoire. Sci. Rep. 2017, 7, 9147. [Google Scholar] [CrossRef] [Green Version]
  22. Atwell, S.; Corwin, J.A.; Soltis, N.E.; Subedy, A.; Denby, K.J.; Kliebenstein, D.J. Whole genome resequencing of Botrytis cinerea isolates identifies high levels of standing diversity. Front. Microbiol. 2015, 6, 996. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  23. King, R.; Urban, M.; Hammond-Kosack, M.C.U.; Hassani-Pak, K.; Hammond-Kosack, K.E. The completed genome sequence of the pathogenic ascomycete fungus Fusarium graminearum. BMC Genom. 2015, 16, 544. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  24. Landi, L.; Pollastro, S.; Rotolo, C.; Romanazzi, G.; Faretra, F.; Angelini, R.M.D.M. Draft Genomic Resources for the Brown Rot Fungal Pathogen Monilinia laxa. Mol. Plant-Microbe Interact. 2019, 33, 145–148. [Google Scholar] [CrossRef] [Green Version]
  25. Hong, C.P.; Moon, S.; Yoo, S.-I.; Noh, J.-H.; Ko, H.-G.; Kim, H.A.; Ro, H.-S.; Cho, H.; Chung, J.-W.; Lee, H.-Y.; et al. Functional Analysis of a Novel ABL (Abnormal Browning Related to Light) Gene in Mycelial Brown Film Formation of Lentinula edodes. J. Fungi 2020, 6, 272. [Google Scholar] [CrossRef]
  26. Meile, L.; Peter, J.; Puccetti, G.; Alassimone, J.; McDonald, B.A.; Sánchez-Vallet, A. Chromatin Dynamics Contribute to the Spatiotemporal Expression Pattern of Virulence Genes in a Fungal Plant Pathogen. Mbio 2020, 11, e02343-20. [Google Scholar] [CrossRef] [PubMed]
  27. Hudec, C.; Biessy, A.; Novinscak, A.; St-Onge, R.; Lamarre, S.; Blom, J.; Filion, M. Comparative Genomics of Potato Common Scab-Causing Streptomyces spp. Displaying Varying Virulence. Front. Microbiol. 2021, 12, 716522. [Google Scholar] [CrossRef] [PubMed]
  28. Yu, C.; Diao, Y.; Lu, Q.; Zhao, J.; Cui, S.; Xiong, X.; Lu, A.; Zhang, X.; Liu, H. Comparative Genomics Reveals Evolutionary Traits, Mating Strategies, and Pathogenicity-Related Genes Variation of Botryosphaeriaceae. Front. Microbiol. 2022, 13, 800981. [Google Scholar] [CrossRef]
  29. Marcet-Houben, M.; Villarino, M.; Vilanova, L.; De Cal, A.; van Kan, J.; Usall, J.; Gabaldón, T.; Torres, R. Comparative Genomics Used to Predict Virulence Factors and Metabolic Genes among Monilinia Species. J. Fungi 2021, 7, 464. [Google Scholar] [CrossRef]
  30. Gramaje, D.; Berlanas, C.; Martínez-Diz, M.; Diaz-Losada, E.; Antonielli, L.; Beier, S.; Gorfer, M.; Schmoll, M.; Compant, S. Comparative Genomic Analysis of Dactylonectria torresensis Strains from Grapevine, Soil and Weed Highlights Potential Mechanisms in Pathogenicity and Endophytic Lifestyle. J. Fungi 2020, 6, 255. [Google Scholar] [CrossRef]
  31. Miyauchi, S.; Kiss, E.; Kuo, A.; Drula, E.; Kohler, A.; Sánchez-García, M.; Morin, E.; Andreopoulos, B.; Barry, K.W.; Bonito, G.; et al. Large-scale genome sequencing of mycorrhizal fungi provides insights into the early evolution of symbiotic traits. Nat. Commun. 2020, 11, 5125. [Google Scholar] [CrossRef]
  32. Zhang, X.; Liu, B.; Zou, F.; Shen, D.; Yin, Z.; Wang, R.; He, F.; Wang, Y.; Tyler, B.M.; Fan, W.; et al. Whole Genome Re-sequencing Reveals Natural Variation and Adaptive Evolution of Phytophthora sojae. Front. Microbiol. 2019, 10, 2792. [Google Scholar] [CrossRef] [PubMed]
  33. Lee, J.-H.; Siddique, M.I.; Kwon, J.-K.; Kang, B.-C. Comparative Genomic Analysis Reveals Genetic Variation and Adaptive Evolution in the Pathogenicity-Related Genes of Phytophthora capsici. Front. Microbiol. 2021, 12, 694136. [Google Scholar] [CrossRef] [PubMed]
  34. Guttman, D.S.; McHardy, A.C.; Schulze-Lefert, P. Microbial genome-enabled insights into plant–microorganism interactions. Nat. Rev. Genet. 2014, 15, 797–813. [Google Scholar] [CrossRef] [PubMed]
  35. Zeng, Z.; Sun, H.; Vainio, E.J.; Raffaello, T.; Kovalchuk, A.; Morin, E.; Duplessis, S.; Asiegbu, F.O. Intraspecific comparative genomics of isolates of the Norway spruce pathogen (Heterobasidion parviporum) and identification of its potential virulence factors. BMC Genom. 2018, 19, 220. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  36. Zhang, X.; Xu, M.; Wu, J.; Dong, W.; Chen, D.; Wang, L.; Chi, Y. Draft Genome Sequence of Phoma arachidicola Wb2 Causing Peanut Web Blotch in China. Curr. Microbiol. 2019, 76, 200–206. [Google Scholar] [CrossRef]
  37. Li, S.; Xue, X.; Gao, M.; Wang, N.; Cui, X.; Sang, S.; Fan, W.; Wang, Z. Genome Resource for Peanut Web Blotch Causal Agent Peyronellaea arachidicola Strain YY187. Plant Dis. 2021, 105, 1177–1178. [Google Scholar] [CrossRef]
  38. Guo, X.Q.; Li, X.; Zhao, Z.Q.; Li, X.; Ju, Q.; Jiang, X.J.; Qu, M.J. The research of different fungicides on control effects against peanut leaf spot and yield increase to peanut. J. Peanut Sci. 2014, 43, 56–60. [Google Scholar] [CrossRef]
  39. Koren, S.; Walenz, B.P.; Berlin, K.; Miller, J.R.; Bergman, N.H.; Phillippy, A.M. Canu: Scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res. 2017, 27, 722–736. [Google Scholar] [CrossRef] [Green Version]
  40. Manni, M.; Berkeley, M.R.; Seppey, M.; Simão, F.A.; Zdobnov, E.M. BUSCO Update: Novel and Streamlined Workflows along with Broader and Deeper Phylogenetic Coverage for Scoring of Eukaryotic, Prokaryotic, and Viral Genomes. Mol. Biol. Evol. 2021, 38, 4647–4654. [Google Scholar] [CrossRef]
  41. Manni, M.; Berkeley, M.R.; Seppey, M.; Zdobnov, E.M. BUSCO: Assessing Genomic Data Quality and Beyond. Curr. Protoc. 2021, 1, e323. [Google Scholar] [CrossRef]
  42. Li, H.; Durbin, R. Fast and accurate short read alignment with Burrows—Wheeler transform. Bioinformatics 2009, 25, 1754–1760. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  43. Servant, N.; Varoquaux, N.; Lajoie, B.R.; Viara, E.; Chen, C.-J.; Vert, J.-P.; Heard, E.; Dekker, J.; Barillot, E. HiC-Pro: An optimized and flexible pipeline for Hi-C data processing. Genome Biol. 2015, 16, 259. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  44. Burton, J.N.; Adey, A.; Patwardhan, R.P.; Qiu, R.; Kitzman, J.O.; Shendure, J. Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions. Nat. Biotechnol. 2013, 31, 1119–1125. [Google Scholar] [CrossRef] [PubMed]
  45. Xu, Z.; Wang, H. LTR_FINDER: An efficient tool for the prediction of full-length LTR retrotransposons. Nucleic Acids Res. 2007, 35, W265–W268. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  46. Han, Y.; Wessler, S.R. MITE-Hunter: A program for discovering miniature inverted-repeat transposable elements from genomic sequences. Nucleic Acids Res. 2010, 38, e199. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  47. Price, A.L.; Jones, N.C.; Pevzner, P.A. De novo identification of repeat families in large genomes. Bioinformatics 2005, 21, i351–i358. [Google Scholar] [CrossRef] [Green Version]
  48. Edgar, R.C.; Myers, E.W. PILER: Identification and classification of genomic repeats. Bioinformatics 2005, 21, i152–i158. [Google Scholar] [CrossRef]
  49. Wicker, T.; Sabot, F.; Hua-Van, A.; Bennetzen, J.L.; Capy, P.; Chalhoub, B.; Flavell, A.; Leroy, P.; Morgante, M.; Panaud, O.; et al. A unified classification system for eukaryotic transposable elements. Nat. Rev. Genet. 2007, 8, 973–982. [Google Scholar] [CrossRef]
  50. Jurka, J.; Kapitonov, V.V.; Pavlicek, A.; Klonowski, P.; Kohany, O.; Walichiewicz, J. Repbase Update, a database of eukaryotic repetitive elements. Cytogenet. Genome Res. 2005, 110, 462–467. [Google Scholar] [CrossRef]
  51. Chen, N. Using Repeat Masker to Identify Repetitive Elements in Genomic Sequences. Curr. Protoc. Bioinform. 2004, 5, 4–10. [Google Scholar] [CrossRef]
  52. Burge, C.; Karlin, S. Prediction of complete gene structures in human genomic DNA. J. Mol. Biol. 1997, 268, 78–94. [Google Scholar] [CrossRef] [Green Version]
  53. Stanke, M.; Waack, S. Gene prediction with a hidden Markov model and a new intron submodel. Bioinformatics 2003, 19, ii215–ii225. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  54. Majoros, W.H.; Pertea, M.; Salzberg, S.L. TigrScan and GlimmerHMM: Two open source ab initio eukaryotic gene-finders. Bioinformatics 2004, 20, 2878–2879. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  55. Blanco, E.; Parra, G.; Guigó, R. Using geneid to Identify Genes. Curr. Protoc. Bioinform. 2007, 18, e56. [Google Scholar] [CrossRef] [PubMed]
  56. Korf, I. Gene finding in novel genomes. BMC Bioinform. 2004, 5, 59. [Google Scholar] [CrossRef] [Green Version]
  57. Keilwagen, J.; Wenk, M.; Erickson, J.L.; Schattat, M.H.; Grau, J.; Hartung, F. Using intron position conservation for homology-based gene prediction. Nucleic Acids Res. 2016, 44, e89. [Google Scholar] [CrossRef] [Green Version]
  58. Pertea, M.; Kim, D.; Pertea, G.M.; Leek, J.T.; Salzberg, S.L. Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown. Nat. Protoc. 2016, 11, 1650–1667. [Google Scholar] [CrossRef]
  59. Campbell, M.A.; Haas, B.J.; Hamilton, J.P.; Mount, S.M.; Buell, C.R. Comprehensive analysis of alternative splicing in rice and comparative analyses with Arabidopsis. BMC Genom. 2006, 7, 327. [Google Scholar] [CrossRef] [Green Version]
  60. Haas, B.J.; Salzberg, S.L.; Zhu, W.; Pertea, M.; Allen, J.E.; Orvis, J.; White, O.; Buell, C.R.; Wortman, J.R. Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. Genome Biol. 2008, 9, R7. [Google Scholar] [CrossRef] [Green Version]
  61. Lowe, T.M.; Eddy, S. tRNAscan-SE: A Program for Improved Detection of Transfer RNA Genes in Genomic Sequence. Nucleic Acids Res. 1997, 25, 955–964. [Google Scholar] [CrossRef]
  62. Nawrocki, E.P.; Eddy, S.R. Infernal 1.1: 100-fold faster RNA homology searches. Bioinformatics 2013, 29, 2933–2935. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  63. Nawrocki, E.P.; Burge, S.W.; Bateman, A.; Daub, J.; Eberhardt, R.; Eddy, S.; Floden, E.; Gardner, P.P.; Jones, T.A.; Tate, J.; et al. Rfam 12.0: Updates to the RNA families database. Nucleic Acids Res. 2015, 43, D130–D137. [Google Scholar] [CrossRef] [PubMed]
  64. Tatusov, R.L.; Galperin, M.Y.; Natale, D.A.; Koonin, E.V. The COG database: A tool for genome-scale analysis of protein functions and evolution. Nucleic Acids Res. 2000, 28, 33–36. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  65. Kanehisa, M.; Goto, S.; Kawashima, S.; Okuno, Y.; Hattori, M. The KEGG resource for deciphering the genome. Nucleic Acids Res. 2004, 32, D277–D280. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  66. Boeckmann, B.; Bairoch, A.; Apweiler, R.; Blatter, M.; Esteicher, A.; Gasteiger, E. The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003. Nucleic Acids Res. 2003, 31, 365–370. [Google Scholar] [CrossRef] [PubMed]
  67. Deng, Y.Y.; Li, J.Q.; Wu, S.F.; Zhu, Y.P.; Chen, Y.W.; He, F.C. Integrated nr database in protein annotation system and its localization. Comput. Eng. 2016, 32, 71–72. [Google Scholar]
  68. Altschul, S.F.; Madden, T.L.; Schäffer, A.A.; Zhang, J.; Zhang, Z.; Miller, W.; Lipman, D.J. Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Res. 1997, 25, 3389–3402. [Google Scholar] [CrossRef] [Green Version]
  69. Conesa, A.; Götz, S.; García-Gómez, J.M.; Terol, J.; Talón, M.; Robles, M. Blast2GO: A universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics 2005, 21, 3674–3676. [Google Scholar] [CrossRef] [Green Version]
  70. Eddy, S.R. Profile hidden Markov models. Bioinformatics 1998, 14, 755–763. [Google Scholar] [CrossRef] [Green Version]
  71. Ashburner, M.; Ball, C.A.; Blake, J.A.; Botstein, D.; Butler, H.; Cherry, J.M.; Davis, A.P.; Dolinski, K.; Dwight, S.S.; Eppig, J.T.; et al. Gene ontology: Tool for the unification of biology. Nat. Genet. 2000, 25, 25–29. [Google Scholar] [CrossRef] [Green Version]
  72. Finn, R.D.; Coggill, P.; Eberhardt, R.Y.; Eddy, S.R.; Mistry, J.; Mitchell, A.L.; Potter, S.C.; Punta, M.; Qureshi, M.; Sangrador-Vegas, A.; et al. The Pfam protein families database: Towards a more sustainable future. Nucleic Acids Res. 2016, 44, D279–D285. [Google Scholar] [CrossRef] [PubMed]
  73. Saier, M.H.; Tran, C.V.; Barabote, R.D. TCDB: The Transporter Classification Database for membrane transport protein analyses and information. Nucleic Acids Res. 2006, 34, D181–D186. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  74. Fischer, M.; Knoll, M.; Sirim, D.; Wagner, F.; Funke, S.; Pleiss, J. The Cytochrome P450 Engineering Database: A navigation and prediction tool for the cytochrome P450 protein family. Bioinformatics 2007, 23, 2015–2017. [Google Scholar] [CrossRef] [Green Version]
  75. Cantarel, B.L.; Coutinho, P.M.; Rancurel, C.; Bernard, T.; Lombard, V.; Henrissat, B. The Carbohydrate-Active EnZymes database (CAZy): An expert resource for Glycogenomics. Nucleic Acids Res. 2009, 37, D233–D238. [Google Scholar] [CrossRef]
  76. Winnenburg, R.; Baldwin, T.K.; Urban, M.; Rawlings, C.; Köhler, J.; Hammond-Kosack, K.E. PHI-base: A new database for pathogen host interactions. Nucleic Acids Res. 2006, 34, D459–D464. [Google Scholar] [CrossRef] [PubMed]
  77. Lu, T.; Yao, B.; Zhang, C. DFVF: Database of fungal virulence factors. Database 2012, 2012, bas032. [Google Scholar] [CrossRef] [PubMed]
  78. Petersen, T.N.; Brunak, S.; von Heijne, G.; Nielsen, H. SignalP 4.0: Discriminating signal peptides from transmembrane regions. Nat. Methods 2011, 8, 785–786. [Google Scholar] [CrossRef]
  79. Krogh, A.; Larsson, B.; von Heijne, G.; Sonnhammer, E.L. Predicting transmembrane protein topology with a hidden markov model: Application to complete genomes. J. Mol. Biol. 2001, 305, 567–580. [Google Scholar] [CrossRef] [Green Version]
  80. Sperschneider, J.; Gardiner, D.M.; Dodds, P.N.; Tini, F.; Covarelli, L.; Singh, K.B.; Manners, J.M.; Taylor, J.M. E ffector P: Predicting fungal effector proteins from secretomes using machine learning. New Phytol. 2015, 210, 743–761. [Google Scholar] [CrossRef] [Green Version]
  81. Li, H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv 2013, arXiv:1303.3997v2. [Google Scholar]
  82. Li, H.; Handsaker, B.; Wysoker, A.; Fennell, T.; Ruan, J.; Homer, N.; Marth, G.; Abecasis, G.; Durbin, R. The Sequence Alignment/Map format and SAMtools. Bioinformatics 2009, 25, 2078–2079. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  83. McKenna, A.; Hanna, M.; Banks, E.; Sivachenko, A.; Cibulskis, K.; Kernytsky, A.; Garimella, K.; Altshuler, D.; Gabriel, S.; Daly, M.; et al. The Genome Analysis Toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010, 20, 1297–1303. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  84. Cingolani, P.; Platts, A.; Wang, L.L.; Coon, M.; Nguyen, T.; Wang, L.; Land, S.J.; Lu, X.; Ruden, D.M. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly 2012, 6, 80–92. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  85. Kumar, S.; Stecher, G.; Li, M.; Knyaz, C.; Tamura, K. MEGA X: Molecular Evolutionary Genetics Analysis across Computing Platforms. Mol. Biol. Evol. 2018, 35, 1547–1549. [Google Scholar] [CrossRef] [PubMed]
  86. Alexander, D.H.; Novembre, J.; Lange, K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 2009, 19, 1655–1664. [Google Scholar] [CrossRef] [Green Version]
  87. Danecek, P.; Auton, A.; Abecasis, G.; Albers, C.A.; Banks, E.; DePristo, M.A.; Handsaker, R.E.; Lunter, G.; Marth, G.T.; Sherry, S.T.; et al. The variant call format and VCFtools. Bioinformatics 2011, 27, 2156–2158. [Google Scholar] [CrossRef]
  88. Chen, C.J.; Chen, H.; Zhang, Y.; Thomas, H.R.; Frank, M.H.; He, Y.H.; Xia, R. TBtools: An Integrative Toolkit Developed for Interactive Analyses of Big Biological Data. Mol. Plant 2020, 13, 1194–1202. [Google Scholar] [CrossRef]
  89. Gawehns, F.; Ma, L.; Bruning, O.; Houterman, P.M.; Boeren, S.; Cornelissen, B.J.C.; Rep, M.; Takken, F.L.W. The effector repertoire of Fusarium oxysporum determines the tomato xylem proteome composition following infection. Front. Plant Sci. 2015, 6, 967. [Google Scholar] [CrossRef] [Green Version]
  90. Sánchez-Vallet, A.; Fouché, S.; Fudal, I.; Hartmann, F.E.; Soyer, J.L.; Tellier, A.; Croll, D. The Genome Biology of Effector Gene Evolution in Filamentous Plant Pathogens. Annu. Rev. Phytopathol. 2018, 56, 21–40. [Google Scholar] [CrossRef]
  91. Jones, D.A.B.; Rozano, L.; Debler, J.W.; Mancera, R.L.; Moolhuijzen, P.M.; Hane, J.K. An automated and combinative method for the predictive ranking of candidate effector proteins of fungal plant pathogens. Sci. Rep. 2021, 11, 19731. [Google Scholar] [CrossRef]
  92. Li, J.; Cornelissen, B.; Rep, M. Host-specificity factors in plant pathogenic fungi. Fungal Genet. Biol. 2020, 144, 103447. [Google Scholar] [CrossRef]
  93. Daverdin, G.; Rouxel, T.; Gout, L.; Aubertot, J.-N.; Fudal, I.; Meyer, M.; Parlange, F.; Carpezat, J.; Balesdent, M.-H. Genome Structure and Reproductive Behaviour Influence the Evolutionary Potential of a Fungal Phytopathogen. PLOS Pathog. 2012, 8, e1003020. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  94. Larkan, N.J.; Lydiate, D.J.; Parkin, I.A.P.; Nelson, M.N.; Epp, D.J.; Cowling, W.A.; Rimmer, S.R.; Borhan, M.H. The B rassica napus blackleg resistance gene LepR3 encodes a receptor-like protein triggered by the L eptosphaeria maculans effector AVRLM 1. New Phytol. 2013, 197, 595–605. [Google Scholar] [CrossRef] [PubMed]
  95. Schmidt, S.M.; Lukasiewicz, J.; Farrer, R.; Dam, P.; Bertoldo, C.; Rep, M. Comparative genomics of Fusarium oxysporum f. sp. melonis reveals the secreted protein recognized by the Fom-2 resistance gene in melon. New Phytol. 2016, 209, 307–318. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  96. Jones, J.D.G.; Dangl, J.L. The plant immune system. Nature 2006, 444, 323–329. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  97. Gouin, A.; Legeai, F.; Nouhaud, P.; Whibley, A.; Simon, J.-C.; Lemaitre, C. Whole-genome re-sequencing of non-model organisms: Lessons from unmapped reads. Heredity 2014, 114, 494–501. [Google Scholar] [CrossRef] [Green Version]
  98. Laine, V.N.; Gossmann, T.I.; van Oers, K.; Visser, M.E.; Groenen, M.A.M. Exploring the unmapped DNA and RNA reads in a songbird genome. BMC Genom. 2019, 20, 19. [Google Scholar] [CrossRef] [Green Version]
  99. Kiran, K.; Rawal, H.C.; Dubey, H.; Jaswal, R.; Bhardwaj, S.C.; Prasad, P.; Pal, D.; Devanna, B.N.; Sharma, T.R. Dissection of genomic features and variations of three pathotypes of Puccinia striiformis through whole genome sequencing. Sci. Rep. 2017, 7, srep42419. [Google Scholar] [CrossRef] [Green Version]
  100. Wittkopp, P.J.; Kalay, G. Cis-regulatory elements: Molecular mechanisms and evolutionary processes underlying divergence. Nat. Rev. Genet. 2012, 13, 59–69. [Google Scholar] [CrossRef]
  101. Andersson, R.; Sandelin, A. Determinants of enhancer and promoter activities of regulatory elements. Nat. Rev. Genet. 2020, 21, 71–87. [Google Scholar] [CrossRef]
  102. Caracuel, Z.; Martínez-Rocha, A.L.; Di Pietro, A.; Madrid, M.P.; Roncero, M.I.G. Fusarium oxysporum gas1 Encodes a Putative β-1, 3-Glucanosyltransferase Required for Virulence on Tomato Plants. Mol. Plant-Microbe Interact. 2005, 18, 1140–1147. [Google Scholar] [CrossRef] [Green Version]
  103. Schirawski, J.; Böhnert, H.U.; Steinberg, G.; Snetselaar, K.; Adamikowa, L.; Kahmann, R. Endoplasmic Reticulum Glucosidase II Is Required for Pathogenicity of Ustilago maydis [W]. Plant Cell 2005, 17, 3532–3543. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  104. Zhao, Z.; Liu, H.; Wang, C.; Xu, J.-R. Comparative analysis of fungal genomes reveals different plant cell wall degrading capacity in fungi. BMC Genom. 2013, 14, 274. [Google Scholar] [CrossRef] [Green Version]
  105. Rafiei, V.; Vélëz, H.; Tzelepis, G. The Role of Glycoside Hydrolases in Phytopathogenic Fungi and Oomycetes Virulence. Int. J. Mol. Sci. 2021, 22, 9359. [Google Scholar] [CrossRef]
  106. Pemberton, C.L.; Salmond, G.P.C. The Nep1-like proteins-a growing family of microbial elicitors of plant necrosis. Mol. Plant Pathol. 2004, 5, 353–359. [Google Scholar] [CrossRef]
  107. Qutob, D.; Kemmerling, B.; Brunner, F.; Küfner, I.; Engelhardt, S.; Gust, A.A.; Luberacki, B.; Seitz, H.U.; Stahl, D.; Rauhut, T.; et al. Phytotoxicity and Innate Immune Responses Induced by Nep1-Like Proteins. Plant Cell 2006, 18, 3721–3744. [Google Scholar] [CrossRef] [Green Version]
  108. Seidl, M.F.; Van Den Ackerveken, G. Activity and Phylogenetics of the Broadly Occurring Family of Microbial Nep1-like Proteins. Annu. Rev. Phytopathol. 2019, 57, 367–386. [Google Scholar] [CrossRef] [PubMed]
  109. Pirc, K.; Hodnik, V.; Snoj, T.; Lenarčič, T.; Caserman, S.; Podobnik, M.; Böhm, H.; Albert, I.; Kotar, A.; Plavec, J.; et al. Nep1-like proteins as a target for plant pathogen control. PLOS Pathog. 2021, 17, e1009477. [Google Scholar] [CrossRef] [PubMed]
  110. Brachmann, A.; Schirawski, J.; Müller, P.; Kahmann, R. An unusual MAP kinase is required for efficient penetration of the plant surface by Ustilago maydis. EMBO J. 2003, 22, 2199–2210. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  111. Román, E.; Arana, D.M.; Nombela, C.; Monge, R.A.; Pla, J. MAP kinase pathways as regulators of fungal virulence. Trends Microbiol. 2007, 15, 181–190. [Google Scholar] [CrossRef]
  112. Son, H.; Seo, Y.-S.; Min, K.; Park, A.R.; Lee, J.; Jin, J.-M.; Lin, Y.; Cao, P.; Hong, S.-Y.; Kim, E.-K.; et al. A Phenome-Based Functional Analysis of Transcription Factors in the Cereal Head Blight Fungus, Fusarium graminearum. PLOS Pathog. 2011, 7, e1002310. [Google Scholar] [CrossRef] [PubMed]
  113. Wagner, G.P.; Lynch, V.J. The gene regulatory logic of transcription factor evolution. Trends Ecol. Evol. 2008, 23, 377–385. [Google Scholar] [CrossRef] [PubMed]
  114. Shelest, E. Transcription Factors in Fungi: TFome Dynamics, Three Major Families, and Dual-Specificity TFs. Front. Genet. 2017, 8, 53. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Figure 1. Pathogenicity assays and lesion phenotype of 41 D. arachidicola isolates. (A) The disease index of peanut seedlings infected by 41 D. arachidicola isolates at 14 dpi; CON denotes non-infected peanut seedlings. Letters on the bars denote significant differences in disease index at 0.05 level; error bars denote the standard error of the mean. (B) The average disease index of peanut seedlings infected by isolates of different lesion phenotypes at 14 dpi. “**” denotes significant differences at 0.05 level; error bars denote the standard error of the mean. (C) Adaxial and abaxial leaf surfaces of different lesion phenotypes showed at the top and bottom, respectively.
Figure 1. Pathogenicity assays and lesion phenotype of 41 D. arachidicola isolates. (A) The disease index of peanut seedlings infected by 41 D. arachidicola isolates at 14 dpi; CON denotes non-infected peanut seedlings. Letters on the bars denote significant differences in disease index at 0.05 level; error bars denote the standard error of the mean. (B) The average disease index of peanut seedlings infected by isolates of different lesion phenotypes at 14 dpi. “**” denotes significant differences at 0.05 level; error bars denote the standard error of the mean. (C) Adaxial and abaxial leaf surfaces of different lesion phenotypes showed at the top and bottom, respectively.
Biology 12 00476 g001
Figure 2. The genomic circos map of D. arachidicola isolate YY187. Circles from inside to outside are (a) chromosome information of the D. arachidicola genome (scale marks in Mb). (b) Gene density on different chromosomes, color-coded from blue to yellow, represents values from low to high. (c) GC content, color-coded from yellow to green, represents values from low to high. Visualization was performed using advanced circos in TBtools v1.098769.
Figure 2. The genomic circos map of D. arachidicola isolate YY187. Circles from inside to outside are (a) chromosome information of the D. arachidicola genome (scale marks in Mb). (b) Gene density on different chromosomes, color-coded from blue to yellow, represents values from low to high. (c) GC content, color-coded from yellow to green, represents values from low to high. Visualization was performed using advanced circos in TBtools v1.098769.
Biology 12 00476 g002
Figure 3. Genome-wide distribution and types of variations (identified by comparison with the reference genome of isolate YY187) within all D. arachidicola isolates. (A) Genome-wide percentage distribution of SNPs and InDels within different genomic regions, respectively. Each bar indicates the average percentage of variations in a certain genomic region among all the isolates, with the error bars showing the standard error of the mean; (B) the proportion of different variation types in the CDS region of D. arachidicola isolates.
Figure 3. Genome-wide distribution and types of variations (identified by comparison with the reference genome of isolate YY187) within all D. arachidicola isolates. (A) Genome-wide percentage distribution of SNPs and InDels within different genomic regions, respectively. Each bar indicates the average percentage of variations in a certain genomic region among all the isolates, with the error bars showing the standard error of the mean; (B) the proportion of different variation types in the CDS region of D. arachidicola isolates.
Biology 12 00476 g003
Figure 4. Phylogenetic analysis of D. arachidicola population with high-quality SNPs. The SNPs of 40 isolates were called by GATK based on the fact that the genome of isolate YY187 was selected as the reference, which caused the absence of isolate YY187 on the phylogenetic tree. The maximum likelihood (ML) tree was constructed using MEGA X with 1000 bootstrap replicates. The tree was visualized by modification in FigTree v1.4.4. Bootstrap values are shown above the branches. The sub-populations of D. arachidicola were classified and displayed by different colors. The isolates of B type are designated in blue font color.
Figure 4. Phylogenetic analysis of D. arachidicola population with high-quality SNPs. The SNPs of 40 isolates were called by GATK based on the fact that the genome of isolate YY187 was selected as the reference, which caused the absence of isolate YY187 on the phylogenetic tree. The maximum likelihood (ML) tree was constructed using MEGA X with 1000 bootstrap replicates. The tree was visualized by modification in FigTree v1.4.4. Bootstrap values are shown above the branches. The sub-populations of D. arachidicola were classified and displayed by different colors. The isolates of B type are designated in blue font color.
Biology 12 00476 g004
Figure 5. Chromosomal distribution of 13 differential genes. The green lines represent gene distribution on different chromosomes. G1–G13 refer to the 13 genes and mark the location of genes on different chromosomes using a blue line. TBtools v1.098769 visualized the figure.
Figure 5. Chromosomal distribution of 13 differential genes. The green lines represent gene distribution on different chromosomes. G1–G13 refer to the 13 genes and mark the location of genes on different chromosomes using a blue line. TBtools v1.098769 visualized the figure.
Biology 12 00476 g005
Table 1. Summary of genome assembly and annotation in isolate YY187.
Table 1. Summary of genome assembly and annotation in isolate YY187.
FeatureValue
Scaffold number26
Total scaffold length (Mb)47.35
Scaffold N50 (Mb)2.17
Scaffold N90 (Mb)1.39
Maximum scaffold length (Mb)6.56
GC content (%)56.37
Chromosomal located sequence length (Mb)36.03
Number of chromosomes18
Percentage of repeats (%)9.89
Number of predicted genes16,629
Average gene length (Kb)1.99
Average CDS number2.47
Number of non-coding RNAstRNA (286), rRNA (81), other ncRNA (95)
Table 2. Summary of pathogenicity factors potentially related to different lesion phenotypes.
Table 2. Summary of pathogenicity factors potentially related to different lesion phenotypes.
Gene IDBrief DescriptionGO IDPfam Annotation
EVM0000257.1 a--GO:0000981; GO:0003677; GO:0005634; GO:0006351; GO:0006355; GO:0008270Fungal Zn(2)-Cys(6) binuclear cluster domain
EVM0000544.1 aHypothetical proteinGO:0003700; GO:0008270AAA domain
EVM0002645.1 a, bCarminomycin 4-O-methyltransferaseGO:0008171O-methyltransferase domain
EVM0002993.1--GO:0003676Zinc-finger double domain
EVM0004472.1 a, b3-oxoacyl-[acyl-carrier-protein] reductaseGO:0004316; GO:0006633; GO:0051287; GO:0102131; GO:0102132Enoyl-(Acyl carrier protein) reductase
EVM0007100.1 a, b, cAlpha-glucosidaseGO:0004553; GO:0005975; GO:0030246Glycosyl hydrolases family 31
EVM0007819.1 a, b, fNPP1 family protein-Necrosis inducing protein (NPP1)
EVM0009176.1Isopenicillin N synthase family oxygenaseGO:0016491; GO:0046872; GO:0055114non-haem dioxygenase in morphine synthesis N-terminal
EVM0009252.1 eAgmatinaseGO:0016813; GO:0046872Arginase family
EVM0011493.1Hypothetical protein-Heterokaryon incompatibility protein Het-C
EVM0012163.1Hypothetical proteinGO:0004843; GO:0006511; GO:0016579; GO:0036459Ubiquitin carboxyl-terminal hydrolase
EVM0013695.1 aAAA family ATPaseGO:0005524ATPase family associated with various cellular activities (AAA)
EVM0015380.1 a, b, dHypothetical proteinGO:0004672; GO:0005524; GO:0006468Protein kinase domain
a, b, c, d denote the proteins with PHI-base, DFVF, CAZy-base, and CYPED hits, respectively. e denotes the secreted protein. f denotes the effector protein.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Li, S.; Wang, Z.; Gao, M.; Li, T.; Cui, X.; Zu, J.; Sang, S.; Fan, W.; Zhang, H. Intraspecific Comparative Analysis Reveals Genomic Variation of Didymella arachidicola and Pathogenicity Factors Potentially Related to Lesion Phenotype. Biology 2023, 12, 476. https://doi.org/10.3390/biology12030476

AMA Style

Li S, Wang Z, Gao M, Li T, Cui X, Zu J, Sang S, Fan W, Zhang H. Intraspecific Comparative Analysis Reveals Genomic Variation of Didymella arachidicola and Pathogenicity Factors Potentially Related to Lesion Phenotype. Biology. 2023; 12(3):476. https://doi.org/10.3390/biology12030476

Chicago/Turabian Style

Li, Shaojian, Zhenyu Wang, Meng Gao, Tong Li, Xiaowei Cui, Junhuai Zu, Suling Sang, Wanwan Fan, and Haiyan Zhang. 2023. "Intraspecific Comparative Analysis Reveals Genomic Variation of Didymella arachidicola and Pathogenicity Factors Potentially Related to Lesion Phenotype" Biology 12, no. 3: 476. https://doi.org/10.3390/biology12030476

APA Style

Li, S., Wang, Z., Gao, M., Li, T., Cui, X., Zu, J., Sang, S., Fan, W., & Zhang, H. (2023). Intraspecific Comparative Analysis Reveals Genomic Variation of Didymella arachidicola and Pathogenicity Factors Potentially Related to Lesion Phenotype. Biology, 12(3), 476. https://doi.org/10.3390/biology12030476

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop