Next Article in Journal
Preharvest Natural Multitoxin Contamination of Winter Wheat Genotypes in Hungary with Special Attention to Aflatoxins and HT-2 Toxin
Next Article in Special Issue
Study of the Degradation and Utilization of Cellulose from Auricularia heimuer and the Gene Expression Level of Its Decomposition Enzyme
Previous Article in Journal
Response of Long-Term Water and Phosphorus of Wheat to Soil Microorganisms
Previous Article in Special Issue
Transcriptome Analysis of Ganoderma lingzhi Liquid Fermentation Process Using Corn Straw as Matrix
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Genetic Diversity and Genome-Wide Association Study of Pleurotus pulmonarius Germplasm

1
Institute of Agricultural Resources and Regional Planning, Chinese Academy of Agricultural Sciences, Beijing 100081, China
2
State Key Laboratory of Efficient Utilization of Arid and Semi-Arid Arable Land in Northern China, Beijing 100081, China
3
Lin’an District Agricultural and Forestry Technology Extension Center, Hangzhou 311300, China
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Agriculture 2024, 14(11), 2023; https://doi.org/10.3390/agriculture14112023
Submission received: 10 September 2024 / Revised: 4 November 2024 / Accepted: 8 November 2024 / Published: 11 November 2024
(This article belongs to the Special Issue Genetics and Breeding of Edible Mushroom)

Abstract

:
Pleurotus pulmonarius is prized by consumers for its distinct flavor, strong aroma, and dense, crispy texture. Although China has extensive germplasm resources for P. pulmonarius, only a limited number of cultivars are commercially available. A comprehensive evaluation and detailed analysis of P. pulmonarius germplasm, alongside the exploration of superior germplasm resources, are essential for developing new varieties. In this study, we resequenced the genomes of 47 P. pulmonarius strains collected nationwide, identifying a total of 4,430,948 single nucleotide polymorphism (SNP) loci. After filtering based on minor allele frequency and data integrity, 181,731 high-quality SNP markers were retained. Phylogenetic analysis grouped the strains into six clusters, with strains from similar geographical regions clustering together. Most CBS strains formed a single cluster; cultivated varieties exhibited higher genetic similarity, whereas wild strains displayed greater diversity. Principal component analysis (PCA) and population structure analyses, using the same SNP markers, corroborated the phylogenetic findings. DNA fingerprinting, derived from 369 core SNPs, further underscored the genetic diversity among strains. Significant morphological variation was observed, with strains in groups ZP, CBS, and WHLJ exhibiting notably higher yields and cap widths compared to other groups. Correlation analysis revealed associations among various phenotypes, while genome-wide association study (GWAS) identified multiple SNP markers within candidate genes linked to agronomic traits, most of which were controlled by multiple genes. This research offers a molecular-level characterization and evaluation of P. pulmonarius germplasm resources, providing a scientific basis for enriching available germplasm and advancing breeding materials.

1. Introduction

Pleurotus pulmonarius, a member of the class Agaricomycetes and subclass Agaricomycetidae, is one of the main species within the oyster mushroom group. According to the China Edible Fungi Association, P. pulmonarius production in China reached 636,700 tons in 2022. China has abundant P. pulmonarius germplasm resources, with wild populations found in regions such as Jiangsu, Zhejiang, Guangdong, Fujian, Henan, and Guangxi. The successful cultivation of P. pulmonarius was first achieved in Luoyuan County, Fuzhou City, in 1998, leading to its rapid and widespread adoption thereafter [1]. Known for its high nutritional value, P. pulmonarius is rich in protein, amino acids, sugars, unsaturated fatty acids, vitamins, folic acid, and essential trace elements. Its distinct aroma, firm yet tender texture, and mild flavor make it highly desirable to consumers.
Studying and assessing the genetic diversity of P. pulmonarius germplasm is essential for breeding applications, providing a foundation for identifying superior germplasm and selecting parental lines with optimal genetic backgrounds. Mating type analysis of 30 P. pulmonarius cultivars revealed only four A factors and three B factors, indicating high genetic similarity among cultivars. The A factor controls nuclear binding and binucleation, while the B factor regulates hyphal fusion and nuclear migration [2]. Clonal propagation through tissue culture is commonly practiced, occasionally leading to the misidentification of distinct strains under the same name. Therefore, accurate variety identification and genetic relationship analysis between cultivated and wild strains are crucial for ensuring high-quality production and advancing scientific research. With the rapid development of molecular biology, molecular markers have become invaluable for analyzing interspecies genetic differences. In addition to traditional morphological identification and antagonistic reactions, molecular markers such as RFLP, RAPD, ISSR, AFLP, and SNPs are widely used to analyze the genetic diversity and phylogenetic relationships of P. pulmonarius germplasm [3]. However, these methods often face limitations in marker quantity, reproducibility, and stability. In contrast, single-nucleotide polymorphisms (SNPs) are abundant, stable, and suitable for automated, high-throughput applications, making them the most widely used genetic markers today. As a primary genotyping marker, SNPs are extensively applied in identifying edible fungi and analyzing genetic diversity across strains, including species like Lentinus edodes [4,5,6], Flammulina filiformis [7,8], and Agaricus bisporus [9]. SNPs are also critical in analyzing P. pulmonarius germplasm diversity, identifying hybrids, and constructing genetic linkage maps. Genome resequencing enables the acquisition of SNP markers evenly distributed across the genome [10]. For example, SNP markers have been used to identify laccase genes in Lentinus edodes [6]. Okuda et al. constructed a genetic linkage map for P. pulmonarius using 150 single-spore hybrid populations and 300 AFLP markers along with two mating-type factors and sporeless traits. This map comprises 12 linkage groups, covering a total length of 971 cM [11]. Vidal-Diez et al. recently assembled and annotated high-quality genomes of P. pulmonarius strains ss2 and ss5. Each genome consists of 23 scaffolds, with sizes ranging from 5.55 Mb to 11 kb (ss2) and 5.06 Mb to 21 kb (ss5), and N50 values of 3.2 Mb and 3.4 Mb, respectively, providing valuable reference genomic data for developing genome-wide SNP markers [12].
Genome-wide association study (GWAS) is commonly employed to identify genes associated with agronomic traits across the entire genome [13,14,15,16]. However, applications of GWASs in edible fungi remain limited. Li et al. used 297 genome-wide markers to genotype 89 cultivated varieties of L. edodes. Among these, 43 markers were significantly associated with four phenotypic traits, leading to the identification of 97 nearby genes and establishing a foundation for molecular breeding in L. edodes [17]. Corner et al. classified the genus Pleurotus into three categories based on hyphal types [18]. Morphological characteristics in edible fungi are highly susceptible to environmental influences [19]. Using a GWAS, Yu identified that DNA damage repair and protein translation processes may play roles in L. edodes responses to cadmium (Cd) stress, providing valuable genomic and population data for functional genomics and artificial breeding research [20]. Zhang applied bulked-segregant analysis (BSA) and an extreme-phenotype GWAS (XP-GWAS) to map cap color traits in Pleurotus cornucopiae and identified the tyrosinase-encoding gene PcTYR through comparative transcriptome analysis [21]. Traditional QTL analysis primarily relies on genetic diversity among parent populations, and its detection efficiency varies between populations. As it often involves multiple genes, accurately pinpointing candidate genes is challenging. In contrast, GWAS enhances allele diversity and improves mapping accuracy. Advances in sequencing technology and mutation analysis tools have significantly increased marker density, while the incorporation of mixed linear models with random and fixed effects has substantially reduced false-positive results.
In this study, the genomes of 47 P. pulmonarius strains collected from various regions across the country were resequenced to detect genome-wide genetic polymorphisms. SNP markers were screened to construct an SNP fingerprint for the strains. Using these genome-wide SNP markers, a phylogenetic tree was developed to analyze the genetic relationships among the strains. Additional population genetic analyses, including population structure, principal component analysis, and linkage disequilibrium, were performed. Fruiting body-related traits of the strains were measured, and the phenotypic data were statistically analyzed. Finally, GWAS between SNP markers and phenotypic traits was conducted to identify SNP markers with significant influence on the traits. This study provides a theoretical foundation for future screening of superior germplasms, gene mining for trait variation, and targeted breeding efforts.

2. Materials and Methods

2.1. Experimental Materials

The tested strains were all stored in the China Center for Mushroom Spawn Standards and Control (CCMSSC) (Table 1).

2.2. DNA Extraction and Sequencing

Genomic DNA from the tested strains was extracted from mycelia grown on PDA (Difco™ Potato Dextrose Agar) plates using the cetyltrimethylammonium bromide (CTAB) method [22]. DNA concentration and purity were assessed with a NanoDrop 2000 spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA). Sequencing libraries of 350 bp fragments were constructed and sequenced on the Illumina/BGI platform (Beijing BioMarker Technology Co., Ltd., Beijing, China), generating 150 bp paired-end reads.

2.3. Identification and Screening of SNP

Single nucleotide polymorphisms (SNPs) were primarily detected using the GATK software (v4.0) toolkit [23]. Redundant reads were filtered with samtools (v1.9) based on localization results of clean reads aligned to the reference genome to ensure accuracy [24]. GATK’s HaplotypeCaller algorithm was employed to detect SNP and InDel mutations, generating gVCF files for each sample, followed by joint genotyping across groups. Rigorous filtering was applied to ensure reliability. The main filtering parameters included using vcfutils.pl in bcftools to exclude SNPs within 5 bp of an InDel and adjacent InDels within 10 bp; limiting variations to a maximum of two in any 5 bp window; and excluding variants with a Phred quality score (QUAL) below 30, a quality-to-depth ratio (QD) below 2.0, a mapping quality (MQ) below 40, and a Fisher Strand value (FS) above 60. Default GATK parameters were used for additional filtering. These steps ensured the accurate and reliable detection of SNPs and InDels.

2.4. Population Genetic Analysis and Construction of Fingerprints

In the analysis, all selected high-quality SNP markers were used for population genetic analysis. MEGA X software [25] was applied to construct phylogenetic trees for each sample using the neighbor-joining method with the p-distance calculation model and 1000 bootstrap replications. Population genetic indicators were calculated using VCFtools (v0.1.15) [26]. Admixture software assessed population structure [27], while EIGENSOFT [28] performed principal component analysis to cluster the samples. Genetic diversity analysis was conducted using PLINK software [29]. A DNA fingerprint was created to identify markers suitable for genetic typing based on specific criteria: markers were evenly distributed across the genome, exhibited 100% completeness (no missing data), had a minor allele frequency (MAF) above 20%, a polymorphic information content (PIC) above 0.35, conformed to Hardy–Weinberg equilibrium (p-value > 0.01), and showed no nearby mutations within 100 bp of the marker [30].

2.5. Cultivation Test and Trait Screening

The 47 tested strains were cultivated in polycarbonate plastic bottles (diameter: 70 mm, height: 90 mm, volume: 280 mL) filled with 180 g (wet weight) of substrate. The substrate consisted of cottonseed shell (94%), wheat bran (5%), and gypsum (1%) with approximately 65% water content. The bottles were sterilized for 2 h at 0.15 MPa and 126 °C. Six replicate bottles were inoculated for each strain. Once fully colonized, the bottles were transferred to a fruiting room maintained at 16 °C to 18 °C, with 80% to 90% relative humidity and a 12 h photoperiod (300 to 350 lx). CO₂ concentration was kept below 1000 ppm through ventilation.
Phenotypic data were collected for various agronomic traits, including the period from inoculation to primordia generation (PPG) and the period from inoculation to harvesting (PIH) for each bottle. Measurements of mature fruiting bodies included cap length (CL), cap width (CW), stipe length (SL), and stipe diameter (SD). Yield of fruiting bodies per bottle (YFB) was also recorded. All strains were cultivated under the same climate-controlled conditions. Broad-sense heritability (H2) was calculated across tester lines as H 2 = σ G 2 / [ σ G 2 + σ e 2 n ] , where σ G 2 represents the genotypic variance, σ e 2 represents the residual variance, and n is the number of testers (indicate heterokaryon set for which this was the case) (n = 42). The phenotypic data were recorded individually and analyzed statistically. One-way analysis of variance and correlation analysis were performed using SPSS software (v27.0) [31]. The best linear unbiased prediction (BLUP) [32] method was used to estimate trait values for association analysis. Descriptive statistics were calculated for each phenotype, including the number of samples after removing missing values (SamNum), mean (Mean), standard deviation (SD), median (Median), minimum (Min), maximum (Max), range (Range), and coefficient of variation (CV). This comprehensive approach was designed to assess the growth and phenotypic characteristics of P. pulmonarius strains under controlled cultivation conditions, enabling a detailed analysis of mushroom production traits.

2.6. Genome-Wide Association Study

SNP data were filtered based on a minor allele frequency (MAF ≥ 0.05) and data integrity (INT ≥ 0.8). Phenotypic data were used to construct frequency distribution histograms. GEMMA software [33] employing the linear mixed model (LMM) was used for GWAS, represented by the model:
y = W α + x β + μ + e
The analysis generated p-values, which were used to create Manhattan and QQ plots. A threshold was set by dividing the number of effective markers by quality control criteria, with regions meeting −log10(p) ≥ 5 designated as candidate intervals. Functional annotation was applied to genes identified within these candidate regions to clarify their potential biological roles. All experimental steps were completed in 2023.

3. Results

3.1. Quality Statistics of Sequencing Data and Discovery of SNPs

Whole-genome resequencing of P. pulmonarius strains yielded clean reads ranging from 6,119,174 to 11,448,795, with an average of 7,998,114 reads per strain. The Q30 quality score ranged from 88.4% to 95.81%, with an average of 92.66%. GC content varied from 43.58% to 50.35%, averaging 47.33%. In total, 112.15 Gbp of clean data was generated across the 47 strains, with an alignment rate to the reference genome of 75.02%, an average coverage depth of 41×, and genome coverage of 89.84% (at least one base covered). These metrics satisfied the requirements for sequence analysis.
Using the GATK toolkit, 4,430,948 SNPs were detected between the samples and the reference genome (NCBI: ASM1298053v1). The largest number of SNPs was observed between cultivated varieties ZP1 and ZP13 (234,642 SNPs), while the smallest number was observed between ZP2 and ZP12 (2151 SNPs). Analysis of SNP distribution across the genome showed that 65.51% of SNPs were located within gene-coding regions, 13.97% were within 5 kb upstream of genes, and 5.95% were in intronic regions. Synonymous coding mutations accounted for 87.13% of the total, whereas nonsynonymous coding mutations constituted 12.69%. Other types of mutations, such as SNPs within 5 kb downstream of genes, accounted for 10.59% (Figure 1).

3.2. Population Genetic Analysis Based on Whole-Genome Wide SNP Markers

3.2.1. Phylogenetic Relationship

After filtering, a total of 181,731 high-quality SNP markers were obtained. Phylogenetic analysis revealed distinct clusters among the tested P. pulmonarius strains. Cultivated varieties (ZP8, ZP11, ZP10, ZP7, ZP3, ZP6, ZP4, ZP12, ZP2, ZP9, and CBS3) grouped closely within Cluster I, indicating high genetic similarity among cultivars (Figure 2). In contrast, two other cultivated varieties, ZP5 and ZP13, clustered with a wild strain from Yunnan in a separate branch within Cluster I, suggesting that these cultivars may have been domesticated from wild strains originating in Yunnan. Most strains collected from CBS clustered in Cluster II, except for wild strain WSC5 from Sichuan and cultivated strain ZP1, which also fell into this cluster but on different branches.
The wild strains exhibited relatively high genetic diversity, with strains from similar geographical regions clustering together. For instance, wild strains from Liaoning, Jilin, and Heilongjiang grouped in Cluster III, while those from Sichuan were predominantly found in Clusters IV and V, and those from Hubei in Cluster VI.
PCA analysis illustrated the clustering pattern of the samples, confirming the results of the phylogenetic analysis (Figure 3, Table S1). The two-dimensional plot of PC1 and PC3, as well as the three-dimensional plot of PC1, PC2, and PC3, depicted the spatial distribution of each strain based on genetic variation. PC1 accounted for 36.58% of the variation, PC2 explained 12%, and PC3 accounted for 6.52%. The analysis revealed four distinct clusters among the strains. With the exception of ZP1 and ZP5, most strains and CBS3 formed a cohesive group characterized by PC1 values ranging from 0.2125 to 0.2122, closely matching the branches observed in the phylogenetic tree. Strains such as WYN8, WYN7, WYN3, WYN4, and WYN2 exhibited PC1 values between 0.1507 and 0.1368, while ZP5, ZP13, WYN6, ZP1, WSC5, WYN5, WYN1, and WSX ranged from 0.0505 to 0.0491, aligning closely with branches in Clusters I and II in the phylogenetic tree.
Wild strains, represented by Clusters III, IV, V, and VI from the phylogenetic tree, showed a broader distribution range of PC1, spanning from −0.1236 to −0.1385. The PCA results closely aligned with the phylogenetic tree, reflecting high genetic similarity among cultivated varieties and significant variability and divergence among wild strains.

3.2.2. Genetic Diversity Analysis

Polymorphic markers among strains within each group exhibited average polymorphic information content (PIC) ranging from 0.226 to 0.327 (Table 2). Among domestic wild strains, the average minor allele frequency (MAF) of Sichuan wild strains was notably low at 0.202, with predominant alleles more widely represented. The average Nei diversity index for this group was only 0.303. In contrast, wild strains from Liaoning and Jilin exhibited higher average MAF values of 0.328 and 0.326, respectively, with Nei diversity index averages of 0.552 and 0.550, respectively. Further comparison showed that average heterozygosity values were notably high in Jilin wild strains (0.606) and Liaoning wild strains (0.598), along with their respective PIC values of 0.326 and 0.327, indicating significantly greater genetic diversity in Northeast China compared to Sichuan. Wild strains from Yunnan and Hubei displayed lower average PIC values of 0.260 and 0.281, respectively. The genetic characteristics of CBS strains were similar to those of wild strains from Yunnan, with relatively lower diversity among the tested strains. Cultivated strains exhibited an average MAF of 0.207, an average Nei diversity index of 0.287, observed heterozygosity of 0.435, and an average PIC of 0.226. These values indicated lower genetic diversity in cultivated strains compared to wild strains.

3.2.3. Population Structure

Based on the 181,731 high-quality SNP markers, Admixture software was used to analyze the population structure of the samples. The analysis considered K values ranging from 1 to 10 to determine optimal clustering. At K = 2, the cross-validation error rate was minimized at 0.514, categorizing the 47 tested strains into two main clusters, with 17 strains showing ambiguous classification (Figure 4, Table S2). The first cluster included 11 cultivated strains previously identified along with CBS3, while the second cluster predominantly consisted of 10 wild strains from Sichuan, 5 from Northeast China, and 4 from Hubei. This analysis underscored the genetic differentiation between cultivated varieties and specific wild strains, highlighting substantial genetic divergence.
Additionally, at K = 4, the cross-validation error rate was lower than at K = 3, with a rate of 0.548, and the strains were distinctly divided into four clusters. Compared to clustering at K = 2, strains CBS1, CBS2, CBS5, and CBS4 formed a separate cluster, while ZP1 and WSC were also categorized into their own cluster.
A total of 369 candidate core SNP markers were identified, facilitating the construction of a DNA fingerprint map to differentiate the 47 P. pulmonarius strains (Figure 5, Table S3). The fingerprint map clearly depicted the close genetic relationships and divergence among the strains. The minor allele frequency was reported as 0.416 (Table S4). These core SNPs exhibited high quality, even distribution across the genome, and strong discriminatory power among strains. The map illustrated that cultivated strains had high genetic similarity with minimal SNP variation, whereas wild strains exhibited greater diversity with numerous variations, indicating substantial genetic differences among wild strains.

3.3. Evaluation of Agronomic Traits in P. pulmonarius

A total of 42 P. pulmonarius strains were capable of producing fruiting bodies (Table S5). The frequency distribution indicated continuous variation for nearly all seven agronomic traits, suggesting that these traits are quantitative and controlled by multiple genes (Figure 6). Since the period from inoculation to primordia generation (PPG) represents the earliest time point for primordia formation, it was excluded from the ANOVAs. The ANOVA results revealed significant variation among strains for the remaining six traits (α = 0.05) (Table S6).
The boxplot clearly illustrates the variation for each trait (Figure 7). The most pronounced differences among strains were observed for the trait YFB, while PPG exhibited the least variation. However, it is worth noting that there were four significant outliers for PPG, indicating that although most strains showed a similar period from inoculation to primordia formation, a few strains had notably different times.
The average period from inoculation to primordium generation (PPG) was approximately 35 days (Table 3). Notably, strains ZP3, ZP4, WHLJ, and CBS5 exhibited faster primordium formation, occurring in as few as 31 days, while ZP2 required a longer period of 64 days. The broad-sense heritability for stipe diameter (SD) and yield of fruiting bodies (YFB) was relatively high, indicating that the variations observed for these traits were primarily influenced by genetic factors. The average period from inoculation to harvest (PIH) was around 42 days; however, strains ZP4, WHLJ, and CBS3 required only 36 days, while WSC5 took 79 days to reach harvest.
Cap size varied significantly among the 42 strains. For example, strain CBS3 had a cap length (CL) of 47.85 mm and a cap width (CW) of 50.99 mm, while strain WJL1 had a CL of only 20.32 mm and a CW of 23.29 mm. Stipe length (SL) also varied greatly, with WSC7 having the shortest length at 10.73 mm and CBS5 the longest at 45.77 mm. Stipe diameter (SD) showed notable variation as well, with WSC5 having the smallest diameter at 5.5 mm, while ZP9 had the largest diameter at 25.56 mm. In terms of YFB, strain ZP3 achieved the highest fruiting body weight at 42.76 g per bottle. Cultivated strains generally produced higher fruiting body weights with more considerable variation, as indicated by a coefficient of variation of 72.32%.
Significant differences were detected for CW, SD, and YFB among the groups. Strains in groups ZP, CBS, and WHLJ exhibited significantly larger yields and cap widths, while strains in groups WHB, WJL, and WLN showed significantly larger stipe diameters compared to other groups (Figure 8). No significant differences were observed for the other four traits among the groups.
Correlation analysis revealed a significant positive correlation between PPG and PIH (correlation coefficient = 0.77). Cap length and width were also significantly positively correlated (correlation coefficient = 0.91). Additionally, YFB showed a positive correlation with CL, CW, and SL, decreasing in strength in that order. Conversely, YFB exhibited a negative correlation with SD (Figure 9, Table S7). The negative correlation between YFB and SD suggests that a thicker stipe may adversely affect overall quality. Furthermore, harvesting time was negatively correlated with cap length and width, indicating that later fruiting resulted in smaller cap sizes.

3.4. GWAS Analysis Results

In this GWAS, significant SNPs across various scaffolds were identified for key traits (Figure 10, Table 4). On scaffold SJKF01000008.1, SNP rs_474319 explained only 0.02% of the variance in cap length (CL), indicating a minimal effect. Conversely, SNP rs_338157 on scaffold SJKF01000016.1 had a strong influence on cap width (CW) with a proportion of variance explained (PVE) of 40.45%. Scaffolds SJKF01000006.1 and SJKF01000003.1 also showed high PVEs for CL with SNPs rs_3071735 (47.30%) and rs_3643242 (48.76%). For stipe diameter (SD), SNP rs_3145491 on scaffold SJKF01000005.1 showed a smaller effect (PVE 4.29%). In stipe length (SL), SNPs rs_499031 and rs_2285326 on scaffolds SJKF01000007.1 and SJKF01000005.1, respectively, made significant contributions with PVEs of 28.29% and 32.72%. The highest impact was observed on scaffold SJKF01000002.1, where SNP rs_4651725 influenced yield with a PVE of 65.72%.
To avoid false positives, the long PPG periods of four strains (ZP1, ZP2, WCS5, and WYN1) and the PIH periods of five strains (WSC5, ZP2, WYN1, WJL1, WLN1) were excluded as outliers in the GWASs. After their exclusion, no significant SNP loci were identified for traits PPG and PIH.

4. Discussion

4.1. Genetic Diversity Analysis of P. pulmonarius Based on SNP

Traditional germplasm identification methods are characterized by long identification cycles, high costs, and susceptibility to environmental factors. Therefore, DNA molecular markers are crucial for germplasm identification at the molecular level. Compared to other molecular markers, SNP-based diversity analysis provides a more accurate and detailed interpretation of phenotypic differences at a lower cost, making it the most widely applied molecular marker [34,35]. SNPs, which represent the most common types of nucleotide sequence polymorphisms—including insertions and deletions—are abundant throughout the genome. These co-dominant markers are highly reproducible and suitable for genetic diversity and population structure mapping [36,37,38,39,40]. Previous studies have shown that SNPs are closely associated with individual traits, where different SNP positions may affect gene expression and phenotype through various mechanisms. The functional impact depends on whether the nucleotide variation affects protein structure or function.
Despite its relatively short cultivation history in China, P. pulmonarius holds promising economic potential and market prospects. Assessing genetic diversity in germplasm resources is essential for resource identification, new gene discovery, and breeding of new varieties. Current challenges in P. pulmonarius breeding include issues such as misidentification of varieties, insufficient research on breeding practices, and limited depth in genetic studies. Additionally, some P. pulmonarius varieties suffer from degeneration and poor disease resistance after several years of introduction. Understanding the genetic relationships and population structure of P. pulmonarius strains from different regions of China at the genomic level is essential for addressing these challenges.
To advance the breeding and application of P. pulmonarius germplasm resources, this study conducted whole-genome resequencing of 47 strains collected nationwide. A total of 181,731 high-quality SNP loci were identified after filtering, indicating substantial genetic variation among the strains. The significant number and proportion of SNPs make these strains well suited for subsequent genetic evolution andGWAS. These SNPs were used for DNA identity verification, phylogenetic relationship analysis, and genetic diversity assessment. Phylogenetic analysis revealed that most currently cultivated P. pulmonarius germplasm originates from the same source, exhibiting close genetic relationships and potential synonymy issues. In contrast, wild germplasm showed greater genetic diversity. Strains from the same geographic origin were grouped into the same cluster, a pattern also observed in other mushroom species [41]. The results of principal component analysis and population structure analysis were consistent with the phylogenetic tree, underscoring the reliability of SNP markers in assessing genetic relatedness in P. pulmonarius strains.
DNA fingerprinting, based on molecular markers or specific sequences, was used to assess genetic diversity and effectively differentiate strains. DNA fingerprinting has been widely applied in genetic diversity analyses of plants and microbes [42], providing valuable tools and technical support for germplasm identification and SNP-based fingerprinting methods [43]. In this study, a fingerprint atlas was constructed using 369 core SNP loci, offering unique genetic identification for each strain. These results provide a foundation for studies on genetic diversity, quality control, and molecular breeding.

4.2. Evaluation of Agronomic Traits

P. pulmonarius, one of the most widely cultivated edible fungi in China, faces challenges due to chaotic germplasm and limited genetic diversity in cultivars. These limitations stem from tissue separation, monosporal and polysporal hybridization, long-term artificial selection, and strain introductions across cultivation areas. These factors complicate the effective utilization and innovation of P. pulmonarius germplasm resources, restricting available breeding materials. Furthermore, unlike green plants, edible fungi exhibit lower morphological differentiation, and oyster mushroom fruiting body morphology is highly influenced by environmental conditions. Therefore, optimizing multiple traits is essential to maximize productivity when selecting superior varieties, necessitating the evaluation of germplasm based on multiple traits rather than a single trait [44].
In this study, we conducted comprehensive statistical analyses on seven key traits related to mushroom production in 47 P. pulmonarius strains. According to the ANOVA results, all seven traits were identified as quantitative traits primarily determined by genetic variation. The high broad-sense heritability values indicated that genetic factors are the primary contributors to the observed variation. Correlation analyses revealed significant associations, such as between the period of primordia generation and the period from inoculation to harvesting, as well as between cap length and cap width. Strains from Groups ZP, CBS, and WHLJ generally exhibited superior agronomic traits compared to others, suggesting their potential as breeding materials. The outliers observed for the traits PPG and PIH may be attributed to the low mycelial activity of certain strains, underscoring the need for improved strain preservation to ensure a robust analysis and evaluation system for P. pulmonarius germplasm. These findings provide valuable insights for optimizing breeding and cultivation strategies for P. pulmonarius.

4.3. Genome-Wide Association Study

GWASs rely on a comprehensive set of variants covering the entire genome. Initially, these studies used low-throughput markers like RFLP, AFLP, and SSR. However, with advancements in technology, the focus has shifted to higher-throughput and more stable SNPs, which now dominate the field. A GWAS identifies specific alleles contributing to desirable traits, facilitating genetic improvement efforts. While GWASs have been extensively applied in plants and animals to uncover genes associated with agronomic traits, their use in edible fungi remains limited. In this study, 47 P. pulmonarius strains from diverse geographic origins, showing significant phenotypic variation in agronomic traits, were used for a GWAS. All seven traits were identified as polygenic, controlled by multiple genes. Effective GWAS models, such as the linear mixed model (LMM) provided by GEMMA software, helped mitigate false positives and identified numerous key SNPs and candidate genes associated with these traits.
For instance, SNP rs_474319 on scaffold SJKF01000008.1, though statistically significant, had a low proportion of variance explained (PVE) at 0.02% for cap length, indicating a minimal impact on this trait. In contrast, SNPs like rs_338157 on SJKF01000016.1, rs_3071735 on SJKF01000006.1, and rs_3643242 on SJKF01000003.1 exhibited high PVE values (ranging from 40.45% to 48.76%), underscoring their significant roles in the genetic control of cap width and cap length. Similarly, SNPs associated with stipe length, such as rs_499031 on scaffold SJKF01000007.1 and rs_2285326 on SJKF01000005.1, showed high PVEs (28.29% and 32.72%, respectively), highlighting their substantial influence on this trait. These findings suggest that the genetic architecture of traits like cap length, cap width, and stipe length is shaped by SNPs distributed across multiple scaffolds, reflecting the complex genetic regulatory networks underlying these phenotypic characteristics.
Additionally, scaffold SJKF01000005.1 exhibited pleiotropic effects on both stipe diameter (SD) and stipe length (SL), while scaffold SJKF01000008.1 showed pleiotropy on cap length (CL), cap width (CW), and yield of fruiting bodies (YFB). The observed phenotypic correlations were effectively explained by genetic correlations, providing valuable insights for future breeding programs and molecular improvement strategies. Notably, SNP rs_4651725, associated with the YFB trait, exhibited a high PVE of 65.72% despite a relatively modest -log10P value of 5.29, making it less prominent in the Manhattan plot. This discrepancy may arise because the p-value primarily reflects statistical significance, which can be influenced by limited sample size, discontinuous phenotypic expression from the micro-cultivation system, and lack of strong natural selection, potentially causing the SNP to appear less significant. Additionally, the large PVE value might result from the high coefficient of variation of YFB among strains. The PVE quantifies the proportion of phenotypic variance explained by the SNP, highlighting its contribution to the traits. Thus, an SNP that marginally exceeds the significance threshold may still demonstrate a substantial PVE.
After removing outliers, no significant SNP associations were found for the period from inoculation to harvest or the primordium formation period. To detect significant SNPs associated with these traits, future experiments should increase sample size and conduct repeated trials to address outliers.
To date, GWASs have been used in various species, including staple crops like rice [45], model organisms such as Arabidopsis thaliana [46], and other plants like sorghum [47]. Their combination with genomics, transcriptomics, and metabolomics has provided a deeper understanding of species biology and genetics. This multidisciplinary approach helps clarify the genetic basis of complex traits. Looking ahead, GWAS technology is expected to continue evolving, with promising applications in the field of edible fungi. Future developments are likely to enhance its effectiveness in understanding genetic structures, optimizing breeding strategies, and contributing to the development of improved fungal varieties.

5. Conclusions

In this study, sequencing of 47 P. pulmonarius strains identified 4,430,948 SNPs. Analysis using 181,731 SNP markers revealed six distinct subpopulations, and a DNA fingerprint was developed using 369 core markers. Genetic diversity analysis showed minimal variation within cultivated strains from the same area but significant diversity among wild strains from different regions. GWASs identified key SNPs and candidate genes associated with five phenotypic traits, providing valuable insights for targeted breeding of improved edible fungus strains.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/agriculture14112023/s1, Table S1: PCA data; Table S2: K value; Table S3: Core SNP; Table S4: Core SNP filtering; Table S5: Phenotypic statistics of 47 Pleurotus pulmonarius strains; Table S6: ANOVA among different groups; Table S7: Correlation analysis.

Author Contributions

Formal analysis, Q.L. and X.Y.; Resources, W.G.; Visualization, Y.Y.; Writing—original draft, Q.L. and X.Y.; Writing—review and editing, W.G.; Supervision and project administration, W.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Lin’an District People’s Government and Chinese Academy of Agricultural Sciences Collaborative Project (NH202244), the Jiangxi Facility Vegetable Project (JXNK202303-06), and the Beijing Innovation Consortium of Agriculture Research System (BAIC03).

Institutional Review Board Statement

Not applicable.

Data Availability Statement

Data are contained within the article or Supplementary Material.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Zhu, J.; Liu, X.; Xie, B.; Liu, F.; Deng, Y.; Xiong, F. Analysis of germplasm resources of Pleurotus geesteranus by ITS-RFLP. J. Fujian Agric. For. Univ. (Nat. Sci. Ed.) 2009, 38, 186–191. [Google Scholar]
  2. Liu, X.; Ye, L.; Zhang, L.; Xie, B.; Wu, X. Mating type analyses of cultivated Pleurotus pulmonarius in China. Mycosystema 2021, 40, 3109–3117. [Google Scholar]
  3. Chen, G.G.; Lin, Y.; Liu, X.R.; Zhao, G.H.; Chen, J. Recent advances in genetics and breeding of Pleurotus pulmonarius. Chin. J. Trop. Crops 2017, 38, 1377–1381. [Google Scholar]
  4. Lee, H.Y.; Moon, S.; Shim, D.; Hong, C.P.; Lee, Y.; Koo, C.D.; Chung, J.W.; Ryu, H. Development of 44 Novel Polymorphic SSR Markers for Determination of Shiitake Mushroom (Lentinula edodes) Cultivars. Genes 2017, 8, 109. [Google Scholar] [CrossRef]
  5. Saito, T.; Sakuta, G.; Kobayashi, H.; Ouchi, K.; Inatomi, S. Genetically Independent Tetranucleotide to Hexanucleotide Core Motif SSR Markers for Identifying Lentinula edodes Cultivars. Mycobiology 2019, 47, 466–472. [Google Scholar] [CrossRef]
  6. Kim, K.H.; Ka, K.H.; Kang, J.H.; Kim, S.; Lee, J.W.; Jeon, B.K.; Yun, J.K.; Park, S.R.; Lee, H.J. Identification of Single Nucleotide Polymorphism Markers in the Laccase Gene of Shiitake Mushrooms (Lentinula edodes). Mycobiology 2015, 43, 75–80. [Google Scholar] [CrossRef]
  7. Liu, X.B.; Feng, B.; Li, J.; Yan, C.; Yang, Z.L. Genetic diversity and breeding history of Winter Mushroom (Flammulina velutipes) in China uncovered by genomic SSR markers. Gene 2016, 591, 227–235. [Google Scholar] [CrossRef]
  8. Liu, X.B.; Li, J.; Yang, Z.L. Genetic diversity and structure of core collection of winter mushroom (Flammulina velutipes) developed by genomic SSR markers. Hereditas 2018, 155, 3. [Google Scholar] [CrossRef]
  9. Wang, L.; Gao, W.; Wang, Q.; Qu, J.; Zhang, J.; Huang, C. Identification of commercial cultivars of Agaricus bisporus in China using genome-wide microsatellite markers. J. Integr. Agric. 2019, 18, 580–589. [Google Scholar] [CrossRef]
  10. Rostoks, N.; Ramsay, L.; Mackenzie, K.; Cardle, L.; Bhat, P.R.; Roose, M.L.; Svensson, J.T.; Stein, N.; Varshney, R.K.; Marshall, D.F.; et al. Recent history of artificial outcrossing facilitates whole-genome association mapping in elite inbred crop varieties. Proc. Natl. Acad. Sci. USA 2006, 103, 18656–18661. [Google Scholar] [CrossRef] [PubMed]
  11. Okuda, Y.; Murakami, S.; Matsumoto, T. A genetic linkage map of Pleurotus pulmonarius based on AFLP markers, and localization of the gene region for the sporeless mutation. Genome 2009, 52, 438–446. [Google Scholar] [CrossRef]
  12. Vidal-Diez, D.U.G.; Lee, Y.Y.; Stajich, J.E.; Schwarz, E.M.; Hsueh, Y.P. Genomic analyses of two Italian oyster mushroom Pleurotus pulmonarius strains. G3 2021, 11, jkaa007. [Google Scholar] [CrossRef]
  13. Breseghello, F.; Sorrells, M.E. Association mapping of kernel size and milling quality in wheat (Triticum aestivum L.) cultivars. Genetics 2006, 172, 1165–1177. [Google Scholar] [CrossRef]
  14. Huang, X.; Wei, X.; Sang, T.; Zhao, Q.; Feng, Q.; Zhao, Y.; Li, C.; Zhu, C.; Lu, T.; Zhang, Z.; et al. Genome-wide association studies of 14 agronomic traits in rice landraces. Nat. Genet. 2010, 42, 961–967. [Google Scholar] [CrossRef]
  15. Olsen, K.M.; Halldorsdottir, S.S.; Stinchcombe, J.R.; Weinig, C.; Schmitt, J.; Purugganan, M.D. Linkage disequilibrium mapping of Arabidopsis CRY2 flowering time alleles. Genetics 2004, 167, 1361–1369. [Google Scholar] [CrossRef]
  16. Risch, N.; Merikangas, K. The future of genetic studies of complex human diseases. Science 1996, 273, 1516–1517. [Google Scholar] [CrossRef]
  17. Li, C.; Gong, W.; Zhang, L.; Yang, Z.; Nong, W.; Bian, Y.; Kwan, H.S.; Cheung, M.K.; Xiao, Y. Association Mapping Reveals Genetic Loci Associated with Important Agronomic Traits in Lentinula edodes, Shiitake Mushroom. Front. Microbiol. 2017, 8, 237. [Google Scholar] [CrossRef]
  18. Cardoso, W.S.; Soares, F.; Queiroz, P.V.; Tavares, G.P.; Santos, F.A.; Sufiate, B.L.; Kasuya, M.; de Queiroz, J.H. Minimum cocktail of cellulolytic multi-enzyme complexes obtained from white rot fungi via solid-state fermentation. 3 Biotech 2018, 8, 46. [Google Scholar] [CrossRef]
  19. Corner, E.J.H. The Agaric Genera Lentinus, Panus, and Pleurotus, with Particular Reference to Malaysian Species; J. Cramer: Vaduz, Liechtenstein, 1981; p. 169. ISBN 85192445. [Google Scholar]
  20. Yu, H.; Zhang, L.; Shang, X.; Peng, B.; Li, Y.; Xiao, S.; Tan, Q.; Fu, Y. Chromosomal genome and population genetic analyses to reveal genetic architecture, breeding history and genes related to cadmium accumulation in Lentinula edodes. BMC Genom. 2022, 23, 120. [Google Scholar] [CrossRef]
  21. Zhang, Y.; Huang, C.; van Peer, A.F.; Sonnenberg, A.; Zhao, M.; Gao, W. Fine Mapping and Functional Analysis of the Gene PcTYR, Involved in Control of Cap Color of Pleurotus cornucopiae. Appl. Environ. Microbiol. 2022, 88, e0217321. [Google Scholar] [CrossRef]
  22. Huang, X.; Duan, N.; Xu, H.; Xie, T.N.; Xue, Y.R.; Liu, C.H. CTAB-PEG DNA Extraction from Fungi with High Contents of Polysaccharides. Mol. Biol. 2018, 52, 718–726. [Google Scholar] [CrossRef]
  23. Mckenna, A.; Hanna, M.; Banks, E.; Sivachenko, A.; Cibulskis, K.; Kernytsky, A.; Garimella, K.; Altshuler, D.; Gabriel, S.; Daly, M.; et al. The Genome Analysis Toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010, 20, 1297–1303. [Google Scholar] [CrossRef]
  24. Danecek, P.; Bonfield, J.K.; Liddle, J.; Marshall, J.; Ohan, V.; Pollard, M.O.; Whitwham, A.; Keane, T.; Mccarthy, S.A.; Davies, R.M.; et al. Twelve years of SAMtools and BCFtools. Gigascience 2021, 10, giab008. [Google Scholar] [CrossRef] [PubMed]
  25. Kumar, S.; Stecher, G.; Li, M.; Knyaz, C.; Tamura, K. MEGA X: Molecular Evolutionary Genetics Analysis across Computing Platforms. Mol. Biol. Evol. 2018, 35, 1547–1549. [Google Scholar] [CrossRef]
  26. Danecek, P.; Auton, A.; Abecasis, G.; Albers, C.A.; Banks, E.; Depristo, M.A.; Handsaker, R.E.; Lunter, G.; Marth, G.T.; Sherry, S.T.; et al. The variant call format and VCFtools. Bioinformatics 2011, 27, 2156–2158. [Google Scholar] [CrossRef]
  27. Alexander, D.H.; Novembre, J.; Lange, K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 2009, 19, 1655–1664. [Google Scholar] [CrossRef]
  28. Price, A.L.; Patterson, N.J.; Plenge, R.M.; Weinblatt, M.E.; Shadick, N.A.; Reich, D. Principal components analysis corrects for stratification in genome-wide association studies. Nat. Genet. 2006, 38, 904–909. [Google Scholar] [CrossRef]
  29. Slifer, S.H. PLINK: Key Functions for Data Analysis. Curr. Protoc. Hum. Genet. 2018, 97, e59. [Google Scholar] [CrossRef]
  30. Wang, Y.; Lv, H.; Xiang, X.; Yang, A.; Feng, Q.; Dai, P.; Li, Y.; Jiang, X.; Liu, G.; Zhang, X. Construction of a SNP Fingerprinting Database and Population Genetic Analysis of Cigar Tobacco Germplasm Resources in China. Front. Plant Sci. 2021, 12, 618133. [Google Scholar] [CrossRef]
  31. Valeri, L.; Vanderweele, T.J. Mediation analysis allowing for exposure-mediator interactions and causal interpretation: Theoretical assumptions and implementation with SAS and SPSS macros. Psychol. Methods 2013, 18, 137–150. [Google Scholar] [CrossRef]
  32. Mucha, A.; Wierzbicki, H. Linear models for breeding values prediction in haplotype-assisted selection—An analysis of QTL-MAS Workshop 2011 Data. BMC Proc. 2012, 6 (Suppl. S2), S11. [Google Scholar] [CrossRef] [PubMed]
  33. Zhou, X.; Stephens, M. Genome-wide efficient mixed-model analysis for association studies. Nat. Genet. 2012, 44, 821–824. [Google Scholar] [CrossRef] [PubMed]
  34. Trick, M.; Long, Y.; Meng, J.; Bancroft, I. Single nucleotide polymorphism (SNP) discovery in the polyploid Brassica napus using Solexa transcriptome sequencing. Plant Biotechnol. J. 2009, 7, 334–346. [Google Scholar] [CrossRef]
  35. Schlotterer, C. The evolution of molecular markers--just a matter of fashion? Nat. Rev. Genet. 2004, 5, 63–69. [Google Scholar] [CrossRef]
  36. Cabezas, J.A.; Ibanez, J.; Lijavetzky, D.; Velez, D.; Bravo, G.; Rodriguez, V.; Carreno, I.; Jermakow, A.M.; Carreno, J.; Ruiz-Garcia, L.; et al. A 48 SNP set for grapevine cultivar identification. BMC Plant Biol. 2011, 11, 153. [Google Scholar] [CrossRef] [PubMed]
  37. Gao, W.; Qu, J.; Zhang, J.; Sonnenberg, A.; Chen, Q.; Zhang, Y.; Huang, C. A genetic linkage map of Pleurotus tuoliensis integrated with physical mapping of the de novo sequenced genome and the mating type loci. BMC Genom. 2018, 19, 18. [Google Scholar] [CrossRef]
  38. An, H.; Lee, H.Y.; Shim, D.; Choi, S.H.; Cho, H.; Hyun, T.K.; Jo, I.H.; Chung, J.W. Development of CAPS Markers for Evaluation of Genetic Diversity and Population Structure in the Germplasm of Button Mushroom (Agaricus bisporus). J. Fungi 2021, 7, 375. [Google Scholar] [CrossRef]
  39. Yang, Q.; Zhang, J.; Shi, X.; Chen, L.; Qin, J.; Zhang, M.; Yang, C.; Song, Q.; Yan, L. Development of SNP marker panels for genotyping by target sequencing (GBTS) and its application in soybean. Mol. Breed. 2023, 43, 26. [Google Scholar] [CrossRef] [PubMed]
  40. Gao, W.; Weijn, A.; Baars, J.J.; Mes, J.J.; Visser, R.G.; Sonnenberg, A.S. Quantitative trait locus mapping for bruising sensitivity and cap color of Agaricus bisporus (button mushrooms). Fungal Genet. Biol. 2015, 77, 69–81. [Google Scholar] [CrossRef]
  41. Gu, M.; Chen, Q.; Zhang, Y.; Zhao, Y.; Wang, L.; Wu, X.; Zhao, M.; Gao, W. Evaluation of Genetic Diversity and Agronomic Traits of Germplasm Resources of Stropharia rugosoannulata. Horticulturae 2024, 10, 213. [Google Scholar] [CrossRef]
  42. Tian, H.L.; Wang, F.G.; Zhao, J.R.; Yi, H.M.; Wang, L.; Wang, R.; Yang, Y.; Song, W. Development of maizeSNP3072, a high-throughput compatible SNP array, for DNA fingerprinting identification of Chinese maize varieties. Mol. Breed. 2015, 35, 136. [Google Scholar] [CrossRef] [PubMed]
  43. Yang, J.; Zhang, J.; Han, R.; Zhang, F.; Mao, A.; Luo, J.; Dong, B.; Liu, H.; Tang, H.; Zhang, J.; et al. Target SSR-Seq: A Novel SSR Genotyping Technology Associate with Perfect SSRs in Genetic Analysis of Cucumber Varieties. Front. Plant Sci. 2019, 10, 531. [Google Scholar] [CrossRef] [PubMed]
  44. Gao, W.; Baars, J.; Maliepaard, C.; Visser, R.; Zhang, J.; Sonnenberg, A. Multi-trait QTL analysis for agronomic and quality characters of Agaricus bisporus (button mushrooms). AMB Express 2016, 6, 67. [Google Scholar] [CrossRef] [PubMed]
  45. Huang, X.; Zhao, Y.; Wei, X.; Li, C.; Wang, A.; Zhao, Q.; Li, W.; Guo, Y.; Deng, L.; Zhu, C.; et al. Genome-wide association study of flowering time and grain yield traits in a worldwide collection of rice germplasm. Nat. Genet. 2011, 44, 32–39. [Google Scholar] [CrossRef]
  46. Atwell, S.; Huang, Y.S.; Vilhjalmsson, B.J.; Willems, G.; Horton, M.; Li, Y.; Meng, D.; Platt, A.; Tarone, A.M.; Hu, T.T.; et al. Genome-wide association study of 107 phenotypes in Arabidopsis thaliana inbred lines. Nature 2010, 465, 627–631. [Google Scholar] [CrossRef]
  47. Morris, G.P.; Ramu, P.; Deshpande, S.P.; Hash, C.T.; Shah, T.; Upadhyaya, H.D.; Riera-Lizarazu, O.; Brown, P.J.; Acharya, C.B.; Mitchell, S.E.; et al. Population genomic and genome-wide association studies of agroclimatic traits in sorghum. Proc. Natl. Acad. Sci. USA 2013, 110, 453–458. [Google Scholar] [CrossRef]
Figure 1. SNP proportions at different genomic locations: INTERGENIC represents intergenic regions, INTRON represents intronic regions, CDS represents coding sequence regions, UPSTREAM/DOWNSTREAM represent SNP sites located within 5 kb upstream/downstream of genes, UTR_5_PRIME/UTR_3_PRIME represent SNP sites in the 5′ UTR and 3′ UTR of genes, SPLICE_SITE_ACCEPTOR/SPLICE_SITE_DONOR represent splice site mutations (within the first 2 bp of an exon), SPLICE_SITE_REGION represents splice site region mutations (1−3 bp variation in exons or 3−8 bp variation in introns), START_GAINED represents gained start codons (in non-coding regions). SNP mutation types within CDS regions annotation: START_LOST represents lost start codons, SYNONYMOUS_START/NON_SYNONYMOUS_START represent synonymous/non-synonymous start codon mutations, SYNONYMOUS_CODING represents synonymous coding mutations, NON_SYNONYMOUS_CODING represents non-synonymous coding mutations, STOP_GAINED/STOP_LOST represent gained/lost stop codons.
Figure 1. SNP proportions at different genomic locations: INTERGENIC represents intergenic regions, INTRON represents intronic regions, CDS represents coding sequence regions, UPSTREAM/DOWNSTREAM represent SNP sites located within 5 kb upstream/downstream of genes, UTR_5_PRIME/UTR_3_PRIME represent SNP sites in the 5′ UTR and 3′ UTR of genes, SPLICE_SITE_ACCEPTOR/SPLICE_SITE_DONOR represent splice site mutations (within the first 2 bp of an exon), SPLICE_SITE_REGION represents splice site region mutations (1−3 bp variation in exons or 3−8 bp variation in introns), START_GAINED represents gained start codons (in non-coding regions). SNP mutation types within CDS regions annotation: START_LOST represents lost start codons, SYNONYMOUS_START/NON_SYNONYMOUS_START represent synonymous/non-synonymous start codon mutations, SYNONYMOUS_CODING represents synonymous coding mutations, NON_SYNONYMOUS_CODING represents non-synonymous coding mutations, STOP_GAINED/STOP_LOST represent gained/lost stop codons.
Agriculture 14 02023 g001
Figure 2. Illustration of 47 P. pulmonarius phylogenetic trees based on SNP. The Roman numerals (I–VI) indicate different clusters. Strain names of each group are listed in Table 1.
Figure 2. Illustration of 47 P. pulmonarius phylogenetic trees based on SNP. The Roman numerals (I–VI) indicate different clusters. Strain names of each group are listed in Table 1.
Agriculture 14 02023 g002
Figure 3. Principal component analysis of 47 P. pulmonarius based on SNP. The samples are aggregated into two-dimensional (A) and three-dimensional (B) by PCA analysis: PC1 represents the first principal component; PC2 represents the second principal component; PC3 represents the third principal component. A point represents a sample, and a color represents a grouping.
Figure 3. Principal component analysis of 47 P. pulmonarius based on SNP. The samples are aggregated into two-dimensional (A) and three-dimensional (B) by PCA analysis: PC1 represents the first principal component; PC2 represents the second principal component; PC3 represents the third principal component. A point represents a sample, and a color represents a grouping.
Agriculture 14 02023 g003
Figure 4. Population structure at different K values. (A) shows the clustering results for K values ranging from 1 to 10. (B) illustrates the clustering of the research population with the number of subgroups (K value) preset from 1 to 10, followed by cross-verification of the clustering results. The optimal number of clusters is determined by identifying the valley in the cross-validation error rate.
Figure 4. Population structure at different K values. (A) shows the clustering results for K values ranging from 1 to 10. (B) illustrates the clustering of the research population with the number of subgroups (K value) preset from 1 to 10, followed by cross-verification of the clustering results. The optimal number of clusters is determined by identifying the valley in the cross-validation error rate.
Agriculture 14 02023 g004
Figure 5. Screening of core tags and construction of SNP fingerprint atlas. SNP fingerprint atlas for 47 strains: Each row in the fingerprint atlas corresponds to a specific SNP, and the columns represent different strains. Yellow indicates C/C genotype, green indicates A/A, blue indicates T/T, and purple indicates G/G. The missing data are represented by gray, while the heterozygosity check points between strains are represented by white.
Figure 5. Screening of core tags and construction of SNP fingerprint atlas. SNP fingerprint atlas for 47 strains: Each row in the fingerprint atlas corresponds to a specific SNP, and the columns represent different strains. Yellow indicates C/C genotype, green indicates A/A, blue indicates T/T, and purple indicates G/G. The missing data are represented by gray, while the heterozygosity check points between strains are represented by white.
Agriculture 14 02023 g005
Figure 6. Histograms showing the frequency distribution of 7 agronomic traits. The x-axis represents the value range of the traits, while the y-axis indicates the frequency. The top left to the bottom right are the frequency distribution histograms of cap length (CL), cap width (CW), the period from inoculation to harvesting (PIH), the period from inoculation to primordia generation (PPG), stipe diameter (SD), stipe length (SL), and yield of fruiting body per bottle (YFB).
Figure 6. Histograms showing the frequency distribution of 7 agronomic traits. The x-axis represents the value range of the traits, while the y-axis indicates the frequency. The top left to the bottom right are the frequency distribution histograms of cap length (CL), cap width (CW), the period from inoculation to harvesting (PIH), the period from inoculation to primordia generation (PPG), stipe diameter (SD), stipe length (SL), and yield of fruiting body per bottle (YFB).
Agriculture 14 02023 g006
Figure 7. Boxplots describing the variation in each trait. The line inside the box represents the median, the whiskers extend to the minimum and maximum values. Red Crosses indicate outliers. The y-axis indicates the value range of agronomic traits, while the x-axis represents traits, from left to right are cap length (CL), cap width (CW), the period from inoculation to harvesting (PIH), the period from inoculation to primordia generation (PPG), stipe diameter (SD), stipe length (SL), and yield of fruiting body per bottle yield per bottle (YFB).
Figure 7. Boxplots describing the variation in each trait. The line inside the box represents the median, the whiskers extend to the minimum and maximum values. Red Crosses indicate outliers. The y-axis indicates the value range of agronomic traits, while the x-axis represents traits, from left to right are cap length (CL), cap width (CW), the period from inoculation to harvesting (PIH), the period from inoculation to primordia generation (PPG), stipe diameter (SD), stipe length (SL), and yield of fruiting body per bottle yield per bottle (YFB).
Agriculture 14 02023 g007
Figure 8. Variations among groups in agronomic traits: (A) variations among groups in cap width (CW); (B) variations among groups in stipe diameter (SD); (C) variations among groups in the yield of fruiting body per bottle (YFB). The x-axis represents different groups, and the y-axis represents the range of trait values.
Figure 8. Variations among groups in agronomic traits: (A) variations among groups in cap width (CW); (B) variations among groups in stipe diameter (SD); (C) variations among groups in the yield of fruiting body per bottle (YFB). The x-axis represents different groups, and the y-axis represents the range of trait values.
Agriculture 14 02023 g008
Figure 9. Phenotypic correlation analysis heat map. From top to bottom and left to right, are in the same order, corresponding to the period from inoculation to primordia generation (PPG), period from inoculation to harvesting (PIH), cap length (CL), cap width (CW), stipe length (SL), stipe diameter (SD), and yield of fruiting body per bottle yield per bottle (YFB). Sample size = 42; * and ** indicate the significance at 0.05 and 0.01 level, respectively.
Figure 9. Phenotypic correlation analysis heat map. From top to bottom and left to right, are in the same order, corresponding to the period from inoculation to primordia generation (PPG), period from inoculation to harvesting (PIH), cap length (CL), cap width (CW), stipe length (SL), stipe diameter (SD), and yield of fruiting body per bottle yield per bottle (YFB). Sample size = 42; * and ** indicate the significance at 0.05 and 0.01 level, respectively.
Agriculture 14 02023 g009
Figure 10. Manhattan plot (left) and quantile–quantile (Q-Q) plot (right) of SNP p-values in the whole population. In the Manhattan plot, y-axis presents the observed SNP −log10(p) and x-axis the SNP positions across the 20 scaffolds. Horizontal lines show the genome-wide significant threshold; the green dashed line represents the value of −log10(p) = 5. In the Q-Q plot, the y-axis and x-axis represent the observed SNP –log10(p) and expected –log10(p), respectively. From top to bottom: (A) cap length (CL), (B) cap width (CW), (C) stipe diameter (SD), (D) stipe length (SL), and (E) yield of fruiting body per bottle yield per bottle (YFB).
Figure 10. Manhattan plot (left) and quantile–quantile (Q-Q) plot (right) of SNP p-values in the whole population. In the Manhattan plot, y-axis presents the observed SNP −log10(p) and x-axis the SNP positions across the 20 scaffolds. Horizontal lines show the genome-wide significant threshold; the green dashed line represents the value of −log10(p) = 5. In the Q-Q plot, the y-axis and x-axis represent the observed SNP –log10(p) and expected –log10(p), respectively. From top to bottom: (A) cap length (CL), (B) cap width (CW), (C) stipe diameter (SD), (D) stipe length (SL), and (E) yield of fruiting body per bottle yield per bottle (YFB).
Agriculture 14 02023 g010
Table 1. Tested strains and their origins.
Table 1. Tested strains and their origins.
Sample NameGroupSample Name for AnalysisName Interpretation
CCMSSC00313ZP 1ZP1Cultivation
CCMSSC00493ZPZP2Cultivation
CCMSSC00494ZPZP3Cultivation
CCMSSC00498ZPZP4Cultivation
CCMSSC00499ZPZP5Cultivation
CCMSSC03815ZPZP6Cultivation
CCMSSC03836ZPZP7Cultivation
CCMSSC03886ZPZP8Cultivation
CCMSSC03897ZPZP9Cultivation
CCMSSC04423ZPZP10Cultivation
CCMSSC04537ZPZP11Cultivation
R08026ZPZP12Cultivation
R08027ZPZP13Cultivation
CCMSSC01106WSCWSC1Sichuan wild
CCMSSC01108WSCWSC2Sichuan wild
CCMSSC01109WSCWSC3Sichuan wild
CCMSSC01110WSCWSC4Sichuan wild
CCMSSC01111WSCWSC5Sichuan wild
CCMSSC01114WSCWSC6Sichuan wild
CCMSSC01116WSCWSC7Sichuan wild
CCMSSC01118WSCWSC8Sichuan wild
CCMSSC01119WSCWSC9Sichuan wild
CCMSSC01120WSCWSC10Sichuan wild
CCMSSC01123WSCWSC11Sichuan wild
CCMSSC04199WHLJWHLJHeilongjiang wild
CCMSSC04458CBSCBS1CBS
CCMSSC04459CBSCBS2CBS
CCMSSC04460CBSCBS3CBS
CCMSSC04461CBSCBS4CBS
CCMSSC04462CBSCBS5CBS
CCMSSC01301WYNWYN1Yunnan wild
CCMSSC04584WYNWYN2Yunnan wild
CCMSSC04585WYNWYN3Yunnan wild
CCMSSC04586WYNWYN4Yunnan wild
CCMSSC04587WYNWYN5Yunnan wild
CCMSSC04588WYNWYN6Yunnan wild
CCMSSC04592WYNWYN7Yunnan wild
CCMSSC04593WYNWYN8Yunnan wild
CCMSSC04591WSXWSXShaanxi wild
CCMSSC04594WLNWLN1Liaoning wild
CCMSSC04595WLNWLN2Liaoning wild
CCMSSC04597WJLWJL1Jilin wild
CCMSSC04598WJLWJL2Jilin wild
CCMSSC04599WHBWHB1Hubei wild
CCMSSC04600WHBWHB2Hubei wild
CCMSSC04601WHBWHB3Hubei wild
CCMSSC04602WHBWHB4Hubei wild
1 The cultivated strains are labeled with the format ‘ZP’ followed by a numerical code, while the wild strains are labeled with ‘W’ followed by the abbreviation of the collection region and a numerical code.
Table 2. Analysis table of population genetic diversity of P. pulmonarius.
Table 2. Analysis table of population genetic diversity of P. pulmonarius.
GroupAverage MAFNei Diversity IndexObserved Heterozygous NumberPolymorphism Information Content
CBS0.2290.200–0.667 (0.361) 10.200–1.000 (0.243)0.164–0.375 (0.265)
WHB0.2550.250–0.667 (0.399)0.250–1.000 (0.409)0.195–0.375 (0.281)
WJL0.3260.500–0.667 (0.550)0.500–1.000 (0.606)0.305–0.375 (0.326)
WLN0.3280.500–0.667 (0.552)0.500–1.000 (0.598)0.305–0.375 (0.327)
WSC0.2020.091–0.556 (0.303)0.091–1.000 (0.293)0.083–0.375 (0.238)
WYN0.2310.125–0.667 (0.343)0.125–1.000 (0.296)0.110–0.375 (0.260)
ZP0.2070.077–0.533 (0.287)0.077–1.000 (0.435)0.071–0.375 (0.226)
1 The number in parentheses represents the average value, and the number range outside parentheses represents the minimum and maximum.
Table 3. P. pulmonarius phenotypic descriptive statistical table.
Table 3. P. pulmonarius phenotypic descriptive statistical table.
TraitSamNum 1MeanSd 2MedianMin 3Max 4RangeCV 5
CL4230.755.0631.3620.3247.8527.5316.45%
CW4233.816.3333.2923.2950.9927.7018.72%
SD4213.804.0113.335.5025.5620.0729.05%
SL4227.477.4727.0310.7345.7735.0427.18%
PPG4234.866.543331643318.76%
PIH4242.4010.373836794324.45%
YFB4215.3011.0610.412.1542.7640.6172.32%
1 SamNum: number of samples after removing missing values; 2 Sd: standard deviation; 3 Min: minimum; 4 Max: maximum; 5 CV: coefficient of variation.
Table 4. Proportion of variance explained (PVE%) and minor allele frequency (MAF) per significant SNP detected in whole population sampling.
Table 4. Proportion of variance explained (PVE%) and minor allele frequency (MAF) per significant SNP detected in whole population sampling.
TraitScaffoldp-Value−log10pAlleleMAF 1SNP IDPVE 2
CL 3SJKF01000008.16.03 × 10−65.22A/G0.39rs_4743190.02%
CWSJKF01000016.13.92 × 10−65.41A/C0.41rs_33815740.45%
SJKF01000008.13.36 × 10−65.47A/G0.39rs_4743190.38%
SJKF01000011.16.11 × 10−65.21T/C0.22rs_9496380.30%
SJKF01000010.12.40 × 10−65.62C/T0.41rs_19341080.43%
SJKF01000006.13.44 × 10−76.46T/A0.17rs_307173547.30%
SJKF01000003.18.07 × 10−65.09T/C0.2rs_364324248.76%
SDSJKF01000005.12.34 × 10−65.63A/G0.21rs_31454914.29%
SLSJKF01000013.17.60 × 10−65.12G/T0.28rs_45947619.54%
SJKF01000007.13.04 × 10−65.52G/A0.13rs_49903128.29%
SJKF01000004.11.14 × 10−65.94C/T0.12rs_131032620.78%
SJKF01000004.17.99 × 10−65.1T/C0.43rs_22246994.55%
SJKF01000005.19.00 × 10−65.05G/A0.27rs_228532632.72%
YFBSJKF01000008.17.24 × 10−65.14T/G0.17rs_8854850.64%
SJKF01000002.15.17 × 10−65.29A/C0.27rs_465172565.72%
1 MAF: minor allele frequency; 2 PVE: phenotypic variance explained; 3 cap length (CL); cap width (CW); stipe diameter (SD); stipe length (SL); yield of fruiting body per bottle yield per bottle (YFB).
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Li, Q.; Ying, X.; Yang, Y.; Gao, W. Genetic Diversity and Genome-Wide Association Study of Pleurotus pulmonarius Germplasm. Agriculture 2024, 14, 2023. https://doi.org/10.3390/agriculture14112023

AMA Style

Li Q, Ying X, Yang Y, Gao W. Genetic Diversity and Genome-Wide Association Study of Pleurotus pulmonarius Germplasm. Agriculture. 2024; 14(11):2023. https://doi.org/10.3390/agriculture14112023

Chicago/Turabian Style

Li, Qian, Xuebing Ying, Yashu Yang, and Wei Gao. 2024. "Genetic Diversity and Genome-Wide Association Study of Pleurotus pulmonarius Germplasm" Agriculture 14, no. 11: 2023. https://doi.org/10.3390/agriculture14112023

APA Style

Li, Q., Ying, X., Yang, Y., & Gao, W. (2024). Genetic Diversity and Genome-Wide Association Study of Pleurotus pulmonarius Germplasm. Agriculture, 14(11), 2023. https://doi.org/10.3390/agriculture14112023

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop