1. Introduction
Maize (
Zea mays L.) is considered one of the most important and oldest cultivated plant species. One theory suggests that the most likely ancestor of maize is teosinte (
Zea mays ssp.
parviglumis). Genetic loci such as
b1 (teosinte branched 1) and
tga1 (teosinte glume architecture 1) have played a crucial role in the transformation of teosinte into modern maize [
1,
2,
3]. Currently, maize, along with wheat and rice, is among the most economically important cereal crops [
4]. The range of maize cultivation is very wide, spanning from 50° N to 40° S. Maize for grain is grown on approximately 197 million hectares worldwide, making it the second most economically important crop after wheat. In comparison, the area under wheat cultivation is 216 million hectares, while rice is grown on 165 million hectares [
5]. The annual world maize grain production currently amounts to 1137 million tons, significantly exceeding the production of rice and wheat [
5]. Over the past quarter century, maize production has more than doubled, driven by both significant yield increases and expansion into increasingly larger areas [
6].
This intensive increase in maize yields would not have been possible without biological advancements. This progress can be described as an ecological method of agricultural intensification, involving the genetic enhancement of plants [
7]. The search for new genes of economic significance is a task for modern plant breeding, including resistance breeding.
Currently, the priorities for maize breeding involve obtaining varieties with higher utility value, including increased yield and an improved nutritional, feed, and technological quality of the harvested crop [
8]. It is also important to enhance plant resistance to both biotic and abiotic stresses [
9]. Biotic stresses involve plant infections by pathogens, including fungal pathogens that cause maize smut. This disease affects all aerial parts of maize [
10] and is caused by the fungus
Ustilago maydis. Spores are the source of fungal infection, which overwinter in or on soil [
11,
12]. Maize rot occurs almost everywhere maize is grown (e.g., USA, China, Poland). However, the fungus thrives best in dry and warm conditions (between 26 °C and 39 °C). This disease inhibits maize growth and reduces yields, leading to significant economic losses [
13]. The most characteristic symptom is the presence of galls. Large galls are typically observed on stems and ears, whereas much smaller tumors are present on leaves and tassels. Chlorosis (coloration changes in plants) may also occur, and additionally, an infected maize ear has altered nutritional value [
14]. It contains significantly more protein, particularly higher levels of the amino acid lysine [
15]. Unfortunately, such maize is unsuitable for sale because most of the kernels undergo complete degradation. In some countries, smut is used for vaccine production [
16]. Resistance breeding is conducted to reduce the incidence of fungal diseases, and it relies on a broad range of molecular genetics techniques, primarily applied in two areas. The first area involves decision-making related to selection based on DNA nucleotide sequence analysis, while the second area focuses on increasing genetic diversity in breeding populations through genetic modifications [
17,
18]. This not only creates attractive prospects for achieving biological progress but also opens new possibilities for the utilization of not only maize but also other crops [
19].
The introduction of molecular tools and rapid advances in next-generation sequencing (NGS) have enabled the sequencing of the genomes of many crop species, including maize. To date, among the most common NGS techniques are 454 pyrosequencing [
20], Solexa technology (Illumina, San Diego, CA, USA), SOLiD platform (Life Technologies Corporation, Carlsbad, CA, USA), Polonator (Harvard University, Cambridge, MA, USA), and HeliScope Single Molecule Sequencer (Helicos BioSciences, Cambridge, MA, USA). These technologies provide cost-effective genome-wide sequencing, employing methods such as chromatin immunoprecipitation, mutation mapping, polymorphism detection, and non-coding RNA sequence detection [
21]. Sequencing methods such as restriction site-associated DNA (RAD) [
22], multiplexed shotgun genotyping (MSG) [
23], and bulk segregant RNA sequencing (BSR-Seq) allow for the identification of a large number of markers and a detailed examination of many loci in a small number of samples. The method utilizing the Illumina platform led to the development of genotyping-by-sequencing (GBS) [
24] and diversity arrays technology sequencing (DArTseq) [
25]. The DArTseq technology was used in the present study to identify candidate genes associated with maize resistance to smut.
The DArT platform offers analyses based on the NGS-DArTseq technology [
26,
27]. DArTseq analysis generates two datasets: the first dataset contains dominant markers, while the second includes codominant markers with identified single-nucleotide polymorphisms. At least three times as many dominant markers can be obtained using DArTseq compared to the conventional DArT method [
28].
These technologies allow for the identification of genomic regions associated with various phenotypic traits, including disease resistance, which is crucial for characterizing and manipulating these regions. The emergence of new genome sequencing technologies, along with novel computational methods, has also led to the sequencing of the maize reference genome. The extensive genotypic data obtained through NGS can be used for association mapping. Genome-wide association studies (GWASs) have thus become a powerful methodology for investigating genetic variation and identifying associations between traits and underlying genetic variability using historical recombination events [
29]. Association mapping involves searching for genotype–phenotype correlations in unrelated individuals using dedicated statistical methods [
30,
31,
32]. Association mapping provides the capability to generate high-quality markers for marker-assisted selection (MAS). Functional markers closely associated with a trait reflect gene polymorphisms that directly cause phenotypic variability. Association mapping provides the opportunity to identify specific markers within a broad spectrum of genetic resources. The potential of association mapping arises from the likelihood of achieving higher resolution by utilizing a greater number of recombination events in the history of germplasm development [
33].
For several years, maize breeding has been supported globally by useful molecular markers, significantly impacting yield increases not only in the USA but also in other countries. This offers tremendous potential for enhancing the productivity and value of maize germplasm [
34,
35]. Maize, like barley and rice, is one of the most thoroughly studied cereal species in terms of its genetics. It contains over 32,000 genes on ten chromosomes, with a genome size of 2.3 Gb. A hallmark of the maize genome is its high polymorphism. Many loci have several active alleles, and the frequency of DNA sequence duplications, which include a significant proportion of retrotransposons and transposons, is approximately 58%. Gene-coding regions account for only 7.5% of the entire maize genome [
36].
As indicated by the latest literature reports, corn smut in various parts of the world is a current threat and causes a huge decrease in yields. As indicated by research conducted by Ramazanov et al. [
37], in Azerbaijan in 2022 and 2023, the infection of corn plants by smut caused a yield loss of 43.19% (2022) and 60.08% (2023). Yield losses were converted into income losses, which amounted to 64.55% in 2022 and 90.99% in 2023. In Hungary, similar losses due to the infection of corn hybrids by smut were recorded by Radocz et al. [
38]. Similar studies were published in 2020 in the United States by Muller et al. [
39]. They estimated annual corn yield losses caused by diseases in 2016–2019 in 26 states. In their study, the estimated loss per hectare was calculated at USD 138.13. Another study conducted in the Antalya region of Turkey found that the yield loss due to the smut infestation of maize ranged from 20.70% to 45.50% depending on the variety [
40]. Also in Turkey, during one study, it was found that due to the smut infestation of maize plantations, a yield loss of 23.10–41.40% was noted depending on the variety [
41]. In view of the above, conducting research related to maize resistance to smut seems to be fully justified.
It can be hypothesized that the use of the latest molecular biology techniques to identify genes for smut resistance will reduce the time and costs required to breed new resistant maize varieties. Furthermore, elucidating the role of these genes in the immune response will allow for the development of a sustainable and cost-effective control strategy for Ustilago maydis in maize crops.
Therefore, the aim of this study was to identify molecular markers (SilicoDArT and SNP) linked to candidate genes responsible for maize smut resistance, using next-generation sequencing, association mapping, and physical mapping. Identifying specific markers and characterizing the associated candidate genes related to maize resistance to smut will greatly improve the process of breeding new resistant varieties.
3. Discussion
Since the mid-1990s, many research centers worldwide have been conducting intensive studies on the structure and function of the maize genome using advanced biotechnological and molecular biology techniques. As a result of extensive breeding experiments, phenotypic observations, and genetic analyses, many quantitative trait loci (QTLs) associated with specific quantitative traits such as yield and resistance to abiotic and biotic stresses have been identified. The priority for all breeders is to obtain high-yielding and disease-resistant maize varieties [
42]. The present study analyzed maize genotypes, both phenotypically and genotypically, to identify molecular markers linked to candidate genes responsible for maize smut resistance. This disease is caused by the fungus
Ustilago maydis. It is a plant pathogenic fungus that causes tumors on all aerial parts of its host, maize (
Zea mays). The formation of these prominent symptoms is associated with a comprehensive reprogramming of the host’s physiology, cell morphology, and organ development [
42]. An important characteristic of
U. maydis relevant for its development as a model system in fungal cell biology lies in its bi-phasic life cycle. The fungus initially grows as a saprophytic haploid yeast. Upon encountering an appropriate host surface, the perception of a compatible pheromone signal induces filament formation, leading to the fusion of two compatible cells [
43]. The resulting dikaryon represents the pathogenic stage of
U. maydis and grows strictly in a filamentous form [
44]. The ability to induce filamentation and penetration structures (appressoria) in vitro was instrumental for establishing
U. maydis as a model system in fungal cell biology [
45].
U. maydis was among the first plant fungal pathogens whose genome was sequenced.
In this study, we focused on phenotypic observations concerning the extent of
U. maydis infection in 122 maize genotypes that causes smut. Observations were conducted in two locations: Smolice and Kobierzyce. Maize smut infection was observed significantly more in Kobierzyce than in Smolice. The large number of completely resistant plants in the experiment conducted in Smolice resulted in higher skewness, kurtosis, and coefficients of variation than in the experiment conducted in Kobierzyce. The broad-sense heritability for maize smut estimated across environments was 59.7. Higher smut infection rates in Kobierzyce were influenced, among other factors, by environmental conditions, as this location had higher temperatures and precipitation compared to Smolice. An analysis of variance indicated that the main effects of genotype and location, as well as the genotype × location interaction, were significant for maize smut. According to Juroszek and von Tiedemann [
46], high temperatures and high humidity increased maize infection by
Ustilago maydis.
Due to genotype–environment interactions, phenotypic analysis alone may be insufficient to identify maize genotypes resistant to smut. Soto et al. [
47] argued that in the era of technological advancement, traditional methods used in breeding are insufficient. In response to this challenge, contemporary breeding programs employ high-throughput plant genome analysis techniques to improve new varieties, including maize [
48]. This genomics-oriented approach provides information about coding regions, which reveal details about the structure of a protein (gene), as well as intergenic regions [
49]. With advancements in high-throughput DNA sequencing methods, which enable the sequencing of entire genomes and transcriptomes, a new level of research quality has emerged for many plant species, including maize [
50,
51]. The introduction of next-generation sequencing (NGS) methods has enabled the discovery of nucleotide sequences in plants other than model organisms with small genomes, such as
Arabidopsis thaliana. In recent years, many researchers have attempted to identify molecular markers functionally associated with important traits in maize. Bocianowski et al. [
52] used NGS technology and association mapping to identify markers associated with the heterosis effect in maize. Using the same methods, Sobiech et al. [
53] identified markers associated with maize plant resistance to
Fusarium.
In our research, out of the selected 61 markers, 10 were highly statistically significant (LOD > 2.3) and showed a significant association with plant resistance to maize smut in two locations (Smolice and Kobierzyce). Using the BLAST database, the location of the selected markers was determined, and the associated candidate genes were provided. Among the 10 identified markers, 3 SilicoDArT (24016548, 2504588, 4578578) and 3 SNP (4779579, 2467511, 4584208) markers were localized within genes. Among these six genes, three were well-characterized proteins that might play a role in the resistance response to maize smut infection: 1. ATPase family AAA domain-containing 3 (ATAD3) proteins; 2. enhanced downy mildew 2 (EDM2); and 3. lutein deficient 5, chloroplastic (CYP97A3). Additionally, SNP 4779579, linked to the enhanced downy mildew 2 gene, distinguished between resistant and susceptible genotypes. A characteristic product of 559 bp was present in all maize smut-resistant genotypes under field conditions.
According to Gordon [
54], ATAD3 (ATPase family AAA domain-containing protein 3) proteins are newly discovered mitochondrial membrane proteins in
Arabidopsis thaliana. Studies in metazoans have indicated that ATAD3A localizes to mitochondria–ER contact sites and is involved in a variety of processes required for proper mitochondrial function. However, the role of ATAD3A proteins in plants is less well defined. ATAD3 proteins in
A. thaliana underwent two gene duplication events, resulting in two clades, both of which are required for plant viability. Research conducted by Zelman [
55] indicated that the activity of ATPase family AAA domain-containing protein 3 is linked to plant responses to abiotic stress. Minsoo [
56] identified three homologous ATAD3 proteins, involved in mitochondrial nucleoid organization, as interacting with suppressor hot1-4 1 (SHOT1). Importantly, disrupting ATAD3 function leads to impaired nucleoids, a decreased accumulation of complex I, and improved heat tolerance. These proteins increase plant resistance to abiotic stresses, such as high temperature and stress, and may also play a significant role in regulating the plant immune response to biotic stress.
Regarding the second gene significantly linked to maize smut resistance, EDM2, McDowell [
57] highlighted its multifaceted role beyond the immune response. The EDM2 mutation also causes pleiotropic effects, influencing flowering time and leaf cellular development, indicating a broad regulatory function [
58,
59]. The first clues to the molecular function of EDM2 were provided by its protein sequence, which contained a nuclear localization signal, a methyltransferase domain, and plant homeodomain (PHD) fingers associated with epigenetic regulation [
60]. Further evidence of an epigenetic role emerged from a yeast two-hybrid screen, which identified interactions between EDM2 and a small family of chromatin remodeling factors [
61].
The third important gene associated with maize smut resistance is lutein deficient 5, chloroplastic (CYP97A3). As reported in a publication by Niu et al. [
62], this gene is one of the cytochrome P450 enzymes. The CYP97A3 gene, together with the CYP97C1 gene, catalyzes hydroxylations of the β- and ε-rings of α-carotene to produce lutein. Lutein, a dihydroxy derivative of alpha-carotene (beta, epsilon-carotene), is the most abundant carotenoid in photosynthetic plant tissues where it plays important roles in light-harvesting complex II structure and function.
Due to the fact that corn smut continues to cause huge grain yield losses [
63,
64,
65], scientists are increasingly undertaking research related to the genetic background determining resistance to this disease [
66]. In the studies conducted by Zou et al. [
64], it was shown that the phytohormone methyljasmonate (MeJA) can induce plant defense against microbial pathogens including
Ustilago maydis. Other authors [
67] consider salicylic acid (SA) and jasmonic acid (JA) to be important defense hormones. Fungal pathogens can activate defense responses associated with JA. Moreover, as a plant hormone, SA can interact with various plant hormone-related signaling pathways to activate the immune response and disease resistance of plants [
68]. Therefore, there is a high probability that the gene encoding lutein deficient 5, chloroplastic selected by the team is involved in the immune response to stress induced by
Ustilago maydis because, like salicylic acid and jasmonic acid, it has a strong antioxidant effect.
The association mapping used in this study has proven to be a promising approach compared to traditional mapping. It enabled the identification of candidate genes associated with maize resistance to smut. The literature reports discussed above confirmed that three of these genes (ATAD3, EDM2, and CYP97A3) could be involved in the resistance response of maize to smut infection. To date, two main types of association mapping have been characterized: genome-wide association mapping (GWAM) and candidate gene association mapping (CGAM). The GWAM approach analyzes genetic variability across the entire genome to identify association signals for various complex traits, while CGAM correlates DNA polymorphisms in selected candidate genes with the trait of interest [
69,
70]. There are many examples of successful applications of association analysis in cereals, particularly in maize. Recently, genome-wide association mapping (GWAM) has become a powerful tool for analyzing the genetic architecture of complex traits in various crop species [
71]. Initially, association mapping performed in maize [
72] did not consider population structure. This error was rectified by Pritchard, who included population structure in his study on maize in 2001 [
73].
The current study demonstrated the utility of field, molecular, bioinformatics, and statistical analyses for identifying candidate genes associated with maize smut resistance. Moreover, methods for identifying candidate genes that could be used for selecting genotypes with desirable traits were proposed. This approach will enable cost savings compared to traditional methods of developing maize varieties.