1. Introduction
The shikimate pathway is used by bacteria, archaea, fungi, algae, some protozoans, and plants for the biosynthesis of the aromatic amino acids phenylalanine (Phe), tyrosine (Tyr), and tryptophan (Trp), but also of a vast array of phenolic-derived compounds [
1]. This pathway exhibits one of the largest metabolic fluxes in the biosphere, and in plants, it is estimated that this route channels up to 30% of the photosynthetically fixed C [
2]. Starting from the central C metabolites phosphoenolpyruvate and erythrose-4-phosphate within plastids, the shikimate pathway results in the synthesis of chorismate, the last common precursor for the three aromatic amino acids, as well as other compounds of paramount importance, such as vitamins K1 and B9. Chorismate is mainly used as a substrate by the enzyme anthranilate synthase (AS; EC 4.1.3.27), for the synthesis of tryptophan, or by the enzyme chorismate mutase (CM; EC 5.4.99.5) for the synthesis of prephenate, the common precursor of Phe and Tyr. The alternative utilization of chorismate by these two enzymes is finely regulated by intracellular levels of aromatic amino acids and, in some cases, by other amino acids or downstream metabolites. In plants, tryptophan generally functions as an inhibitor of the AS enzyme and activator of CM [
3,
4]. Conversely, phenylalanine and tyrosine function as inhibitors of CM activity and activators of the AS [
5]. Additionally, a characteristic feature of angiosperms is the existence of cytosolic CMs involved in the extra-plastidial synthesis of Phe, which are insensitive to regulation by amino acids [
4,
6].
Previous works have addressed the phylogenetic distribution of CMs, showing the existence of several clades that partially correlate with their allosteric regulation. Briefly, these studies establish the existence of a distinct group that includes CMs from green algae, a clade corresponding to the cytosolic enzymes from angiosperms, as well as another clade containing the angiosperm plastidic isoforms [
5,
7]. Finally, an additional clade encompassing CMs from basal plant lineages, such as lycophytes or mosses, was also reported [
5]. None of the aforementioned analyses considered the inclusion of CMs from conifers or other gymnosperms, likely due to the delayed completion of sequencing projects for these plant species with extremely complex and repetitive megagenomes.
Parallel studies have also explored structural aspects of plant CMs, focusing particularly on the residues involved in the binding of amino acid effectors. For instance, Westfall et al. (2014) [
3] utilized 3D modeling and site-directed mutagenesis to demonstrate that in AtCM1, Gly213 is essential for allosteric regulation, while Gly149 contributes to determining the specificity of the effector amino acids. The importance of these analyses is highlighted by the conservation of Gly213 in CMs within the phylogenetic clade encompassing plastidial regulated CMs, in contrast with its absence in the clade of cytosolic CMs which are insensitive to inhibition by Phe or Tyr.
Conifers, owing to their significant vertical growth and exposure to multiple mechanical stresses, require substantial lignin synthesis, and therefore, large amounts of its precursor, phenylalanine. In previous work, we studied the biochemistry and molecular regulation of this crucial process, using
Pinus pinaster (maritime pine) as a model species. In particular, we extensively examined the enzymes prephenate aminotransferase and arogenate dehydratase, which are responsible for synthesizing Phe from prephenate, the product of the reaction catalyzed by CM [
8,
9,
10,
11]. Additionally, we contributed to elucidating a regulation model of the Phe synthesis pathway, involving MYB and NAC type transcription factors [
12,
13,
14].
The overarching objective of this work was to investigate the role of the CM enzyme in controlling the metabolic flux of phenolic compounds in conifers, a plant group that is reliant on substantial synthesis of these compounds. In this regard, here we have analyzed the CM family in maritime pine, comprising two members with plastidial localization, reinforcing the absence of a cytosolic Phe synthesis pathway in conifers. Furthermore, we investigated the biochemical regulation of these enzymes and elucidated how this metabolic step correlates at the transcriptional level with other key steps of this pathway.
2. Methods
2.1. Growth Conditions
Nicotiana benthamiana seeds were planted and allowed to grow for 6 weeks. Initially, seeds were planted in pots for 1–2 weeks (depending on the size), then each seedling was separated into individual pots for 3–4 weeks. N. benthamiana seedlings were grown at 25 °C in a plant chamber with a long-day photoperiod (16 h of light and 8 h of darkness).
Seeds from maritime pine were soaked in distilled water for 24 h with continuous aeration, then germinated in a plastic tray filled with vermiculite as the substrate. The seedlings were grown in a controlled environment growth room, maintained at 25 °C and 50/70% relative humidity, with a 16/8 h photoperiod, and were watered twice weekly with distilled water.
2.2. Maximum Likelihood Phylogeny Analysis
CM phylogenetic analysis was conducted with 135 sequences corresponding to multiple species from different plant clades and green algae. The CM from
Saccharomyces cerevisiae was used to outgroup root the tree. Sequences were aligned using muscle [
15] and tree topology was inferred using maximum likelihood with PhyML [
16,
17]. Bootstrapping was performed with 1000 replicates. The phylogenetic trees were drawn using MEGA 11 [
18].
2.3. cDNA Cloning
Full-length cDNAs of
PpCM1 to
PpCM2 were obtained from
Pinus pinaster seedlings’ total RNA through reverse transcription-PCR. This process involved the use of iScript Reverse Transcription Supermix (Bio-Rad, Hercules, CA, USA) and the corresponding primer pairs listed in
Table S1. Subsequently, the obtained cDNAs were subcloned into the pJET1.2 vector (ThermoFisher Scientific, Waltham, MA, USA). Sequenced full-length cDNAs were PCR amplified using forward and reverse oligonucleotides that featured attB1 and attB2 sites at their 5’ and 3’ ends. These primer pairs are all detailed in
Table S1. The resulting products were cloned into the pDONR207 vector, and then into pDEST17 vector, using the BP Clonase
® II and LR Clonase
® II enzyme mixes (Thermo Fisher Scientific, Waltham, MA, USA), respectively.
2.4. RNA Extraction, Reverse Transcription, and Quantitative Real-Time PCR
2.4.1. RNA Extraction
Samples from developing xylem from compressed and opposite woods were obtained, as described by Villalobos et al., 2012 [
19]. Samples were processed to extract total RNA in accordance with the procedure by Canales et al., 2012 [
20]. Approximately 100 mg of each powdered sample was employed in the extraction process. To eliminate any genomic DNA contaminants from the RNA samples, 5 units of RQ1 Rnase-Free Dnase (Promega, Madison, WI, USA) were applied for 25 min at 37 °C. The purity of the total RNA, as indicated by the 260/280 and 260/230 ratios, was assessed using a NanoDrop ND-1000 spectrophotometer (ThermoFisher Scientific, Waltham, MA, USA), which was also employed to quantify the samples. Furthermore, the quality of the total RNA was evaluated through agarose gel electrophoresis.
2.4.2. cDNA Synthesis and qPCR Analyses
For cDNA synthesis, 0.5 μg of total RNA was utilized in accordance with the manufacturer’s guidelines, employing the qPCRBIO cDNA synthesis kit (PCR Biosystems, London, UK). The qPCR analysis involved the utilization of specific primers, which can be found in
Table S1. The qPCR reaction mixture was prepared as follows: 5 μL of 2 × SsoFast™ EvaGreen
® Supermix (BioRad, Hercules, CA, USA), 0.5 pmol of the forward primer, 0.5 pmol of the reverse primer, and 10 ng of the amplified cDNA. The qPCR reactions were conducted in a CFX384 thermal cycler (Bio-Rad, Hercules, CA, USA) under the following conditions: an initial denaturation step of 3 min at 95 °C (1 cycle), followed by a 1 s denaturation at 95 °C, and a 5 s annealing/extension at 60 °C (for 50 cycles). A subsequent melting curve analysis was performed over a temperature range from 60 °C to 95 °C.
The raw fluorescence data from each reaction were analyzed using the MAK2 model, which does not assume a specific amplification efficiency for quantitative PCR (qPCR) assays, as described by Boggy and Woolf, 2008 [
21]. The initial target concentration (D0 parameter) for each gene was determined using the MAK2 model and the qPCR package in the R environment, as outlined by Ritz and Spiess, 2008 [
22]. Subsequently, these values were normalized to two reference genes,
PpActin and
PpEF1 (
Table S1).
2.5. Transcriptome Libraries
In silico expressions of
PpCMs were studied using different transcriptome databases. These include Sustainpine seedlings (
http://v22.popgenie.org/microdisection/, accessed on 15 April 2024) [
23] and developing xylem [
19] databases.
2.6. Subcellular Localization of CMs
Pine
PpCMs were cloned into the pGWB5 vector using gateway technologyto create complete PpCM proteins tagged with GFP at their C-termini, under the regulation of the CaMV 35S promoter. The resulting plasmids were introduced into
Agrobacterium tumefaciens C58C1 strains via electroporation. The expression of GFP-tagged PpCM proteins was achieved through agroinfiltration with an OD
600 of 0.5 in
Nicotiana benthamiana leaves, following established procedures [
24]. In all experiments, the silencing suppressor p19 [
25] was co-expressed. GFP fluorescence was assessed 48 h post-agroinfiltration using a Leica Stellaris 8 confocal microscope (Leica, Wetzlar, Germany), for chloroplast autofluorescence of excitation/emission at 488/680–700 nm and for GFP detection excitation/emission at 488/505–525 nm. Excitation was provided by an argon ion laser 488 nm. The expression of the corresponding proteins was confirmed by western blot.
2.7. Protein Extraction, SDS-PAGE, and Immunodetection
Total proteins were extracted from plant material using an extraction buffer (100 mM Tris buffer at pH 8.0, 10% (v/v) glycerol, 1% (w/v) sodium dodecyl sulfate (SDS), 2 mM EDTA, and 0.1% (v/v) beta-mercaptoethanol). Approximately 100 mg of frozen plant powder was reconstituted in 150 μL of the extraction buffer at room temperature. Intact chloroplast isolation was performed using a CPISO chloroplast isolation kit (Merck, Darmstadt, Germany) and a Percoll® gradient.
After centrifugation at 20,000× g for 10 min at 4 °C, 75 μL of the supernatant was collected and mixed with 25 μL of 4× Laemmli buffer, followed by denaturation at 100 °C for 5 min.
For the immunodetection of transiently expressed CMs within the protein extracts, 25 μg of the total proteins was separated using sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE). Western blot analysis was conducted following standard procedures. Transgenic proteins were detected by exploiting the GFP tag present in the construct, utilizing a specific commercial antibody (GFP (B-2) sc-9996, mouse monoclonal antibody, Santa Cruz Biotechnology, Santa Cruz, CA, USA) at a dilution of 1:1000.
2.8. Recombinant Expression and Purification of CM Enzymes in Escherichia coli
DNA sequences coding for both PpCM1 and PpCM2 were subcloned into the pDEST17 vector using Gateway technology. In both cases, the sequence coding for the putative chloroplast transit peptide was not included. These sequences, coding for residues 1–58 of PpCM1 and residues 1-58 of PpCM2, were determined by comparison with multiple CM sequences from plants, using ChloroP [
26] and TargetP [
27] algorithms.
pDEST17 expression constructs were transformed into Escherichia coli BL21-CodonPlus-RIL® (Agilent Technologies, Santa Clara, CA, USA) and cultured in 500 mL of LB, supplemented with 100 µg of ampicillin and 34 µg of chloramphenicol, at 37 °C until A600 of 0.6 was reached. Protein expression was induced by adding a final concentration of 1 mM isopropyl β-D-1-thiogalactopyranoside (IPTG) to the cultures, which were subsequently incubated for 5 h at 20 °C with gentle shaking at 100 rpm. Cells were pelleted by centrifugation at 6000× g and frozen until further purification. Poly-His-tagged recombinant proteins were purified using Protino Ni-TED 2000 Packed nickel resin columns (Macherey-Nagel, Duren, Germany) and subjected to buffer exchange to a 50 mM Tris buffer at pH 8.0 using Sephadex G-25 M resin (PD-10 Columns; GE Healthcare, Chicago, IL, USA).
2.9. Co-Immunoprecipitation Analysis
DNA sequences coding for both PpCM1 and PpCM2 were subcloned into the pGWB11 (C-terminal FLAG tag) and pGWB5 (C-terminal GFP tag) vectors, respectively, using Gateway technology. PpCM2-GFP and PpCM1-FLAG proteins were expressed through agroinfiltration in
Nicotiana benthamiana leaves, as described in
Section 2.6. The empty vector pGWB6 (N-terminal GFP tag) was used as a control.
Co-immunoprecipitation was performed using ChromoTek GFP-Trap® Agarose (Chromotek, Planegg-Martinsried, Germany). Briefly, soluble proteins were extracted from 250 mg of N. benthamiana leaves in an extraction buffer containing 50 mM Tris-HCl, pH 7.5; 150 mM NaCl; 10% glycerol; 2 mM EDTA, pH 8; 1% Triton X-100, and 1% protease inhibitor cocktail P-9599 from Merck KGaA (Darmstadt, Germany). After centrifugation at 20,000× g for 15 min at 4 °C, 100 µL was saved as input for western blot analysis and remaining supernatants (two milligrams of total protein extracts) were mixed with 25 μL of GFP-Trap A beads and incubated 2 h at 4 °C, with end-over-end rocking. After incubation, the beads were washed three times with wash buffer (extraction buffer). Proteins bound to the beads were resuspended in 80 mL of 2× Laemmli sample buffer and heated at 100 °C for 10 min to dissociate immunocomplexes from the beads. Total (input) immunoprecipitated (IP) and co-immunoprecipitated (CoIP) proteins were separated by electrophoresis in 10% SDS-PAGE and analyzed by western blot analysis using anti-Flag (1:1000, OctA-Probe H5, sc-166355, Santa Cruz Biotechnology, Santa Cruz, CA, USA) or anti-GFP (1:1000; sc-9996, Santa Cruz Biotechnology, Santa Cruz, CA, USA) antibodies. Appropriate peroxidase-conjugated secondary antibodies were used: m-IgGk BP-HRP; sc-516102 (Santa Cruz Biotechnology, Santa Cruz, CA, USA) for anti-Flag (1:5000) and anti-mouse igG-Peroxidase, A9044 from Merck Darmstadt, Germany for anti-GFP (1:5000).
2.10. CM Activity Assay
CM reactions consisted of a final volume of 80 µL containing 50 mM Tris buffer pH 8, varying concentrations of chorismate, and 0.3 µg of the corresponding purified CM enzyme. CM assays were conducted in a plate reader at 30 °C, using UV-transparent 96-well plates (UV-Star®, Greiner Bio-One. GmbH, Frickenhausen, Germany), by tracking the disappearance of chorismate, which results in an absorbance decrease at A274 nm (ε = 2630 M−1 cm−1). Chorismate (C1259) was obtained from Merck KGaA (Darmstadt, Germany) To identify putative amino acid effectors, the reactions were conducted as described above; containing 50 mM Tris buffer pH 8, 1 mM chorismite, 0.3 µg of the CM enzyme, individually supplemented with 2 mM of each amino acid. Assays showing a minimum 1.5-fold increase or reduction compared to controls were selected for further analysis. Similarly, the effect on CM activity of decreasing concentrations of the amino acids Phe, Tyr, and Trp was determined.
4. Discussion
In our previous studies, we analyzed the regulation of the enzyme prephenate aminotransferase (PAT), which converts prephenate into arogenate, and the enzyme arogenate dehydratase (ADT), which catalyzes the synthesis of Phe from arogenate, in
Pinus pinaster [
8,
10,
11,
14]. This species serves as a well-accepted model for conifers, a group of plants with a complex secondary metabolism requiring a high rate of synthesis of compounds derived from aromatic amino acids to maintain their growth and development.
Here, we investigate the regulation of the enzyme CM in the same species, with the aim of obtaining a comprehensive understanding of the metabolic regulation of Phe synthesis starting from chorismate. The CM reaction is an essential step that determines the metabolic flux towards the synthesis of Phe and Tyr, as well as multiple derived compounds. This pathway competes with the alternative use of chorismate towards other important metabolic destinations, including the synthesis of Trp and other compounds, such as vitamins K1 and B9 [
5]. The alternative channeling of chorismate towards different metabolic destinations represents a branching point, regulated through positive and negative feedback loops, which has been extensively studied in plant CMs, as well as in the enzymes of microorganisms, including bacteria and yeast [
5,
31].
In general, CM enzymes are inhibited by the amino acids derived from prephenate, Phe and Tyr, and activated by Trp. In contrast, the enzyme anthranilate synthase, which catalyzes the first step of Trp synthesis from chorismate, is activated by Phe and Tyr and inhibited by Trp (reviewed by Maeda and Dudareva, 2012 [
5]). Several works have delved into this general model, evidencing different types of regulation for the enzyme chorismate mutase in various plant species and cellular compartments. Notably, angiosperms possess cytosolic CM enzymes whose activity, in the cases studied, appears to be insensitive to aromatic amino acids. Regarding plastidial enzymes, alternative regulatory mechanisms exist, as illustrated by the case of
Arabidopsis, in which the AtCM1 enzyme is positively activated by Trp and negatively regulated by Tyr and Phe, while the AtCM3 enzyme is activated by Trp, His, and Cys [
3]. Although these processes mainly occur in chloroplasts, recent works have shown that in some plant groups, the synthesis of aromatic amino acids can also partially take place in the cytosol [
32].
In this work, we report the existence of two coding sequences for the enzyme chorismate mutase, PpCM1 and PpCM2, in
Pinus pinaster. In parallel, we also identified CM sequences in other species of conifers and gymnosperms, which were also found to contain two
CM genes in their genomes. Through sequence alignments, in silico analysis, and transient expression studies in
Nicotiana benthamiana leaves of the fusion proteins PpCM1-GFP and PpCM2-GFP, we concluded that both isoforms are specifically located in plastids, indicating that, unlike angiosperms,
Pinus pinaster, and probably all conifers, lack cytosolic CM enzymes. This finding is of particular importance, since the cytosolic pathway for Phe synthesis described in petunia, and proposed in
Arabidopsis, requires the presence of chorismate mutase activity in the cytosol, in addition to the activities of prephenate dehydratase and phenylpyruvate aminotransferase. Interestingly, the article by Qian et al. 2019 [
32], which describes the aforementioned cytosolic pathway, specifically points to the non-existence of prephenate dehydratase activity in the cytosol in
Pinus pinaster. Taken together, these data strongly indicate that conifers, and likely all gymnosperms, lack the capacity for cytosolic Phe synthesis. Furthermore, CMs from most ancestral land plants, such as mosses, liverworts, ferns, or lycophytes, apparently lack cytosolic CMs, with the striking exception of
Selaginella moellendorffii, suggesting that the cytosolic Phe synthesis pathway is also absent in the non-seed plants which first colonized the land. The absence of this pathway also implies a lower capacity for the interconversion between Phe and Tyr in the cytosol, which has been suggested to be important for the catabolism of these amino acids [
33]. However, the lineages of plants apparently lacking the cytosolic pathway coincide with those having Phe-hydroxylase activity in the cytosol, thus allowing Phe/Tyr interconversion [
34].
Taken together, our work and the literature suggest that the existence of cytosolic CMs correlates with the cytosolic synthesis of Phe, a novel feature of angiosperms. To delve into the evolutionary appearance of this route, we identified and analyzed the sequences of CM enzymes corresponding to basal groups of angiosperms. In these analyses, we determined that Amborella trichopoda, the most primitive of existing angiosperms, like conifers, has two plastidial CMs, but lacks cytosolic isoforms. However, the existence of cytosolic CM enzymes in other basal angiosperms considered less primitive, such as Nymphaea colorata or Liriodendron tulipifera, allows us to delimit the evolutionary point at which the cytosolic synthesis of Phe may arise.
Through phylogenetic analysis, we have shown that CMs from seed plants are distributed in two clades (
Figure 1). Clade-II includes allosterically-regulated plastidic CMs from angiosperms and gymnosperms, such as PpCM2, as well as the complete set of CMs from ancient plant lineages such as mosses, ferns, or lycophytes. Clade-III mainly includes cytosolic isoforms of angiosperms, which lack regulation by amino acids [
3] and by a group of gymnosperm plastidic CMs, including PpCM1.
We have demonstrated that PpCM1 and PpCM2 are activated by Trp and inhibited by Phe and Tyr, indicating that allosteric regulation is not exclusive to CMs from clade-II. This result is not surprising, considering that both pine CMs retain the residue Gly213, as described by Westfall et al. (2014) [
3] in AtCM1, which is decisive for conferring allosteric sensitivity. In contrast, the occurrence of deregulated enzymes has been associated with the loss of the Gly213. These data suggest that the appearance of seed plants was followed by the emergence of a new group of CMs that, in the case of angiosperms, subsequently gave rise to the current deregulated cytosolic enzymes, requiring the loss of both, the signal peptide for plastid targeting, and the residue Gly213.
To delve deeper into this point, we checked the presence of Gly213 in the set of sequences used to build the phylogenetic tree. While in most of these sequences, there is a concurrent loss of both plastidial localization and Gly213, we observed that members of the paraphyletic group of basal eudicots, represented here by Aquilegia coerulea and Eschscholzia californica, lack plastid signal peptide but retain Gly213, likely preserving their allosteric sensitivity. The analysis of this plants group suggests that during plant evolution, some CM enzymes lost their plastid localization and, subsequently, their allosteric sensitivity. A possible interpretation could be that the loss of this regulation could be associated with the cytosolic localization, where competition with anthranilate synthase for the synthesis of Trp does not occur. On the other hand, the levels of Phe and/or Tyr would predictably be lower, due to lower synthesis and immediate consumption by various cytosolic routes, such as the synthesis of phenylpropanoids.
The existence of two CM isoforms in pine, as observed in other gymnosperms, suggests the existence of alternative and likely non-overlapping physiological roles. Our expression analyzes show that in pine seedlings, the two isoforms present markedly different expression patterns. PpCM1 expression is preferential in cotyledon mesophyll cells, hypocotyl pith, and in the vascular tissues of developing roots, while PpCM2 is preferentially expressed in vascular tissues from young needles, vascular cells of cotyledons, and hypocotyls. We also analyzed the expression of PpCM1 and PpCM2 in the highly lignified compression wood (CW) versus opposite wood (OW), which is characterized by lower levels of this polymer. Although the level of expression of both isoforms in CW is similar, the induction of PpCM2, compared to OW, (5.2x) is much higher than in PpCM1 (1.4X). The induction of PpCM2 could be associated with a higher rate of Phe synthesis to support the massive synthesis of lignin, distinctive of CW, but also of the vascular tissues of seedlings. Supporting this hypothesis, PpCM2 is more resistant to Phe inhibition than PpCM1, and retains considerable activity at high levels of Phe (30–35% at 2 mM), enabling this enzyme to generate a higher rate of Phe synthesis for lignification, consistent with the presence of two Phe-insensitive type ADTs, such as PpADT-A and PpADT-D [
11], in the same tissue. This hypothesis is also supported by our observation in previous investigations that the expression of
PpCM2, but not PpCM1, is significantly reduced in plants silenced for
PpMYB8, a central transcription factor in the coordination of Phe and lignin synthesis and a transcriptional activator of
PpADT-A and
PpADT-D [
14]. Furthermore, our co-immunoprecipitation analysis revealed the in vivo formation of heterodimers between PpCM1 and PpCM2, indicating a potential functional interaction between these isoforms. This finding suggests a cooperative role for these isoforms in certain physiological contexts, possibly contributing to the fine-tuning of Phe synthesis and metabolic regulation within plastids. While the formation of heterodimers among CM isoforms has not been previously reported, this discovery opens new avenues for understanding the molecular mechanisms underlying metabolic regulation in plants, particularly in the context of aromatic amino acid biosynthesis.
Although the role of the Phe-cytosolic pathway in angiosperms remains unclear, its involvement in plant biotic and abiotic stress responses has been proposed [
32]. The non-existence of this route in gymnosperms and other more ancestral groups of plants, suggests that its evolutionary appearance would be related to specific environmental adaptation mechanisms developed later along the evolution of angiosperms.
Finally, our results demonstrate that although pine CM activation by Trp is limited (3.1-fold for PpCM1 and 1.7-fold for PpCM2), the EC50 values indicate that these enzymes show extraordinary sensitivity towards this effector, compared to previously characterized enzymes from other plants. To fully understand the importance of this aspect, future work, focused on the allosteric regulation of the pine anthranilate synthase enzyme, is necessary.