1. Introduction
Rocket is the commonly used name to refer to two different species of baby-leaf salad crops, belonging to the Brassicaceae family: Eruca sativa (L.) Cav. (2n = 22) and Diplotaxis tenuifolia (L.) DC. (2n = 22).
Eruca sativa, also referred as cultivated or annual garden rocket, although used for human consumption for millennia, can be a problematic weed, as happens in China [
1].
D. tenuifolia, also referred to as wild or perennial rocket, is an invasive weed in Europe, USA, Argentina, and is particularly noxious in Australia, where some states attempt to control its spread [
2].
Nevertheless, both species are edible, tasty, and the search for new healthy food products by modern consumers has promoted their worldwide production and commercialization, particularly as ready-to-eat, pre-packaged salads.
The most evident distinctive morphologic traits that differentiate the two species are the white petals and simple silique of cultivated garden rocket vs. the wild rocket yellow petals and septate silique [
3] (
Figure 1).
Other differences, e.g., in seed size (much smaller in wild rocket), number of allowed successive cuttings, response to abiotic factors, and other aspects that determine some specificities of their production and commercialization were thoroughly reviewed by [
4].
The two species produce glucosinolates, and multiple other secondary metabolites, namely antioxidants. A very inclusive comparative analysis of the data of the research carried out on both rocket species, regarding the variability and levels of these two kinds of compounds, and nitrate, crude fibers, total minerals, and carbohydrates, their variation depending on the environmental and storage conditions, and their effects on human health, was performed by [
5]. A recent similar review, that includes the characterization of sensory properties and consumer preferences, was published by [
6].
A wide range of information, specifically focused on the wild rocket (
D. tenuifolia) biology, biochemical and nutraceutical properties, farming practices, crop protection, and industrial processing can be accessed in the review published by [
7].
As previously noticed by [
8], the available statistical data still do not consider the two rocket species separately. Nevertheless, the general perception is that the trend in the evolution of the relative production and commercialization of the rocket species is characterized by the increasing importance and share of
D. tenuifolia, the production area of which attained 4800 ha in Italy [
9].
As the importance of
D. tenuifolia as a baby-leaf crop grows, plant breeding and germplasm conservation activities regarding this species become more imperative. Multiple companies are presently dedicated to breeding, producing, and commercializing new
D. tenuifolia cultivars although, so far, the number of collected and conserved genotypes is still low. A brief consult of the International Minor Leafy Vegetables Database (
https://ecpgr.cgn.wur.nl/LVintro/minorlv/con_spec.htm, accessed on 1 November 2022) shows that
D. tenuifolia (plus
Diplotaxis sp. and spp.) counts for 90 references vs. 663 registers for
E. sativa (syn.
E. vesicaria). Similarly, a consultation of the Genebank Information System of the IPK, Gatersleben, Germany (
https://gbis.ipk-gatersleben.de/gbis2i, accessed on 1 November 2022) identifies 11 accessions of
D. tenuifolia and 1 accession of
Diplotaxis sp. vs. 145 accessions of
E. sativa.
Among the main objectives of the collaborative project, REMIRucula is the establishment of a germplasm collection of D. tenuifolia, including some accessions of E. sativa, at the Instituto Nacional de Investigação Agrária e Veterinária (INIAV), Oeiras, Lisboa, Portugal. A representative number of accessions is intended to be tested for their response to a specific isolate of Hyaloperonospora sp. collected on a downy mildew naturally infected D. tenuifolia plant, and to be assessed for their genetic variability and genetic relationships by DNA markers and their unequivocal molecular identification by specific DNA markers.
The use of DNA markers in
Diplotaxis studies has been relatively limited and performed using Inter Single Sequent Repeats (ISSR) and Random amplified Polymorphic DNA (RAPD) markers, mainly focused on the assessment of interspecific relationships [
10] and identification of interspecific hybrids [
11].
Recently, the genetic relationships among a large group of
D. tenuifolia accessions of the novel germplasm collection, including some
E. sativa accessions as the outgroup, were assessed using RAPD and ISSR markers [
3]. These molecular analysis techniques have discriminated between the accessions of both species and allowed the genetic diversity within each species and the genetic relationships among the respective accessions to be assessed.
Herein, we describe the first wild rocket (D. tenuifolia (L.) DC.) genome assembly via next generation sequencing, the retrieving of some hundreds of single sequence repeat (SSR) and single sequence polymorphisms (SNP) loci, and the use of 14 of these markers for the unequivocal molecular characterization and identification of 90 accessions: 87 of D. tenuifolia and 3 of E. sativa accessions used as the outgroup.
3. Discussion
In a previous assessment of the genetic relationships among a large set of accessions of the same germplasm collection using RAPD and ISSR markers, only in one case did the genetic similarity (DICE coefficient) between two
Diplotaxis accessions fell below 0.7, and in 95% of the cases, this parameter was over 0.8 [
3].
This level of genetic similarity agrees with the results obtained in our lab during the last few decades whenever randomly amplified DNA markers (RAPD, ISSR, AFLP) were used to assess the genetic similarity and genetic relationships within multiple plant species, e.g., pear [
13]; apple [
14]; fig [
15]; or common beans [
16].
In the present study, as expected, the calculated genetic similarity values based on SSR markers were relatively low. This is a consequence of the hyper-polymorphism of these markers, that makes them highly useful for the unequivocal identification and discrimination of individuals, and for the establishment or categorical abolishment of close genetic relationships between individuals (e.g., paternity tests), but not suitable for the quantification of these relationships. This last circumstance is clear in the present study (
Table S3) as the reckoned genetic similarity between accessions of the same species often fell drastically below 0.8, reaching values as low as 0.3 (e.g., accessions 12 and 16 of
D. tenuifolia).
On the other hand, the intraspecific (
D. tenuifolia) genetic similarity values calculated based on the SNP-CAPS markers were higher, with the lowest value (0.556) registered in eight cases (
Table S5).
In comparison with the results obtained in our lab using randomly amplified markers [
3], the low discriminating results of SNP-CAPS markers analyses are, apparently, a consequence of the low number of analyzed loci. This situation could be improved by the assessment of a larger number of SNP loci or by complementation with results of other molecular markers analyses. We chose the second option and the combination of the results of the SNP-CAPS and SSR markers analyses allowed the identification of specific molecular fingerprints for most of the accessions (
Table S4;
Figure 6).
The seven cases of non-discrimination among D. tenuifolia accessions deserve a detailed analysis and discussion, as the identification of specific fingerprints for all accessions of the germplasm collection is one of the main objectives of this work.
Accession 1 was provided in 2015 identified as a commercial variety by a wild rocket producer, while accession 98 was provided in 2019 as the same cultivar by the breeding company.
Accessions 160 and 161 appeared as identical in our analysis However, they were supposed to correspond to two different cultivars provided by the same rocket producer. For that reason, an accurate comparative analysis of the multiple phenotypic traits of these two accessions will be performed, new samples will be requested from the original providers, and a new molecular analysis will be carried out. Then, the needed correction will be introduced in the germplasm collection data.
Accessions 17 and 24 were provided in different years by the same donor, under the same name. These accessions were assumed as an internal control of the performed analyses.
Accessions 10 and 19 were provided, respectively, by a wild rocket producer and a breeding company as the same, relatively resistant to downy mildew cultivar.
Accessions 18 and 165 were provided by two different wild rocket producers, one in Portugal and the other in the USA, as the same downy mildew-resistant cultivar.
Accession 39 was registered (in 2019) in the germplasm collection as a trial sample of a new cultivar provided by a breeding company. Accession 164 was received later (in 2020) and identified as a commercially available downy mildew-resistant cultivar. Both accessions are among the very few that, in our previous studies [
3], have exhibited strong resistance to the used
Hyaloperonospora isolate.
Accessions 23, 38, and 158 were provided by three different donors. The first from a wild rocket leaf producer. The second as a new cultivar under assay (identified by a company code). The third accession as a very well identified, highly resistant to downy mildew, cultivar. In fact, the three accessions were also among the very few
D. tenuifolia accessions that exhibited downy mildew resistance in our previous studies [
3]. This relatively rare phenotypic trait shared by the three accessions, also contributed to confirm these three accessions as the same cultivar, despite the different providers.
Except for the case of the molecular identity of the accessions 160 and 161, which needs further explanation, the other cases of accessions that exhibit the same molecular pattern reinforces our assumption that the set of molecular markers used in this study can be further utilized for the unequivocal identification and register of all accessions of the germplasm collection.
4. Materials and Methods
4.1. Plant Germplasm Accessions
A set of 87
D. tenuifolia and 3
E. sativa accessions of the germplasm collection (INIAV), previously tested for their interaction with the isolate D5 of
Hyaloperonospora sp., the causing agent of downy mildew disease [
3], were selected for unequivocal molecular characterization and identification by specific SSR and SNPs patterns.
4.2. Plant Growing Conditions
The seeds were quickly washed with tap water and common detergent and immersed for 1 min into a disinfection solution containing 0.5% of SDS and 10% of commonly commercialized bleach. After thorough washing with distilled water until the total removal of the disinfection solution, the seeds were transferred to Petri dishes containing three layers of filter paper saturated with tap water. During the following days, the consecutively germinated seedlings were transferred in groups of 3 to small, 9 cm diameter pots containing a mix of 50% peat and 50% perlite. Five pots per accession were transferred to a glass greenhouse, and 3 weeks later only one well-succeeded plant was left to grow per pot.
4.3. DNA Extraction
4.3.1. DNA Extraction for Molecular Characterization
The genomic DNA extraction for molecular markers analysis was performed as described in [
17] with minor modifications. One leaf from 5 plants of each accession was removed, washed with tap water, and wiped up with paper. The central nervures were removed using a scalpel and the leaves were grounded under liquid nitrogen in a mortar with a pestle. The obtained fine powder was transferred to a microfuge tube containing 500 µL of extraction buffer (250 mM Tris-HCL, pH 8.0, 25 mM EDTA, 1% SDS) until the final volume of the suspension reached approximately 750 µL. RNase A (20 μg.mL
−1) was added to the tube and then transferred to a water bath at 65 °C for 15 min. Already at room temperature, 1 volume of phenol: chloroform: isoamyl alcohol (25:24:1) was added to the tube which, after successive inversions for 1 min, was centrifuged at 13,000 rpm for 3 min. The upper phase was transferred to a new tube and extracted with 1 volume of chloroform: isoamyl alcohol (24:1) as described in the previous step. This second extraction was repeated at least once, until the interphase appeared completely transparent. The final upper phase was transferred to another tube and mixed and precipitated with 3 volumes of cold absolute ethanol and kept at −20 °C. After centrifugation at 13,000 rpm, the DNA pellet was dried and slowly resuspended for two days in 50 µL of TE0.1 (10 mM Tris, 0.1 mM EDTA) in a refrigerator.
4.3.2. DNA Extraction for NGS Sequencing
All described procedures were carried out under cold conditions. Five leaves from accession 7 were excised, prepared, and ground as described in the previous paragraph. The obtained fine powder was resuspended in 25 mL of nuclei isolation buffer (50 mM Tris—HCl pH 8.0, 1 M saccharose, 25 mM MgCl2, 100 mM KCl, and 2% Triton X-100). After very mild agitation for 1 min the suspension was filtered through an inox sieve (mesh size 75 µm) helped by the addition of 10 mL of isolation buffer. The filtrate was divided equally into two 15 mL sterilized Falcon tubes and centrifuged at 30 g in a bench centrifuge for 5 min. The supernatants were collected to two new Falcon tubes and centrifuged again at 1100 g for 5 min. The supernatants were discarded, and a small amount of the nuclei-enriched pellets quickly collected using a micropipette tip and mixed with 10 µL of a DAPI solution on a glass microscope slide for quality analysis under UV microscopy (Olympus Vanox AHBT3). The pellets were resuspended in 1 mL of DNA extraction buffer and the DNA was extracted as described above.
4.4. Quantification and Quality Evaluation of the Extracted DNA
The integrity of the extracted DNA and eventual contamination with RNA were assessed by agarose gel (1.4%) electrophoresis. The DNA concentration was determined approximately in the same gels by comparison with different known amounts of genomic DNA extracted from
Pisum sativum roots, which do not contain chlorophyll or other pigments usually present in leaf samples that can bias the spectrophotometry results. A more accurate quantification was then obtained by UV spectrophotometry (NanoDrop One; Thermofisher, Waltham, MA, USA). The respective values, which can be biased by the fluorescence of the remaining leaf pigments (chlorophyll) and the presence of other contaminants, were accepted if falling within the concentration limits established in agarose gels. The amplifiability of the DNA samples was assessed by RAPD-PCR, performed as previously described in [
3].
4.5. NGS Sequencing
The precipitated with three volumes of absolute ethanol genomic DNA was centrifuged, washed with 75% ethanol, centrifuged again and the pellet was left to dry for two hours at room temperature. After slow resuspension in autoclaved milli-Q water, the DNA integrity and purity were assessed as above described. After quantification using UV spectrophotometry (NanoDrop One), the DNA was sent to STAB VIDA Lda, the company requested for next generation sequencing using an Illumina HiSeq platform.
4.6. Primer Design and Synthesis
All primers were designed, their parameters calculated and their eventual self- or pair-annealing assessed, using the FastPCR 6.7 Software [
18]. Common, non-labeled, primers were synthesized by the company Eurofins Genomics (Ebersberg, Germany). The fluorescent labeled primers were ordered from the company to STAB VIDA (Lisboa, Portugal).
4.7. Single Sequence Repeats (SSR) Markers Analysis
The SSR loci (~500 bp sequences containing an SSR motif) were identified manually by a random search for microsatellite motifs among the sequence contigs. The fragment analysis of the amplified by fluorescent primers SSR markers was performed in a 3730XL Genetic Analyzer platform using GeneScan™ 500 LIZ™ (Thermofisher, Waltham, MA, USA) as the dye size standard. The resulting data were analyzed using the Peak Scanner™ Software v. 1.0 (Applied Biosystems, Thermofisher, Waltham, MA, USA).
The amplification of the SSR (microsatellite) markers was performed in 30 µL reactions, starting with an initial denaturation at 94 °C for 1 min and 30 sec, followed by 35 cycles of 30 sec denaturation at 94 °C; 30 sec annealing, at different temperatures depending on the specific primer pair, and 1 min extension at 72 °C, followed by a period of final extension at 72 °C for 10 min.
The PCR products were analyzed in 3% agarose gel electrophoresis and the better amplified markers were selected for further, more accurate, analysis among the studied accessions. The amplifications were repeated with the same pair of primers with the forward primer labeled with a fluorochrome. Half of the amount of the amplified products was analyzed by agarose gel electrophoresis and the second half of the approved amplified samples was sent to the company STAB VIDA for fragment analysis by capillary polyacrylamide gel electrophoresis.
4.8. Single Nucleotide Polymorphisms (SNP) Markers Analysis
Five hundred SNP loci were identified using the Geneious Prime v.2021.2.5 software (Dotmatics, Boston, MA, USA). Seven nucleotide sequences, three nucleotides from each side of the identified SNP, were analyzed by the NEBcutter V2.0 software [
19] for the identification of restriction enzymes that differentially recognized the alternative SNP alleles.
Nineteen SNP markers harboring a TaqI restriction site encompassing the polymorphic nucleotide were selected for further work and amplified using the same protocol used for the SSR markers. Fifteen microliters of the amplified products were analyzed by 3% agarose gel electrophoresis. The remaining 15 µL of well amplified samples were then cut with the TaqI restriction enzyme, and the samples were analyzed as CAPS (cleaved amplified polymorphic sequences) markers in 3% agarose gels.
4.9. Data Analysis
The detailed analysis of the sequence contigs and respective reads were performed using the software Tablet 1.21.02.08 (The James Hutton Institute, Aberdeen, Scotland, UK) [
20].
The NTSYS-pc program [
21] was used for cluster analysis. The genetic similarity between the accessions was reckoned using the coefficient DICE [
22] by pairwise comparisons based on the percentage of common fragments, according to the following equation: similarity = 2Nab/(Na + Nb), where Nab is the number of scored amplification products simultaneously present in accessions ‘a’ and ‘b’, Na is the number of amplification products scored in accession ‘a’, and Nb is the number of scored fragments in accession ‘b’. The unweighted pair-group method with arithmetic averages (UPGMA) was used to calculate the cophenetic matrix used for dendrogram construction. The cophenetic correlation coefficient (r) was calculated by comparison of the similarity matrix with the UPGMA-produced cophenetic matrix graphically represented as a dendrogram.
5. Conclusions
The first genome assembly for wild rocket (Diplotaxis tenuifolia (L.) DC) here described and the provided information regarding multiple SSR (microsatellite) and SNP loci constitute major research tools available for the scientific community engaged in genetic and genomics studies on rocket species.
The combined use of genome-specific SSR and SNP-CAPS allowed the identification of specific molecular patterns for almost all analyzed D. tenuifolia accessions, the identification and confirmation of cases of synonymy, and the clear discrimination from the E. sativa accessions.
The SSR and SNP-CAPS markers, tested and validated in this study, will be used for the unequivocal identification of all present and upcoming accessions of the Portuguese germplasm collection of D. tenuifolia, and for the initial research steps towards the genome location of downy mildew-resistance genes in this species.