Next Article in Journal
Investigation of a Standardized Qualitative Behaviour Assessment and Exploration of Potential Influencing Factors on the Emotional State of Dairy Calves
Previous Article in Journal
Effect of Feeding Cold-Pressed Sunflower Cake on Ruminal Fermentation, Lipid Metabolism and Bacterial Community in Dairy Cows
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Communication

A First Insight into a Draft Genome of Silver Sillago (Sillago sihama) via Genome Survey Sequencing

1
Guangdong Research Center on Reproductive Control and Breeding Technology of Indigenous Valuable Fish Species, Fisheries College, Guangdong Ocean University, Zhanjiang 524088, China
2
Southern Marine Science and Engineering Guangdong Laboratory, Zhanjiang 524025, China
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Animals 2019, 9(10), 756; https://doi.org/10.3390/ani9100756
Submission received: 20 September 2019 / Accepted: 27 September 2019 / Published: 1 October 2019
(This article belongs to the Section Aquatic Animals)

Abstract

:

Simple Summary

Silver sillago (Sillago sihama Forsskål) is distributed alongshore from the Indian Ocean to the West Pacific. Owing to its delicate quality, rich seafood taste, and high nutritional value, S. sihama is an attractive seafood in China. However, the main supply of this fish is from wild capture. The lack of genetic and genomic data for S. sihama has led to limited improvement in its breeding programs. In this study, we conducted a genomic survey of S. sihama using next-generation sequencing technology to investigate its genomic profile. We obtained useful data, such as genome size, sequence repeat ratio, heterozygosity ratio, and the genome sequences, which might accelerate the breeding and culturing programs for S. sihama.

Abstract

Sillago sihama has high economic value and is one of the most attractive aquaculture species in China. Despite its economic importance, studies of its genome have barely been performed. In this study, we conducted a first genomic survey of S. sihama using next-generation sequencing (NGS). In total, 45.063 Gb of high-quality sequence data were obtained. For the 17-mer frequency distribution, the genome size was estimated to be 508.50 Mb. The sequence repeat ratio was calculated to be 21.25%, and the heterozygosity ratio was 0.92%. Reads were assembled into 1,009,363 contigs, with a N50 length of 1362 bp, and then into 814,219 scaffolds, with a N50 length of 2173 bp. The average Guanine and Cytosine (GC) content was 45.04%. Dinucleotide repeats (56.55%) were the dominant form of simple sequence repeats (SSR).

1. Introduction

Silver sillago (Sillago sihama Forsskål) is distributed alongshore from the Indian Ocean to the West Pacific [1]. When adult sillago are scared, they bury themselves in the sand [2]. Polychaete worms, amphipods, small prawns (Penaeus), and shrimps are the main source of food for sillago [3]. This fish is found along the southern seashore of China [4]. Owing to its delicate quality, rich seafood taste, and high nutritional value [5], S. sihama is an attractive seafood in China. However, the main supply of this fish is from wild capture [6]. Studies of this species have mainly focused on salinity tolerance [7], population dynamics [8], distinction of the genus Sillago [9], and phylogenetic relationships among the genus Sillago [10]. These studies found that S. sihama had tolerance to lower salinities [7], but with excessive exploitation, wild populations of S. sihama had been diminished in size and were low-aged [8]. The lack of genetic and genomic data for S. sihama has led to limited improvement in its breeding programs [11]. It is necessary to study the genome size and genome characteristics for S. sihama, which will provide genetic and genomic resources.
High-throughput next-generation sequencing (NGS) is currently the main approach for genomic surveys and is an important and efficient strategy for generating genetic and genomic information [12,13,14,15]. A genomic survey can boost progress in gene finding and phylogenetic analysis, and in understanding genetic variety, genome structure, and genetic improvement of advantageous characteristics [16,17,18,19], which could also accelerate breeding and culturing progress of S. sihama.

2. Materials and Methods

2.1. Specimen Materials

Specimens of S. sihama were obtained from Guangdong Ocean University Breeding Base. Two S. sihama specimens, named specimen 1 (female) and specimen 2 (male), were subjected to genome sequencing. All animal experiments were conducted in accordance with the guidelines and approval of the Animal Research and Ethics Committees of the Institute of Aquatic Economic Animals of Guangdong Ocean University (201903003).

2.2. DNA Extraction, Library Construction, and Sequencing

Genomic DNA was extracted from a S. sihama muscle sample using the SDS (sodium dodecyl sulfate) method [20] and randomly fragmented using a Covaris ultrasonic shearing device. Fragments with a length of ~350 bp were used to construct two paired-end DNA libraries, and then sequenced using the Illumina HiSeq X Ten platform with a read length of 2 × 150 bp, following the manufacturer’s protocol. After reads containing adapters or contaminations and low-quality reads were removed, clean reads underlying all following analyses were acquired. Entire read sets were deposited in the Short Read Archive (SRA) databank (http://www.ncbi.nlm.nih.gov/sra/), and are available under the accession number PRJNA545388.

2.3. Genome Size Estimation and Identification of Heterozygosity Ratio and Repeat Ratio

An estimate for the genome size of S. sihama was based on the K-mer frequency of the clean reads (k = 17) and the 17-mer frequency (depth) distribution was consistent with the Poisson distribution. From the distribution of 17-mer depth, we acquired the peak depth value, which represents the average value and variation of the related Poisson distribution [21,22]. Calculation of K-mer depth distribution for clean sequence reads and estimation of genome size were performed via Jellyfish (v2.2.4) software [23]. Because K-mer depth distribution can be affected by heterozygosity and repetitive sequences in the genome, the revision of genome size was performed. We also inferred the heterozygous frequency and repeat frequency based on K-mer analysis.

2.4. Sequence Assembly and Analysis of Guanine and Cytosine (GC) Content

Genome sequence assembly was performed using the de Bruijn graph algorithm available in SOAPdenovo (v2.04) [24,25]. Contigs were realigned using all clean reads and scaffolds were constructed step by step using diversified insert size paired-ends [26]. A K-mer size of 41 was set as the default assembly parameter. GC content along the assembled sequence was calculated from the proportion of GC out of the total number of bases in the sequencing data [27].

2.5. Identification of Simple Sequence Repeats (SSRs)

In order to identify simple sequence repeat (SSR) markers, SSRs were searched in the assembled scaffolds using SR search software [28]. The minimum base number for SSR identification of di-, tri-, tetra-, penta-, and hexa-nucleotides was 12 [29].

3. Results

3.1. Genome Sequencing and Sequence Quality Estimation

The 350 bp insert libraries were sequenced and a total 54.837 Gb (specimen 1)/54.452 Gb (specmen-2) of raw reads was generated (Table 1). After filtering and correction, a total of 45.063 Gb (specimen 1)/38.583 Gb (specimen 2) of clean reads were derived, with an error rate of approximately 0.03% for both samples. The Q20 values were both above 95%, while the Q30 values were both above 90%. Here, 5000 random clean reads from each specimen were used as a query sequence with BLAST (The Basic Local Alignment Search Tool) against the Nucleotide Sequence Database from NCBI (National Center for Biotechnology Information), and the result showed that there was no contamination from other species (Table S1). We present specimen 1 in the main text and specimen 2 in the Supplementary Materials, because differences in survey data between the two specimens were very small.

3.2. Genome Size, Ratio of Heterozygosity and Repeats

K-mer analysis was performed on all of the clean data. For the 17-mer frequency distribution (Figure 1, specimen 2 in Figure S1), the number of K-mers was 36,648,430,961 and the peak depth distribution was set at 70×. The estimated genome size was 523.55 Mb, which was calculated via the following formula:
Genome size = K−mer num/Peak depth
which was based on the output of Jellyfish (v2.2.4) [23]. Then, the genome size was revised by excluding the K-mer error, via the following formula:
Revised genome size = Genome size × (1−Error Rate),
giving a revised genome size of 508.50 Mb. The genome sequence repeat ratio percentage for S. sihama was 21.25% and the proportion of heterozygotes was 0.92% (Table 2, specimen 2 in Table S2).

3.3. Genome Assembly

With 41 bp K-mers, de novo assembly was performed using all of the clean reads. A total of 568,556,466 bp scaffolds were derived, with a N50 scaffold value of 2173 bp (Table 3, specimen 2 in Table S3). The N50 / N90 of the contigs / scaffolds were derived by ordering all sequences, adding all the contigs / scaffolds from the longest to the shortest and when the added length reached 50% / 90% of the total length of all contigs / scaffolds, the length of the last added contig / scaffold was the N50 / N90 [15].

3.4. GC Content

The GC content of the S. sihama genome and average sequencing depth were plotted along the assembled sequence (Figure 2, specimen 2 in Figure S2). The density points were only concentrated in the 30–65% range, and the average GC content was 45.04%.

3.5. Identification of SSR

The total number of identified SSRs was 149,257 (Table 4, specimen 2 in Table S4). Dinucleotide repeats were dominant (56.55%), followed by trinucleotide repeats (33.78%), tetranucleotides repeats (7.61%), pentanucleotide repeats (1.47%), and hexanucleotide repeats (0.58%) (Figure 3, specimen 2 in Figure S3).

4. Discussion

In recent years, with the development of NGS technology [30], efficient approaches, such as faster sequencing, longer reads, and cost reduction [31], have been provided for researchers to cope with a wide range of questions from newly-found and non-model species, such as Procambarus clarkii [19], Sillago sinica [4], and Pelteobagrus fulvidraco [32]. Moreover, the estimation of genome size by the K-mer method using genome survey sequences makes genome size estimation available for non-model species, without any prior knowledge [15]. According to the K-mer (k = 17) analysis, the genome size of S. sihama was ~508.50 Mb. The genome size of S. sihama was close to the size of S. sinica (534 Mb) [4] and Gambusia affinis (598.7 Mb) [33], but smaller than Oryzias latipes (700.4 Mb) [34], P. fulvidraco (714 Mb) [32], and Oreochromis niloticus (1.082 Gbp) [35]. The genome size of Sillaginidae is relatively small, as a result of lower number of repetitive sequences in the Sillaginidae genome [4].
For the genome assembly, if the heterozygosity rate is higher than 0.5%, it is difficult to assemble, and if higher than 1%, it is even more difficult [23]. We found that the heterozygosity rate of S. sihama was ~0.92%. The repeat rate of S. sihama genomic sequences was ~21.25%. The characteristics of the S. sihama genome might impact the accuracy of genome size estimation. This was the reason that revision of genome size was performed. Before the appearance of a more efficient de novo assembly method, a reference genome was necessary for a good genome assembly [19].
In our study, the N50 scaffold value was 2173 bp and the N50 contig value was 1362 bp (Table 3). As de novo assemblies obtained from NGS technologies are delicate debris, a good genome assembly requires N50 contigs > 30 kb and N50 scaffolds > 250 kb [36]. However, a reference genome should be available to map short read sequences to a good genome assembly [19]. Our study was a first draft genome and stands as a useful reference for further studies on whole genome sequencing of S. sihama.

5. Conclusions

In this study, the first reference genome of S. sihama was presented. The genome size of S. sihama was ~508.50 Mb, with 814,219 scaffolds and a N50 length of 2173 bp. The genome sizes of S. sihama were close to S. sinica (534 Mb), which shared a very close relationship with S. sihama during evolution [4], indicating that the result of this study was credible. Regarding the N50 values for contigs and scaffolds in this study, there are still improvements to be made in the research of the genome of S. sihama.

Supplementary Materials

The following are available online at https://www.mdpi.com/2076-2615/9/10/756/s1. Table S1: Top 5 similar species compared in the Nucleotide Sequence Database of NCBI. Table S2: Estimation of S. sihama (specimen 2) genome based on K-mer statistics. Table S3: Statistics of S. sihama (specimen 2) assembled genome sequences. Table S4: SSR distribution statistics of S. sihama (specimen 2). Figure S1: K-mer (k=17) analysis for estimation of the genome size of S. sihama (specimen 2). Figure S2: GC content and average sequencing depth of S. sihama (specimen 2) genome data used for assembly. Figure S3: Ratio of different SSRs in S. sihama (specimen 2).

Author Contributions

Conceptualization, Z.L. and H.C.; formal analysis, C.T., D.J., and H.C.; investigation, Z.L., X.L., and Y.W.; resources, Y.H. and C.Z.; writing—original draft preparation, Z.L. and H.C.; writing—review and editing, G.L. and C.T.; project administration, Z.L.

Funding

This study was supported by grants from the National Natural Science Foundation of China (Nos. 41706174 and 31702326), the Fund of Southern Marine Science and Engineering Guangdong Laboratory (Zhanjiang) (ZJW-2019-06), and the Marine Fishery Science and Technology Promotion Projects of Guandong Province (No. Yue Cai Nong 2017 [17]), Guangdong Ocean University. Funding was also received from the department of education of Guangdong province (2018KQNCX111), program for scientific research start-up funds of Guangdong Ocean University (R19026).

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Kaga, T. Phylogenetic systematics of the family Sillaginidae (Percomorpha: Order Perciformes). Zootaxa 2013, 3642, 100–105. [Google Scholar] [CrossRef] [PubMed]
  2. Du, T.; Huang, Y. Biological characteristics and indoor culture experiment in Sillago sihama. J. Aquac. 2009, 30, 1–3. (In Chinese) [Google Scholar]
  3. Gunn, J.S.; Milward, N.E. The food, feeding habits and feeding structures of the whiting species Sillago sihama. J. Fish Biol. 1985, 26, 411–427. [Google Scholar] [CrossRef]
  4. Xu, S.; Xiao, S.; Zhu, S.; Zeng, X.; Luo, J.; Liu, J.; Gao, T.; Chen, N. A draft genome assembly of the Chinese sillago (Sillago sinica), the first reference genome for Sillaginidae fishes. GigaScience 2018, 7, 1–8. [Google Scholar] [CrossRef] [PubMed]
  5. Huang, Y.; Huang, H.; Du, T.; Huang, J. Analysis and Evaluation of Main Nutritive Composition in the Muscle of Wild Sillago sihama. J. Guangdong Ocean Univ. 2015, 35, 9–14. [Google Scholar] [CrossRef]
  6. Guo, Y.S.; Wang, Z.D.; Yan, C.Z.; Zhang, Y.L.; Zheng, J.N.; Xu, Y.M.; Du, T.; Liu, C.W. Isolation and characterization of microsatellite DNA loci from Sillago sihama. J. Genet. 2014, 93, e32–e36. [Google Scholar] [CrossRef]
  7. Lee, C.S.; Hu, F.; Hirano, R. Salinity Tolerance of Fertilized Eggs and Larval Survival in the Fish Sillago sihama. Mar. Ecol. Prog. Ser. 1981, 4, 169–174. [Google Scholar] [CrossRef]
  8. Lu, Z.B.; Chen, X.; Du, J.G. The population dynamics and parameter of growth and mortality of Sillago sihama in the Minnan–Taiwan fishing grounds. Mar. Fish Res. 2008, 29, 47–53. [Google Scholar] [CrossRef]
  9. Pan, X.Z.; Gao, T.X. Sagittal otolith shape used in the discrimination of fishes of the genus Sillago in China. Acta Zootaxonomica Sin. 2010, 35, 799–805. (In Chinese) [Google Scholar]
  10. Xue, T.Q.; Du, N.; Gao, T.X. Phylogenetic relationships of 4 Sillaginidae species based on partial sequences of COI and Cytochrome b gene. Period. Ocean Univ. China 2010, 40, 91–98. [Google Scholar] [CrossRef]
  11. Tian, C.; Li, Z.; Dong, Z.; Huang, Y.; Du, T.; Chen, H.; Jiang, D.; Deng, S.; Zhang, Y.; Wanida, S.; et al. Transcriptome Analysis of Male and Female Mature Gonads of Silver Sillago (Sillago sihama). Genes 2019, 10, 129. [Google Scholar] [CrossRef] [PubMed]
  12. Jennings, T.N.; Knaus, B.J.; Mullins, T.D.; Haig, S.M.; Cronn, R.C. Multiplexed microsatellite recovery using massively parallel sequencing. Mol. Ecol. Resour. 2011, 11, 1060–1067. [Google Scholar] [CrossRef] [PubMed]
  13. Nybom, H.; Weising, K.; Rotter, B. DNA fingerprinting in botany-past, present, future. Investig. Genet. 2014, 5, 1–35. [Google Scholar] [CrossRef] [PubMed]
  14. Kralova-Hromadova, I.; Minarik, G.; Bazsalovicsova, E.; Mikulicek, P.; Oravcova, A.; Palkova, L.; Hanzelova, V. Development of microsatellite markers in Caryophyllaeus laticeps (Cestoda: Caryophyllidea), monozoic fish tapeworm, using next-generation sequencing approach. Parasitol. Res. 2015, 114, 721–726. [Google Scholar] [CrossRef] [PubMed]
  15. Lu, M.; An, H.; Li, L. Genome Survey Sequencing for the Characterization of the Genetic Background of Rosa roxburghii Tratt and Leaf Ascorbate Metabolism Genes. PLoS ONE 2016, 11, e0147530. [Google Scholar] [CrossRef] [PubMed]
  16. Barchi, L.; Lanteri, S.; Portis, E.; Acquadro, A.; Valè, G.; Toppino, L.; Rotino, G.L. Identification of SNP and SSR markers in eggplant using RAD tag sequencing. BMC Genom. 2011, 12, 1–9. [Google Scholar] [CrossRef] [PubMed]
  17. Rowe, H.C.; Renaut, S.; Guggisberg, A. RAD in the realm of next-generation sequencing technologies. Mol. Ecol. 2011, 20, 3499–3502. [Google Scholar] [CrossRef] [Green Version]
  18. Xu, P.; Xu, S.; Wu, X.; Tao, Y.; Wang, B.; Wang, S.; Qin, D.; Lu, Z.; Li, G. Population genomic analyses from low-coverage RAD-Seq data: A case study on the non-model cucurbit bottle gourd. Plant J. 2014, 77, 430–442. [Google Scholar] [CrossRef]
  19. Shi, L.; Yi, S.; Li, Y. Genome survey sequencing of red swamp crayfish Procambarus clarkii. Mol. Biol. Rep. 2018, 45, 799–806. [Google Scholar] [CrossRef]
  20. Natarajan, V.P.; Zhang, X.; Morono, Y.; Inagaki, F.; Wang, F. A Modified SDS-Based DNA Extraction Method for High Quality Environmental DNA from Seafloor Environments. Front. Microbiol. 2016, 7, 986. [Google Scholar] [CrossRef] [Green Version]
  21. Kim, E.B.; Fang, X.; Fushan, A.A.; Huang, Z.; Lobanov, A.V.; Han, L.; Marino, S.M.; Sun, X.; Turanov, A.A.; Yang, P.; et al. Genome sequencing reveals insights into physiology and longevity of the naked mole rat. Nature 2011, 479, 223–227. [Google Scholar] [CrossRef] [PubMed]
  22. Zhang, G.; Fang, X.; Guo, X.; Li, L.; Luo, R.; Xu, F.; Yang, P.; Zhang, L.; Wang, X.; Qi, H.; et al. The oyster genome reveals stress adaptation and complexity of shell formation. Nature 2012, 490, 49–54. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  23. Marcais, G.; Kingsford, C. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics 2011, 27, 764–770. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  24. Li, R.; Zhu, H.; Ruan, J.; Qian, W.; Fang, X.; Shi, Z.; Li, Y.; Li, S.; Shan, G.; Kristiansen, K.; et al. De novo assembly of human genomes with massively parallel short read sequencing. Genome Res. 2010, 20, 265–272. [Google Scholar] [CrossRef] [PubMed]
  25. Luo, R.; Liu, B.; Xie, Y.; Li, Z.; Huang, W.; Yuan, J.; He, G.; Chen, Y.; Pan, Q.; Liu, Y.; et al. SOAPdenovo2: An empirically improved memory-efficient short-read de novo assembler. GigaScience 2012, 1, 1–6. [Google Scholar] [CrossRef]
  26. Zhou, W.; Hu, Y.; Sui, Z.; Fu, F.; Wang, J.; Chang, L.; Guo, W.; Li, B. Genome Survey Sequencing and Genetic Background Characterization of Gracilariopsis lemaneiformis (Rhodophyta) Based on Next-Generation Sequencing. PLoS ONE 2013, 8, e69909. [Google Scholar] [CrossRef]
  27. Algorithm of GC content. Available online: https://www.cnblogs.com/Datapotumas/p/6306186.html (accessed on 7 March 2018).
  28. Wang, J.L.; Zhu, M.X.; Xu, M.H.; Chen, S.L.; Zhang, F.Q. Analysis on SSR in Sinoswertia tetraptera Base on RAD-seq. Bull. Bot. Res. 2017, 37, 447–452. [Google Scholar] [CrossRef]
  29. Temnykh, S.; DeClerck, G.; Lukashova, A.; Lipovich, L.; Cartinhour, S.; McCouch, S. Computational and Experimental Analysis of Microsatellites in Rice (Oryza sativa L.): Frequency, Length Variation, Transposon Associations, and Genetic Marker Potential. Genome Res. 2001, 11, 1441–1452. [Google Scholar] [CrossRef]
  30. Liu, L.; Li, Y.; Li, S.; Hu, N.; He, Y.; Pong, R.; Lin, D.; Lu, L.; Law, M. Comparison of next-generation sequencing systems. J. Biomed. Biotechnol. 2012, 251364, 1–11. [Google Scholar] [CrossRef]
  31. Van Dijk, E.L.; Auger, H.; Jaszczyszyn, Y.; Thermes, C. Ten years of next-generation sequencing technology. Trends Genet. 2014, 30, 418–426. [Google Scholar] [CrossRef]
  32. Gong, G.; Dan, C.; Xiao, S.; Guo, W.; Huang, P.; Xiong, Y.; Wu, J.; He, Y.; Zhang, J.; Li, X.; et al. Chromosomal-level assembly of yellow catfish genome using third-generation DNA sequencing and Hi-C analysis. GigaScience 2018, 7, 1–9. [Google Scholar] [CrossRef] [PubMed]
  33. Hoffberg, S.L.; Troendle, N.J.; Glenn, T.C.; Mahmud, O.; Louha, S.; Chalopin, D.; Bennetzen, J.L.; Mauricio, R. A High-Quality Reference Genome for the Invasive Mosquitofish Gambusia affinis Using a Chicago Library. G3 Gen. Genomes Genet. 2018, 8, 1855–1861. [Google Scholar] [CrossRef] [PubMed]
  34. Kasahara, M.; Naruse, K.; Sasaki, S.; Nakatani, Y.; Qu, W.; Ahsan, B.; Yamada, T.; Nagayasu, Y.; Doi, K.; Kasai, Y.; et al. The medaka draft genome and insights into vertebrate genome evolution. Nature 2007, 447, 714–719. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  35. Conte, M.A.; Gammerdinger, W.J.; Bartie, K.L.; Penman, D.J.; Kocher, T.D. A high quality assembly of the Nile Tilapia (Oreochromis niloticus) genome reveals the structure of two sex determination regions. BMC Genom. 2017, 18, 341. [Google Scholar] [CrossRef] [PubMed]
  36. Hamilton, J.P.; Buell, C.R. Advances in plant genome sequencing. Plant J. 2012, 70, 177–190. [Google Scholar] [CrossRef] [PubMed]
Figure 1. K-mer (k = 17) analysis for estimation of the genome size of S. sihama (specimen 1).
Figure 1. K-mer (k = 17) analysis for estimation of the genome size of S. sihama (specimen 1).
Animals 09 00756 g001
Figure 2. GC content and average sequencing depth of S. sihama (specimen 1) genome data used for assembly. For the spot graphs, the x-axis is GC content and the y-axis is sequencing depth. For the bar graphs, the x-axis is sequencing depth distribution and the y-axis is GC content distribution.
Figure 2. GC content and average sequencing depth of S. sihama (specimen 1) genome data used for assembly. For the spot graphs, the x-axis is GC content and the y-axis is sequencing depth. For the bar graphs, the x-axis is sequencing depth distribution and the y-axis is GC content distribution.
Animals 09 00756 g002
Figure 3. Ratio of different SSRs in S. sihama (specimen 1).
Figure 3. Ratio of different SSRs in S. sihama (specimen 1).
Animals 09 00756 g003
Table 1. Statistics of S. sihama genome sequencing data.
Table 1. Statistics of S. sihama genome sequencing data.
LibraryInsert Size (bp)Raw Base
(bp)
Effective Rate
(%)
Clean Base
(bp)
Error Rate
(%)
Q20 1
(%)
Q302
(%)
GC Content
(%)
Specimen 135054,836,979,60099.9845,063,446,4000.0395.9390.8145.03
Specimen 235054,451,684,20099.7438,583,415,2000.0395.7590.4445.36
1 Q20: The ratio of data with accuracy above 99% in total data. 2 Q30: The ratio of data with accuracy above 99.9% in total data.
Table 2. Estimation of S. sihama (specimen 1) genome based on K-mer statistics.
Table 2. Estimation of S. sihama (specimen 1) genome based on K-mer statistics.
IdentityK-merK-mer
Depth
K-mer NumberGenome Size
(Mbp)
Revised Genome
Size (Mbp)
Heterozygous
Ratio (%)
Repeat
(%)
Specimen 1177036,648,430,961523.55508.500.9221.25
Table 3. Statistics of S. sihama (specimen 1) assembled genome sequences.
Table 3. Statistics of S. sihama (specimen 1) assembled genome sequences.
IdentityTotal Length
(bp)
Total NumberMax Length
(bp)
N50 Length
(bp)
N90 Length
(bp)
ContigSpecimen 1559,219,8071,009,36346,4171362171
ScaffoldSpecimen 1568,556,466814,21972,9532173219
Table 4. Simple Sequence Repeat (SSR) distribution statistics for S. sihama (specimen 1).
Table 4. Simple Sequence Repeat (SSR) distribution statistics for S. sihama (specimen 1).
StatisticsDi-Tri-Tetra-Penta-Hexa-
SSR number84,40650,42011,3612200870
Percentage56.55%33.78%7.61%1.47%0.58%

Share and Cite

MDPI and ACS Style

Li, Z.; Tian, C.; Huang, Y.; Lin, X.; Wang, Y.; Jiang, D.; Zhu, C.; Chen, H.; Li, G. A First Insight into a Draft Genome of Silver Sillago (Sillago sihama) via Genome Survey Sequencing. Animals 2019, 9, 756. https://doi.org/10.3390/ani9100756

AMA Style

Li Z, Tian C, Huang Y, Lin X, Wang Y, Jiang D, Zhu C, Chen H, Li G. A First Insight into a Draft Genome of Silver Sillago (Sillago sihama) via Genome Survey Sequencing. Animals. 2019; 9(10):756. https://doi.org/10.3390/ani9100756

Chicago/Turabian Style

Li, Zhiyuan, Changxu Tian, Yang Huang, Xinghua Lin, Yaorong Wang, Dongneng Jiang, Chunhua Zhu, Huapu Chen, and Guangli Li. 2019. "A First Insight into a Draft Genome of Silver Sillago (Sillago sihama) via Genome Survey Sequencing" Animals 9, no. 10: 756. https://doi.org/10.3390/ani9100756

APA Style

Li, Z., Tian, C., Huang, Y., Lin, X., Wang, Y., Jiang, D., Zhu, C., Chen, H., & Li, G. (2019). A First Insight into a Draft Genome of Silver Sillago (Sillago sihama) via Genome Survey Sequencing. Animals, 9(10), 756. https://doi.org/10.3390/ani9100756

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop