cPlot: Contig-Plotting Visualization for the Analysis of Short-Read Nucleotide Sequence Alignments
Abstract
:1. Introduction
2. Results and Discussion
2.1. Dataset
2.2. Identifying Optimal k-mer Size for Improved Performance
2.3. Accuracy of Contig Plotting
2.4. Comparative Results
3. Materials and Methods
3.1. Algorithm
3.1.1. Basic Read Alignment
3.1.2. Optimal Read Alignment
3.1.3. Query Sequence Rearrangement
3.2. Application
3.2.1. Web-Based Interface for Contig Plotting
3.2.2. Input Data Required for Contig Plotting
3.2.3. Job Management
4. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Li, H.; Ruan, J.; Durbin, R. Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res. 2008, 18, 1851–1858. [Google Scholar] [CrossRef] [PubMed]
- Lunter, G.; Goodson, M. Stampy: A statistical algorithm for sensitive and fast mapping of Illumina sequence reads. Genome Res. 2010, 21, 936–939. [Google Scholar] [CrossRef] [PubMed]
- Mohamadi, H.; Chu, J.; Vandervalk, B.P.; Birol, I. ntHash: Recursive nucleotide hashing. Bioinformatics 2016, 32, 3492–3494. [Google Scholar] [CrossRef] [PubMed]
- Berlin, K.; Koren, S.; Chin, C.S.; Drake, J.P.; Landolin, J.M.; Phillippy, A.M. Assembling large genomes with single-molecule sequencing and locality-sensitive hashing. Nat. Biotechnol. 2015, 33, 623–630. [Google Scholar] [CrossRef]
- Harrath, Y.; Mahjoub, A.; AbuBakr, F.; Azhar, M. Comparative Evaluation of Short Read Alignment Tools for next Generation DNA Sequencing. In Proceedings of the 2019 International Conference on Innovation and Intelligence for Informatics, Computing, and Technologies (3ICT), Sakhier, Bahrain, 22–23 September 2019; pp. 1–6. [Google Scholar] [CrossRef]
- Comin, M.; Schimd, M. Fast comparison of genomic and meta-genomic reads with alignment-free measures based on quality values. BMC Med. Genom. 2016, 9, 36. [Google Scholar] [CrossRef]
- Li, R.; Yu, C.; Li, Y.; Lam, T.W.; Ỹiu, S.M.; Kristiansen, K.; Wang, J. SOAP2: An improved ultrafast tool for short read alignment. Bioinformatics 2009, 25, 1966–1967. [Google Scholar] [CrossRef]
- Huang, L.; Popic, V.; Batzoglou, S. Short read alignment with populations of genomes. Bioinformatics 2013, 29, i361–i370. [Google Scholar] [CrossRef] [PubMed]
- Ruffalo, M.; LaFramboise, T.; Koyutürk, M. Comparative analysis of algorithms for next-generation sequencing read alignment. Bioinformatics 2011, 27, 2790–2796. [Google Scholar] [CrossRef] [PubMed]
- Pirooznia, M.; Nagarajan, V.; Deng, Y. GeneVenn—A web application for comparing gene lists using Venn diagrams. Bioinformation 2007, 1, 420–422. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Kurtz, S.; Phillippy, A.; Delcher, A.L.; Smoot, M.; Shumway, M.; Antonescu, C.; Salzberg, S.L. Versatile and open software for comparing large genomes. Genome Biol. 2004, 5, R12. [Google Scholar] [CrossRef] [PubMed]
- Marçais, G.; Delcher, A.L.; Phillippy, A.M.; Coston, R.; Salzberg, S.L.; Zimin, .A. MUMmer4: A fast and versatile genome alignment system. PLoS Comput. Biol. 2018, 14, e1005944. [Google Scholar] [CrossRef] [PubMed]
- Altschul, S.F.; Gish, W.; Miller, W.; Myers, E.W.; Lipman, D.J. Basic local alignment search tool. J. Mol. Biol. 1990, 215, 403–410. [Google Scholar] [CrossRef]
- Williams, T.; Kelley, C. Gnuplot 4.6: An Interactive Plotting Program. 2013. Available online: http://gnuplot.sourceforge.net/ (accessed on 18 August 2022).
- Husemann, P.; Stoye, J. r2cat: Synteny plots and comparative assembly. Bioinformatics 2009, 26, 570–571. [Google Scholar] [CrossRef] [PubMed]
- Cabanettes, F.; Klopp, C. D-GENIES: Dot plot large genomes in an interactive, efficient and simple way. PeerJ 2018, 6, e4958. [Google Scholar] [CrossRef]
- Kim, J.I.; Yoon, H.S.; Yi, G.; Kim, H.S.; Yih, W.; Shin, W. The Plastid Genome of the Cryptomonad Teleaulax amphioxeia. PLoS ONE 2015, 10, e0129284. [Google Scholar] [CrossRef]
- Du, N.; Chen, J.; Sun, Y. Improving the sensitivity of long read overlap detection using grouped short k-mer matches. BMC Genom. 2019, 20, 190. [Google Scholar] [CrossRef]
- Hunter, J. Matplotlib: A 2D Graphics Environment. Comput. Sci. Eng. 2007, 9, 90–95. [Google Scholar] [CrossRef]
Category | Sequences |
---|---|
Reference sequence | Rhodomonas salina (NC_009573.1) |
Query sequence | Cryptophyta plastid 8 1 |
Name | Length (bp) | Mapped Contigs (bp) | Similarity (%) |
---|---|---|---|
NC_001137.3_T0 | 72,100 | 72,098 | 99.997 |
NC_001137.3_T1 | 72,100 | 72,098 | 99.997 |
NC_001137.3_T2 | 72,100 | 72,098 | 99.997 |
NC_001137.3_T3 | 72,100 | 72,098 | 99.997 |
NC_001137.3_T4 | 72,100 | 72,097 | 99.995 |
NC_001137.3_T5 | 72,100 | 72,097 | 99.995 |
NC_001137.3_T6 | 72,100 | 72,098 | 99.997 |
NC_001137.3_T7 | 72,174 | 72,171 | 99.995 |
Total | 576,874 | 576,855 | 99.996 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ji, M.; Kan, Y.; Kim, D.; Jung, J.; Yi, G. cPlot: Contig-Plotting Visualization for the Analysis of Short-Read Nucleotide Sequence Alignments. Int. J. Mol. Sci. 2022, 23, 11484. https://doi.org/10.3390/ijms231911484
Ji M, Kan Y, Kim D, Jung J, Yi G. cPlot: Contig-Plotting Visualization for the Analysis of Short-Read Nucleotide Sequence Alignments. International Journal of Molecular Sciences. 2022; 23(19):11484. https://doi.org/10.3390/ijms231911484
Chicago/Turabian StyleJi, Mingeun, Yejin Kan, Dongyeon Kim, Jaehee Jung, and Gangman Yi. 2022. "cPlot: Contig-Plotting Visualization for the Analysis of Short-Read Nucleotide Sequence Alignments" International Journal of Molecular Sciences 23, no. 19: 11484. https://doi.org/10.3390/ijms231911484
APA StyleJi, M., Kan, Y., Kim, D., Jung, J., & Yi, G. (2022). cPlot: Contig-Plotting Visualization for the Analysis of Short-Read Nucleotide Sequence Alignments. International Journal of Molecular Sciences, 23(19), 11484. https://doi.org/10.3390/ijms231911484