Developments in Algorithms for Sequence Alignment: A Review
Abstract
:1. Introduction
2. Pairwise Sequence Alignment
2.1. Divide and Conquer
2.2. Bounded Dynamic Programming
2.3. Scoring System of Pairwise Sequence Alignment
3. Multiple Sequence Alignment
3.1. Star Alignment Strategy
3.2. Progressive Alignment Strategy
4. Defects of Heuristic Algorithms and Countermeasures
4.1. Consistency Objective Function
4.2. Iterative Refinement
5. Quality Estimation of Multiple Sequence Alignment Software
5.1. Estimation Based on Reference Alignment
Structural Benchmark | Simulated Sequences | Commonality-Based | |
---|---|---|---|
Scalability | Low | High | High |
Pre-Built Alignment | Yes | Yes | No |
Scoring Methods | Sum of pair score and true column score | Sum of pair score and true column score | Multiple overlap score and head-or-tail score |
Dependency | Protein structure | Probabilistic model | / |
Test Sets | Fixed | Configurable | Not limited |
Drawbacks | Limited to the diversity of benchmarks | Adopted model may have defects | Tested software can make common mistakes |
Examples | BAliBASE [91] Pfam [95] HOMSTRAD [96] | ROSE [97] INDELible [98] Dawg [99] | MUMSA [100] |
5.2. Estimation Based on the Commonality among Alignments by Different Software
6. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Acknowledgments
Conflicts of Interest
References
- Zou, Q.; Lin, G.; Jiang, X.; Liu, X.; Zeng, X. Sequence clustering in bioinformatics: An empirical study. Brief. Bioinform. 2018, 21, 1–10. [Google Scholar] [CrossRef] [PubMed]
- Lewin, H.A.; Robinson, G.E.; Kress, W.J.; Baker, W.J.; Coddington, J.; Crandall, K.A.; Durbin, R.; Edwards, S.V.; Forest, F.; Gilbert, M.T.P.; et al. Earth BioGenome Project: Sequencing life for the future of life. Proc. Natl. Acad. Sci. USA 2018, 115, 4325–4333. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Wong, K.M.; Suchard, M.A.; Huelsenbeck, J.P. Alignment Uncertainty and Genomic Analysis. Science 2008, 319, 473–476. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Phillips, A.; Janies, D.; Wheeler, W. Multiple Sequence Alignment in Phylogenetic Analysis. Mol. Phylogenet. Evol. 2000, 16, 317–330. [Google Scholar] [CrossRef]
- Rost, B.; Sander, C. Combining evolutionary information and neural networks to predict protein secondary structure. Proteins: Struct. Funct. Bioinform. 1994, 19, 55–72. [Google Scholar] [CrossRef]
- Fukuda, H.; Tomii, K. DeepECA: An end-to-end learning framework for protein contact prediction from a multiple sequence alignment. BMC Bioinform. 2020, 21, 10. [Google Scholar] [CrossRef]
- Hu, G.; Feng, J.; Xiang, X.; Wang, J.; Salojärvi, J.; Liu, C.; Wu, Z.; Zhang, J.; Liang, X.; Jiang, Z.; et al. Two divergent haplotypes from a highly heterozygous lychee genome suggest independent domestication events for early and late-maturing cultivars. Nat. Genet. 2022, 54, 73–83. [Google Scholar] [CrossRef]
- Chowdhury, B.; Garai, G. A review on multiple sequence alignment from the perspective of genetic algorithm. Genomics 2017, 109, 419–431. [Google Scholar] [CrossRef]
- Chatzou, M.; Magis, C.; Chang, J.-M.; Kemena, C.; Bussotti, G.; Erb, I.; Notredame, C. Multiple sequence alignment modeling: Methods and applications. Brief. Bioinform. 2016, 17, 1009–1023. [Google Scholar] [CrossRef] [Green Version]
- Needleman, S.B.; Wunsch, C.D. A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 1970, 48, 443–453. [Google Scholar] [CrossRef]
- Hirschberg, D.S. A linear space algorithm for computing maximal common subsequences. Commun. ACM 1975, 18, 341–343. [Google Scholar] [CrossRef]
- Söding, J. Protein homology detection by HMM-HMM comparison. Bioinformatics 2005, 21, 951–960. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Rabiner, L.R. A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 1989, 77, 257–286. [Google Scholar] [CrossRef] [Green Version]
- Eddy, S.R. Hidden Markov models. Curr. Opin. Struct. Biol. 1996, 6, 361–365. [Google Scholar] [CrossRef]
- Lemoine, F.; Blassel, L.; Voznica, J.; Gascuel, O. COVID-Align: Accurate online alignment of hCoV-19 genomes using a profile HMM. Bioinformatics 2021, 37, 1761–1762. [Google Scholar] [CrossRef]
- Durbin, R.; Eddy, S.R.; Krogh, A.; Mitchison, G. Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids; Cambridge University Press: Cambridge, UK, 1998. [Google Scholar]
- Eddy, S.R. Profile hidden Markov models. Bioinformatics 1998, 14, 755–763. [Google Scholar] [CrossRef]
- Shen, C.; Zaharias, P.; Warnow, T. MAGUS+eHMMs: Improved multiple sequence alignment accuracy for fragmentary sequences. Bioinformatics 2022, 38, 918–924. [Google Scholar] [CrossRef]
- Katoh, K.; Frith, M. Adding unaligned sequences into an existing alignment using MAFFT and LAST. Bioinformatics 2012, 28, 3144–3146. [Google Scholar] [CrossRef]
- Smith, T.F.; Waterman, M.S. Identification of common molecular subsequences. J. Mol. Biol. 1981, 147, 195–197. [Google Scholar] [CrossRef]
- Lipman, D.J.; Pearson, W.R. Rapid and Sensitive Protein Similarity Searches. Science 1985, 227, 1435–1441. [Google Scholar] [CrossRef] [Green Version]
- Pearson, W.R.; Lipman, D.J. Improved tools for biological sequence comparison. Proc. Natl. Acad. Sci. USA 1988, 85, 2444–2448. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Altschul, S.F.; Gish, W.; Miller, W.; Myers, E.W.; Lipman, D.J. Basic local alignment search tool. J. Mol. Biol. 1990, 215, 403–410. [Google Scholar] [CrossRef]
- Karp, R.M.; Rabin, M.O. Efficient randomized pattern-matching algorithms. IBM J. Res. Dev. 1987, 31, 249–260. [Google Scholar] [CrossRef]
- Delcher, A.L.; Kasif, S.; Fleischmann, R.D.; Peterson, J.; White, O.; Salzberg, S.L. Alignment of whole genomes. Nucleic Acids Res. 1999, 27, 2369–2376. [Google Scholar] [CrossRef] [Green Version]
- Marçais, G.; Delcher, A.L.; Phillippy, A.; Coston, R.; Salzberg, S.; Zimin, A. MUMmer4: A fast and versatile genome alignment system. PLOS Comput. Biol. 2018, 14, e1005944. [Google Scholar] [CrossRef] [Green Version]
- Weiner, P. Linear pattern matching algorithms. In Proceedings of the 14th Annual Symposium on Switching and Automata Theory (Swat 1973), Iowa City, IA, USA, 15–17 October 1973; pp. 1–11. [Google Scholar]
- Manber, U.; Myers, G. Suffix Arrays: A New Method for On-Line String Searches. SIAM J. Comput. 1993, 22, 935–948. [Google Scholar] [CrossRef]
- Ferragina, P.; Manzini, G. Opportunistic data structures with applications. In Proceedings of the 41st Annual Symposium on Foundations of Computer Science, Redondo Beach, CA, USA, 12–14 November 2000; pp. 390–398. [Google Scholar]
- Li, H. Minimap2: Pairwise alignment for nucleotide sequences. Bioinformatics 2018, 34, 3094–3100. [Google Scholar] [CrossRef]
- Moshiri, N. ViralMSA: Massively scalable reference-guided multiple sequence alignment of viral genomes. Bioinformatics 2021, 37, 714–716. [Google Scholar] [CrossRef]
- Kazutaka, K.; Misakwa, K.; Kei-ichi, K.; Miyata, T. MAFFT: A novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res. 2002, 30, 3059–3066. [Google Scholar] [CrossRef] [Green Version]
- Naznooshsadat, E.; Elham, P.; Ali, S.-Z.; Etminan, N.; Parvinnia, E.; Sharifi-Zarchi, A. FAME: Fast and memory efficient multiple sequences alignment tool through compatible chain of roots. Bioinformatics 2020, 36, 3662–3668. [Google Scholar] [CrossRef]
- Liu, H.; Zou, Q.; Xu, Y. A novel fast multiple nucleotide sequence alignment method based on FM-index. Brief. Bioinform. 2022, 23, bbab519. [Google Scholar] [CrossRef] [PubMed]
- Smirnov, V.; Warnow, T. MAGUS: Multiple sequence Alignment using Graph clUStering. Bioinformatics 2021, 37, 1666–1672. [Google Scholar] [CrossRef] [PubMed]
- Edgar, R.C. MUSCLE v5 enables improved estimates of phylogenetic tree confidence by ensemble bootstrapping. bioRxiv 2021. [Google Scholar] [CrossRef]
- Spouge, J.L. Speeding up Dynamic Programming Algorithms for Finding Optimal Lattice Paths. SIAM J. Appl. Math. 1989, 49, 1552–1566. [Google Scholar] [CrossRef]
- Korf, R.E. Depth-first iterative-deepening: An optimal admissible tree search. Artif. Intell. 1985, 27, 97–109. [Google Scholar] [CrossRef]
- Ranwez, V.; Harispe, S.; Delsuc, F.; Douzery, E.J.P. MACSE: Multiple Alignment of Coding SEquences Accounting for Frameshifts and Stop Codons. PLoS ONE 2011, 6, e22594. [Google Scholar] [CrossRef] [PubMed]
- Li, W.-H.; Wu, C.I.; Luo, C.C. A new method for estimating synonymous and nonsynonymous rates of nucleotide substitution considering the relative likelihood of nucleotide and codon changes. Mol. Biol. Evol. 1985, 2, 150–174. [Google Scholar] [CrossRef] [Green Version]
- Schwartz, R.M.; Dayhoff, M.O. Matrices for Detecting Distant Relationships. In Atlas of Protein Sequences; National Biomedical Research Foundation: Washington, DC, USA, 1978; pp. 353–359. [Google Scholar]
- Jones, D.T.; Taylor, W.R.; Thornton, J.M. The rapid generation of mutation data matrices from protein sequences. Comput. Appl. Biosci. 1992, 8, 275–282. [Google Scholar] [CrossRef]
- Henikoff, S.; Henikoff, J.G. Amino acid substitution matrices from protein blocks. Proc. Natl. Acad. Sci. USA 1992, 89, 10915–10919. [Google Scholar] [CrossRef] [Green Version]
- Ríos, S.; Fernandez, M.F.; Caltabiano, G.; Campillo, M.; Pardo, L.; Gonzalez, A. GPCRtm: An amino acid substitution matrix for the transmembrane region of class A G Protein-Coupled Receptors. BMC Bioinform. 2015, 16, 206. [Google Scholar] [CrossRef] [Green Version]
- Vingron, M.; Waterman, M.S. Sequence alignment and penalty choice: Review of concepts, case studies and implications. J. Mol. Biol. 1994, 235, 1–12. [Google Scholar] [CrossRef]
- Korotkov, E.V.; Suvorova, Y.M.; Kostenko, D.O.; Korotkova, M.A. Multiple alignment of promoter sequences from the Arabidopsis thaliana L. Genome. Genes 2021, 12, 135. [Google Scholar] [CrossRef] [PubMed]
- Pugacheva, V.; Korotkov, A.; Korotkov, E. Search of latent periodicity in amino acid sequences by means of genetic algorithm and dynamic programming. Stat. Appl. Genet. Mol. Biol. 2016, 15, 381–400. [Google Scholar] [CrossRef] [PubMed]
- Korotkov, E.; Korotkova, M.A. Search for regions with periodicity using the random position weight matrices in the C. elegans genome. Int. J. Data Min. Bioinform. 2017, 18, 331–354. [Google Scholar] [CrossRef]
- Zou, Q.; Guo, M.Z.; Wang, X.K.; Zhang, T.T. An algorithm for DNA multiple sequence alignment based on center star method and keyword tree. Tien Tzu Hsueh Pao/Acta Electron. Sin. 2009, 37, 1746–1750. [Google Scholar]
- Li, W.; Jaroszewski, L.; Godzik, A. Clustering of highly homologous sequences to reduce the size of large protein databases. Bioinformatics 2001, 17, 282–283. [Google Scholar] [CrossRef]
- Zou, Q.; Hu, Q.; Guo, M.; Wang, G. HAlign: Fast multiple similar DNA/RNA sequence alignment based on the centre star strategy. Bioinformatics 2015, 31, 2475–2481. [Google Scholar] [CrossRef] [Green Version]
- Su, W.; Liao, X.; Lu, Y.; Zou, Q.; Peng, S. Multiple Sequence Alignment Based on a Suffix Tree and Center-Star Strategy: A Linear Method for Multiple Nucleotide Sequence Alignment on Spark Parallel Framework. J. Comput. Biol. 2017, 24, 1230–1242. [Google Scholar] [CrossRef]
- Dong, G.; Fu, X.; Li, H.; Li, J. An accurate algorithm for multiple sequence alignment in MapReduce. J. Comput. Methods Sci. Eng. 2018, 18, 283–295. [Google Scholar] [CrossRef]
- Barton, G.; Sternberg, M. A strategy for the rapid multiple alignment of protein sequences: Confidence levels from tertiary structure comparisons. J. Mol. Biol. 1987, 198, 327–337. [Google Scholar] [CrossRef]
- Sokal, R.; Michener, C. A statistical method for evaluating systematic relationships. Univ. Kans. Sci. Bull. 1958, 38, 1409–1438. [Google Scholar]
- Katoh, K.; Toh, H. PartTree: An algorithm to build an approximate tree from a large number of unaligned sequences. Bioinformatics 2007, 23, 372–374. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Blackshields, G.; Sievers, F.; Shi, W.; Wilm, A.; Higgins, D.G. Sequence embedding for fast construction of guide trees for multiple sequence alignment. Algorithms Mol. Biol. 2010, 5, 21. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Sievers, F.; Wilm, A.; Dineen, D.; Gibson, T.J.; Karplus, K.; Li, W.; Lopez, R.; McWilliam, H.; Remmert, M.; Söding, J.; et al. Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol. Syst. Biol. 2011, 7, 539. [Google Scholar] [CrossRef] [PubMed]
- Lassmann, T. Kalign 3: Multiple sequence alignment of large datasets. Bioinformatics 2020, 36, 1928–1929. [Google Scholar] [CrossRef]
- Boyce, K.; Sievers, F.; Higgins, D.G. Simple chained guide trees give high-quality protein multiple sequence alignments. Proc. Natl. Acad. Sci. USA 2014, 111, 10556–10561. [Google Scholar] [CrossRef] [Green Version]
- Yamada, K.D.; Tomii, K.; Katoh, K. Application of the MAFFT sequence alignment program to large data—reexamination of the usefulness of chained guide trees. Bioinformatics 2016, 32, 3246–3251. [Google Scholar] [CrossRef] [Green Version]
- Tan, G.; Gil, M.; Löytynoja, A.P.; Goldman, N.; Dessimoz, C. Simple chained guide trees give poorer multiple sequence alignments than inferred trees in simulation and phylogenetic benchmarks. Proc. Natl. Acad. Sci. USA 2015, 112, E99–E100. [Google Scholar] [CrossRef] [Green Version]
- Boyce, K.; Sievers, F.; Higgins, D.G. Reply to Tan et al.: Differences between real and simulated proteins in multiple sequence alignments. Proc. Natl. Acad. Sci. USA 2015, 112, E101. [Google Scholar] [CrossRef] [Green Version]
- Löytynoja, A. Phylogeny-aware alignment with PRANK. Methods Mol. Biol. 2014, 1079, 155–170. [Google Scholar] [CrossRef]
- Löytynoja, A.; Vilella, A.J.; Goldman, N. Accurate extension of multiple sequence alignments using a phylogeny-aware graph algorithm. Bioinformatics 2012, 28, 1684–1691. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Maiolo, M.; Zhang, X.; Gil, M.; Anisimova, M. Progressive multiple sequence alignment with indel evolution. BMC Bioinform. 2018, 19, 331. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Maiolo, M.; Gatti, L.; Frei, D.; Leidi, T.; Gil, M.; Anisimova, M. ProPIP: A tool for progressive multiple sequence alignment with Poisson Indel Process. BMC Bioinform. 2021, 22, 518. [Google Scholar] [CrossRef] [PubMed]
- Zou, Q.; Shan, X.; Jiang, Y. A Novel Center Star Multiple Sequence Alignment Algorithm Based on Affine Gap Penalty and K-Band. Phys. Procedia 2012, 33, 322–327. [Google Scholar] [CrossRef] [Green Version]
- Feng, D.-F.; Doolittle, R.F. Progressive sequence alignment as a prerequisitetto correct phylogenetic trees. J. Mol. Evol. 1987, 25, 351–360. [Google Scholar] [CrossRef]
- Gotoh, O. Heuristic Alignment Methods. Methods Mol. Biol. 2014, 1079, 29–43. [Google Scholar] [CrossRef]
- Notredame, C.; Holm, L.; Higgins, D.G. COFFEE: An objective function for multiple sequence alignments. Bioinformatics 1998, 14, 407–422. [Google Scholar] [CrossRef] [Green Version]
- Notredame, C.; Higgins, D.; Heringa, J. T-coffee: A novel method for fast and accurate multiple sequence alignment. J. Mol. Biol. 2000, 302, 205–217. [Google Scholar] [CrossRef] [Green Version]
- Do, C.B.; Mahabhashyam, M.S.; Brudno, M.; Batzoglou, S. ProbCons: Probabilistic consistency-based multiple sequence alignment. Genome Res. 2005, 15, 330–340. [Google Scholar] [CrossRef] [Green Version]
- Wallace, I.M.; O’Sullivan, O.; Higgins, D.G.; Notredame, C. M-Coffee: Combining multiple sequence alignment methods with T-Coffee. Nucleic Acids Res. 2006, 34, 1692–1699. [Google Scholar] [CrossRef]
- Berger, M.P.; Munson, P.J. A novel randomized iterative strategy for aligning multiple protein sequences. Bioinformatics 1991, 7, 479–484. [Google Scholar] [CrossRef] [PubMed]
- Liu, K.; Raghavan, S.; Nelesen, S.; Linder, C.R.; Warnow, T. Rapid and Accurate Large-Scale Coestimation of Sequence Alignments and Phylogenetic Trees. Science 2009, 324, 1561–1564. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Liu, K.; Warnow, T.J.; Holder, M.; Nelesen, S.M.; Yu, J.; Stamatakis, A.P.; Linder, C.R. SATé-II: Very Fast and Accurate Simultaneous Estimation of Multiple Sequence Alignments and Phylogenetic Trees. Syst. Biol. 2012, 61, 90. [Google Scholar] [CrossRef] [PubMed]
- Hirosawa, M.; Totoki, Y.; Hoshida, M.; Ishikawa, M. Comprehensive study on iterative algorithms of multiple sequence alignment. Bioinformatics 1995, 11, 13–18. [Google Scholar] [CrossRef] [PubMed]
- Gotoh, O. A weighting system and aigorithm for aligning many phylogenetically related sequences. Bioinformatics 1995, 11, 543–551. [Google Scholar] [CrossRef] [PubMed]
- Deorowicz, S.; Debudaj-Grabysz, A.; Gudyś, A. FAMSA: Fast and accurate multiple sequence alignment of huge protein families. Sci. Rep. 2016, 6, 33964. [Google Scholar] [CrossRef]
- Zhan, Q.; Fu, Y.; Jiang, Q.; Liu, B.; Peng, J.; Wang, Y. SpliVert: A Protein Multiple Sequence Alignment Refinement Method Based on Splitting-Splicing Vertically. Protein Pept. Lett. 2020, 27, 295–302. [Google Scholar] [CrossRef]
- Altschul, S.F. Gap costs for multiple sequence alignment. J. Theor. Biol. 1989, 138, 297–309. [Google Scholar] [CrossRef]
- Lipman, D.J.; Altschul, S.F.; Kececioglu, J.D. A tool for multiple sequence alignment. Proc. Natl. Acad. Sci. USA 1989, 86, 4412–4415. [Google Scholar] [CrossRef] [Green Version]
- Ranwez, V. Two Simple and Efficient Algorithms to Compute the SP-Score Objective Function of a Multiple Sequence Alignment. PLoS ONE 2016, 11, e0160043. [Google Scholar] [CrossRef] [Green Version]
- Ortuño, F.M.; Valenzuela, O.; Rojas, F.; Pomares, H.; Florido, J.P.; Urquiza, J.M.; Rojas, I. Optimizing multiple sequence alignments using a genetic algorithm based on three objectives: Structural information, non-gaps percentage and totally conserved columns. Bioinformatics 2013, 29, 2112–2121. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Vega, C.G.Z.; Nebro, A.J.; García-Nieto, J.; Aldana-Montes, J.F. M2Align: Parallel multiple sequence alignment with a multi-objective metaheuristic. Bioinformatics 2017, 33, 3011–3017. [Google Scholar] [CrossRef] [PubMed]
- Narayan, B.; Jeevitesh, M. Evolutionary computation approach to enhance protein multiple sequence alignments. Res. Sq. 2022. Available online: https://www.researchsquare.com/article/rs-1236304/v1 (accessed on 26 March 2022).
- Notredame, C. SAGA: Sequence alignment by genetic algorithm. Nucleic Acids Res. 1996, 24, 1515–1524. [Google Scholar] [CrossRef]
- Iantorno, S.; Gori, K.; Goldman, N.; Gil, M.; Dessimoz, C. Who Watches the Watchmen? An Appraisal of Benchmarks for Multiple Sequence Alignment. In Multiple Sequence Alignment Methods; Russell, D.J., Ed.; Humana Press: Totowa, NJ, USA, 2014; pp. 59–73. [Google Scholar]
- Aniba, M.R.; Poch, O.; Thompson, J.D. Issues in bioinformatics benchmarking: The case study of multiple sequence alignment. Nucleic Acids Res. 2010, 38, 7353–7363. [Google Scholar] [CrossRef] [Green Version]
- Thompson, J.D.; Koehl, P.; Ripp, R.; Poch, O. BAliBASE 3.0: Latest developments of the multiple sequence alignment benchmark. Proteins: Struct. Funct. Bioinform. 2005, 61, 127–136. [Google Scholar] [CrossRef]
- Thompson, J.D.; Plewniak, F.; Poch, O. A comprehensive comparison of multiple sequence alignment programs. Nucleic Acids Res. 1999, 27, 2682–2690. [Google Scholar] [CrossRef]
- Edgar, R.C. MUSCLE: Multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004, 32, 1792–1797. [Google Scholar] [CrossRef] [Green Version]
- Roshan, U.; Livesay, D.R. Probalign: Multiple sequence alignment using partition function posterior probabilities. Bioinformatics 2006, 22, 2715–2721. [Google Scholar] [CrossRef] [Green Version]
- Mistry, J.; Chuguransky, S.; Williams, L.; Qureshi, M.; Salazar, G.A.; Sonnhammer, E.L.L.; Tosatto, S.C.E.; Paladin, L.; Raj, S.; Richardson, L.J.; et al. Pfam: The protein families database in 2021. Nucleic Acids Res. 2021, 49, D412–D419. [Google Scholar] [CrossRef]
- Mizuguchi, K.; Deane, C.; Blundell, T.L.; Overington, J. HOMSTRAD: A database of protein structure alignments for homologous families. Protein Sci. 1998, 7, 2469–2471. [Google Scholar] [CrossRef] [PubMed]
- Stoye, J.; Evers, D.; Meyer, F. Generating benchmarks for multiple sequence alignments and phylogenetic reconstructions. Proceedings. Int. Conf. Intell. Syst. Mol. Boil. 1997, 5, 303–306. [Google Scholar]
- Fletcher, W.; Yang, Z. INDELible: A Flexible Simulator of Biological Sequence Evolution. Mol. Biol. Evol. 2009, 26, 1879–1888. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Cartwright, R.A. DNA assembly with gaps (Dawg): Simulating sequence evolution. Bioinformatics 2005, 21, iii31–iii38. [Google Scholar] [CrossRef]
- Lassmann, T.; Sonnhammer, E.L.L. Automatic assessment of alignment quality. Nucleic Acids Res. 2005, 33, 7120–7128. [Google Scholar] [CrossRef] [Green Version]
- Landan, G.; Graur, D. Heads or Tails: A Simple Reliability Check for Multiple Sequence Alignments. Mol. Biol. Evol. 2007, 24, 1380–1383. [Google Scholar] [CrossRef] [Green Version]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Chao, J.; Tang, F.; Xu, L. Developments in Algorithms for Sequence Alignment: A Review. Biomolecules 2022, 12, 546. https://doi.org/10.3390/biom12040546
Chao J, Tang F, Xu L. Developments in Algorithms for Sequence Alignment: A Review. Biomolecules. 2022; 12(4):546. https://doi.org/10.3390/biom12040546
Chicago/Turabian StyleChao, Jiannan, Furong Tang, and Lei Xu. 2022. "Developments in Algorithms for Sequence Alignment: A Review" Biomolecules 12, no. 4: 546. https://doi.org/10.3390/biom12040546
APA StyleChao, J., Tang, F., & Xu, L. (2022). Developments in Algorithms for Sequence Alignment: A Review. Biomolecules, 12(4), 546. https://doi.org/10.3390/biom12040546