Next Article in Journal
Hantavirus in Panama: Twenty Years of Epidemiological Surveillance Experience
Next Article in Special Issue
Unbiased Virus Detection in a Danish Zoo Using a Portable Metagenomic Sequencing System
Previous Article in Journal
Genetic Variability in the E6/E7 Region of Human Papillomavirus 16 in Women from Ecuador
Previous Article in Special Issue
Virus Pop—Expanding Viral Databases by Protein Sequence Simulation
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Investigation of the Molecular Epidemiology and Evolution of Circulating Severe Acute Respiratory Syndrome Coronavirus 2 in Thailand from 2020 to 2022 via Next-Generation Sequencing

by
Jiratchaya Puenpa
1,
Vorthon Sawaswong
2,
Pattaraporn Nimsamer
2,
Sunchai Payungporn
2,
Patthaya Rattanakomol
1,
Nutsada Saengdao
3,
Jira Chansaenroj
1,
Ritthideach Yorsaeng
1,
Kamol Suwannakarn
3 and
Yong Poovorawan
1,4,*
1
Center of Excellence in Clinical Virology, Department of Pediatrics, Faculty of Medicine, Chulalongkorn University, Bangkok 10330, Thailand
2
Center of Excellence in Systems Microbiology, Department of Biochemistry, Faculty of Medicine, Chulalongkorn University, Bangkok 10330, Thailand
3
Department of Microbiology, Faculty of Medicine, Siriraj Hospital, Mahidol University, Bangkok 10700, Thailand
4
FRS(T), The Royal Society of Thailand, Sanam Sueapa, Dusit, Bangkok 10300, Thailand
*
Author to whom correspondence should be addressed.
Viruses 2023, 15(6), 1394; https://doi.org/10.3390/v15061394
Submission received: 9 June 2023 / Revised: 16 June 2023 / Accepted: 16 June 2023 / Published: 19 June 2023
(This article belongs to the Special Issue Applications of Next-Generation Sequencing in Virus Discovery 2.0)

Abstract

:
Coronavirus disease 2019 (COVID-19) is an infectious condition caused by the severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2), which surfaced in Thailand in early 2020. The current study investigated the SARS-CoV-2 lineages circulating in Thailand and their evolutionary history. Complete genome sequencing of 210 SARS-CoV-2 samples collected from collaborating hospitals and the Institute of Urban Disease Control and Prevention over two years, from December 2020 to July 2022, was performed using next-generation sequencing technology. Multiple lineage introductions were observed before the emergence of the B.1.1.529 omicron variant, including B.1.36.16, B.1.351, B.1.1, B.1.1.7, B.1.524, AY.30, and B.1.617.2. The B.1.1.529 omicron variant was subsequently detected between January 2022 and June 2022. The evolutionary rate for the spike gene of SARS-CoV-2 was estimated to be between 0.87 and 1.71 × 10−3 substitutions per site per year. There was a substantial prevalence of the predominant mutations C25672T (L94F), C25961T (T190I), and G26167T (V259L) in the ORF3a gene during the Thailand outbreaks. Complete genome sequencing can enhance the prediction of future variant changes in viral genomes, which is crucial to ensuring that vaccine strains are protective against worldwide outbreaks.

1. Introduction

Over the past few decades, RNA viruses belonging to the Coronaviridae family have cyclically caused life-threatening illnesses in the human population due to zoonotic spillover. In 2002, the severe acute respiratory syndrome coronavirus originated in Guangdong, China, and gave rise to a pandemic of atypical pneumonia, resulting in 8437 confirmed cases and 813 deaths [1]. In 2012, the Middle East respiratory syndrome coronavirus, first reported in Saudi Arabia, caused severe respiratory illness and death in 27 countries, with 858 confirmed fatalities [2]. More recently, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) emerged in Wuhan, China, in late December 2019 and has been declared an etiological cause of the COVID-19 pandemic since March 2020 [3]. As of 31 May 2023, the number of confirmed COVID-19 cases was > 760 million worldwide, including almost 7 million fatal cases [4]. The COVID-19 pandemic has continuously and progressively imposed a severe burden on economic and public health systems.
SARS-CoV-2 is an enveloped, positive-sense single-stranded RNA virus with a large genome of approximately 27–32 kb [5]. In contrast to other RNA viruses, coronaviruses harbor the proofreading activity of nonstructural protein 14 (nsp14ExoN), promoting the replication fidelity of their RNA-dependent RNA polymerase (RdRp), leading to a low mutation rate [6,7]. Continuous large-scale circulation of SARS-CoV-2 and an inequitable distribution of vaccines and antiviral drugs have resulted in stochastic intra- and inter-transmission events at the population level, which have driven a degree of viral adaptation and escape from host immunity. To maintain essential genetic information while adapting its molecular ligand to fit the host milieu, genetic diversification of the virus mainly occurs in the spike (S) region, resulting in broadened tissue tropism and host range and waning host immune defense [8,9]. The emergence of new variants is characterized by alteration of the spike gene, and variants of concern (VOCs) are classified based on consequences, i.e., increased transmission, reduced vaccine and antiviral effectiveness, reduced treatment efficacy, and altered clinical disease presentation [10]. To date, the world has been confronted with five VOCs—alpha, beta, gamma, delta, and more recently, omicron—the common sublineages of which are BA.1 to BA.5. Over ten million SARS-CoV-2 genome data entries collected worldwide are available from the Global Initiative on Sharing All Influenza Data (GISAID; http://www.gisaid.org, accessed on 15 January 2023) [11]. The study of the evolution of the virus, alongside the investigation of host-viral interaction, is valuable for improving coping strategies and health policies and predicting the evolutionary trajectory of the virus.
As of May 24, 2023, the time of writing, there had been 4,738,988 confirmed COVID-19 cases in Thailand and a total of 34,053 reported deaths [12]. Since January 2020, Thailand has experienced five COVID-19 waves, and the omicron variant is dominating the fifth ongoing wave. Despite >25 million members of the population having had a third vaccine, COVID-19 remained a significant public health burden, affecting both fully vaccinated and vulnerable populations.
The present study investigated the molecular epidemiological trends and evolutionary history of SARS-CoV-2 in Thailand from December 2020 to July 2022. The genetic traits of Thai sequence variants and their phylogenetic relationships with other globally published variants were comprehensively analyzed via full-genome sequence comparisons.

2. Materials and Methods

2.1. Sample Collection and Processing

A total of 210 full-length SARS-CoV-2 genomes were successfully sequenced from individuals diagnosed with COVID-19 in Thailand from December 2020 to July 2022 (waves two to five). Of these, 63 sequences were detected before the B.1.1.529 omicron variant began to predominate. They were collected from various regions in Thailand, including Bangkok, Samut Sakhon, LobBuri, Narathiwat, and Yala, from December 2020 to December 2021. The remaining 147 sequences were collected during B.1.1.529 omicron variant predominance between January 2022 and July 2022 and were obtained exclusively from patients in Bangkok.
All nasopharyngeal samples included in the study were collected from collaborating hospitals and the Institute of Urban Disease Control and Prevention. These samples routinely tested positive for SARS-CoV-2 in multiplex real-time reverse transcription polymerase chain reaction (RT-PCR) assays, as described previously [13]. A magLEAD 12gC instrument (Precision System Science, Chiba, Japan) was used to extract nucleic acid from a 200-μL aliquot of supernatant in accordance with the manufacturer’s instructions, which was then analyzed at our laboratory.

2.2. Genomic Sequencing of SARS-CoV-2

To investigate the molecular epidemiology and evolution of SARS-CoV-2, further analysis of qRT-PCR-positive samples with CT values below 25 was conducted using next-generation sequencing (NGS). The Celemics comprehensive respiratory virus panel (Celemics Inc., Incheon, Republic of Korea) was used to sequence and identify complete SARS-CoV-2 genomes. Briefly, the RNA extraction process involved mixing 25 ng of extracted RNA with an RNA fragment buffer mix to facilitate fragmentation. First-strand cDNA was then synthesized using a first-strand synthesis master mix. The 1st-strand cDNA underwent double-stranded cDNA construction via incubation at 16 °C for 60 min with a 2nd-strand synthesis-1 mix, followed by a 2nd-strand synthesis-2 mix at 25 °C for 15 min. The double-stranded cDNA was cleaned, repaired, and added to poly(A) tail oligomers in a 5 ERA buffer mix. After multiple incubation steps at different temperatures, the A-tailed DNA was ligated with adaptors in a ligation reaction mix at 20 °C for 15 min. The ligated DNA was purified using CeleMag cleanup beads, amplified, and transformed into an adaptor-ligated library using CLM polymerase and UDI primers, in accordance with the manufacturer’s instructions. The constructed DNA library was assessed for quantity and quality via automated capillary gel electrophoresis (QIAxcel; Qiagen, Hilden, Germany) to ensure the presence of 200- to 400-bp DNA fragments. The DNA libraries were then subjected to NGS using the Illumina NextSeq 500 system with the mid/high-output kit v2.5 (300 cycles). The resulting FASTQ data were trimmed, assembled, and analyzed using the Celemics Virus Verifier pipeline, facilitating the identification and generation of consensus sequences for the SARS-CoV-2 genome. Any nucleotide gaps found in the assembled SARS-CoV-2 FASTQ sequences were filled by incorporating nucleotide sequences obtained from conventional RT-PCR-derived Sanger sequencing using primers specifically designed for those gaps.

2.3. Phylogenetic Analysis and Evolutionary Dynamics

The complete genome sequences acquired were compared with publicly available sequence data from GISAID. The sequence dataset was constructed with BioEdit v7.2.6 software [14] and aligned using CLUSTAL W at the European Bioinformatics Institute web server [15]. The diversity of SARS-CoV-2 lineages was analyzed with the maximum-likelihood phylogenetic method available in the MEGA program (v7) [16]. The Kimura two-parameter model with a gamma distribution (Γ) was selected as the substitution model in the analyses. The statistical consistency of tree nodes was determined via the bootstrap method (1000 random samplings).
A time-scaled phylogenetic tree for complete genome sequences was reconstructed with the BEAST version 1.10.4 program [17]. An uncorrelated lognormal prior distribution of nucleotide substitution rates among lineages and three independent Markov chain Monte Carlo (MCMC) procedures were used for Bayesian phylogenetic analyses. The general time-reversible model with a 4-category gamma-distributed rate variation across sites was used as the nucleotide substitution model. Bayesian Markov chain Monte Carlo analysis was run for 120 million steps and sampled every 300 steps from the posterior distribution. Tracer version 1.7.1 (http://tree.bio.ed.ac.uk/software/tracer/, accessed on 7 January 2023) was used to assess the convergence of all parameters (an adequate operator sample size of >200). The maximum clade credibility tree was summarized as maximum clade credibility (MCC) trees using the TreeAnnotator v1.10.4 tool (http://beast.bio.ed.ac.uk/treeannotator, accessed on 7 January 2023) after discarding the first 10% as burn-in, and then visualized in FigTree.

2.4. Nucleotide Sequence Accession IDs

Genome sequences generated in this study were deposited in the GISAID (https://www.gisaid.org, accessed on 10 March 2023) databases. Accession IDs are available in Supplementary Table S1.

3. Results

3.1. Divergence and Amino Acid Variations in SARS-CoV-2 Strains Detected before the Predominance of the B.1.1.529 Omicron Variant

The SARS-CoV-2 outbreaks in Thailand before the emergence of the B.1.1.529 omicron variant were classified into four waves. The first wave occurred from March 2020 to April 2020, the second from late December 2020 to January 2021, the third from April 2021 to July 2021, and the fourth from August 2021 to December 2021 [18]. A total of 63 sequences were collected prior to the predominance of the B.1.1.529 omicron variant. Of these, 14 sequences were sampled during the second wave, with 12 belonging to lineage B.1.36.16 and one each belonging to B.1.351 and B.1.1 lineages. There were 22 sequences from the third wave outbreak, with the alpha variant (19 sequences) being the most common, followed by lineage B.1.524 (3 sequences). During the fourth wave, 27 sequences were collected, with 17 belonging to lineage AY.30 and 10 belonging to lineage B.1.617.2.
Phylogenetic analysis revealed the dynamic nature of the epidemic in Thailand, and molecular changes in the SARS-CoV-2 genome were detected before the B.1.1.529 omicron variant predominated (Figure 1). Several disparate lineages were identified, with an initial lineage B (clade L) linked to early Bangkok cases dating from February 2019, including lineage A (clade S) and lineage B.1 (clades G, GH, and GR). All the lineages in the first epidemic wave except lineage B (clade L) probably emerged before April 2020. Lineage B.1.36.16 (clade GH) was found in July and August 2020 and was established near the beginning of the second epidemic wave. Most SARS-CoV-2 collected from the third epidemic wave belonged to lineage B.1.1.7 (clade GRY/alpha), with a few belonging to lineage B.1.524 (clade G). The third epidemic wave’s divergence time estimate for lineage B.1.1.7 (clade GRY/alpha) was December 2020. Phylogenetic analysis in the current study indicated that two lineages dominated the fourth epidemic wave: AY.30 (clade GK/delta) and B.1.617.2 (clade GK/delta). We estimated that interpersonal transmission of the fourth wave lineage began in January 2021. Its spread was sustained in April 2021 for lineage AY.30 (clade GK/delta) and in May 2021 for lineage B.1.617.2 (clade GK/delta).
Due to the error-prone nature of viral RNA genome replication, we analyzed crucial amino acid replacements in SARS-CoV-2 proteins from the samples acquired in this study from the second wave to the fourth epidemic wave. The 63 SARS-CoV-2 sequences identified in this study were combined with 67 published Thai samples to obtain a dataset of 130 sequences. The positions of amino acid substitutions in SARS-CoV-2 proteins and their relative frequencies in the entire set of 130 genomes were aligned and compared to the first isolate identified in December 2019, Wuhan-Hu-1 (Figure 2). Comparative analysis of the SARS-CoV-2 sequences revealed amino acid changes in all genome samples, most of which were scattered in nonstructural proteins. The nsp3, nsp14, and nsp2 viral proteins changed at 38, 22, and 17 amino acid positions, respectively. In S, N, and M, there were a total of 46, 24, and 10 amino acid position changes in the structural proteins, respectively. There were only three amino acid position changes in the E protein. There were minimal changes in nsp5, nsp7, nsp8, nsp9, nsp10, nsp16, and ORF6, and these changes were present in approximately 0.8–1.5% of the genome samples. Some amino acid substitutions in nsp3, nsp12, nsp13, M, S, ORF3a, ORF7a, ORF8, and N were present in >20% of the genomes sampled.

3.2. Evaluation of the Evolutionary History of SARS-CoV-2 in Thailand

To investigate the evolutionary history of SARS-CoV-2 outbreaks in Thailand, a phylogenetic analysis of the spike sequence samples obtained and the SARS-CoV-2 reference sequence Wuhan-Hu-1 (accession NC_045512) was conducted. Relationships between the Thailand SARS-CoV-2 variants and the dates of their emergence in Thailand are shown in Figure 3. The nucleotide substitution rate for the sampled population was estimated to be 1.24 × 10−3 (95% highest density interval 0.87–1.71 × 10−3) substitutions per site per year. The estimated time to the most recent common ancestor (tMRCA) of SARS-CoV-2 was 2.7 years for the most recent strain analyzed. The tMRCAs for the omicron sublineages BA.1 and BA.2 were approximately 0.8 and 0.6 years, respectively. The tMRCA for the omicron sublineages BA.4 and BA.5 was as recent as 0.2 years.
In Thailand, the Sinovac-CoronaVac vaccine was initially approved for use in late February 2021 (Figure 4). Following the outbreak of the third wave with the Alpha variant, the AstraZeneca vaccine and Sinopharm were administered to the Thai population in June 2021. During the fourth wave outbreak with the Delta variant, approximately 10% of the Thai population had received full vaccination, and the Pfizer-BioNTech vaccine was first used in Thailand in August 2021. The Thai population received the Moderna vaccine for the first time in November 2021. During the fifth wave outbreak with the Omicron variant, Thailand achieved a fully vaccinated rate of over 70%.

3.3. SARS-CoV-2 Omicron Sublineage BA.1 Genetic Characterization

The 63 Thailand B.1.1.529 omicron sequences identified in the present study were combined with 145 publicly available SARS-CoV-2 genome sequences identified worldwide, resulting in a comprehensive dataset of 208 sequences. All sequences in the present study in the B.1.1.529/BA.1 lineage were collected between January 2022 and May 2022. Analysis using clade-defining sequences (https://clades.nextstrain.org/, accessed on 7 January 2023) identified the Thailand sequences as 7 sublineages from the parent lineage B.1.1.529/BA.1 (Figure 5).
To investigate the mutation profile of the SARS-CoV-2 omicron BA.1 variant in the Thailand dataset, the 63 viral sequences identified in the current study and the dataset of 2951 viral sequences downloaded from the GISAID database were analyzed. The sequences were analyzed using the Nextclade Webtool to identify the most common mutations and characteristics of the Thailand dataset [19]. The majority of sequences (n = 44) were classified as sublineage BA.1.1 and shared the R346K (G22599A) substitution in the spike protein. Mutations in sublineage BA.1.1 (C2470T, C14805T, T19632C, and A26530G) were dominant, with a frequency > 50% in the Thailand genomes. With regard to the BA.1 variant from Thailand, T2019C (M585T in ORF1a), C2470T, G6850T, and G23628A (S689N in S) were present at frequencies > 10% (Table 1).
C14117T and A26530G were present at high frequencies (>60%) in Thai viral genomes in the sublineage BA.1.1.15. G2894A and G26167T mutations dominated (>10%) in Thailand’s genome sublineage BA.1.1.18. C4113T, C5672T, and A26530G mutations (>40%), followed by T851C, C10605T, C12084T, G15850A, and G28436T (<10%), were present in Thailand sublineages BA.1.17/BA.1.17.2. In this study, one Thailand strain (EPI_ISL_12176269) was identified as sublineage BA.1.22, and it was detected in March 2022. One Thailand strain contained six genetic variations: A3301T (ORF1a:L1012F), G11417T (ORF1a:V3718F), C15738T, C17285T (ORF1b:S1273L), C20719T, and C27494T (ORF7a:P34L). All strains in the sublineage BA.1.22 shared the unique mutations G3182A (ORF1a:E973K) and G5515T.

3.4. SARS-CoV-2 Omicron Sublineage BA.2 Genetic Characterization

BA.2 and its sublineages accounted for 29.5% of all variants among the sequenced samples. In this study, 62 omicron sublineage BA.2 variants obtained in Thailand were analyzed for the period from January 2022 to June 2022. In phylogenetic analysis, 53% (33/62) were classified as sublineage BA.2, 23% (14/62) as BA.2.10, 16% (10/62) as BA.2.9, 5% (3/62) as BA.2.27, and 3% (2/62) as BA.2.3 (Figure 6).
To further characterize the genomes of SARS-CoV-2 omicron BA.2 and its sublineages in the Thailand viral population, an analysis of sequence variants across the entire viral genome was conducted, comparing them to the Wuhan-Hu1 strain (MN908947). The mutations C241T, T22882G (S:N440K), and C23854A (S:N764K) were present at high frequencies (>80%) in the genomes of Thailand viral sublineage BA.2, followed by C7471T and C25416T (>40%) (Table 2). The BA.2.27 sublineage, primarily identified in Thailand, has been detected in several other countries, including the United Kingdom, France, the United States, and India. Within the BA.2.27 sublineage, mutations C241T and C10198T were present at frequencies > 80% in the 5′UTR and ORF1a regions, whereas mutations C17745T, C19610T (ORF1b:T2048I), C25672T (ORF3a:L94F), and G28739T (N:A156S) were present at frequencies < 10%. The BA.2.3 sublineage has also been identified in several Asian countries, primarily the Philippines, Japan, and South Korea. The BA.2.3 sublineage exhibited dominant mutations (C241T and A21222G) with frequencies > 75% in the Thai population. Mutations C832T and T7282C were also present at frequencies > 20%, and mutations in the ORF1b region were present at frequencies > 8%.
The BA.2.9 sublineage characterized by the H78Y mutation in ORF3a was predominantly circulating in Europe, with an exceptionally high prevalence observed in Denmark. That sublineage, which shares the V1393A mutation in ORF1a, was most commonly detected in Thailand but has also been identified in Germany, Israel, Japan, and Denmark. Among the BA.2.9/BA.2.9.5 sublineages, Thailand variants have frequencies > 3% and are located in ORF1a, ORF1b, ORF9b, and the spike protein. In the Thailand sublineage BA.2.10, the mutations T7813C and C25961T were present at frequencies > 10%.

3.5. SARS-CoV-2 Omicron Sublineage BA.4 and BA.5 Genetic Characterization

The omicron sublineages BA.4 and BA.5 comprised variants that were detected in Thailand during June 2022 and July 2022. A phylogenetic tree based on complete genome sequences was constructed to investigate genetic relationships between Thailand’s BA.4 and BA.5 variants and global BA.4 and BA.5 variants (Figure 7). The complete genomes of the Thailand BA.4 and BA.5 variants were compared with a set of 110 SARS-CoV-2 genomes publicly available from GISAID. In the tree, 7/210 sequences (3.3%) were categorized as sublineage BA.4, and 15/210 (7.1%) were categorized as sublineage BA.5. The Thailand BA.5 sequences were categorized into four subtypes: BA.5.2, BA.5.2.1, BA.5.2.22, and BA.5.2.26, as determined by an analysis using clade-defining sequences available at (https://clades.nextstrain.org/, accessed on 7 January 2023). In Thailand, BA.4 and its sublineages exhibited high frequencies (>70%) of C241T, G6680A (ORF1a:A2139T), A22786C (Spike:R408S), T22882G (Spike:N440K), and T24163C mutations (Table 3). The most frequent mutations (>80%) present in Thailand BA.5 and its sublineages were C16616A (ORF1b:T1050N), A18163G (ORF1b:I1566V), T22882G (Spike:N440K), C23854A (Spike:N764K), and C26270T (E:T9I).

4. Discussion

In this study, the genomic variation and molecular phylogeny of 210 SARS-CoV-2 strains identified in Thailand from December 2020 to July 2022 were characterized using complete genome sequences. Classification analysis identified 31 distinct SARS-CoV-2 lineages in the samples. Similar findings have also been reported in populations in Malaysia [20], Hong Kong [21], and India [22]. Among the 31 different lineages identified in this study, seven were detected before the emergence of the B.1.1.529 omicron variant, including B.1.36.16, B.1.351, B.1.1, B.1.1.7, B.1.524, AY.30, and B.1.617.2. Previous studies indicate that Thailand experienced its first COVID-19 wave between March 2020 and April 2020, during which the prevalent lineages identified were A, B, and B.1 [13]. Before the B.1.1.529 omicron variant became predominant the majority of lineages belonged to three groups; B.1.36.16 (second wave), alpha (third wave), and delta (fourth wave). On 24 November 2021 a woman who had traveled to Africa was recorded as the first occurrence of BA.1 in Thailand, with the GISAID identifier EPI_ISL_7398758. The prevalence of BA.1 peaked between January 2022 and February 2022, then it shifted to BA.2 in the following months [18]. The variants BA.4 and BA.5 were detected in our samples during June 2022 and July 2022. After Thai individuals received full vaccination coverage of over 70%, there has been a decrease in the number of SARS-CoV-2 infections.
Based on Bayesian analyses with the tip-dating method, the rate of evolutionary change in the SARS-CoV-2 spike region was 1.24 × 10−3 substitutions per site per year, which is concordant with previous studies [23,24,25,26]. This rate of change is comparable to that observed in other human coronaviruses [27], but it is nearly three times higher than the reported mutation rate of human influenza B [28]. A previous study examined the global evolution rate of SARS-CoV-2 during the early stages of the outbreak and reported an estimated mean nucleotide mutation rate ranging from 1.79 × 10−3 to 1.83 × 10−3 substitutions per site per year [29]. In another study, it was suggested that the incubation period, serial interval, and generation time of SARS-CoV-2 have progressively decreased with the emergence of each new VOC [30].
The SARS-CoV-2 genome has been undergoing rapid evolution throughout the pandemic, with evidence suggesting that mutations in the genome affect the virus’s virulence [31]. The current study identified distinctive genomic patterns of synonymous and missense variants linked to the distribution of lineages in Thailand. The genetic variations observed in the Thailand isolates predominantly occurred within nonstructural proteins. Previous studies have also reported similar findings [32]. The ORF3a gene encodes a protein crucial in modulating inflammation, antiviral responses, and apoptosis processes [33]. In the present study, a notable prevalence of the dominant mutations C25672T (L94F), C25961T (T190I), and G26167T (V259L) was observed within this gene in the Thailand isolates.
Although most structural proteins remained conserved, the spike protein exhibited multiple mutations, notably the dominant variant carrying the D614G mutation, which is frequently associated with enhanced viral infectivity [34]. The spike protein is widely recognized for facilitating infection via interaction with the angiotensin-converting enzyme 2 (ACE2) receptor on the surface of human host cells [35,36]. In the current study, the dominant mutations in the spike gene at R408S, N440K, and N764K were observed at a frequency of >70% in Thailand isolates. In previous studies, multiple mutations were identified in the receptor-binding domain of VOCs that enhanced ACE2 binding affinity and facilitated evasion of antibody binding [37,38]. For example, the K417N, L452R, E484K, F486V, and N501Y mutations, present in most VOCs, were also detected in the samples isolated in Thailand in the current study. The S689N mutation in the spike gene, which first appeared in unassigned variants in May 2020, was also detected in the BA.1 lineage. The S689N mutation was detected in multiple other variants, including B.1.1.7, B.1.258.11, B.1.351, and B.1.617.2.
The N protein maintains the genome structure inside the viral envelope and is also involved in viral assembly and budding [39,40]. Its high degree of conservation has led to its utilization for diagnostics and the investigation of it as a target for new vaccines [41,42]. R203K/G204R substitutions in the N protein have been linked to enhanced SARS-CoV-2 infectivity, fitness, and virulence [43,44]. In the present study, the A156S, A398V, and D399Y mutations in the N protein were detected at frequencies exceeding 3%.
The current study has some limitations. We only sequenced the complete genomes of SARS-CoV-2-positive specimens with high viral loads, which could be associated with specific SARS-CoV-2 genotypes. Secondly, the study focused solely on phylogeny and molecular characteristics; therefore, inferences about the antigenicity of new SARS-CoV-2 variants were limited. The genetic sequence data used in the study were not from samples that were randomly selected for sequencing; hence, they may not be representative of the SARS-CoV-2 circulating throughout Thailand. Lastly, the study did not investigate correlations between different SARS-CoV-2 variants and clinical features, thus missing an opportunity to identify potential changes in clinical manifestations associated with emerging SARS-CoV-2 variants.
In summary, the present study highlighted the changing SARS-CoV-2 variants in epidemic waves in Thailand and identified unique genomic patterns that may be associated with the severity of COVID-19. The occurrence of some mutations can significantly affect the evolutionary trajectory of the epidemic and the dissemination of genetically diverse variations. Continued molecular surveillance, including complete genome sequencing, is crucial with respect to identifying emerging SARS-CoV-2 variants early. This will enable us to reduce the overall burden of COVID-19 and guide research on SARS-CoV-2 vaccines and therapeutic targets.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/v15061394/s1. Table S1: GISAID Accession Numbers.

Author Contributions

Conceptualization, J.P., S.P. and Y.P.; methodology, V.S., P.N. and N.S.; soft-ware, J.P. and P.R.; validation, J.C., R.Y. and K.S.; formal analysis, V.S., P.N. and S.P.; investigation, K.S.; resources, P.R.; data curation, J.P., S.P., K.S. and Y.P.; writing—original draft preparation, J.P. and P.R.; writing—review and editing, J.P. and Y.P.; visualization, S.P. and Y.P.; supervision, S.P., K.S. and Y.P.; project administration, Y.P.; funding acquisition, Y.P. All authors have read and agreed to the published version of the manuscript.

Funding

The research was financially supported by the Health Systems Research Institute, the National Research Council of Thailand, the Center of Excellence in Clinical Virology, Chulalongkorn University, King Chulalongkorn Memorial Hospital, the MK Restaurant Group and Aunt Thongkam Foundation, and the BJC Big C Foundation. The Rachadapisek Sompote Fund of Chulalongkorn University awarded postdoctoral fellowships to Jiratchaya Puenpa.

Institutional Review Board Statement

This study was conducted according to the guidelines of the Declaration of Helsinki and approved by the Institutional Review Board of the Faculty of Medicine, Chulalongkorn University, Thailand (approval number IRB178/64). All information and patient identifiers were anonymized to protect patient confidentiality.

Informed Consent Statement

Patient consent was waived due to the samples’ anonymity by the institutional review board of the Ethics Committee for human research.

Data Availability Statement

Genome sequences generated in this study were deposited in the GISAID (https://www.gisaid.org, accessed on 7 January 2023) databases. Accession IDs are available in Supplementary Table S1.

Acknowledgments

We greatly appreciate all participants for helping and supporting this study. We thank all the staff from the Center of Excellence in Clinical Virology, Faculty of Medicine, Chulalongkorn University, for their help with the experiment.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Zhong, N.S.; Zheng, B.J.; Li, Y.M.; Poon, L.L.M.; Xie, Z.H.; Chan, K.H.; Li, P.H.; Tan, S.Y.; Chang, Q.; Xie, J.P.; et al. Epidemiology and cause of severe acute respiratory syndrome (SARS) in Guangdong, People’s Republic of China, in February, 2003. Lancet 2003, 362, 1353–1358. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  2. Middle East Respiratory Syndrome Coronavirus (MERS-CoV). Available online: www.who.int/health-topics/middle-east-respiratory-syndrome-coronavirus-mers#tab=tab_1 (accessed on 2 June 2023).
  3. WHO Director-General’s Opening Remarks at the Media Briefing on COVID-19. 11 March 2020. Available online: www.who.int/director-general/speeches/detail/who-director-general-s-opening-remarks-at-the-media-briefing-on-covid-19---11-march-2020 (accessed on 5 June 2023).
  4. WHO Coronavirus (COVID-19) Dashboard. Available online: covid19.who.int/ (accessed on 2 June 2023).
  5. Wang, M.Y.; Zhao, R.; Gao, L.J.; Gao, X.F.; Wang, D.P.; Cao, J.M. SARS-CoV-2: Structure, Biology, and Structure-Based Therapeutics Development. Front. Cell. Infect. Microbiol. 2020, 10, 587269. [Google Scholar] [CrossRef] [PubMed]
  6. Baddock, H.T.; Brolih, S.; Yosaatmadja, Y.; Ratnaweera, M.; Bielinski, M.; Swift, L.P.; Cruz-Migoni, A.; Fan, H.; Keown, J.R.; Walker, A.P.; et al. Characterization of the SARS-CoV-2 ExoN (nsp14ExoN-nsp10) complex: Implications for its role in viral genome stability and inhibitor identification. Nucleic Acids Res. 2022, 50, 1484–1500. [Google Scholar] [CrossRef] [PubMed]
  7. Robson, F.; Khan, K.S.; Le, T.K.; Paris, C.; Demirbag, S.; Barfuss, P.; Rocchi, P.; Ng, W.L. Coronavirus RNA Proofreading: Molecular Basis and Therapeutic Targeting. Mol. Cell 2020, 79, 710–727. [Google Scholar] [CrossRef] [PubMed]
  8. Jaimes, J.A.; André, N.M.; Chappie, J.S.; Millet, J.K.; Whittaker, G.R. Phylogenetic Analysis and Structural Modeling of SARS-CoV-2 Spike Protein Reveals an Evolutionary Distinct and Proteolytically Sensitive Activation Loop. J. Mol. Biol. 2020, 432, 3309–3325. [Google Scholar] [CrossRef]
  9. Wrobel, A.G.; Benton, D.J.; Roustan, C.; Borg, A.; Hussain, S.; Martin, S.R.; Rosenthal, P.B.; Skehel, J.J.; Gamblin, S.J. Evolution of the SARS-CoV-2 spike protein in the human host. Nat. Commun. 2022, 13, 1178. [Google Scholar] [CrossRef]
  10. Tracking SARS-CoV-2 Variants. Available online: www.who.int/en/activities/tracking-SARS-CoV-2-variants/ (accessed on 15 April 2023).
  11. Full Genome Tree Derived from All Outbreak Sequences. Available online: www.epicov.org/epi3/frontend# (accessed on 2 June 2023).
  12. COVID-19 Situation, Thailand. 24 May 2023. Available online: cdn.who.int/media/docs/default-source/searo/thailand/2023_05_24_tha-sitrep-264-covid-19.pdf?sfvrsn=cc6f41de_1 (accessed on 2 June 2023).
  13. Puenpa, J.; Suwannakarn, K.; Chansaenroj, J.; Nilyanimit, P.; Yorsaeng, R.; Auphimai, C.; Kitphati, R.; Mungaomklang, A.; Kongklieng, A.; Chirathaworn, C.; et al. Molecular epidemiology of the first wave of severe acute respiratory syndrome coronavirus 2 infection in Thailand in 2020. Sci. Rep. 2020, 10, 16602. [Google Scholar] [CrossRef]
  14. Hall, T.A. BioEdit: A User-Friendly Biological Sequence Alignment Editor and Analysis Program for Windows 95/98/NT. Nucleic Acids Symp. Ser. 1999, 41, 95–98. [Google Scholar]
  15. Madeira, F.; Park, Y.M.; Lee, J.; Buso, N.; Gur, T.; Madhusoodanan, N.; Basutkar, P.; Tivey, A.R.N.; Potter, S.C.; Finn, R.D.; et al. The EMBL-EBI search and sequence analysis tools APIs in 2019. Nucleic Acids Res. 2019, 47, W636–W641. [Google Scholar] [CrossRef] [Green Version]
  16. Tamura, K.; Stecher, G.; Peterson, D.; Filipski, A.; Kumar, S. MEGA6: Molecular Evolutionary Genetics Analysis version 6.0. Mol. Biol. Evol. 2013, 30, 2725–2729. [Google Scholar] [CrossRef] [Green Version]
  17. Suchard, M.A.; Lemey, P.; Baele, G.; Ayres, D.L.; Drummond, A.J.; Rambaut, A. Bayesian phylogenetic and phylodynamic data integration using BEAST 1.10. Virus Evol. 2018, 4, vey016. [Google Scholar] [CrossRef] [Green Version]
  18. Puenpa, J.; Rattanakomol, P.; Saengdao, N.; Chansaenroj, J.; Yorsaeng, R.; Suwannakarn, K.; Thanasitthichai, S.; Vongpunsawad, S.; Poovorawan, Y. Molecular characterisation and tracking of severe acute respiratory syndrome coronavirus 2 in Thailand, 2020–2022. Arch. Virol. 2023, 168, 26. [Google Scholar] [CrossRef]
  19. Aksamentov, I.; Roemer, C.; Hodcroft, E.; Neher, R. Nextclade: Clade assignment, mutation calling and quality control for viral genomes. J. Open Source Softw. 2021, 6, 3773. [Google Scholar] [CrossRef]
  20. Tan, K.K.; Tan, J.Y.; Wong, J.E.; Teoh, B.T.; Tiong, V.; Abd-Jamil, J.; Nor’e, S.S.; Khor, C.S.; Johari, J.; Yaacob, C.N. Emergence of B.1.524(G) SARS-CoV-2 in Malaysia during the third COVID-19 epidemic wave. Sci. Rep. 2021, 11, 22105. [Google Scholar] [CrossRef]
  21. Gu, H.; Xie, R.; Adam, D.C.; Tsui, J.L.; Chu, D.K.; Chang, L.D.J.; Cheuk, S.S.Y.; Gurung, S.; Krishnan, P.; Ng, D.Y.M. Genomic epidemiology of SARS-CoV-2 under an elimination strategy in Hong Kong. Nat. Commun. 2022, 13, 736. [Google Scholar] [CrossRef]
  22. Joshi, M.; Puvar, A.; Kumar, D.; Ansari, A.; Pandya, M.; Raval, J.; Patel, Z.; Trivedi, P.; Gandhi, M.; Pandya, L. Genomic Variations in SARS-CoV-2 Genomes From Gujarat: Underlying Role of Variants in Disease Epidemiology. Front. Genet. 2021, 12, 586569. [Google Scholar] [CrossRef]
  23. Li, X.; Zai, J.; Zhao, Q.; Nie, Q.; Li, Y.; Foley, B.T.; Chaillon, A. Evolutionary history, potential intermediate animal host, and cross-species analyses of SARS-CoV-2. J. Med. Virol. 2020, 92, 602–611. [Google Scholar] [CrossRef]
  24. Li, X.; Wang, W.; Zhao, X.; Zai, J.; Zhao, Q.; Li, Y.; Chaillon, A. Transmission dynamics and evolutionary history of 2019-nCoV. J. Med. Virol. 2020, 92, 501–511. [Google Scholar] [CrossRef]
  25. Duchene, S.; Featherstone, L.; Haritopoulou-Sinanidou, M.; Rambaut, A.; Lemey, P.; Baele, G. Temporal signal and the phylodynamic threshold of SARS-CoV-2. Virus Evol. 2020, 6, veaa061. [Google Scholar] [CrossRef]
  26. Nie, Q.; Li, X.; Chen, W.; Liu, D.; Chen, Y.; Li, H.; Li, D.; Tian, M.; Tan, W.; Zai, J. Phylogenetic and phylodynamic analyses of SARS-CoV-2. Virus Res. 2020, 287, 198098. [Google Scholar] [CrossRef]
  27. Cotten, M.; Watson, S.J.; Zumla, A.I.; Makhdoom, H.Q.; Palser, A.L.; Ong, S.H.; Al Rabeeah, A.A.; Alhakeem, R.F.; Assiri, A.; Al-Tawfiq, J.A.; et al. Spread, circulation, and evolution of the Middle East respiratory syndrome coronavirus. mBio 2014, 5, e01062-13. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  28. Nobusawa, E.; Sato, K. Comparison of the mutation rates of human influenza A and B viruses. J. Virol. 2006, 80, 3675–3678. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  29. Shen, S.; Zhang, Z.; He, F. The phylogenetic relationship within SARS-CoV-2s: An expanding basal Glade. Mol. Phylogenet. Evol. 2021, 157, 107017. [Google Scholar] [CrossRef] [PubMed]
  30. Xu, X.; Wu, Y.; Kummer, A.G.; Zhao, Y.; Hu, Z.; Wang, Y.; Liu, H.; Ajelli, M.; Yu, H. Assessing changes in incubation period, serial interval, and generation time of SARS-CoV-2 variants of concern: A systematic review and meta-analysis. medRxiv 2023. [Google Scholar] [CrossRef]
  31. Zhang, L.; Jackson, C.B.; Mou, H.; Ojha, A.; Peng, H.; Quinlan, B.D.; Rangarajan, E.S.; Pan, A.; Vanderheiden, A.; Suthar, M.S.; et al. SARS-CoV-2 spike-protein D614G mutation increases virion spike density and infectivity. Nat. Commun. 2020, 11, 6013. [Google Scholar] [CrossRef]
  32. Laha, S.; Chakraborty, J.; Das, S.; Manna, S.K.; Biswas, S.; Chatterjee, R. Characterizations of SARS-CoV-2 mutational profile, spike protein stability and viral transmission. Infect. Genet. Evol. 2020, 85, 104445. [Google Scholar] [CrossRef]
  33. Zhang, J.; Ejikemeuwa, A.; Gerzanich, V.; Nasr, M.; Tang, Q.; Simard, J.M.; Zhao, R.Y. Understanding the Role of SARS-CoV-2 ORF3a in Viral Pathogenesis and COVID-19. Front. Microbiol. 2022, 13, 854567. [Google Scholar] [CrossRef]
  34. Korber, B.; Fischer, W.M.; Gnanakaran, S.; Yoon, H.; Theiler, J.; Abfalterer, W.; Hengartner, N.; Giorgi, E.E.; Bhattacharya, T.; Foley, B.; et al. Tracking Changes in SARS-CoV-2 Spike: Evidence that D614G Increases Infectivity of the COVID-19 Virus. Cell 2020, 182, 812–827.e19. [Google Scholar] [CrossRef]
  35. Guan, W.J.; Ni, Z.Y.; Hu, Y.; Liang, W.H.; Ou, C.Q.; He, J.X.; Liu, L.; Shan, H.; Lei, C.L.; Hui, D.S.C.; et al. Clinical Characteristics of Coronavirus Disease 2019 in China. N. Engl. J. Med. 2020, 382, 1708–1720. [Google Scholar] [CrossRef]
  36. Chu, D.K.W.; Pan, Y.; Cheng, S.M.S.; Hui, K.P.Y.; Krishnan, P.; Liu, Y.; Ng, D.Y.M.; Wan, C.K.C.; Yang, P.; Wang, Q.; et al. Molecular Diagnosis of a Novel Coronavirus (2019-nCoV) Causing an Outbreak of Pneumonia. Clin. Chem. 2020, 66, 549–555. [Google Scholar] [CrossRef] [Green Version]
  37. Starr, T.N.; Greaney, A.J.; Hilton, S.K.; Ellis, D.; Crawford, K.H.D.; Dingens, A.S.; Navarro, M.J.; Bowen, J.E.; Tortorici, M.A.; Walls, A.C.; et al. Deep Mutational Scanning of SARS-CoV-2 Receptor Binding Domain Reveals Constraints on Folding and ACE2 Binding. Cell 2020, 182, 1295–1310.e20. [Google Scholar] [CrossRef]
  38. Greaney, A.J.; Starr, T.N.; Gilchuk, P.; Zost, S.J.; Binshtein, E.; Loes, A.N.; Hilton, S.K.; Huddleston, J.; Eguia, R.; Crawford, K.H.D.; et al. Complete Mapping of Mutations to the SARS-CoV-2 Spike Receptor-Binding Domain that Escape Antibody Recognition. Cell Host Microbe 2021, 29, 44–57.e9. [Google Scholar] [CrossRef]
  39. Gao, T.; Gao, Y.; Liu, X.; Nie, Z.; Sun, H.; Lin, K.; Peng, H.; Wang, S. Identification and functional analysis of the SARS-CoV-2 nucleocapsid protein. BMC Microbiol. 2021, 21, 58. [Google Scholar] [CrossRef]
  40. Wu, W.; Cheng, Y.; Zhou, H.; Sun, C.; Zhang, S. The SARS-CoV-2 nucleocapsid protein: Its role in the viral life cycle, structure and functions, and use as a potential target in the development of vaccines and diagnostics. Virol. J. 2023, 20, 6. [Google Scholar] [CrossRef]
  41. Diao, B.; Wen, K.; Zhang, J.; Chen, J.; Han, C.; Chen, Y.; Wang, S.; Deng, G.; Zhou, H.; Wu, Y. Accuracy of a nucleocapsid protein antigen rapid test in the diagnosis of SARS-CoV-2 infection. Clin. Microbiol. Infect. 2021, 27, 289.e1–289.e4. [Google Scholar] [CrossRef]
  42. Matchett, W.E.; Joag, V.; Stolley, J.M.; Shepherd, F.K.; Quarnstrom, C.F.; Mickelson, C.K.; Wijeyesinghe, S.; Soerens, A.G.; Becker, S.; Thiede, J.M.; et al. Nucleocapsid vaccine elicits spike-independent SARS-CoV-2 protective immunity. bioRxiv 2021. [Google Scholar] [CrossRef]
  43. Wu, H.; Xing, N.; Meng, K.; Fu, B.; Xue, W.; Dong, P.; Tang, W.; Xiao, Y.; Liu, G.; Luo, H.; et al. Nucleocapsid mutations R203K/G204R increase the infectivity, fitness, and virulence of SARS-CoV-2. Cell Host Microbe 2021, 29, 1788–1801.e6. [Google Scholar] [CrossRef]
  44. Mourier, T.; Shuaib, M.; Hala, S.; Mfarrej, S.; Alofi, F.; Naeem, R.; Alsomali, A.; Jorgensen, D.; Subudhi, A.K.; Rached, F.B.; et al. SARS-CoV-2 genomes from Saudi Arabia implicate nucleocapsid mutations in host response and increased viral load. Nat. Commun. 2022, 13, 601. [Google Scholar] [CrossRef]
Figure 1. Time-scaled phylogenetic tree of 96 complete SARS-CoV-2 genomes (nt positions 56–29,739, 29,684 bp) detected before the B.1.1.529 omicron variant predominated. Shown is a maximum clade credibility tree constructed from 10,000 trees sampled from the posterior distribution with mean node ages. Clades described in GISAID are identified (S, L, V, G, GH, GR, GRY, and GK). Several lineages predominantly represent outbreaks in Thailand, and posterior probability support is given.
Figure 1. Time-scaled phylogenetic tree of 96 complete SARS-CoV-2 genomes (nt positions 56–29,739, 29,684 bp) detected before the B.1.1.529 omicron variant predominated. Shown is a maximum clade credibility tree constructed from 10,000 trees sampled from the posterior distribution with mean node ages. Clades described in GISAID are identified (S, L, V, G, GH, GR, GRY, and GK). Several lineages predominantly represent outbreaks in Thailand, and posterior probability support is given.
Viruses 15 01394 g001
Figure 2. Amino acid mutations in the 130 SARS-CoV-2 genomes analyzed in the study (nt positions 56–29,739, 29,684 bp), compared to the Wuhan-Hu-1 (accession NC_045512) reference strain. The percentage frequency of all amino acid positions in the 130 genomes is shown on the y-axis. NSP, nonstructural protein; M, membrane protein; S, spike protein; N, nucleoprotein; ORF, open reading frame encoding the accessory protein.
Figure 2. Amino acid mutations in the 130 SARS-CoV-2 genomes analyzed in the study (nt positions 56–29,739, 29,684 bp), compared to the Wuhan-Hu-1 (accession NC_045512) reference strain. The percentage frequency of all amino acid positions in the 130 genomes is shown on the y-axis. NSP, nonstructural protein; M, membrane protein; S, spike protein; N, nucleoprotein; ORF, open reading frame encoding the accessory protein.
Viruses 15 01394 g002
Figure 3. Time–scaled phylogenetic tree of complete spike sequences (nt positions 21,566–25,387, 3831 bp) of SARS-CoV-2 variants. Shown is a maximum clade credibility tree constructed from 10,000 trees sampled from the posterior distribution with mean node ages. Several lineages predominantly represent outbreaks in Thailand, and posterior probability support is given.
Figure 3. Time–scaled phylogenetic tree of complete spike sequences (nt positions 21,566–25,387, 3831 bp) of SARS-CoV-2 variants. Shown is a maximum clade credibility tree constructed from 10,000 trees sampled from the posterior distribution with mean node ages. Several lineages predominantly represent outbreaks in Thailand, and posterior probability support is given.
Viruses 15 01394 g003
Figure 4. Timeline of the COVID-19 vaccination in Thailand.
Figure 4. Timeline of the COVID-19 vaccination in Thailand.
Viruses 15 01394 g004
Figure 5. Unrooted phylogenetic analyses of SARS-CoV-2 omicron sublineage BA.1 variant based on full genome sequences (nt positions 202–29,745, 29,544 bp). Bootstrap values for key nodes are shown as percentages of 1000 replicates. All SARS-CoV-2 omicron sublineage BA.1 variants identified in this study are represented and labeled. Scale bars represent the number of substitutions per site.
Figure 5. Unrooted phylogenetic analyses of SARS-CoV-2 omicron sublineage BA.1 variant based on full genome sequences (nt positions 202–29,745, 29,544 bp). Bootstrap values for key nodes are shown as percentages of 1000 replicates. All SARS-CoV-2 omicron sublineage BA.1 variants identified in this study are represented and labeled. Scale bars represent the number of substitutions per site.
Viruses 15 01394 g005
Figure 6. Unrooted phylogenetic analyses of SARS-CoV-2 omicron sublineage BA.2 variants based on full genome sequences (nt positions 218–29,686, 29,469 bp). Bootstrap values for key nodes are shown as percentages of 1000 replicates. All SARS-CoV-2 omicron sublineage BA.2 variants identified in this study are represented and labeled. Scale bars represent the number of substitutions per site.
Figure 6. Unrooted phylogenetic analyses of SARS-CoV-2 omicron sublineage BA.2 variants based on full genome sequences (nt positions 218–29,686, 29,469 bp). Bootstrap values for key nodes are shown as percentages of 1000 replicates. All SARS-CoV-2 omicron sublineage BA.2 variants identified in this study are represented and labeled. Scale bars represent the number of substitutions per site.
Viruses 15 01394 g006
Figure 7. Unrooted phylogenetic analyses of SARS-CoV-2 omicron sublineages BA.4 and BA.5 variants based on full genome sequences (nt positions 201–29,698, 29,496 bp). Bootstrap values for key nodes are shown as percentages of 1000 replicates. All SARS-CoV-2 omicron BA.4 and BA.5 variants identified in this study are represented and labeled. Scale bars represent the number of substitutions per site.
Figure 7. Unrooted phylogenetic analyses of SARS-CoV-2 omicron sublineages BA.4 and BA.5 variants based on full genome sequences (nt positions 201–29,698, 29,496 bp). Bootstrap values for key nodes are shown as percentages of 1000 replicates. All SARS-CoV-2 omicron BA.4 and BA.5 variants identified in this study are represented and labeled. Scale bars represent the number of substitutions per site.
Viruses 15 01394 g007
Table 1. Comparison of missense and synonymous mutation frequency profiles of SARS-CoV-2 omicron sublineage BA.1 in Thailand datasets (nt length 29,544 bp).
Table 1. Comparison of missense and synonymous mutation frequency profiles of SARS-CoV-2 omicron sublineage BA.1 in Thailand datasets (nt length 29,544 bp).
LineageGenent Positionaa PositionGenome CountFrequency
BA.1 ORF1abT2019CM585T13516.40
(n = 823) C2470T 8910.81
G5515T 50.61
G6850T 14617.74
C15952T 30.36
SG23628AS689N9912.03
C26936T 587.05
BA.1.1 ORF1abC2470T 137496.83
(n = 1419) G3692AV1143I80.56
G3896TV1211F211.48
G6109A 755.29
C11750TL3829F110.78
G12661A 1168.17
C14805T 73551.80
G18433AD1656N 352.47
T19632C 74452.43
ORF3aG25634AC81Y 80.56
EG26428TV62F120.85
MA26530GD3G106074.70
NC28838TR189C292.04
BA.1.1.5 ORF1abC14117TT217M5164.56
(n = 79)SC21597TS12F67.59
MA26530GD3G6582.28
BA.1.1.8ORF1abG2894AD877N1013.70
(n = 73)ORF3aG26167TV259L912.33
BA.1.15.1MA26530GD3G1246.15
(n = 26)
BA.1.16.1ORF1abG1806AG514E139.35
(n = 139) C6401TP2046S1812.95
MA26530GD3G12791.37
NG29162AD297N42.88
C29274TT334I42.88
BA.1.17ORF1abT851CY196H153.98
(n = 377) C4113TA1283V17646.68
C5672TP1803S15641.38
C10605TP3447L123.18
C12084TT3940I133.45
G15850AD795N307.96
MA26530GD3G32285.41
NG28436TA55S51.33
BA.1.20ORF1abC15830TA788V426.67
(n = 15)MA26530GD3G533.30
BA.1.22ORF1abG11083TL3606F77.87
(n = 89) C15928TP821S66.74
NC29466TA398V77.87
Table 2. Comparison of missense and synonymous mutation frequency profiles of SARS-CoV-2 omicron sublineage BA.2 isolates in the Thailand dataset (nt length 29,469 bp).
Table 2. Comparison of missense and synonymous mutation frequency profiles of SARS-CoV-2 omicron sublineage BA.2 isolates in the Thailand dataset (nt length 29,469 bp).
LineageGenent Positionaa PositionGenome CountFrequency
BA.25′ UTRC241T 127487.26
(n = 1460)ORF1abC6196T 19413.29
C7471T 61041.78
C854TP197S261.78
C3653TL1130F261.78
C3686TH1141Y644.38
C4893TT1543I322.19
A4916GI1551V100.68
C6401TP2046S453.08
G7798TK2511N120.82
C10789T 161.10
C11109TA3615V261.78
G14188AA241T422.88
C15240T 19413.29
G15451AG662S130.89
C16362T 886.03
A19133CE1889A151.03
ORF3aA25411GI7V171.16
C25613TS74F211.44
SC22120AF186L151.03
G22632AR357K171.16
T22882GN440K127086.99
C23280TT573I493.36
C23854AN764K141897.12
T25224CI1221T251.71
C25416T 58239.86
NG29468TD399Y443.01
BA.2.275′ UTRC241T 22284.73
(n = 262)ORF1abC10198T 24292.37
C12403T 5822.14
C17745T 207.63
C19610TT2048I124.58
ORF3aC25672TL94F155.73
NG28739TA156S103.82
BA.2.35′ UTRC241T 40288.55
(n = 454)ORF1abC832T 9821.59
T7282C 14431.72
C14267TT267M398.59
C18508TL1681F378.15
A21222G 35878.85
BA.2.95′ UTRC241T 45186.90
(n = 519)ORF1abG1820AG519S7815.03
A2442CE726A366.94
T4443CV1393A8716.76
C5051TP1596S5811.18
C5672TP1803S428.09
C12789TT4175I326.17
A14109GI214M387.32
A15553GN696D132.50
T16494C 366.94
C18457TP1664S112.12
ORF9bA28389TN36Y285.39
ST21752AW64R5911.37
T22882GN440K44886.32
G24348TS929I101.93
BA.2.105′ UTRC241T 87194.16
(n = 925)ORF1abC2676TP804L151.62
A4457GI1398V212.27
T7813C 26228.32
C17528TT1354I717.68
ORF3aC25961TT190I12012.97
Table 3. Comparison of missense and synonymous mutation frequency profiles of SARS-CoV-2 omicron sublineage BA.4 and BA.5 variants in the Thailand dataset (nt length 29,496 bp).
Table 3. Comparison of missense and synonymous mutation frequency profiles of SARS-CoV-2 omicron sublineage BA.4 and BA.5 variants in the Thailand dataset (nt length 29,496 bp).
LineageGenent Positionaa PositionGenome CountFrequency
BA.45′ UTRC241T 15081.97
(n = 183)ORF1abG6680AA2139T15886.34
T15521AF685Y94.92
SA22786CR408S15483.70
T22882GN440K13473.22
T24163C 16087.43
BA.5.25′ UTRC241T 140679.98
(n = 1758)ORF1abC823T 1407.96
C5497T 71240.50
C13551T 824.66
T16023C 71640.73
C16616AT1050N170897.16
A18163GI1566V161791.98
ST22882GN440K141280.32
C23854AN764K159690.78
EC26270TT9I134276.34
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Puenpa, J.; Sawaswong, V.; Nimsamer, P.; Payungporn, S.; Rattanakomol, P.; Saengdao, N.; Chansaenroj, J.; Yorsaeng, R.; Suwannakarn, K.; Poovorawan, Y. Investigation of the Molecular Epidemiology and Evolution of Circulating Severe Acute Respiratory Syndrome Coronavirus 2 in Thailand from 2020 to 2022 via Next-Generation Sequencing. Viruses 2023, 15, 1394. https://doi.org/10.3390/v15061394

AMA Style

Puenpa J, Sawaswong V, Nimsamer P, Payungporn S, Rattanakomol P, Saengdao N, Chansaenroj J, Yorsaeng R, Suwannakarn K, Poovorawan Y. Investigation of the Molecular Epidemiology and Evolution of Circulating Severe Acute Respiratory Syndrome Coronavirus 2 in Thailand from 2020 to 2022 via Next-Generation Sequencing. Viruses. 2023; 15(6):1394. https://doi.org/10.3390/v15061394

Chicago/Turabian Style

Puenpa, Jiratchaya, Vorthon Sawaswong, Pattaraporn Nimsamer, Sunchai Payungporn, Patthaya Rattanakomol, Nutsada Saengdao, Jira Chansaenroj, Ritthideach Yorsaeng, Kamol Suwannakarn, and Yong Poovorawan. 2023. "Investigation of the Molecular Epidemiology and Evolution of Circulating Severe Acute Respiratory Syndrome Coronavirus 2 in Thailand from 2020 to 2022 via Next-Generation Sequencing" Viruses 15, no. 6: 1394. https://doi.org/10.3390/v15061394

APA Style

Puenpa, J., Sawaswong, V., Nimsamer, P., Payungporn, S., Rattanakomol, P., Saengdao, N., Chansaenroj, J., Yorsaeng, R., Suwannakarn, K., & Poovorawan, Y. (2023). Investigation of the Molecular Epidemiology and Evolution of Circulating Severe Acute Respiratory Syndrome Coronavirus 2 in Thailand from 2020 to 2022 via Next-Generation Sequencing. Viruses, 15(6), 1394. https://doi.org/10.3390/v15061394

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop