Next Article in Journal
Effects of Wearing a 50% Lower Jaw Advancement Splint on Biophysical and Perceptual Responses at Low to Severe Running Intensities
Previous Article in Journal
A Study of IFN-α-Induced Chemokines CCL2, CXCL10 and CCL19 in Patients with Systemic Lupus Erythematosu
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Review

“Superwobbling” and tRNA-34 Wobble and tRNA-37 Anticodon Loop Modifications in Evolution and Devolution of the Genetic Code

1
School of Biological Sciences, University of New England, Biddeford, ME 04005, USA
2
Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824, USA
*
Author to whom correspondence should be addressed.
Life 2022, 12(2), 252; https://doi.org/10.3390/life12020252
Submission received: 19 January 2022 / Revised: 25 January 2022 / Accepted: 1 February 2022 / Published: 8 February 2022
(This article belongs to the Section Origin of Life)

Abstract

:
The genetic code evolved around the reading of the tRNA anticodon on the primitive ribosome, and tRNA-34 wobble and tRNA-37 modifications coevolved with the code. We posit that EF-Tu, the closing mechanism of the 30S ribosomal subunit, methylation of wobble U34 at the 5-carbon and suppression of wobbling at the tRNA-36 position were partly redundant and overlapping functions that coevolved to establish the code. The genetic code devolved in evolution of mitochondria to reduce the size of the tRNAome (all of the tRNAs of an organism or organelle). “Superwobbling” or four-way wobbling describes a major mechanism for shrinking the mitochondrial tRNAome. In superwobbling, unmodified wobble tRNA-U34 can recognize all four codon wobble bases (A, G, C and U), allowing a single unmodified tRNA-U34 to read a 4-codon box. During code evolution, to suppress superwobbling in 2-codon sectors, U34 modification by methylation at the 5-carbon position appears essential. As expected, at the base of code evolution, tRNA-37 modifications mostly related to the identity of the adjacent tRNA-36 base. TRNA-37 modifications help maintain the translation frame during elongation.

1. Introduction

This review was written to support an interpretation of a confluence of recent and older data. We attempt to bring some simplicity, order and concept to what may seem, at first, like overwhelming complexity and confusion. The genetic code evolved in columns around the structure of the tRNA anticodon. Genetic code columns represent the middle position of the anticodon (tRNA-35), which is and was the easiest anticodon position to read. Initially, tRNA-34 and tRNA-36 were wobble positions, but wobbling was suppressed at tRNA-36, in part, by tRNA-37 modifications. Appreciation of tRNA anticodon loop structure and reading helps to explain genetic code structure and the evolution of tRNA modifications that affect reading of the anticodon.
Notably, “superwobbling” or four-way wobbling in evolution of the mitochondria has been described and supported by detailed tRNA modification data [1,2,3,4,5]. Phylogenetics indicates pathways of evolution of Archaea, ancient Bacteria, derived Bacteria and Eukarya [5,6]. Evolution of the mitochondria from a bacterial endosymbiont is fundamental to understand evolution of Eukarya [7,8,9,10,11]. Superwobbling indicates the importance of ancient wobble U34 methylation-based modifications at the 5-carbon position. In the mitochondrion, unmodified wobble U34 can potentially read wobble codons ending in A, G, C and U to translate an entire 4-codon sector of the code using a single tRNA species [1,2,5]. At the base of genetic code evolution, however, it appears that tRNA-U34 may often or always have been modified, in part, to suppress superwobbling and to allow evolution of 2-codon sectors [3,4,12,13]. Recent tRNA modification data support this idea. To our knowledge, the relationship of superwobbling to initial genetic code evolution has, for the most part, not been discussed (but see [14]). We posit that 5-carbon U34 methylation-based wobble modifications were essential for the initial evolution of the genetic code.
Similarly, tRNA wobble adenosine deamination to inosine (tRNA-A34→I34) modifications appear fundamental to the later evolution and enrichment of the code [15,16,17,18,19]. I34, generally, can read wobble codons A, C and U, and the I34 modification is associated with the suppression of synonymous G34 anticodons. G34 is favoured in Archaea and, for the most part, in Bacteria [15]. Put another way, when the I34 wobble modification occurs, the corresponding G34 tRNA anticodon is rarely if ever present. In addition, the introduction of tRNAs with unnatural G34 anticodons in 4-codon boxes can be toxic in Eukaryotes [15,20]. In Bacteria, A34→I34 modification is mostly found for the Arg anticodon (ACG→ICG). By contrast, in Eukarya, the A34→I34 wobble modification is found for Leu (AAG→IAG), Ile (AAU→IAU), Val (AAC→IAC), Ser (AGA→IGA), Pro (AGG→IGG), Thr (AGU→IGU), Ala (AGC→IGC) and, as in Bacteria, Arg (ACG→ICG). Interestingly, in Eukarya, Gly occupies a 4-codon box but does not utilize the A34→I34 modification. We offer two possible explanations below. Because of wobble ambiguity, the A34→I34 modification can only occur in 3- or 4-codon sectors of the genetic code. Some Bacteria encode A34 in 4-codon sectors other than Arg, but, in most of these cases, A34 does not appear to be converted to inosine [16,17]. Because of superwobbling in 4-codon sectors, the A34→I34 modification is not utilized in mitochondria [5]. In response to oxidative and starvation stress, Eukaryotes utilize endonuclease V to cleave I34 tRNAs to stall translation [21].
Bacteria utilize G34→Q34 modifications (Q for queuosine) [5,22,23,24,25,26]. These modifications are found in Eukaryotes, mitochondria and Bacteria but not in Archaea. In Archaea, the queuosine-related modification archaeosine, which involves a homologous enzyme, is found at the G15-position of tRNAs. In humans, queuine is a necessary coenzyme supplied by diet and generated by symbiotic enteric bacteria. Q34 modifications cause more balanced reading of NAU and NAC codons, so the lack of queuosine modifications slows translation [22,23]. Queuosine modifications are only found in column 3 of the genetic code (GUN→QUN anticodons).
Modifications of the anticodon loop tRNA-37 position, just 3’ to the anticodon, also appear to be of importance [15,19,26,27]. TRNA-37 modifications tend to be bulky next to an anticodon U36 or A36 and may help to stabilize intrinsically weaker anticodon-codon interactions. Modifications of tRNA-37 limit frameshifting during translation [27,28,29,30]. TRNA-35 and -36 are rarely modified and are generally read by Watson-Crick pairing to their mRNA codon. We posit that modifications of tRNA-37 help to delimit the anticodon, stabilize base pairing at position 36, stabilize the anticodon-codon interaction, suppress frameshifting [30] and perform other roles, for instance, recognition by aaRS enzymes to charge the cognate tRNA [31]. We find that, as expected, at the base of genetic code evolution, tRNA-37 modifications primarily depend on the adjacent tRNA-36 base, which corresponds to genetic code rows 1–4.
A new tRNA database helps to follow the current trends in the literature [32]. Older databases are also useful [33,34,35,36]. Updated modification data for tRNAs were essential to understand how tRNA modifications affect translation. Some tRNA modifications (i.e., cm5U34-based, t6A37 and m1G37) appear to be as old as the genetic code and, probably, were coevolved with the code and necessary for its initial establishment. Analysis of tRNA modifications at the tRNA-34 and -37 positions strongly supports the hypothesis that the genetic code evolved around the reading of the tRNA anticodon [37,38,39].
The archaeal genetic code is simplest and closest to the code that was present at LUCA (the last universal common (cellular) ancestor). We consider LUCA to be the first membrane-enclosed cells with intact DNA genomes. Pyrococcus furiosis is a reasonable reference organism for an ancient Archaeon and an approximation of LUCA [40,41]. The code is simpler in older bacterial species such as Thermus thermophilus, compared to more derived Bacteria, such as Escherichia coli and α-Proteobacteria. It appears that the mitochondria were derived from an α-Proteobacteria (Rickettsiales) [5,6,7,10,42,43]. The eukaryotic cytosolic code was derived from Archaea with contributions from an α-proteobacterial endosymbiont. Thus, the genetic code can be mostly traced, along with relevant tRNA modification data through evolution of life on Earth [19]. Currently, there is missing tRNA modification data for ancient Bacteria, such as Thermus thermophilus. At the time of writing, sequences of only ~5 modified Thermus thermophilus tRNAs have been reported out of a total of about 47 tRNAs. At the time of writing, no Thermus thermophilus tRNA with a modified or unmodified U34 has yet been reported [32]. Combining these missing data with this paper would be a useful contribution.
Aminoacyl-tRNA synthetases (aaRS) attach cognate amino acids to the 3′-ends of tRNAs [31,37,44]. Evolution of aaRS enzymes has been described in detail. AaRS are of the two incompatible folding classes I and II with structural subclasses A→E. The class II aaRS GlyRS-IIA was refolded into a class I aaRS (probably a primitive ValRS-IA). In addition to their incompatible fold, class I aaRS have an in-phase N-terminal extension relative to class II aaRS. The class II aaRS mounts the enzyme active site on a surface of antiparallel β-sheets. By contrast, the class I aaRS mounts the enzyme active site at the C-terminal ends of a set of parallel β-sheets. GlyRS-IIA (glycine aminoacyl-tRNA synthetase; class II; structural subclass A) is the root of all aaRS enzymes. In ancient Archaea, GlyRS-IIA is a sequence homolog of ValRS-IA and IleRS-IA. Tracing the evolution of aaRS enzymes describes the evolution of the genetic code. The genetic code evolved from Archaea to ancient Bacteria to more derived Bacteria. Eukarya are a fusion of multiple Archaea and multiple Bacteria probably involving a number of endosymbionts and/or other large horizontal gene transfers [6,10,45]. We find that a simple narrative for the evolution of life on Earth is obtained by comparing genetic codes, tRNA-34 and tRNA-37 modifications, aaRS and tRNAome data from a small number of reference organisms.

2. Evolution of the Genetic Code around the tRNA Anticodon

In Figure 1, the Saccharomyces cerevisiae tRNAPhe anticodon loop is shown (PDB 1EHZ) [46]. In Figure 1A, the linear modified sequence is shown. In Figure 1B, the folded structure is indicated. Figure 1C–E are three orientations of the anticodon loop structure including part of the anticodon stem. The genetic code evolved around the structure of the tRNA anticodon. The anticodon triplet is tRNA positions 34, 35 and 36. TRNA-34 is the wobble position at which diverse wobble contacts to mRNA codons are allowed, adjusted and tuned in evolution. TRNA-35 is the central position, which represents genetic code columns and is the easiest position for the translation system to read. TRNA-36 represents genetic code rows 1–4. Generally, the tRNA-35 and -36 positions are read during translation as Watson—Crick base pairs versus the mRNA codon. As in Saccharomyces cerevisiae tRNAPhe, tRNA-35 and -36 are generally unmodified.
A detailed and rational model for pre-LUCA evolution of the genetic code has been published [37,38,39]. The genetic code is highly structured and more simply structured in Archaea than in other organisms. Most evolution is in code columns, which represent the tRNA-35 base. For instance, in column 1 (tRNA-35A), related hydrophobic amino acids Val, Met, Ile and Leu are found, and these chemically similar amino acids are added to their cognate tRNAs by ValRS-IA, MetRS-IA, IleRS-IA and LeuRS-IA, which are closely related aaRS class IA enzymes. Similarly, in column 2 (tRNA-35G), amino acids Thr, Pro and Ser are found. Thr and Ser are closely related amino acids, and ThrRS-IIA, ProRS-IIA and SerRS-IIA are closely related aaRS class IIA enzymes. The code is proposed to have evolved through stages. Initially, both tRNA anticodon positions 34 and 36 were wobble positions, at which only 2-assignments (purine versus pyrimidine) were possible. Wobbling was suppressed at position 36 by evolution of EF-Tu, the 16S rRNA “latch” (i.e., G530~A1492 and A1493; Thermus thermophilus numbering) [47,48] and modifications of anticodon loop position 37. Suppression of wobbling at position 36 allowed the code to expand from 8-amino acids (complexity 2 × 4) to a maximum complexity of 32-assignments (complexity 2 × 4 × 4). Because of fidelity mechanisms, the standard genetic code froze at 20-amino acids plus stops.
The primordial sequence of the 7-nt anticodon loop was close to 32-CU/BNNAA-38 (/ indicates a U-turn; B = G, C or U (not A); N = A, G, C or U). In Figure 1, four bases (30, 31, 39 and 40) that are normally part of the anticodon stem are also shown. The G30 = m5C40 base pair is evident. The expected A30→Ψ39 base pair was disrupted by the pseudouridine rearrangement, perhaps to adjust the conformation and dynamics of the loop. Typically, the loop includes a U-turn after U33. A U-turn is a U-shaped turn in the RNA backbone [49]. The U-turn loop conformation is important to present the 3-nt anticodon (tRNA-34, -35 and -36). The Cm32~A38 H-bond can be characterized as a weak reverse Hoogsteen pair Cm32 (O2)→A38 (N6). This interaction is thought to regulate the U-turn geometry and dynamics of the anticodon loop [19,26]. The yW37 (wybutosine) modification of G is a bulky modification that is thought to stabilize interactions of the A36 anticodon base with its cognate codon and also to suppress frameshifting during translation.

3. Evolution of Life on Earth

A simple narrative for evolution of life on Earth is proposed in which LUCA evolved to Archaea [41,50]. As a reference organism that is close to LUCA, we propose Pyrococcus furiosis that has a tRNAome that is very similar in sequence to tRNAPri (a primordial tRNA) [40]. We propose that Archaea evolved to ancient Bacteria, such as Thermus thermophilus. We selected Thermus thermophilus because it has a simple but intact tRNAome. Unfortunately, the reported tRNA modification data for Thermus thermophilus is not complete at the time of writing. As a model organism for more derived Bacteria, we relied mostly on Escherichia coli. If data were available, we would incorporate the closest bacterial relative of the eukaryotic mitochondria. Escherichia coli, however, appears to be a reasonable model, albeit with several differences from the endosymbiont that became the mitochondria. We support the hypothesis that eukaryotic mitochondria were derived from an α-proteobacterial endosymbiont within an Asgard Archaea [11,51]. Eukaryotes, however, arose as a complex set of genetic fusions of multiple Archaea and multiple Bacteria. For the purposes of this paper, we trace tRNA U34, A34→I34 and G34→Q34 modifications through evolution. We discuss maintenance of the Ile-Met sector. Maintenance of 1-codon sectors (i.e., for Met and Trp) in evolution was difficult and was abandoned during evolution of mitochondria [5]. We consider modifications of anticodon position 37 [19,52]. We combine these data with evolution of aaRS enzymes and analyses of tRNAomes. To our knowledge, these issues have largely not been raised or have not been integrated in this manner in published papers. We consider our presentation to be highly informative to describe the major advances in evolution of the genetic code through the natural biological history of Earth.

4. Ancient Archaea

In this paper, we present or approximate the genetic codes of several reference organisms including some related data. Figure 2 shows an approximation of the Pyrococcus furiosis genetic code. Because of missing tRNA modification data, some information has been taken from or inferred from other Archaea. At the time of writing, significant tRNA modification data is available for Pyrococcus furiosis, Methanocaldococcus jannachii, Methanococcus maripauludis, Sulfolobus acidocaldarius and Haloferax volcanii [3,4,12]. The genetic code is presented as a 64-assignment code. Codon sequence surrounds the table. Anticodon data is enriched with tRNA modification data mostly for the wobble base (tRNA-34). The amino acid and structural class (class I or II; structural subclasses A–E) of the aminoacyl tRNA synthetase (aaRS) enzymes was included. Anticodons that are not utilized in an organism or domain may be shown in red with strikethrough. To follow the narrative of this paper, all of these data are necessary to consider in order to compare genetic codes relevant to the generation of Eukarya and mitochondria.
First of all, A34, in which A is unmodified, is rarely or never allowed in Archaea [15]. Rather, in Archaea, G34 appears to always be utilized. As a wobble base, G34 has the advantage of pairing with codon wobble U, as a G~U wobble pair, or else with codon wobble C, as a Watson-Crick G = C pair. At the base of code evolution, U34 appears to seldom or never be unmodified, specifically by a methylation-based modification at the 5-carbon of U34 (cm5U34-based modifications). For the precise chemistry of tRNA modifications, please refer to the Modomics Database [26,33,34,35,36]. We propose that cm5U-based modifications (i.e., cnm5U in Pyrococcus furiosis) suppress superwobbling, which is observed for 4-codon sectors in mitochondrial tRNAs [1,2,5]. A cnm5U34 tRNA, therefore, is likely confined to read codon wobble A and G. Superwobbling, by contrast, would allow unmodified U34 to read A, G, C and U, which would prevent evolution of 2-codon sectors. To evolve 2-codon genetic code sectors (i.e., for columns 1, 3 and 4), therefore, required cm5U-based modifications.
Furthermore, 1-codon sectors were difficult to evolve and maintain. Consider the Ile/Met 4-codon sector, in which Met occupies a 1-codon (AUG) sector. We posit that the 4-codon Ile/Met sector was originally a 4-codon Ile sector that Met invaded, eliminating the Ile UAU anticodon [37,38,39]. In Archaea and Bacteria, Ile utilizes a CAU anticodon. In some Archaea, C34 is modified to 2-agmatidine (agm2C) to read codon AUA (Ile) but not codon AUG (Met) [4,53,54,55]. Note that a cnm5UAU anticodon would read both AUA (Ile) and AUG (Met), causing miscoding. Met utilizes two tRNAs, tRNAMet (i.e., CmAU) for elongation and tRNAiMet (i.e., unmodified CAU) for initiation. A very similar strategy is utilized to maintain the 1-codon Met box in most or all prokaryotes [26,53,56,57,58,59]. The Trp 1-codon sector (UGG) is read by the Trp anticodon CCA that is specific for codon UGG. The UCA anticodon is not utilized, because Trp shares a 2-codon box with a stop codon (UGA) that is recognized by a protein release factor that binds to the mRNA UGA stop codon to terminate translation on the ribosome [60]. Anticodon cnm5UCA would read codons UGA and UGG, causing miscoding and suppressing translation stops. This explains why Trp utilizes anticodon CCA, rather than cm5UCA, to read codon UGG.
GlnRS-IB was a eukaryotic innovation that was transferred from Eukarya to Archaea and Bacteria by horizontal gene transfer [51,61]. Some archaeal and bacterial species, therefore, lack GlnRS-IB and instead use GluRS-IB to convert tRNAGln to Glu-tRNAGln. In these organisms, an amidotransferase converts Glu-tRNAGln to Gln-tRNAGln for translation [62,63]. So, GlnRS-IB in Archaea and Bacteria was a later acquisition in evolution (i.e., perhaps ~1.5 to 2.5 billion years ago). In Archaea, GluRS-IB, LysRS-IE and GlnRS-IB (from Eukarya) are closely related aaRS enzymes [37,38,39]. In some cases, the historic structural subclassifications for aaRS are deceptive. LysRS-IE is more closely related to GluRS-IB and GlnRS-IB than any of these three aaRS enzymes are to CysRS-IB. Similarly, AspRS-IIB, AsnRS-IIB and HisRS-IIA are reasonably closely related aaRS enzymes. We posit that a pre-LUCA AspRS-IIA evolved to AspRS-IIB to suppress tRNA charging errors, before evolution of AsnRS-IIB from AspRS-IIB. These homologies create a striped pattern of aaRS relatedness in column 3, indicative of the mode by which column 3 sectored [37,38,39]. The striped pattern in Archaea is somewhat disrupted by evolution of LysRS-IIB in Bacteria to replace archaeal LysRS-IE.

5. Ancient Bacteria

As a model organism for an ancient Bacterium, we selected Thermus thermophilus (Figure 3). Unfortunately, to date, there is too much missing tRNA modification data for Thermus thermophilus, so, perhaps, the analysis we present can be refined in the future. Although data are currently missing, we posit a 5-carbon cm5U34-based modification to suppress superwobbling and to support the existence of 2-codon genetic code sectors. In column 4, the Arg 4-codon sector may be an intermediate in evolution of the A34→I34 modification. Thermus thermophilus tRNAArg encodes anticodon ACG and lacks a tRNA with a GCG anticodon. Thermus thermophilus, however, appears to lack the enzyme expected to convert A34→I34 (tRNA adenosine deaminase). Currently, we do not know whether an unknown modification of A34 is present in Thermus thermophilus. If present, unmodified Arg (UCG) would read the entire 4-codon box. Modified anticodon cm5UCG would be expected to read CGA and CGG Arg codons. Anticodon CCG reads the CGG Arg codon. Precisely how Thermus thermophilus reads the Arg 4-codon box, therefore, does not appear to be currently reported. It is possible that Thermus thermophilus represents an intermediate stage in evolution of the Arg (ACG→ICG) anticodon present in most Bacteria [15].
In column 1, the Ile/Met sector is maintained in much the same manner as in Archaea, although, using a slightly different modification. In Thermus thermophilus, tRNA lysidine (34) synthetase (TilS) is present, so it appears Thermus thermophilus utilizes the 2-lysidine Ile (k2CAU) modification [26,53,56,57,58]. The 2-lysidine modification is chemically similar to the 2-agmatidine modification in Archaea. 2-lysidine is utilized to read Ile codon (AUA) but not Met codon (AUG). The UAU anticodon is not utilized, because cm5U34 would read both codons AUA (Ile) and AUG (Met). The elongator tRNAMet (CAU) has a lightly modified C34 (i.e., CmAU). As in Archaea, the initiator tRNAiMet (CAU) is unmodified.
In column 3, Thermus thermophilus utilizes a type II tRNATyr, with a longer V-loop (14-nt; the primordial length of the type II V-loop) [64]. Thermus thermophilus TyrRS-IC interacts with the V-loop tip as a determinant in Tyr placement to form Tyr-tRNATyr. Although the corresponding tRNAs have not been analyzed for modifications, Thermus thermophilus encodes enzymes for queuosine modification of column 3 tRNAs. Bacterial LysRS-IIB replaces archaeal LysRS-IE. LysRS-IIB is derived in evolution from AspRS-IIB, probably by duplication and repurposing of the gene copy [37]. So, even when an aaRS enzyme is replaced by a very different aaRS in evolution (i.e., LysRS-IE (Archaea)→LysRS-IIB (Bacteria)), evolution of the replacement aaRS may arise within the same column (column 3). Replacement of archaeal LysRS-IE with bacterial LysRS-IIB breaks the striped pattern observed for the simpler archaeal genetic code (compare Figure 2 and Figure 3, column 3). We posit that Archaea, which have a simpler genetic code, are older organisms than Bacteria (compare Figure 2 and Figure 3) [41,65]. Thermus thermophilus has a GlyRS-IIA and a ProRS-IIA that lacks an editing active site, similar to GlyRS-IIA and ProRS-IIA in Archaea. Later in bacterial evolution, GlyRS-IID and ProRS-IIA (i.e., sometimes with an added editing active site) evolved. More derived Bacteria utilize CmoA and CmoB enzymes to generate the cmo5U modification found in 4-codon sectors in columns 1 and 2 of the Escherichia coli genetic code (i.e., Val, Ser, Pro, Thr and Ala) (Figure 4). Thermus thermophilus lacks a detectable CmoA or CmoB homolog. Some Rickettsiales utilize CmoA and CmoB, but many do not. In mitochondria, unmodified U34 (superwobbling) is utilized to read 4-codon sectors. Also, CmoA and CmoB were probably missing in the bacterial endosymbiont that became the mitochondria.

6. Derived Bacteria

Because of available tRNA modification data, our model organism for a more derived Bacterium is generally Escherichia coli (Figure 4) [32]. In this regard, we would prefer to also show full information for the nearest relative of the α-proteobacterial species (i.e., Rickettsiales) that became the mitochondria, but we cannot identify these data. Also, because of horizontal gene transfers, a modern Rickettsiales might not be an apt comparison to the mitochondria. We posit that the 5-carbon of U34 is often modified in Bacteria to suppress superwobbling and to maintain 2-codon sectors. TRNA-34 modification data tend to evolve in columns, as might be expected for enzymes that bind the tRNA anticodon to add a modification. Columns represent the central position tRNA-35 of the anticodon.
Interestingly, in columns 1 and 2, the cmo5U34 modification is found in tRNAs encoding Val, Ser, Pro, Thr and Ala [26,66,67]. The cmo5U34 modification, therefore, is found in 4-codon sectors and was expected to read codons ending in wobble A, G and U but not C. For tRNAPro (cmo5UGG); however, this single tRNAPro (cmo5UGG) supports viability of Salmonella, indicating that cmo5U34 anticodons can potentially read the entire Pro 4-codon box. In Bacillus subtilis, tRNALeu (UAG), in which U34 appears to be unmodified, may utilize superwobbling [32].
In column 4, tRNAArg (ACG→ICG), encoded A34 is modified to inosine (I34) by deamination [15,16,17]. Interestingly, tRNAArg (GCG), which is favoured in Archaea, is not utilized. When A34 is converted to I34, the corresponding G34 anticodon is not utilized. Anticodon I34 reads codon wobble bases U, C and A but not G. To read the 4-codon Arg box, tRNAArg (ICG), (mnm5UCG) and (CCG) are utilized. TRNAArg (mnm5UCG) probably reads codons CGA and CGG. Also, in column 4, GlyRS-IIA may be replaced with GlyRS-IID in some derived Bacteria (i.e., Escherichia coli). In α-Proteobacteria, GlyRS-IIA is utilized, as in Thermus thermophilus and Archaea. Not surprisingly, GlyRS-IID is utilized in plant chloroplasts (i.e., from Cyanobacteria), although GlyRS-IIA, not GlyRS-IID, is utilized in the plant mitochondria [51].
In column 1, the Ile/Met 4-codon sector is essentially as described for Archaea and ancient Bacteria. Ile anticodon GAU reads codons AUU and AUC. Ile anticodon k2CAU (k2C for 2-lysidine modification of C) reads codon AUA (Ile) but not AUG (Met) [26,53,56]. Anticodon UAU is not utilized because even a cm5UAU would read both AUA (Ile) and AUG (Met) causing miscoding. Met utilizes tRNAMet (m5CAU) (elongator Met) and tRNAiMet (unmodified CAU) (initiator Met). Maintaining 1-codon sectors presents problems. For instance, in mitochondria, Ile and Met occupy 2-codon sectors to minimize the size of the tRNAome and its supporting proteome [5].
In column 3, queuosine modification for G34 (G34→Q34) is utilized [24,25,26]. Interestingly, the G34→Q34 column 3 modification is passed forward into the eukaryotic cytosol and also into mitochondria. All G34 anticodons in column 3 are modified G34→Q34. There can be further modification of queuosine to glutamyl-queuosine (tRNAAsp (gluQGUC)). As in Thermus thermophilus, tRNATyr is a type II tRNA with a longer V-loop. As expected, this bacterial feature of tRNATyr goes forward to the mitochondria but not the eukaryotic cytosol. LysRS-IIB is utilized in most Bacteria in place of archaeal LysRS-IE. E. coli appears to lack tRNALys (CUU). Apparently, tRNALys (mnm5s2UUU) reads both Lys codons AAA and AAG, as expected.

7. Mitochondria

Mitochondria were evolved from an α-proteobacterial endosymbiont, perhaps a Rickettsiales. The genetic code for human mitochondria is shown in Figure 5 [5]. Because of human health issues, better tRNA modification data are available for human mitochondrial tRNAs than for most Eukarya. Furthermore, human mitochondria utilize only 22-tRNAs, so humans, vertebrates and animals have a significantly reduced mitochondrial tRNAome. We believe the data shown in Figure 5 are essentially complete and accurate.
The main strategy for shrinking the mitochondrial tRNAome is “superwobbling” or 4-way wobbling, in which a single unmodified U34 tRNA reads an entire 4-codon box [1,2,5]. This strategy is used for all 4-codon boxes, including 4-codon boxes encoding Leu, Val, Ser, Pro, Thr, Ala, Arg and Gly (beige shading in Figure 4). In column 3, G34→Q34 modifications are utilized (light green shading in Figure 5). 2-codon boxes with U34 utilize a modified U34, as expected, to restrict superwobbling, which would cause miscoding. Evolution of specific modifications generally aligns in columns, as expected. Human mitochondria include no 1-codon sectors (i.e., to encode Met and Trp) [5]. Instead, atypically, 2-codon sectors are utilized for Ile, Met and Trp. Because a stop codon (UGA) was lost in forming a Trp 2-codon sector, the loss was compensated by converting AGG and AGA, which in Bacteria are Arg codons, into mitochondrial stop codons. Human mitochondria do not import GlnRS-IB. Instead, GluRS-IB is utilized to synthesize Glu-tRNAGln, which is converted to Gln-tRNAGln by an amidotransferase. The bacterial mitochondrial ancestor did not encode GlnRS-IB, which was a eukaryotic innovation transferred to Archaea and Bacteria by horizontal gene transfers [51]. Archaeal Pyrococcus furiosis also lacks GlnRS-IB and uses a similar tRNAGln charging strategy. Mitochondria utilize LysRS-IIB, which was derived initially from a bacterial source. Not all mitochondrial and chloroplast tRNAomes, tRNA modifications and collections of aaRS enzymes are the same, so human mitochondria are an example without complete generality.

8. The Eukaryotic Cytosol

In the eukaryotic cytosol, the genetic code reflects the fusion of an Asgard Archaea and the α-proteobacterial endosymbiont that became the mitochondria [6,8,9,10,11] (Figure 6). A major feature in evolution of the eukaryotic cytosol is the expansion of the A34→I34 strategy (beige shading in Figure 6). All 4-codon sectors except that encoding glycine utilize the A34→I34 modification and, also, suppression of the corresponding G34 anticodon [15]. We suspect that the Gly 4-codon sector did not adopt the A34→I34 modification strategy because of evolutionary pressures to adjust rates of translation. It appears that the Gly GCC anticodon may have been better balanced with the mnm5UCC and CCC anticodons. Although Escherichia coli does not do this, some Bacteria encode A34 in 4-codon sectors other than Arg (ACG→ICG), but, generally, in these cases, A34 does not appear to be converted to inosine [15,17]. To prevent miscoding, the A34→I34 modification strategy can only occur in 3-(Ile) or 4-codon sectors, because I34 recognizes codon wobble bases U, C and A.
In column 1, the Ile/Met 4-codon sector underwent some eukaryotic cytosol-specific changes. The Ile anticodon AAU→IAU modification is utilized, allowing the reading of Ile codons AUU, AUC and AUA. Also, in Eukaryotes, anticodon UAU→ΨAΨ (Ψ for pseudouridine) can be used to read codon AUA (Ile) but not AUG (Met) [32]. In Prokaryotes, generally, UAU is not utilized even with modification (Figure 2, Figure 3 and Figure 4). In column 3, G34 is modified to Q34 or a modified Q34 (i.e., galactosyl- or mannosyl-queuosine) [24,25]. Because queuosine in column 3 is a bacterial innovation, the eukaryotic cytosol takes on significant bacterial characteristics in the genetic fusion(s) that resulted in eukaryogenesis. LysRS-IIB is another bacterial innovation that is utilized in the eukaryotic cytosol. Apparently, LysRS-IE, derived from an Asgard archaeal partner in the fusion, was rejected. GlyRS-IIA could be derived from an Asgard Archaea, an α-Proteobacteria or by horizontal gene transfer from another archaeal or bacterial source.
The eukaryotic cytosol does not utilize the cmo5U34 modification found in some Bacteria but not others (columns 1 and 2; compare Figure 4 and Figure 6). Probably, the cmo5U34 modification was absent in the bacterial endosymbiont that became the mitochondria. We posit that optimal balanced reading of 4-codon boxes may be tuned by coevolution of tRNA sequences and anticodon modifications. Therefore, the cmo5U34 modification may be more compatible paired with synonymous G34 anticodons, as observed in Escherichia coli for Val, Ser, Pro, Thr and Ala (Figure 4). By contrast, in Eukarya, the ncm5U34 modification may be more compatible paired with isoacceptor I34 anticodons (Figure 6). This could help explain why Gly utilizes anticodons GCC (rather than ICC, which does not appear to be utilized), ncm5UmCC and CCC anticodons in Eukarya (Figure 6). The ncm5UmCC Gly anticodon probably is restricted to read Gly codons GGG and GGA.

9. Sources of Eukaryotic and Mitochondrial aaRS Enzymes

Table 1 reflects work in progress toward understanding how human cytoplasmic and mitochondrial aaRS enzymes may have evolved through the complex genetic fusion(s) that generated Eukarya [51]. The story is tangled because of 1) (sometimes multiple) horizontal gene transfers; 2) multiple archaeal and bacterial contributions to the eukaryotic genetic make-up; 3) eukaryotic genetic innovations; and 4) coevolution of cytosolic and mitochondrial tRNAs and aaRS enzymes. A recent paper describes molecular events associated with eukaryogenesis [11]. Generally, cytosolic tRNAs are thought to have archaeal origins and mitochondrial tRNAs probably have an α-proteobacterial origin. Interestingly, tracing mitochondrial aaRS to α-proteobacterial origins has been challenging, indicating many diverse bacterial contributions to Eukarya evolution [61,68,69]. In plants, several aaRS enzymes are co-targeted to the mitochondria and the chloroplasts, and chloroplast aaRS, in some cases, appear to have been derived from a cyanobacterial source [69]. Also, there are apparent discrepancies relating to the proteobacterial sourcing of mitochondrial aaRS [61,68,69]. A full and reliable accounting of the sourcing of aaRS enzymes in the eukaryotic cytosols (i.e., animals and plants) and in mitochondria and chloroplast organelles does not appear to yet be available. Also, nearest apparent bacterial relatives of most mitochondrial and chloroplast aaRS have not been unambiguously reported [51].
Mitochondrial aaRS enzymes are encoded within the eukaryotic cell nucleus. For two aaRS, the gene encoding the cytoplasmic aaRS and the mitochondrial aaRS is the same (GlyRS-IIA (GARS) and LysRS-IIB (KARS)). In most cases, by contrast, separate genes encoding the cytoplasmic and mitochondrial aaRS are utilized (Table 1). Mitochondrial aaRS enzymes are expected to include a mitochondrial targeting sequence. We conclude the following. Many cytosolic eukaryotic aaRS enzymes appear to be bacterial in origin (i.e., seven cytosolic aaRS enzymes: AlaRS-IID (AARS), ArgRS-ID (RARS), AspRS-IIB (DARS), IleRS-IA (IARS), LysRS-IIB (KARS), ThrRS-IIA (TARS) and ValRS-IA (VARS)). In the cases in which there are separate cytoplasmic and mitochondrial aaRS genes, the cytoplasmic aaRS gene is likely to have an archaeal origin and the mitochondrial gene invariably appears to have a bacterial origin (i.e., AsnRS-IIB (NARS and NARS2), GluRS-IB (EPRS and EARS2), HisRS-IIA (HARS and HARS2), LeuRS-IA (LARS and LARS2), MetRS-IA (MARS and MARS2); PheRS-IICα and PheRS-IICβ (FARSA, FARSB and FARS2), ProRS-IIA (EPRS and PARS2), SerRS-IIA (SARS and SARS2), TrpRS-IC (WARS and WARS2) and TyrRS-IC (YARS and YARS2)). In human cells, EPRS is a hybrid gene encoding both GluRS-IB and ProRS-IIA. Twelve cytosolic aaRS enzymes appear to have an archaeal origin (i.e., 12 cytosolic aaRS enzymes: AsnRS-IIB (NARS), CysRS-IB (CARS), GluRS-IB (EPRS), GlyRS-IIA (GARS), HisRS-IIA (HARS), LeuRS-IA (LARS), MetRS-IA (MARS), PheRS-IICα/β (FARSA and FARSB), ProRS-IIA (EPRS), SerRS-IIA (SARS), TrpRS-IC (WARS) and TyrRS-IC (YARS)). The CARS gene appears to have split into cytosolic CARS and mitochondrial CARS2 by gene duplication and divergence. As noted above, GlnRS-IB is not imported into human mitochondria. In the eukaryotic cytosol, GlnRS-IB appears to be a eukaryotic innovation that was transferred to Bacteria and Archaea by multiple horizontal gene transfers [51,61]. Some cytosolic aaRS genes appear to have undergone multiple horizontal gene transfers. Examples include AlaRS-IID (AARS), AsnRS-IIB (NARS), ArgRS-ID (RARS), CysRS-IB (CARS), HisRS-IIA (HARS), MetRS-IA (MARS), ProRS-IIA (EPRS) and TyrRS-IC (YARS). Because of complex genetics, horizontal gene transfers and divergent evolution, there may be significant differences comparing eukaryotic cytosols, mitochondria and chloroplasts from very different species. It appears that for the first eukaryotes to have survived may have required multiple and complex horizontal gene transfers and/or multiple endosymbioses.

10. TRNA Modifications Are as Old as LUCA

We consider Pyrococcus furiosis to be a reasonable reference organism for LUCA. Pyrococcus furiosis includes an Elp3 homolog that may encode tRNA-U34 cm5U methylase that initiates the cnm5U34 modification (Figure 2). The Elp3 enzyme class is as ancient as LUCA. These enzymes utilize S-adenosylmethionine, an iron-sulphur complex, acetyl coenzyme A and radical intermediates to methylate the 5-carbon of U34 [70,71,72]. The cm5U34 reaction appears to include multiple steps and cooperation of the S-adenosylmethionine and the lysine acetyltransferase homology (coenzyme A-binding) active sites. S-adenosylmethionine is converted to a 5’deoxyadenosine radical. Acetyl-CoA is bound in the lysine acetyltransferase homology domain. An acetyl radical may then be formed and attached at the C5 position of U34. In Figure 7, the related Escherichia coli enzyme RlmN methylase is shown that modifies the 2-carbon of tRNA-A37 [73,74]. The RlmN images were selected because they better emphasize some properties of these ancient enzymes. The image in Figure 7B is a detail and different orientation than that shown in Figure 7A. The (β−α)6 partial barrel that binds S-adenosylmethionine was derived from a (β−α)8 TIM barrel (TIM for triose phosphate isomerase). The partial barrel domain is identified by 6-parallel β-sheets with intervening α-helices in an open barrel shape. These ancient enzymes include a linked lysine acetyltransferase homology active site. The coenzyme A-binding region of the lysine acetyltransferase homology domain is identified in the image by antiparallel β-sheets (Figure 7A). Because Elp3 homologs are older than LUCA, TIM barrels, S-adenosylmethionine, Fe4-S4 cages, lysine acetyltransferases, coenzyme A and cm5U34-based modifications must be older than LUCA [75,76]. We posit that cm5U34-based tRNA modifications, which were required to form 2-codon genetic code sectors, were required to evolve the genetic code, which must also be older than LUCA. Because modifications of the tRNA-37 position were important or essential to read the tRNA-36 position, we posit that t6A37 and m1G37 modifications are likely older than LUCA (see below).

11. TRNA-37 Modifications

To gain potential insights into tRNA-A37 and -G37 modifications, we visualized the genetic code for Archaea along with reported tRNA-37 modifications (Figure 8). We strongly support the idea that Archaea are the most ancient organisms on Earth and the most similar to LUCA [41,50,65]. Because of missing data, we combined results for tRNA-37 modifications from a number of archaeal species. We conclude the following. At the base of genetic code evolution, the major determinant of tRNA-37 modifications was the identity of the tRNA-36 base. As a result, similar or identical tRNA-37 modifications tend to cluster in genetic code rows (rows 1–4). This result makes sense because tRNA-36 and tRNA-37 are adjacent bases. The most-bulky ancient tRNA-37 modifications (i.e., t6A37 and hn6A37) are associated with tRNA-U36 (row 3) indicating that U36 may have required stabilization during early code evolution. TRNA-m1G37 modifications appear important or essential for reading tRNA-A36 (row 1) [27]. Of course, in principle, the identity of tRNA-37 could relate to the reading of the first codon position in mRNA instead of the tRNA-36 position, but we do not favor this idea. It appears to us that mRNA evolution generally chased tRNA evolution and that the genetic code evolved around the tRNA anticodon and the anticodon delimiting base tRNA-37. Also, tRNA-37 modifying enzymes can read the tRNA-36 base directly but not the complementary codon base. Throughout row 3 (tRNA-U36), tRNA-37 t6A, hn6A and ms2hn6A are found. One exception is tRNAiMet, for which the anticodon loop is unmodified. From this comparison, it appears to us that tRNA-37 modifications may be most important to support translation elongation rather than to support initiation. Further discrimination of tRNAIle (CAU), tRNAMet (CAU) and tRNAiMet (CAU) is evident in the acceptor stems of the tRNAs [37].
According to tRNA anticodon preference rules, the genetic code evolved around the tRNA anticodon. At the wobble position tRNA-34, G was favored over C/U. At anticodon positions tRNA-35 and tRNA-36, the preference rules are C>G>U>>>A, and preferences are much stronger for the tRNA-36 position, which, early in code evolution, was a wobble position [37,38,39]. In keeping with these rules, unmodified tRNA-A37 appears favorable for row 4 (tRNA-C36), and C is the most favored tRNA-36 base (Figure 8). Although data are missing, it appears that tRNA-37 modifications can also be absent for row 2 (tRNA-G36). By contrast, in Archaea, row 3 (tRNA-U36) appears to be the most heavily modified for tRNA-37. We posit that tRNA-t6A37 may be among the most ancient row 3 modifications. Notably, t6A37 and hn6A37 are large N-6 modifications of A37 that may be important for stabilization of tRNA-U36 during translation elongation [27]. Row 1, tRNA-A36, was the last row to fill during evolution of the genetic code. Row 1 is modified for tRNA-37. We posit that tRNA-m1G37 may be the most ancient row 1 modification. Because m1G37 (row 1) appears to be a smaller modification than t6A or hn6A37 (row 3), we posit that tRNA-A36 may have been easier to stabilize than tRNA-U36 after suppression of tRNA-36 wobbling (i.e., by EF-Tu, 30S ribosomal closing and tRNA-37 modifications). Also, there is the difference in the identity of the t6A37 and m1G37 bases. Removing the tRNA-m1G37 modification increases the frameshifting of a near-cognate tRNA in the ribosome P-site [30].
Preference rules for the tRNA anticodon may also partially explain why the glycine 4-codon sector did not evolve the A34→I34 modification in Eukaryotes. According to anticodon preference rules, Gly (GCC) is the most favored anticodon in the genetic code [37,38,39]. This may partly explain why the unmodified GCC anticodon was favored over a modified ICC anticodon for the glycine 4-codon sector in Eukarya. Consideration of anticodon preference rules appears to reinforce our model for evolution of the genetic code, our interpretations of tRNA anticodon loop modifications and our hypothesis that the genetic code evolved around the reading of the tRNA anticodon on the primitive pre-LUCA ribosome.

12. Partial Redundancy and Overlap in Translation Functions

Because of their ancient evolution and central importance to life, very early, translation systems evolved overlapping, partly redundant and mutually-reinforcing systems. Such redundancy and overlap are observed in: (1) translational fidelity and frame maintenance; (2) tRNA sequence and modification; and (3) aaRS enzyme selectivity in tRNA charging. Because translation systems were central to life and evolution of the genetic code, functional redundancy and, also, backed-up, resilient functions were necessary to evolve stable systems. On the ribosome, translational accuracy and maintenance of the translation frame appear to be partially reinforcing systems. Specifically, translational accuracy and frame maintenance involve: (1) EF-Tu GTPase; (2) the 16S rRNA “latch” (30S ribosomal subunit closing mechanism); (3) a mRNA bend between the P-site and A-site codons; and (4) modifications of the tRNA-37 base [30,47,48]. EF-Tu is the most important factor in translational accuracy. EF-Tu binds the aminoacylated tRNA (aa-tRNA) and docks it on the ribosome. If the tRNA anticodon-mRNA codon interaction is cognate, EF-Tu hydrolyzes GTP to close the conformation of the ribosome 30S subunit (also referred to as closing the 16S rRNA latch). Once the latch is closed, EF-Tu releases the cognate A-site aa-tRNA to accommodate into the peptidyl transferase center for peptide bond transfer. Accommodation requires a surprisingly large motion of the 3′-end of the aa-tRNA. Figure 9 shows a detail of a catalytic ribosome structure (PDB 5IBB) with the P-site (peptidyl-site) and A-site (aminoacyl-site) tRNAs [77,78]. To avoid confusion, only the decoding center is shown in the image, not the peptidyl transferase center, and only the anticodon loops of the P-site and A-site tRNAs are shown. The 16S rRNA latch (G530~A1492 and A1493; Thermus thermophilus numbering) is shown in its closed conformation. The mRNA bends between the P-site and A-site codons. The bend (or “kink”) orients the 3′-ends of the tRNAs in the peptidyl transferase center, but the bend also separates the P-site and A-site tRNA anticodons in the decoding center [79,80,81]. Separation of the P-site and A-site anticodons in the decoding center has multiple effects. First, the bend in the mRNA prevents collision of the two anticodon loops. Notably, without the bend, A-site tRNA-37 might collide with the P-site tRNA. Second, separation of the P-site and A-site tRNAs helps the tRNAs to maintain the translation frame by acting as ratchet pawls. Closing the latch maintains the accuracy of translation by confirming the codon-anticodon interaction but also helps to maintain the frame. Modifications at the tRNA-37 position help to delineate the A-site anticodon and to maintain the translation reading frame. Notably, mutations that disable tRNA-37 modifications can cause slippage of the translation frame [30]. Bulky 37 modifications are associated most strongly with U36 (row 3) and A36 (row 1) anticodons, indicating that, among other features, tRNA-37 modifications help to read otherwise less stable codon–anticodon interactions (Figure 8) [27].
The tRNA anticodon loop has a highly specialized sequence with modifications that affect anticodon readout and loop dynamics (Figure 1). Also, the anticodon loop is a target for multiple interactions with modifying enzymes and the cognate aaRS. Thus, any particular sequence or modification can have multiple purposes and interactions. Mutations, therefore, can have complex and unanticipated effects. The anticodon immediately follows a U-turn following a U, in the 7-nt anticodon loop. The primordial tRNA anticodon loop sequence was close to 32-CU/BNNAA-38 (/ indicates a U-turn; B indicates G, C or U (not A); N indicates any base) [37,38,39,82]. Modifications are common at positions 32, 34, 37 and 38 [19,26,27]. A weak interaction (i.e., a C~A reverse Hoogsteen pair) is often observed between positions 32 and 38. The C32~A38 interaction may help to preserve the U-turn loop conformation that is important to maintain the codon-anticodon interaction. So, tRNA anticodon loop modifications, sequences and dynamics are evolved features that affect translational accuracy and output. We consider anticodon loop features to be complex, with overlapping inputs and outputs (i.e., sequences and modifications) that are evolved for different species and for individual tRNAs.
Matching a cognate tRNA to its cognate aaRS is also a problem with multiple inputs [31]. Notably, aaRS enzymes may read: (1) the discriminator base (XCCA-3’; X is the discriminator); (2) the acceptor stem; (3) the anticodon loop; (4) the tRNA elbow (where the D loop and the T loop interact); (5) expanded V-loops in type II tRNAs; and (6) tRNA modifications. We posit that aaRS recognition of their cognate tRNA, therefore, is a product of multiple partially overlapping determinants and anti-determinants. Table 1 indicates how cognate tRNAs and aaRS enzymes may have been sorted after genetic fusion of multiple Archaea and multiple Bacteria to form Eukarya.

13. Conclusions

We strongly support the model that the genetic code evolved around the reading of the tRNA anticodon on the primitive pre-LUCA ribosome [37,38,39]. Analyses of modifications at the tRNA-34 and -37 anticodon loop positions support this concept. Suppression of wobbling at the tRNA-36 position was essential to evolve the code.
Some of the conclusions of this paper are shown schematically in Figure 10. The presentation in this paper was partly organized around work of others [19,26]. We wished to expand the previous presentations to make it easier for non-experts in tRNA modification and anticodon readout to shape a detailed understanding. We also wanted to emphasize the problem of code evolution and devolution in mitochondria as an evolutionary milestone that helps explain ancient pre-LUCA evolution and also eukaryogenesis [5]. Figure 10 indicates that, in outline, evolution of life on Earth was simple with a small number of main branches. We advocate for the model that LUCA evolved first to Archaea. Archaea gave rise to Bacteria [41,50,65]. Fusion of an Asgard Archaea and an α-Proteobacteria (i.e., Rickettsiales) gave rise to Eukarya, with division and establishment of separate and partly overlapping translation systems for the eukaryotic cytosol and the mitochondria [6,10,45]. Many other archaeal and bacterial genetic inputs were likely during eukaryogenesis, but, at the time of writing, these other gene transfers are somewhat less completely understood (Table 1) [51].
We consider analysis of the evolution of genetic codes and tRNA-34 modifications through Earth’s history to support our narrative (Figure 2, Figure 3, Figure 4, Figure 5 and Figure 6). The simplest genetic code is that of Archaea (Figure 2), indicating that Archaea is closest to LUCA [41,50,65]. Generally, unmodified A34 is not allowed in Archaea, and only G34 is utilized. This fact alone indicates how genetic code degeneracy evolved. Degeneracy evolved through natural processes of the evolution of the reading of the tRNA anticodon on the primitive ribosome. To evolve the genetic code, universal or near universal cm5U34-based modifications were necessary to suppress superwobbling (4-way wobbling) and to, thus, support evolution of 2-codon genetic code sectors. Lacking 2-codon sectors, the genetic code would have been limited to a maximum of 16-amino acids.
Translation systems evolved through ancient bacteria to more derived bacteria. To date, too much tRNA modification data remains unreported for Thermus thermophilus. The missing Thermus thermophilus data will enhance this discussion. More derived Bacteria are genetically diverse with many innovations. In some derived bacteria, G34 anticodons in 4-codon boxes pair with the cmo5U34 modification (Val, Ser, Pro, Thr and Ala), unmodified UAG (Leu) and mnm5UCC (Gly) (Figure 4). The emergence of the A34→I34 modification is relevant. The A34→I34 innovation is associated with suppression of the otherwise preferred G34 anticodon (Figure 6). The A34→I34 modification expanded in Eukarya. In 3- and 4-codon boxes, I34 anticodons may partner with particular U34 modifications (i.e., ncm5U34 and mcm5U34, in Eukarya). The G34→Q34 (Q for queuosine) modification in genetic code column 3 arose in derived Bacteria and was transmitted to the eukaryotic cytosol and to mitochondria.
Tracing the evolution of the Ile/Met 4-codon sector through evolution is instructive. Maintaining 1-codon sectors for Met and Trp in the genetic code required proteome support. Probably, for this reason, mitochondria abandoned 1-codon sectors (Figure 5) to simplify the tRNAome and its supporting proteome [5]. In prokaryotes, we posit that Met invaded a 4-codon Ile sector during genetic code evolution, suppressing use of the UAU anticodon and resulting in C34 modifications to read Ile (i.e., CAU→agm2CAU and k2CAU). The 2-agmatidine modification of C34 found in Archaea and the related 2-lysidine modification in Bacteria read codon AUA (Ile) but not codon AUG (Met). In Eukarya, the Ile anticodon modification (UAU→ΨAΨ) arose, rescuing Ile anticodon UAU.
We posit that 4-codon sectors of the genetic code were balanced using different evolved strategies in different organisms to utilize, generally, 3-isoacceptor tRNAs to read 4-codons. This balance was mostly achieved by adjusting use of G34 or A34-derived and U34 anticodons. In Archaea, G34 and cm5U34-based anticodons (i.e., cnm5U34) were utilized (Figure 2). In some derived Bacteria, G34 and cmo5U34 anticodons were partnered for columns 1 and 2 of the code (4-codon sectors). In column 4, anticodon ICG partners with mnm5UCG to encode Arg, and GCC partners with mnm5UCC to encode Gly (Figure 4). According to anticodon preference rules, Gly (GCC) is expected to be the most favoured anticodon in the genetic code. Gly (GCC) is associated with unmodified tRNA-A37 in Archaea (Figure 8), possibly reflecting the preferred anticodon GCC status. In Eukarya, diverse strategies were evolved for balancing 3- and 4-codon sectors (Figure 6). Very clearly, anticodons that are not utilized in organisms are very important for maintaining balanced reading of tRNAs (Figure 2, Figure 3, Figure 4, Figure 5 and Figure 6). In mitochondria, 4-codon sectors utilize a single tRNA with unmodified U34 to read the entire 4-codon box, indicating that small mitochondrial genome size was more important than optimization of balancing multiple tRNAs for the most rapid and efficient translation of the 4-codon sectors (Figure 5).
We posit that the genetic code evolved around the reading of the tRNA anticodon on the primitive pre-LUCA ribosome. Analysis of tRNA wobble modifications strongly supports the idea that the genetic code evolved around the reading of the anticodon wobble position. Code degeneracy arose from wobbling at the 34 and 36 positions, as previously described [37,38,39]. Wobbling limits coding to pyrimidine-purine discrimination, so, only 2-assignments were possible at a tRNA wobble position. Thus, evolving 1-codon sectors posed difficulties with miscoding and anticodon ambiguity. TRNA-37 modifications evolved to help lock down the anticodon tRNA-36 position, in part, to suppress wobbling at position 36. Also, wobbling at tRNA-36 was suppressed by evolution of EF-Tu and the 16S rRNA latch (Figure 8 and Figure 9). Analysis of how the genetic code devolved in evolution of the mitochondria strongly supports these views. We do not find the concept of late wobbling evolution to be credible [14,83]. We posit that the genetic code evolved and sectored largely around the reading of tRNA wobble positions.
Column 3 of the genetic code is split entirely into 2-codon sectors. We have posited that initially column 3 was divided into alternating 2-codon Asp and Glu sectors [37,38,39]. Our model explains the striped pattern of related aaRS enzymes in Archaea column 3 (Figure 2). According to our model for code evolution, tRNA-U34 modification (i.e., cm5U34) may have been necessary to suppress superwobbling at tRNA-U34 and to achieve the 8-amino acid fractionation of the code. According to our model, therefore, cm5U34-based modifications may have been necessary to achieve a genetic code including 8-amino acids. Alternatively, only tRNAs with 34-GU-35 (Asp) and 34-CU-35 (Glu) may have initially been utilized. In this case, C34 may have required modification to read mRNA wobble 3A. We conclude that tRNA wobble modifications appear to have been necessary as early as at the 8-amino acid stage of genetic code evolution.
The model we support for evolution of life on Earth is a fairly well-accepted model (Figure 10). The analysis we present, therefore, appears to be straightforward and reasonable. Our work with the initial evolution of the genetic code is also very consistent with our current analysis [37,38,39]. As noted, the analyses that we present will be enhanced by the acquisition of additional tRNA modification data.
We imagine eukaryogenesis proceeding through a tense evolutionary bottleneck from FECA to LECA (first to last eukaryotic common ancestors). It appears to us that eukaryogenesis was tortured, involving many endosymbiotic and other large horizontal gene transfer events, only some of which resulted in identified eukaryotic organelles. Apparently, contributions were made to the process by many archaeal and many bacterial genes and, also, the genetic fusions were balanced by many compensating eukaryotic innovations [11]. The FECA to LECA bottleneck is reflected in the evolution of aaRS enzymes through eukaryogenesis (Table 1) [51]. Clearly, genes were transferred between many different organisms, including the horizontal transfer of the gene encoding GlnRS-IB from Eukarya to Archaea and Bacteria.

14. Future work

Specific goals for future work include: (1) obtain additional tRNA modification data (i.e., for Pyrococcus furiosis and Thermus thermophilus); (2) Improve the data underlying Table 1 (obtain optimal aaRS enzyme evolutionary sourcing for: (1) animals; (2) plants; (3) mitochondria; and (4) chloroplasts); (3) improve the description of evolution of tRNA-34 modifications and modification enzymes; and (4) improve the description of evolution of tRNA-37 modifications and modification enzymes. These additional data would enhance the narrative presented here. Mitochondria were an older acquisition than chloroplasts in evolution of Eukaryotes. A more-detailed model for the more recent evolution of chloroplasts (i.e., tRNAs, tRNA modifications, aaRS enzymes and genetic code), therefore, would enhance the understanding of the acquisition of mitochondria and the evolution of Eukaryotes through endosymbiosis.

Author Contributions

Z.F.B. and L.L. reviewed the literature and drafted the manuscript. Z.F.B. drew the initial figures, which were corrected by Z.F.B. and L.L. Z.F.B. and L.L. revised the manuscript. Z.F.B. and L.L. conceived the article structure and content. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Alkatib, S.; Scharff, L.B.; Rogalski, M.; Fleischmann, T.T.; Matthes, A.; Seeger, S.; Schottler, M.A.; Ruf, S.; Bock, R. The contributions of wobbling and superwobbling to the reading of the genetic code. PLoS Genet. 2012, 8, e1003076. [Google Scholar] [CrossRef] [Green Version]
  2. Rogalski, M.; Karcher, D.; Bock, R. Superwobbling facilitates translation with reduced tRNA sets. Nat. Struct. Mol. Biol. 2008, 15, 192–198. [Google Scholar] [CrossRef]
  3. Wolff, P.; Villette, C.; Zumsteg, J.; Heintz, D.; Antoine, L.; Chane-Woon-Ming, B.; Droogmans, L.; Grosjean, H.; Westhof, E. Comparative patterns of modified nucleotides in individual tRNA species from a mesophilic and two thermophilic archaea. RNA 2020, 26, 1957–1975. [Google Scholar] [CrossRef]
  4. Phillips, G.; de Crecy-Lagard, V. Biosynthesis and function of tRNA modifications in Archaea. Curr. Opin. Microbiol. 2011, 14, 335–341. [Google Scholar] [CrossRef]
  5. Suzuki, T.; Yashiro, Y.; Kikuchi, I.; Ishigami, Y.; Saito, H.; Matsuzawa, I.; Okada, S.; Mito, M.; Iwasaki, S.; Ma, D.; et al. Complete chemical structures of human mitochondrial tRNAs. Nat. Commun. 2020, 11, 4269. [Google Scholar] [CrossRef]
  6. Eme, L.; Spang, A.; Lombard, J.; Stairs, C.W.; Ettema, T.J.G. Archaea and the origin of eukaryotes. Nat. Rev. Microbiol. 2017, 15, 711–723, Erratum in Nat. Rev. Microbiol. 2018, 16, 120. [Google Scholar] [CrossRef] [PubMed]
  7. Roger, A.J.; Munoz-Gomez, S.A.; Kamikawa, R. The Origin and Diversification of Mitochondria. Curr. Biol. 2017, 27, R1177–R1192. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  8. Eme, L.; Ettema, T.J.G. The eukaryotic ancestor shapes up. Nature 2018, 562, 352–353. [Google Scholar] [CrossRef]
  9. Spang, A.; Eme, L.; Saw, J.H.; Caceres, E.F.; Zaremba-Niedzwiedzka, K.; Lombard, J.; Guy, L.; Ettema, T.J.G. Asgard archaea are the closest prokaryotic relatives of eukaryotes. PLoS Genet. 2018, 14, e1007080. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  10. Martijn, J.; Vosseberg, J.; Guy, L.; Offre, P.; Ettema, T.J.G. Deep mitochondrial origin outside the sampled alphaproteobacteria. Nature 2018, 557, 101–105. [Google Scholar] [CrossRef]
  11. Gogoi, J.; Bhatnagar, A.; Ann, K.J.; Pottabathini, S.; Singh, R.; Mazeed, M.; Kuncha, S.K.; Kruparani, S.P.; Sankaranarayanan, R. Switching a conflicted bacterial DTD-tRNA code is essential for the emergence of mitochondria. Sci. Adv. 2022, 8, eabj7307. [Google Scholar] [CrossRef] [PubMed]
  12. Yu, N.; Jora, M.; Solivio, B.; Thakur, P.; Acevedo-Rocha, C.G.; Randau, L.; de Crecy-Lagard, V.; Addepalli, B.; Limbach, P.A. tRNA Modification Profiles and Codon-Decoding Strategies in Methanocaldococcus jannaschii. J. Bacteriol. 2019, 201. [Google Scholar] [CrossRef] [Green Version]
  13. Grosjean, H.; Gaspin, C.; Marck, C.; Decatur, W.A.; de Crecy-Lagard, V. RNomics and Modomics in the halophilic archaea Haloferax volcanii: Identification of RNA modification genes. BMC Genom. 2008, 9, 470. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  14. Yarus, M. Crick Wobble and Superwobble in Standard Genetic Code Evolution. J. Mol. Evol. 2021, 89, 50–61. [Google Scholar] [CrossRef] [PubMed]
  15. Srinivasan, S.; Torres, A.G.; Ribas de Pouplana, L. Inosine in Biology and Disease. Genes 2021, 12, 600. [Google Scholar] [CrossRef] [PubMed]
  16. Rafels-Ybern, A.; Torres, A.G.; Camacho, N.; Herencia-Ropero, A.; Roura Frigole, H.; Wulff, T.F.; Raboteg, M.; Bordons, A.; Grau-Bove, X.; Ruiz-Trillo, I.; et al. The Expansion of Inosine at the Wobble Position of tRNAs, and Its Role in the Evolution of Proteomes. Mol. Biol. Evol. 2019, 36, 650–662. [Google Scholar] [CrossRef] [PubMed]
  17. Rafels-Ybern, A.; Torres, A.G.; Grau-Bove, X.; Ruiz-Trillo, I.; Ribas de Pouplana, L. Codon adaptation to tRNAs with Inosine modification at position 34 is widespread among Eukaryotes and present in two Bacterial phyla. RNA Biol. 2018, 15, 500–507. [Google Scholar] [CrossRef] [Green Version]
  18. Torres, A.G.; Pineyro, D.; Filonava, L.; Stracker, T.H.; Batlle, E.; Ribas de Pouplana, L. A-to-I editing on tRNAs: Biochemical, biological and evolutionary implications. FEBS Lett. 2014, 588, 4279–4286. [Google Scholar] [CrossRef]
  19. El Yacoubi, B.; Bailly, M.; de Crecy-Lagard, V. Biosynthesis and function of posttranscriptional modifications of transfer RNAs. Annu. Rev. Genet. 2012, 46, 69–95. [Google Scholar] [CrossRef]
  20. Pernod, K.; Schaeffer, L.; Chicher, J.; Hok, E.; Rick, C.; Geslain, R.; Eriani, G.; Westhof, E.; Ryckelynck, M.; Martin, F. The nature of the purine at position 34 in tRNAs of 4-codon boxes is correlated with nucleotides at positions 32 and 38 to maintain decoding fidelity. Nucleic Acids Res. 2020, 48, 6170–6183. [Google Scholar] [CrossRef]
  21. Vik, E.S.; Nawaz, M.S.; Strom Andersen, P.; Fladeby, C.; Bjoras, M.; Dalhus, B.; Alseth, I. Endonuclease V cleaves at inosines in RNA. Nat. Commun. 2013, 4, 2271. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  22. Muller, M.; Legrand, C.; Tuorto, F.; Kelly, V.P.; Atlasi, Y.; Lyko, F.; Ehrenhofer-Murray, A.E. Queuine links translational control in eukaryotes to a micronutrient from bacteria. Nucleic Acids Res. 2019, 47, 3711–3727. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  23. Tuorto, F.; Legrand, C.; Cirzi, C.; Federico, G.; Liebers, R.; Muller, M.; Ehrenhofer-Murray, A.E.; Dittmar, G.; Grone, H.J.; Lyko, F. Queuosine-modified tRNAs confer nutritional control of protein translation. EMBO J. 2018, 37. [Google Scholar] [CrossRef] [PubMed]
  24. Tuorto, F.; Lyko, F. Genome recoding by tRNA modifications. Open Biol. 2016, 6. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  25. Vinayak, M.; Pathak, C. Queuosine modification of tRNA: Its divergent role in cellular machinery. Biosci. Rep. 2009, 30, 135–148. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  26. Agris, P.F.; Eruysal, E.R.; Narendran, A.; Vare, V.Y.P.; Vangaveti, S.; Ranganathan, S.V. Celebrating wobble decoding: Half a century and still much is new. RNA Biol. 2018, 15, 537–553. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  27. Berg, M.D.; Brandl, C.J. Transfer RNAs: Diversity in form and function. RNA Biol. 2021, 18, 316–339. [Google Scholar] [CrossRef]
  28. Bjork, G.R.; Durand, J.M.; Hagervall, T.G.; Leipuviene, R.; Lundgren, H.K.; Nilsson, K.; Chen, P.; Qian, Q.; Urbonavicius, J. Transfer RNA modification: Influence on translational frameshifting and metabolism. FEBS Lett. 1999, 452, 47–51. [Google Scholar] [CrossRef] [Green Version]
  29. Bjork, G.R.; Wikstrom, P.M.; Bystrom, A.S. Prevention of translational frameshifting by the modified nucleoside 1-methylguanosine. Science 1989, 244, 986–989. [Google Scholar] [CrossRef]
  30. Hoffer, E.D.; Hong, S.; Sunita, S.; Maehigashi, T.; Gonzalez, R.L.J.; Whitford, P.C.; Dunham, C.M. Structural insights into mRNA reading frame regulation by tRNA modification and slippery codon-anticodon pairing. eLife 2020, 9, e51898. [Google Scholar] [CrossRef] [PubMed]
  31. Perona, J.J.; Gruic-Sovulj, I. Synthetic and editing mechanisms of aminoacyl-tRNA synthetases. Top. Curr. Chem. 2014, 344, 1–41. [Google Scholar] [CrossRef]
  32. Sajek, M.P.; Wozniak, T.; Sprinzl, M.; Jaruzelska, J.; Barciszewski, J. T-psi-C: User friendly database of tRNA sequences and structures. Nucleic Acids Res. 2020, 48, D256–D260. [Google Scholar] [CrossRef]
  33. Boccaletto, P.; Machnicka, M.A.; Purta, E.; Piatkowski, P.; Baginski, B.; Wirecki, T.K.; de Crecy-Lagard, V.; Ross, R.; Limbach, P.A.; Kotter, A.; et al. MODOMICS: A database of RNA modification pathways. 2017 update. Nucleic Acids Res. 2018, 46, D303–D307. [Google Scholar] [CrossRef]
  34. Machnicka, M.A.; Milanowska, K.; Osman Oglou, O.; Purta, E.; Kurkowska, M.; Olchowik, A.; Januszewski, W.; Kalinowski, S.; Dunin-Horkawicz, S.; Rother, K.M.; et al. MODOMICS: A database of RNA modification pathways-2013 update. Nucleic Acids Res. 2013, 41, D262–D267. [Google Scholar] [CrossRef] [PubMed]
  35. Czerwoniec, A.; Dunin-Horkawicz, S.; Purta, E.; Kaminska, K.H.; Kasprzak, J.M.; Bujnicki, J.M.; Grosjean, H.; Rother, K. MODOMICS: A database of RNA modification pathways. 2008 update. Nucleic Acids Res. 2009, 37, D118–D121. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  36. Dunin-Horkawicz, S.; Czerwoniec, A.; Gajda, M.J.; Feder, M.; Grosjean, H.; Bujnicki, J.M. MODOMICS: A database of RNA modification pathways. Nucleic Acids Res. 2006, 34, D145–D149. [Google Scholar] [CrossRef] [Green Version]
  37. Lei, L.; Burton, Z.F. Evolution of the genetic code. Transcription 2021, 12, 28–53. [Google Scholar] [CrossRef] [PubMed]
  38. Lei, L.; Burton, Z.F. Evolution of Life on Earth: tRNA, Aminoacyl-tRNA Synthetases and the Genetic Code. Life 2020, 10, 21. [Google Scholar] [CrossRef] [Green Version]
  39. Kim, Y.; Opron, K.; Burton, Z.F. A tRNA- and Anticodon-Centric View of the Evolution of Aminoacyl-tRNA Synthetases, tRNAomes, and the Genetic Code. Life 2019, 9, 37. [Google Scholar] [CrossRef] [Green Version]
  40. Pak, D.; Du, N.; Kim, Y.; Sun, Y.; Burton, Z.F. Rooted tRNAomes and evolution of the genetic code. Transcription 2018, 9, 137–151. [Google Scholar] [CrossRef]
  41. Long, X.; Xue, H.; Wong, J.T. Descent of Bacteria and Eukarya From an Archaeal Root of Life. Evol. Bioinform. Online 2020, 16, 1176934320908267. [Google Scholar] [CrossRef]
  42. Youle, R.J. Mitochondria-Striking a balance between host and endosymbiont. Science 2019, 365, eaaw9855. [Google Scholar] [CrossRef] [PubMed]
  43. Lopez-Garcia, P.; Eme, L.; Moreira, D. Symbiosis in eukaryotic evolution. J. Biol. 2017, 434, 20–33. [Google Scholar] [CrossRef]
  44. Kaiser, F.; Krautwurst, S.; Salentin, S.; Haupt, V.J.; Leberecht, C.; Bittrich, S.; Labudde, D.; Schroeder, M. The structural basis of the genetic code: Amino acid recognition by aminoacyl-tRNA synthetases. Sci. Rep. 2020, 10, 12647. [Google Scholar] [CrossRef]
  45. Zaremba-Niedzwiedzka, K.; Caceres, E.F.; Saw, J.H.; Backstrom, D.; Juzokaite, L.; Vancaester, E.; Seitz, K.W.; Anantharaman, K.; Starnawski, P.; Kjeldsen, K.U.; et al. Asgard archaea illuminate the origin of eukaryotic cellular complexity. Nature 2017, 541, 353–358. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  46. Shi, H.; Moore, P.B. The crystal structure of yeast phenylalanine tRNA at 1.93 A resolution: A classic structure revisited. RNA 2000, 6, 1091–1105. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  47. Loveland, A.B.; Demo, G.; Grigorieff, N.; Korostelev, A.A. Ensemble cryo-EM elucidates the mechanism of translation fidelity. Nature 2017, 546, 113–117. [Google Scholar] [CrossRef]
  48. Loveland, A.B.; Demo, G.; Korostelev, A.A. Cryo-EM of elongating ribosome with EF-Tu*GTP elucidates tRNA proofreading. Nature 2020, 584, 640–645. [Google Scholar] [CrossRef] [PubMed]
  49. Quigley, G.J.; Suddath, F.L.; McPherson, A.; Kim, J.J.; Sneden, D.; Rich, A. The molecular structure of yeast phenylalanine transfer RNA in monoclinic crystals. Proc. Natl. Acad. Sci. USA 1974, 71, 2146–2150. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  50. Lei, L.; Burton, Z.F. Early Evolution of Transcription Systems and Divergence of Archaea and Bacteria. Front. Mol. Biosci. 2021, 8, 1134. [Google Scholar] [CrossRef]
  51. Furukawa, R.; Nakagawa, M.; Kuroyanagi, T.; Yokobori, S.I.; Yamagishi, A. Quest for Ancestors of Eukaryal Cells Based on Phylogenetic Analyses of Aminoacyl-tRNA Synthetases. J. Mol. Evol. 2017, 84, 51–66. [Google Scholar] [CrossRef] [PubMed]
  52. Agris, P.F.; Narendran, A.; Sarachan, K.; Vare, V.Y.P.; Eruysal, E. The Importance of Being Modified: The Role of RNA Modifications in Translational Fidelity. Enzymes 2017, 41, 1–50. [Google Scholar] [CrossRef]
  53. Suzuki, T.; Numata, T. Convergent evolution of AUA decoding in bacteria and archaea. RNA Biol. 2014, 11, 1586–1596. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  54. Satpati, P.; Bauer, P.; Aqvist, J. Energetic tuning by tRNA modifications ensures correct decoding of isoleucine and methionine on the ribosome. Chemistry 2014, 20, 10271–10275. [Google Scholar] [CrossRef] [PubMed]
  55. Mandal, D.; Kohrer, C.; Su, D.; Russell, S.P.; Krivos, K.; Castleberry, C.M.; Blum, P.; Limbach, P.A.; Soll, D.; RajBhandary, U.L. Agmatidine, a modified cytidine in the anticodon of archaeal tRNA(Ile), base pairs with adenosine but not with guanosine. Proc. Natl. Acad. Sci. USA 2010, 107, 2872–2877. [Google Scholar] [CrossRef] [Green Version]
  56. Sonawane, K.D.; Sambhare, S.B. The influence of hypermodified nucleosides lysidine and t(6)A to recognize the AUA codon instead of AUG: A molecular dynamics simulation study. Integr. Biol. 2015, 7, 1387–1395. [Google Scholar] [CrossRef]
  57. Nakanishi, K.; Bonnefond, L.; Kimura, S.; Suzuki, T.; Ishitani, R.; Nureki, O. Structural basis for translational fidelity ensured by transfer RNA lysidine synthetase. Nature 2009, 461, 1144–1148. [Google Scholar] [CrossRef]
  58. Grosjean, H.; Bjork, G.R. Enzymatic conversion of cytidine to lysidine in anticodon of bacterial isoleucyl-tRNA--an alternative way of RNA editing. Trends Biochem. Sci. 2004, 29, 165–168. [Google Scholar] [CrossRef]
  59. Soma, A.; Ikeuchi, Y.; Kanemasa, S.; Kobayashi, K.; Ogasawara, N.; Ote, T.; Kato, J.; Watanabe, K.; Sekine, Y.; Suzuki, T. An RNA-modifying enzyme that governs both the codon and amino acid specificities of isoleucine tRNA. Mol. Cell 2003, 12, 689–698. [Google Scholar] [CrossRef]
  60. Burroughs, A.M.; Aravind, L. The Origin and Evolution of Release Factors: Implications for Translation Termination, Ribosome Rescue, and Quality Control Pathways. Int J. Mol. Sci 2019, 20, 1981. [Google Scholar] [CrossRef] [Green Version]
  61. Brindefalk, B.; Viklund, J.; Larsson, D.; Thollesson, M.; Andersson, S.G. Origin and evolution of the mitochondrial aminoacyl-tRNA synthetases. Mol. Biol. Evol. 2007, 24, 743–756. [Google Scholar] [CrossRef] [Green Version]
  62. Bhaskaran, H.; Perona, J.J. Two-step aminoacylation of tRNA without channeling in Archaea. J. Mol. Biol 2011, 411, 854–869. [Google Scholar] [CrossRef] [Green Version]
  63. Perona, J.J. Two-step pathway to aminoacylated tRNA. Structure 2005, 13, 1397–1398. [Google Scholar] [CrossRef] [PubMed]
  64. Yaremchuk, A.; Kriklivyi, I.; Tukalo, M.; Cusack, S. Class I tyrosyl-tRNA synthetase has a class II mode of cognate tRNA recognition. EMBO J. 2002, 21, 3829–3840. [Google Scholar] [CrossRef] [Green Version]
  65. Kim, K.M.; Caetano-Anolles, G. The evolutionary history of protein fold families and proteomes confirms that the archaeal ancestor is more ancient than the ancestors of other superkingdoms. BMC Evol. Biol. 2012, 12, 13. [Google Scholar] [CrossRef] [Green Version]
  66. Nasvall, S.J.; Chen, P.; Bjork, G.R. The wobble hypothesis revisited: Uridine-5-oxyacetic acid is critical for reading of G-ending codons. RNA 2007, 13, 2151–2164. [Google Scholar] [CrossRef] [Green Version]
  67. Nasvall, S.J.; Chen, P.; Bjork, G.R. The modified wobble nucleoside uridine-5-oxyacetic acid in tRNAPro(cmo5UGG) promotes reading of all four proline codons in vivo. RNA 2004, 10, 1662–1673. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  68. Brindefalk, B.; Ettema, T.J.; Viklund, J.; Thollesson, M.; Andersson, S.G. A phylometagenomic exploration of oceanic alphaproteobacteria reveals mitochondrial relatives unrelated to the SAR11 clade. PLoS ONE 2011, 6, e24457. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  69. Brandao, M.M.; Silva-Filho, M.C. Evolutionary history of Arabidopsis thaliana aminoacyl-tRNA synthetase dual-targeted proteins. Mol. Biol. Evol. 2011, 28, 79–85. [Google Scholar] [CrossRef] [Green Version]
  70. Abbassi, N.E.; Biela, A.; Glatt, S.; Lin, T.Y. How Elongator Acetylates tRNA Bases. Int J. Mol. Sci. 2020, 21, 8209. [Google Scholar] [CrossRef]
  71. Lin, T.Y.; Abbassi, N.E.H.; Zakrzewski, K.; Chramiec-Glabik, A.; Jemiola-Rzeminska, M.; Rozycki, J.; Glatt, S. The Elongator subunit Elp3 is a non-canonical tRNA acetyltransferase. Nat. Commun. 2019, 10, 625. [Google Scholar] [CrossRef] [Green Version]
  72. Glatt, S.; Zabel, R.; Kolaj-Robin, O.; Onuma, O.F.; Baudin, F.; Graziadei, A.; Taverniti, V.; Lin, T.Y.; Baymann, F.; Seraphin, B.; et al. Structural basis for tRNA modification by Elp3 from Dehalococcoides mccartyi. Nat. Struct. Mol. Biol. 2016, 23, 794–802. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  73. Schwalm, E.L.; Grove, T.L.; Booker, S.J.; Boal, A.K. Crystallographic capture of a radical S-adenosylmethionine enzyme in the act of modifying tRNA. Science 2016, 352, 309–312. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  74. Blue, T.C.; Davis, K.M. Computational Approaches: An Underutilized Tool in the Quest to Elucidate Radical SAM Dynamics. Molecules 2021, 26, 2590. [Google Scholar] [CrossRef]
  75. Martin, W.F.; Weiss, M.C.; Neukirchen, S.; Nelson-Sathi, S.; Sousa, F.L. Physiology, phylogeny, and LUCA. Microb. Cell 2016, 3, 582–587. [Google Scholar] [CrossRef] [Green Version]
  76. Weiss, M.C.; Sousa, F.L.; Mrnjavac, N.; Neukirchen, S.; Roettger, M.; Nelson-Sathi, S.; Martin, W.F. The physiology and habitat of the last universal common ancestor. Nat. Microbiol. 2016, 1, 16116. [Google Scholar] [CrossRef]
  77. Rozov, A.; Demeshkina, N.; Westhof, E.; Yusupov, M.; Yusupova, G. New Structural Insights into Translational Miscoding. Trends Biochem Sci 2016, 41, 798–814. [Google Scholar] [CrossRef] [PubMed]
  78. Rozov, A.; Westhof, E.; Yusupov, M.; Yusupova, G. The ribosome prohibits the G*U wobble geometry at the first position of the codon-anticodon helix. Nucleic Acids Res. 2016, 44, 6434–6441. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  79. Keedy, H.E.; Thomas, E.N.; Zaher, H.S. Decoding on the ribosome depends on the structure of the mRNA phosphodiester backbone. Proc. Natl. Acad. Sci. USA 2018, 115, E6731–E6740. [Google Scholar] [CrossRef] [Green Version]
  80. Demeshkina, N.; Jenner, L.; Westhof, E.; Yusupov, M.; Yusupova, G. A new understanding of the decoding principle on the ribosome. Nature 2012, 484, 256–259. [Google Scholar] [CrossRef] [PubMed]
  81. Selmer, M.; Dunham, C.M.; Murphy, F.V.t.; Weixlbaumer, A.; Petry, S.; Kelley, A.C.; Weir, J.R.; Ramakrishnan, V. Structure of the 70S ribosome complexed with mRNA and tRNA. Science 2006, 313, 1935–1942. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  82. Burton, Z.F. The 3-Minihelix tRNA Evolution Theorem. J. Mol. Evol. 2020, 88, 234–242. [Google Scholar] [CrossRef] [PubMed]
  83. Yarus, M. Evolution of the Standard Genetic Code. J. Mol. Evol. 2021, 89, 19–44. [Google Scholar] [CrossRef] [PubMed]
Figure 1. The Saccharomyces cerevisiae tRNAPhe anticodon loop (PDB 1EHZ). (A) The linear sequence is shown. The anticodon (Ac) is indicated (3 blue dots). (B) The folded loop structure is shown. / indicates a U-turn. (CE) Three views of the anticodon loop are shown. The anticodon is indicated in C (blue dots). Blue dashed lines indicate H-bonds. Colors: (beige) carbon; (blue) nitrogen; (red) oxygen; (orange) phosphorous. Abbreviations: (cm) 2′-O-methylcytidine; (Gm) 2′-O-methylguanosine; (yW) wybutosine (modification of G); (Ψ) pseudouridine; (m5C) 5-methylcytidine.
Figure 1. The Saccharomyces cerevisiae tRNAPhe anticodon loop (PDB 1EHZ). (A) The linear sequence is shown. The anticodon (Ac) is indicated (3 blue dots). (B) The folded loop structure is shown. / indicates a U-turn. (CE) Three views of the anticodon loop are shown. The anticodon is indicated in C (blue dots). Blue dashed lines indicate H-bonds. Colors: (beige) carbon; (blue) nitrogen; (red) oxygen; (orange) phosphorous. Abbreviations: (cm) 2′-O-methylcytidine; (Gm) 2′-O-methylguanosine; (yW) wybutosine (modification of G); (Ψ) pseudouridine; (m5C) 5-methylcytidine.
Life 12 00252 g001
Figure 2. The genetic code in Archaea (i.e., Pyrococcus furiosis). Genetic code columns (tRNA-35) are labelled 1–4. The leftmost table column gives row designations. Row 1–4 numbers indicate the tRNA-36 base. Codon bases (1st, 2nd, 3rd) are shaded pale yellow. TRNA-34 bases are indicated with modifications in bold type. Amino acids and aaRS structural classes and subclasses are shown (i.e., Phe-IIC indicates tRNAPhe is charged by PheRS-IIC) (aa-aaRS). GAA/AAA indicates anticodon (Ac) data. Anticodon GAA reads codons UUU and UUC, and anticodon AAA is not utilized. Color highlighting is meant to emphasize particular table features and evolution of aaRS enzymes through Earth’s history in Figures 2–6. Data were modeled on Pyrococcus furiosis but tRNA modification data are not complete, so some data were inferred or utilized from other Archaea. Color shading is meant to be largely consistent in Figures 2–6.
Figure 2. The genetic code in Archaea (i.e., Pyrococcus furiosis). Genetic code columns (tRNA-35) are labelled 1–4. The leftmost table column gives row designations. Row 1–4 numbers indicate the tRNA-36 base. Codon bases (1st, 2nd, 3rd) are shaded pale yellow. TRNA-34 bases are indicated with modifications in bold type. Amino acids and aaRS structural classes and subclasses are shown (i.e., Phe-IIC indicates tRNAPhe is charged by PheRS-IIC) (aa-aaRS). GAA/AAA indicates anticodon (Ac) data. Anticodon GAA reads codons UUU and UUC, and anticodon AAA is not utilized. Color highlighting is meant to emphasize particular table features and evolution of aaRS enzymes through Earth’s history in Figures 2–6. Data were modeled on Pyrococcus furiosis but tRNA modification data are not complete, so some data were inferred or utilized from other Archaea. Color shading is meant to be largely consistent in Figures 2–6.
Life 12 00252 g002
Figure 3. The genetic code in ancient Bacteria (i.e., Thermus thermophilus). GAA/AAA indicates anticodon GAA is utilized and AAA is not, to encode Phe. QGUA/AUA indicates the G34→Q34 modification and AUA is not utilized. LysRS-IIB is a bacterial innovation. cm2UAA for Leu indicates that the precise 5-carbon U modification to suppress superwobbling is not currently reported for Thermus thermophilus. Some tRNA modification data were inferred by identifying enzymes in Thermus thermophilus. It is not clear to us at the time of writing how the Arg 4-codon box is read.
Figure 3. The genetic code in ancient Bacteria (i.e., Thermus thermophilus). GAA/AAA indicates anticodon GAA is utilized and AAA is not, to encode Phe. QGUA/AUA indicates the G34→Q34 modification and AUA is not utilized. LysRS-IIB is a bacterial innovation. cm2UAA for Leu indicates that the precise 5-carbon U modification to suppress superwobbling is not currently reported for Thermus thermophilus. Some tRNA modification data were inferred by identifying enzymes in Thermus thermophilus. It is not clear to us at the time of writing how the Arg 4-codon box is read.
Life 12 00252 g003
Figure 4. The genetic code in derived Bacteria (i.e., Escherichia coli). Innovations include: (1) ProRS-IIA takes on additional bacterial features; (2) Arg ACG→ICG/GCG is utilized (Thermus thermophilus appears to lack tRNA adenosine deaminase); and (3) GlyRS-IIA can be replaced in some Bacteria by GlyRS-IID. As in Thermus thermophilus, LysRS-IIB and type II tRNATyr are utilized. This table is based on incomplete tRNA modification data. Escherichia coli appears not to utilize Lys anticodon CUU.
Figure 4. The genetic code in derived Bacteria (i.e., Escherichia coli). Innovations include: (1) ProRS-IIA takes on additional bacterial features; (2) Arg ACG→ICG/GCG is utilized (Thermus thermophilus appears to lack tRNA adenosine deaminase); and (3) GlyRS-IIA can be replaced in some Bacteria by GlyRS-IID. As in Thermus thermophilus, LysRS-IIB and type II tRNATyr are utilized. This table is based on incomplete tRNA modification data. Escherichia coli appears not to utilize Lys anticodon CUU.
Life 12 00252 g004
Figure 5. The genetic code in human mitochondria. A major strategy to shrink the mitochondrial tRNAome was superwobbling (beige shading). In mitochondria, Met, Ile and Trp utilize 2-codon sectors. The distribution of stop codons has changed. GlnRS-IB is not imported into human mitochondria. G34→Q34 modifications are utilized in column 3. τ indicates taurine modifications. Many unused anticodons were not struck out in this figure (except in column 1). It appears that the human mitochondrial code may be completely and accurately reported [5].
Figure 5. The genetic code in human mitochondria. A major strategy to shrink the mitochondrial tRNAome was superwobbling (beige shading). In mitochondria, Met, Ile and Trp utilize 2-codon sectors. The distribution of stop codons has changed. GlnRS-IB is not imported into human mitochondria. G34→Q34 modifications are utilized in column 3. τ indicates taurine modifications. Many unused anticodons were not struck out in this figure (except in column 1). It appears that the human mitochondrial code may be completely and accurately reported [5].
Life 12 00252 g005
Figure 6. The genetic code in the eukaryotic cytosol (i.e., human). Shading and symbols are as in Figure 2, Figure 3, Figure 4 and Figure 5. ΨUAΨ indicates ΨAΨ (Ψ for pseudouridine).
Figure 6. The genetic code in the eukaryotic cytosol (i.e., human). Shading and symbols are as in Figure 2, Figure 3, Figure 4 and Figure 5. ΨUAΨ indicates ΨAΨ (Ψ for pseudouridine).
Life 12 00252 g006
Figure 7. Elp3 (tRNA-cm5U34 methyl transferase) is an ancient enzyme. The Elp3 homolog RlmN (tRNA-m2A37) methylase is shown. (A) A view of the RlmN structure. (B) A detail and rotated view. β-sheets are yellow. The Fe4S4 cage is indicated. A 5′-deoxyadenosine (5AD) radical is formed from S-adenosylmethionine (space-filling representation). The radical reaction mechanism of RlmN methylase involves a covalent intermediate linking Cys355 and m2A37. In Archaea, Elp3 may function somewhat differently. Enzymes of this class include an S-adenosylmethionine methylase domain and a lysine acetyl transferase homology domain that binds acetyl coenzyme A.
Figure 7. Elp3 (tRNA-cm5U34 methyl transferase) is an ancient enzyme. The Elp3 homolog RlmN (tRNA-m2A37) methylase is shown. (A) A view of the RlmN structure. (B) A detail and rotated view. β-sheets are yellow. The Fe4S4 cage is indicated. A 5′-deoxyadenosine (5AD) radical is formed from S-adenosylmethionine (space-filling representation). The radical reaction mechanism of RlmN methylase involves a covalent intermediate linking Cys355 and m2A37. In Archaea, Elp3 may function somewhat differently. Enzymes of this class include an S-adenosylmethionine methylase domain and a lysine acetyl transferase homology domain that binds acetyl coenzyme A.
Life 12 00252 g007
Figure 8. TRNA-37 modifications in Archaea. The tRNA-34 and tRNA-37 modifications are indicated in bold type. TRNA-37 modifications track the tRNA-36 position (rows 1–4). Row 1 (light blue) and 3 (light green) numbers are shaded for emphasis.
Figure 8. TRNA-37 modifications in Archaea. The tRNA-34 and tRNA-37 modifications are indicated in bold type. TRNA-37 modifications track the tRNA-36 position (rows 1–4). Row 1 (light blue) and 3 (light green) numbers are shaded for emphasis.
Life 12 00252 g008
Figure 9. The decoding center of the Thermus thermophilus ribosome during peptide bond synthesis [78]. Colors: (grey) P-site tRNA anticodon loop; beige) A-site tRNA anticodon loop; sea (green) the “latch”; and (red) mRNA. A bend in the mRNA that separates the P-site and A-site codons and anticodons is indicated (red arrow). Codon positions (1, 2 and 3) and the 5’→3’ directionality of the mRNA are indicated.
Figure 9. The decoding center of the Thermus thermophilus ribosome during peptide bond synthesis [78]. Colors: (grey) P-site tRNA anticodon loop; beige) A-site tRNA anticodon loop; sea (green) the “latch”; and (red) mRNA. A bend in the mRNA that separates the P-site and A-site codons and anticodons is indicated (red arrow). Codon positions (1, 2 and 3) and the 5’→3’ directionality of the mRNA are indicated.
Life 12 00252 g009
Figure 10. Evolution of tRNA-34 wobble modifications. Superwobbling in mitochondria indicates that cm5U34-based modifications were necessary to generate 2-codon sectors to evolve the LUCA code. Red strikethrough indicates that an anticodon is not utilized. Ψ indicates pseudouridine. In mitochondria, 2-codon sectors are utilized to encode Ile, Met and Trp. HGT indicates horizontal gene transfer. Not all anticodon strike-outs are listed for superwobbling in mitochondria.
Figure 10. Evolution of tRNA-34 wobble modifications. Superwobbling in mitochondria indicates that cm5U34-based modifications were necessary to generate 2-codon sectors to evolve the LUCA code. Red strikethrough indicates that an anticodon is not utilized. Ψ indicates pseudouridine. In mitochondria, 2-codon sectors are utilized to encode Ile, Met and Trp. HGT indicates horizontal gene transfer. Not all anticodon strike-outs are listed for superwobbling in mitochondria.
Life 12 00252 g010
Table 1. Human aaRS enzymes (and genes) in the cytosol and mitochondria. PMW indicates Parvarchaeota, Micrarchaeota, and Woesearchaeota [51]. The mitochondria utilize GluRS-IB to generate Glu-tRNAGln and a transamidase to generate Gln-tRNAGln for translation. Abbreviations: Cyto) cytoplasmic; Mito) mitochondrial.
Table 1. Human aaRS enzymes (and genes) in the cytosol and mitochondria. PMW indicates Parvarchaeota, Micrarchaeota, and Woesearchaeota [51]. The mitochondria utilize GluRS-IB to generate Glu-tRNAGln and a transamidase to generate Gln-tRNAGln for translation. Abbreviations: Cyto) cytoplasmic; Mito) mitochondrial.
aaRSCytoCyto/MitoMito
AlaRS-IIDAARS (Bacteria) AARS2 (Bacteria)
ArgRS-IDRARS (Bacteria) RARS2 (Bacteria)
AsnRS-IIBNARS (Archaea) NARS2 (Bacteria)
AspRS-IIBDARS (Deinococcus-Thermus; Bacteria) DARS2 (Bacteria)
CysRS-IBCARS (Archaea) CARS2 (Archaea)
GlnRS-IBQARS (Eukarya) Transamidation
GluRS-IBEPRS (PMW; Archaea) EARS2 (Bacteria)
GlyRS-IIA GARS (Euryarchaeota; Archaea)
HisRS-IIAHARS (Archaea) HARS2 (Bacteria)
IleRS-IAIARS (Lentisphaera; Bacteria) IARS2 (Bacteria)
LeuRS-IALARS (PMW; Archaea) LARS2 (Bacteria)
LysRS-IIB KARS (Bacteria)
MetRS-IAMARS (Archaea) MARS2 (Bacteria)
PheRS-IICFARSA + FARSB (Euryarchaeota?; Archaea) FARS2 (Bacteria)
ProRS-IIAEPRS (Archaea) PARS2 (Bacteria)
SerRS-IIA SARS (TACK; Archaea) SARS2 (Bacteria)
ThrRS-IIATARS (i.e., Gemmatimonadetes?; Bacteria) TARS2 (Bacteria)
TrpRS-ICWARS (PMW; Archaea) WARS2 (Bacteria)
TyrRS-ICYARS (Archaea) YARS2 (Bacteria)
ValRS-IAVARS (Deltaproteobacteria?; Bacteria) VARS2 (Bacteria)
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Lei, L.; Burton, Z.F. “Superwobbling” and tRNA-34 Wobble and tRNA-37 Anticodon Loop Modifications in Evolution and Devolution of the Genetic Code. Life 2022, 12, 252. https://doi.org/10.3390/life12020252

AMA Style

Lei L, Burton ZF. “Superwobbling” and tRNA-34 Wobble and tRNA-37 Anticodon Loop Modifications in Evolution and Devolution of the Genetic Code. Life. 2022; 12(2):252. https://doi.org/10.3390/life12020252

Chicago/Turabian Style

Lei, Lei, and Zachary Frome Burton. 2022. "“Superwobbling” and tRNA-34 Wobble and tRNA-37 Anticodon Loop Modifications in Evolution and Devolution of the Genetic Code" Life 12, no. 2: 252. https://doi.org/10.3390/life12020252

APA Style

Lei, L., & Burton, Z. F. (2022). “Superwobbling” and tRNA-34 Wobble and tRNA-37 Anticodon Loop Modifications in Evolution and Devolution of the Genetic Code. Life, 12(2), 252. https://doi.org/10.3390/life12020252

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop