Order and Complexity in the RNA World
Abstract
:1. Defining Progress in an RNA World
2. A “Random Walk” through the System
- (1)
- The starting point may be any unit of the system.
- (2)
- In each step, the move preferably occurs to the closest adjacent unit (nearest neighbor).
- (3)
- If rule 2 leads to a unit that is already part of the given path, choose the one with the next-shortest distance. This rule applies repeatedly until the next step is leading to a “fresh” unit that has not been part of the path so far.
- (4)
- The random walk stops after N steps, with N being the total number of units in the system.
- (5)
- The random walk is repeated again and again over a certain time window to account for dynamic reorientation processes (molecular dynamics) in the system.
3. Defining the Order of the System—Reciprocal Sequential Entropy
- (1)
- The number n of different types of monomer units (e.g., four in case of conventional RNA).
- (2)
- The relative contributions ri of all given monomer types i in solution (with r1 + r2 + r3 + …+ rn = 1). In case of RNA and for equal concentrations of all bases, we obtain r1 = r2 = r3 = r4 = 0.25.
- (3)
- The average number of monomers M between two chains in the random walk (in number of units).
- (4)
- The average chain length L (in number of units).
- (5)
- The predictability pk of a given chain unit in accordance with a (partially) defined chain sequence. Note that the index “k” generally does not coincide with the index “i”. Instead, k = 1 denotes the most likely unit to follow, k = 2 the second-most likely, etc. Completely defined chains, therefore, lead to p1 = 1 and p2 = p3 = p4 = … = pn = 0), completely random chains to p1 = p2 = p3 = … = pn = 1/n. The values for pk are a measure for the average degree of definition of the chain sequences.
- (6)
- The relative accessibility aj of unit j of the chains in solution (with aj = 0 for a completely inaccessible segment, and a1 + a2 + a3 + … + aL = 1). This parameter set accounts for the average chain conformation and its preferred contact sites with monomer units in solution.
- (7)
- The total number N of units in the system and on the pathway of the random walk.
- (1)
- Pure monomer. If no chains are present and the system consists of monomer units only, the average chain length is L = 1 and all units are fully random with p = 1. In this case, Equation (3) reduces to Sr = k N ln n. For four different monomer varieties, the corresponding entropy contribution for one mol of units (N = 6.022 ∙ 1023) amounts to Sr = 11.526 J/(K∙mol), which is equivalent to the mixing entropy of the four different units (with 0.25 mol of each monomer) in the same overall volume. This leads to a relatively large entropy contribution Sr and, correspondingly, to a small sequential order 1/Sr; this situation is represented by point 1 in Figure 1.
- (2)
- Crystals. If we induce crystallization, e.g., from a mixed monomer solution by removal of the solvent, we expect to obtain separate crystals of pure monomer types 1, 2, 3, …, n next to each other. In the random walk model, each crystal is the equivalent to a very long chain (with a very large length L), while no monomer is present (M = 0). At the same time, we have to consider n = 1 for a dominating part of the random walk, since only one single type of monomer is found within each individual crystal. This given, Equation (3) simplifies to Sr = k (2N/L) ln L, an entropy term which reflects the variability of the possible contact points between the different crystals in the course of the random walk. In effect, this leads to a very low sequential entropy Sr and, correspondingly, to a high degree of order given by 1/Sr. In Figure 1, this situation could be assigned to point 2 (Figure 1).
- (3)
- Random chains, no monomer. In case of random chains (p = 1) in complete absence of all monomers (meaning M = 0), Equation (3) turns into Sr = k (2N/L) ln L + k N ln n, a term largely dominated the mixing entropy of the N units, as all of them are random over the full pathway. Due to the variability of the chain contacts, the entropy contribution is slightly larger than in case 1 and again leads to a slightly lower sequential order 1/Sr. This situation is generally referred to as the asphalt problem [18] and would correspond to point 3 (Figure 1).
- (4)
- Defined chains, no monomer. In a system without any residual monomer (which means that M = 0) and fully defined chains with common sequences (p = 0), Equation (3) reduces to Sr = k (2N/L) ln L, an entropy term similar to the one in case 3, but this time reflecting the possible contact points between the chains in the course of the random walk. Depending on the average chain length L, the resulting entropy contribution Sr is significantly smaller than in case 1. In Figure 1, this situation reflects the product of an extremely successful evolution with a high sequential order 1/Sr as indicated by point 4 (Figure 1).
4. Defining the Complexity of the System—The Size of the Reproducing Algorithm
- (a)
- The algorithm could create a list of all nL possible permutations for a chain of a length L, and then assign an integer number between 0 and N/L, the upper limit defined by the average number of chains in the total system. With a large number of possible permutations, these values would require the largest part of the code. Therefore, the value for the complexity can be approximated by:The precise value for c could be larger, as the short code for the creation of the list of permutations must be accounted for, but it could also be smaller, as many zeros or small values on the list of integers could be compressed in an optimized coding.
- (b)
- The algorithm could contain a list of every single sequence of every single chain in the system. Since all chains would add up to an overall length of N, the complexity could then be approximated by:c2N log2 n (in bit)
- (1)
- Pure monomer. We assume that no chains are present and the system consists of monomer units of equal relative contributions (r1 = r2 = r3 = … = rn = 1/n). Under these circumstances, the choice of the next monomer in the random walk is fully reproduced by a suitable random number generator that produces the integers 1, 2, 3, …, n at equal probability. This random number generator could be a subroutine that would be called repetitively in a loop for N times. If run repeatedly for an infinite number of times, this program would fully reproduce the statistics of a corresponding infinite number of consecutive random walks. The overall size of the program code could be limited to a few byte, corresponding to a very low degree of complexity. This situation is represented by point 1 in Figure 1.
- (2)
- Crystals. If we induce crystallization, e.g., from a mixed monomer solution by removal of the solvent, we expect to obtain separate crystals of pure monomer types 1, 2, 3, … n next to each other. Simulating the random walk, the program would run the random number generator once to define the type i of the starting crystal. It would then assume a random walk through the same units i for an average length determined by the average size of the crystals. After that, the type of the following crystal i’ would be determined by the random generator, and so on, until the full total length N is achieved. Again, if this routine is repeated for a very large number of times, the statistics of the results would be identical to the one for random walks in the real system. The size of the program code would only be slightly larger than in case 1, so the complexity of this situation is still very low. In Figure 1, it could be assigned to point 2.
- (3)
- Completely random chains, no monomer. In case of random chains (p = 1) in complete absence of all monomers (meaning M = 0), the random walk may initially resemble the result of case 1 (pure monomer). However, with an increasing number of random walks over time, the statistical result of the corresponding sequences will reflect the given sequences of the chains. Even though they have formed randomly, they will determine the total statistics of an infinite set of random walks over time. Therefore, an algorithm that is meant to reproduce these statistics must contain the sequence of every single given chain, together with a subroutine deciding on the random decisions on where to start and where to connect from chain to chain. This means that its code necessarily contains all sequences and hence is determined either by c1 in Equation (5) or by c2 in Equation (6), depending on which result is smaller. Some additional code is required for the hopping between chains. In any case, the result will be significantly larger than in cases 1 and 2. The system may have formed randomly, nevertheless its state is quite complex. In Figure 1, this situation would correspond to point 3.
- (4)
- Completely defined chains, no monomer. In a system without any residual monomer (which means that M = 0) and fully defined chains with a given sequence (p = 0), the random walk will follow the defined sequence (or parts of it) repeatedly. The starting unit and the connecting positions between the chains are the only random points of the pathway. Correspondingly, the reproducing algorithm would have to include the defined sequence of the chain together with an occasional call for the random number generator. Hence, the number of bit or byte necessary for this algorithm depends on the length L of the defined chain and is slightly larger than c = L ∙ log2 n (in bit). This system’s degree of complexity may be smaller than in case 3 (since N is generally larger than L), but it definitely exceeds that of the cases 1 or 2 (point 4 in Figure 1).
5. Model Calculations
6. Summary and Outlook
Supplementary Materials
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Woese, C. The Genetic Code; Harper and Row: New York, NY, USA, 1967; pp. 179–195. [Google Scholar]
- Crick, F.H.C. The origin of the genetic code. J. Mol. Biol. 1968, 38, 367–379. [Google Scholar] [CrossRef]
- Orgel, L.E. Evolution of the genetic apparatus. J. Mol. Biol. 1968, 38, 381–393. [Google Scholar] [CrossRef] [PubMed]
- Kruger, K.; Grabowski, P.J.; Zaug, A.J.; Sands, J.; Gottschling, D.E.; Cech, T.R. Self-splicing RNA: Autoexcision and autocyclization of the ribosomal RNA intervening sequence of Tetrahymena. Cell 1982, 31, 147–157. [Google Scholar] [CrossRef] [PubMed]
- Guerrier-Takada, C.; Gardiner, K.; Marsh, T.; Pace, N.; Altman, S. The RNA moiety of ribonuclease P is the catalytic subunit of the enzyme. Cell 1983, 35, 849–857. [Google Scholar] [CrossRef] [PubMed]
- Gilbert, W. The RNA world. Nature 1986, 319, 618. [Google Scholar] [CrossRef]
- Cech, T.R. The RNA worlds in context. Cold Spring Harb. Perspect. Biol. 2012, 4, a006742. [Google Scholar] [CrossRef] [Green Version]
- Neveu, M.; Kim, H.J.; Benner, S.A. The “strong” RNA world hypothesis: Fifty years old. Astrobiology 2013, 13, 391–403. [Google Scholar] [CrossRef] [Green Version]
- Lehman, N. The RNA World: 4,000,000,050 years old. Life 2015, 5, 1583–1586. [Google Scholar] [CrossRef] [Green Version]
- Higgs, P.; Lehman, N. The RNA world: Molecular cooperation at the origins of life. Nat. Rev. Genet. 2015, 16, 7–17. [Google Scholar] [CrossRef]
- Pressman, A.; Blanco, C.; Chen, I.A. The RNA World as a Model System to Study the Origin of Life. Curr. Biol. 2015, 25, R953–R963. [Google Scholar] [CrossRef] [Green Version]
- Leslie, E.O. Prebiotic chemistry and the origin of the RNA world. Crit. Rev. Biochem. Mol. Biol. 2004, 39, 99–123. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Atkins, J.F.; Gesteland, R.F.; Cech, T. The RNA World: The Nature of Modern RNA Suggests a Prebiotic RNA World; Cold Spring Harbor Laboratory Press: Plainview, NY, USA, 2006. [Google Scholar]
- Johnston, W.K.; Unrau, P.J.; Lawrence, M.S.; Glasner, M.E.; Bartel, D.P. RNA-catalyzed RNA polymerization: Accurate and general RNA-templated primer extension. Science 2001, 292, 1319–1325. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Robertson, M.P.; Joyce, G.F. The origins of the RNA world. Cold Spring Harb. Perspect. Biol. 2012, 4, a003608. [Google Scholar] [CrossRef] [PubMed]
- Mayer, C. Life in the context of order and complexity. Life 2020, 10, 5. [Google Scholar] [CrossRef] [Green Version]
- Mayer, C. Spontaneous formation of functional structures in messy environments. Life 2022, 12, 720. [Google Scholar] [CrossRef]
- Benner, S.A.; Kim, H.J.; Carrigan, M.A. Asphalt, water, and the prebiotic synthesis of ribose, ribonucleotides, and RNA. Acc. Chem. Res. 2011, 45, 2025–2034. [Google Scholar] [CrossRef]
- Higgs, P. Chemical evolution and the evolutionary definition of life. J. Mol. Evol. 2017, 84, 225–235. [Google Scholar] [CrossRef]
- Higgs, P. The effect of limited diffusion and wet-dry cycling on reversible polymerization reactions: Implications for prebiotic synthesis of nucleic acids. Life 2016, 6, 24. [Google Scholar] [CrossRef] [Green Version]
- Deamer, D.; Dworkin, J.P.; Sandford, S.A.; Bernstein, M.P.; Allamandola, L.J. The first cell membranes. Astrobiology 2002, 2, 371–381. [Google Scholar] [CrossRef]
- Deamer, D. The role of lipid membranes in life’s origin. Life 2017, 7, 5. [Google Scholar] [CrossRef]
- Damer, B.; Deamer, D. Coupled phases and combinatorial selection in fluctuating hydrothermal pools: A scenario to guide experimental approaches to the origin of cellular life. Life 2015, 5, 872–887. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Wächtershäuser, G. Before enzymes and templates: Theory of surface metabolism. Microbiol. Rev. 1988, 52, 452–484. [Google Scholar] [CrossRef] [PubMed]
- Hazen, R.M. Sverjensky, Mineral surfaces, geochemical complexities, and the origin of life. Cold Spring Harb. Perspect. Biol. 2010, 2, a002162. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Vanchurin, V. Towards a theory of machine learning. Mach. Learn. Sci. Technol. 2021, 2, 035012. [Google Scholar] [CrossRef]
- Vanchurin, V.; Wolf, Y.I.; Katsnelson, M.O.; Koonin, E.V. Towards a theory of evolution as multilevel learning. Proc. Natl. Acad. Sci. USA 2022, 119, e2120037119. [Google Scholar] [CrossRef] [PubMed]
- Vanchurin, V.; Wolf, Y.I.; Koonin, E.V.; Katsnelson, M.O. Thermodynamics of evolution and the origin of life. Proc. Natl. Acad. Sci. USA 2022, 119, e2120042119. [Google Scholar] [CrossRef]
- Kolmogorov, A.N. On tables of random numbers. Sankhya. Ser. 1963, 25, 369–375. [Google Scholar] [CrossRef] [Green Version]
- Kolmogorov, A.N. On tables of random numbers. Theor. Comp. Sci. 1998, 207, 387–395. [Google Scholar] [CrossRef] [Green Version]
- Kolmogorov, A.N. Logical basis for information theory and probability theory. IEE Trans. Inform. Theor. 1968, 14, 662–664. [Google Scholar] [CrossRef] [Green Version]
- Li, M.; Vitányi, P. Preliminaries. In An Introduction to Kolmogorov Complexity and Its Applications, Texts in Computer Science; Springer: New York, NY, USA, 2008. [Google Scholar]
- Chaitin, G.J. On the simplicity and speed of programs for computing of infinite sets of natural numbers. J. Assoc. Comp. Machin. 1969, 16, 407–422. [Google Scholar] [CrossRef]
- Kozyrev, S.V. Genome as a functional program. Lobachevskii J. Math. 2020, 41, 2326–2331. [Google Scholar] [CrossRef]
- Kauffman, S.A. The Origins of Order: Self-Organization and Selection in Evolution; Oxford University Press: Oxford, UK, 1993. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Mayer, C. Order and Complexity in the RNA World. Life 2023, 13, 603. https://doi.org/10.3390/life13030603
Mayer C. Order and Complexity in the RNA World. Life. 2023; 13(3):603. https://doi.org/10.3390/life13030603
Chicago/Turabian StyleMayer, Christian. 2023. "Order and Complexity in the RNA World" Life 13, no. 3: 603. https://doi.org/10.3390/life13030603
APA StyleMayer, C. (2023). Order and Complexity in the RNA World. Life, 13(3), 603. https://doi.org/10.3390/life13030603