Next Article in Journal
Role of Mineral Surfaces in Prebiotic Chemical Evolution. In Silico Quantum Mechanical Studies
Next Article in Special Issue
Dynamical Task Switching in Cellular Computers
Previous Article in Journal
Unevolved De Novo Proteins Have Innate Tendencies to Bind Transition Metals
Previous Article in Special Issue
Real-World Synthetic Biology: Is It Founded on an Engineering Approach, and Should It Be?
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Thermodynamics of Duplication Thresholds in Synthetic Protocell Systems

by
Bernat Corominas-Murtra
Institute of Science and Technology Austria, Am Campus 1, A-3400 Klosterneuburg, Austria
Submission received: 31 October 2018 / Revised: 8 January 2019 / Accepted: 9 January 2019 / Published: 15 January 2019
(This article belongs to the Special Issue Synthetic Biology: From Living Computers to Terraformation)

Abstract

:
Understanding the thermodynamics of the duplication process is a fundamental step towards a comprehensive physical theory of biological systems. However, the immense complexity of real cells obscures the fundamental tensions between energy gradients and entropic contributions that underlie duplication. The study of synthetic, feasible systems reproducing part of the key ingredients of living entities but overcoming major sources of biological complexity is of great relevance to deepen the comprehension of the fundamental thermodynamic processes underlying life and its prevalence. In this paper an abstract—yet realistic—synthetic system made of small synthetic protocell aggregates is studied in detail. A fundamental relation between free energy and entropic gradients is derived for a general, non-equilibrium scenario, setting the thermodynamic conditions for the occurrence and prevalence of duplication phenomena. This relation sets explicitly how the energy gradients invested in creating and maintaining structural—and eventually, functional—elements of the system must always compensate the entropic gradients, whose contributions come from changes in the translational, configurational, and macrostate entropies, as well as from dissipation due to irreversible transitions. Work/energy relations are also derived, defining lower bounds on the energy required for the duplication event to take place. A specific example including real ternary emulsions is provided in order to grasp the orders of magnitude involved in the problem. It is found that the minimal work invested over the system to trigger a duplication event is around ~ 10 13 J , which results, in the case of duplication of all the vesicles contained in a liter of emulsion, in an amount of energy around ~ 1 kJ . Without aiming to describe a truly biological process of duplication, this theoretical contribution seeks to explicitly define and identify the key actors that participate in it.

1. Introduction

How living beings have been able to overcome the entropic forces to develop increasingly complex individuals which, in turn, maintain their functionality is an open question and one of the hardest problems of modern science [1,2,3,4,5,6,7,8,9,10,11,12]. The exhaustive analysis of the energy flows in real living entities collides with the extreme complexity of even the simplest bacteria. Therefore, one must first set what are the physical defining properties of living beings and, then, try to attack the problem by cutting it into pieces. Each piece should incorporate a key or several key features, simple enough to accept a rigorous analysis, but complex enough to shed light to certain facets of the problem. The later integration of all pieces, however, will likely be much more than building a puzzle, for it is clear that the cross dependencies between all the building blocks will introduce an additional layer of complexity.
Following this philosophy, we will focus here on two crucial properties of living beings, according to accepted definitions of life discussed among scholars [13,14,15,16,17]. Specifically, we will concentrate on systems able to: ( i ) Capture material resources and turn them into building blocks by the use of externally provided free energy—and eventually undergo a duplication cycle and ( i i ) Keep its components together and distinguish itself from the environment. It is assumed that the compartment contains the metabolic and information system—if any. Our simplified system, thus, will lack two crucial features of living beings, namely, ( i i i ) To process and transmit inheritable information to progeny and ( i v ) To undergo Darwinian evolution through variation of the copied inheritable information and a successive selection of the better progeny. We will thus focus on the thermodynamic properties of the duplication process, and we will skip all the complexity arising from other phenomena. It is worth to recall here that this kind of approach, where the essential physics of the duplication problem is addressed has a long history, dating back to the late thirties of the 20th century, with the highly influential works of N. Rashevsky [1].
In contrast to the usual top-down approaches followed in biology, we will address this problem using a bottom-up approach. In such kind of approaches to life-related phenomena, physical building blocks and chemical processes are externally assembled and triggered, creating artificial, synthetic entities that mimic some of the crucial properties of living beings. Consistently, this approach has been named Artificial Life [16,18,19,20]. Artificial cells, or, protocells are usually composed by emulsions [21] made of mixtures of lipids, precursors, and water [15,16,19,20,22,23,24,25,26,27,28]. The foundation of this approach is based on three main starting points: First, it provides a framework where energy imbalances trigger the emergence of cell like aggregates [21], second, it is possible to externally drive simplified metabolic reactions [15,16,26,28], and, third, it uses the same type of building blocks—mainly lipids—that compose an important part of the structure of most of the living organisms [29]. Crucial to our aims, it is worth to remark a couple of recent results: First, numerical approaches have shown that duplication dynamics as a consequence of energy imbalances due to geometrical frustration is expected in those systems, if properly driven out of equilibrium [30]. Second, recent experiments succeeded in duplicating real artificial protocells through a specific oil-in-water droplet system with replicating information templates [31]. This result is certainly remarkable, but our approach does exclude the role of any information/replication dynamics. In doing so, we explore how far can we go by just taking into account general stability properties and energy imbalances to explain and characterize the duplication process. The work presented here runs in parallel to an interesting complementary approach taken in [32], where the kinetics involved in the duplication events of synthetic systems was studied in detail.
In this paper we will work with a generic emulsion system [21]. We will make use of the well understood free energy landscape of such systems, where the contributions coming from aggregate geometry and size have been long studied [21,33,34], as well as the non-trivial contributions of the entropic terms [35,36]. The impact of a changing energy landscape—which eventually can favour a duplication event—will be studied from a generic non-equilibrium situation making use of modern methods arising from the emerging field of Stochastic Thermodynamics [37,38,39,40,41,42]. Within this framework, the evolution of the system can be studied following the individual trajectories in the phase space and, importantly, exact relations between energy and work can be obtained, even in out of equilibrium cases. In addition, relations between energy, entropy and information arise naturally [43,44].
The remainder of the paper is organized as follows: In the next section, we describe the thermodynamics of the abstract emulsion system in detail. We derive its free energy landscape, Section 2.1, the equilibrium distributions, Section 2.2, and the detailed balance condition over transitions, Section 2.3. Next, in Section 2.4, we expose the generic protocol that drives the system towards the occurrence of a duplication event. We end the section where the system is presented by exploring the orders of magnitude involved in these kind of systems, Section 2.5. Here we analyze the quantitative values of the thermodynamic functionals presented generically in the previous sections for a real microemulsion system. The thermodynamic analysis of the duplication thresholds is the core of section III. First, we derive a general relation for duplication probabilities, Section 3.1. Then, in Section 3.2, we explore the consequences of this result for a system evolving in a quasi-static fashion. Section 3.3 generalizes the previous equilibrium approach by providing an exact equality between probabilities of duplication thresholds in a specific non-equilibrium scenario, in which the relaxation process that may eventually lead to duplication happens between two states which may not be in equilibrium. This equivalence leads us to define general duplication scenarios and derive the general conditions of duplication, as well as the amount of work invested over the system to trigger a duplication event, and the conditions for the perpetuation of the duplication cycle. Section 3.4 refers to the free energy/entropy relations for the perpetuation of the duplication cycle in time. The final section is devoted to discuss the implications of the presented results. The whole paper is aimed to be self-contained and details of the derivations are provided in the Appendix A to make it understandable to non-specialized audiences.

2. The System

Our system is conceived as being an abstract emulsion in a kind of reaction tank of volume V syst connected to a heat reservoir at inverse temperature β = 1 T —we set k B = 1 . Let X = ( X 1 , , X L ) , where X i is a specific kind of lipid species populating the system and Y = ( Y 1 , , Y P ) , where Y i is a specific kind of precursor/surfactant species populating the system. Let X tot = ( X 1 , tot , , X L , tot ) , Y tot = ( Y 1 , tot , , Y P , tot ) the total amount of molecules of the different species of lipids and precursors that lie in aqueous solution inside our volume. We refer to X tot , Y tot as the boundary conditions. As we shall see, they may change in time, under the action of an external protocol.
Due to the hydrophobic/hydrophilic nature of the surfactant molecules, we assume that (part of them) tend to aggregate in spheroidal compartments. Surfactants are supposed to populate the surface of the aggregates. No assumptions are made on the specific nature of the membranes or the interior of the aggregates, leaving the discussion always in a general plane. A state σ n of our system is described by a 3-tuple:
σ n σ n ( X a , Y a , n ) ,
where X a = ( X 1 , a , , X L , a ) and Y a = ( Y 1 , a , , Y P , a ) are the amount of lipids and precursors forming aggregates, respectively, and n the number of aggregates present in the volume. In general, and if no confusion can arise, we refer to a given state as σ n instead of σ n ( X a , Y a , n ) for notational simplicity. We keep the label subscript “ n ” accounting for the number of aggregates only for notational convenience. When we introduce time dependence, we write σ n t σ n ( X a ( t ) , Y a ( t ) , n ( t ) ) . Not all molecules will be part of the aggregates. Therefore, we must account for these molecules in bulk. Consistently, given a state σ n ( X a , Y a , n ) occuring under the boundary conditions X tot , Y tot , we wil have that X b = X tot X a = ( X 1 , b , , X L , b ) and Y b = Y tot Y a = ( Y 1 , b , , Y P , b ) are the amount of lipids and precursors in bulk, respectively.
A macrostate or coarse-grained state σ ˜ n is defined as the 4-tuple:
σ ˜ n σ ˜ n ( X tot , Y tot , n , p ( σ n | σ ˜ n ) ) ,
where p ( σ n | σ ˜ n ) is the probability distribution of finding σ n as a particular realization of this macrostate. This macrostate can be realized through any state containing X tot , Y tot and n protocellular aggregates following the distribution p ( σ n | σ ˜ n ) . In case of time dependence we write σ ˜ n t σ ˜ n t ( X tot ( t ) , Y tot ( t ) , n , p ( σ n t | σ ˜ n t ) ) .

2.1. Gibbs Free Energy Landscape

The thermodynamic landscape of our system is given by the Gibbs free energy of the state σ n ,
G ( σ n ) G X tot , Y tot ( σ n ) .
The Gibbs free energy is always defined over states of the system and depends on both the state σ n and the boundary conditions X tot , Y tot . Therefore, the same state will have energy changes if the boundary conditions change. Each macrostate has a uniquely defined free energy functional. For notational simplicity, we drop the subscript X tot , Y tot , if no confusion arises.
The complex nature of these type of emulsions results in a free energy functional with several blocks, which we construct step by step. First, we focus on the free energy contribution of a single protocellular aggregate, containing X lipids, Y , and precursors, G a :
G a ( X i , Y i ) = i L Δ μ X i X i + i P Δ μ Y i Y i + G geo ,
where Δ μ X i and Δ μ Y i are the changes in chemical potential when moving lipids and surfactants from bulk into the i-th aggregate, and G geo is a geometric term expressing shape and surface contributions to the free energy of the aggregate. This geometric term accounts for the membrane properties of the system, and is computed according to the existence of a minimum energy configuration or perfect protocellular aggregate, which can be directly computed as the optimal packing from the knowledge to the sizes and geometries of the precursor molecules. The geometrical term thus reads:
G geo = γ A + α A + κ A ( H H 0 ) 2 d A ,
where γ is the surface tension, α the compressibility coefficient, and κ the elastic bending modulus of the lipid membrane. The integral is the second order expansion of the contribution of the Helfrisch Hamiltonian to the overall free energy, being H the curvature of the membrane—as a function of some coordinates parametrizing the membrane surface—of the current aggregate and H 0 the curvature of the perfect aggregate. The integral is computed over the whole area of the membrane, A [33,34].
Once we have properly characterized the free energies of a single aggregate, we proceed to construct the free energy of the whole state σ n . The next task will be to compute the entropy for a system in the state σ n = σ n ( X a , Y a , n ) under the boundary conditions X tot , Y tot . To compute the entropy of such state, we apply directly Boltzmann’s definition over the amount of configurations the state σ n can adopt, Ω X tot , Y tot ( σ n ) [45]:
S ( σ n ) = log Ω X tot , Y tot ( σ n ) .
Clearly, S ( σ n ) S X tot , Y tot ( σ n ) . However, we do not write this dependence explicitly for the sake of readability, if no confusion can arise. This entropic term has two contributions, the translational entropy and the configurational entropy. We start with the translational contribution. We consider that the system of n indistinguishable aggregates has 3 n degrees of freedom and that each aggregate diffuse around within a volume V syst = n V a and that m is an appropriate length scale for such a diffusive process. Accordingly, one has that the amount of configurations provided by the translational term is:
1 n ! n V a m n .
We emphasize that, in the approach take here, m has been chosen as a typical volume unit whose purpose is to render the argument of the logarithm dimensionless—for a deeper discussion on the choice of the right length scale see [35,36]. For each configuration described above, we must account for the potential degeneracy of states, or, in other words, the amount of configurations given by the amount of molecules in bulk and forming the aggregates. For each chemical species, e.g., the i-th lipid, this amount of configurations is
X i , tot X i , a .
Therefore, assuming that there are no cross dependencies among the different chemical species, one has that the amount of configurations of molecules in bulk and aggregates is:
i L k P X i , tot X i , a Y i , tot Y i , a .
Considering these two contributions, the entropy term reads:
S ( σ n ) = log 1 n ! n V a v m n i L k P X i , tot X i , a Y i , tot Y i , a .
The overall entropy of the state σ n = σ n ( X a , Y a , n ) , under the boundary conditions given by X tot , Y tot , S ( σ n ) = log Ω X tot , Y tot ( σ n ) , is:
S ( σ n ) = n log V a v m + + i L log X i , tot X i , a + i P log Y i , tot Y i , a ,
where we used the fact the log ( a b ) = log a + log b and the Stirling approximation for the factorial for the first term, namely log n ! n log n n . Collecting all the above ingredients, we have that the Gibbs free energy of the system in the state σ n = σ n ( X a , Y a , n ) under boundary conditions X tot , Y tot becomes:
G ( σ n ) = i L μ X i X i , tot + i P μ Y i Y i , tot + + i n G a ( X i , Y i ) T S ( σ n ) ,
with the standard chemical potentials μ X i and μ Y i of lipids and precursors, respectively.

2.2. Helmholtz Free Energy

Let the system be subject to the boundary conditions X tot , Y tot . In equilibrium, the probability that the system is in the particular state σ n , belonging to the macrostate σ ˜ n is given by the Boltzmann distribution, p ( σ n | σ ˜ n ) [45]:
p ( σ n | σ ˜ n ) = e β G ( σ n ) Z ( σ ˜ n ) ,
being Z ( σ ˜ n ) the partition function, namely:
Z ( σ ˜ n ) = σ n σ ˜ n e β G ( σ n ) .
Accordingly, the Helmholtz free energy of the macrostate σ ˜ n , F ( σ ˜ n ) is:
F ( σ ˜ n ) = log Z ( σ ˜ n ) = G σ ˜ n 1 β H ( σ ˜ n ) ,
being σ ˜ n the average over all states of the macrostate and H ( σ ˜ n ) the entropy of the macrostate, namely:
H ( σ ˜ n ) = σ n σ ˜ n p ( σ n | σ ˜ n ) log p ( σ n | σ ˜ n ) ,
where p ( σ n | σ n ˜ ) is now defined as:
p ( σ n | σ ˜ n ) = e β ( G ( σ n ) + F ( σ ˜ n ) ) .
We point out that we will refer to a given probability distribution associated to a macrostate σ ˜ n either as p ( σ n | σ ˜ n ) or p | σ ˜ n , indistinctly. We finally recall that we assume that the equilibrium distribution macrostate σ ˜ n is such that:
argmin σ n σ ˜ n G X tot , Y tot ( σ n ) = argmin σ G X tot , Y tot ( σ ) ,
where we emphasized the dependency on the boundary conditions X tot , Y tot only for clarity. In words, we assume that the equilibrium distribution is defined around the absolute minimum of Gibbs free energies, and that such a minimum is unique.

2.3. Detailed Balance Condition in Duplication

The process of duplication/fusion of aggregates is of special interest for us, since it is the basis of duplication. It is assumed to satisfy the following transition rates between states:
σ n ( X a , Y a , n ) k n + n σ n + 1 ( X a , Y a , n + 1 ) σ n ( X a , Y a , n ) k n n σ n 1 ( X a , Y a , n 1 ) ,
where the kinetic constants relate as:
k n = k n + e β δ G ( σ n , σ n + 1 ) ,
where δ G ( σ n , σ n + 1 ) G ( σ n + 1 ) G ( σ n ) . Detailed balance condition is also assumed for any other transition between states. Therefore, for any two states σ n σ ˜ n and σ n + 1 σ ˜ n + 1 , thanks to the detailed balance condition given in Equation (8) and assumed for all transitions, one has that, between two arbitrary states σ , σ :
p ( σ σ ) p ( σ σ ) k n k n + = e β δ G ( σ , σ ) .
Importantly, we recall that the functional G must be computed under the same boundary conditions X tot , Y tot in any evaluation of the difference, i.e.,:
δ G ( σ , σ ) = G ( σ ) G ( σ ) G X tot , Y tot ( σ ) G X tot , Y tot ( σ ) .

2.4. The Driving Protocol

Let us assume that at time t = 0 the system is in contact to a thermal reservoir at inverse temperature β , and in an equilibrium macrostate σ ˜ n 0 , that is—see Figure 1):
p ( σ n 0 | σ ˜ n 0 ) = 1 Z ( σ ˜ n 0 ) e β G ( σ n 0 ) .
From this moment on, we run a protocol that changes the energy landscape, without separating the system from the heath bath neither changing the whole system’s volume, V syst . This protocol runs from t = 0 to t = τ —see Figure 1b,c. For example, suppose that we add new lipids and that we switch on a light that triggers a metabolic reaction that transforms lipids into precursors, thereby creating new surfactants. We call this protocol ψ ( t ) . In general, it will affect the L + P variables of our system. Therefore, the protocol ψ ( t ) consists on a list of—maybe interdependent—protocols:
ψ ( t ) = ( ϕ 1 ( t ) , , ϕ L ( t ) , ϕ L + 1 ( t ) , , ϕ L + P ( t ) ) ,
where the first L elements ϕ 1 ( t ) , , ϕ L ( t ) explicit the action of the protocol on the lipids X 1 , , X L abundance and the last P elements ϕ L + 1 ( t ) , , ϕ L + P ( t ) explicit the change due to the protocol on the precursors Y 1 , , X P abundance. Let us be more specific on the action of the protocol. Assume that at time t the boundary conditions of our system are given by X tot ( t ) , Y tot ( t ) . The application of the protocol a for short time interval [ t , t + δ ] to the boundary conditions, denoted by δ ψ ( X tot ( t ) , Y tot ( t ) ) will lead the boundary conditions to change as:
δ ψ ( X tot ( t ) , Y tot ( t ) ) = ( X 1 ( t ) + δ ϕ 1 ( t ) , , Y 1 ( t ) + δ ϕ L + 1 ( t ) ϕ L + 1 ( t ) , ) .
The above transformation of the boundary conditions will lead the system to change its macrostate, from σ ˜ t to σ ˜ t + δ . This transition can be done through a set of stochastic trajectories, which will be referred to as Σ [ t , t + δ ] . At τ the system will be at the macrostate σ ˜ n + 1 τ and we will stop the protocol—see Figure 1d—letting the system relax towards an equilibrium state, achieved at time τ —see Figure 1d. The distribution of states p | σ ˜ n + 1 τ is assumed to obey the standard equilibrium Boltzmann statistics:
p ( σ n + 1 τ | σ ˜ n + 1 τ ) = 1 Z ( σ ˜ n + 1 τ ) e β G ( σ n + 1 τ ) .
We assume that a duplication event has taken place in the time interval [ τ δ , τ ] and that the relaxation process happening at the interval [ τ , τ ] does not imply a change in the number of protocell aggregates. We remind that the whole process takes place in contact to a heat reservoir with inverse temperature β and at a constant volume V syst .

2.5. An Example: Ternary Emulsions

To grasp the orders of magnitude involved in our problem, we take a particular example of the above general system, in line to the one described in [26,27]. From this example, we perform a rough estimation of the orders of magnitude involved in the computation of the free energies of a single aggregate. For the sake of readability and extension, the computations provided here are not as detailed as in the other parts of the paper. We refer the interested reader to [21,26,27,46,47,48] for the detailed discussions on the orders of magnitude and potential experimental set ups.
Suppose that we have a Winsor type IV ternary emulsion made of a single lipid, X decanoic acid anhydride, ( C 9 H 19 - CO - O - OC - C 9 H 19 ), a single precursor, Y decanoic acid, ( C 9 H 19 - COOH ), and water. Equation (1) now reads:
G a ( X , Y ) = Δ μ X X + Δ μ Y Y + G geo .
Δ μ Y can be calculated from their partition coefficient—i.e., the fraction of lipids found in bulk solution as opposed to the aggregates. Estimations give this value to be around 14% [46]. If k Y + / k Y is the ratio between precursor molecules going from bulk to aggregates and precursor molecules going from aggregates to bulk, this reads:
k Y + k Y 0.14 .
Therefore, using Equation (8), and setting β = 1 / k B T , one can approximate the energy gain of moving a decanoic acid molecule from bulk to aggregate, Δ μ Y , as
Δ μ Y log ( 0.14 ) k B T ,
where k B is the Boltzmann constant, k B 1.38 × 10 23 J / K At T = 300 K , and N A being the Avogadro number, the above equation leads to:
N A Δ μ Y 4.9 kJ / mol .
Since the decanoic acid anhydride ( C 9 H 19 - CO - O - OC - C 9 H 19 ) has two hydrophobic chains, we set
Δ μ X = 2 Δ μ Y 9.8 kJ / mol ,
which in turn evaluates to a partition coefficient of ∼2%. For the geometric term given by Equation (2), we make the assumption that γ , α κ , therefore the contribution of the Helfrisch hamiltonian will not be taken into account. The surface tension and the compressibility parameters, γ , α can be estimated as γ 45.9 mN / m and α 5.80 × 10 45 Nm 3 [26]. We assume a spherical lipid core of X c precursor molecules, whose individual molecular volume V X = 0.54 nm 3 . Thus, the spherical core of the aggregate has a radius, R core ( X c ) , of:
R core ( X c ) 3 V X 4 π X c 1 / 3 .
The whole aggregate, including the surface molecules, displays a radius, R ( X c ) , of
R ( X c ) R core ( X c ) + t ,
where t is the length of the tail of the surfactant molecules, which is considered constant. The optimal number of surfactant molecules, Y ( X c ) for this amount of molecules in the core of the aggregate is then computed as:
Y ( X c ) = 4 π a 0 R 2 ( X c ) .
We assume that the tail length of the surfactants is around t = 1.4 nm and that their effective head area a 0 = 25 Å 2 [21]. The typical radius of oil droplets is around 100 nm leading to a volume of 4.1 × 10 6 nm 3 , i.e., ~ 0.004 femtoliter, which—assuming a typical water-to-oil ratio of 10:1—gives a system volume of 0.044 femtoliter per droplet. Therefore, a milliliter of emulsion has an order of magnitude of 10 13 oil droplets. From the ratio of precursor to droplet volume, it follows that X c 7.62 × 10 6 . With an optimal packing number of surfactants Y ( X c ) computed from Equation (10) and a partition coefficient of 14%, one can estimate a total of Y c = 5.7 × 10 5 surfactant molecules. With these values, a rough estimation of the orders of magnitude of the free energy of a single aggregate whose packing is optimal, G a ( X , Y ) , is given by:
| G a ( X , Y ) | 10 13 J .
This example gives us an orientation about the energy scales involved in our problem.

3. Duplication Thresholds

We proceed now to explore under which circumstances the application of the protocol results into a duplication event. The goal is to obtain an inequality which, when observed, a duplication event is expected to take place. This will be related to the amount of work performed from the protocol. We perform the analysis from a quasi-static approach and from a more general non-equilibrium approach. First of all, we derive a general condition for the transition probabilities among macrostates, which does not require, a priori, equilibrium conditions.

3.1. Transition Probabilities between Macrostates

Now we take a close analysis on the process happening in the interval [ t , t + δ ] , where 0 < t < τ . We drop the indices n because, we consider transitions between any two states, and, by now, there will not be necessarily a duplication event in consideration. At time t the Gibbs free energy landscape suffers a change imposed by the protocol ψ ( t ) . The initial state, σ ˜ t , is therefore perturbed and may no longer be necessarily in equilibrium. The system then relaxes to σ ˜ t + δ , which may not be in equilibrium, too. The boundary conditions ( X tot , Y tot ) are considered constant after the change imposed by the protocol at time t until time t + δ . The probability of jumping from macrostate σ ˜ t to macrostate σ ˜ t + δ is given by:
p ( σ ˜ t σ ˜ t + δ ) = σ t σ t + δ p ( σ t | σ ˜ t ) p ( σ t σ t + δ ) .
Thanks to the detailed balance condition, one can rewrite the backwards transition as:
p ( σ ˜ t + δ σ ˜ t ) = σ t σ t + δ p ( σ t | σ ˜ t ) p ( σ t σ t + δ ) g ( t , t + δ ) ,
where g ( t , t + δ ) is a function that depends on the states, σ t , σ t + δ which, written in a suitable form for further developments, reads:
g ( t , t + δ ) e β δ G ( t , t + δ ) e ln p ( σ t | σ ˜ t ) p ( σ t + δ | σ ˜ t + δ ) .
Now, we derive the probability that we chose a given trajectory σ n τ δ to σ n + 1 τ from the ensemble Σ [ t , t + δ ] of trajectories that go from σ ˜ t to σ ˜ t + δ —see Figure 2. This probability distribution is referred to as p t + δ , and is defined as:
p t + δ ( σ t , σ t + δ ) p ( σ t | σ ˜ t + δ ) p ( σ t σ t + δ ) p ( σ ˜ t σ ˜ t + δ ) .
Conversely, we can define the backwards version of the above probability distribution, namely, p t + δ as:
p t + δ ( σ t + δ , σ t ) p ( σ t + δ | σ ˜ t + δ ) p ( σ t + δ σ t ) p ( σ ˜ t + δ σ ˜ t ) .
The probability p t + δ accounts for the probability of a given trajectory in case of time reversal of the protocol action. It is straightforward to check that both p t + δ and p t + δ are well defined probability distributions—i.e., that they sum up to 1. With the above defined distributions, the above computations lead to:
p ( σ ˜ t + δ σ ˜ t ) p ( σ ˜ t σ ˜ t + δ ) = σ t σ t + δ p t + δ ( σ t , σ t + δ ) g ( t , t + δ ) .
The above equation is the average over all paths of the last element of the sum, namely:
p ( σ ˜ t + δ σ ˜ t ) p ( σ ˜ t σ ˜ t + δ ) = e β δ G ( t , t + δ ) e ln p ( σ t | σ ˜ t ) p ( σ t + δ | σ ˜ t + δ ) Σ [ t , t + δ ] ,
where the brackets denote average over all the microscopic trajectories Σ [ t , t + δ ] between states σ t σ t + δ that realize the transition from macrostate σ ˜ t to macrostate σ ˜ t + δ .

3.2. Quasi-Static Approach

The first exploration corresponds to the situation in which the transitions triggered by the protocol ψ ( t ) are performed quasistatically, that is: They are so slow that all the trajectories Σ [ 0 , τ ] can be considered a succession of equilibrium states. Applying the general relation given by Equation (14) we arrive, after cancellations, at:
p ( σ ˜ t σ ˜ t + δ ) p ( σ ˜ t + δ σ ˜ t ) = Z ( σ ˜ t + δ ) Z ( σ ˜ t ) ,
and we then recover, as expected, the equilibrium relation for the backwards and forwards probabilities:
p ( σ ˜ t σ ˜ t + δ ) p ( σ ˜ t + δ σ ˜ t ) = e β δ F ( t , t + δ ) ,
where δ F ( t , t + δ ) is the increase of the Helmholtz free energies—see Equation (7)—through the interval [ t , t + δ ] :
δ F ( t , t + δ ) = F ( σ ˜ t + δ ) F ( σ ˜ t ) .
As stated in the description of the protocol ψ ( t ) , we assume that a duplication event has taken place in the time interval [ τ δ , τ ] . To study this case, we recover the subscripts n , n + 1 accounting for the number of aggregates in the system. This will imply that, at time τ δ , we had the system in a macrostate σ ˜ n τ δ and that at time τ the system transitioned towards a macrostate state σ ˜ n + 1 τ . For that to happen spontaneously, we need that:
p ( σ ˜ n τ δ σ ˜ n + 1 τ ) > p ( σ ˜ n + 1 τ σ ˜ n τ δ ) .
and, according to Equation (15), one needs that δ F ( t , t + δ ) < 0 , which implies:
F ( σ ˜ n + 1 τ ) < F ( σ ˜ n τ δ ) .
If we take a closer look to the structure of the Gibbs free energy given in Equation (4), we can refine the above condition. Indeed, since the boundary conditions X tot , Y tot do not change during the time interval [ τ δ , τ ) , the contributions to the change of the free energies will only correspond to the free energies of the aggregates, due to size and frustration given in Equation (1), and their associated entropic terms, given by the Shannon entropies of the macrostate and the translational and configurational entropies of the states, given in Equation (3). After cancellations, we arrive to:
β δ G a < δ S ( τ δ , τ ) ,
where the increase on the average free energies δ G a is defined from Equation (1) as:
δ G a i n + 1 G a ( X i , Y i ) σ ˜ n + 1 τ i n G a ( X i , Y i ) σ ˜ n τ δ ,
being the averages taken over the whole set of states belonging to σ ˜ n + 1 τ and σ ˜ n τ δ , respectively, and the entropic gradient δ S ( τ δ , τ ) is defined as:
δ S ( τ δ , τ ) = δ S ( τ δ , τ ) + δ H ( τ δ , τ ) .
where δ S ( τ δ , τ ) is the increase of configurational and translational entropies for each state, as given in Equation (3):
δ S ( τ δ , τ ) = S ( σ n + 1 τ ) σ ˜ n + 1 τ S ( σ n τ δ ) σ ˜ n τ δ .
And δ H ( τ δ , τ ) the increase on Shannon entropies, namely:
δ H ( τ δ , τ ) = H ( σ ˜ n + 1 τ ) H ( σ ˜ n τ δ ) .
Knowing the evolution of free energies, and assuming the quasi-static approach, one can easily compute the amount of work performed by the protocol ψ ( t ) to trigger a duplication event. Indeed, in the quasi static approach, the amount of work δ w ( τ δ , τ ) invested over the system can be identified with the Helmholtz free energy gradients, namely:
δ w ( τ δ , τ ) = δ F ( τ δ , τ ) .
In consequence, the amount of work performed over the system along the protocol, W ψ is, assuming a continuous approach ( δ F ( t , t + δ ) d F ( t ) ):
W ψ = 0 τ d F ( t ) = Δ F ( 0 , τ ) ,
as expected in the case of equilibrium transformations [49].

3.3. Non Equilibrium Approach

We now explore a more general situation, in which the states visited along the trajectory are not necessarily in equilibrium, and thus, an extra amount of dissipated heat is expected, deforming the energy/work relations derived in the previous section [37,38,50]—see Figure 3. Our approach does not consider explicitly other sources of non-equilibrium behaviour, and is focused on the exploration of the potentials under the assumption that the final and initial states may not be equilibrium ones. In particular, the hypothesis of detailed balance between different states of the system is always assumed to hold at the level of microscopic transitions.
Specifically, let us consider the case in which at time t the boundary conditions X tot , Y tot , suffer a sudden change imposed by the protocol ψ ( t ) . We observe that the change induced by the protocol to the boundary conditions implies a change on the free energy landscape described by the Gibbs free energy in Equation (4). The initial state, σ ˜ t , is therefore perturbed and is not necessarily considered in equilibrium. We consider that, in this irreversible destabilization of the system, an amount of entropy like:
β Q ψ ( t ) ,
is produced, due to the non-equilibrium transformation that is associated to the perturbation of the system after the application of the protocol. This part will not be studied in detail, since it plays no role in the duplication process. The system then moves to σ ˜ t + δ , which may not be in equilibrium, too. As above, the boundary conditions ( X tot , Y tot ) are considered constant in the interval [ t , t + δ ] after the change imposed by the protocol at time t.
If we do not assume a priori that the starting distribution p ( σ n t | σ ˜ t ) is an equilibrium one, we see that, under the application of Equation (14) we reach a more general relation—see Appendix A for details:
p ( σ ˜ t σ ˜ t + δ ) ) p ( σ ˜ t + δ σ ˜ t ) e β ( δ G ( t , t + δ ) Σ [ t , t + δ ] ) + δ H ( t , t + δ ) ,
where δ G ( t , t + δ ) Σ [ t , t + δ ] is the increase of Gibbs free energy averaged over all trajectories Σ [ t , t + δ ] from σ ˜ t to σ ˜ t + δ . Unfortunately, the above inequality only can give necessary but not sufficient conditions for duplication. The derivation of an exact equivalence for a restricted range of situations—yet involving many non-equilibrium cases—is the objective of the next subsection.

3.3.1. Free Energy Structure

To achieve an exact relation between forwards and backwards probabilities of duplication, we need to develop some equivalences involving information-theoretic measures. These relations are derived from the exploration of the structure of δ G ( t , t + δ ) . It is important to highlight that, since we do not assume we are in equilibrium, one cannot use the identity F ( σ ˜ t ) = G σ ˜ t β 1 H ( σ ˜ t ) anymore. Again, we focus our efforts in the study of the time interval [ τ δ , τ ] , where we assume that a duplication event has taken place. As above, we recover the subscripts n , n + 1 accounting for the number of aggregates in the system. We remind that this implies that, at time τ δ we had the system in a macrostate state σ ˜ n τ δ and that at time τ the system transitioned towards a macrostate state σ ˜ n + 1 τ .
First relation. Observe that we can decouple the general term δ G ( τ δ , τ ) Σ [ τ δ , τ ] as follows: Let p | σ ˜ t be the probability distribution of the actual macrostate σ ˜ t , and p | σ ˜ * t be the equilibrium distribution associated to the equilibrium macrostate σ ˜ t , under the conditions imposed by the protocol at time t. That is, the probability distribution that would correspond to σ ˜ t if it where in equilibrium, σ ˜ t = σ ˜ * t . In other words, we have an equilibrium distribution p ( σ t | σ ˜ * t ) e β G ( σ t ) , sharing the support (let p be a probability distribution defined over the set X and let X X such that X = { x i X : p ( x i ) > 0 } . X is the support set of the probability distribution p. In words, the support set is the set of elements whose probability is larger than 0) set with the actual, possibly non-equilibrium, distribution p | σ ˜ t . After rearrangements one finds that—see Appendix A for details:
β G ( τ δ ) σ ˜ n τ δ = H ( σ ˜ n τ δ ) + D ( p | σ ˜ n τ δ | | p | σ ˜ n * τ δ ) + F ( σ ˜ n * t ) ,
where D ( p | σ ˜ n τ δ | | p | σ ˜ n * τ δ ) is the Kullback-Leibler divergence or information gain between distributions p | σ ˜ n τ δ and p | σ ˜ n * τ δ , defined as [51]:
D ( p | σ ˜ n τ δ | | p | σ ˜ n * τ δ ) = σ n τ δ p ( σ n τ δ | σ ˜ n τ δ ) log p ( σ n τ δ | σ ˜ n τ δ ) p ( σ n τ δ | σ ˜ n * τ δ ) .
The Kullback-Leibler divergence is a non-negative measure D ( p | σ ˜ n τ δ | | p | σ ˜ n * τ δ ) 0 , and
D ( p | σ ˜ n τ δ | | p | σ ˜ n * τ δ ) = 0 ,
only in the case of p | σ ˜ n τ δ = p | σ ˜ n * τ δ . In other words, as expected, if transitions are performed between equilibrium states, no contributions arise from this term. In analogy to the equilibrium Helmholtz free energy—see Equation (7)—on can define a non-equilibrium Helmholtz free energy of the non-equilibrium macrostate σ ˜ t , F ( σ ˜ t ) , as follows [43,52,53]:
F ( σ ˜ t ) β G ( t ) σ ˜ t H ( σ ˜ t ) ,
where the average is over all the states of the macrostate σ ˜ t . If one assumes that the transitions between states obey the Markov property—see Appendix A for details—one can define the increase on non-equilibrium Helmholtz free energies, δ F ( τ δ , τ ) , as:
δ F ( τ δ , τ ) = F ( σ ˜ n + 1 τ ) F ( σ ˜ n τ δ ) ,
where F ( σ ˜ n + 1 τ ) and F ( σ ˜ n τ δ ) are defined following Equation (21). Now, thanks to Equation (20), one arrives at:
δ F ( τ δ , τ ) = δ D ( τ δ , τ ) + δ F * ( τ δ , τ ) ,
being δ D ( τ δ , τ ) the increase in the Kullback-Leibler divergence between τ δ and τ , namely:
δ D ( τ δ , τ ) = D ( p | σ ˜ n + 1 τ | | p | σ ˜ n + 1 * τ ) D ( p | σ ˜ n τ δ | | p | σ ˜ n * τ δ ) ,
and δ F * ( τ δ , τ ) the increase on Helmholtz free energy of the equilibrium macrostates associated to σ ˜ n + 1 τ and σ ˜ n τ δ , respectively. The sign of δ D ( τ δ , τ ) and its absolute value are important to understand the extent of the dissipative role of this term. Using the log-sum inequality [51] one is lead to—see details in the Appendix A:
δ D ( τ δ , τ ) 0 .
Again, using the log-sum inequality, one can prove that the above inequality becomes an equality only if the transitions are among equilibrium states—as expected, i.e., p | σ ˜ n + 1 τ = p | σ ˜ n + 1 * τ and p | σ ˜ n τ δ = p | σ ˜ n * τ δ . If this is not the case,
| δ D ( τ δ , τ ) | > 0 .
Second relation. Recognizing that Equation (9) implies that:
ln p ( σ n + 1 τ σ n τ δ ) p ( σ n τ δ σ n + 1 τ ) = δ β G ( τ δ , τ ) ,
one can rewrite δ G ( τ δ , τ ) as:
β δ G ( τ δ , τ ) = log p ( σ n + 1 τ σ n τ δ ) p ( σ n τ δ σ n + 1 τ ) Σ [ τ δ , τ ] ,
where the average is computed over all trajectories Σ [ τ δ , τ ] from macrostate σ ˜ n τ δ to σ ˜ n + 1 τ . Furthermore, markoviantity in the transition probabilities implies that:
p ( σ n + 1 τ | σ ˜ n + 1 τ δ ) = σ n τ δ p ( σ n τ δ | σ ˜ n τ δ ) p ( σ n τ δ σ n + 1 τ ) p ( σ ˜ n τ δ σ ˜ n + 1 τ δ ) = σ n τ δ p τ ( σ n τ δ , σ n + 1 τ ) .
Now we develop the part of Equation (25). From the definition of p τ ( σ n τ δ , σ n + 1 τ ) given in Equation (12), and averaging directly, one arrives at:
= σ n τ δ σ n + 1 τ p τ ( σ n τ δ , σ n + 1 τ ) log p ( σ n + 1 τ σ n τ δ ) p ( σ n τ δ σ n + 1 τ ) .
After some algebra, and using Equation (21), one arrives to a relation involving the global forward and backwards probabilities—see Appendix A for details:
δ F ( τ δ , τ ) = D ( p τ | | p τ ) + log p ( σ ˜ n + 1 τ σ ˜ n τ δ ) p ( σ ˜ n τ δ σ ˜ n + 1 τ ) ,
where p τ is the backwards probability of trajectories, defined in Equation (12). Specifically, D ( p τ | | p τ ) reads:
D ( p τ | | p τ ) = σ n τ δ σ n + 1 τ p τ ( σ n τ δ , σ n + 1 τ ) × × log p τ ( σ n τ δ , σ n + 1 τ ) p τ ( σ n τ δ , σ n + 1 τ ) .
If one assumes that there is no dissipation in the trajectory itself, and that the transition between states is performed in a quasi-stationary way, then D ( p τ | | p τ ) = 0 . We highlight that this is true as long as the trajectories are balanced and no currents are present in the system. In general, one has that, due to the non-negativity of the Kullback-Leibler divergence:
D ( p τ | | p τ ) 0 .

3.3.2. Non-Equilibrium Duplication Thresholds and Work Relations

Equation (26) encodes the relation between forward and backwards duplication probabilities. Indeed, exponentiating, one arrives directly at:
p ( σ ˜ n τ δ σ ˜ n + 1 τ ) p ( σ ˜ n + 1 τ σ ˜ n τ δ ) = e β δ F ( τ δ , τ ) D ( p τ | | p τ ) .
The above relation gives us an exact relation between duplication and fusion probabilities in a general class of non-equilibrium cases. In consequence, Equation (28) provides a necessary and sufficient condition for the triggering of a duplication event after the application of the protocol ψ ( t ) . The above equation leads to the following duplication threshold:
F ( σ ˜ n 1 τ ) < F ( σ ˜ n τ δ ) D ( p τ | | p τ ) .
If we notice, as we did in the quasi-static case, that we can impose that δ G = δ G a , where δ G a is the average increase on the free energy in the aggregates due only to size and frustration, as given in Equation (17):
β δ G a < δ S ( τ δ , τ ) + D ( p τ | | p τ ) ,
and δ S ( τ δ , τ ) refers to the entropic contributions as described in Equation (18). Equations (29) and (30) explicitly show how the tension between entropic gradients and free energy gains controls the duplication process. This provides a nice, hands-on example of the imbalance between entropy and free energy gains that create structure and order that biology needs to overcome in order to endure.
From the order of magnitudes analysis of Section 2.5, we can roughly estimate the numerical values involved in these inequalities in the case of ternary mixtures containing decanoic acid anhydride, ( C 9 H 19 - CO - O - OC - C 9 H 19 ), a single precursor, Y decanoic acid, ( C 9 H 19 - COOH ), and water. As we outlined, the amount of aggregates in one milliliter of emulsion is ~ 10 13 . Therefore, if we consider that just before the duplication the extra free energy was exactly the free energy of the aggregate at the optimal packing, we have that δ G a G a / n . Since, from Equation (11), we know that G a 10 13 , we have that:
δ G a 10 26 J / aggregate .
From that, considering T = 300 K, we have that β = ( k B T ) 1 10 21 J 1 , where k B is the Boltzmann constant. Therefore, from Equation (30) one can estimate the minimum (statistical) entropy gradient as:
δ S ( τ δ , τ ) + D ( p τ | | p τ ) 10 5 nats .
We recall that this is entropy excludes the contribution of the heat generated in non-equilibrium transitions.
We now revise the work relations in this general non-equilibrium case. The non-equilibrium work performed over the system is:
W ψ = Δ F ( 0 , τ ) .
From the definition of work invested over the system, one can derive the minimum work invested to trigger a duplication event. Indeed, let us suppose now that we take as the final point the equilibrium macrostate σ ˜ n + 1 τ , with probability distribution p ( σ n + 1 τ | σ ˜ n + 1 τ ) . As we said in the description of the protocol, the action of ψ ( t ) stops at t = τ and then the systems relaxes towards an equilibrium in a quasi-static way. One can in consequence, calculate the minimal amount of required work invested over the system through the protocol to trigger a duplication event, to be named W ψ :
W ψ = Δ F ( 0 , τ ) .
With the above relation, one can compute the amount of dissipated work, W ψ diss , due to non-equilibrium loses:
W ψ diss = W ψ W ψ .
In consequence, from the definition of the non-equilibrium free energy given in Equation (22), one can find an exact expression for the dissipated work:
W ψ diss = 0 τ d D ( t ) d t d t .
We observe that, consistently, W ψ diss 0 , due to inequality (24). We now retake the exploration of the orders of magnitude involved in our problem by using again the example of the ternary mixture presented in Section 2.5. The free energy of a single aggregate will determine, by construction, the minimum (non-dissipated) work needed to invest into the system to trigger the formation of an aggregate. Therefore, thanks to Equation (11), we can bound numerically the order of magnitude of this work:
W ψ > G a 10 13 J .
From this, and knowing from Section 2.5 that the amount of oil vesicles is around 10 13 in a milliliter of microemulsion, we can conclude, under the assumption that a linear increase of volume to accommodate new protocells does not impact dramatically in the energy landscape, that the amount of work we need to invest to duplicate the amount of protocells contained initially in a liter of emulsion is lower bounded as:
W ψ ( 1 L ) 1 kJ .
Finally, we can compute the amount of entropy produced throughout the whole process, S ψ , by collecting the entropic terms, and adding the entropy produced by the destabilization of the system after each application of the protocol, β Q ( t ) :
S ψ = Δ H ( 0 , τ ) + 0 τ D ( p t | | p t ) d t + 0 τ β Q ψ ( t ) .
Recall, again, that D ( p τ | | p τ ) 0 . We remind that here we did not specify the formal shape of the last term, corresponding to the heat produced within the non-equilibrium trajectories that destabilize the system right after the application of the protocol. We warn the reader that the potential relations between the dissipated work δ D ( τ δ , τ ) and the entropy produced through the relaxation process, D ( p τ | | p τ ) are not studied here, but can contain relevant information for the conditions of the duplication process. Similar relations are studied in the context of the analysis of the structure of the second law [54], thermodynamics of computation [55], and work/energy relations in coarse grained approaches [56].

3.4. The Perpetuation of the Duplication Process

A crucial condition for the emergence of synthetic-living entities is the capacity for perpetuating the duplication process. In the framework derived above we very briefly revise the key ingredients for this successive duplications to be maintained. Let us suppose that at time t the system contains n ( t ) aggregates in equilibrium, i.e., σ ˜ n ( t ) t . Following Equation (29), there will be a duplication if, under the action of the protocol ψ ( t ) , if ( n ( t ) N ) ( τ > 0 ) , the following condition holds
F ( σ ˜ n ( t ) + 1 t + τ ) < F ( σ ˜ n ( t ) t + τ ) D ( p t + τ | | p t + τ ) .
If we plug Equation (30) we obtain a criteria that explicitly relates the free energy increases of the aggregates, due to frustration and size, to the entropic gradients. In that context, the duplication process will go on as long as, ( n ( t ) N ) ( τ > 0 ) , the following condition holds:
β δ G a < δ S ( t + τ δ , t + τ ) + D ( p t + τ | | p t + τ ) ,
where δ G a is the average increase of free energies of the aggregates due to size and frustration from n ( t ) to n ( t ) + 1 in the time interval [ t + τ δ , t + τ ] and δ S ( t + τ δ , t + τ ) the increase of the other entropic components, as defined in Equation (18). Inequalities (33) and (34) are the inequalities that the protocol must trigger to ensure the continuation of the duplication process. They may be called inequalities for prevalence. In summary, they tell us that the process can continue if the action of the protocol is able to trigger an imbalance between entropic contributions and free energy gradients favouring the equilibrium state containing n ( t ) + 1 aggregates as:
Δ (aggregate free energy) < Δ (entropic gradients)
In Figure 4 we described a potential trajectory of a successive duplication process. We therefore derived a specific example of the race between entropy and free energy increases that enables the perpetuation of the duplication process. Other circumstances must be taken into account. For example, the volume of the system should increase in proportion to the aggregate number, in order to keep the concentration of chemical species inside the ranges in which the system remains in the phase where aggregates are formed. A significant change of this concentrations could result into a change on the phase of the system, where the preferred structures could no longer be spherical aggregates. Other circumstances, such as the specific application of the protocol, could also interfere the duplication process.

4. Discussion

In this paper we explored in depth the thermodynamics of duplication thresholds in a generic emulsion system made of an arbitrary set of lipid and precursor species. This feasible, yet artificial system enables us to overcome the tremendous complexity of the duplication process in actual living entities, such as cells. The thermodynamic landscape has been carefully constructed, accounting for the contributions due to surface tension, volume of the aggregates, entropic contributions and total amount of chemical species within the systems, all summarized in the definition of the Gibbs free energy of the state, Equation (4). An abstract protocol is proposed, driving the system away from the equilibrium state and resulting, eventually, in a duplication event. We approached the problem from the equilibrium framework, assuming that the process is a succession of equilibrium states, and from a non-equilibrium perspective, where the visited states may not be equilibrium ones.
Fundamental relations involving free energies and duplication probabilities, Equation (28), duplication thresholds, Equations (29) and (30), necessary work to be invested over the system by the protocol to trigger a duplication event, Equation (31), dissipated work, Equation (32) or the conditions for the perpetuation of the duplication cycle, Equations (33) and (34) have been derived. These relations invoke the explicit energy landscape provided by the free energies and set the abstract conditions for a duplication process to be triggered and, eventually maintained. It is worth to emphasize that they show explicitly the structure of the race between entropic forces and free energy gains to generate structure and preserve it. The synthetic approach, therefore, enabled us to convey a very detailed picture of the thermodynamical tensions involved in the process of creation and perdurability of living entities.
Further explorations should target more systematically specific systems, with quantitatively testable observables. The study of specific systems should also include the conditions of feasibility, in terms of microemulsion phases, of the aggregate duplication, avoiding transitions to non-aggregate phases, possible in emulsion systems. In the same line, a rigorous exploration of the orders of magnitude involved in the abstract relations derived above would add a necessary layer towards the quantification and, eventually, empirical test of the above predictions. Complementarily, the exploration of the constraints imposed by different protocol strategies could shed light to the potential prebiotic scenarios, where possibly circadian cycles play a crucial role in creating free energy sources driving the system towards imbalance, destabilization, and duplication. In addition, more complex free energy landscapes allowing bilayer membranes, more realistic when compared to biological structures than the single layer approach used here, could refine the triggering points for duplication events to occur. In a different direction, an in depth study of the dissipation within the trajectories themselves—assumed to observe detailed balance in the above developments—would generalize the approach, making it more realistic and providing predictions on dissipated heat which could be presumable testable. Finally, the interesting relations involving dissipation and information measures could be explored to be the seed of further developments linking information and duplication processes, in line to the results exposed in [31], and, perhaps, clear the conditions for the emergence of inheritable information—thus the appearance of differentiated traits between elements of the system—intrinsically linked to the duplication process and, in the long term, trigger darwinian dynamics.

Funding

This research received no external funding.

Acknowledgments

The author is grateful to the criticisms and comments provided by Rudolf Hanel, Edouard Hannezo, Artemy Kolchinsky and Kepa Ruiz-Mirazo. The author also wants to thank Harold Fellermann for early discussions on the manuscript.

Conflicts of Interest

The author declare no conflicts of interest.

Appendix A. Derivation Details

In this appendix we will use a simplified notation in order to emphasize only the technical details. We will consider the following scenario: The system is at t = 0 in a given macrostate A 0 and the change on the boundary conditions makes it to jump to macrostate A, whose states are denoted by “x”, following a probability distribution p ( x ) . The system relaxes until macrostate B, whose states are denoted by y, and follows a probability distribution q ( y ) . Given a macrostate A with a distribution p ( x ) , there is a macrostate with the same support but in equilibrium, A * , whose probability distribution will be denoted by p * ( x ) , and follows a Boltzmann-like statistics:
p * ( x ) = 1 Z x e β G ( x ) ,
where Z x is the normalization constant and G ( x ) . We define B * and q * ( y ) in a totally analogous way, defining the Gibbs free energy with the new boundary conditions induced by the application of the protocol. We will assume that the backwards and forwards transition probabilities obey equilibrium detailed balance:
p ( x y ) p ( y x ) = e β δ G ( x , y ) , where δ G ( x , y ) = G ( y ) G ( x ) .
Derivation of Equation (19). Let
p ( B A ) p ( A B ) = e β δ G ( x , y ) e ln p ( x ) q ( y ) A B ,
where A B denotes that the average is performed through all trajectories from A B . Accordingly,
1 = e β δ G ( x , y ) ln p ( B A ) p ( A B ) ln p ( x ) q ( y ) A B .
The Taylor expansion of the exponential ensures that e x 1 + x , so, if we know that e x = 1 , then 1 + x 1 , so x 0 :
β δ G A B ln p ( B A ) p ( A B ) ln p ( x ) q ( y ) A B 0 .
Finally, the equality ln p ( x ) q ( y ) A B = H ( B ) H ( A ) follows directly from basic probability reasoning. Therefore, rearranging that and the above equation, one is led to Equation (19).
Derivation of Equation (24). Given two distributions p q with the same support set, the log-sum inequality states that:
x p ( x ) log p ( x ) q ( x ) x p ( x ) log x p ( x ) x q ( x ) ,
with equality only in the case in which p = q . In our case, if we assume that after the transition there can be a relaxation period—i.e., we approach q * , we have that:
D ( q | | q * ) x y p ( y ) p ( x | y ) log y p ( y ) p ( x | y ) y p * ( y ) p ( x | y ) x , y p ( y ) p ( x | y ) log p ( y ) p ( x | y ) p * ( y ) p ( x | y ) = x p ( x ) log p ( x ) p * ( x ) = D ( p | | p * ) .
Leading to δ D = D ( q | | q * ) D ( p | | p * ) 0 .
Derivation of Equation (20). Given an arbitrary distribution p ( x ) and an equilibrium one p * ( x ) = 1 Z x e β G ( x ) with the same support, one can write
β G ( x ) A = x p ( x ) log p * ( x ) log Z x ,
now the A denotes that the average is computed over the states of the macorstate A. Identifying log Z x as the equilibrium Helmholtz free energy corresponding to p * ( x ) , F x = log Z x , and developing the cross entropy term x p ( x ) log p * ( x ) as follows:
x p ( x ) log p * ( x ) = x p ( x ) log p ( x ) + D ( p | | p * ) ,
one obtains the desired result.
Derivation of Equation (26). Let f ( x , y ) = p ( x ) p ( x y ) / p ( A B ) and u ( y , x ) = p ( y ) p ( y x ) / p ( B A ) . Now we rewrite the increase on free energy as:
β δ G A B = x , y f ( x , y ) log p ( x y ) p ( y x ) ,
whose second term can be written as:
x , y f ( x , y ) log p ( x ) p ( A B ) / ( p ( x ) p ( A B ) ) p ( x y ) q ( y ) p ( B A ) / ( q ( y ) p ( B A ) ) p ( y x ) .
Rearranging and using the definition of f ( x , y ) and u ( y , x ) one arrives at:
H ( B ) H ( A ) D ( f | | u ) + log p ( B A ) p ( A B ) ,
where H ( A ) , H ( B ) are the Shannon entropies of macrostates A and B, and D ( f | | u ) the Kullback-Leibler divergence between f and u.

References

  1. Rashevky, N. Mathematical Biophysics: Physico-Mathematical Foundations of Biology; University of Chicago Press: Chicago, IL, USA, 1938. [Google Scholar]
  2. Schrödinger, E. What Is Life? The Physical Aspect of the Living Cell; Cambridge University Press: Cambridge, UK, 1944. [Google Scholar]
  3. Morowitz, H. Energy Flow in Biology; Academic Press: New York, NY, USA, 1968. [Google Scholar]
  4. Kauffman, S.A. The Origins of Order: Self-Organization and Selection in Evolution; Oxford University Press: New York, NY, USA, 1993. [Google Scholar]
  5. Szathmáry, E.; Smith, J.M. The Major Transitions in Evolution; WH Freeman Spektrum: Oxford, UK, 1995. [Google Scholar]
  6. Schuster, P. How does complexity arise in evolution. Complexity 1999, 2, 22–30. [Google Scholar] [CrossRef]
  7. Deamer, D.W. The first living systems: A bioenergetic perspective. Microbiol. Mol. Biol. Rev. 1997, 61, 239. [Google Scholar] [PubMed]
  8. Lane, N. The Vital Question: Energy, Evolution, and the Origins of Complex Life; Norton and Co.: New York, NY, USA, 2015. [Google Scholar]
  9. Smith, E. Thermodynamics of natural selection I: Energy flow and the limits on organization. J. Theor. Biol. 2008, 252, 185. [Google Scholar] [CrossRef] [PubMed]
  10. Smith, E. Thermodynamics of natural selection II: Chemical Carnot Cycles. J. Theor. Biol. 2008, 252, 198. [Google Scholar] [CrossRef] [PubMed]
  11. Smith, E. Thermodynamics of natural selection III: Landauer’s principle in computation and chemistry. J. Theor. Biol. 2008, 252, 213. [Google Scholar] [CrossRef]
  12. England, J.L. Statistical Physics of Self-Replication. J. Chem. Phys. 2013, 139, 121923. [Google Scholar] [CrossRef]
  13. Kauffman, S. Molecular autonomous agents. Philos. Trans. R. Soc. A 2003, 361, 1089. [Google Scholar] [CrossRef]
  14. Ganti, T. The Principles of Life; Oxford University Press: Oxford, UK, 2003. [Google Scholar]
  15. Solé, R.V.; Rasmussen, S.; Bedau, M.A. Towards the artificial cell. Philos. Trans. R. Soc. Lond. B Biol. Sci. 2007, 362, 1723–1925. [Google Scholar]
  16. Rasmussen, S.; Bedau, M.A.; Chen, L.; Deamer, D.; Krakauer, D.; Packard, N.; Stadler, P. (Eds.) Proto-Cells: Bridging Nonliving and Living Matter; MIT Press: Cambridge, MA, USA, 2008. [Google Scholar]
  17. Ruiz-Mirazo, K.; Briones, C.; de la Escosura, A. Prebiotic systems chemistry: New perspectives for the origins of life. Chem. Rev. 2014, 114, 285. [Google Scholar] [CrossRef]
  18. Bachmann, P.A.; Luisi, P.L.; Lang, J. Autocatalytic self-replicating micelles as models for prebiotic structures. Nature 1992, 357, 57–59. [Google Scholar] [CrossRef]
  19. Deamer, D. A giant step toward artificial life? Trends Biotechnol. 2005, 23, 336–338. [Google Scholar] [CrossRef] [PubMed]
  20. Solé, R. Synthetic transitions: Towards a new synthesis. Philos. Trans. R. Soc. B 2016, 371, 20150438. [Google Scholar] [CrossRef] [PubMed]
  21. Fennell Evans, D.; Wennerstrøm, H. The Colloidal Domain: Where Physics, Chemistry, Biology, and Technology Meet; VCH Publishers: New York, NY, USA, 1994. [Google Scholar]
  22. Hanczyc, M.M.; Toyota, T.; Ikegami, T.; Packard, N.; Sugawara, T. Fatty Acid Chemistry at the Oil-Water Interface: Self-Propelled Oil Droplets. J. Am. Chem. Soc. 2007, 129, 9386–9391. [Google Scholar] [CrossRef] [PubMed]
  23. Mansy, S.S.; Schrum, J.P.; Krishnamurthy, M.; Tobé, S.; Treco, D.A.; Szostak, J.W. Template-directed synthesis of a genetic polymer in a model protocell. Nature 2008, 454, 122–125. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  24. Toyota, T.; Maru, N.; Hanczyc, M.M.; Ikegami, T.; Sugawara, T. Self-Propelled Oil Droplets Consuming Fuel Surfactant. J. Am. Chem. Soc. 2009, 131, 5012–5013. [Google Scholar] [CrossRef] [PubMed]
  25. Attwater, J.; Wochner, A.; Pinheiro, V.B.; Coulson, A.; Holliger, P. Ice as a protocellular medium for RNA replication. Nat. Commun. 2010, 1, 76. [Google Scholar] [CrossRef] [PubMed]
  26. Fellermann, H.; Corominas-Murtra, B.; Hansen, P.L.; Ipsen, J.H.; Solé, R.V.; Rasmussen, S. Non-Equilibrium Thermodynamics of Self-Replicating Protocells. arXiv, 2015; arXiv:1503.04683. [Google Scholar]
  27. Corominas-Murtra, B.; Fellermann, H.; Solé, R. Protocell cycles as thermodynamic cycles. In The Interplay between Thermodynamics and Computation in Natural and Artificial Systems; Wolpert, D.H., Kempes, C., Grochow, J.A., Stadler, P.F., Eds.; Santa Fe Institute Press: Santa Fe, NM, USA, 2019; accepted, to appear. [Google Scholar]
  28. Gutiérrez JM, P.; Hinkley, T.; Taylor, J.W.; Yanev, K.; Cronin, L. Evolution of oil droplets in a chemorobotic platform. Nat. Commun. 2014, 5, 5571. [Google Scholar] [CrossRef] [Green Version]
  29. Mouritsen, O. Life-As a Matter of Fat: The Emerging Science of Lipidomics; Springer: Berlin, Germany, 2005. [Google Scholar]
  30. Zwicker, D.; Seyboldt, R.; Weber, C.A.; Hyman, A.A.; Jüicher, F. Growth and division of active droplets provides a model for protocells. Nat. Phys. 2017, 13, 408–413. [Google Scholar] [CrossRef]
  31. Taylor, J.W.; Eghtesadi, S.A.; Points, L.J.; Liu, T.; Cronin, L. Autonomous model protocell division driven by molecular replication. Nat. Commun. 2017, 8, 237. [Google Scholar] [CrossRef]
  32. Mavelli, F.; Ruiz-Mirazo, K. Theoretical conditions for the stationary reproduction of model protocells. Integr. Biol. 2012, 5, 324. [Google Scholar] [CrossRef]
  33. Helfrich, W. Elastic properties of lipid bilayers: Theory and possible experiments. Zeitschrift für Naturforschung C 1973, 28, 693. [Google Scholar] [CrossRef]
  34. Seifert, U. Configurations of Fluid Membranes and Vesicles. Adv. Phys. 1997, 46, 13–137. [Google Scholar] [CrossRef]
  35. Reiss, H.; Ellerby, H.M.; Manzanares, J.A. Configurational entropy of microemulsions: The fundamental length scale. J. Chem. Phys. 1993, 99, 9930–9937. [Google Scholar] [CrossRef] [Green Version]
  36. Reiss, H.; Kegel, W.K.; Groenewold, J. Length Scale for Configurational Entropy in Microemulsions? Berichte der Bunsengesellschaft 1996, 100, 279–295. [Google Scholar] [CrossRef]
  37. Jarzynski, C. Nonequilibrium Equality for Free Energy Differences. Phys. Rev. Lett. 1997, 78, 2690. [Google Scholar] [CrossRef]
  38. Crooks, G.E. Entropy production fluctuation theorem and the nonequilibrium work relation for free energy differences. Phys. Rev. E 1999, 60, 2721. [Google Scholar] [CrossRef]
  39. Gómez-Marín, P.J.; Van den Broeck, C. Lower bounds on dissipation upon coarse graining. Phys. Rev. E 2008, 78, 011107. [Google Scholar] [CrossRef]
  40. Parrondo, J.M.R.; Van den Broeck, C.; Kawai, R. Entropy production and the arrow of time. New J. Phys. 2009, 11, 073008. [Google Scholar] [CrossRef] [Green Version]
  41. Roldán, E. Irreversibility and Dissipation in Microscopic Systems; Springer: Berlin, Germany, 2014. [Google Scholar]
  42. Seifert, U. Stochastic thermodynamics, fluctuation theorems and molecular machines. Rep. Progress Phys. 2012, 75, 126001. [Google Scholar] [CrossRef] [Green Version]
  43. Parrondo, J.M.; Horowitz, J.M.; Sagawa, T. Thermodynamics of information. Nat. Phys. 2015, 11, 131. [Google Scholar] [CrossRef]
  44. Kolchinsky, A.; Wolpert, D.H. Semantic information, agency, and nonequilibrium statistical physics. Interface Focus 2018, 8, 0041. [Google Scholar] [CrossRef] [PubMed]
  45. Pathria, R.K. Statistical Mechanics; Butterworth-Heinemann: Oxford, UK, 1996. [Google Scholar]
  46. Bachmann, P.A.; Luisi, P.L.; Lang, P.L.J. Self-replicating micelles: Aqueous micelles and enzymatically driven reactions in reverse micelles. J. Am. Chem. Soc. 1991, 113, 8204. [Google Scholar] [CrossRef]
  47. Kaganer, V.M.; Möhwald, H.; Dutta, P. Structure and phase transitions in Langmuir monolayers. Rev. Mod. Phys. 1999, 71, 779. [Google Scholar] [CrossRef]
  48. Caschera, P.; Rasmussen, S.; Hanczyc, M.M. An Oil Droplet Division-Fusion Cycle. ChemPlusChem 2013, 78, 52–54. [Google Scholar] [CrossRef]
  49. Fermi, E. Thermodynamics; Dover: New York, NY, USA, 1956. [Google Scholar]
  50. De Groot, S.R.; Mazur, P. Non-Equilibrium Thermodynamics; Dover: New York, NY, USA, 1969. [Google Scholar]
  51. Cover, T.M.; Thomas, J.A. Elements of Information Theory; John Wiley and Sons: Hoboken, NJ, USA, 2012. [Google Scholar]
  52. Gaveau, B.; Schulman, L. A general framework for non-equilibrium phenomena: The master equation and its formal consequences. Phys. Lett. A 1997, 229, 347–353. [Google Scholar] [CrossRef]
  53. Esposito, M.; Van den Broeck, C. Second law and Landauer principle far from equilibrium. Europhys. Lett. 2011, 95, 40004. [Google Scholar] [CrossRef] [Green Version]
  54. Esposito, M.; Van der Broeck, C. The Three Faces of the Second Law: I. Master Equation Formulation. Phys. Rev. E 2010, 82, 011143. [Google Scholar] [CrossRef]
  55. Kolchinsky, A.; Wolpert, D.H. Dependence of dissipation on the initial distribution over states. J. Stat. Mech. 2017, 2017, 083202. [Google Scholar] [CrossRef] [Green Version]
  56. Gomez-Marín, A.; Parrondo, J.; Van den Broeck, C. The footprints of irreversibility. EPL 2008, 82, 50002. [Google Scholar] [CrossRef]
Figure 1. Schematic characterization of the role of the protocol ψ ( t ) . (a) At time t = 0 the system is in contact to a thermal reservoir at inverse temperature β , and in an equilibrium state containing 1 aggregate. (b,c) The protocol starts by increasing the number of lipids and precursors and providing energy that may trigger chemical reactions. The action of the protocol is depicted by the red arrow. This process may change the energy landscape provided by Equation (4) and thereby destabilize the structure of the aggregates, eventually creating more and more frustration in the surface. (d) At time τ the energy gradients favour the duplication and the protocol stops. (e) The system relaxes towards an equilibrium state containing 2 aggregates.
Figure 1. Schematic characterization of the role of the protocol ψ ( t ) . (a) At time t = 0 the system is in contact to a thermal reservoir at inverse temperature β , and in an equilibrium state containing 1 aggregate. (b,c) The protocol starts by increasing the number of lipids and precursors and providing energy that may trigger chemical reactions. The action of the protocol is depicted by the red arrow. This process may change the energy landscape provided by Equation (4) and thereby destabilize the structure of the aggregates, eventually creating more and more frustration in the surface. (d) At time τ the energy gradients favour the duplication and the protocol stops. (e) The system relaxes towards an equilibrium state containing 2 aggregates.
Life 09 00009 g001
Figure 2. Trajectories between macrostates. At time τ δ the macrostate containing 1 aggregate receives the action of the protocol and transits to a macrostate containing 2 aggregates. This transition can performed through any of the trajectories connecting the states that realize one or the other macrostate. The ensemble of trajectories connecting these two macrostates is called Σ [ τ δ , τ ] . Forward trajectories are depicted with solid lines. Dashed lines correspond to time reversal trajectories, i.e., trajectories obtained through the protocol running under time reversal.
Figure 2. Trajectories between macrostates. At time τ δ the macrostate containing 1 aggregate receives the action of the protocol and transits to a macrostate containing 2 aggregates. This transition can performed through any of the trajectories connecting the states that realize one or the other macrostate. The ensemble of trajectories connecting these two macrostates is called Σ [ τ δ , τ ] . Forward trajectories are depicted with solid lines. Dashed lines correspond to time reversal trajectories, i.e., trajectories obtained through the protocol running under time reversal.
Life 09 00009 g002
Figure 3. Irreversible action of the Protocol ψ . (a) At t = 0 we have a macrostate σ ˜ n 0 in equilibrium and the protocol induces a change in the boundary conditions that destabilizes the system eventually making it to jump to a non equilibrium state, producing an amount of entropy β Q ψ ( t ) . Then the system relaxes—maybe not completely—until the next action of the protocol until there is a stable division event and the system relaxes completely. (b) Details of the transition, with the thermodynamic quantities involved. The jump experienced by the system from its previous state is D ( p | σ ˜ t | | p | σ ˜ * t ) , and Δ F is the energy gradient that leads to the new macrostate. The entropy produced through this, possibly partial, relaxation process is D ( p | | p ) —see text.
Figure 3. Irreversible action of the Protocol ψ . (a) At t = 0 we have a macrostate σ ˜ n 0 in equilibrium and the protocol induces a change in the boundary conditions that destabilizes the system eventually making it to jump to a non equilibrium state, producing an amount of entropy β Q ψ ( t ) . Then the system relaxes—maybe not completely—until the next action of the protocol until there is a stable division event and the system relaxes completely. (b) Details of the transition, with the thermodynamic quantities involved. The jump experienced by the system from its previous state is D ( p | σ ˜ t | | p | σ ˜ * t ) , and Δ F is the energy gradient that leads to the new macrostate. The entropy produced through this, possibly partial, relaxation process is D ( p | | p ) —see text.
Life 09 00009 g003
Figure 4. Schematic picture of the conditions for the duplication process to be sustained in time. Duplication events are indicated by dashed circles. A system with n aggregates—gray line—in equilibrium receives the action of the protocol changing the energy landscape. Its Helmoltz free energy increases until a point in which the Helmholtz free energy of a macrostate containing n + 1 aggregates—orange line—is lower than the one for the n aggregates, and a duplication event occurs. If we switch on the protocol again, the system increases its Helmholtz free energy until a point in which, eventually, the Helmholtz free energy differences trigger again a duplication event—blue line. If the protocol is able to destabilize the system from n ( t ) to n ( t ) + 1 aggregates for any t, the duplication process will continue unboundedly in time. In this figure we described a quasi-static approach, which makes use of equilibrium Helmholtz free energies for the sake of clarity. The non-equilibrium case is thoroughly discussed in the text.
Figure 4. Schematic picture of the conditions for the duplication process to be sustained in time. Duplication events are indicated by dashed circles. A system with n aggregates—gray line—in equilibrium receives the action of the protocol changing the energy landscape. Its Helmoltz free energy increases until a point in which the Helmholtz free energy of a macrostate containing n + 1 aggregates—orange line—is lower than the one for the n aggregates, and a duplication event occurs. If we switch on the protocol again, the system increases its Helmholtz free energy until a point in which, eventually, the Helmholtz free energy differences trigger again a duplication event—blue line. If the protocol is able to destabilize the system from n ( t ) to n ( t ) + 1 aggregates for any t, the duplication process will continue unboundedly in time. In this figure we described a quasi-static approach, which makes use of equilibrium Helmholtz free energies for the sake of clarity. The non-equilibrium case is thoroughly discussed in the text.
Life 09 00009 g004

Share and Cite

MDPI and ACS Style

Corominas-Murtra, B. Thermodynamics of Duplication Thresholds in Synthetic Protocell Systems. Life 2019, 9, 9. https://doi.org/10.3390/life9010009

AMA Style

Corominas-Murtra B. Thermodynamics of Duplication Thresholds in Synthetic Protocell Systems. Life. 2019; 9(1):9. https://doi.org/10.3390/life9010009

Chicago/Turabian Style

Corominas-Murtra, Bernat. 2019. "Thermodynamics of Duplication Thresholds in Synthetic Protocell Systems" Life 9, no. 1: 9. https://doi.org/10.3390/life9010009

APA Style

Corominas-Murtra, B. (2019). Thermodynamics of Duplication Thresholds in Synthetic Protocell Systems. Life, 9(1), 9. https://doi.org/10.3390/life9010009

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop