In this Section we briefly present three different, but equivalent definitions of entropy. By “different” we mean that the definitions do not follow from each other, specifically, neither Boltzmann’s, nor the one based on Shannon’s measure of information (SMI), can be “derived” from Clausius’s definition.
By “equivalent” we mean that, for any process for which we can calculate the change in entropy, we obtain the same results by using the three definitions. Note carefully, that there are many processes for which we cannot calculate the pertinent, entropy changes. Therefore, we cannot claim equivalency in general. However, it is believed that these three definitions are indeed equivalent although no formal proof of this is available.
In the following sections we shall introduce three definitions of entropy. The first definition originated in the 19th century, stemming from the interest in heat engines. The introduction of entropy into the vocabulary of physics is attributed to Clausius. In reality, Clausius did not define entropy, but rather only changes in entropy. Clausius’s definition, together with the Third Law of Thermodynamics, led to the calculation of “absolute values” of the entropy of many substances.
The second definition is attributed to Boltzmann. This definition is sometimes referred to as either the microscopic definition of entropy. It relates the entropy of a system to the total number of accessible micro-states of a thermodynamic system characterized macroscopically by the total energy E, volume V, and total number of particles N [for a multi-component system N may be reinterpreted as the vector , where is the number of atoms of type i]. The extension of Boltzmann’s definition to systems characterized by independent variables other than the is attributed to Gibbs.
The Boltzmann definition seems to be completely unrelated to the Clausius definition. However, it is found that, for all processes for which entropy changes can be calculated by using Boltzmann’s definition, the results agree with entropy changes calculated using Clausius’s definition. Although, there is no formal proof that Boltzmann’s entropy is equal to the thermodynamic entropy, as defined by Clausius, it is widely believed that this is true.
We also note, that calculations of entropy changes, based on the SMI definition, agree with those calculations based on Clausius’s, as well as on Boltzmann’s definition. Unlike Boltzmann’s definition however, the SMI definition does not rely on calculations of the number of accessible states of the system. It also provides directly the entropy function of an ideal gas, and by extension, also the entropy function for a system of interacting particles.
In each of the following sections, we shall examine the question of the “range of applicability” and of the possible time-dependence of entropy. As we shall see, the independence of entropy on time is most clearly revealed from the SMI-based definition of entropy. At this point it is appropriate to quote Einstein on Thermodynamics:
“It is the only physical theory of universal content, which I am convinced, that within the framework of applicability of its basic concepts will never be overthrown.”
I fully agree that thermodynamics will never be overthrown. This quotation features in numerous popular science books. It is ironic that most of the authors who quote Einstein on thermodynamics emphasize the word “overthrown,” but ignore Einstein’s words “framework of applicability.” These are exactly the words that many authors overlook when writing about the entropy and the Second Law. They apply the concept of entropy and the Second Law in realms where they do not apply.
In this article, I will discuss the misconstrued association of Entropy with time. As we shall see, the cause for this association of entropy with time is that thermodynamics was used not “within the framework of its applicability”.
1.1. Clausius’s “Definition” of Entropy
I enclose the word “definition” in inverted commas because Clausius did not really define entropy. Instead, he defined a small change in entropy for one very specific process. It should be added that, even before a proper definition of entropy was posited, entropy was confirmed as a state function
, which means that whenever the macroscopic state of a system is specified, its entropy is determined [
20,
21,
22,
23,
24,
25,
26,
27,
28,
29].
Clausius’s definition, together with the Third Law of Thermodynamics led to the calculation of “absolute values” of the entropy of many substances.
Clausius started from one particular process; the spontaneous flow of heat from a hot to a cold body. Based on this specific process, Clausius defined a new quantity which he called Entropy. Let
be a small quantity of heat flowing
into a system, being at a given temperature
T. The change in entropy is defined as:
The letter d stands for a very small quantity. T is the absolute temperature, Q has the units of energy, and T has the units of temperature. Therefore, the entropy change has the units of energy divided by units of temperature. Sometimes, you might find the subscript “rev” in the Clausius definition which means that equation 1.1 is valid only for a “reversible” process. This is unnecessary. In fact, the additional requirement that the process be “reversible” may be even confusing since the term “reversible” (or irreversible) has many quite different meanings.
The quantity of heat,
must be very small, such that when it is transferred into, or out from the system, the temperature
T does not change. If
is a finite quantity of heat, and one transfers it to a system which is initially at a given
T, the temperature of the system might change, and therefore the change in entropy will depend on both the initial and the final temperature of the system. Note, that this equation does not define entropy but only changes in entropy for a particular process, i.e., a small exchange of heat (
means heat flows into the system,
means heat flows out of the system. Correspondingly,
dS will be positive or negative when heat flows into or out from the system, respectively). There are many processes which do not involve heat transfer, yet, from Clausius’s definition and the postulate that the entropy is a state function, one could devise a path, leading from one state to another for which the entropy change can be calculated. For some numerical examples [
24,
26].
It should be remembered that during the time of Clausius the meaning of entropy was not clear. It was a well-defined quantity, and one could calculate changes in entropy for many processes without bothering with the meaning of entropy. In fact, there are many scientists who use the concept of entropy successfully and who do not care for its meaning, or whether if it has a meaning at all.
Entropy is useful quantity in chemical thermodynamics and engineering regardless of what it means on a molecular level.
To summarize, Clausius’s definition in Equation (1) requires the system to be at thermal equilibrium. However, in order to calculate finite changes in entropy one must carry out the process through a very large number of small steps, while the system is maintained at equilibrium.
1.2. Boltzmann’s Definition Based on the Total Number of Micro-States
Towards the end of the 19th century the majority of scientists believed that matter consists of small units called atoms and molecules. A few persistently rejected that idea arguing that there was no proof of the existence of atoms and molecules; no one has ever seen an atom.
On the other hand, the so-called kinetic theory of heat which was based on the assumption of the existence of atoms and molecules had scored a few impressive gains. First, the pressure of a gas was successfully explained as arising from the molecules bombarding the walls of the container. Then came the interpretation of temperature in terms of the kinetic energy of the molecules which was a remarkable achievement that supported and lent additional evidence for the atomic constituency of matter. While, the kinetic theory of heat was successful in explaining the concepts of pressure, temperature and heat, entropy was left lagging behind.
Boltzmann defined the entropy in terms of the total of number accessible micro-states of a system consisting of a huge number of particles, but characterized by the macroscopic parameters of energy E, volume V and number of particles N.
What are these “number of micro-states,” and how are they related to entropy?
Consider a gas consisting of N simple particles in a volume V, each particle’s micro-state may be described by its location vector and its velocity vector . By simple particles we mean particles having no internal degrees of freedom. Atoms, such as argon, neon and the like are considered as simple. They all have internal degrees of freedom but these are assumed to be unchanged in all the processes we discuss here. Assuming that the gas is very dilute so that interactions between the particles can be neglected, then, all the energy of the system is simply the sum of the kinetic energies of all the particles.
Imagine that you have microscopic eyes, and you could see the particles rushing incessantly, colliding with each other, and with the walls from time-to-time. Clearly, there are infinite configurations or arrangements, or micro-states of the particles which are consistent with the requirements that the total energy is a constant, and that they are all contained within the box of volume V. Each particle is specified by its location and its velocity . Thus, in classical mechanics, a description of all the locations and all the velocities of the particles consist of a micro-state of the system. In quantum mechanics one usually defines W, as the total number of quantum mechanical solutions of the Schrödinger equation for a system described by its energy E, its volume V and the total number of particle N. We use the shorthand notation to describe the macro-state of the system.
Without getting bogged down with the question of how to estimate the total number of arrangements, it is clear that this is a huge number, far “huger” than the number of particles which is of the order
. Boltzmann postulated the relationship which is now known as the Boltzmann entropy,
where
is a constant, now known as the Boltzmann constant (1.380 × 10
−23 J/K), and
W is the number of accessible micro-states of the system. Here, log is the natural logarithm. At first glance, Boltzmann’s entropy seems to be completely different from Clausius’s entropy. Nevertheless, in all cases for which one can calculate changes of entropy one obtains agreements between the values calculated by the two methods.
Boltzmann’s entropy was not easy to swallow, not only by those who did not accept the atomic theory of matter, but also by those who accepted it. The criticism was not focused so much on the definition of entropy, but rather on the formulation of the Second Law of Thermodynamics. Boltzmann explained the Second Law as a probabilistic law. In Boltzmann’s words [
28,
29,
30]:
“… a system…when left to itself, it rapidly proceeds to disordered, most probable state.”
“Most probable state.” This statement was initially shocking to many physicists. Probability was totally foreign to physical reasoning. Physics was built on the foundation of deterministic and absolute laws, no provisions for exceptions. The macroscopic formulation of the Second Law was absolute—no one has ever observed a single violation of the Second Law. Boltzmann, on the other hand, insisted that the Second Law is only statistical, entropy increases most of the time, not all the time. The decrease in entropy is not an impossibility but is only highly improbable.
There is another quantity which is sometimes also referred to as Boltzmann’s entropy. This is the H-function. We shall discuss this quantity in the next section.
Boltzmann’s entropy, as defined in Equation (2), has raised considerable confusion regarding the question of whether entropy is, or isn’t a subjective quantity.
One example of this confusion which features in many popular science books is the following: Entropy is assumed to be related to our “knowledge” of the state of the system. If we “know” that the system is at some specific state, then the entropy is zero. Thus, it seems that the entropy is dependent on whether one knows or does not know in which state (or states) the system is.
This confusion arose from misunderstanding
W which is the total number of accessible micro-states of the system. If
, then the entropy of the system is indeed zero (as it is for many substances at 0 K). However, if there are
W states and we know in which state the system is, the entropy is still
and not zero! This kind of argument led some authors to reject the “informational interpretation” of entropy. For details and examples, see References [
24,
26].
Although, it was not explicitly stated in the definition, the Boltzmann entropy applies to an equilibrium state. In statistical mechanics this entropy applies to the so-called micro-canonical ensemble, i.e., to systems having a fixed energy E, volume V, and number of particles N (N could be a vector comprising the number of particles of each species ). It is also clear that the entropy of an ideal gas applies to a well-defined system at equilibrium.
In general, the Boltzmann entropy does not provide an explicit entropy function. However, for some specific systems one can derive an entropy function based on Boltzmann’s definition. The most famous case is the entropy of an ideal gas, for which one can derive an explicit entropy function. This function was derived by Sackur and by Tetrode in 1912 [
31,
32], by using the Boltzmann definition of entropy. This function was re-derived based on Shannon’s measure of information. We shall discuss this in
Section 2.
1.3. ABN’s Definition of Entropy Based on Shannon’s Measure of Information
In this section we first present the so-called Shannon Measure of Information (SMI). Then we present an outline of the steps leading from SMI to Entropy. This is a relatively recent definition of entropy. It was originally used as an interpretation of entropy [
22,
24], but later turned into a definition of entropy. This definition is superior to both the Clausius and the Boltzmann definitions. Unlike the Clausius definition, which provides only a definition of changes in entropy, the present one provides the entropy function itself. Unlike Boltzmann’s definition, which is strictly valid for isolated systems and does not provide a simple intuitive interpretation, the present one is more general and provides a clear, simple and intuitive interpretation of entropy. It is more general in the sense that it relates the entropy to probability distributions, rather than to the number of micro-states. One final “bonus” is afforded by this definition of entropy. It does not only remove any trace of mystery associated with entropy, but it also expunges the so-called irreversibility paradox.
In this section we shall not discuss the Shannon measure of information (SMI) in any details. For details, see References [
24,
26]. We shall only quote the definition of SMI then outline the procedure to obtaining the entropy from the SMI.
We shall see that the entropy is, up to a multiplicative constant, nothing but a particular case of SMI. For more details, see References [
22,
24].
In 1948, Shannon published a landmark article titled, “A Mathematical Theory of Communication.” [
33].
Here is how Shannon introduced the measure of information. In Section 6 of the article titled: “Choice, Uncertainty and Entropy,” we find:
Suppose we have a set of possible events whose probabilities of occurrence are. These probabilities are known but that is all we know concerning which event will occur. Can we find a measure of how much “choice” is involved in the selection of the event or how uncertain we are of the outcome?
If there is such a measure, say,, it is reasonable to require of it the following properties:
H should be continuous in the.
If all theare equal,, then H should be a monotonic increasing function of n. With equally likely events there is more choice, or uncertainty, when there are more possible events.
If a choice be broken down into two successive choices, the original H should be the weighted sum of the individual values of H.
Then Shannon proved the theorem:
The only
H satisfying the three assumptions above has the form:
Shannon did not seek a measure of the general concept of information, only a measure of information contained in, or associated with a probability distribution. This is an important point that one should remember whenever using the term “information” either, as a measurable quantity, or in connection with the Second Law of Thermodynamics.
We are ready to define the concept of entropy as a special case of SMI.
In this section we use for convenience the natural logarithm , or . Whenever we want to convert to SMI we need to multiply by , i.e., . The overall plan of obtaining the entropy of an ideal gas from the SMI consists of four steps:
First, we calculate the locational SMI associated with the equilibrium distribution of locations of all the particles in the system.
Second, we calculate the velocity SMI associated with the equilibrium distribution of velocities (or momenta) of all the particles.
Third, we add a correction term due to the quantum mechanical uncertainty principle.
Fourth, we add a correction term due to the fact that the particles are indistinguishable.
First step: The locational SMI of a particle in a 1D box of length L
Suppose we have a particle confined to a one-dimensional (1D) “box” of length
L. Since there are infinite points in which the particle can be within the interval (0,
L), the corresponding locational SMI must be infinite. However, we can define, as Shannon did, the following quantity by analogy with the discrete case:
This quantity might either converge or diverge, but in any case, in practice we shall use only differences between such quantities. It is easy to calculate the density distribution which maximizes the locational SMI,
in Equation (4). The result is:
This is a uniform distribution. The use of the subscript
Equation (for equilibrium) will be clarified later. The corresponding SMI calculated by substituting Equation (5) in Equation (4) is:
We now acknowledge that the location of the particle cannot be determined with absolute accuracy, i.e., there exists a small interval
within which we do not care where the particle is. Therefore, we must correct Equation (6) by subtracting
. Thus, we write instead of Equation (6):
We recognize that in Equation (7) we effectively defined for the finite number of intervals . Note that when diverges to infinity. Here, we do not take the mathematical limit, but we stop at small enough but not zero. Note also that in writing Equation (7) we do not have to specify the units of length for as long as we use the same units for L and .
Second step: The velocity SMI of a particle in a 1D “box” of length L
The mathematical problem is to calculate the probability distribution that maximizes the continuous SMI which is subject to two conditions, a normalization condition and a constant variation. For details see reference [
34].
The result is the Normal distribution:
The subscript
which stands for equilibrium will be clarified once we realize that this is the equilibrium distribution of velocities. Applying this result to a classical particle having average kinetic energy
, and using the relationship between the standard deviation
and the temperature of the system:
We get the equilibrium velocity distribution of one particle in a 1D system,
where
is the Boltzmann constant,
m is the mass of the particle, and
T the absolute temperature. The value of the continuous SMI for this probability density is,
Similarly, we can write the momentum distribution in 1D, by transforming from
, to get,
and the corresponding maximum SMI:
As we have noted in connection with the locational SMI, we again recognize the fact that there is a limit to the accuracy within which we can determine the velocity (or the momentum) of the particle. We, therefore, correct the expression in Equation (13) by subtracting log
where
is a small, but finite interval:
Note, again, that if we choose the units of of momentum as , the same as of , then the whole expression under the logarithm will be a pure number.
Third step: Combining the SMI for the location and momentum of one particle; introducing the uncertainty principle
In the previous two subsections, we derived the expressions for the locational and the momentum SMI of one particle in 1D system. We now combine the two results. Assuming that the location and the momentum (or velocity) of the particles are independent events we write:
Recall that and were chosen to eliminate the divergence of the SMI. In writing Equation (15) we assume that the location and the momentum of the particle are independent. However, quantum mechanics impose restrictions on the accuracy in determining both the location x and the corresponding momentum . We must acknowledge that nature imposes on us a limit on the accuracy with which we can determine simultaneously the location and the corresponding momentum. Thus, in equation Equation (15), and cannot both be arbitrarily small, but their product must be of the order of Planck constant J s. Thus, we set:
and instead of Equation (15), we write:
The SMI of a particle in a box of volume V
We consider again one simple particle in a cubic box of volume
V. We assume that the location of the particle along the three axes
x,
y and
z are independent. Therefore, we can write the SMI of the location of the particle in a cube of edges
L, and volume
V as:
Similarly, for the momentum of the particle we assume that the momentum (or the velocity) along the three axes
x,
y and
z are independent. Hence, we write:
We combine the SMI of the locations and momenta of one particle in a box of volume
V, taking into account the uncertainty principle. The result is:
Step four: The SMI of locations and momenta of N independent and indistinguishable particles in a box of volume V
The next step is to proceed from one particle in a box to
N independent particles in a box of volume
V. Given the location
, and the momentum
of one particle within the box, we say that we know the micro-state of the particle. If there are
N particles in the box, and if their micro-states are independent, we can write the SMI of
N such particles simply as
N times the SMI of one particle, i.e.,:
This equation would have been correct if the micro-states of all the particles were independent. In reality, there are always correlations between the micro-states of all the particles; one is due to the indistinguishability between the particles. The second is due to
intermolecular interactions between the particles. We shall mention here only the first correction due the indistinguishability of the particles. For indistinguishable particles we must correct Equation (20) and write instead:
Using the Stirling approximation for
(note again that we use here the natural logarithm) in the form:
We have the final result for the SMI of
N indistinguishable particles in a box of volume
V, and temperature
T:
This is a remarkable result. By multiplying the SMI of N particles in a box of volume V, at temperature T, by a constant factor (, if we use the natural log, or if the log is to the base 2), one gets the entropy, the thermodynamic entropy of an ideal gas of simple particles. This equation was derived by Sackur and by Tetrode in 1912, by using the Boltzmann definition of entropy. Here, we have derived the entropy function of an ideal gas from the SMI.
One can convert this expression to the entropy function
, by using the relationship between the total kinetic energy of the system, and the total kinetic energy of all the particles:
The explicit entropy function of an ideal gas is obtained from Equations (23) and (24):
We can use this equation as a definition of the entropy of an ideal gas of simple particles characterized by constant energy, volume and number of particles. Note that when we combine all the terms under the logarithm sign, we must get a dimensionless quantity. It should be noted that this entropy function is monotonically increasing function of
E,
V and
N, and the curvature is negative with respect to each of these variables. This is an important property of the entropy function, which we will not discuss here. For more details, see [
22,
24].
To summarize, in the procedure of obtaining the entropy we started with the SMI associated with the locations and momenta of the particles. We calculated the distribution of the locations and momenta that maximizes the SMI. We referred to this distribution as the equilibrium distribution. The reason is that we know that for ideal gases, the distribution that maximizes the SMI is the same as the distribution at equilibrium. This is actually the experimental distribution of locations and momenta at equilibrium. Let us denote this distribution of the locations and momenta of all the particles by .
Next, we use the equilibrium distribution to calculate the SMI of a system of N particles in a volume V, and at temperature T. This SMI is, up to a multiplicative constant identical to the entropy of an ideal gas at equilibrium. This also justifies the reference to the distribution which maximizes the SMI as the equilibrium distribution.
It should be noted that in the derivation of the entropy we used the SMI twice; first, in calculating the distribution that maximizes the SMI, then in evaluating the maximum SMI corresponding to this distribution. The distinction between the concepts of SMI and entropy is absolutely essential. Referring to the SMI (as many do) as entropy inevitably leads to such an awkward statement. The maximum value of the entropy (meaning the SMI) is the entropy (meaning the thermodynamic entropy). The correct statement is that the SMI associated with locations and momenta is defined for any system; small or large, at equilibrium or far from equilibrium. This SMI, not the entropy, evolves into a maximum value when the system reaches equilibrium. At this state, the SMI becomes proportional to the entropy of the system. The entropy obtained in this procedure is referred to as the Shannon based, or the ABN definition of entropy.
Since the entropy is up to a constant a special case of an SMI, it follows that whatever interpretation one accepts for the SMI, it will be automatically applied to the concept of entropy. The most important conclusion of this definition is that entropy, being a state function, is not a function of time. Entropy does not change with time, and entropy does not have a tendency to increase. It is very common to say that entropy increases towards its maximum at equilibrium. This is wrong. The correct statement is: The entropy is the maximum! As such it is not a function of time.
It is now crystal clear that none of the definitions of entropy hints on any possible time dependence of the entropy. This aspect of entropy stands out most clearly from the third definition.
It is also clear that all the equivalent definitions render entropy a state function; meaning that entropy has a unique value for a well-defined thermodynamic system.
Finally, we add a comment on the concept of equilibrium. We understand that there exist states in which the thermodynamic parameters, say temperature, pressure, or density do not change with time. These states are called equilibrium states.
Unfortunately, there is no general definition of equilibrium which applies to all systems. Callen [
35,
36], introduced the existence of the equilibrium state as a postulate. He also emphasized that any definition of an equilibrium state is necessarily circular.
In practice, we find many systems in which the parameters describing the system seem to be unchanged with time. Yet, they are not equilibrium states. But for all our purposes in this book, we can assume that every well-defined system, say having a fixed energy E, volume V, and number of particles N will tend to an equilibrium state. At this state, the entropy of the system, as well as many other thermodynamic relationships are applicable.