1. Introduction
The principle of maximum entropy provides an estimation for the underlying probability distribution of the observed data that corresponds best to the currently available information about the system [
1,
2]. It has been applied to fields as diverse as physics [
3,
4], biology [
5], ecology [
4,
6], and natural language [
7]. The philosophy behind the maximum entropy inference approach is to explain and predict experimental observations by making the fewest number of assumptions (i.e., constraints) while assuming no explicit underlying mechanisms.
One of the prime and dreadfully arduous challenges in applying the principle of maximum entropy to a given system is to find out relevant constraints that should be imposed on the system [
8]. The authors of [
9] have suggested that, in situations where the experiments are repeatable, the expected value of the entropy of the likelihood function is relevant information that should be considered a constraint. However, for a given system, its value is largely unknown. Solving the corresponding Lagrange problem leads to the so-called entropic probability distribution [
10,
11]. Entropic distributions have been exploited mainly within the context of data classification and theoretical physics [
12]. Yet, the consequences of such an approach are not fully explored in biology and life sciences. In the context of biology, due to the rigid structure of DNA, most experiments must be repeatable.
In the present paper, we adopt the approach of [
9] and apply it to a complex multicellular biological system, namely, cone photoreceptor cells in the retina; for an earlier attempt in this direction, see [
13]. Cone cells are wavelength-sensitive receptors in the retinas of vertebrate eyes, and their different sensitivities and responses to light of different wavelengths mediate color vision. The spatial distribution of these cells, so-called cone mosaic [
14], varies among different species, which, in each case, may reflect the evolutionary pressures that give rise to various adaptations to the lifestyle of a particular species and its specific visual needs. However, in most cases, the adaptive value of a particular cone mosaic is unknown [
15]. From the perspective of gene regulatory mechanisms, the most fundamental questions such as: what are the mechanisms which control the mostly random distributions of cone subtypes in the human retina, or what migration mechanisms determine the highly regular and ordered patterns of cone subtypes in the retina of the zebrafish, remain unanswered [
16].
In the current work, we show that various forms of distributions of cone cells are controlled by entropy, and we predict the frequency of the appearance of cones in the retina. To this end, we employ the principle of maximum entropy without invoking any specific biological mechanisms or driving forces. In a nutshell, we look for a configuration of sensory cells that maximizes entropy while the expected value of the entropy of the likelihood—which codifies information about the local environment of cells—has been imposed as a constraint. One of the outcomes of this approach is that a configuration with a lower entropy has a higher probability of occurrence (i.e., the frequency of the appearance). This approach enables us to identify a conserved retinal factor, which we call retinal
temperature or
coldness, in divergent species of rodent, dog, monkey, human, fish, and bird. To our knowledge, this is the first model capable of predicting the probability of the occurrence of cone cells in various species’ eyes by tuning a single parameter. For earlier entropic approaches to study neuronal mosaics, see [
17,
18].
The virial equation of state for two-dimensional cellular networks, known as Lemaître’s law [
19,
20], relates the fraction of hexagons in a given network to the width of the polygon distribution. Here, we demonstrate how, by assuming additional information concerning the topology of the network in the entropy maximization procedure, we can obtain this universal law.
The idea that the organization of biological systems stems from an underlying optimization problem goes back to D’Arcy Thompson, which, in his seminal work,
On Growth and Form [
21], he argues for the case of energy minimization, which leads to, for example, the prediction of cellular packing geometries in two-dimensional (2D) networks [
22]. The geometric properties, obtained based on the knowledge of the physical properties of epithelial cells, can be considered as factors that control the development and function of a living organism [
21]. Reducing seemingly different phenomena to a simple governing principle was the manifestation of the universality of form to Thompson [
23]. In essence, here, we are replacing energy minimization with entropy maximization, with the advantage of ignoring the involved forces and physical interactions, which incidentally implies a mathematical (entropic) restriction on the evolution of biological forms.
This paper is organized as follows. In
Section 2, we review the problem of entropy maximization as applied in this paper. We study the spatial distributions of cone cells in the retinas of various vertebrates in
Section 3 and demonstrate the predictive power of our approach apart from its explanatory nature. In
Section 4, we derive Lemaître’s law and examine it in several artificially generated cellular networks and cone mosaics. We summarize and conclude this paper in
Section 5.
2. Entropy Maximization
In statistical mechanics, to obtain the Boltzmann distribution from the principle of maximum entropy, one has to assume a constraint on the mean energy value as, in the context of physics, the expected value of energy is crucial information about the system. This approach leads to a formalism in which thermodynamic temperature emerges as a free parameter and should be determined later from the experiment [
24]. In a general setting, the challenge is to find out the relevant constraints that should be imposed on the system. A. Caticha and R. Preuss in [
9] have assumed a set of data generated by some experiment, where the only requirement is the experiment to be repeatable. If, for example, the experiment is performed twice, with the corresponding outcomes of
and
, in the case that we discard the value of
, the resulting situation should be indistinguishable from if we had done the experiment only once. They have argued that a constraint on the expected value of the entropy of the likelihood codifies this information. Inspired by this idea and since biological experiments must be repeatable because of the robustness of DNA, we adopt this specific approach to entropy maximization and apply it to multicellular biological systems; see also [
13] and the references cited therein. Generally, an experiment does not need to be repeatable; for instance, this may be the case at the atomic scale [
25]. For non-repeatable experiments, the Einstein fluctuation formula is applicable [
9,
26]. Note that, in cellular biology, each cell is composed of a large number of atoms, and the experiments are robust and repeatable.
We denote sensory neurons
S and their local environment, which consists of other cells
Y. We assume the following information about the system:
where
denotes the support set of
S. Equation (
1) is a normalization condition of the probability mass function (in this paper, the frequency of the appearance) of neurons, Equation (
2) assumes the knowledge of the numerical value
of
, and
, where
denotes the support set of
Y, which is defined in terms of the probabilities
. By the method of Lagrange multipliers, we maximize the Shannon entropy of neurons,
, while taking (
1) and (
2) into account. The corresponding Lagrangian reads
where
and
are Lagrange multipliers. By solving
, we obtain the so-called entropic probability [
9,
10,
11]:
where
. Assuming
, Equation (
4) implies that neurons with lower entropy
have a higher probability or frequency of appearance, which is confirmed in the case of cone photoreceptors in
Section 3. The probability distribution in (
4) is the most likely and the least-biased one, where the only assumed knowledge about the system is the repeatability nature of the experiments. Other available information about the system can be incorporated as additional constraints in (
3); an example of such a scenario is given in
Section 4.
A couple of remarks are in order. The application of the principle of maximum entropy strongly depends on how we specify the system configuration, which by itself depends on the nature of the problem at hand. Different ways of describing the configuration of the same system may lead to different outcomes; for a detailed discussion of this issue, see [
27]. The second remark deals with (
2). Although we have assumed the knowledge of
, we do not know its value in most cases, but rather, it is a quantity whose value
should be known; thus, we have formulated our problem as if we had this information. For a detailed discussion of this matter, see [
9]. By calculating the free parameter,
, from the experimental data, one can infer the value of
. In analogy with statistical mechanics where thermodynamic temperature emerges as the inverse of the Lagrange multiplier in the derivation of the Boltzmann distribution, we interpret
as the biological
coldness (the reciprocal of
temperature) of neurons. As in thermodynamics, where energy is an extensive quantity, here entropy is also extensive. Note that thermodynamic temperature is a statistical property of matter in bulk, and thus
can be viewed as an emergent quantity at a tissue level.
4. Lemaître’s Law
Lemaître’s law is the virial equation of state for two-dimensional cellular networks, which relates two measures of disorder (i.e., thermodynamic variables), namely, the fraction of hexagons to the width of the polygon distribution [
11,
19,
20,
39,
40,
41]. Although at first proposed for two-dimensional foams, it has been shown that a wide range of planar cellular networks in nature obey Lemaître’s law, ranging from biology such as avian cones [
30], epithelial cells [
42], and mammalian corneal endothelium [
43], to physics such as amorphous graphene [
41], the Ising model [
44], Bénard–Marangoni convection [
45], silicon nanofoams [
46], and silica bilayers [
47]. It can be obtained by maximizing the entropy,
, where
is the probability, or the frequency of the appearance, of an
n-sided polygon, while considering the following information:
The first relation is the normalization condition, and the second one is a consequence of Euler’s relation concerning the topology of the structure, which assumes only three lines meet at a vertex. Networks with higher vertices can be transformed into trivalent vertices by appropriate transformations [
48]. The function
in the last relation depends on the geometry or the underlying dynamics of cells (polygons). Lemaître and colleagues assumed
as an empirical observation made by measuring the areas of cells in a two-dimensional mosaic produced by hard discs moving on an air table [
19,
20]. At first glance, the choice of
seems not applicable in a general setting. Indeed, it was already mentioned in [
20] that this particular form of
cannot be valid for all cellular mosaics, as, for instance, it is incompatible with the well-known Lewis’ law [
49], which assumes that the average area of polygons is linear in
n. However, the authors of [
20] speculated that the remarkable universality of Lemaître’s law suggests that the particular choice of
has probably a deeper meaning than expected.
Without considering any ad hoc constraint, we derive Lemaître’s law as a special case of our formalism explained in
Section 2. To this end, first, we generalize the Lagrangian introduced in Equation (
3) as [
13],
where we have assumed the following additional information:
, in which
is the average number of cells in local environment and
is a Lagrange multiplier. By solving
, we obtain:
where
. We simplify the notations in (
10) and write:
where we have replaced
s by
n.
is the probability of having an
n-sided polygon, or its frequency of appearance, and
. To calculate
, we consider a general standardized discrete distribution, which its density can be expanded as [
50],
with
and
, where
,
,
are constants and
is the
kth Hermite polynomial. Note that, as
,
approaches the standard normal distribution. Now that we have
at our disposal, we can calculate its differential entropy,
. Since
is an odd function of
x, its integral vanishes, and thus, the first nonzero correction term is of the order
. We obtain:
where the first term is the entropy of the standard normal distribution. By plugging (
13) into (
11), we arrive at
where
and we have absorbed the constants included in
in
. Equation (
14) sheds light on the origin of
, which Lemaître and colleagues had obtained for a specific two-dimensional mosaic [
19,
20]. Since the calculations leading to (
14) only assume a general discrete distribution, the universality of Lemaître’s law becomes evident.
The variance,
, of the distribution
in (
14) reads
where we have used Euler’s relation,
. The second moment of
,
, demonstrates a deviation from the hexagonal configuration and can be interpreted as a measure of topological disorder. By exploiting (
14) and (
15), Lemaître’s law, as a relation between two measures of disorder,
and
, has been obtained as [
11,
19,
20,
41,
46],
We present a simple and intuitive derivation of (
16) and (
17), which is inspired and developed in discussion with C. Beenakker and I. Pinelis [
51]. For (
16) to hold,
should be large and thus
in (
14) should peak at
. This allows us to approximate
near
by a normal distribution,
, centered at
while ignoring the discreteness of
n. We can let
n vary from
to
∞ since only those
ns close to the peak have notable contributions, provided that
is not too small. Thus, we have:
, which results in
. For (
17) to hold, the probabilities
s for
should be negligible compared to
s for
; as a result, the discreteness of
n cannot be neglected in this case, since only three
ns contribute. The constraint
implies that
should
sharply peak at
, leading to
as
, and thus:
. Note that, although in (
9), we have assumed information about seemingly unrelated quantities
and
represented in terms of their corresponding Lagrange multipliers
and
, the peakedness of
and thus
at
gives us a relation between
and
. Since
, we have:
.
To obtain regions of validity of
and
, numerical analyses are performed and the results are shown in
Figure 11. The left panel illustrates
as a function of
, where the red points are obtained from (
14), subjected to the constraint
, and the dashed blue and yellow curves correspond to
and
, respectively. Simulations suggest that the known lower bound of (
16) can be relaxed to
, that is,
In the right panel of
Figure 11, we have shown
as a function of
, where the dashed brown curve represents
and the green points depict the values of
obtained from (
14), subjected to the constraint
.
As
decreases, going from
to
, the peak of
shifts from
to
and remains so up to
, see the left panel of
Figure 12. Again, we can approximate
by a Gaussian, which this time peaks at
. In the right panel of
Figure 12, we have shown the values of
, obtained from (
14) and subjected to the constraint
, in red points, and the Gaussian as a dashed blue curve.
By decreasing
further, the peak shifts from
to
, and eventually,
becomes monotonically decreasing, see the left panel of
Figure 13. For small values of
, going from
to
,
becomes a U-shaped distribution, as is shown in the right panel of
Figure 13.
Most two-dimensional cellular networks in nature have an abundance of hexagons, and they likely obey (
17) and (
18). Low values of
may correspond to amorphous or artificially generated networks. In the following, we examine several cases of mosaics that are artificially generated: random fragmentation, Feynman diagrams, the Poisson network, and semi-regular Archimedean tiling. We demonstrate that all these networks still obey
in (
14) with the constraint
.
In [
52], specific artificial, two-dimensional cellular structures are generated by a fragmentation process. One way to construct these networks is by a random selection of a cell among all cells, and then this cell is to be fragmented into two cells by adding an edge randomly. The side number distribution of cells in this system is obtained by a mean-field model as [
52],
where
and
. Equation (
19) can be solved as
In the top-left corner of
Figure 14, we have shown
as a blue curve and
in (
14) as a dashed orange curve.
Another way to construct such networks is by a random selection of an edge among all cell edges followed by selecting one of the cells which shares this edge, and then this cell is to be fragmented into two cells as in the previous case [
52]. The probability distribution of the number of cell sides reads [
52],
with
and
. In the top-right corner of
Figure 14, a comparison between
and
is shown.
The ensemble of planar Feynman diagrams with a cubic interaction (i.e., planar
diagrams with a fixed number of vertices) is equivalent to the ensemble of polygons with trivalent vertices [
53,
54]. The probability distribution of the number of cell edges is obtained as [
53,
54],
See the bottom-left corner of
Figure 14 for a comparison between
and
.
The two-dimensional Poisson network studied in [
55] can be obtained from a tessellation of a surface based on Poisson point distribution. The distribution of the number of cell sides reads [
55],
A comparison between
and
is shown in the bottom-right corner of
Figure 14.
The Archimedean tilings, obtained by Kepler, are the analogs of the Archimedean solids. Eight of them are semi-regular and consist of regular polygons at each vertex [
56]. In the left panel of
Figure 15, we have shown one of these semi-regular tilings, known as truncated hexagonal tiling, consisting of two dodecagons and one triangle at each vertex. The right panel of
Figure 15 shows
in (
14) as
and
. This plot corresponds to a pattern that comprises an abundance of triangles with dodecagons amongst them and is in agreement with truncated hexagonal tiling.
4.1. Human Cone Mosaics
In this subsection, we examine Lemaître’s law in the case of the human retina, which can be viewed as a natural, two-dimensional cellular network. To partition the retinal field of
Figure 1 into polygons, we construct the corresponding Voronoi tessellation. Each Voronoi polygon is generated by a cone cell in a way that all points in a given polygon are closer to its creating cone cell than to any other [
57]. In the top row of
Figure 16, we have shown Voronoi tessellations of the spatial arrangements of blue, green, and red cones in a living human retina. At the bottom, the Voronoi tessellation of the whole pattern of cones is presented. The fractions of
n-sided bounded polygons are reported in the figure caption.
If we assume a high value of
indicates the regularity of the corresponding cone mosaic,
Figure 16 demonstrates that the spatial arrangement of blue cones is more random than those of green and red cones, where for blue cones we have:
while
and
for greens and reds, respectively. This finding is in agreement with [
28]. Note that, as is shown at the bottom of
Figure 16, in contrast to the cone subtypes, the whole spatial arrangement of human cones is highly ordered, with
.
We have shown Lemaître’s law as applied to human cone mosaics in
Figure 17. In the left panel—the case of blue cone mosaic—the experimental value of
is depicted as a blue point, and the dashed dark-gray curve corresponds to
and the dashed light-gray curve to
. The cases of greens, reds, and the entire pattern of cones (in black) are shown in the right panel.
As another illustration, the behavior of cones in a different subject is shown in
Figure 18. The image in the left panel, adapted from [
58], shows human cone mosaics at six different retinal locations: two, four, six, eight, ten, and twelve degrees of retinal eccentricities, temporal to the fovea. The right panel shows the agreement between human cone mosaics and Lemaître’s law.
4.2. Vertebrate Cone Mosaics: From Rodent to Bird
Here, we apply the approach of
Section 4.1 to rodent, dog, monkey, human, fish, and bird. The results are summarized in
Figure 19,
Figure 20,
Figure 21,
Figure 22,
Figure 23 and
Figure 24. In each case, the experimental value of
is depicted in the color of its respective cone subtype, and the black point represents the whole pattern of cones in a given retinal field.
5. Concluding Remarks
In this work, we have applied the principle of maximum entropy to explain various forms of retinal cone mosaics in vertebrate eyes and established a parameter called retinal temperature or coldness, which is conserved throughout different species as diverse as rodent, dog, monkey, human, fish, and bird, regardless of the details of the underlying mechanisms, or physical and biological forces. This approach has enabled us to predict the frequency of the appearance of cone cells only by tuning a single parameter. The only constraint of the Lagrange problem stems from the repeatable nature of the experiments in biology.
Lemaître’s law, which relates the fraction of hexagons to the width of the polygon distribution in numerous two-dimensional cellular networks in nature and is usually obtained by assuming an ad hoc constraint, here is derived as a special case of our formalism. We have shown that various networks, whether artificially generated or natural, obey this universal law.
Since we have considered a completely general constraint in the entropy maximization procedure, the approach of the current paper can be exploited to explain other patterns or processes in nature. In the case of failure, it implies that either additional information, which stems from the knowledge of the underlying mechanisms, needs to be considered, or that the assumed information is incorrect. Indeed, this is one of the pitfalls of the maximum entropy approach as it is not falsifiable, and there are no criteria for its validity within itself [
8,
59].
Although in many cases, as in this paper, we can explain and predict the phenomena without knowing the details of the underlying dynamics, the principle of maximum entropy can still lead us to a better understanding of the involved mechanisms by validating the assumed information about the system.