Next Article in Journal
Design, Synthesis, and In Vitro Antiproliferative Screening of New Hydrazone Derivatives Containing cis-(4-Chlorostyryl) Amide Moiety
Next Article in Special Issue
Singularities for Focal Sets of Timelike Sabban Curves in de Sitter 3-Space
Previous Article in Journal
Cross-Layer Optimization-Based Asymmetric Medical Video Transmission in IoT Systems
Previous Article in Special Issue
Singularities of Osculating Developable Surfaces of Timelike Surfaces along Curves
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Double Contingency of Communications in Bayesian Learning

Department of Mathematics, Osaka Dental University, Osaka 573-1121, Japan
Symmetry 2022, 14(11), 2456; https://doi.org/10.3390/sym14112456
Submission received: 16 September 2022 / Revised: 8 November 2022 / Accepted: 15 November 2022 / Published: 19 November 2022
(This article belongs to the Special Issue Symmetry and Its Application in Differential Geometry and Topology)

Abstract

:
In previous work, we described the geometry of Bayesian learning on a manifold. In this paper, inspired by the notion of modified double contingency of communications from sociologist Niklas Luhmann, we take two manifolds in equal parts and a potential function on their product to set up mutual Bayesian learning. Particularly, given a parametric statistical model, we consider mutual learning between two copies of the parameter space. Here, we associate the potential with the relative entropy (i.e., the Kullback–Leibler divergence). Although the mutual learning forgets all elements about the model except the relative entropy, it still substitutes for the usual Bayesian estimation of the parameter in a certain case. We propose it as a globalization of the information geometry.

1. Introduction

This is the sequel of the author’s research [1] on the geometry of Bayesian learning. We introduce mutual Bayesian learning by taking two manifolds, each of which is the parameter space of a family of density functions on the other. This setting has the following background in sociology that seems more ideological than practical.
Talcott Parsons [2] introduced the notion of double contingency in sociology. Here, the contingency is that no event is necessary and no event is impossible. A possible understanding of this definition appeals to probability theory. Specifically, even an event with probability P = 1 does not always occur, and even that with P = 0 sometimes occurs, as a non-empty null set appears, at least conceptually. We consider the contingency as the subjective probabilistic nature of society. In fact, updating the conceptual subjective probability according to Bayes’ rule should be a response to the conventional contingency that the prior probability is not a suitable predictor in reality. However, the double contingency is not straightforward, as it concerns mutually dependent social actions. In this article, we describe the double contingency by means of Bayesian learning. In our description, when one learns from another, the opposite learning also proceeds. This implies that, in contrast to sequential games such as chess, the actions in a double contingency have to be selected at once. Niklas Luhmann [3] leveraged this simultaneity to regard people not as individuals but as a single agent that he called a system. This further enabled him to apply the double contingency to any communications between systems. We introduce a function λ on the product of two manifolds to understand his systems theory.
From a practical perspective, we consider a family { h x : W R > 0 } x X of probability density on a manifold W and regard the parameter space X as a manifold. The product X × X carries the function φ : X × X R 0 induced from the relative entropy. Recall that the information geometry [4] is a differential geometry on the diagonal set Δ X × X , which deals with the 3-jet of φ at Δ . The author [5] began exploring the global geometry of ( X × X , φ ) . Now we take exp ( φ ) as the above function λ , and show that the mutual Bayesian learning between two copies of X substitutes for the original Bayesian estimation on W in a certain case. We notice that the global geometry of φ , as well as the information geometry, forgets the original problem on W, and addresses a related problem on X. In this regard, our mutual Bayesian learning is a globalization of the information geometry.

2. Mathematical Formulation

2.1. Geometric Bayesian Learning

We work in the C -smooth category. Take a possibly non-compact and possibly disconnected manifold X equipped with a volume form d v o l X . Note that a discrete set is a 0-dimensional manifold on which a positive function is a volume form. Suppose that each point x of the manifold X presents a possible action of a person. A positive function f : X R > 0 on X is called a density. If its integral | f | d v o l X : = X f d v o l X is finite, it defines the probability f / | f | d v o l X on X. Suppose that the selection of an action x is weighted by a density f 0 on X. In our story, the person believes that a density ρ x : Y R > 0 on another manifold ( Y , d v o l Y ) depends on his action x. That is why the person perceives a given point y 0 Y by multiplying the density f 0 by the function
l : X R > 0 : X x ρ x ( y 0 ) > 0 ,
which is called the likelihood of the datum y 0 Y . The perception updates the prior density f 0 to the posterior density f 1 ( x ) : = l ( x ) f 0 ( x ) = ρ x ( y 0 ) f 0 ( x ) . Indeed f 1 / | f 1 | d v o l X is the Bayesian posterior probability provided that f 0 / | f 0 | d v o l X is the prior probability. The only change from the description in [1] is the aim of the learning, i.e., prediction is replaced with action. Although the word action has an active meaning, an activity consisting of countless actions would be a chain of automatic adaptations to the environment.

2.2. Mutual Learning

It is natural to symmetrize the above setting by altering the roles of X and Y. Specifically, we further suppose that a point y of the second manifold Y parameterizes a density ρ y : X R > 0 of the first manifold X, and the perception of a datum x 0 X by the second person updates a prior density g 0 : Y R > 0 on the second manifold to the posterior density g 1 ( y ) = ρ y ( x 0 ) g 0 ( y ) . This models the double contingency of Parsons [2]. We further modify it as follows. Fix volume forms d v o l X , d v o l Y , and d v o l X × Y on X, Y, and X × Y , respectively. Take densities f 0 : X R > 0 , g 0 : Y R > 0 , and λ : X × Y R > 0 . Suppose that prior densities f 0 and g 0 , respectively, changes to the posterior densities
f 1 = λ ( · , y 0 ) f 0 : x λ ( x , y 0 ) f 0 ( x ) and g 1 = λ ( x 0 , · ) g 0 : y λ ( x 0 , y ) g 0 ( y ) .
This models the double contingency of Luhmann [3]. We say that f 0 is coupled with g 0 in the mutual learning through Luhmann’s potential  λ on the product X × Y . Since the potential λ is also a density, it can be coupled with a density σ 0 on another manifold Z. Specifically, if there is a datum ( ( x , y ) 0 , z 0 ) and a density τ 0 : ( X × Y ) × Z R > 0 , the pair of two persons can change the tendency of its action selection. This mathematics enables us to consider the double contingency not only between persons but also between systems. Here we suppose that the points ( x , y ) 0 and ( x 0 , y 0 ) are given as the same point. We emphasize that what we are discussing is not how the datum appears objectively, but how we perceive it or how we learn from it subjectively. We discuss in Section 4 the discordance between ( x , y ) 0 and ( x 0 , y 0 ) to understand a proposition in Luhmann’s systems theory saying that no system is a subsystem.

2.3. Relative Entropy

Shannon [6] introduced the notion of entropy in information theory. As for continuous distributions, Jaynes [7] pointed out that the notion of relative entropy
D d v o l X f 1 | f 1 | d v o l X f 0 | f 0 | d v o l X : = X f 1 | f 1 | d v o l X log f 1 | f 1 | d v o l X · | f 0 | d v o l X f 0 d v o l X
is rather foundational to the notion of entropy
H d v o l X f | f | d v o l X : = X f | f | d v o l X log f | f | d v o l X d v o l X .
Indeed, the entropy takes all real values even for normal distributions, whereas the relative entropy is non-negative for any pair of distributions, where the non-negativity is obvious from log ( 1 / t ) 1 t and is called the Gibbs inequality. Further, if we multiply the volume form by a positive constant, the entropy changes while the relatrive entropy does not. In any case, putting f = f 1 / f 0 and using the volume form f 0 d v o l X , we have
H f 0 d v o l X f 1 / f 0 | f 1 / f 0 | f 0 d v o l X = log | f 0 | d v o l X D d v o l X f 1 | f 1 | d v o l X f 0 | f 0 | d v o l X .
Note that we cannot put f 0 = 1 unless d v o l X is finite. If we multiply the volume form by a non-constant density, the relative entropy varies in general. We notice that the choice of the volume forms in the above mutual learning does not affect the result of the learning.

2.4. Mutual Learning via Relative Entropy

The information geometry [4], as well as its partial globalization by the author [1,5], starts with a family of probability distributions. Slightly more generally, we consider a manifold W equipped with a volume form d v o l W and a family { h x } x X of densities with finite total masses on it. We regard the parameter space X as a manifold, and define the function φ : X × X R 0 on its square by
φ ( x , y ) : = D d v o l W h x | h x | d v o l W h y | h y | d v o l W .
The information geometry focuses on the 3-jet of φ at the diagonal set Δ X × X . From the Gibbs inequality, the symmetric quadratic tensor defined by the 2-jet of φ is positive semi-definite. If it is positive definite, it defines a Riemannian metric called the Fisher–Rao metric. Then the symmetric cubic tensor defined by the 3-jet of the anti-symmetrization φ ( x , y ) φ ( y , x ) directs a line of torsion-free affine connections passing through the Levi-Civita connection of the Fisher–Rao metric. This line of connections is the main subject of the information geometry. On the other hand, developing the global geometry in [1,5], we define Luhmann’s potential for mutual learning between two copies of X as
λ : = exp ( φ ) : X × X R > 0 .
We couple a prior density f 0 : X R > 0 on the first factor with a prior density g 0 : X R > 0 on the second factor through the potential λ on the product X × X . Here, the mutual learning updates f 0 and g 0 , respectively, to the posterior densities
f 1 = λ ( · , y 0 ) f 0 : x λ ( x , y 0 ) f 0 ( x ) and g 1 = λ ( x 0 , · ) g 0 : y λ ( x 0 , y ) g 0 ( y ) .
Note that the function φ changes if we multiply the volume form d v o l W by a non-constant density in general. Thus, the choice of the volume form is crucial. The volume form d v o l X might be related to the Fisher–Rao metric, although the choice of d v o l X is indeed irrelevant to the mutual learning. We can also imagine that the other volume forms d v o l X × X and d v o l W have been determined in earlier mutual learnings “connected” to the current one.

3. Results

We address the following problem in certain cases below.
Problem 1. 
Does the mutual learning via the relative entropy substitute for the conventional Bayesian estimation of the parameter of the family { h x } ?
Remark 1. 
The mutual learning uses only the relative entropy, whereas the conventional Bayesian estimation needs all the information about the family. Thus, Problem 1 also asks if the mutual learning can “sufficiently restore” the family from the relative entropy. To clarify this point, we use the constant 1 as the formal prior density in the sequel even when the total volume is infinite. Then one may compare the family with the particular posterior g 1 to see “how much” it is restored.

3.1. Categorical Distributions

Let W be a 0-dimensional manifold with N + 1 unit components, i.e., W = { 0 , , N } with volume form d v o l W = 1 . A point x of the open N-simplex
X = { x = ( x 0 , , x N ) R N + 1 x 0 , , x N > 0 , x 0 + + x N = 1 }
with the standard volume form d v o l X presents a categorical distribution (i.e., a finite distribution) on W. We take the product manifold X × X with Luhmann’s potential
λ ( x , y ) = exp x 0 log ( x 0 / y 0 ) x N log ( x N / y N ) .
Suppose that the prior densities are the constants f 0 ( x ) 1 and g 0 ( y ) 1 on the first and second factors of X × X . Then, the iteration of mutual Bayesian learning yields
f n ( x ) = exp ( n x 0 log y 0 ¯ + + n x N log y N ¯ n x 0 log x 0 n x N log x N ) , g n ( y ) exp ( n x 0 ¯ log y 0 + + n x N ¯ log y N )
where the overlines denote arithmetic means x 0 ¯ = x 0 0 + + x n 1 0 n etc.
Proposition 1. 
We have the following maximum a posteriori (MAP) estimations:
x 0 : : x N = exp ( log y 0 ¯ ) : : exp ( log y N ¯ ) f n ( x ) = max f n y = x ¯ = ( x 0 ¯ , , x N ¯ ) g n ( y ) = max g n
We notice that the probability g n / | g n | d v o l X for the posterior density g n on the second factor of X × X is known as the Dirichlet distribution.
Definition 1. 
The Dirichlet distribution Dir ( α ) for α = ( α 0 , , α N ) R > 0 N + 1 is presented by the probability f / | f | d v o l X on the open N-simplex X R N + 1 for the density
f ( x ) = exp ( α 0 1 ) log x 0 + + ( α N 1 ) log x N .
In particular, the constant Dir ( 1 , , 1 ) is called the flat Dirichlet distribution.
We identify the set W with the 0-skeleton of the closure cl ( X ) of the open N-simplex X R N + 1 . If the prior is the flat Dirichlet distribution Dir ( 1 , , 1 ) , the Bayesian learning from categorical data x 0 , , x n 1 W yields the posterior Dir ( ( 1 , , 1 ) + x 0 + + x N ) . This is the conventional Bayesian learning from categorical data. On the other hand, the above probability g n / | g n | d v o l X is the Dirichlet distribution Dir ( ( 1 , , 1 ) + x 0 + + x N ) . Here we believe that the data x k X obey the probability λ ( x , y k ) / | λ ( x , y k ) | d v o l X , which we can consider as a continuous version of the categorical distribution. Imagine that a coarse graining of the data x k on X yields data x k obeying a categorical distribution on the 0-skeleton W of the closure of X. Then the probability g n / | g n | d v o l X for the new data x k reaches the posterior probability of the conventional Bayesian learning.
The following is the summary of the above.
Theorem 1. 
Instead of the conventional Bayesian learning from categorical data, we consider the mutual learning on the product of two copies of the space of categorical distributions via the relative entropy. Then a coarse graining of the data of the first factor into the 0-skeleton of the closure of the domain deforms the second factor of the mutual learning into the conventional Bayesian learning.
Thus, the answer to Problem 1 is affirmative in this case.

3.2. Normal Distributions

In the case where X is the space of normal distributions, we would like to change the coordinates of the second factor of the product X × X to make the expression simpler, although one can reach the same result through a straightforward calculation.

3.2.1. The Coordinate System

Let X be the upper-half plane { ( m , s ) m R , s R > 0 } and W the line { w w R } . Suppose that any point ( m , s ) of X presents the normal distribution N ( m , s 2 ) on W with mean m and standard deviation s. The relative entropy is expressed as
D d w N ( m , s 2 ) N ( m , s 2 ) = ( m m ) 2 + s 2 s 2 2 s 2 log s s = ( m m ) 2 + ( s s ) 2 2 s 2 + s s s log 1 + s s s .
This implies that the Fisher–Rao metric is the half of the Poincaré metric. We put
d v o l X : = 1 s 2 d m d s = d 1 s d m ,
and consider the symplectic product ( X , d v o l X ) × ( X , d v o l X ) = ( X × X , d v o l X d v o l X ) . In [5], the author fixed the Lagrangian correspondence
N = ( ( m , s ) , ( M , S ) ) X × X | m s + M S = 0 , s S = 1 ,
which is the graph of the symplectic involution
F : X X : ( m , s ) ( M , S ) = m s 2 , 1 s .
Using it, the author took the “stereograph” D ˜ : X × X R 0 of the relative entropy as follows. Regard a value D of the relative entropy D d w N ( m , s 2 ) N ( m , s 2 ) as a function of the pair of two points ( m , s ) and ( m , s ) on the first factor of the product X × X ; take the point ( M , S ) = m s 2 , 1 s on the second factor, which N corresponds to the point ( m , s ) on the first factor; and regard the value D as the value of a function D ˜ of the point ( m , s , M , S ) X × X . That is, the function D ˜ is defined by
D ˜ ( m , s , M , S ) : = D d w N ( m , s 2 ) N ( M / S 2 , 1 / S 2 ) = 1 2 M S + s S m s 2 + s 2 S 2 1 log ( s 2 S 2 ) 2 .
The function D ˜ enjoys symplectic/contact geometric symmetry as well as the submanifold N. See [1] for the multivariate versions of D ˜ and N with Poisson geometric symmetry.

3.2.2. The Mutual Learning

In the above setting, we define Luhmann’s potential by
λ ( m , s , M , S ) : = exp ( D ˜ ( m , s , M , S ) ) = s S exp 1 2 M S + s S m s 2 s 2 S 2 1 2
Put f 0 ( m , s ) 1 and g 0 ( M , S ) 1 . Then, the iteration of the mutual learning yields
f n ( m , s ) s n exp n S 2 ¯ 2 m M ¯ S 2 ¯ 2 + s 2 s n exp n S 2 ¯ 2 s 2 , g n ( M , S ) S n exp n S 2 2 M S 2 m ¯ 2 + m 2 ¯ m ¯ 2 + s 2 ¯ = ( s ) n exp n 2 ( s ) 2 ( m m ¯ ) 2 + m 2 ¯ m ¯ 2 + s 2 ¯ .
Since d d s s n exp n S 2 ¯ 2 s 2 = n s n 1 n S 2 ¯ s n + 1 exp n S 2 ¯ 2 s 2 , we see that the density f n reaches the maximum at ( m , s ) = M ¯ S 2 ¯ , 1 S 2 ¯ . Similarly, we can see that the density g n reaches the maximum when m = M S 2 = m ¯ and s 2 = 1 S 2 = m 2 ¯ m ¯ 2 + s 2 ¯ hold.
Definition 2. 
The normal-inverse-Gamma distribution NIG ( μ , ν , α , β ) on the upper-half plane X ^ = { ( m , v ) ( m , s ) X , v = s 2 } equipped with the volume form d v o l X ^ = d m d v is the probability density proportional to
v α 1 exp ν ( m μ ) 2 2 v β v .
Its density form is the volume form with unit total mass, which is proportional to
v α 1 exp ν ( m μ ) 2 2 v β v d v o l X ^ .
Using our volume form d v o l X , we can write the density form of NIG ( μ , ν , α , β ) as
const · s 2 α + 1 exp ν 2 s 2 ( m μ ) 2 + 2 β ν d v o l X .
This is proportional to g n d v o l X on the second factor of X × X when
( μ , ν , α , β ) = m ¯ , n , n + 1 2 , n ( m 2 ¯ m ¯ 2 + s 2 ¯ ) 2 .
We identify the line W with the boundary of X. The conventional Bayesian learning of the normal data m 0 , , m n 1 yields the posterior NIG m ¯ , n , n + 1 2 , n ( m 2 ¯ m ¯ 2 ) 2 provided that the prior is formally 1. Thus, we have the following result similar to Theorem 1.
Theorem 2. 
Instead of the conventional Bayesian learning from normal data on R , we consider the mutual learning on the product of two copies of the space X of normal distributions via the relative entropy. Then a coarse graining of the data of the first factor into the boundary X = R by taking s 0 deforms the second factor of the mutual learning into the conventional Bayesian learning.
Thus, the answer to Problem 1 is also affirmative in this case.

3.3. Von Mises Distributions with Fixed Concentration in Circular Case

A von Mises distribution M k ( m ) with a fixed large concentration k ( 1 ) is a circular analogue of a normal distribution with a fixed small variance that is parametrized by a point m of X = R / 2 π Z . Its density is proportional to the restriction of the function exp ( k cos ( m ) x + k sin ( m ) y ) to the circle W = { ( x , y ) x = cos w , y = sin w , w R / 2 π Z } with d v o l W = d w . Then, using the easy formula 0 2 π exp ( k cos x ) sin x d x = 0 , we obtain the following expression of the relative entropy:
D ( M k ( m ) | | M k ( m ) ) = 0 2 π exp ( k cos ( w m ) ) k cos ( w m ) k cos ( w m ) d w = c ( 1 cos ( m m ) )
where c is a positive constant. (When k Z , using modified Bessel, we have c = k I 1 ( k ) I 0 ( k ) .) Thus, Luhmann’s potential is λ ( m , m ) = exp ( c ( 1 cos ( m m ) ) . We put f 0 ( m ) 1 and g 0 ( m ) 1 . Then, the iteration of mutual Bayesian learning on the torus X × X yields
f n ( m ) = exp n c ( 1 cos m ¯ cos m sin m ¯ sin m ) , g n ( m ) = exp n c ( 1 cos m ¯ cos m sin m ¯ sin m ) .
On the other hand, the conventional Bayesian learning on W yields the posterior probability density proportional to exp n k cos ¯ ( m ) cos w + n k sin ¯ ( m ) sin w , which looks like g n ( w ) . This suggests the affirmative answer to Problem 1.

3.4. Conclusions

We have observed that the answer to Problem 1 is affirmative in some cases. Specifically, the mutual Bayesian learning covers at least a non-empty area of parametric statistics. The author expects that it could cover the whole from some consistent perspective.

4. Discussion

4.1. On Socio-Cybernetics

In our setup of mutual learning, a system must be organized as the product of two manifolds with Luhmann’s potential before each member learns. Further, the potential is the result of an earlier mutual learning in which the system was a member. In Luhmann’s description [3], the unit of society is not the agent of an action but a communication or rather a chain of communications. In mathematics, a manifold is locally a product of manifolds and is characterized as the algebraic system of functions on it. By analogy, Luhmann’s society seems to be a system of relations between certain systems of functions. Some authors criticize his theory for failing to acquire individual identity, but an individual is a relation between identities that are already represented by manifolds.
As a matter of course, reality cannot be explained by theories. Instead, a theory which can better explain something on reality is chosen. In Section 2.2, we have assumed that ( x , y ) 0 and ( x 0 , y 0 ) are given as the same point in reality. Then there are two possibilities: (1) The potential λ is updated by using ( x , y ) 0 as a component of a datum, or (2) the mutual learning of ( x 0 , y 0 ) is performed under the undated potential λ . The discordance between (1) and (2) does not affect the reality. Further, there is no consistent hierarchy among Luhmann’s systems that choose either (1) or (2), and therefore there is no system that is a proper subsystem of another system. Perhaps, the social system chooses either (1) or (2), which can better explain the “fact” in relation to other “facts” in a story on reality. Undertaking all of the above, the notion of autopoiesis that Maturana and Varela [8] found in living organisms can be the foundation of Luhmann’s socio-cybernetics.

4.2. On the Total Entropy

In objective probability theory, one considers a continuous probability distribution as the limit of a family of finite distributions presented by relative frequency histograms and the entropy of the limit as the limit of the entropies. Since the entropy of a finite distribution whose support is not a singleton is positive, a distribution with negative entropy, e.g., a normal distribution with small variance, does not appear. On the other hand, we take the position of subjective probability theory, and regard a positive function on a manifold that has the unit mass with respect to a fixed volume form as a probability density. From our point of view, the relative entropy between two probability densities is essential as it is non-negative; it presents the information gain; and it does not change (while even the sign of entropy does change) by multiplying the volume form by any positive constant. We notice that an objective probability is a subjective probability, and not vice-versa.
We know that the lowest entropy at the beginning of the universe must be relative to higher entropy in the future. In this regard, the total amount of information decreases as the order of time. However, it is still possible that the amount of consumable information increases, and perhaps that is how this world works. Here we would like to distinguish the world from the universe, even though they concern the same reality and therefore communicate with each other. The world consists of human affairs, including the possible variations of knowledge on facts in the universe—there is no love in the universe, but love is the most important consumable thing in the world. We consider that the notion of complexity in Luhmann’s systems theory concerns such consumability as it relates to coupling of systems. Now the problem is not the total reserve of information, but how to strike it and refine it like oil. At present, autopoiesis is gaining ground against mechanistic cybernetics. Our research goes against this stream: Its goal is to invent a learning machine to exploit information resources to be consumed by humans and machines.

4.3. On Geometry

In this paper, we have quickly gone from the general definition of mutual learning to a discussion of the special mutual learning via relative entropy. However, it may be worthwhile to stop and study various types of learning according to purely geometric interests. For example, the result of previous work [1] is apparently related to the geometry of dual numbers, and fortunately this special issue includes a study [9] on a certain pair of dual number manifolds. Considering mutual learning for pairs of related manifolds such as this is something to be investigated in the future.
In addition, in proceeding to the case of the mutual learning via relative entropy, one basic problem was left unaddressed: Given a non-negative function φ on a squared manifold M × M that takes zero on the diagonal set, can we take a family of probability densities with parameter space M so that the relative entropy induces the function φ ?

Funding

This research received no external funding.

Data Availability Statement

Not applicable.

Conflicts of Interest

The author declares no conflict of interest.

References

  1. Mori, A. Global Geometry of Bayesian Statistics. Entropy 2020, 22, 240. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  2. Parsons, T. The Social System; Free Press: Glencoe, IL, USA, 1951. [Google Scholar]
  3. Luhmann, N. The autopoiesis of the social system. In Sociocybernetic Paradoxes: Observation, Control and Evolution of Self-Steering Systems; Geyer, R.F., van der Zouwen, J., Eds.; Sage: London, UK, 1986; pp. 172–192. [Google Scholar]
  4. Amari, S. Information Geometry and Its Applications; Springer: Tokyo, Japan, 2016. [Google Scholar]
  5. Mori, A. Information geometry in a global setting. Hiroshima Math. J. 2018, 48, 291–305. [Google Scholar] [CrossRef]
  6. Shannon, C. A mathematical theory of communication. Bell Syst. Tech. J. 1948, 27, 379–423. [Google Scholar] [CrossRef] [Green Version]
  7. Jaynes, E. Prior Probabilities. IEEE Trans. Syst. Sci. Cybern. 1968, 4, 227–241. [Google Scholar] [CrossRef]
  8. Maturana, H.; Varela, F. Autopoiesis and Cognition: The Realization of the Living, Boston Studies in the Philosophy and History of Science 42; Reidel: Dordrecht, The Netherlands, 1972. [Google Scholar]
  9. Li, Y.; Alluhaibi, N.; Abdel-Baky, R.A. One-parameter Lorentzian dual spherical movements and invariants of the axodes. Symmetry 2022, 14, 1930. [Google Scholar] [CrossRef]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Mori, A. Double Contingency of Communications in Bayesian Learning. Symmetry 2022, 14, 2456. https://doi.org/10.3390/sym14112456

AMA Style

Mori A. Double Contingency of Communications in Bayesian Learning. Symmetry. 2022; 14(11):2456. https://doi.org/10.3390/sym14112456

Chicago/Turabian Style

Mori, Atsuhide. 2022. "Double Contingency of Communications in Bayesian Learning" Symmetry 14, no. 11: 2456. https://doi.org/10.3390/sym14112456

APA Style

Mori, A. (2022). Double Contingency of Communications in Bayesian Learning. Symmetry, 14(11), 2456. https://doi.org/10.3390/sym14112456

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop