Next Article in Journal
Two-Scale Homogenization of Piezoelectric Perforated Structures
Next Article in Special Issue
General Relativistic Space-Time with η1-Einstein Metrics
Previous Article in Journal
Improving YOLOv4-Tiny’s Construction Machinery and Material Identification Method by Incorporating Attention Mechanism
Previous Article in Special Issue
Riemannian Formulation of Pontryagin’s Maximum Principle for the Optimal Control of Robotic Manipulators
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Explicit Information Geometric Calculations of the Canonical Divergence of a Curve

1
Scuola Militare Nunziatella, Via Generale Parisi, 16, 80132 Napoli, Italy
2
SUNY Polytechnic Institute, Albany, NY 12203, USA
*
Author to whom correspondence should be addressed.
Mathematics 2022, 10(9), 1452; https://doi.org/10.3390/math10091452
Submission received: 30 March 2022 / Revised: 20 April 2022 / Accepted: 23 April 2022 / Published: 26 April 2022
(This article belongs to the Special Issue New Advances in Differential Geometry and Optimizations on Manifolds)

Abstract

:
Information geometry concerns the study of a dual structure ( g , , * ) upon a smooth manifold M . Such a geometry is totally encoded within a potential function usually referred to as a divergence or contrast function of ( g , , * ) . Even though infinitely many divergences induce on M the same dual structure, when the manifold is dually flat, a canonical divergence is well defined and was originally introduced by Amari and Nagaoka. In this pedagogical paper, we present explicit non-trivial differential geometry-based proofs concerning the canonical divergence for a special type of dually flat manifold represented by an arbitrary 1 -dimensional path γ . Highlighting the geometric structure of such a particular canonical divergence, our study could suggest a way to select a general canonical divergence by using the information from a general dual structure in a minimal way.

1. Introduction

Information geometry (IG) has, in recent decades, become a very helpful mathematical tool for several branches of science [1,2,3]. It relies on the study of a smooth manifold endowed with a Riemannian metric tensor and a couple of torsion-free affine connections which are dual to each other and whose average leads to the Levi–Civita connection [4]. More precisely, the main object of study in IG is a quadruple ( M , g , , * ) , where ( M , g ) is a Riemannian manifold [5] and , * are torsion-free linear connections on the tangent bundle T M , such that
g X Y , Z + g X * Y , Z = X g Y , Z
for all sections X , Y , Z T ( M ) and 1 2 + * = ¯ LC , where ¯ LC denotes the Levi–Civita connection of the metric tensor g [6]. Here T ( M ) denotes the space of vector fields X : M T M from the manifold to the vector bundle. The quadruple ( M , g , , * ) is usually referred to as a statistical manifold whenever the affine connections are both torsion-free [6].
The geometry of a statistical manifold is totally encoded in a distance-like function
D : M × M R , D ( p , q ) 0 p , q M and D ( p , q ) = 0 iff p = q ,
in the following way:
g i j ( p ) = i j D ( ξ p , ξ q ) p = q = i j D ( ξ p , ξ q ) p = q
Γ i j k ( p ) = i j k D ( ξ p , ξ q ) p = q , Γ i j k * ( p ) = i j k D ( ξ p , ξ q ) p = q ,
where the indexes i , j , run from 1 to n = d i m M and Γ i j k = g i j , k , Γ i j k * = g i * j , k are the symbols of the dual connections and * , respectively [7]. Here, { ξ p } denotes a coordinate system at p and i = ξ p i . When the matrix g ( p ) = [ g i j ( p ) ] is strictly positive definite for all p M , such a function is called a divergence function or contrast function of the statistical manifold ( M , g , , * ) [8].
Given a torsion free dual structure  ( g , , * ) on a smooth manifold M , there are infinitely many divergence functions which induce on M the same dual structure ( g , , * ) [9]. However, Amari and Nagaoka showed that a kind of canonical divergence is uniquely defined on a dually flat statistical manifold [4]. More precisely, a dual structure ( g , , * ) is said to be dually flat when both the Riemann curvature tensors R ( ) and R * ( * ) are zero [5]. In this case, there exist coordinate systems mutually dual [ θ i ] , [ η i ] , such that i j = 0 , i * j = 0 and g i , j = δ i j , where δ i j = 1 if i = j otherwise δ i j = 0 . Here, i = θ i and i = η i . Moreover, there exists a couple of functions φ and ψ , such that
ψ θ i = η i , φ η i = θ i .
It turns out that
φ = θ i η i ψ ,
where Einstein’s notation is adopted. Given two points p , q M , the canonical divergence of the dually flat statistical manifold ( M , g , , * ) between p and q is then defined by
D ( p , q ) : = ψ ( p ) + φ ( q ) θ i ( p ) η i ( q ) .
Relying upon the canonical divergence on a dually flat structure, one can list the basic properties for a divergence to be a canonical one of a general dual structure. In [6], the authors required that a divergence function, to be a canonical one, would be one half the square of the Riemannian distance when = * and would be the canonical divergence of Amari and Nagaoka when the dual structure is dually flat. However, we can find in the literature some different divergences which accomplish such requirements (see for instance [10,11,12,13,14]). Hence, the search for a canonical divergence on a general dual structure is still an open problem [10]. Nonetheless, the notion of canonical divergence on a dually flat manifold can illustrate how information geometry modifies the usual Riemannian geometry. In [4], the authors have applied such a notion to any curve within a general statistical manifold. Let γ : [ 0 , 1 ] M be a curve within a statistical manifold ( M , g , , * ) ; we can consider the dual structure ( g γ , γ , γ * ) induced by ( g , , * ) on γ . Such a structure is given by
g γ ( t ) = g γ ˙ , γ ˙ , γ = Γ γ ( t ) d d t , γ * = Γ γ * ( t ) d d t ,
where
Γ γ ( t ) : = g γ ( t ) γ ˙ γ ˙ , γ ˙ g γ ( t ) γ ˙ , γ ˙ and Γ γ * ( t ) : = g γ ( t ) γ ˙ * γ ˙ , γ ˙ g γ ( t ) γ ˙ , γ ˙ .
From here on, we denote the scalar product g p : T p M × T p M R + induced by the metric tensor g with · , · p . Since ( γ , g γ , γ , γ * ) is a 1 -dimensional manifold, it is a dually flat manifold. Therefore, in [4] the authors applied the notion of canonical divergence to obtain a divergence of the curve γ :
D ( γ ) = D ( γ ( 1 ) , γ ( 0 ) ) = 0 t τ 1 g γ ( t ) μ ( τ ) μ ( t ) d τ d t , with μ ( t ) : = exp 0 t Γ γ ( τ ) d τ .
In [4], the authors claimed that the divergence D ( γ ) is independent of the parameterization of γ but only depends on the orientation of γ . This brings a close relation between the divergence D ( γ ) and its dual function, which is given as follows:
D * ( γ ) = D * ( γ ( 1 ) , γ ( 0 ) ) = 0 t τ 1 g γ ( t ) μ * ( τ ) μ * ( t ) d τ d t , with μ * ( t ) : = exp 0 t Γ γ * ( τ ) d τ .
In particular, we have that
D ( γ ˜ ) = D ( γ ( 0 ) , γ ( 1 ) ) = D * ( γ ) = D * ( γ ( 1 ) , γ ( 0 ) ) ,
where γ ˜ ( t ) = γ ( 1 t ) [4].
Even though the adaptation of the canonical divergence (7) to the 1 -dimensional case resulting in the canonical divergence Formula (10) is clear in [4], with this work, we aim to provide differential geometric-based proofs of all statements claimed in [4] on the canonical divergence of a curve γ . In particular, we want to prove the following statements:
(i)
The divergence of a curve γ is given by the expression in (10). We refer to it as the canonical divergence of the curve γ .
(ii)
The divergence D ( γ ) is independent of the particular parameterization of γ .
(iii)
If we change the orientation of γ , we obtain the dual divergence given in (11) and the relation (12).
Finally, in the self-dual case, that is when = * = ¯ LC , we prove that
D ( γ ) = 1 2 0 1 γ ˙ , γ ˙ γ ( t ) d t 2 ,
which is claimed in [4] and provides evidence of how information geometry modifies the usual Riemannian geometry.

2. The Canonical Divergence of a Curve

Let γ : [ 0 , 1 ] M be a curve within a statistical manifold ( M , g , , * ) . Let us assume that it is a 1 -dimensional manifold. Therefore, the statistical manifold ( γ , g γ , γ , γ * ) is always dually flat. This implies that there exists a couple of two affine parameters, t and t * . Indeed, consider an arbitrary parameter t ; owing to the dual flatness of γ , the 1 -forms d t = t ˙ ( t ) d t and d t * = t ˙ * ( t ) d t are such that the following holds true [15]:
d d t t ˙ ( t ) Γ γ ( t ) t ˙ ( t ) = 0 ,
d d t t ˙ * ( t ) Γ γ * ( t ) t ˙ * ( t ) = 0 ,
where the coefficients Γ γ ( t ) and Γ γ * ( t ) are defined in (9). The solutions of Equations (14) and (15) are given by
t ˙ ( t ) = t ˙ ( 0 ) exp 0 t Γ γ ( ω ) d ω , t ˙ * ( t ) = t ˙ * ( 0 ) exp 0 t Γ γ * ( ω ) d ω .
Then, it straightforwardly follows that
t ( t ) = t ( 0 ) + t ˙ ( 0 ) 0 t exp 0 τ Γ γ ( ω ) d ω d τ ,
t * ( t ) = t * ( 0 ) + t ˙ * ( 0 ) 0 t exp 0 τ Γ γ * ( ω ) d ω d τ .
According to the theory developed in [4], we can find two functions φ and ψ such that
ψ t = t * , φ t * = t .
These two equations lead to the following relations:
ψ ˙ = t ˙ t * , φ ˙ = t ˙ * t
which have the following solution:
ψ ( t ) = ψ ( 0 ) + 0 t t ˙ ( τ ) t * ( τ ) d τ
φ ( t ) = φ ( 0 ) + 0 t t ˙ * ( τ ) t ( τ ) d τ .
Now, we can understand that φ ( 0 ) = φ ( p ) and φ ( 1 ) = φ ( q ) . Likewise, we have that ψ ( 0 ) = ψ ( p ) and ψ ( 1 ) = ψ ( q ) . Therefore, following (7), we can define the divergence of ( γ , g γ , γ , γ * ) between p and q as
D γ ( p , q ) : = ψ ( p ) + φ ( q ) t ( p ) t * ( q ) ,
where t ( p ) : = t ( 0 ) and t ( q ) : = t * ( 1 ) .
At this point, we can try to give a more explicit expression for the divergence D γ . Let us interchange the role of p and q . Then, from Equations (16) and (20), we can write
D γ ( q , p ) = ψ ( q ) + φ ( p ) t * ( p ) t ( q ) = ψ ( q ) + φ ( q ) 0 1 t ˙ * ( t ) t ( t ) d t t ( q ) t * ( q ) 0 1 t ˙ * ( t ) d t = ψ ( q ) + φ ( q ) t * ( q ) t ( q ) 0 1 t ˙ * ( t ) ( t ( t ) t ( q ) ) d t = 0 1 t ˙ * ( t ) t 1 t ˙ ( τ ) d τ d t .
Now, we show that the expression (23) is the same as the one in (10).
Proposition 1.
Let γ : [ 0 , 1 ] M be a curve endowed with the dualistic structure ( g γ , γ , γ * ) . Let t and t * be the affine parameters with respect to γ and γ * , respectively. Then, the divergence D ( γ ) of γ can be written as follows:
D ( γ ) = 0 1 t ˙ * ( t ) t 1 t ˙ ( τ ) d τ d t .
Proof. 
If we assume that t ˙ ( 0 ) = 1 , we can see from Equations (10) and (16) that
μ ( τ ) = t ˙ ( τ ) .
Let us now observe that, from Equation (1), we can write
Γ γ * ( t ) = γ ˙ * γ ˙ , γ ˙ γ γ ˙ , γ ˙ γ = d d t γ ˙ , γ ˙ γ γ ˙ , γ ˙ γ γ ˙ γ ˙ , γ ˙ γ γ ˙ , γ ˙ γ .
Therefore, we have that
t ˙ * ( t ) = exp 0 t Γ γ * ( ω ) d ω = exp 0 t d d ω γ ˙ , γ ˙ γ γ ˙ , γ ˙ γ γ ˙ γ ˙ , γ ˙ γ γ ˙ , γ ˙ γ d ω = exp log γ ˙ ( t ) , γ ˙ ( t ) γ exp 0 t γ ˙ γ ˙ , γ ˙ γ γ ˙ , γ ˙ γ d ω = g γ ( t ) t ˙ ( t ) = g γ ( t ) μ ( t ) ,
where we assumed that γ ˙ ( 0 ) , γ ˙ ( 0 ) p = 1 . Now, we may observe from Figure 1 that we can represent the domain of integration in the double integral of Equation (10) as the grey colored region.
Therefore, we can split the double integral as follows:
0 t τ 1 g γ ( t ) μ ( τ ) μ ( t ) d τ d t = 0 1 g γ ( t ) μ ( t ) d t t 1 μ ( τ ) d τ ,
and by plugging in it t ˙ ( τ ) = μ ( τ ) and t ˙ * ( t ) = g γ ( t ) μ ( t ) , we obtain the desired result.
The representation of D ( γ ) in Proposition 1 allows us to show that the divergence of a curve γ is independent of the particular parameterization of γ . The next result might be important to characterize the divergence as a distance-like function.
Proposition 2.
Let γ : [ 0 , 1 ] M be a smooth curve and let γ ˜ : [ a , b ] M be a re-parameterized curve of γ . Then
D ( γ ) = D ( γ ˜ ) .
Proof. 
Consider φ : [ 0 , 1 ] [ a , b ] ( a : = φ ( 0 ) , b : = φ ( 1 ) ) an increasing diffeomorphism, meaning that φ ( t 1 ) < φ ( t 2 ) if t 1 < t 2 . From Equation (24), we can write
D ( γ ) = 0 1 t ˙ * ( t ) t ( 1 ) t ( t ) d t .
Our aim is to perform the change of variable t = φ 1 ( ξ ) within the integral (26), which is ruled by d t = d ξ φ ( φ 1 ( ξ ) ) . Here, we use different notations to denote the derivative with respect to the “time” t ( γ ˙ ) and the derivative with respect to the parameter ξ ( φ ). Let us define γ ˜ ( ξ ) = γ ( φ 1 ( ξ ) ) . Then, we have that
γ ˙ = φ ( φ 1 ( ξ ) ) γ ˜ , γ ¨ = φ ( φ 1 ( ξ ) ) 2 γ ˜ + φ ( φ 1 ( ξ ) ) φ ( φ 1 ( ξ ) ) γ ˜ .
Recall that
t ˙ * ( t ) = exp 0 t Γ γ * ( ω ) d ω
with
Γ γ * ( ω ) = γ ˙ i ( ω ) γ ˙ j ( ω ) Γ k i j ( ω ) + γ ¨ j ( ω ) g j k ( ω ) γ ˙ k ( ω ) / g γ ( ω )
and g γ ( ω ) = g i j ( γ ( ω ) ) γ ˙ i γ ˙ j .
Since
g γ ( t ) = g γ ˜ ( ξ ) ( φ ( φ 1 ( ξ ) ) 2 ,
we obtain
Γ γ * ( t ) = γ ˜ i ( ξ ) γ ˜ j ( ξ ) Γ i j k * ( γ ˜ ( ξ ) ) + γ ˜ j ( ξ ) g j k ( γ ˜ ( ξ ) ) φ ( φ 1 ( ξ ) ) γ ˜ k ( ξ ) g γ ˜ ( ξ ) + φ ( φ 1 ( ξ ) ) γ ˜ j γ ˜ k g j k ( γ ˜ ( ξ ) ) g γ ˜ ( ξ ) = Γ γ ˜ * ( ξ ) φ ( φ 1 ( ξ ) ) + φ ( φ 1 ( ξ ) ) .
We can now use Equations (27) and (28) to perform the change of variable in the integral (26):
D ( γ ) = 0 1 t ˙ * ( t ) t ( 1 ) t ( t ) d t = 0 1 exp 0 t Γ γ * ( ω ) d ω t ( 1 ) t ( t ) d t = a b { exp a ξ Γ γ ˜ * ( ζ ) φ ( φ 1 ( ζ ) ) + φ ( φ 1 ( ζ ) ) d ζ φ ( φ 1 ( ζ ) ) × t ( b ) t ( ξ ) } d ξ φ ( φ 1 ( ξ ) ) = a b exp a ξ Γ γ ˜ ( ζ ) d ζ + a ξ φ ( φ 1 ( ζ ) ) φ ( φ 1 ( ζ ) ) d ζ t ( b ) t ( ξ ) φ ( φ 1 ( ξ ) ) d ξ = a b { exp a ξ Γ γ ˜ ( ζ ) d ζ + log φ ( φ 1 ( ξ ) ) log φ ( φ 1 ( a ) ) × t ( b ) t ( ξ ) } d ξ φ ( φ 1 ( ξ ) ) = a b exp a ξ Γ γ ˜ ( ζ ) d ζ t ( b ) t ( ξ ) φ ( φ 1 ( ξ ) ) d ξ φ ( φ 1 ( ξ ) ) = a b exp a ξ Γ γ ˜ * ( ω ) d ω t ( b ) t ( ξ ) d ξ = a b t ˙ * ( ξ ) t ( b ) t ( ξ ) d ξ = D ( γ ˜ ) ,
where we assumed that φ ( 0 ) = 1 .
This result shows that the canonical divergence D ( γ ) of the dual structure ( γ , g γ , γ , γ * ) is independent of the parametrization of the curve. However, it depends on the orientation of the curve. We will shortly show that, by reversing the parameter of γ ( t ) , we obtain the dual divergence of D ( γ ) which is defined in Equation (11). In order to accomplish this task, we may note that, by applying the same methods as the ones in the proof of Proposition 1, we can write the dual divergence of γ as follows:
D * ( γ ) = D γ * ( q , p ) = 0 1 t ˙ ( t ) t 1 t ˙ * ( τ ) d τ d t ,
for γ : [ 0 , 1 ] M , such that γ ( 0 ) = p and γ ( 1 ) = q . Then, in the next result, we are going to prove that D * ( γ ) coincides with the ( g , ) -divergence of the reversely oriented curve, γ ˜ ( t ) = γ ( 1 t ) .
Proposition 3.
Let γ : [ 0 , 1 ] M be such that γ ( 0 ) = p and γ ( 1 ) = q . Let γ ˜ ( t ) = γ ( 1 t ) be the reversely oriented curve. Then we have that
D ( γ ˜ ) = D γ ( p , q ) = D γ * ( q , p ) .
Proof. 
Consider Equation (24); by integrating by parts, we can write it as follows:
D ( γ ) = 0 1 t ˙ * ( t ) t ( 1 ) d t 0 1 t ˙ * ( t ) t ( t ) d t = t * ( 1 ) t * ( 0 ) t ( 1 ) t * ( 1 ) t ( 1 ) + t * ( 0 ) t ( 0 ) + 0 1 t ˙ ( t ) t * ( t ) d t = t * ( 0 ) t ( 0 ) t * ( 0 ) t ( 1 ) + 0 1 t ˙ ( t ) t * ( t ) d t .
By reversing the “time” t , namely t ( 1 t ) , we get the reversely oriented curve, namely γ ˜ ( t ) = γ ( 1 t ) . Therefore, from the last equation, we obtain
D ( γ ˜ ) = t * ( 1 ) t ( 1 ) t * ( 1 ) t ( 0 ) 0 1 t ˙ ( t ) t * ( t ) d t = t ( 1 ) t ( 0 ) t * ( 1 ) 0 1 t ˙ ( t ) t * ( t ) d t = 0 1 t ˙ ( t ) t * ( 1 ) t * ( t ) d t = D * ( γ ) ,
where the last equality is obtained from Equation (29). Finally, we can write
D γ ( p , q ) = D ( γ ˜ ) = 0 1 t ˙ ( t ) t 1 t ˙ * ( τ ) d τ d t = d * ( γ ) = D γ * ( q , p ) ,
which proves the statement.
To sum up, the ( g , ) -divergence of an arbitrary path γ only depends on the orientation but not on its parameterization, and the ( g , * ) -divergence of γ coincides with the ( g , ) -divergence of the reversely oriented curve γ ˜ . For this reason, we can refer to D ( γ ) as a modification of the curve length within information geometry. A further support to this statement comes from the self-dual case, namely when = * . In this case, information geometry reduces to the Riemannian geometry [6] and the ( g , ) -divergence D ( γ ) becomes the square of the Riemannian length of the curve γ .
Proposition 4.
Let ( M , g , , * ) be a self-dual statistical manifold, i.e. = * = ¯ LC with ¯ LC the Levi–Civita connection. Let γ : [ 0 , 1 ] M be a smooth curve, such that γ ( 0 ) = p and γ ( 1 ) = q . Then the ( g , ) -divergence (10) becomes
D ¯ ( γ ) = 1 2 γ 2 ,
where γ = 0 1 γ ˙ , γ ˙ γ d t is the length of γ .
Proof. 
Consider the representation of D ( γ ) given in Equation (10); when Γ γ Γ ¯ γ is the coefficient of the Levi–Civita connection induced onto the curve γ , we can write
Γ ¯ γ ( t ) = ¯ LC γ ˙ γ ˙ , γ ˙ γ γ ˙ , γ ˙ γ = 1 2 d d t γ ˙ , γ ˙ γ γ ˙ , γ ˙ γ = 1 2 d d t g γ ( t ) g γ ( t ) .
Thanks to the compatibility property of ¯ LC with the metric tensor g , we then obtain
μ ¯ ( τ ) μ ¯ ( t ) = exp t τ Γ ¯ γ ( ω ) d ω = exp t τ 1 2 d d ω g γ ( ω ) g γ ( ω ) d ω = exp ln g γ ( ω ) t τ = exp ln g γ ( τ ) g γ ( t ) = g γ ( τ ) g γ ( t ) .
At this point, we are ready to express the divergence of γ in the self-dual case:
D ¯ ( γ ) = 0 t τ 1 g γ ( t ) μ ¯ ( τ ) μ ¯ ( t ) d t d τ = 0 t τ 1 g γ ( t ) g γ ( τ ) g γ ( t ) d t d τ = 0 t τ 1 g γ ( t ) g γ ( τ ) d t d τ .
We may now observe from Figure 1 that the area of the integration domain in the integral above is one half the area of the rectangle [ 0 , 1 ] × [ 0 , 1 ] . Therefore, by the symmetry properties of g γ , we obtain that
D ¯ ( γ ) = 1 2 0 1 g γ ( t ) d t 0 1 g γ ( τ ) d τ = 1 2 0 1 γ ˙ ( t ) , γ ˙ ( t ) γ ( t ) d t 2 ,
which proves that the divergence of γ , in the self-dual case, is one half the square of the length, i.e., it is independent of the parameterization.

3. Conclusions

In classical differential geometry, the connection among geodesics, the length of a curve, and the distance provide deep insight into the geometric structure of a Riemannian manifold. For instance, under specific conditions [5], the distance between any two points of a Riemannian manifold is obtained through the geodesic between them by means of the minimization of the length functional along any path between the two points [16,17].
In information geometry, the inverse problem concerns the search for a divergence function D which recovers a given dual structure g , , * on a smooth manifold M . The Hessian of D allows recovery of the Riemannian metric g , while third-order derivatives of D retrieve the two torsion free connections and * which are dual with respect to g . Interestingly, ( M , g , , * ) is only determined by the third-order Taylor polynomial expansion of D . Therefore, in general, D is far from being unique for a given statistical manifold structure. However, in the dually flat case where both and * are flat, it is possible to consider a canonical divergence function from a given ( M , g , , * ) , which was originally proposed by Amari and Nagaoka.
We considered here the case in which the divergence of a curve γ in a statistical manifold ( M , g , , * ) is obtained by applying the notion of the canonical divergence of a dually flat statistical manifold. In this way, the notion of divergence can be considered as the natural quantity to study to understand the geometric structure of a statistical manifold. More specifically, given ( M , g , , * ) , a curve γ of M has a natural induced dual structure which is dually flat, since γ can be regarded as a 1 -dimensional submanifold of M itself. In [4], Amari and Nagaoka used the canonical divergence in a dually flat case to define a “canonical divergence” D ( γ ) of the curve.
In this pedagogical paper, we provided a systematic organization and mathematical proofs of the properties of the divergence in Equation (22) which were originally stated without demonstration in [4]. Specifically, we obtained the following results:
  • In Equation (22), we provided an explicit expression D ( γ ) of the canonical divergence of a 1 -dimensional path γ .
  • In Equation (25) and Proposition 2, we showed that D ( γ ) does not depend on the chosen parameterization of γ .
  • In Equation (30) and Proposition 3, we demonstrated how D ( γ ) depends on the adopted orientation of γ .
  • In Equation (32) and Proposition 4, we verified that D ( γ ) equals one half the square length of γ in the self-dual case.
By mimicking the classical theory of Riemannian geometry, our work helps to obtain a deeper understanding of the distance-like functional represented by the divergence in Equation (22). Furthermore, our explicit analysis can provide useful insights for identifying a minimum of D ( γ ) through variational methods in more general settings. These methods, in turn, are the natural tools to deal with any sort of application in information geometry [18]. It would be of great interest to select a general canonical divergence via the minimum of D ( γ ) over the set of all paths connecting any two points of a general statistical manifold. However, this problem appears to be highly non-trivial, and its solution requires, for instance, a criterion capable of explaining why one divergence would be better than the others in capturing a given dual structure. In some sense, the divergence must be defined using the information from ( M , g , , * ) in a minimal way. We believe that the differential geometric-based approach employed here could help in accomplishing such a task, which is, for now, beyond the scope of our current pedagogical setting. We leave this discussion along with a presentation on potential applications of general canonical divergence for non-dually flat scenarios to future scientific efforts.
In conclusion, our work presents new proofs of pedagogical value that, to the best of our knowledge, do not appear anywhere in the literature. In this sense, our work is neither a review nor a simplification of existing proofs. However, our findings could be regarded as preparatory to results of more general applicability in information geometry, including the study of more general examples of statistical manifolds. In addition, as previously mentioned in the paper, our proofs could be further exploited to pursue more research-oriented open questions of a non-trivial nature, including investigating the ranking of different divergences capable of capturing a given dual structure.

Author Contributions

Writing—review & editing, D.F. and C.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Amari, S.-I. Information geometry of the em and em algorithms for neural networks. Neural Netw. 1995, 8, 1379–1408. [Google Scholar] [CrossRef]
  2. Tsallis, C. Possible generalization of boltzmann-gibbs statistics. J. Stat. Phys. 1988, 57, 479–487. [Google Scholar] [CrossRef]
  3. Felice, D.; Cafaro, C.; Mancini, S. Information geometric methods for complexity. Chaos 2018, 28, 032101. [Google Scholar] [CrossRef]
  4. Amari, S.-I.; Nagaoka, H. Methods of Information Geometry; Oxford University Press: Oxford, UK, 2000. [Google Scholar]
  5. Lee, J.M. Riemannian Manifolds: An introduction to Curvature, 1st ed.; Springer: New York, NY, USA, 1997; p. 176. [Google Scholar]
  6. Ay, N.; Jost, J.; Van Le, H.; Schwachhöfer, L. Information Geometry, 1st ed.; Springer International Publishing: Berlin, Germany, 2017. [Google Scholar]
  7. Eguchi, S. Geometry of minimum contrast. Hiroshima Math. J. 1992, 22, 631–647. [Google Scholar] [CrossRef]
  8. Eguchi, S. Second order efficiency of minimum contrast estimators in a curved exponential family. Ann. Statist. 1983, 11, 793–803. [Google Scholar] [CrossRef]
  9. Matumoto, T. Any statistical manifold has a contrast function—On the c3-functions taking the minimum at the diagonal of the product manifold. Hiroshima Math. J. 1993, 23, 327–332. [Google Scholar] [CrossRef]
  10. Felice, D.; Ay, N. Towards a canonical divergence within information geometry. Inf. Geom. 2021, 4, 65–130. [Google Scholar] [CrossRef]
  11. Ciaglia, F.M.; Di Cosmo, F.; Felice, D.; Mancini, S.; Marmo, G.; Pérez-Pardo, J.M. Hamilton-jacobi approach to potential functions in information geometry. J. Math. Phys. 2017, 58, 063506. [Google Scholar] [CrossRef]
  12. Wong, T.-K.L. Logarithmic divergences from optimal transport and rènyi geometry. Inf. Geom. 2018, 1, 39–78. [Google Scholar] [CrossRef] [Green Version]
  13. Ay, N.; Amari, S.-I. A novel approach to canonical divergences within information geometry. Entropy 2015, 17, 8111–8129. [Google Scholar] [CrossRef]
  14. Henmi, M.; Kobayashi, R. Hooke’s law in statistical manifolds and divergences. Nagoya Math. J. 2000, 159, 1–24. [Google Scholar] [CrossRef] [Green Version]
  15. Jost, J. Riemannian Geometry and Geometric Analysis, 7th ed.; Univesitext; Springer International Publishing: Berlin, Germany, 2017. [Google Scholar]
  16. Calin, O.; Udriste, C. Geometric Modeling in Probability and Statistics; Springer: Berlin, Germany, 2014. [Google Scholar]
  17. Toponogov, V.A.; Rovenski, V. Differential Geometry of Curves and Surfaces: A Concise Guide; Birkauser: Basel, Switzerland, 2006. [Google Scholar]
  18. Amari, S.-I. Information Geometry and Its Applications, 1st ed.; Springer Publishing Company: Berlin, Germany, 2016. [Google Scholar]
Figure 1. The domain of the double definite integral in Equation (10) is highlighted in grey color.
Figure 1. The domain of the double definite integral in Equation (10) is highlighted in grey color.
Mathematics 10 01452 g001
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Felice, D.; Cafaro, C. Explicit Information Geometric Calculations of the Canonical Divergence of a Curve. Mathematics 2022, 10, 1452. https://doi.org/10.3390/math10091452

AMA Style

Felice D, Cafaro C. Explicit Information Geometric Calculations of the Canonical Divergence of a Curve. Mathematics. 2022; 10(9):1452. https://doi.org/10.3390/math10091452

Chicago/Turabian Style

Felice, Domenico, and Carlo Cafaro. 2022. "Explicit Information Geometric Calculations of the Canonical Divergence of a Curve" Mathematics 10, no. 9: 1452. https://doi.org/10.3390/math10091452

APA Style

Felice, D., & Cafaro, C. (2022). Explicit Information Geometric Calculations of the Canonical Divergence of a Curve. Mathematics, 10(9), 1452. https://doi.org/10.3390/math10091452

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop