1. Introduction
A triple (
) is called statistical manifold if (
) is a semi-Riemannian manifold and ∇ is a torsion-free affine connection on
M such that
This notion was defined by S. Amari in [
1], as a geometrical model for some facts in Statistics:
M is a parameters space of distributions of probability,
g is the Rao–Fisher metric deduced from the Bolzmann–Gibbs–Shannon entropy function, and ∇ is a tool for asymptotic estimations.
The geometrization based on both a semi-Riemannian metric and an affine connection was already used (until now, without great succes) in different attempts to unify relativistic gravity models with electromagnetism ones (Weyl, Eddington, Einstein, Kaluza, etc., in the first half of the 20th century; see in [
2] for a recent review). Instead, in Statistics, the model was considered important and fruitful. Today it constitutes a modern and promising area of active research (see, for example, in [
3,
4,
5,
6]).
Two connections
and
are dual in
if
From a differential geometric point of view, the dualistic structure first generalizes somehow the invariance of the inner product under parallel translation through metric connections. Moreover, the existence of a dually flat structure on a manifold points out some topological and geometrical properties of the manifold. The notion traces back to Norden and was adapted on statistical manifolds, where remarkable families of dual connections contain information about the dualistic properties of exponential families of probability distributions [
7].
This initial (and already classical) setting was generalized in several ways. We point out here only a direction opened by Kurose and Matsuzoe, who considered statistical manifolds with non-symmetric connections ∇ [
8,
9,
10], intended for quantum field theories. As statistical manifolds have close relations to the geometry of affine immersions, statistical manifolds admitting torsion have relations to the geometry of affine distributions.
In this paper, we review in a creative manner the fundamentals of dual connections and of statistical manifolds and we give new examples (
Section 2 and
Section 4). The families of these new examples depend on many “parameters” and thus are susceptible to fit in various applications. We establish “controls” over the parameters manifold
M and its various affine modules of connections. These “controls” are provided by deformation algebras defined by the difference of two connections, or by some Riemannian Rinehart bi-algebras associated to the Riemannian metric and to the canonical Lie–Rinehart algebra of the manifold. The deformation algebras were intensively studied during 1970–1990, as a natural translation between algebraic and differential geometric properties of differentiable manifolds. The Riemannian Rinehart structures are more recent and constitute a promising area of research. In
Section 2, we give some hints for the respective literature related to both these algebraic objects. We show that arbitrary pairs of dual connections are determined by pairs of a metric connection and an arbitrary connection, or by triples formed by a metric connection, an arbitrary connection and a function (thus generalizing the pairs of the so called
-connections).
In
Section 3, we prove formulas for the main invariants associated to the dual connections. We determine links between their Bianchi identities and express the Jacobi equation for geodesics in terms of dual connections.
In
Section 4, some very general families of statistical manifolds are defined, depending on many parameters. Here, we characterize the semi-Riemannian manifolds admitting flat dual connections with torsion, thus solving a problem suggested in [
7].
In
Section 5, we define nine new families of statistical manifolds, denoted SMAT
SMAT
, which generalize the known ones, by using some special hypothesis on the curvature and on the tensor vector fields.
In
Section 6, we determine how many independent bi-invariant statistical structures may exist on a compact Lie group and how many independent left invariant statistical structures may exist on an arbitrary Lie group.
Section 7 is devoted to some examples of statistical manifolds, in particular frameworks from Information Geometry.
2. Dual Connections and Controls over Some Affine Modules of Connections
Let (
) be a semi-Riemannian manifold with the Levi–Civita connection
. We denote by
and
the sets of affine connections and of symmetric (i.e., torsion-free) connections on
M, respectively, endowed with the canonical structure of affine
-module. We define
and
. As
, we have the following inclusions of (non-void) affine submodules
Remark 1. (i) Each connection may be uniquely written as , where relates to the torsion tensor field by the relation We have the complete determination of the connection ∇ by the (1,2)-tensor field A, and, mutatis mutandis, the determination of the affine module by its direction, the real vector space . We may interpret the semi-Riemannian geometry of as a reference point and “vary” it by affine geometries , acting through the “control” A. Moreover, once A is fixed, gets a structure of -algebra, called the deformation algebra of the pair (), by the multiplication . Translations between the algebraic properties of these deformation algebras and the geometric properties of the ambient manifold were extensively studied (see, for example, in [11,12,13] and the references therein). We must point out here alternative invariants, also studied in the literature: the “cubic forms” and ; we shall not use them in our paper.
(ii) We have the following characterizations: (iii) A Riemannian Rinehart space is a Lie–Rinehart algebra endowed with a “Riemannian metric”, i.e., a musical (generalized) scalar product (see in [14] for details). This construction establishes a purely algebraic framework for many properties which are commonly studied in Riemannian geometry, by analytic and geometric tools. In particular, on the Lie–Rinehart algebra
[
14], we get a canonical structure of Riemannian Rinehart space, induced by the (generalized) scalar product
, canonically associated to the Riemannian metric
g.
Fix a connection , with the (fixed) control A. Define, as above, the multiplication . It follows that on we get an additional algebra structure, which combines the properties of the deformation algebra with those from the Lie–Rinehart algebra . We believe that these bi-algebras deserve a closer attention of their own.
In what we are concerned here, we restrain to the following.
Theorem 1. Let be a Riemannian manifold and a connection on M. Then, if and only if the Riemannian Rinehart bi-algebra satisfies the compatibility condition We read this formula as a mutual determinancy, which express the obstruction to “self-adjointness”, in the left side, through the obstruction to commutativity, in the right side.
Remark 2. (i) Consider the affine transformation , which associates to each the affine connection , given by The connection is called the dual of ∇ [1,7]; the relation of duality is an equivalence one, as Φ
is an involution. Obviously, is (the only) self-dual connection, as the unique fixed point of Φ
. The transformation Φ
depends on the Riemannian metric only. (ii) Suppose and denote the adjoint operators (at the right and at the left, respectively) given by We have also a direct relation between and , which allows us to study only one of these operators, namely, Obviously, if and only if A is commutative.
We have , where . The deformation algebras and are equivalent, i.e., contain the same (algebraic and) geometric information. Obviously, if and only if .
(iii) Another interesting deformation algebra is , which measures another obstruction for ∇ to coincide with (i.e., if and only if if and only if ).
(iv) The tensor fields , , and (and their associated deformation algebras, or their associated Riemannian Rinehart bi-algebras) act as controls over the affine modules of special connections previously studied. The information they carry with is, of course, redundant and may be translated and simplified, following the context.
(v) Let () be dual connections and their mean connection. It is well known [7,10] that . We remark also that . It is interesting that we have also a strong converse statement (inspired by a suggestion in [
7] (p. 51)): consider
, with the torsion tensor field of the form
, for some skew-symmetric tensor field
. Let ∇ be a connection on
M, with torsion
B. Define
. It follows that
. Then,
and
. In conclusion, to any connection
with the respective special torsion, we can associate an infinite family of dual connections, such that
is the mean connection for every element of this family. In particular, this works for the Levi–Civita connection
, because
in this case. (An elementary comparation: in order to identify a closed interval on the real line, we may specify both its ends, or we may specify one of its ends and its middle point.)
The next example shows that we may generalize the previous “arithmetic” mean.
Example 1. Consider an arbitrary couple of conjugate connections. We define a 1-parameter family of connections , called the f-connections, such that are dually coupled to the metric, where We get , withand we obtain a family of deformation algebras . In particular, we have the
-connections
. These connections generalize the classical ones (see for example [
7,
15]), which are symmetric.
In the general case, a short calculation shows that
and
Conversely, start with a connection , satisfying (3) and (4) for some . Consider a function f and the connection . Then, there exists a unique connection such that , and () are conjugate. For , we recover the construction in the Remark 2, (v). (An elementary comparation: in order to identify a closed interval on the real line, we may specify both its ends, or we may specify one of its ends and the point which divides the interval in some given “ratio” f).
Finally, we remark that Formula (3) shows the direct proportionality between the obstruction to the -parallelism of the metric g (on the left side) and the extent to which ∇ differs from (i.e., differs from 0), weighted through the “conformal factor” .
3. The Main Geometric Invariants Associated to Dual Connections
Let ∇ and
be dual connections on a semi-Riemannian manifold (
), with the Levi–Civita connection
. We denote by
T,
t,
R,
,
, and
the torsion tensor, the “mean torsion” one form, the curvature tensor, the Ricci tensor, the Faraday tensor, and the “pseudo-scalar” curvature of ∇, respectively, defined by
and
. Similar geometric objects associated to
and
will be denoted with an upper ∗ or 0, respectively. Denote
a local orthonormal basis of vector fields on
.
Remark 3. Using (2), we can determine the previous invariants of in terms of ∇ and g:where we denoted , Remark 4. We denote . Then, we can express the invariants of ∇
in terms of g and A: If we denote , then we have the equivalent formula Using coordinates associated to the given orthonormal basis on M, one has The Ricci tensor is symmetric iff The following result is a simple consequence of the previous formulas and may be considered “folklore”.
Proposition 1. Let . Then, (i) if and only if ; (ii) if and only if ; (iii) if and only if ; (iv) if , then ; (v) ∇ and have the same (parameterized) geodesics the tensor field is skew-symmetric.(vi) Let . If ∇ and have the same (parameterized) geodesics, then , i.e.,
Proposition 2. Let and be dual connections on . Then, the Bianchi identities for ∇
impose the following conditions uppon the control A: Similar conditions arise for , replacing A by .
Remark 5. With the notations in the previous proposition, we deduce a consequence of combining the first Bianchi identity for ∇
and The second Bianchi identity leads to a much more complicated relation of compatibility and we omit it.
Theorem 2. Let be dual connections on the semi-Riemannian manifold , with and and let be a geodesic of g. Then, the Jacobi fields J along γ are solutions of If, moreover, ∇
is symmetric, then The assertion still holds if we replace ∇ by and A by .
Proof. The Jacobi equation for
writes
Replacing and as functions of ∇, R and A, we get the identity we were looking for. □
The previous theorem provides formulas for the transversal control of the geodesics behaviour, expressed in terms of the dual connections instead of the metric. Conversely, we may obtain formulas which express the Jacobi equation along the auto-parallel curves of ∇ or (i.e., ∇-“geodesics” or -“geodesics”), in terms of g, , and A or , respectively.
4. Existence and Characterizations of Statistical Structures
A triple is a statistical manifold if . In this case, () is a statistical manifold too. Alternatively, we denote instead of , in order to point out the implicit duality inside.
The centro-affine properties of w.r.t. (together with the metric properties) constitute the geometrical core of the theory of statistical manifolds.
Remark 6. Given the semi-Riemannian manifold (), we point out these five (equivalent) characterizations of a statistical manifold ():
(I)—by relation (1) and .
(II)—through such that , with (III)—through and such that (IV)—through the dual connection in (2), such that .
(V)—through the dual connection in (2), such that both A, are symmetric.
The set of all the invariant statistical structures on a Lie group
G will be characterized in
Section 6. The Lie algebra
will allow us to “count” easier “how many” statistical structures exist on
G.
Example 2. (classical statistical manifolds, i.e., for dual connections without torsion) Consider a fixed n-dimensional semi-Riemannian manifold (). We have the canonical (and trivial) structure of statistical manifold (), with . The set of all the statistical structures on () is parameterized by . This is a large set (see Section 6) and the (1,2)-type deformation tensor fields measure “how far” a statistical structure is from the canonical one. In what follows, we construct particular new statistical structures on (), in a down-to-up hierarchical way. (i) Let fix and denote its dual 1-form w.r.t. g. Define and . Then, and . Thus, on each semi-Riemannian manifold, there always exists an infinite family of (distinct) dual connections, each of them corresponding to a different statistical structure associated to ().
(ii) Suppose M is parallelizable and consider a fixed basis in . We denote , as defined in (i). We have n “independent” statistical structures () with . Moreover, each affine combination of these connections provides a new statistical structure, a “mean” with specified weights, which may control the global measuring in a specific way (w.r.t. the fixed basis, of course).
(iii) If M is not parallelizable, we may consider (if any) a linearly independent set in , , and make a similar construction as in (ii).
(iv) Let fix
and denote
its adjoint w.r.t.
g, i.e.,
. Fix
,
and denote
,
their dual vector fields w.r.t.
g, i.e.,
and
. Fix two symmetric
,
,
, define
and
. Then ∇ is a symmetric connection and
, where
and
,
are given by
and
.
We have
if and only if
In particular, it follows that
and
Relation (8) may be viewed as an equation in the unknowns , which always has solutions, as depends explicitly on the variables from the right side.
We distinguish the following special cases:
(iv)
Suppose
. Then,
and (8) writes
. In particular, for
and
we get the examples from the family (iii).
(iv)
Suppose
. Then,
and (8) writes
(iv)
If
, then
(iv)
If
, then
(a) If, moreover,
, then
In particular, if
, then
(b) If, moreover,
, then
In particular, if
, then
(iv)
If
, then
(a) If, moreover,
, then
In particular, if
, then
(b) If, moreover
, then
In particular, if
, then
All these examples show that, on every semi-Riemannian manifold, there always exist many families of (distinct) dual connections; the choice of the parameters allows a large flexibility and variability of the possible associated statistical models.
(v) Let
and
such that
and
is non-degenerated. Then (
) is a statistical manifold. Here, we used (1) and
(We remark that the hypothesis is much weaker than imposing the curvature flatness of ∇.) The dual connection of ∇ is uniquely determined by
In particular, if f is a divergence function associated to a parameterized family of distributions of probability, then g is the Fisher metric associated to it.
Remark 7. In [7], p. 180, the authors suggest some open problems of interest. The third one is the following:“Let be a Riemannian space. We say that may be flattened if there exists a pair of affine connections ∇ and such that is dually flat. Show whether this is always possible. If not, find the invariant which characterizes those spaces which may be flattened.”
From (6) we see that if and only if . Suppose that . The existence of a dually flat structure is then equivalent, in this case, with the existence of an affine structure on M, which is a longstanding open problem in Differential Geometry. For example, there exist Lie groups which do not admit such a left invariant structure.
If we relax the hypothesis, and accept ∇ and have torsion, then we obtain the following characterization of the spaces which may be flattened.
Theorem 3. (i) Let be a semi-Riemannian space with a pair of flat dual connections, not necessarily symmetric. Then M is parallelizable.
(ii) Conversely, suppose is a parallelizable semi-Riemannian space. Then, there exists a pair of flat dual connections , compatible with g.
Proof. (i) Because ∇ is flat, it follows that M is a parallelizable manifold. This is a purely affine differential result, which has nothing to do with the semi-Riemannian structure of the space. Furthermore, it does not involve some properties related to the torsion, for example the eventual symmetry of the connection.
(ii) Consider
a global basis of vector fields on
. Let
and
be the Cartan–Schouten connections on
M, defined by
and
, respectively, for every indices
. Then,
where
,
are arbitrary vector fields on
M. One knows that
; in general, both connections have non-null torsion. We define
and
, using (2). In the chosen frame, their coefficients are given, respectively, by
Then, and are dually flat. It is possible that, in some cases, these two structures coincide. □
Remark 8. The proof of the second part of the previous theorem suggests the following question: on which parallelizable semi-Riemannian manifold does there exist a dually flat structure which, moreover, has both connections with parallel torsion? The Lie groups endowed with left-invariant semi-Riemannian metrics are the first candidates, as then has -parallel torsion (see also Section 6). 6. Invariant Statistical Structures on Lie Groups
Let
G be a
n-dimensional Lie group and
its Lie algebra. A
left invariant statistical structure on
G is defined by a left invariant semi-Riemannian metric
g and a left invariant connection ∇ satisfying (1). A similar definition works for right invariant statistical structures. A statistical structure is called
bi-invariant if it is simultaneously left and right invariant. Linearity of the tensorial relations allows simpler expressions of the characteristic properties, as acting on invariant vector fields. For example, for a left invariant statistical structure, relation (1) is equivalent to
for all
and (2) is equivalent to
for all
.
The simplest (and trivial) example of bi-invariant statistical structure is given by a bi-invariant semi-Riemannian metric together with its Levi-Civita connection , for all . The real “line” contains only bi-invariant connections, so the dimension of the space of bi-invariant connections is at least 1.
On a n-dimensional abelian Lie group G, any left invariant geometrical object is also bi-invariant. As the set of symmetric left invariant connections may be identified with the set of symmetric type (1,2) tensors on , it follows that, in this case, there exist plenty of bi-invariant statistical structures on G, different from the (previous) trivial ones.
The situation changes drastically as soon as we quit the abelian realm.
Proposition 3. Let g be a bi-invariant semi-Riemannian metric on a compact simple Lie group G. Any bi-invariant statistical structure () is trivial, with the exception of , for , which admits an infinite family corresponding tofor any real number α. (By i we denote the imaginary constant.) Proof. The Levi–Civita connection of
g is
Consider a symmetric bi-invariant connection ∇ on G such that the triple () be a bi-invariant statistical structure. Then, , for all , where A is a symmetric bi-invariant type (1,2) tensor on .
In [
17], it was proven that all the bi-invariant connections on
G are trivial, except
(for
), where there exists a family of connections, depending on two real parameters
and
, of the form
It follows that ∇ must satisfy (11) for any real number . □
Corollary 1. Let g be a bi-invariant semi-Riemannian metric on a compact Lie group G. Suppose , where the center has dimension p and the ’s are the simple ideals in . Let r be the number of ’s () in . Then, the dimension of the space of bi-invariant statistical structures () is given by Proof. The dimension of the space of
all the bi-invariant connections on
G was determined in [
17] to be
For statistical structures one must restrain to symmetric bi-invariant connections only, which leads to the required number.
In particular, when G is simple, we have , and the dimension of the space of bi-invariant symmetric connections is 1 (as stated in Proposition 3). □
Corollary 2. Let g be a bi-invariant semi-Riemannian metric on (). Then, any bi-invariant statistical structure () is given byfor arbitrary real numbers . Proof. For , we have , and the dimension of the space of bi-invariant symmetric connections is 4 (as follows from Corollary 1).
A basis for the bi-invariant connections on
is given [
17] by
where
I is the identity
matrix. As statistical structures involve only
symmetric connections, we see that from the “affine connections frame”
we get that
may be uniquely expressed as a combination of
. (Remember that all the geometric objects here act on the Lie algebra.) We found the required general form of a symmetric bi-invariant connection ∇. □
Remark 11.(i) The group is isomorphic with , so it admits a unique bi-invariant statistical structure (and that is the trivial one).
(ii) On , the space of bi-invariant statistical structures () has only three dimensions, as we have [17] the following relation of linear dependence on and thusfor arbitrary real numbers . (iii) All the symmetric bi-invariant connections on non-abelian compact Lie groups are non-flat, due to a result of Milnor [18]. Proposition 4. Let G be a n-dimensional Lie group, g a left invariant semi-Riemannian metric on G. Then, the space of left invariant statistical structures () has the dimension .
Proof. Any left invariant connection ∇ may be written
where
A is a left invariant type (1,2) tensor on
. As ∇ must be symmetric and subject to (7), it follows that
for all
. A simple combinatorics counts the number of independent tensors
A to be
which finishes the proof. □
Remark 12. (i) Let G be a n-dimensional Lie group and . As a consequence of Proposition 4, we deduce that the set of all the (semi-Riemannian!) left invariant statistical structures () can be parameterized by the direct product of with an open subset of (corresponding to the symmetric non-singular matrices).
(ii) The left invariant connection involved in the previously considered left invariant statistical structures is not supposed to be flat. In this context, flatness would be a very strong restriction, which might forbid the existence of such structures. Moreover, up to now, the existence of flat symmetric left invariant connections on Lie groups is an open problem.
7. Examples
In this section, we shall use the framework and notations adapted from [
7] (Chapters 2 and 3) and [
15], where more details may be found.
Consider
positive integers and
M a connected
m-dimensional differentiable manifold. The set
is a parametric model for the domain of a family of probability distributions
,
,
,
. All integrals have the domain
. For an arbitrary function
, we denote
with
a function from
M to
. We write
when local coordinates on
M are involved. In many applications,
f depends on
p and the values of the operator
E measure some kind of entropy, thus the notation.
Let
,
be the log-likehood function,
the Gibbs entropy function, given by
, i.e.,
and
the
-matrix of the Fischer Riemannian metric, defined by
where we denoted
. We have [
7]
Let
be a fixed real number. Then, the connection
from Example 1 has the following coefficients, calculated in a point
:
Here, the coefficients of
, with three down indices, are defined by
The coefficients of the Levi–Civita connection of the metric
g (also known as the Christoffel coefficients of the first kind) are
Whenever it is possible, we shall avoid writing the point
in formulas. For example,
Example 3. Let ∇ be an arbitrary connection on M, given by , with . Denote and the coefficients (with three down indices) of a connection ∇.
We shall choose A in order to provide examples for SMAT’s.
(i) If , then .
(ii) Let be another function providing an entropy function , given by , i.e. Many such choices are possible, as many entropy functions were suggested in the last decades (the Tsallis entropy, the von Neumann entropy, the Renyi entropy, etc.) and their various generalizations.
We shall combine the partial derivatives of l and f in , in order to get more examples. We consider a genericwhere are constants to be determined. (1) Ifwhere , , , , then is SMAT. (2) Ifthen is SMAT. For other families of SMAT’s, the calculations are similar but more tedious. We shall follow now another path, under some more restrictive assumptions.
Example 4. With the previous notations, let have , so , This may occur, for example, for exponential families of distributions probabilities, i.e., [5,7]with . Then, we have the following characterizations: Let us take and One has We obtain Then Let us take Therefore,
If one considers then we have one example of connection with torsion for SMAT and SMAT. Let and If we may consider Let us take the torsion We can consider If we have one example of affine connection with torsion for SMAT and SMAT.where In dimension 2 one can consider a solution for SMAT. In dimension 2, if we may consider In dimension 2, if one can consider 8. Discussions
The paper tries to clarify some notions and results from Differential Geometry, which are motivated by models arising from Statistics, related to statistical manifolds and to dual connections. The main idea is to distinguish, at each level of understanding, which are the appropriate algebraic and/or geometric “controls” for the variability of the models. Thus, we pointed out the deformation algebras and the Riemannian Rinehart bi-algebras , as algebraic invariants underlying behind the dual connections and statistical manifolds.
Second, we characterized the differentiable manifolds admitting dually flat statistical structures with torsion (Theorem 4.4) and proved several results which count the number of statistical manifold structures on compact Lie groups (
Section 6).
Third, we define new families of dual connections and of statistical manifolds with and without torsion (including the families SMAT
,
), which impose new assumptions on the curvature and torsion tensor fields. In
Section 7 we exemplify them, on particular manifolds of probability distributions.
Several research directions open: (i) the purely algebraic study of the Riemannian Rinehart bi-algebras and of the deformations algebras, associated to specific control tensor fields on statistical manifolds; (ii) the relevance of the —connections for statistics, with arbitrary (or specific) functions f, extending the studies when the function is constant; (iii) specific statistical applications for the SMAT’s structures; and (iv) optimization results on the space of the control tensors A.