Next Article in Journal
Application of Permutation Entropy and Permutation Min-Entropy in Multiple Emotional States Analysis of RRI Time Series
Previous Article in Journal
The Volume of Two-Qubit States by Information Geometry
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Mixture and Exponential Arcs on Generalized Statistical Manifold

by
Luiza H. F. De Andrade
1,*,
Francisca L. J. Vieira
2,
Rui F. Vigelis
3 and
Charles C. Cavalcante
4
1
Department of Natural Science, Mathematics and Statistics,Federal Rural University of Semi-Arid Region, Mossoró 59.625-900, Brazil
2
Department of Mathematics,Regional University of Cariri, Juazeiro do Norte 63105-000, Brazil
3
Computer Engineering, Campus Sobral, Federal University of Ceará, Sobral 62010-560, Brazil
4
Department of Teleinformatics Engineering, Federal University of Ceará, Fortaleza 60455-900, Brazil
*
Author to whom correspondence should be addressed.
Entropy 2018, 20(3), 147; https://doi.org/10.3390/e20030147
Submission received: 19 January 2018 / Revised: 18 February 2018 / Accepted: 21 February 2018 / Published: 25 February 2018

Abstract

:
In this paper, we investigate the mixture arc on generalized statistical manifolds. We ensure that the generalization of the mixture arc is well defined and we are able to provide a generalization of the open exponential arc and its properties. We consider the model of a φ -family of distributions to describe our general statistical model.

1. Introduction

In the geometry of statistical models, information geometry [1,2,3] is the part of probability theory dedicated to investigate probability density functions equipped with differential geometry structure. A differential-geometric structure to the multi-parameter families of distributions was provided in [4]. In the mid-1980s, other topics related to the subject, such as fiber bundle theory and duality of connections of statistical models, were investigated by Amari [5] and Amari and Nagaoka [6], respectively. In the parametric case, exponential, mixture and α -connections, as well as their dual structure, are among the most important geometric objects [6], since the dual structure of the α -connections is the key point distinguishing statistical manifolds against arbitrary differential manifolds. Divergence function is an essential topic in information geometry, for both, parametric and non-parametric cases, since a metric and dual connections can be induced from a divergence [7,8,9,10]. To find an information-geometrical foundation for multi-parameter families of probability distributions, with a more general description, is one of topics of interest in information geometry [11,12,13,14]
Non-parametric statistical models [15] are important in a wide range of areas [16,17]. In the parametric case, the manifold of probability density functions obtains a Euclidian topology from the space of its natural parameters. As for the non-parameter case, a major challenge is to define a convenient topology and a notion of convergence. Pistone and Sempi [18] were the first to formulate a rigorous infinite dimensional extension. In that work, the set P μ of all strictly positive probability densities was endowed with a structure of exponential Banach manifolds, using Orlicz spaces associated to a Young function. In a later work [19], more properties of the statistical manifold were studied, specifically regarding the orthogonality condition.
Similar to in the parametric case, in non-parametric models, the mixture and exponential connections are among the most important geometric objects. To find these connections, it is necessary to guarantee the existence of the open arcs, which are the geodesics of the manifold. Using the notion of exponential convergence, Gibilisco and Pistone [20] investigated those connections. In that work, the exponential and mixture connections were built in a way that the relation between them is the same as in the parametric case. Another approach was used in [21] where the mixture arc was additionally studied. Moreover, Grasselli [21] proved that two probability densities in the same neighborhood are connected by an open mixture arc if and only if the difference between their random variables is bounded.
The exponential statistical manifold was later studied in [22], with another system of charts, the statistical model E ( p ) , called the maximal exponential model. Cena and Pistone [22] proved that this model is the set of all positive densities connected to a given positive density p by an open exponential arc and viceversa. In that work, it was used the open mixture arc and the open exponential arc to discuss properties of this model as e-connection and m-connection in the same way that in [6]. This exponential model E ( p ) with the open exponential and mixture arcs were also studied recently by Santacroce et al., 2016 [23] and Santacroce et al., 2017 [24], where a proof of duality properties of statistical models was provided. Examples of applications of non-parametric information geometry to statistical physics using the connection by open arcs were studied in [25].
The generalization of the exponential statistical manifold has been an active topic of research in the last years. Pistone [26] used the Kaniadaki’s κ -exponential [27] in the construction of a statistical manifold. Vigelis and Cavalcante [28] proposed a φ -family of probability distributions F c φ , which generalizes the exponential family E ( p ) . This generalization is based on the replacement of the exponential function by a deformed exponential φ which satisfies some properties and provides to the set P μ a Banach manifold structure, so called generalized statistical manifold. In [29], a review of nonparametric information geometry with specific issues of the infinite dimensional setting is provided. In that work, the deformed exponential manifold was studied with a deformed exponential function defined in [30] and a model space was built according to the proposal in [28].
In [31] were given necessary and sufficient conditions for any two probability distributions being connected by a φ -arc. In this work, we ensure the existence of a generalized mixture arc for probability distributions in the same φ -family F c φ , with a deformed exponential function which satisfies some properties. Moreover, we find a generalization of open exponential arcs and we prove, in the same way that in [22], that the φ -family F c φ is the component connected to a given positive density p = φ ( c ) and viceversa.
The rest of the paper is organized as follows. In Section 2, we revisit results about Musielak–Orlicz space and φ -family of probability distributions. We also briefly recall about the subdifferential of a convex function. In Section 3, where we provide our main results, we ensure that the generalized mixture arc is well-defined. In Section 4, we discuss the generalized, exponential and mixture arcs. Finally, our conclusions and perspectives are stated in Section 5.

2. Preliminary Results

The statistical manifold P μ can be equipped with a structure of C -Banach Manifold, using the Musielak–Orlicz space L Φ associated to the Musielak–Orlicz function Φ c ( t , u ) = φ ( t , c ( t ) + u ( t ) ) φ ( t , c ( t ) ) . Each connected component of the statistical manifold gives rise to a φ -family of probability distributions F c φ . In this section, we provide an introduction of Musielak–Orlicz spaces and the construction of the φ -family of probability distributions.

2.1. φ -Families of Probability Distributions

Let ( T , Σ , μ ) be a σ -finite, non-atomic measure space. A function Φ : T × [ 0 , ) [ 0 , ] is said to be a Musielak–Orlicz function if
(i)
Φ ( t , · ) is convex and lower semi-continuous for μ -a.e. (almost everywhere) t T ,
(ii)
Φ ( t , 0 ) = lim u 0 , Φ ( t , u ) = 0 and lim u Φ ( t , u ) = for μ -a.e. t T ,
(iii)
Φ ( · , u ) is measurable for each u 0 .
We notice that Φ ( t , · ) , by (i)-(ii), is not equal to 0 or on the interval ( 0 , ) .
Let L 0 be the linear space of all real-value, measurable functions on T. Given a Musielak–Orlicz function Φ , we denote the functional I Φ ( u ) = T Φ ( t , | u ( t ) | ) d μ , for any u L 0 . The Musielak–Orlicz space, Musielak–Orlicz class, Morse–Transue space generated by a Musielak–Orlicz function Φ are defined, respectively, by
L Φ = { u L 0 : I Φ ( λ u ) < for some λ > 0 }
L ˜ Φ = { u L 0 : I Φ ( u ) < } ,
and
E Φ = { u L 0 : I Φ ( λ u ) < for all λ > 0 } .
The Musielak–Orlicz space L Φ is a Banach space when it is equipped with the Luxemburg norm given by
u Φ = inf λ > 0 : I Φ u λ 1 ,
or the Orlicz norm, represented as
u Φ , 0 = sup T u v d μ : v L ˜ Φ and I Φ ( v ) 1 ,
where Φ ( t , v ) = sup u 0 ( u v Φ ( t , u ) ) is the Fenchel conjugate of Φ ( t , · ) , which is also a Musielak–Orlicz function. These norms are equivalent and the inequalities u Φ u Φ , 0 2 u Φ hold for all u L Φ [32]. A Musielak–Orlicz function is said to satisfy the Δ 2 -condition, or belong to the Δ 2 -class (denoted by Φ Δ 2 ), if we can find a constant α > 0 and a non-negative function f L ˜ Φ such that
α Φ ( t , u ) Φ t , 1 2 u , for all u f ( t ) , and μ a . e . t T .
If the Musielak–Orlicz function Φ satisfies the Δ 2 -condition, then I Φ ( u ) < for every u L Φ [32]. In this case L Φ , L ˜ Φ and E Φ are equal as sets. Moreover, if the Musielak–Orlicz function Φ does not satisfy the Δ 2 -condition, E Φ is a proper subspace of L Φ . Every function Φ that satisfies the Δ 2 -condition is finite-value. Indeed, we define
b Φ ( t ) = sup { u 0 : Φ ( t , u ) < } ,
and assuming that b Φ ( t ) < , we get Φ ( t , 1 2 u ) < α Φ ( t , u ) = for all b Φ ( t ) < u < 2 b Φ ( t ) which implies that Φ cannot satisfy the Δ 2 -condition. For more information see for instance [32,33].
We say that a Musielak–Orlicz function Φ satisfies the 2 -condition, or belongs to 2 -class, if we can find a constant γ > 1 , and a non-negative function f L ˜ Φ such that
γ Φ ( t , u ) Φ t , 1 2 γ u , for all u > f ( t ) .
We notice that, if Φ 2 , then
d Φ = lim u Φ ( t , u ) u = lim u Φ ( t , u ) = lim u Φ + ( t , u ) = .
Example 1.
The function Φ : [ 0 , ) [ 0 , ) defined by:
Φ ( u ) = exp ( u ) u 1
satisfies the 2 -condition and does not satisfy the Δ 2 -condition.
The (topological) dual space of L Φ , is denoted by ( L Φ ) and represented in the following way [32,34,35]
L Φ = L Φ ( L Φ ) s ,
where L Φ is the set of the order continuous functionals and ( L Φ ) s is formed by singular components. If the Musielak–Orlicz function Φ c Δ 2 then all functionals in ( L Φ ) are order continuous and represented by
f v ( u ) : = T u v d μ , for all u L Φ .
Otherwise, if Φ Δ 2 , then the functionals f in ( L Φ ) can be uniquely expressed as
f = f c + f s ,
where f c is the order continuous component and f s is the singular component.
While exponential families are based on the exponential function, φ -families are based on deformed exponential functions. A deformed exponential φ : T × R ( 0 , ) is a function that satisfies the following properties, for μ -a.e. t T [28]:
(i)
φ ( · ) is convex and injective;
(ii)
lim u φ ( u ) = 0 and lim u φ ( u ) = ;
(iii)
There exists a measurable function u 0 : T ( 0 , ) such that
T φ ( c + λ u 0 ) d μ < , for all λ > 0 ,
for every measurable function c : T R for which T φ ( c ) d μ = 1 .
In de Souza et al. [36], Lemma 1, it was shown that the constraint T φ ( c ) d μ = 1 can be replaced by T φ ( c ) d μ < . Thus, the condition (iii) can be rewritten as:
(iii’)
There exists a measurable function u 0 : T ( 0 , ) such that
T φ ( c + λ u 0 ) d μ < , for all λ > 0 ,
for every measurable function c : T R for which T φ ( c ) d μ < .
There are many examples of deformed exponential functions. An example of relevance is the exponential function φ ( x ) = exp ( x ) that satisfies (i)-(iii) with u 0 = 1 T . Another example is Kaniadakis’ κ -exponential [26,27,28]:
Example 2.
The Kaniadakis’ κ-exponential exp κ : R ( 0 , ) for κ [ 1 , 1 ] is defined as
exp κ ( u ) = κ u + 1 + κ 2 u 2 1 κ , i f κ 0 , exp ( u ) i f κ = 0 .
The inverse of exp κ is the Kaniadakis’ κ-logarithm
ln κ ( u ) = u κ u κ 2 κ , i f κ 0 , ln ( u ) i f κ = 0 .
One can easily notice the κ-exponential satisfies ( i ) ( iii ) [28,36].
The Musielak–Orlicz function
Φ c ( t , u ) = φ ( t , c ( t ) + u ) φ ( t , c ( t ) )
for a measurable function c : T R such that φ ( t , c ( t ) ) is μ -integrable, was defined in [28]. Thus, the sets L Φ c , L ˜ Φ c and E Φ c are denoted by L c φ , L ˜ c φ and E c φ , respectively, when Φ c is given by (5). Let
P μ = p L 0 : p > 0   and   T p d μ = 1
be the collection whose φ -family is a subset, where L 0 is the linear space of all real-valued. For each probability density p P μ , we have a φ -family of probability density associated, F c φ = φ c ( B c φ ) P μ according to
φ c ( u ) = φ ( c + u ψ ( u ) u 0 ) , for each u B c φ ,
where the set B c φ is the intersection of the convex set
K c φ = u L c φ : T φ ( c + λ u ) d μ < for   some   λ > 1 ,
with the closed subspace
B c φ = u L c φ : T u φ + ( c ) d μ = 0 ,
that is B c φ = K c φ B c φ . The normalizing function ψ : B c φ [ 0 , ) is introduced so that expression (6) is a probability distribution in P μ . Suppose that the Musielak–Orlicz function Φ c does not satisfy the Δ 2 -condition, we have that the boundary of B c φ , the set B c φ , is not empty. A function u B c φ belongs to B c φ if only if T φ ( c + λ u ) d μ < for all λ ( 0 , 1 ) , and T φ ( c + λ u ) d μ = for each λ > 1 . The behavior of the normalizing function near the boundary was studied in [33,37].
It is shown that the normalizing function ψ : K c φ R is a convex function [28]. Assuming that φ is continuously differentiable, the normalizing function is Gâteaux-differentiable and the expression for Gâteaux-derivative is
ψ ( u ) v = T v φ ( c + u ψ ( u ) u 0 ) d μ T u 0 φ ( c + u ψ ( u ) u 0 ) d μ ,
with u K c φ and v L c φ .
In the next section, we recall some differentiability properties of convex functions on infinite dimensional spaces.

2.2. The Subdifferential of a Convex function

In this section, we discuss some properties of extended real-valued convex functions in Banach spaces, i.e., functions with values in R { ± } . Mainly, we recall subdifferentials of lower semicontinuous convex functions and its properties.
Let E be a Banach space. A function f is a convex function on E, with the epigraph [38]
epi f = ( x , α ) : x E , α R , α f ( x ) .
If f ( x ) > for every x and f ( x ) < + for at least one value of x, we call f a proper function. The set
dom f = x E : f ( x ) <
denotes the effective domain of f. A function f : E ( , ] is said to be lower semicontinuous (l.s.c.) if for every λ R the set
[ f λ ] = { x E : f ( x ) λ }
is closed.
Let E be the dual space of E. A vector x E is said to be a subgradient of f at x E if
( x , z ) f ( x + z ) f ( x ) for all z E .
We denote by f ( x ) the set of subgradients of f at x and the subdifferential of f is the multivalued mapping x f ( x ) from E to E . By definition, f ( x ) is always a closed convex subset of E for each x. Suppose f is a convex function finite at x. One has x f ( x ) if and only if
( z , x ) f ( x ; z ) , z E ,
where
f ( x ; z ) = lim t 0 + f ( x + t z ) f ( x ) t
is the directional derivative of f at x in direction z E . The subdifferential may be empty at points of dom f , so we denote by
D ( f ) = x E : f ( x ) ,
the domain of f and we have that D ( f ) dom f . We say that f is subdifferentiable at x for all x D ( f ) .
Let f be a lower semicontinuous proper convex function, then int dom f D ( f ) [39] (Corollary 2.38). The conjugate of f is the function f : E R ¯ defined by
f ( x ) = sup { ( x , x ) f ( x ) : x E } , x E .
Observe that, if f is proper, then “sup” in Equation (9) may be restricted to the points x dom f . The conjugate f is a convex and lower semicontinuous function on E and jointly with f satisfy the well known Young’s inequality
( x , x ) f ( x ) + f ( x ) ,
with equality holding if and only if x f ( x ) . If f is a lower semicontinuous function, the subdifferential f of the conjugate function f coincides with ( f ) 1 ([39], Proposition 2.33).
It is known that, if f is a lower semicontinuous proper convex function, then
int dom f D ( f ) dom f ,
and it was shown in [40] that D ( f ) is, in fact, dense in dom f .
Fact 1
(([41], Corollary 2.19), ([42], Corollary 7.2.3)). Suppose x D ( f ) . Then x i n t d o m f if only if f is locally bounded at x.
Fact 2
(([41], Lemma 2.20), ([42], Lemma 7.2.4)). If i n t d o m f and x D ( f ) i n t d o m f , then f ( x ) is unbounded.
The subdifferential of a convex function is closely related to Gâteaux-gradient. If the convex function f is Gâteaux-differentiable in x 0 E , then f ( x 0 ) consists of a single element x = grad f ( x 0 ) ([39], Proposition 2.40), where grad f ( x ) is the Gâteaux-gradient of f at x.
In the next section, we investigate the subdifferential of the normalizing function ψ . This result will be useful for us to prove that the generalized mixture arc is well defined, which is one of our main goals in this work.

3. Construction of Generalized Mixture Arcs

The normalizing function ψ : K c φ R is convex and Gâteaux-differentiable and this derivative is given by Equation (8). Hence, with these facts in mind, we can provide the expression for the generalized mixture arc as given by:
p ( t ) = F 1 ( ( 1 t ) F ( p ) + t F ( q ) ) ,
where
F ( p ) = φ ( φ 1 ( p ) ) T u 0 φ ( φ 1 ( p ) ) d μ ,
and p , q belong to a φ -family F c φ . We can rewrite the functional F ( p ) as
φ ( c + u ψ ( u ) u 0 ) T u 0 φ ( c + u ψ ( u ) u 0 ) d μ , u B c φ ,
with p = φ c ( c + u ψ ( u ) u 0 ) and Equation (13) is the Gâteaux-gradient of ψ . Thus, for the generalized mixture arc to be well defined, it is necessary that the set of these functionals in Equation (13) be convex. As mentioned in Section 2.2, the subdifferential and Gâteaux-gradient are closely related. For this reason, we investigate the subdifferential of ψ .

3.1. Subdifferential of the Normalizing Function ψ

Considering that the Musielak–Orlicz function (5) does not satisfy the Δ 2 -condition, then we have that B c φ is not-empty [33]. The effective domain of the normalizing ψ , the set dom ψ = u B c φ : ψ ( u ) < is
dom ψ = B c φ { B c φ } < ,
where { B c φ } < is the set of points in the boundary of B c φ such that ψ ( u ) < . The behavior of the normalizing function ψ near the boundary of B c φ was discussed in [33]. We need to know the subdifferentials of ψ . Hence, we have to prove some properties of ψ , then we have our first result.
Proposition 1.
The normalizing function ψ : B c φ R { } is lower semicontinuous.
Proof. 
Given α R , let C α be the set C α = { u B c φ : ψ ( u ) α } . To prove the statement, it suffices to show that C α is closed. We define a set
B = u B c φ : T φ ( c + u α u 0 ) d μ 1 ,
and we are going to prove that B is a closed set and that B = C α . Let { u n } be a sequence which belongs to B, such that u n u Φ c 0 . This way, u n u , μ -a.e. Since φ is a continuous function, we have that φ ( c + u n α u 0 ) φ ( c + u + α u 0 ) , μ -a.e. From Fatou’s Lemma, it follows that
T φ ( c + u α u 0 ) d μ = T lim inf n φ ( c + u n α u 0 ) d μ lim inf n T φ ( c + u n α u 0 ) d μ 1
thus, u B and B is a closed set. Now, we prove that B = C α . Let u be a function which belongs to C α , then ψ ( u ) α . The function φ is a strictly increasing function, so that
T φ ( c + u α u 0 ) d μ T φ ( c + u ψ ( u ) u 0 ) d μ = 1 ,
thus, u B .
Suppose that there exists w B C α , then w B , which implies that T φ ( c + w α u 0 ) d μ 1 and w C α , which implies that ψ ( w ) > α . Then
T φ ( c + w α u 0 ) d μ > T φ ( c + w ψ ( w ) u 0 ) d μ = 1 ,
thus T φ ( c + w α u 0 ) d μ > 1 . This contradicts the assumption that w B . Therefore, B = C α and C α is closed. ☐
The subdifferential of ψ at a function u dom ψ is the set
ψ ( u ) = u ( L Φ c ) : T u v d μ ψ ( u + v ) ψ ( u ) , for all v B c φ ,
where ( L Φ c ) denotes the dual space of L Φ c . We know that, for all u B c φ the normalizing function ψ is Gâteaux-differentiable and the Gâteaux-gradient is given by Equation (13). Hence, ψ ( u ) consists of a single element and is given by
ψ ( u ) = φ ( c + u ψ ( u ) u 0 ) T u 0 φ ( c + u ψ ( u ) u 0 d μ , u B c φ .
In fact, we prove below that Equation (13) belongs to ψ ( u ) , for all u B c φ .
Proposition 2.
Let u be a function in d o m ψ . Supposing that the functional
φ ( c + u ψ ( u ) u 0 ) T u 0 φ ( c + u ψ ( u ) u 0 ) d μ
belongs to L Φ c , then (16) belongs to ψ ( u ) .
Proof. 
We have that the functional (16) belongs to L Φ c . Let v be a function in B c φ such that T φ ( c + u + v ) d μ < . In other words, u + v dom ψ , so we have that T φ ( c + u ψ ( u ) u 0 ) d μ = 1 and T φ ( c + u + v ψ ( u + v ) u 0 ) d μ = 1 . Thus, by the convexity of φ , we have
T [ v + ( ψ ( u + v ) ψ ( u ) ) u 0 ] φ ( c + u ψ ( u ) u 0 ) d μ T φ ( c + u ψ ( u ) u 0 ) d μ T φ ( c + u + v ψ ( u + v ) u 0 ) d μ = 0 .
Thus,
T v φ ( c + u ψ ( u ) u 0 ) d μ T u 0 φ ( c + u ψ ( u ) u 0 ) d μ ( ψ ( u + v ) ψ ( u ) ) d μ ,
and
T v φ ( c + u ψ ( u ) u 0 ) d μ T u 0 φ ( c + u ψ ( u ) u 0 ) d μ ψ ( u + v ) ψ ( u ) .
If u + v B c φ dom ψ , then ψ ( u + v ) = , and
T v φ ( c + u ψ ( u ) u 0 ) d μ T u 0 φ ( c + u ψ ( u ) u 0 ) d μ < ψ ( u + v ) ψ ( u ) .
Consequently, Inequality (17) holds for all v B c φ and the result follows. ☐
We need to find the subdifferential of ψ for u in the set { B c φ } < . We know that ψ is a proper lower semicontinuous convex function, so
int dom ψ D ( ψ ) dom ψ ,
where int dom ψ = B c φ and D ( ψ ) int dom ψ = { B c φ } < . As we have that int dom ψ , then for u D ( ψ ) int dom ψ , ψ ( u ) is unbounded.
Since we are interested to prove that the set of functionals in Equation (13) is convex and these functionals are order continuous, we need to analyze only the order continuous part of the subdifferential, i.e., the part of the subdifferential that belongs to L Φ c . We need to investigate whether the functional in Equation (16) belongs to L Φ c , for u { B c φ } < . For this, we will use some results.
Lemma 1
([35], Lemma 3.11). Let Φ c be a Musielak–Orlicz function that does not satisfy the Δ 2 -condition. In addition, assume that Φ c ( t , b Φ ( t ) ) = for μ-a.e. t T . Then there exist a strictly increasing sequence 0 < λ n 1 , and sequences { u n } and { A n } of finite-valued, non-negative, measurable functions, and pairwise disjoint, measurable sets, respectively, such that
I Φ c ( u n χ A n ) = 1 , a n d I Φ c ( λ n u n χ A n ) 2 n , f o r   a l l   n 1 .
Proposition 3
([43], Proposition 2.3). Let Φ and Ψ be Musielak–Orlicz functions. Suppose that, for constants α, λ > 0 , there exists an integrable function h : T [ 0 , ) such that
α Ψ ( t , u ) Φ ( t , λ u ) + h ( t ) , f o r a l l u 0 .
Then, for constants α ( 0 , α ) and λ = λ , or α = α and λ > λ , a non-negative function f L ˜ Ψ can be found such that
α Ψ ( t , u ) Φ ( t , λ u ) , f o r a l l u > f ( t ) .
Lemma 2.
Let Φ and Ψ denote the complementary functions to the Musielak–Orlicz functions Φ and Ψ, respectively. Suppose that, for constants α , λ > 0 , there exists a non-negative function f L ˜ Ψ such that
α Ψ ( t , u ) Φ ( t , λ u ) , f o r a l l u > f ( t ) .
Then, for constants α = 1 / α and λ > λ / α , or α ( 0 , 1 / α ) and λ = λ / α , a non-negative function g L ˜ Φ can be found such that
α Φ ( t , v ) Ψ ( t , λ v ) , f o r a l l v > g ( t ) .
Proof. 
Defining the function h ( t ) = Ψ ( t , f ( t ) ) , we can write
α Ψ ( t , u ) Φ ( t , λ u ) + α h ( t ) , for all u 0 .
Calculating the Fenchel conjugate of the functions in the inequality above, we obtain
1 α Φ ( t , v ) Ψ t , λ α v + h ( t ) , for all v 0 .
From Proposition 3, we infer that Equation (19) is satisfied. ☐
Lemma 3.
The Δ 2 -condition is equivalent to the statement that, for every λ ( 0 , 1 ) , there exist a constant α λ ( 0 , 1 ) , and a non-negative function f λ L ˜ Φ such that
α λ Φ ( t , u ) Φ ( t , λ u ) , f o r a l l u > f λ ( t ) .
The 2 -condition is equivalent to the statement that, for any λ ( 0 , 1 ) , there exist a constant γ λ > 1 , and a non-negative function f λ L ˜ Φ such that
γ λ Φ ( t , u ) Φ ( t , λ γ λ u ) , f o r a l l u > f λ ( t ) .
Proof. 
Suppose it satisfies the Δ 2 -condition. If the natural number n 1 is such that 2 n λ , then α n Φ ( t , u ) Φ ( t , 2 n u ) Φ ( t , λ u ) , for all u > 2 n 1 f ( t ) . Conversely, if Φ satisfies Equation (20) and the natural number n 1 is chosen so that λ n 1 / 2 , then α λ n Φ ( t , u ) Φ ( t , λ n u ) Φ ( t , 1 2 u ) , for all u > λ n + 1 f λ ( t ) .
Assume that Equation (3) is satisfied. Let n 1 be a natural number such that 2 n λ . Then γ n Φ ( t , u ) Φ ( t , 2 n γ n u ) Φ ( t , λ γ n u ) , for all u > f ( t ) . Conversely, if Equation (21) holds and the natural number n 1 is chosen so that λ n 1 / 2 , then γ λ n Φ ( t , u ) Φ ( t , λ n γ λ n u ) Φ ( t , 1 2 γ λ n u ) , for all u > f ( t ) . ☐
The next result follows from Lemmas 2 and 3.
Theorem 1.
A Musielak–Orlicz function Φ c satisfies the 2 -condition if, and only if, its complementary function Φ satisfies the Δ 2 -condition.
Proposition 4.
Let Φ c be a Musielak–Orlicz function that does not satisfy the Δ 2 -condition and that Φ ( t , b Φ c ( t ) ) = for μ-a.e. t T . Then we can find a non-negative function u L ˜ Φ c such that I Φ c ( Φ c + ( t , u ( t ) ) ) = .
Proof. 
Let { λ n } , { u n } and { A n } be given as in Lemma 1. Select a subsequence { λ n k } { λ n } for which the series k = 1 ( 1 λ n k ) converges, and ( 1 λ n k ) + 2 n k < 1 for all k 1 . Because λ I Φ c ( λ u n k χ A n k ) is continuous for λ [ 0 , 1 ] , we can find λ k ( λ n k , 1 ) such that I Φ c ( λ k u n k χ A n k ) = ( 1 λ n k ) + 2 n k . Define u = k = 1 λ k u n k χ A n k . Then, we can write
I Φ c ( u ) = k = 1 I Φ c ( λ k u n k χ A n k ) = k = 1 [ ( 1 λ n k ) + 2 n k ] < ,
and
T u ( t ) Φ c + ( t , u ( t ) ) d μ = k = 1 A n k λ k u n k ( t ) Φ c + ( t , λ k u n k ( t ) ) d μ k = 1 λ k 1 λ k λ n k [ I Φ c ( λ k u n k χ A n k ) I Φ c ( λ n k u n k χ A n k ) ] k = 1 λ k 1 λ n k [ ( 1 λ n k ) + 2 n k 2 n k ] = k = 1 λ k = .
Hence, it follows that
I Φ c ( Φ c + ( t , u ( t ) ) ) = T u ( t ) Φ c + ( t , u ( t ) ) d μ I Φ c ( u ) = ,
which concludes the proof. ☐
The previous proposition makes it clear that we can find a u L ˜ Φ c , but Φ c + ( t , u ( t ) ) L ˜ Φ c . Let u be as in Proposition 4, clearly for λ ( 0 , 1 ) , I Φ c ( λ u ) < and for λ > 1 , I Φ c ( λ u ) = ([35], Remark 3.12).
Proposition 5.
Let Φ c be a Musielak–Orlicz function such that, Φ c satisfies 2 -condition, does not satisfy Δ 2 -condition and Φ c ( t , b Φ ( t ) ) = . Then we can find w B c φ such that
φ ( c + w ψ ( w ) u 0 ) T u 0 φ ( c + w ψ ( w ) u 0 ) d μ L Φ c
where L ˜ Φ c is the Musielak–Orlicz class of Φ c , the conjugate of Φ c .
Proof. 
Take k 0 1 and denote B = T k = k 0 A n k , then we define u ˜ = k = k 0 λ k u n k χ A n k . We can choose λ < 0 such that
w = λ u 0 χ B + u ˜
satisfies T w φ ( c ) d μ = 0 . In other words, w B c φ . It is easy to see that T φ ( c + α w ) d μ < for α ( 0 , 1 ) and T φ ( c + α w ) d μ = for α > 1 , so w B c φ . The need to show Equation (22) remains. From Proposition 4 we have that
T w Φ c + ( t , w ( t ) ) d μ = B λ u 0 ( t ) Φ c + ( t , λ u 0 ) d μ + k = k 0 A n k I Φ c ( λ k u n k χ A n k ) d μ = ,
since
B λ u 0 ( t ) Φ c + ( t , λ u 0 ) d μ λ B Φ c ( ( λ + 1 ) u 0 ) Φ c ( λ u 0 ) d μ λ B φ ( c + ( λ + 1 ) u 0 ) φ ( c + λ u 0 ) d μ < .
Thus,
I Φ c Φ c + ( t , w ( t ) ) = T w ( t ) Φ c + ( t , w ( t ) ) d μ I Φ c ( w ) = ,
consequently φ ( c + w ) L ˜ Φ c . Since Φ c 2 , we have that Φ c Δ 2 and therefore L ˜ Φ c = L Φ c . We conclude that φ ( c + w ) L Φ c . Since L Φ c is a linear set, we have that Equation (22) occurs. ☐
As a consequence of Proposition 5, we have that it is possible to find u B c φ < such that
φ ( c + u ψ ( u ) u 0 ) T u 0 φ ( c + u ψ ( u ) u 0 ) d μ L Φ c ,
and therefore the functional in Equation (16) does not belong to ψ ( u ) .
We conclude in this section that, if the functional
φ ( c + u ψ ( u ) u 0 ) T u 0 φ ( c + u ψ ( u ) u 0 ) d μ
belongs to L Φ c , then the functional belongs to ψ ( u ) for u dom ψ .
In next section we finally prove that the set of functionals formed by Gâteaux gradient of the normalizing function ψ that belongs to L Φ c is convex, so we can guarantee that the generalized mixture arc is well defined.

3.2. Convexity of the Functionals Set

We already know that, for the generalized mixture arc in Equation (11) to be well defined, it is necessary that the set of functionals
φ ( c + u ψ ( u ) u 0 ) T u 0 φ ( c + u ψ ( u ) u 0 ) d μ , u dom ψ L Φ c
to be convex. From Proposition 2, the set in Equation (24) is contained in the range of ψ , the set given by
range ψ = ψ ( u ) : u dom ψ .
Let ψ be the conjugate function of ψ . By the fact that ψ be a l.s.c. proper convex function, int dom ψ and dom ψ are convex sets and the range of ψ is the effective domain of ψ , since ( ψ ) 1 = ψ . Thus
int dom ψ D ( ψ ) dom ψ
is the same that
int dom ψ range ψ dom ψ .
To prove that the set in Equation (24) is convex, we analyze the set in Equation (25) in three cases. Let u , v be elements in Equation (25) such that
Case 1.
u , v int dom ψ , so by convexity of int dom ψ , for λ ( 0 , 1 ) , we have λ u + ( 1 λ ) v int dom ψ .
Case 2.
If u int dom ψ and v D ( ψ ) int dom ψ , then λ u + ( 1 λ ) v int dom ψ , for λ ( 0 , 1 ) ([41], Fact 2.1).
Case 3.
Let u , v be elements in Equation (25) belonging to D ( ψ ) int dom ψ .
We want to prove that, for λ ( 0 , 1 ) , λ u + ( 1 λ ) v belongs to Equation (25). To solve this problem, we are going to prove that D ( ψ ) = int dom ψ . Supposing φ a strictly convex function, then ψ is a strictly convex function. In next proposition, we show that ψ ( u ) is a unitary set.
Proposition 6.
Let ψ be a strictly convex function, then ψ ( u ) is a unitary set, where u ψ ( u ) , with u D ( ψ ) .
Proof. 
Assuming that ψ is a strictly convex function we have that for λ ( 0 , 1 ) and u 1 u 2 dom ψ
ψ ( λ u 1 + ( 1 λ ) u 2 ) < λ ψ ( u 1 ) + ( 1 λ ) ψ ( u 2 ) .
Supposing that ψ ( u ) is not a unitary set, i.e., ψ ( u ) = { u 1 , u 2 , } , where u i D ( ψ ) , i = 1 , 2 , . Taking u 1 , u 2 ψ ( u ) . By Young’s Inequality (10)
( λ u 1 + ( 1 λ ) u 2 , u ) ψ ( λ u 1 + ( 1 λ ) u 2 ) + ψ ( u ) ,
where λ ( 0 , 1 ) and as a consequence of u 1 , u 2 ψ ( u ) we have
ψ ( u 1 ) + ψ ( u ) = ( u 1 , u ) ,
and
ψ ( u 2 ) + ψ ( u ) = ( u 1 , u ) .
Taking the product of Equation (30) by λ , the product of Equation (31) by ( 1 λ ) and adding the two obtained equations, we have
λ ψ ( u 1 ) + ( 1 λ ) ψ ( u 2 ) + ψ ( u ) = ( λ u 1 + ( 1 λ ) u 2 , u ) .
From Equations (29) and (32), we obtain
λ ψ ( u 1 ) + ( 1 λ ) ψ ( u 2 ) ψ ( λ u 1 + ( 1 λ ) u 2 ) ,
which is a contradiction by Equation (28). This implies that ψ ( u ) is a unitary set and this completes the proof. ☐
Thus, the set ψ ( u ) is unitary, then ψ is locally bounded at u D ( ψ ) and, therefore, by Fact 1, we conclude that u int dom ψ which implies that D ( ψ ) int dom ψ , by Equation (26), we have that
range ψ = D ( ψ ) = int dom ψ .
Therefore, by Fact 2, there exists no functional u in Equation (25) such that u D ( ψ ) int dom ψ . Thus Equation (25) is a convex set and, as a consequence, the generalized mixture arc is well defined, since the set in Equation (24) is a convex set. Indeed, let u, v be functions in dom ψ such that
u = φ ( c + u ψ ( u ) u 0 ) T u 0 φ ( c + u ψ ( u ) u 0 ) d μ
and
v = φ ( c + v ψ ( v ) u 0 ) T u 0 φ ( c + v ψ ( v ) u 0 ) d μ
belong to Equation (24). Clearly,
T u 0 u d μ = T u 0 φ ( c + u ψ ( u ) u 0 ) d μ T u 0 φ ( c + u ψ ( u ) u 0 ) d μ = 1
and
T u 0 v d μ = T u 0 φ ( c + v ψ ( v ) u 0 ) d μ T u 0 φ ( c + v ψ ( v ) u 0 ) d μ = 1 .
We note that, the functionals in Equation (24) are the only elements in Equation (25) that satisfy T u 0 u d μ = 1 . For λ ( 0 , 1 ) we have
T u 0 ( ( 1 λ ) u + λ v ) d μ = 1 ,
then there exist functions w λ dom ψ such that
φ ( c + w λ ψ ( w λ ) u 0 ) d μ T u 0 φ ( c + w λ ψ ( w λ ) u 0 ) d μ = ( 1 λ ) u + λ v , for each λ ( 0 , 1 ) .
Thus, the set in Equation (24) is a convex set.
In this section, we proved that the generalized mixture arc is well defined for a deformed exponential φ strictly convex. In the next section, we discuss generalized open exponential arcs and generalized open mixture arcs.

4. Generalized Arcs

The concept of arc-connected probability distributions was defined by de Souza et al. [36] defined the concept of arc-connected probability distributions. Fixing any deformed exponential φ we say that two probability distributions p , q P μ are φ -connected if, for each α [ 0 , 1 ] , there exists k ( α ) : = k ( α ; p , q ) R such that
T φ ( α φ 1 ( p ) + ( 1 α ) + k ( α ) u 0 ) d μ = 1 .
In [31], necessary and sufficient conditions for any probability distributions being φ -connected were provided. In this section, we discuss the concept of two probability distributions p , q P μ are φ -connected by open arcs. We generalize open exponential arcs and open mixture arcs, defined in [22] and studied later in [23].

4.1. Generalized Open Exponential Arcs

Let us define the generalized open arcs and prove some of its properties.
Definition 1.
For a fixed deformed exponential φ, we say that p and q in P μ are φ-connected by an open arc if there exists an open interval I [ 0 , 1 ] and a constant k ( α ) such that
p ( α ) = φ ( ( 1 α ) φ 1 ( p ) + α φ 1 ( q ) + k ( α ) u 0 )
belongs to P μ for every t I , where k ( α ) depends of t , p a n d q .
In the following proposition, we give an equivalent definition of φ -connection by open arc.
Proposition 7.
p , q P μ are φ-connected by an open arc if and only if there exist an open interval I [ 0 , 1 ] and a random variable v L c φ , such that p ( α ) φ ( c + α v ) belongs to P μ , for all t I and p ( 0 ) = p and p ( 1 ) = q .
Proof. 
Let us assume that p , q are φ -connected, i.e., T φ ( ( 1 α ) φ 1 ( p ) + α φ 1 ( q ) ) d μ < , for all α I . Since
T φ ( ( 1 α ) φ 1 ( p ) + α φ 1 ( q ) ) d μ = T φ ( α [ φ 1 ( q ) φ 1 ( p ) ] + φ 1 ( p ) ) d μ = T φ ( c + α v ) d μ ,
where v = φ 1 ( q ) φ 1 ( p ) and φ ( c ) = p , then v L c φ . Moreover, p ( α ) φ ( c + α v ) belongs to P μ , for every α I and p ( 0 ) = φ ( c ) = p and p ( 1 ) = q . The converse follows immediately. Suppose that q = p ( 1 ) , we have φ ( c + v ) = q , then v = φ 1 ( q ) φ 1 ( p ) , with φ ( c ) = p = p ( 0 ) . ☐
Because of v L c φ the need to define the open arcs arises. As a consequence of Proposition 7, we have that if p , q P μ are φ - connected by an open arc, then the random variable v K c φ , since T φ ( c + α v ) d μ < for all α ( ε , 1 + ε ) . With this, we can prove the following results.
Corollary 1.
Let p , q P μ , where p = φ ( c ) . We have that q F c φ if and only if, p and q are φ-connected by an open arc.
Proof. 
Supposing q F c φ , then q = φ ( c + v ψ ( v ) u 0 ) where v B c φ . Thus, we have T φ ( c + α v ) d μ < for all α ( ε , 1 + ε ) , we deduce that p ( α ) φ ( c + α v ) is an open arc containing p and q. Conversely, supposing that p and q are φ -connected by an open arc, by Proposition 7, there exist an open interval I [ 0 , 1 ] and v K c φ such that p ( α ) φ ( c + α v ) belongs to P μ with q = p ( 1 ) . If v B c φ , then q = φ ( c + v ) F c φ and the proof is over. Otherwise, let w be such that
w = v T v φ ( c ) d μ T u 0 φ ( c ) d μ u 0 ,
thus T w φ ( c ) d μ = 0 and w B c φ . Hence, we have q = φ ( c + v ) = φ ( c + w ) and q F c φ . ☐
With this, we prove that, for φ ( c ) = p , the φ -family of probability distributions F c φ is the set of all q P μ such that q is φ -connected by an open arc to p.
Corollary 2.
Let p = φ ( c ) and q = φ ( c ˜ ) be such that p , q P μ are φ -connected by an open arc. Then the spaces L c φ and L c ˜ φ are equal as sets.
Proof. 
It follows from Corollary 1 that p and q are in the same φ -family, then c ˜ = c + u ψ ( u ) u 0 and by Vigelis and Cavalcante [28], Lemma 5, it follows the result. ☐
Now, we show that the connection by generalized open exponential arcs is an equivalence relation.
Proposition 8.
The relation in Definition 1 is an equivalence relation.
Proof. 
Reflexive and symmetry properties follow from the definition and now, we prove transitivity. Consider p , q , r P μ
p ( t ) φ ( c + t u ) , r ( t ) φ ( c + t v ) , t ( ε , 1 + ε ) ,
with p ( 0 ) = φ ( c ) = p , p ( 1 ) = φ ( c + u ) = q , r ( 0 ) = φ ( c ) = p , r ( 1 ) = φ ( c + v ) = r with u , v L c φ . We have that p is φ -connected to q and r, respectively. We need to prove that q and r are also φ -connected. Consider
q ( t ) φ ( c + ( 1 t ) u + t v ) φ ( c + u + t ( v u ) )
is defined with c + u = c ˜ , p ( t ) φ ( c ˜ + t ( v u ) ) , v u L c φ , such that q ( 0 ) = φ ( c ˜ ) = φ ( c + u ) = q , q ( 1 ) = φ ( c ˜ + ( v u ) ) = φ ( c + v ) = r . Therefore, q and r are φ -connected. ☐
We know from Corollary 1 that the φ -family F c φ coincides with the set of all q P μ which are φ -connected to p by an open arc. We want now to prove that the φ -family F c φ is convex for some deformed exponential φ .
Lemma 4.
Let φ be a fixed deformed exponential. Assuming that ( φ 1 ) ( x ) is continuous and
α φ ( α φ 1 ( x ) + k ) φ ( α φ 1 ( x ) + k ) φ ( φ 1 ( x ) ) φ ( φ 1 ( x ) ) ,
then F ( x ) = φ ( α φ 1 ( x ) + k ) , for some fixed α > 1 and k R is a convex function.
Proof. 
We know that, if F ( x ) 0 α > 1 and x , then F ( x ) is a convex function. We have
F ( x ) = α 2 φ ( α φ 1 ( x ) + k ) φ ( φ 1 ( x ) ) α φ ( α φ 1 ( x ) + k ) φ ( φ 1 ( x ) ) [ φ ( α φ 1 ( x ) ) ] 3
by the fact that φ is an increasing function [ φ ( α φ 1 ( x ) ) ] 3 > 0 . Hence, we have F ( x ) 0 if and only if
α 2 φ ( α φ 1 ( x ) + k ) φ ( φ 1 ( x ) ) α φ ( α φ 1 ( x ) + k ) φ ( φ 1 ( x ) ) 0 ,
which follows from Equation (37). ☐
Proposition 9.
Let p P μ such that φ ( c ) = p . Assuming that ( φ 1 ) ( x ) is continuous and
α φ ( α φ 1 ( x ) + k ) φ ( α φ 1 ( x ) + k ) φ ( φ 1 ( x ) ) φ ( φ 1 ( x ) )
for some fixed α > 1 and k R . Then, the φ-family of probability F c φ is convex.
Proof. 
Note that, for any φ ( c ˜ ) = r F c φ , F c φ = F c ˜ φ . Suppose q F c φ , and consider p ( λ ) = λ p + ( 1 λ ) q for any λ [ 0 , 1 ] . We show that p ( λ ) F c φ ,   λ [ 0 , 1 ] by proving that T φ ( ( 1 α ) φ 1 ( p ) + α φ 1 ( p ( λ ) ) d μ < for α ( ε , 1 + ε ) . In the others words, we will show that p ( λ ) and p are φ -connected for all λ [ 0 , 1 ] .
For α ( 0 , 1 ) , due the convexity of φ , we have
T φ ( ( 1 α ) φ 1 ( p ) + α φ 1 ( p ( λ ) ) ) d μ T ( 1 α ) φ ( φ 1 ( p ) ) + α φ ( φ 1 ( p ( λ ) ) ) d μ = T ( 1 α ) p + α p ( λ ) d μ = ( 1 α ) T p d μ + α T p ( λ ) d μ = 1 .
If α ( ε , 0 ) , according the convexity of α φ 1 ( x ) and φ ( x ) , we have
T φ ( ( 1 α ) φ 1 ( p ) + α φ 1 ( p ( λ ) ) ) d μ T φ ( λ α φ 1 ( p ) + ( 1 λ ) α φ 1 ( q ) + ( 1 α ) φ 1 ( p ) ) d μ = T φ ( λ [ α φ 1 ( p ) + ( 1 α ) φ 1 ( p ) ] + ( 1 λ ) [ α φ 1 ( q ) + ( 1 α ) φ 1 ( p ) ] ) d μ T λ φ ( α φ 1 ( p ) + ( 1 α ) φ 1 ( p ) ) + ( 1 λ ) φ ( α φ 1 ( q ) + ( 1 α ) φ 1 ( p ) ) d μ = λ T φ ( φ 1 ( p ) ) d μ + ( 1 λ ) T φ ( α φ 1 ( q ) + ( 1 α ) φ 1 ( p ) ) d μ = λ + ( 1 λ ) T φ ( α φ 1 ( q ) + ( 1 α ) φ 1 ( p ) ) d μ ,
since q F c φ , we have by Corollary 1 that q and p are φ -connected. Hence,
T φ ( ( 1 α ) φ 1 ( p ) + α φ 1 ( p ( λ ) ) ) d μ <
so p ( λ ) and p are φ -connected by an open arc, for all α ( ε , 0 ) .
Now, if α ( 1 , 1 + ε ) , the Lemma 4, F ( x ) = φ ( α φ 1 ( x ) + k ) is a convex function, so
φ ( α φ 1 ( λ x + ( 1 λ ) y ) + k ) λ φ ( α φ 1 ( x ) + k ) + ( 1 λ ) φ ( α φ 1 ( y ) + k ) ,
where λ [ 0 , 1 ] and k a constant. Taking k = ( 1 α ) φ 1 ( p ) , we have
T φ ( α φ 1 ( p ( λ ) ) + ( 1 α ) φ 1 ( p ) ) d μ λ T φ ( α φ 1 ( p ) + ( 1 α ) φ 1 ( p ) d μ + ( 1 λ ) T φ ( α φ 1 ( q ) + ( 1 α ) φ 1 ( p ) d μ = λ + ( 1 λ ) T φ ( α φ 1 ( q ) + ( 1 α ) φ 1 ( p ) ) d μ < ,
since q F c φ and, therefore, p and q are φ -connected by an open arc. ☐

4.2. Generalized Open Mixture Arcs

In Section 3.2, we proved that the generalized mixture arc given by
p ( α ) = F 1 ( 1 α ) F ( p ) + α F ( q ) ,
is well defined for α [ 0 , 1 ] . In this section, our goal is twofold: firstly, to ensure that the open arc is also well defined; and, secondly, to provide some properties of these arcs. For such objectives, we use Equation (34), which establishes that D ( ψ ) = range ψ is an open set, so we can extend the convex combination in Equation (40) between F ( p ) and F ( q ) beyond these extreme points while maintaining positivity of ( 1 α ) F ( p ) + α F ( q ) . Indeed, by the fact D ( ψ ) = range ψ is an open set, so there exists ε 1 > 0 such that B F ( p ) , ε 1 is the open ball of radius ε 1 centered at F ( p ) with B F ( p ) , ε 1 D ( ψ ) . Similarly, there exists ε 2 > 0 such that B ( F ( q ) , ε 2 ) D ( ψ ) . Taking ε = min { ε 1 , ε 2 } we guarantee that the combination ( 1 α ) F ( p ) + α F ( q ) in (40) can be extended to α I = ( ε , 1 + ε ) [ 0 , 1 ] .
Definition 2.
For a fixed deformed exponential φ, we say that p and q in P μ are φ-connected by an open mixture arc if there exists an open interval I [ 0 , 1 ] such that
p ( α ) = F 1 ( 1 α ) F ( p ) + α F ( q )
belongs to P μ for every α I , where F ( p ) = φ ( φ 1 ( p ) ) T u 0 φ ( φ 1 ( p ) ) d μ .
In [22], it was shown that densities connected by open mixture arcs have bounded away from zero ratios. Santacroce et al. [23] showed the converse implication, providing a characterization of open mixture models. Here, one can see that the fundamental role for being connected by open mixture arcs is given by ratios F ( p ) F ( q ) which have to be bounded. The functional F ( p ) in the definition of generalized open mixture arc satisfies F ( p ) > 0 . Thus the combination ( 1 α ) F ( p ) + α F ( q ) in (41) has to satisfy the same property, that is, ( 1 α ) F ( p ) + α F ( q ) > 0 . Assume that p and q are φ -connected by an open mixture arc given according to (41) belong to P μ for all α ( ε 1 , 1 + ε 2 ) [ ε , 1 + ε ] with ε > 0 . Since p ( ε ) and p ( 1 + ε ) P μ , then
F ( p ( ε ) ) = ( 1 + ε ) F ( p ) + ( ε ) F ( q ) > 0 ,
which implies that
F ( p ) F ( q ) > ε 1 + ε
and
F ( p ( 1 + ε ) ) = ( ε ) F ( p ) + ( 1 + ε ) F ( q ) > 0 ,
which give to us
F ( p ) F ( q ) < 1 + ε ε .
Combining inequalities (42) and (43),we have
ε 1 + ε < F ( p ) F ( q ) < 1 + ε ε .
Conversely, if we have Equation (44), then ( 1 α ) F ( p ) + α F ( q ) > 0 and Equation (41) belongs to P μ . Thus, we have that p and q in P μ are φ -connected by an open mixture arc if and only if the ratio F ( p ) F ( q ) is bounded. By the fact that range ψ is an open set, there exists an interval I [ 0 , 1 ] such that ( 1 α ) F ( p ) + α F ( q ) belongs to range ψ and we have
T u 0 ( 1 α ) F ( p ) + α F ( q ) d μ = 1 , for all α I [ 0 , 1 ] ,
for all α I . Then, there exist functions w α dom ψ such that
( 1 α ) F ( p ) + α F ( q ) = φ ( c + w α ψ ( w α ) u 0 ) T u 0 φ ( c + w α ψ ( w α ) u 0 ) d μ ,
with
p ( α ) = F 1 φ ( c + w α ψ ( w α ) u 0 ) T u 0 φ ( c + w α ψ ( w α ) u 0 ) d μ , for all α I [ 0 , 1 ] ,
that is, the convex combination in Equation (41) is also a functional of the type in Equation (12) for all α I . Then, the open mixture arc is well-defined. Another property of this connection by generalized open mixture arc is that it is an equivalence relation.
Proposition 10.
The relation in Definition 2 is an equivalence relation.
Proof. 
Reflexity and symmetry properties follow from definition. As for the transitivity, consider p , q and r P μ such that p ( λ ) = F 1 ( 1 λ ) F ( p ) + λ F ( q ) P μ and q ( β ) = F 1 ( 1 β ) F ( q ) + β F ( r ) P μ with λ , β [ ε , 1 + ε ] for some ε > 0 . We can take p ( ε ) = F 1 ( 1 + ε ) F ( p ) + ( ε ) F ( q ) and q ( ε ) = F 1 ( 1 + ε ) F ( q ) + ( ε ) F ( r ) , and define a probability distribution
p 1 = F 1 1 ε 1 + 2 ε F ( p ( ε ) ) + ε 1 + 2 ε F ( q ( ε ) ) = F 1 ( 1 + ε ) 2 1 + 2 ε F ( p ) ε 2 1 + 2 ε F ( r ) .
If we have p ( 1 + ε ) = F 1 ( ε ) F ( p ) + ( 1 + ε ) F ( q ) and q ( 1 + ε ) = F 1 ( ε ) F ( q ) + 1 + ε ) F ( r ) , we may define a probability distribution as
p 2 = F 1 ε 1 2 ε F ( p ( 1 + ε ) ) + 1 ε 1 2 ε F ( q ( 1 + ε ) ) = F 1 ε 2 1 + 2 ε F ( p ) + ( 1 + ε ) 2 1 + 2 ε F ( r ) .
The generalized open mixture arc, r ( α ) = F 1 ( 1 α ) F ( p 1 ) + α F ( p 2 ) , α ( 0 , 1 ) , connects r ( 1 + ε ) 2 2 ε 2 + 2 ε + 1 = p and r ε 2 2 ε 2 + 2 ε + 1 = r . ☐

5. Conclusions

In this work, we have generalized open exponential arc and open mixture arc for probability distributions. Moreover, we ensure that the generalization of open mixture arc is well-defined for deformed exponential strictly convex. From two φ -connected probability distributions p 1 and p 2 , we can define the generalized parallel transport τ p 1 , p 2 ( 1 ) between the tangent spaces T p 1 P μ and T p 2 P μ given by
u u T u φ ( φ 1 ( p 2 ) ) d μ T u 0 φ ( φ 1 ( p 2 ) ) d μ u 0 ,
where T p P μ B c φ with p = φ ( c ) . A next step is to find a generalized parallel transport τ p 1 , p 2 ( 1 ) that is dual to τ p 1 , p 2 ( 1 ) . Another goal is to investigate if the generalized Rényi divergence D φ α ( · · ) defined in [36] from two probability distributions φ -connected, can be related to the statistical divergence associated with ( τ p 1 , p 2 ( 1 ) , τ p 1 , p 2 ( 1 ) , · , · ) .

Acknowledgments

The authors would like to thank CAPES and CNPq (Procs. 309055/2014-8 and 408609/2016-8) for partial funding of this research.

Author Contributions

All authors contributed equally to the design of the research. The research was carried out by all authors. Rui F. Vigelis and Charles C. Cavalcante gave the central idea of the paper and managed the organization of it. Luiza H.F. de Andrade wrote the paper. All the authors read and approved the final manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Amari, S.-I. Information geometry on hierarchy of probability distributions. IEEE Trans. Inf. Theory 2001, 47, 1701–1711. [Google Scholar] [CrossRef]
  2. Calin, O.; Udrişte, C. Geometric Modeling in Probability and Statistics; Springer: Cham, Switzerland, 2014. [Google Scholar]
  3. Amari, S.-I. Information geometry and its applications. In Applied Mathematical Sciences; Springer: Tokyo, Japan, 2016; Volume 194. [Google Scholar]
  4. Amari, S.-I. Differential Geometry of Curved Exponential Families-Curvatures and Information Loss. Ann. Stat. 1982, 10, 357–385. [Google Scholar] [CrossRef]
  5. Amari, S.-I. Differential-Geometrical Methods in Statistics; Lecture Notes in Statistics; Springer: New York, NY, USA, 1985; Volume 28. [Google Scholar]
  6. Amari, S.-I.; Nagaoka, H. Methods of information geometry. In Translations of Mathematical Monographs; American Mathematical Society, Providence, RI; Translated from the 1993 Japanese original by Daishi Harada; Oxford University Press: Oxford, UK, 2000; Volume 191. [Google Scholar]
  7. Amari, S.-I.; Cichocki, A. Information geometry of divergence functions. Bull. Pol. Acad. Sci. Tech. Sci. 2010, 58, 183–195. [Google Scholar]
  8. Amari, S.-I. α-Divergence Is Unique, Belonging to Both f-Divergence and Bregman Divergence Classes. IEEE Trans. Inf. Theory 2009, 55, 4925–4931. [Google Scholar] [CrossRef]
  9. Zhang, J. Divergence Function, Duality, and Convex Analysis. Neural Comput. 2004, 16, 159–195. [Google Scholar] [CrossRef] [PubMed]
  10. Nielsen, F.; Nock, R. On w-mixtures: Finite convex combinations of prescribed component distributions. arXiv, 2017; arXiv:1708.00568v1. [Google Scholar]
  11. Amari, S.-I.; Ohara, A.; Matsuzoe, H. Geometry of deformed exponential families: Invariant, dually-flat and conformal geometries. Phys. A Stat. Mech. Appl. 2012, 391, 4308–4319. [Google Scholar] [CrossRef]
  12. Harsha, K.V.; Moosath, K.S.S. Dually flat geometries of the deformed exponential family. Phys. A Stat. Mech. Appl. 2015, 433, 136–147. [Google Scholar]
  13. Matsuzoe, H. Hessian structures on deformed exponential families and their conformal structures. Differ. Geom. Appl. 2014, 35, 323–333. [Google Scholar] [CrossRef]
  14. Matsuzoe, H.; Wada, T. Deformed Algebras and Generalizations of Independence on Deformed Exponential Families. Entropy 2015, 17, 5729–5751. [Google Scholar] [CrossRef]
  15. Giné, E.; Nickl, R. Mathematical Foundations of Infinite-Dimensional Statistical Models; Cambridge Series in Statistical and Probabilistic Mathematics; Cambridge University Press: Cambridge, UK, 2015. [Google Scholar]
  16. Townsend, J.; Solomon, B.; Smith, J. The perfect gestalt: Infinite dimensional Riemannian face spaces and other aspects of face perception. In Computacional, Geometric and Process Perspectives on Facial Cognition: Contexs and Challenges; Wenger, M.J., Townsend, J.T., Eds.; Society for Mathematical Psychology: Washington, DC, USA, 2001; pp. 39–82. [Google Scholar]
  17. Trivellato, B. Deformed exponentials and applications to finance. Entropy 2013, 15, 3471–3489. [Google Scholar] [CrossRef]
  18. Pistone, G.; Sempi, C. An infinite-dimensional geometric structure on the space of all the probability measures equivalent to a given one. Ann. Stat. 1995, 23, 1543–1561. [Google Scholar] [CrossRef]
  19. Pistone, G.; Rogantin, M.P. The exponential statistical manifold: Mean parameters, orthogonality and space transformations. Bernoulli 1999, 5, 721–760. [Google Scholar] [CrossRef]
  20. Gibilisco, P.; Pistone, G. Connections on Non-Parametric Statistical Manifolds by Orlicz Space Geometry. Infin. Dimens. Anal. Quantum Probab. Relat. Top. 1998, 1, 325–347. [Google Scholar] [CrossRef]
  21. Grasselli, M.R. Dual connections in nonparametric classical information geometry. Ann. Inst. Statist. Math. 2010, 62, 873–896. [Google Scholar] [CrossRef]
  22. Cena, A.; Pistone, G. Exponential statistical manifold. Ann. Inst. Statist. Math. 2007, 59, 27–56. [Google Scholar] [CrossRef]
  23. Santacroce, M.; Siri, P.; Trivellato, B. New results on mixture and exponential models by Orlicz spaces. Bernoulli 2016, 22, 1431–1447. [Google Scholar] [CrossRef]
  24. Santacroce, M.; Siri, P.; Trivellato, B. On Mixture and Exponential Connection by Open Arcs. In Geometric Science of Information; Nielsen, F., Barbaresco, F., Eds.; Springer International Publishing: Cham, Switzerland, 2017; pp. 577–584. [Google Scholar]
  25. Pistone, G. Examples of the application of nonparametric information geometry to statistical physics. Entropy 2013, 15, 4042–4065. [Google Scholar] [CrossRef]
  26. Pistone, G. kappa-exponential models from the geometrical viewpoint. Eur. Phys. J. B 2009, 70, 29–37. [Google Scholar] [CrossRef]
  27. Kaniadakis, G. Non-linear kinetics underlying generalized statistics. Phys. A Stat. Mech. Appl. 2001, 296, 405–425. [Google Scholar] [CrossRef]
  28. Vigelis, R.F.; Cavalcante, C.C. On ϕ-families of probability distributions. J. Theoret. Probab. 2013, 26, 870–884. [Google Scholar] [CrossRef]
  29. Pistone, G. Nonparametric information geometry. In Geometric Science of Information; Lecture Notes in Comput. Sci.; Springer: Heidelberg, Germany, 2013; Volume 8085, pp. 5–36. [Google Scholar]
  30. Naudts, J. Generalised Thermostatistics; Springer-London, Ltd.: London, UK, 2011. [Google Scholar]
  31. Vigelis, R.F.; de Andrade, L.H.F.; Cavalcante, C.C. On the Existence of Paths Connecting Probability Distributions. In Proceedings of the Geometric Science of Information: Third International Conference, GSI 2017, Paris, France, 7–9 November 2017; Nielsen, F., Barbaresco, F., Eds.; Springer International Publishing: Cham, Switzerland, 2017; pp. 801–808. [Google Scholar]
  32. Musielak, J. Orlicz Spaces and Modular Spaces; Lecture Notes in Mathematics; Springer: Berlin, Germany, 1983; Volume 1034. [Google Scholar]
  33. Vigelis, R.F.; Cavalcante, C.C. The Δ2-condition and ϕ-families of probability distributions. In Geometric Science of Information; Lecture Notes in Comput. Sci.; Springer: Heidelberg, Germany, 2013; Volume 8085, pp. 729–736. [Google Scholar]
  34. Hudzik, H.; Zbaszyniak, Z. Smoothness in Musielak-Orlicz spaces equipped with the Orlicz norm. Collect. Math. 1997, 48, 543–561. [Google Scholar]
  35. Vigelis, R.F.; Cavalcante, C.C. Smoothness of the Orlicz norm in Musielak-Orlicz function spaces. Math. Nachr. 2014, 287, 1025–1041. [Google Scholar] [CrossRef]
  36. de Souza, D.C.; Vigelis, R.F.; Cavalcante, C.C. Geometry Induced by a Generalization of Rényi Divergence. Entropy 2016, 18, 407. [Google Scholar] [CrossRef]
  37. De Andrade, L.H.F.; Vigelis, R.F.; Vieira, F.L.J.; Cavalcante, C.C. Normalization and ϕ-function: Definition and Consequences. In Proceedings of the Geometric Science of Information: Third International Conference, GSI 2017, Paris, France, 7–9 November 2017; Nielsen, F., Barbaresco, F., Eds.; Springer International Publishing: Cham, Switzerland, 2017; pp. 231–238. [Google Scholar]
  38. Asplund, E.; Rockafellar, R.T. Gradients of convex functions. Trans. Am. Math. Soc. 1969, 139, 443–467. [Google Scholar] [CrossRef]
  39. Barbu, V.; Precupanu, T. Convexity and Optimization in Banach Spaces, 4th ed.; Springer Monographs in Mathematics; Springer: Dordrecht, The Netherlands, 2012. [Google Scholar]
  40. Brø ndsted, A.; Rockafellar, R.T. On the subdifferentiability of convex functions. Proc. Am. Math. Soc. 1965, 16, 605–611. [Google Scholar] [CrossRef]
  41. Bauschke, H.H.; Borwein, J.M.; Combettes, P.L. Essential smoothness, essential strict convexity, and Legendre functions in Banach spaces. Commun. Contemp. Math. 2001, 3, 615–647. [Google Scholar] [CrossRef]
  42. Borwein, J.M.; Vanderwerff, J.D. Convex functions: Constructions, characterizations and counterexamples. In Encyclopedia of Mathematics and Its Applications; Cambridge University Press: Cambridge, UK, 2010; Volume 109. [Google Scholar]
  43. Vigelis, R.F. On Musielak-Orlicz Spaces and Applications to Information Geometry. Ph.D. Thesis, Department of Teleinformatics Engineering, Federal University of Ceará, Fortaleza, Brazil, 2011. [Google Scholar]

Share and Cite

MDPI and ACS Style

De Andrade, L.H.F.; Vieira, F.L.J.; Vigelis, R.F.; Cavalcante, C.C. Mixture and Exponential Arcs on Generalized Statistical Manifold. Entropy 2018, 20, 147. https://doi.org/10.3390/e20030147

AMA Style

De Andrade LHF, Vieira FLJ, Vigelis RF, Cavalcante CC. Mixture and Exponential Arcs on Generalized Statistical Manifold. Entropy. 2018; 20(3):147. https://doi.org/10.3390/e20030147

Chicago/Turabian Style

De Andrade, Luiza H. F., Francisca L. J. Vieira, Rui F. Vigelis, and Charles C. Cavalcante. 2018. "Mixture and Exponential Arcs on Generalized Statistical Manifold" Entropy 20, no. 3: 147. https://doi.org/10.3390/e20030147

APA Style

De Andrade, L. H. F., Vieira, F. L. J., Vigelis, R. F., & Cavalcante, C. C. (2018). Mixture and Exponential Arcs on Generalized Statistical Manifold. Entropy, 20(3), 147. https://doi.org/10.3390/e20030147

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop