1. Introduction
Some history. In 1988 Tsallis [
1] generalized the Boltzmann-Gibbs entropy
Describing classical thermodynamical ensembles with microstates of probabilities
, by the entropy
For
in the sense that
. Here
is the Boltzmann constant (being a multiplicative constant neglected in the following). Many physicists argue that generalizing the classical entropy was a breakthrough in thermodynamics since the extension allows better describing systems out of equilibrium and systems with strong correlations between microstates. There is, however, also criticism on the application of Tsallis’ concept (compare [
2,
3]). In information theory pioneered by Shannon, the Boltzmann-Gibbs entropy is one of the central concepts. We follow the usual practice to call it Shannon entropy. Also note that Tsallis’ entropy concept coincides up to a constant with the Havrda-Charvát entropy [
4] given in 1967 in an information theoretical context. Besides information theory, entropies are used in many fields, among them dynamical systems, data analysis (see e.g. [
5]), and fractal geometry [
6].
There have been given many axiomatic characterizations of Tsallis’ entropy originating in such of the classical Shannon entropy (see below). One important axiom called (generalized) Shannon additivity is extensively discussed and shown to be sufficient in some sense in this paper.
Tsallis entropy. In the following, let
for
be the set of all
n-dimensional stochastic vectors and
be the set of all stochastic vectors, where
and
are the sets of natural numbers and of nonnegative real numbers, respectively. Given
with
, the Tsallis entropy of a stochastic vector
of some dimension
n is defined by
In the case
, the value
is not defined, but the limit of it as
approaches to 1 is
Which provides the classical Shannon entropy. In so far Tsallis entropy can be considered as a generalization of the Shannon entropy and so it is not surprising that there have been many attempts to generalize various axiomatic characterizations of the latter to the Tsallis entropy.
Axiomatic characterizations. One line of characterizations mainly followed by Suyari [
7] and discussed in this paper has its origin in the
Shannon-Khinchin axioms of Shannon entropy (see [
8,
9]). Note that other characterizations of Tsallis entropy are due to dos Santos [
10], Abe [
11] and Furuichi [
12]. For some general discussion of axiomatization of entropies see [
13].
A map
is the Shannon entropy up to a multiplicative positive constant if it satisfies the following axioms:
Axiom (S4) called
Shannon additivity is playing a key role in the characterization of the Shannon entropy and an interesting result given by Suyari [
7] says that its generalization
For provides the Tsallis entropy for this .
More precisely, if
satisfies (S1), (S2), (S3) and (GS4), then
is the Tsallis entropy for some positive constant
. The full result of Suyari slightly corrected by Ilić et al. [
14] includes a characterization of the map
under the assumption that
H also depends continuously on
. We do not discuss this characterization, but we note here that the results below also provide an immediate simplification of the whole result of Suyari and Ilić et al.
Given , the constant is determined by any positive value of some stochastic vector . If this reference vector is for example given by , one easily sees that and .
The main result. In this paper, we study the role of generalized Shannon additivity in characterizing Tsallis entropy, where for
and
we also consider the slightly relaxed property that
It turns out that this property basically is enough for characterizing the Tsallis entropy for
and with a further weak assumption in the cases
. As already mentioned, the statement (iii) for
is an immediate consequence of a characterization of Shannon entropy by Diderrich [
15] simplifying an axiomatization given by Faddeev [
16] (see below).
Theorem 1. Let be given with (GS4) or, a bit weaker, with (GS4’), for . Then the following holds:
- (i)
- (ii)
If , then the following statements are equivalent:
- (a)
- (b)
H is bounded on ,
- (c)
H is continuous on ,
- (d)
H is symmetric on ,
- (e)
H does not change the signum on .
- (iii)
If , then the following statements are equivalent:
- (a)
- (b)
H is bounded on .
Note that statement (iii) is given here only for reasons of completeness. It follows from a result of Diderrich [
15].
The paper is organized as follows.
Section 2 is devoted to the proof of the main result. It will turn out that most of the substantial work is related to stochastic vectors contained in
and that the generalized Shannon additivity acts as a bridge to stochastic vectors longer than 2 or 3.
Section 3 completes the discussion. In particular, the Tsallis entropy for
on rational vectors is discussed and an open problem is formulated.
2. Proof of the Main Result
We start with investigating the relationship of and for .
Lemma 1. Let and satisfy (GS4’).
Then for all it followsin particular for and for Moreover it holds Proof. First of all, note that (5) is an immediate consequence of (GS4’) implying
Further, two different applications of (GS4’) to
provide
Therefore , and since one similarly gets , we can assume in the following that .
Applying (GS4’) three times, one obtains
and in the same way
Transforming (7) to the term
and then substituting this term in (6), provides
which is equal to (2). Statements (3) and (4) follow immediately from Equation (2). ☐
In the case condition (GS4’) implies that the order of components of a stochastic vector does not make a difference for H:
Lemma 2. Let satisfy (GS4’) for . Then H is permutation-invariant, meaning that for each and each permutation π of .
Proof. For this has been shown in Lemma 1 (see (3)), for it follows directly from (GS4’) and from Lemma 1. ☐
The following lemma provides the substantial part of the proof of Theorem 1.
Lemma 3. For satisfying (GS4’) with , the following holds:
- (i)
- (ii)
If , then the following statements are equivalent:
- (a)
- (b)
H is symmetric on , meaning that for all ,
- (c)
H is continuous on ,
- (d)
H is bounded on ,
- (e)
H is nonnegative or nonpositive on .
Proof. We first show (i). Let
and
. Changing the role of
and
in (2), by Lemma 1 one obtains
Moreover, one easily sees that (2) transforms to
(8) and (9) provide
Since
, it follows
In order to show (ii), let
and define maps
and
by
and
for
.
By (4) in Lemma 1, (a) is equivalent both to (b) and to for all . (c) implies (d) by compactness of and validity of the implications (a) ⇒ (c) and (a) ⇒ (e) is obvious.
From
for
one obtains
and by induction
with
For
it holds
, hence
f maps the interval
onto the interval
. Since
for all
, the following holds:
Moreover, applying (10) to
yields
, hence
Assuming (d), by use of (11), (12) and (13) one obtains
for all
, hence (a). If (e) is valid, then by (4) in Lemma 1
for all
, providing (d). By the already shown, (a), (b), (c), (d), (e) are equivalent. ☐
Now we are able to complete the proof of Theorem 1. Assuming (GS4’), we first show (1) for , and for H bounded and . This provides statement (i) and, together with Lemma 3 (ii), statement (ii) of Theorem 1.
Statement (1) is valid for all
by Lemma 3. In order to prove it for
, we use induction. Assuming validity of (1) for all
with
, where
, let
. Choose some
with
. Then by (GS4’) and Lemma 3 we have
So (1) holds for all
with
.
In order to see (iii), recall a result of Diderrich [
15] stating that
is a multiple of the Shannon entropy if
H is bounded and permutation-invariant on
and satisfies
which is weaker than (GS4’) with
. Since under (GS4’)
H is permutation-invariant by Lemma 2, Diderrichs axiom are satisfied, and we are done.
3. Further Discussion
Our discussion suggests that the case is more complicated than the general one. In order to get some further insights, particularly in the case , let us consider only rational stochastic vectors. So in the following let with for and being the rationals. The following proposition states that for the `rational’ generalized Shannon additivity principally provides the Tsallis entropy on the rationals, which particularly provides a proof of the implication (c) ⇒ (a) in Theorem 1 (ii).
Proposition 1. Let be given with (S4)
for instead of ▵ and . Then it holds Proof. For the vectors
with
, we get from axiom (S4)
implying
Now consider any rational vector
with
for
satisfying
. With (S4) we get
Let us finally compare (ii) and (iii) in Theorem 1 and ask for the role of (c), (d) and (e) of (ii) in (iii). Symmetry is already given by Lemma 2 when only (S4) is satisfied, (S4) and nonnegativity are not sufficient for characterizing Shannon entropy, as shown in [
17]. By our knowledge, there is no proof that (S4) and continuity are enough, but (S4) and analyticity is working. Showing the latter, in [
18] an argumentation reducing everything to the rationals as above has been used.
We want to resume with the open problem whether the further assumptions for in Theorem 1 are necessary.
Problem 1. Is (1) in Theorem 1 also valid for ?